Fig. 3: Prediction accuracy of PGS methods obtained with real phenotypes. | Nature Communications

Fig. 3: Prediction accuracy of PGS methods obtained with real phenotypes.

From: Improving polygenic score prediction for underrepresented groups through transfer learning

Fig. 3: Prediction accuracy of PGS methods obtained with real phenotypes.The alternative text for this image may have been generated using AI.

Prediction squared correlation ( ± SE) with real phenotypes varies by trait, method, and the ancestry of the testing data set. Within uses training data from AOU from the same ancestry as the testing data set. Cross uses data of EU-ancestry from the UK-Biobank. Combined uses the data used in Within and Cross. The three Transfer Learning (TL-) methods use the estimates obtained with UK-Biobank data (Cross) as priors to a TL algorithm: Gradient Descent with Early Stopping, GDES, Penalized Regression, PR, or Bayesian Mixture Model, BMM, applied to the data set used in the Within method. Standard error (SE) was computed analytically via the standard large-sample formula for Pearson’s correlation coefficient. Details of PGS construction are described in Pipeline 1 in Methods. Sample sizes vary among target ancestry groups and traits and are summarized in Supplementary Table 3. Detailed numerical results are reported in Supplementary Data 1. WHR waist-to-hip ratio, DBP diastolic blood pressure, SBP systolic blood pressure.

Back to article page