Fig. 1: Graphical representation of the pipeline used to benchmark Transfer Learning (TL) algorithms for PGS derived using p-value thresholding.
From: Improving polygenic score prediction for underrepresented groups through transfer learning

The pipeline used European (EU) and non-EU training data from the UK-Biobank (UKB) and the All of Us (AOU) cohorts to train six types of polygenic scores (cross-ancestry, within-ancestry, combined-ancestry, and three TL methods implemented in the GPTL R-package: TL-GDES, TL-PR, and TL-BMM). We evaluated prediction accuracy on test data from non-EU ancestries, including African Americans and Hispanics. The pipeline consists of three main steps: (1) variant selection based on GWAS results; (2) variant effect estimation using SS; and (3) evaluation of prediction accuracy in an independent testing data set. SS: Sufficient Statistics.