Fig. 2 | Scientific Reports

Fig. 2

From: Polygenic prediction of human longevity on the supposition of pervasive pleiotropy

Fig. 2

Overview of the iLGS model construction pipeline. We derived integrated Longevity Genetic Scores (iLGS) in three steps: (a) First, across unrelated individuals in the Einstein LonGenity cohort, we calculated polygenic scores for a total of 87 traits (53 associated traits from PheWAS analysis + 34 additional blood and urine biomarkers). We used LD clumping and p-value thresholding to derive the best score for each trait; (b) Next, we applied a stacked Elastic-net regression framework to combine the polygenic scores and derive iLGS. This method entails passing the output from one model to the next to increase model accuracy and derive meaningful insight. By sandwiching a polynomial regression between two Elastic-net regressions, we created a search space for the model to select the most informatic PGS and their interactions for predicting longevity. We started with 87 PGS as input and derived 369 PGS and interaction terms with model accuracy of around 91% in the derivation set upon fivefold cross-validation; (c) Using the validation portion of the Einstein LonGenity cohort (n = 242) and two additional cohorts, including the Wellderly (n = 510) & Australian Healthy Ageing cohort (MRGB; n = 2,570) we replicated the predictive performance of iLGS for distinguishing differential lifespan. Our model achieved an area under the curve (AUC) of 87% in the validation portion of the Einstein LonGenity cohort.

Back to article page