Fig. 1
From: Polygenic prediction of human longevity on the supposition of pervasive pleiotropy

The schematic overview of the analysis. Our analysis constitutes six major steps; in Step 01, we used the sliding regression framework to track allele frequency changes across 48 age bins in the Einstein LonGenity cohort. We fitted a robust linear regression with MM-estimator for each variant to derive the slope and p-value of association with chronological age. Variants surpassing the genome-wide significance threshold (p < 5e − 8) were followed up in the UK Biobank; in Step 02, the association of 34 candidate variants with 614 disease endpoints and traits were investigated in a PheWAS analysis. For each variant, we used the Bonferroni-corrected association p-values for shortlisting associated traits; in Step 03, we used GWAS summary statistics across the 87 traits (53 associated traits from PheWAS analysis + 34 additional blood and urine biomarkers) to calculate the polygenic scores (PGS) for unrelated participants in the Einstein LonGenity cohort (n = 957); in Step 04, we split our cohort into the derivation set (consisting of training and test sets, n = 715) and validation set (n = 242). Using the derivation portion of our cohort, we applied a stacked Elastic-net regression framework to effectively combine PGS across the 87 traits and construct a composite prognostic score to distinguish survival chances. We trained the ensemble model on 65% of the derivation set and tested on the remaining 35%. Our model achieved an AUC of 0.87 in the validation set. Using coefficients derived from our stacked model, we computed the integrated Longevity Genetic Scores (iLGS). Subsequently, using an external cohort (GERA cohort), we converted the scores to a set of 3.8 million variant weights. In Step 05, we carried out survival analysis to test the performance of iLGS in predicting survival in the validation set as well as two external cohorts including the Wellderly and MRGB cohorts; in Step 06, we carried out a sex-stratified association analysis to identify proteomic correlates of iLGS. Proteins significantly associated with iLGS were subsequently queried in the DrugBank database to identify druggable targets. Drugs targeting or interacting with iLGS associated proteins were investigated for potential repurposing as senolytic.