Fig. 5: T1GRS reveals nonlinear interactions and genetic complexity in T1D.
From: Genetic association and machine learning improve the prediction of type 1 diabetes risk

a, SHAP analysis with feature importance in the top 25 features for discovery cohort individuals assessed with T1GRS-cov. Colors indicate the contribution of 2, 1 or 0 copies of the risk allele. Positions at either end of the x axis indicate the greatest impact on T1D classification. b, Chord diagram maps the strongest interactions between T1GRS variants. Three categories are highlighted—blue represents MHC–MHC interactions, green represents MHC–non-MHC interactions and orange indicates non-MHC–non-MHC interactions. Variants with a key interaction, rs1064173/INS, are highlighted with a star. c, Top ten feature interactions ranked by interaction value shown for each previously defined category. FDR < 0.05 threshold is indicated by a dashed red line. The same key interaction, rs1064173/INS, is highlighted with a star. d, Two-dimensional partial dependence plot of INS and rs1064173 as a function of allele count. Allele counts between 0 and 1 and 1 and 2 have been interpolated for visualization purposes. e, Dueling density plots of SHAP values, averaged across individuals for each feature for MHC and non-MHC variants, where individuals are binned by HLA-DR3 or HLA-DR4 haplotype status. A blue asterisk indicates that the MHC distribution is significantly different from 0, while a red asterisk indicates a significant difference for the non-MHC distribution. P values were calculated by a one-sample, two-tailed t test. Error bars on the plot represent the interquartile range. f, AUC values for T1GRS (purple), GRS2 (red) and a LogReg (orange) are plotted at increasing Complexity Score deciles (Methods). Two-sided DeLong tests—*P < 0.05, **P < 0.001, ***P < 0.0001—for T1GRS versus GRS2 and GRS2 versus LogReg, arranged between AUC points at each decile. Specific P values for AUC and DeLong tests can be found in Supplementary Table 14.