Extended Data Fig. 2: Quality check of the LASSO modeling. | Nature Medicine

Extended Data Fig. 2: Quality check of the LASSO modeling.

From: Multiomic signatures of body mass index identify heterogeneous health phenotypes and responses to a lifestyle intervention

Extended Data Fig. 2: Quality check of the LASSO modeling.

a,b, Pairwise correlation of all plasma analytes (a; Metabolomics: 766 metabolites, Proteomics: 274 proteins, Clinical labs: 71 clinical laboratory tests, Combined omics: 1,111 analytes) or the analytes that were retained across all ten LASSO models (b; Metabolomics: 62 metabolites, Proteomics: 30 proteins, Clinical labs: 20 clinical laboratory tests, Combined omics: 132 analytes). Each violin is scaled to have same width between the omics categories and represents the kernel density distribution with the standard boxplot (Methods). c, Hierarchical clustering and heatmap for the pairwise correlations of the analytes that were retained across all ten CombiBMI models (132 analytes: 77 metabolites, 51 proteins and four clinical laboratory tests). Of note, both upper and lower triangular sides of the symmetric matrix are visualized. d, Model performance of each fitted BMI model with sex stratification. Out-of-sample R2 was calculated from each corresponding hold-out testing set. Standard measures: OLS linear regression model with sex, age, triglycerides, HDL cholesterol, LDL cholesterol, glucose, insulin and HOMA-IR as regressors; Padj: adjusted P value of two-sided Welch’s t-test with the Benjamini–Hochberg method across the eight (four comparisons × two sexes) comparisons. Data: mean with 95% confidence interval, n = 10 models. All exact values of test summaries are found in Supplementary Data 10. Note that the sample size for modeling was different between female and male (Female: 821 participants versus Male: 456 participants). eh, Transition of out-of-sample R2 in the LASSO-modeling iteration analysis (Methods) for metabolomics (e), proteomics (f), clinical labs (g) or combined omics (h). The iteration is highlighted with shading color when the removed analyte is the variable that was retained across all the original ten models. Data: mean with 95% confidence interval, n = 10 models.

Back to article page