Fig. 2: Flowchart of genetic-image integrative analysis. A Data partitioning and model training.
From: AI-enhanced integration of genetic and medical imaging data for risk assessment of Type 2 diabetes

Phenotype Definition IV was used as an example to illustrate the process. The data containing information from 7,786 individuals were divided into four subsets: a training dataset (N = 4689), a validation dataset (N = 1175), and two independent testing datasets (N = 1469 for the first dataset and N = 444 for the second independent dataset). Subsequently, the best XGBoost model was established. B Flowchart of PRS construction. The Polygenic Risk Score (PRS) was constructed using PRS-CSx, utilizing genome-wide association study (GWAS) summary statistics from the European (EUR), East Asian (EAS), and South Asian (SAS) populations obtained from the analysis of the DIAGRAM Project. Source data are provided as a Source Data file.