Fig. 1: Workflow and overall performance of ProtAIDe-Dx on GNPC.
From: A deep joint-learning proteomics model for diagnosis of six conditions associated with dementia

a, Project workflow. (1) Model development on GNPC with tenfold cross-validation procedure. The joint learning framework allows learning of multiple neurodegenerative classification tasks jointly, facilitating shared information during training. (2) Model evaluation on GNPC (blue panels) and BioFINDER-2 cohort (orange panel). Model outputs include not just classifications but also probabilities for each class, and contributing proteins and embeddings can be probed to better understand model choices. (3) Individual neurological disease risk report based on the developed model. b, Overall model performance on GNPC. Left: BCA score for each diagnostic task. Right: AUC score for each diagnostic task. Box plots were drawn across 10 cross-validation folds, showing the median, interquartile range (IQR; 25th to 75th percentiles) and whiskers extending to 1.5× IQR. Two-sided corrected resample t-tests were applied; P values were FDR corrected. c, Normalized AD probabilities stratified by different APOE ε2/ε4 groups, shown for participants with (lower diagonal) and without (upper diagonal) an AD diagnosis. d, Correlation between normalized AD probabilities and MMSE. Error bands represent the 95% confidence interval of the regression line. e, Receiver operating characteristic (ROC) curve of model generalization to a new task for predicting longitudinal clinical progression (from no cognitive impairment to future cognitive impairment). Shaded regions represent the mean ROC ± 1 s.d. across ten cross-validation folds. Schematic and logo in a created in BioRender; An, L. https://biorender.com/q2by4y5 (2026).