Table 2 Results of model prediction performance. (A) Internal test using the GARD cohort dataset. The baseline models consisted of three logistic regression models using (1) MMSE scores, (2)expert-assessed RCFT scores, and (3) AI-generated RCFT scores produced by a previously developed AI-based scoring model, as well as (4) a deeplearning model using only the spatial stream. All baseline models included chronological age, sex, and education as covariates. The data was split into6:2:2 (training, validation, and testing sets), and this process was repeated 50 times. (B) External test using the WUH cohort dataset. ExpertsassessedRCFT scores refers to the models using the initial expert-assessed scores before QC, while expert-corrected scores indicates the models usingthe expert-corrected scores obtained after re-evaluating based on comparisons with the AI-generated RCFT scores.
(A) Internal test using the GARD cohort dataset | ||||
|---|---|---|---|---|
Input modality | AUC | Accuracy | Sensitivity | Specificity |
MMSE scores | 0.714 [0.706–0.721] | 0.660 [0.652–0.667] | 0.625 [0.613–0.636] | 0.694 [0.685–0.704] |
Experts-assessed RCFT scores | 0.776 [0.768–0.782] | 0.705 [0.699–0.712] | 0.700 [0.689–0.711] | 0.711 [0.700–0.722] |
AI-generated RCFT scores | 0.777 [0.770–0.783] | 0.710 [0.703–0.717] | 0.699 [0.689–0.709] | 0.721 [0.710–0.731] |
RCFT images only | 0.803 [0.768–0.837] | 0.731 [0.702–0.761] | 0.701 [0.661–0.741] | 0.762 [0.720–0.804] |
RCFT images + AI-generated scores (Our method) | 0.852 [0.837–0.869] | 0.771 [0.755–0.787] | 0.742 [0.718–0.767] | 0.800 [0.774–0.823] |
(B) External test using the WUH cohort dataset | ||||
|---|---|---|---|---|
WUH | ||||
Experts-assessed RCFT scores | 0.750 [0.750–0.751] | 0.709 [0.707–0.712] | 0.832 [0.829–0.835] | 0.575 [0.571–0.579] |
Expert-corrected RCFT scores | 0.813 [0.812–0.814] | 0.750 [0.748–0.753] | 0.849 [0.845–0.852] | 0.643 [0.639–0.648] |
AI-generated RCFT scores | 0.804 [0.803–0.805] | 0.722 [0.721–0.725] | 0.799 [0.797–0.802] | 0.639 [0.634–0.644] |
RCFT images only | 0.837 [0.814–0.860] | 0.744 [0.719–0.768] | 0.743 [0.690–0.800] | 0.745 [0.697–0.792] |
RCFT images + AI-generated scores (Our method) | 0.872 [0.862–0.882] | 0.781 [0.768–0.795] | 0.836 [0.807–0.864] | 0.722 [0.687–0.757] |