Table 2 Results of model prediction performance. (A) Internal test using the GARD cohort dataset. The baseline models consisted of three logistic regression models using (1) MMSE scores, (2)expert-assessed RCFT scores, and (3) AI-generated RCFT scores produced by a previously developed AI-based scoring model, as well as (4) a deeplearning model using only the spatial stream. All baseline models included chronological age, sex, and education as covariates. The data was split into6:2:2 (training, validation, and testing sets), and this process was repeated 50 times. (B) External test using the WUH cohort dataset. ExpertsassessedRCFT scores refers to the models using the initial expert-assessed scores before QC, while expert-corrected scores indicates the models usingthe expert-corrected scores obtained after re-evaluating based on comparisons with the AI-generated RCFT scores.

From: Multi-stream deep learning framework integrating images and feature representations to predict mild cognitive impairment using the rey complex figure test

(A) Internal test using the GARD cohort dataset

Input modality

AUC

Accuracy

Sensitivity

Specificity

MMSE scores

0.714

[0.706–0.721]

0.660

[0.652–0.667]

0.625

[0.613–0.636]

0.694

[0.685–0.704]

Experts-assessed

RCFT scores

0.776

[0.768–0.782]

0.705

[0.699–0.712]

0.700

[0.689–0.711]

0.711

[0.700–0.722]

AI-generated

RCFT scores

0.777

[0.770–0.783]

0.710

[0.703–0.717]

0.699

[0.689–0.709]

0.721

[0.710–0.731]

RCFT images only

0.803

[0.768–0.837]

0.731

[0.702–0.761]

0.701

[0.661–0.741]

0.762

[0.720–0.804]

RCFT images + 

AI-generated scores

(Our method)

0.852

[0.837–0.869]

0.771

[0.755–0.787]

0.742

[0.718–0.767]

0.800

[0.774–0.823]

(B) External test using the WUH cohort dataset

WUH

Experts-assessed

RCFT scores

0.750

[0.750–0.751]

0.709

[0.707–0.712]

0.832

[0.829–0.835]

0.575

[0.571–0.579]

Expert-corrected

RCFT scores

0.813

[0.812–0.814]

0.750

[0.748–0.753]

0.849

[0.845–0.852]

0.643

[0.639–0.648]

AI-generated

RCFT scores

0.804

[0.803–0.805]

0.722

[0.721–0.725]

0.799

[0.797–0.802]

0.639

[0.634–0.644]

RCFT images only

0.837

[0.814–0.860]

0.744

[0.719–0.768]

0.743

[0.690–0.800]

0.745

[0.697–0.792]

RCFT images + 

AI-generated scores

(Our method)

0.872

[0.862–0.882]

0.781

[0.768–0.795]

0.836

[0.807–0.864]

0.722

[0.687–0.757]