Table 3 Data extraction agent performance as a multi-class classifier

From: CARE-AD: a multi-agent large language model framework for Alzheimer’s disease prediction using longitudinal clinical notes

Symptom category

Precision

Recall

F1-score

Cognitive impairment

0.77 (0.75, 0.78)

0.82 (0.83, 0.84)

0.79 (0.77, 0.81)

Notice/Concern by others

0.84 (0.83, 0.84)

0.39 (0.37, 0.41)

0.53 (0.52, 0.55)

Requires assistance

0.69 (0.67, 0.70)

0.65 (0.63, 0.67)

0.67 (0.65, 0.68)

Physiological changes

0.74 (0.73, 0.75)

0.76 (0.74, 0.77)

0.75 (0.73, 0.77)

Neuropsychiatric symptoms

0.78 (0.77, 0.80)

0.83 (0.81, 0.85)

0.8 (0.78, 0.82)

Overall accuracy

Micro-average

0.75 (0.73, 0.77)

Macro-average

0.76 (0.74, 0.78)