Table 5 Performance comparison of the proposed CARE-AD method with baseline models at -10-year prediction

Method	LLM calls	AD cases (P/R/F)	Controls (P/R/F)	Accuracy
Zero-shot	1	0.09 (0.08, 0.09)/0.25 (0.23, 0.28)/0.13 (0.11, 0.14)	0.56 (0.54, 0.57)/0.26 (0.25, 0.27)/0.35 (0.34, 0.37)	0.26 (0.25, 0.27)
Chain of thought (CoT)	1	0.11 (0.10, 0.12)/0.27 (0.25, 0.29)/0.15 (0.13, 0.17)	0.65 (0.63, 0.67)/0.37 (0.35, 0.39)/0.47 (0.45, 0.49)	0.35 (0.33, 0.37)
Self-consistency	6 reasoning paths	0.13 (0.11, 0.15)/0.29 (0.26, 0.32)/0.18 (0.16, 0.20)	0.70 (0.69, 0.71)/0.47 (0.45, 0.49)/0.56 (0.54, 0.58)	0.43 (0.42, 0.44)
Self-refine	6 refine rounds	0.16 (0.14, 0.18)/0.36 (0.33, 0.39)/0.22 (0.19, 0.25)	0.73 (0.72, 0.74)/0.47 (0.45, 0.49)/0.57 (0.55, 0.59)	0.45 (0.44, 0.46)
AutoGen multi-agent (1 round)	6 doctor agents (6 LLM calls)	0.16 (0.15, 0.17)/0.36 (0.33, 0.39)/0.22 (0.20, 0.24)	0.73 (0.72, 0.74)/0.48 (0.47, 0.50)/0.58 (0.57, 0.59)	0.45 (0.44, 0.47)
AutoGen multi-agent (2 rounds)	6 doctor agents (12 LLM calls)	0.20 (0.18, 0.21)/0.38 (0.35, 0.41)/0.26 (0.23, 0.28)	0.77 (0.76, 0.78)/0.58 (0.56, 0.59)/0.66 (0.65, 0.67)	0.53 (0.52, 0.55)
AutoGen multi-agent (3 rounds)	6 doctor agents (18 LLM calls)	0.20 (0.18, 0.21)/0.38 (0.35, 0.41)/0.26 (0.24, 0.28)	0.77 (0.76, 0.78)/0.58 (0.56, 0.59)/0.66 (0.64, 0.67)	0.53 (0.52, 0.55)
CARE-AD	6 doctor agents	0.20 (0.18, 0.21)/0.38 (0.35, 0.41)/0.26 (0.24, 0.28)	0.77 (0.76, 0.78)/0.57 (0.55, 0.59)/0.65 (0.64, 0.67)	0.53 (0.51, 0.54)

Quick links

Search