Extended Data Fig. 4: Head to head comparison between model and clinicians.
From: AI-based differential diagnosis of dementia etiologies on multimodal data

Comparison between model-predicted probability scores and the assessments provided by practicing clinicians is shown. a, For the analysis, neurologists (n = 12) were given 100 randomly selected cases encompassing individual-level demographics, health history, neurological tests, physical as well as neurological examinations, and multisequence MRI scans. The neurologists were then tasked with assigning confidence scores for NC, MCI, DE, and the 10 dementia etiologies: AD, LBD, VD, PRD, FTD, NPH, SEF, PSY, TBI, and ODE (see Glossary 1). Neurologists’ confidence scores were averaged to produce a single consensus confidence score for each case. In the visual representation, the boxplot in blue indicates the distribution of confidence scores for true negative cases, while the boxplot in red signifies true positive cases. The symbol ‘+’ represents true positive cases, and ‘x’ denotes true negative cases. Significance levels are denoted as: ns (not significant) for p≥0.05; * for p < 0.05; ** for p < 0.01; *** for p < 0.001; and **** for p < 0.0001. These levels were determined using pairwise comparisons via the unadjusted two-sided Brunner-Munzel test, for which detailed pvalues and statistics can be found in Table S17. b, Similarly, in a separate analysis, radiologists (n = 7) were given 70 randomly selected cases with a confirmed dementia diagnosis encompassing individual-level demographics and multisequence MRI scans. The radiologists were tasked with assigning confidence scores for the 10 dementia etiologies. Similar to that of a, the visual representation consists of boxplots and scatterplots that represent the distribution of model and radiologists’ consensus confidence scores for true negative and true positive cases. Unadjusted two-sided Brunner-Munzel statistical test results are shown as pairwise annotations of ns, *, **, ***, or ****, and more detailed statistics and pvalues can be found in Table S18. Each boxplot presented includes a box presenting the median value and interquartile range (IQR), with whiskers extending from the box to the maxima and minima no further than a distance of 1.5 times the IQR.