Fig. 6: The performance of the AI system and of radiologists in identifying pneumonia conditions from CXR images.

Performance comparison of four groups: the AI system; an average of a group of four junior radiologists; an average of a group of four senior radiologists; and an average of the group of four junior radiologists with AI assistance. a, The ROC curves for diagnosing viral pneumonia from other types of pneumonia and from the absence of pneumonia. The star denotes the operating point of the AI system. The filled dots denote the performance of the junior and senior radiologists, and the hollow dots denote the performance of the junior group with assistance from the AI system. The dashed lines link the paired performance values of the junior group. Inset: magnification of the plot. b, Weighted errors of the four groups on the basis of a penalty metric. The grey dashed line represents the performance of the AI system, and the grey shaded region represents the 95% confidence interval. P < 0.001 computed using a two-sided permutation test of 10,000 random resamplings. c, An evaluation experiment on diagnostic performance when the AI system acted as a ‘second reader’ or an ‘arbitrator’.