Fig. 6: Comparison of labeling efficiency/confidence metrics for each of the 5 clinical output labels. | Nature Communications

Fig. 6: Comparison of labeling efficiency/confidence metrics for each of the 5 clinical output labels.

From: Accurate auto-labeling of chest X-ray images based on quantitative similarity to an explainable AI model

Fig. 6

For each of the five auto-labeled clinical output labels– cardiomegaly (blue), pleural effusion (orange), atelectasis (gray), pulmonary edema (green), and pneumonia (yellow)—we compared: (i) the percent of positively auto-labeled CXR’s “captured” from the three pooled, full public datasets (i.e., “Pooled Capture%”, from Supplementary Table 3, C); (ii) the percent of cases with complete agreement between the model and all seven expert readers (i.e., “Full Agree%”, from Supplementary Fig. 2); (iii) the lowest pSim value such that PPV = 1 (graphed as “1-pSim”, from Figs. 24, c), and (iv) the lowest pSim value such that NPV = 1 (graphed as “1-pSim”, from Figs. 24, d). clinical output labels with higher y-axis values (e.g., cardiomegaly, pleural effusion) correspond to those with greater model auto-labeling efficiency/confidence; clinical output labels with lower y-axis values (e.g., pneumonia, pulmonary edema) correspond to those with lesser model auto-labeling efficiency/confidence. Of note, in the graph for atelectasis, “1-pSim@PPV1” is higher than “1-pSim@NPV1”, which can be interpreted as greater confidence that the model is correct in “ruling-in” the clinical output label (i.e., correctly auto-labeling true-positives) than in “ruling-out” the clinical output label (i.e., correctly auto-labeling true-negatives); this relationship is reversed for the other four clinical output labels (e.g., greater confidence that the model can correctly “rule-out” than “rule-in” pneumonia or pulmonary edema).

Back to article page