Figure 3
From: Assessing clinical applicability of COVID-19 detection in chest radiography with deep learning

Confusion matrices between ground truth labels and radiologist annotations. (a) Comparison between ground truth and radiologists’ consensus on each dataset; (b) Inter- and intraobserver variability across all datasets (left and right respectively). N - Normal; P - Not indicative of COVID-19 (pathological); C - Indicative of COVID-19; U - Undetermined. Cases annotated as Compromised are not shown. Color intensity corresponds to the percentage of cases within each column.