Figure 3

Test performance of the pathology deep learning system in each of the 3 test labs is illustrated. (left) Specimen classification accuracy is shown at each confidence level. (right) The percentage of specimens whose confidence score remains above the confidence threshold at each of the 3 confidence levels is shown. Note that even at baseline fewer than 100% of the specimens are classified. Some specimens are unclassified at baseline due to the lack of any ROI detected by CNN-2. This occurred in approximately 3–6% of specimens from each lab.