Fig. 6: Area under the curve (AUC) distribution of five models on the testing set of PatchCamelyon data set (number of samples/patches per testing set = 32,768; number of experiments per model = 8).
From: Accurate recognition of colorectal cancer with semi-supervised deep learning on pathological images

The boxes indicate the upper and lower quartile values, and the whiskers indicate the minima and maxima values. The horizontal bar in the box indicates the median, while the cross indicates the mean. The circles represent data points, and the scatter dots indicate outliers. * indicates significant difference, and ** indicates no significant difference. The Wilcoxon-signed rank test (sample size/group = 8) is then used to evaluate the significant difference of AUCs between the two models. Two-sided P values are reported, and no adjustment is made. The average AUC and standard deviation (sample size = 8) are calculated for each model. Pcam-1%-SSL vs. Pcam-1%-SL: 0.947 ± 0.008 vs. 0.912 ± 0.008, P value = 0.012; Pcam-5%-SSL vs. Pcam-5%-SL: 0.960 ± 0.002 vs. 0.943 ± 0.009, P value = 0.011; Pcam-5%-SSL vs. Pcam-100%-SL: 0.960 ± 0.002 vs. 0.961 ± 0.004, P value = 0.888.