Table 2 Classification performance across different cancer types.

Organ	Number of cases	Number of positive cases	Number of negative cases	AUC (Area under curve)	TP/FP/TN/FN	Sensitivity	Specificity	Accuracy
Lung	207	153	54	0.8895(0.8466–0.9324)	153/48/6/0	100%	11.1%	76.8%
Breast	173	35	138	0.9986(0.9958-1.000)	35/10/128/0	100%	92.8%	94.2%
Lymph node	121	16	105	0.8774(0.7570–0.9978)	11/3/102/5	68.8%	97.1%	93.4%
Female adnexal tumors	50	12	38	0.9930(0.9762-1.000)	11/6/32/1	91.7%	84.2%	86.0%
Thyroid	42	23	19	0.9144(0.8266-1.000)	17/2/17/6	73.9%	89.5%	81.0%
Other organs	76	22	54	0.8242(0.6816–0.9668)	15/3/51/7	68.2%	94.4%	86.8%

The metrics in Table 2 represent the raw diagnostic potential of the model per organ site. Note that the overall metrics in Table 1/3 are derived from a multi-stage MIL aggregation (including a top-10% instance selection cutoff and a synchronized slide-level threshold), which optimizes global specificity for clinical safety. Consequently, the summation of individual organ counts in Table 2 may slightly differ from the global optimized metrics due to this hierarchical thresholding logic.

Quick links

Search