Fig. 5
From: Comprehensive Benchmark Dataset for Pathological Lymph Node Metastasis in Breast Cancer Sections

Benchmark comparison of MIL models with various feature encoders. (a) Mean AUC scores across 12 MIL models and 9 feature encoders, including ResNet-5026, ViT-S12, Ctranspath6, PLIP3, CONCH7, CONCH-V1.521, UNI4, Gigapath5, and Virchow14. (b) Mean F1-score under the same evaluation setup. Each cell represents the averaged performance across datasets. Warmer colors denote higher values. CLAM-MB25, TransMIL24, and AMD-MIL33 exhibit consistently strong performance across multiple encoders, while newer foundation models such as UNI, Gigapath, and Virchow lead to higher AUC and F1-scores than conventional encoders.