Fig. 1: Experimental design of the study. | Nature Biomedical Engineering

Fig. 1: Experimental design of the study.

From: Benchmarking foundation models as feature extractors for weakly supervised computational pathology

Fig. 1

Benchmarking of 19 histopathology foundation models using 13 cohorts and 31 tasks. a, Number of slides used from each of the 13 cohorts including 4 cancer types. b, About 9,528 haematoxylin and eosin (H&E) stained WSIs were preprocessed using the standardized STAMP19 pipeline. Feature extraction from the processed tiles was performed using 19 foundation models analysed in this study. The TCGA features were utilized for fivefold cross-validation with downstream transformer models on 31 classification tasks using STAMP. All models were subsequently applied to external features from CPTAC, Bern, Kiel, DACHS and IEO. The transformer architecture schematic shows layer normalization (Norm) and multi-headed self-attention (MSHA), followed by a MLP. c, All experiments were analysed using AUROCs, supplemented by AUPRC, Pearson’s correlation coefficient, DeLong’s test, balanced accuracy and F1 score. CONCH achieves the highest average AUROC across all tasks, followed by Virchow2, Prov-GigaPath and DinoSSLPath. The star indicates that Panakeia was tested on all tasks despite being specifically designed for BRCA and CRC. Attention heatmaps were generated for some slides to interpret differences between foundation models.

Source data

Back to article page