Fig. 4: Performance evaluation of FLEX when incorporated with different pathology VLMs. | Nature Communications

Fig. 4: Performance evaluation of FLEX when incorporated with different pathology VLMs.

From: Knowledge-guided adaptation of pathology foundation models effectively improves cross-domain generalization and demographic fairness

Fig. 4: Performance evaluation of FLEX when incorporated with different pathology VLMs.The alternative text for this image may have been generated using AI.

a Comparison of AUROC performance between original pathology vision-language models (VLMs) and their counterparts enhanced with FLEX. The 16 tasks span four datasets: TCGA-BRCA (n = 937), TCGA-NSCLC (n = 958), TCGA-STAD (n = 414), and TCGA-CRC (n = 606). Each box plot summarizes results from n = 15 independent cross-validation folds derived from SP-MCCV. Box plots display the median (center line), interquartile range (IQR; box limits from 25th to 75th percentiles), and whiskers extending to 1.5 × IQR; individual data points for each fold are overlaid. Indicated P-values were calculated using a two-sided paired-samples t-test with multiple hypothesis correction. C, P, and Q correspond to CONCH, PathGen-CLIP, and QuiltNet, respectively. b UMAP visualizations illustrating the effect of FLEX on the patch feature space for the STAD-EBV task. For each VLM, parallel subplots are colored by site (left) to visualize batch effect mitigation, and by EBV status (right) to visualize class separability. The LISI for site integration (higher is better) and the Silhouette Score for class separation (higher is better) provide quantitative evidence. The visualizations and scores demonstrate that FLEX reduces site-specific clustering while improving the discriminability of task-relevant classes. Source data are provided as a Source Data file.

Back to article page