Fig. 1: Image feature-based clustering segments complex WSIs into relatively uniform tissue partitions. | Nature Communications

Fig. 1: Image feature-based clustering segments complex WSIs into relatively uniform tissue partitions.

From: PHARAOH: A collaborative crowdsourcing platform for phenotyping and regional analysis of histology

Fig. 1

A (i) Workflow highlighting mapping of tissue patterns across entire Whole Slide Images (WSIs). Briefly, a pre-trained Convolutional Neural Network (CNN) is used as a feature extractor and the generated Deep Learning Feature Vector (DLFV)s are used to cluster and map image patches back onto the WSI. (ii–iv) Cartoon schematic of the PHARAOH workflow. (ii) Unlabeled WSIs are uploaded to the online portal. Users receive tile-clustered maps to help decipher proposed groupings. Users provide cluster-level annotations which are aggregated across multiple WSIs and used to finetune custom CNN models. The process can be repeated to refine accuracy/desired outputs. (iii) Once developed, trained classifiers are made publicly available. In addition to tissue segmentation, various regional histomic (DLFs) and cell-based phenotyping outputs are provided to serve as biomarkers of disease (e.g. tumor infiltrating lymphocytes). (iv) In addition to core PHARAOH outputs, users can also export segmented target regions of interest and carry out custom image analyses using other third-party tools on companion platforms (CODIDO; codido.co). Panels (ii–iv) created in BioRender. Diamandis, P. (2025) https://BioRender.com/y70k830. B, C Demonstrative input (WSI) (B) and output (tissue heterogeneity map) (C) images of a sample colorectal adenocarcinoma from The Cancer Genome Atlas (TCGA). Scale bars = 2 mm. (n = 984 tiles extracted/clustered from this sample). D Representative image patches highlighting stereotypical morphology from different partitions. Tiles = 256 × 256 pixels. E, F The relative degree of histomorphology similarities/differences align with cluster positioning on dimensionality reduction plots (UMAP) (E) and Pairwise Pearson correlation coefficients (r) (F) of the partition’s DLFVs. GI Box plots highlighting quantitative cellularity (G), epithelial (DLF66) (H), and fibrosis (DLF215) (I) marker differences between defined regions. Box plots show minimum, first quartile, median, third quartile, and maximum. Counts represent nuclear instances or overall activation per 67,488 µm2. ***p < 0.001 (2-sided t-test). J Regional cell composition differences (HoVer-Net outputs). All relevant source data including number of unique image patches (technical replicates) for each comparison group are provided as Supplementary Data files.

Back to article page