Integrative whole slide image and spatial transcriptomics analysis with QuST and QuPath

Huang, Chao-Hui; Lichtarge, Sara; Fernandez, Diane

doi:10.1038/s41698-025-00841-9

Download PDF

Brief Communication
Open access
Published: 12 March 2025

Integrative whole slide image and spatial transcriptomics analysis with QuST and QuPath

npj Precision Oncology volume 9, Article number: 70 (2025) Cite this article

7113 Accesses
3 Citations
1 Altmetric
Metrics details

Subjects

Abstract

The integration of AI in digital pathology, particularly in whole slide image (WSI) and spatial transcriptomics (ST) analysis, holds immense potential for enhancing disease understanding. Despite challenges such as training pattern preparation and resolution disparities, the convergence of these technologies can unlock insights. We introduce QuST, a QuPath extension that bridges the gap between WSI and ST at single-cell level, highlighting the power of this integrated approach in disease biology.

Artificial intelligence in digital pathology: a systematic review and meta-analysis of diagnostic test accuracy

Article Open access 04 May 2024

Integrating digital pathology into clinical practice

Article 01 October 2021

Towards a general-purpose foundation model for computational pathology

Article 19 March 2024

Introduction

Spatial analysis, a critical component of pathology, has greatly enhanced our understanding of complex biological processes. Traditional pathology, which involves scrutinizing tissue slides with high-power microscopy, is labor-intensive. However, the advent of digital image analysis (DIA) and machine learning (ML) technologies has broadened the scope of artificial intelligence (AI) in this field. Over the past few years, a slew of deep learning (DL) based whole slide image (WSI) analysis tools such as QuPath¹, TIA Toolbox², MONAI³, SlideFlow⁴, PHARAOH⁵, WSInfer⁶ have been introduced.

One of the significant hurdles in DL-based WSI analysis is the creation of training patterns. Hematoxylin & eosin (H&E), the standard tissue staining technique, provides structural information but rarely offers direct biological evidence like gene expressions and transcription factors. As a result, the success of DL-based WSI analysis hinges largely on the expertise of those conducting manual annotation tasks on the WSI H&E images.

On the other hand, spatial transcriptomics (ST) has seen significant advancements as it enables the visualization and analysis of histological sections with gene expression features. ST provides valuable spatial context to molecular data, making it vital for studying complex biological processes, such as cell-cell interactions. ST also presents a unique opportunity to address the challenges of DL-based WSI analysis, as subcellular ST technologies are already available. However, merging these two powerful modalities has been challenging due to differences in data formats and analyzing methods.

Numerous studies have delved into the application of tools for the analysis of ST in WSI. For example, Wood et al. examined the use of QuPath for image analysis, paired with GeoMx ST, to investigate gene expression variability in colorectal cancer and liver metastases⁷. Tippani et al. also have highlighted the necessity of bridging high-resolution images for spatially resolved transcriptomics data, and concluded the missing capabilities of QuPath for supporting preprocessing of the multichannel fluorescent images. Despite certain constraints, QuPath’s significant functionalities in image analysis have been widely recognized. Nonetheless, these studies mainly use QuPath for initial analysis, with detailed ST analysis done using other tools.

While providers of ST technologies have developed various platforms for visually examining and researching biological insights from given samples, such as Loupe Browser (https://www.10xgenomics.com/support/software/loupe-browser/latest), Xenium Explorer (https://www.10xgenomics.com/support/software/xenium-explorer/latest), and AtoMx Spatial Informatics Platform (https://nanostring.com/products/atomx-spatial-informatics-platform/atomx-sip-overview/), these tools have not fully exploited the functionalities of DIA, resulting in ineffective integration of existing DIA tools and platforms into ST research. To address this gap, we present QuST, a QuPath extension that offers a comprehensive platform for integrating and analyzing WSI and ST data. QuST is designed to enable more in-depth spatial-omics analysis, including cell-cell interactions, cell spatial profiling, and visualization. Furthermore, QuST’s implementation of Deep Learning (DL)-based cell categorization and region segmentation methods could facilitate image annotation based on biological evidence.

Methods

QuST is designed to seamlessly integrate WSI and ST analysis with QuPath, enhancing its capabilities with tools specifically tailored for spatial biology. The extension supports the visualization of spatial gene expression data within the context of histopathological images, enabling users to explore the molecular landscape of tissues at an unprecedented resolution (see Fig. 1). Below, we will introduce some analyzing tools and use cases available in QuST.

**Fig. 2: Workflow for ST and WSI coordination alignment based on image registration.**

Integrative WSI and ST analysis at single-cell level

QuST is being developed with a focus on single-cell level analysis. It has the capability to load single-cell level ST data formats, including those from 10x Genomics Xenium and NanoString CosMx. Furthermore, it can also load 10x Genomics Visium datasets for whole-slide, full-spectrum ST analysis.

A significant challenge in single-cell level ST analysis lies in aligning ST data with WSI, primarily due to the involvement of different image modalities. For instance, in Xenium and CosMx, cell localization relies on DAPI staining, while WSIs can be H&E staining. This necessitates multiple rounds of sample preparation and scanning to acquire multiple staining of the samples. Consequently, loading ST data requires extra steps for aligning the ST data with the provided WSI. To address this issue, we took inspiration from the guidelines provided by 10x Genomics (https://www.10xgenomics.com/analysis-guides/he-to-xenium-dapi-image-registration-with-fiji), and proposed a method that aligns the coordinates of ST data to the reference image, as illustrated in Fig. 2.

To test the proposed approach, in the experiment, we first performed a cell detection algorithm, e.g., StarDist⁸, Cellpose⁹, etc. Then, for experimental purpose, we loaded transcriptomic data with and without including image registration information, separately. The results are shown in Fig. 3, and the statistical evaluation of cell displacement between the H&E image and transcriptomic data is shown in Fig. 4. In the experiment, 723,384 cells were detected from H&E images. Without using image registration, 122,288 cells were missing. The root causes include: 1) different quality control approaches for the two data modalities; and 2) the location information obtained from the transcriptomic data did not match the cells detected on the given H&E images. With image registration, this number dropped to 99,405, representing an 18.71% improvement. In addition, it can be observed that the deformation looks much relevant to the grid-like artifacts¹⁰ resulting from stitching the DAPI images (see Fig. 3e, f). As a result, integrating the proposed ST and WSI registration can mitigate the noises generated from data acquisition.

**Fig. 3: Image example for analyzing the performance of image registration.**

**Fig. 4: Statistics for cell displacement with and without image registration.**

For performing whole-slide, full-spectrum ST analysis using 10x Genomics Visium datasets, QuST requires the corresponding affine matrix, generated manually via the Visium Image Alignment function in the Loupe Browser (see https://www.10xgenomics.com/support/software/space-ranger/latest/analysis/inputs/image-fiducial-alignment). In addition, in the work of HEST-1k¹¹, an automatic fiducial detection using the YOLOv8 model¹² was introduced, which enables alignment inference for full-spectrum ST and WSI interactive analysis using Visium datasets.

Cellular spatial profiling

Cell spatial profiling plays a critical role in spatial-omics analysis. In QuST, cell spatial profiling provides the foundation of all other spatial related computing. First, the Delaunay clustering is required in order to obtain the neighboring cell connectivity. Next, the edge distance of each chosen cells is computed. As a result, the position of each cell in the cluster is obtained and can be used for the following analyzing tasks. The detailed algorithm is shown in Algorithm 1.

The results shown in Fig. 5 represent insights from cellular spatial analysis. The heat map indicates the boundary distance of individual cells, e.g., the distance from a cancer epithelial cell to the boundary of the corresponding tumor boundary. Based on the heat map, one can explore the differential gene expression patterns between the intratumoral tumor cells and the tumor cells present in the immune-invasive region, which are located on the surface of the tumor.

**Fig. 5: Results showing functions of spatial profiling provided by QuST.**

A use case of QuST is spatial profiling for tumor micro-environment (TME). The TME encompasses the surrounding cellular and non-cellular components that interact with cancer cells. It plays a crucial role in tumor growth, progression, and response to therapy. By understanding the complex interactions between cancer cells, immune cells, stromal cells, and the extracellular matrix, researchers can identify potential targets for therapeutic intervention. Given the rich information provided by a ST dataset, the functions that QuST can provide are of paramount importance for TME study.

Algorithm 1

Cell spatial profiling based on Delaunay clustering

1: definition

2: \(C=\left\{{c}_{1},{c}_{2},\cdots \right\}\): all chosen cells

3: \(T=\left\{{t}_{1},{t}_{2},\cdots \right\}\): all cell types

4: \(d\in {{\mathbb{Z}}}_{ > 0}\): a threshold of edge distance defining the neighborhood of a cell

5: \(k\in \left[\mathrm{0,1}\right]\): a threshold for determining is c locating at the boundary of a cluster

6: \({\rm{EdgeDist}}\left(c,c^{\prime} \right)\in {{\mathbb{Z}}}_{ > 0}\): edge distance between \(c\) and \(c^{\prime}\)

7: \({\rm{CellType}}\left(c\right)\in T\): function for obtaining cell type of \(c\)

8: \({e}_{c}\in {{\mathbb{Z}}}_{\ge 0}\): the computed edge distance to the boundary of clusters, \(\forall c\in C\)

9: procedure \({\rm{CellSpatialProfiling}}\left(t\in T:{\rm{a\; chosen\; cell\; type}}\right)\)

10: \({e}_{c}\leftarrow 0,\forall c\in C\)

11: for each \(c\in C\) where \({\rm{CellType}}\left(c\right)=t\), do

12: \({N}_{c}\leftarrow \left\{{c}^{{\prime} }|{\rm{EdgeDist}}\left(c,{c}^{{\prime} }\right)\le {d}\,{\bf{and}}\,{c}^{{\prime} }\ne c,\forall c^{\prime} \in C\right\}\) ⊳ all neighbors of \({c}\)

13: \({O}_{c}\leftarrow \left\{{c}^{{\prime} }|{\text{CellType}}\left({c}^{{\prime} }\right)\,\ne\, t,\forall c^{\prime} \in {N}_{c}\right\}\) ⊳ all other types in neighbors of c

14: \({e}_{c}\leftarrow 1,{\bf{if}}\frac{\left|{O}_{c}\right|}{\left|{N}_{c}\right|}\ge k\) ⊳ initialize e_c the distance to the cluster boundary

15: \({i}1\) ⊳ i: distance indicator

16: repeat

17: for each \(c\in C\) where \(\text{CellType}\left(c\right)=t\), do

18: \({N}_{c}\leftarrow \left\{{c}^{{\prime} }|{\rm{EdgeDist}}\left(c,{c}^{{\prime} }\right)\le {d}\,{\bf{and}}\,{c}^{{\prime} }\ne c,\forall c^{\prime} \in C\right\}\) ⊳ all neighbors of c

19: \({O}_{c}\leftarrow \left\{{c}^{{\prime} }|\text{CellType}\left({c}^{{\prime} }\right)\,\ne\, t,\forall c^{\prime} \in {N}_{c}\right\}\) ⊳ all other types in neighbors of c

20: \({e}_{c}\leftarrow i+1,{\bf{if}}\frac{\left|{O}_{c}\right|}{\left|{N}_{c}\right|}\le k\,{\bf{and}}\,\exists {c}^{{\prime} }\in {N}_{c},{e}_{{c}^{{\prime} }}=i\)

21: \(i\leftarrow i+1\)

22: until \(\forall c\in C,{e}_{c}\) are obtained

return \({e}_{c},\forall c\in C\)

Cellular neighborhood analysis

Living tissues are composed of various cellular communities that coexist in complex spatial structures. Clusters of cells, each specialized for particular tissues, function together in advanced functional units to uphold and manage organ functions. Thus, the analysis of spatial context is critical for a thorough understanding of tissue biology. Schapiro et al. proposed HistoCAT¹³, highlighted the potential of studying the cellular neighbors using multiplex immunohistochemistry (mIHC) imaging. In addition, Ruitenberg et al. explored the analysis of cellular neighbors in spatial omics, yielding new biological insights¹⁴. Inspired by their works, we introduced Cellular Neighborhood Analysis (CNA) (see Algorithm 2).

Algorithm 2

Cellular neighborhood analysis

1: definition

2: \(C=\left\{{c}_{1},{c}_{2},\cdots \right\}\): all chosen cells

3: \(T=\left\{{t}_{1},{t}_{2},\cdots \right\}\): all cell types

4: \(d\in {{\mathbb{Z}}}_{ > 0}\): a threshold of edge distance defining the neighborhood of a cell

5: \({\rm{EdgeDist}}\left(c,c^{\prime} \right)\in {{\mathbb{Z}}}_{ > 0}\): edge distance between \(c\) and \(c^{\prime}\)

6: \({\rm{CellType}}\left(c\right)\in T\): function for obtaining cell type of \(c\)

7: \({h}_{c,t}\in \left[\mathrm{0,1}\right]\): the probability that \(t\) exists in the neighborhood of \(c\)

8: procedure CellularNeighborhoodAnalysis

9: \({h}_{c,t}\leftarrow 0,\forall c\in C,\forall t\in T\)

10: for each \(c\in C\), do

11: \({N}_{c}\leftarrow \left\{{c}^{{\prime} }|{\rm{EdgeDist}}\left(c,{c}^{{\prime} }\right)\le {d}\,{\bf{and}}\,{c}^{{\prime} }\,\ne\, c,\forall c^{\prime} \in C\right\}\) ⊳ all neighbors of c

12: for each \(c^{\prime} \in {N}_{c}\), do

13: \(t^{\prime} \leftarrow {\rm{CellType}}\left(c^{\prime} \right)\)

14: \({h}_{c,t^{\prime} }\leftarrow {h}_{c,t^{\prime} }+1\)

15: \({h}_{c,t^{\prime} }\leftarrow {h}_{c,{t}^{{\prime} }}/\left|{N}_{c}\right|,\forall t^{\prime} \in T\)

return \({h}_{c,t},\forall c\in C,\forall t\in T\)

An outcome is depicted in Fig. 6. In the experiment, we initially emphasized the epithelial and tumor cells. We then carried out CNA and used a heat map to display the count of lymphocytes that could be identified in the vicinity of a cell. By overlaying the tumor regions and the CNA results, one can determine the likelihood of a tumor cell having at least one lymphocyte in its immediate surroundings.

**Fig. 6: A result of cellular neighborhood analysis.**

Cell-cell interaction analysis

ST is a powerful tool for understanding cell-cell interactions (CCIs) within tissues. By mapping gene expression patterns and spatial organization, researchers gain insights into how cells communicate and influence each other. This knowledge has implications for drug development, disease research, and personalized medicine.

QuST uses the datasets provided by CellTalkDB¹⁵, which is a manual curated database that provides a comprehensive collection of ligand-receptor (LR) pairs in both humans and mice. The database includes 3398 human LR pairs and 2033 mouse LR pairs, which were obtained through a combination of text mining, manual verification of known protein-protein interactions using the STRING database, and literature-supported evidence for each pair.

QuST uses the results of cellular spatial profiling to compute CCI, incorporating crucial information about cell neighborhoods within specific regions of interest. When analyzing a cell of receptors, QuST takes into account all ligand cells situated within a designated neighboring distance, determined using Delaunay clustering, for the computation of the corresponding CCI. The algorithm presented in Algorithms 2 provide detailed explanations of how QuST calculates the LR product. Our future work includes incorporating implementations for more advanced methods¹⁵.

Figure 7 shows a result of CCI, focusing on CEACAM6-EGFR CCI analysis. As CCI analysis offers an additional layer of investigation by generating a heat map that illustrates the intensity of CCI for a specific ligand-receptor pair. The generated heat map provides a quantitative measure of the strength and significance of communication among a cluster of cells.

**Fig. 7: An example of analyzing CEACAM6-EGFR CCI using QuST.**

Algorithm 3

Ligand/Receptor expression computing for CCI based on LR product

1: definition

2: \(C=\left\{{c}_{1},{c}_{2},\cdots \right\}\): all chosen cells

3: \(Q=\left\{{\left(l,r\right)}_{1},{\left(l,r\right)}_{2},\cdots \right\}\): all chosen ligand-receptor pairs

4: \(d\in {{\mathbb{Z}}}_{ > 0}\): a threshold of edge distance defining the neighborhood of a cell

5: \({\rm{EdgeDist}}\left(c,c^{\prime} \right)\in {{\mathbb{Z}}}_{ > 0}\): edge distance between \(c\) and \(c^{\prime}\)

6: \(\text{LigExpr}\left(c,l\right)\in {{\mathbb{R}}}_{\ge 0}\): ligand \(l\) expression of \(c\)

7: \(\text{RecExpr}\left(c,r\right)\in {{\mathbb{R}}}_{\ge 0}\): receptor \(r\) expression of \(c\)

8: \({v}_{c,\left(l,r\right),l}\in {{\mathbb{R}}}_{\ge 0}\): ligand expression of the given \(c\) and \(\left(l,r\right)\)

9: \({v}_{c,\left(l,r\right),r}\in {{\mathbb{R}}}_{\ge 0}\): receptor expression of the given \(c\) and \(\left(l,r\right)\)

10: procedure \({\rm{CCIProfiling}}\)

11: \({v}_{c,\left(l,r\right),l}\leftarrow 0,\forall c\in C,\forall \left(l,r\right)\in Q\)

12: \({v}_{c,\left(l,r\right),r}\leftarrow 0,\forall c\in C,\forall \left(l,r\right)\in Q\)

13: for each \(c\in C\), do

14: \({N}_{c}\leftarrow \left\{{c}^{{\prime} }|{\rm{EdgeDist}}\left(c,{c}^{{\prime} }\right)\le {d}\,{\bf{and}}\,{c}^{{\prime} }\ne c,\forall c^{\prime} \in C\right\}\) ⊳ all neighbors of c

15: for each \(\left(l,r\right)\in Q\), do

16: for each \(c^{\prime} {\in N}_{c}\), do

17: \({v}_{c,\left(l,r\right),l}\leftarrow {v}_{c,\left(l,r\right),l}+\text{LigExpr}\left(c,l\right)\times \text{RecExpr}\left(c^{\prime} ,r\right)\)

18: \({v}_{c,\left(l,r\right),r}\leftarrow {v}_{c,\left(l,r\right),r}+\text{RecExpr}\left(c,r\right)\times \text{LigExpr}\left(c^{\prime} ,l\right)\)

19: \({v}_{c,\left(l,r\right),l}\leftarrow {v}_{c,\left(l,r\right),l}/\left|{N}_{c}\right|\)

20: \({v}_{c,\left(l,r\right),r}\leftarrow {v}_{c,\left(l,r\right),r}/\left|{N}_{c}\right|\)

return \({v}_{c,\left(l,r\right),l},{v}_{c,\left(l,r\right),r},\forall c\in C,\forall \left(l,r\right)\in Q\)

Cell clustering analysis

Recent research has highlighted the profound relationship between the functioning of various biological processes, such as cell-cell interactions, and the local densities and positions of cells within cellular monolayers and stratified epithelia. In an attempt to delve deeper into this relationship, Küchenhoff et al. introduced DBSCAN-CellX, a density-based clustering algorithm¹⁶ to study cell densities and positions within cellular monolayers and stratified epithelia, crucial for understanding various biological processes. This algorithm extends the DBSCAN method¹⁷ to analyze cell localization and tissue physiology. For example, it has been noted that Tertiary Lymphoid Structures (TLSes) are associated with a favorable cancer prognosis¹⁸. Furthermore, a technique for identifying TLS and evaluating their density on H&E-stained digital slides of lung cancer was proposed¹⁹.

To facilitate cell clustering analysis, we integrated DBSCAN-CellX into QuST. Given the classes of the cells, QuST is able to compute local densities and positions of cells accordingly. The results are shown as the measurements in the detection table. Hence, the results can also be visually investigated and exported using the native QuPath functions of measurement maps.

The result is shown in Fig. 8, indicating that clusters of lymphocytes (50+) were identified from the given sample. This integration allows researchers to utilize DBSCAN-CellX directly for quantitatively analyzing lymphocyte clusters, potentially identifying imaging biomarkers that correlate with disease prognosis.

**Fig. 8: An example of performing DBSCAN-CellX for lymphocytes.**

Pseudo spot generation based on single cell ST data

While sub-cellular ST technologies exist, they often have limitations in terms of the range of the genome they can cover. Consequently, lower-resolution technologies capable of analyzing the entire genome continue to be widely used. As a result, there is a requirement to generate data for evaluating gene expression deconvolution approaches.

QuST provides an opportunity to evaluate the spatial single cell deconvolution methods by mimicking the Visium datasets (see Fig. 9). For example, Huang et al. proposed an approach of ST auto-encoder and deconvolution method, largely utilizing the integration of a scalable deep generative model for predicting gene expression at cellular or nuclei level based on H&E imaging and in situ RNA capturing, thus allowing a better understanding of the tissue micro-environment²⁰.

**Fig. 9: Pseudo spots generated using single-cell data.**

DL-based image classification and segmentation with QuPath

QuST primary utility lies in utilizing transcriptomic data as the biological basis for training deep learning-based object categorization and region segmentation for H&E images. To service this purpose, QuST features functions for extracting single-cell H&E images for cell classification, as well as WSI patches for region segmentation. The QuPath annotation and detection measurement tables can be exported as files, serving as image label data for DL model training. Consequently, QuST significantly reduces the workload for users training their models for WSI analysis.

QuST includes a DL capability based on PyTorch²¹ to perform image classification, which aids in cell categorization and region segmentation tasks. To initiate the training procedure, two inputs are required. Firstly, an image set in a folder generated using the aforementioned image sampling approach. Secondly, the detection/annotation measurement table, which can be directly generated from QuPath measurement functions. QuST offers a wide range of neural networks, including resnet²², vgg²³, densenet²⁴, and various variations of modern vision transformers (ViT)²⁵.

Object classification

In the experiment, we first acquired H&E image patches for each detected object (e.g., a cell). Next, we used the genotype information provided by the chosen datasets, and train the DL model for object classification. In this experiment, we used ViT. A confusion matrix, shown in Fig. 10, was computed based on 10-fold cross-validation. Some examples of single-cell genotype classification based on H&E are shown in Fig. 11. In our experiment, as shown in the confusion matrix, cell types 1 and 10 can be better predicted based on single-cell H&E images, while type 4 has a poor prediction result. This result revealed that certain cell types are more readily distinguishable in H&E staining, such as lymphocytes, blood cells, etc., while others are not. Furthermore, differentiating B-cells from T-cells based on H&E staining is recognized as a particularly challenging task.

**Fig. 10: Confusion matrix of single-cell genotype classification based on H&E.**

**Fig. 11: Some examples of cell genotype classification.**

Region segmentation

In the experiment, we used manual annotation as shown in Fig. 12a. The chosen model was resnet50. The testing target is shown in Fig. 12b. Based on 10-fold cross-validation, we obtained the confusion matrix shown in Fig. 13.

**Fig. 12: Results showing the heat map representing the probability of various classification result.**

**Fig. 13: Confusion matrix for region segmentation.**

In addition, QuST also provides region segmentation with arbitrary tile size (aka. resolution). The higher the resolution, the longer the processing time. Figure 14 shows an example of various resolutions for region segmentation.

**Fig. 14: Image example for DL-based image region prediction.**

Data availability

Given the availability of H&E images, in the experiments, we mainly used 10x Genomics datasets. Below are the two datasets used: 10x Genomics dataset FFPE Human Breast using the Entire Sample Area (https://www.10xgenomics.com/datasets/ffpe-human-breast-using-the-entire-sample-area-1-standard). The sample was 5 μm section from a FFPE human breast resected tumor mass sample of Infiltrating Ductal Carcinoma, provided by Avaden Biosciences. 10x Genomics dataset FFPE Human Breast with Custom Add-on Panel (https://www.10xgenomics.com/datasets/xenium-ffpe-human-breast-with-custom-add-on-panel-1-standard). The sample was 5μm section from a FFPE human Infiltrating ductal carcinoma, Ductal carcinoma in situ, provided by BiolVT. NanoString CosMX dataset FFPE Human NSCLC (https://staging.nanostring.com/products/cosmx-spatial-molecular-imager/ffpe-dataset/nsclc-ffpe-dataset/), file ID: Lung 5-2.

Code availability

The QuST is developed based on QuPath 0.5.1 and Python 3.10+ and is available under the Apache 2.0 license (https://github.com/huangch/qust).

References

Bankhead, P. et al. Qupath: Open source software for digital pathology image analysis. Sci. Rep. 7, 16878 (2017).
Article PubMed PubMed Central Google Scholar
Pocock, J. et al. Tiatoolbox as an end-to-end library for advanced tissue image analytics. Commun. Med. 2, 120 (2022).
Article PubMed PubMed Central Google Scholar
Cardoso, M. J. et al. Monai: An open-source framework for deep learning in healthcare. Preprint at https://arxiv.org/abs/2211.02701 (2022).
Dolezal, J. M. et al. Slideflow: deep learning for digital histopathology with real-time whole-slide visualization. BMC Bioinforma. 25, 134 (2024).
Article Google Scholar
Faust, K. et al. PHARAOH: A collaborative crowdsourcing platform for phenotyping and regional analysis of histology. Nat. Commun. 16, 742 (2025).
Article PubMed PubMed Central CAS Google Scholar
Kaczmarzyk, J. R. et al. Open and reusable deep learning for pathology with wsinfer and qupath. Nat. Precis. Oncol. 8, 9 (2024).
Article Google Scholar
Wood, C. S. et al. Spatially resolved transcriptomics deconvolutes prognostic histological subgroups in patients with colorectal cancer and synchronous liver metastases. Cancer Res. 83, 1329–1344 (2023).
Article PubMed PubMed Central CAS Google Scholar
Schmidt, U., Weigert, M., Broaddus, C. & Myers, G. Cell detection with star-convex polygons. In Medical Image Computing and Computer Assisted Intervention (MICCAI), 265–273 (Springer, 2018).
Stringer, C., Wang, T., Michaelos, M. & Pachitariu, M. Cellpose: a generalist algorithm for cellular segmentation. Nat. Methods 18, 100–106 (2020).
Article PubMed Google Scholar
Wang, S. et al. A deep learning-based stripe self-correction method for stitched microscopic images. Nat. Commun. 14, 5393 (2023).
Jaume, G. et al. Hest-1k: A dataset for spatial transcriptomics and histology image analysis, Advances in Neural Information Processing Systems (2024).
Varghese, R. & Sambath, M. Yolov8: A novel object detection algorithm with enhanced performance and robustness. In International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), 1–6 (IEEE, 2024).
Schapiro, D. et al. Histocat: Analysis of cell phenotypes and interactions in multiplex image cytometry data. Nat. Methods 14, 873–876 (2017).
Article PubMed PubMed Central CAS Google Scholar
Ruitenberg, M. J. & Nguyen, Q. H. Cellular neighborhood analysis in spatial omics reveals new tissue domains and cell subtypes. Nat. Genet. 56, 362–364 (2024).
Article PubMed CAS Google Scholar
Shao, X. et al. Celltalkdb: a manually curated database of ligand-receptor interactions in humans and mice. Brief. Bioinforma. 22, bbaa269 (2021).
Article Google Scholar
Küchenhoff, L. et al. Extended methods for spatial cell classification with dbscan-cellx. Sci. Rep. 13, 18868 (2023).
Article PubMed PubMed Central Google Scholar
Ester, M., Kriegel, H.-P., Sander, J. & Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD’96. 226–231 vol. 1 (AAAI Press,1996).
Rodriguez, A. B. et al. Immune mechanisms orchestrate tertiary lymphoid structures in tumors via cancer-associated fibroblasts. Cell Rep. 36, 109422 (2021).
Article PubMed PubMed Central CAS Google Scholar
Barmpoutis, P. et al. Tertiary lymphoid structures (tls) identification and density assessment on h&e-stained digital slides of lung cancer. PLoS ONE 16, e0256907 (2021).
Article PubMed PubMed Central CAS Google Scholar
Huang, C.-H., Park, Y., Pang, J. & Bienkowska, J. R. Single-cell gene expression prediction using h&e images based on spatial transcriptomics. In Proceeding SPIE 12471, Medical Imaging 2023: Digital and Computational Pathology, vol. 12471, 1247105 (SPIE, 2023).
Paszke, A. et al. PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, (eds. Wallach, H. Larochelle, H. Beygelzimer, A. d’Alché Buc, F. Fox, E. Garnett, R.) 8024–8035 (Curran Associates, 2019).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 29th IEEE Computer Vision and Pattern Recognition (CVPR), vol. 2016, 770–778 (IEEE, 2016).
Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In 3rd International Conference on Learning Representations (ICLR2015), Computational and Biological Learning Society 1–14. (2015).
Huang, G., Liu, Z., Maaten, L. V. D. & Weinberger, K. Q. Densely connected convolutional networks. In 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2017, 2261–2269 (IEEE, 2017).
Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. In 9th International Conference on Learning Representations (ICLR2021), Computational and Biological Learning Society (2021).

Download references

Author information

Authors and Affiliations

Oncology Research & Development, Pfizer Inc., 10777 Science Center Dr., San Diego, CA, 92121, USA
Chao-Hui Huang, Sara Lichtarge & Diane Fernandez

Authors

Chao-Hui Huang
View author publications
Search author on:PubMed Google Scholar
Sara Lichtarge
View author publications
Search author on:PubMed Google Scholar
Diane Fernandez
View author publications
Search author on:PubMed Google Scholar

Contributions

CHH conceived the project, developed the algorithms and the QuPath plugin. SL trained and evaluated the machine learning models for image analysis, and maintained the documentation for the project. DF provided consultations and pathological opinions for the experiments. CHH wrote the manuscript with revisions from all authors. All authors approved the final version of the manuscript and agreed to submission.

Corresponding author

Correspondence to Chao-Hui Huang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Huang, CH., Lichtarge, S. & Fernandez, D. Integrative whole slide image and spatial transcriptomics analysis with QuST and QuPath. npj Precis. Onc. 9, 70 (2025). https://doi.org/10.1038/s41698-025-00841-9

Download citation

Received: 30 May 2024
Accepted: 17 February 2025
Published: 12 March 2025
DOI: https://doi.org/10.1038/s41698-025-00841-9