Abstract
The integration of AI in digital pathology, particularly in whole slide image (WSI) and spatial transcriptomics (ST) analysis, holds immense potential for enhancing disease understanding. Despite challenges such as training pattern preparation and resolution disparities, the convergence of these technologies can unlock insights. We introduce QuST, a QuPath extension that bridges the gap between WSI and ST at single-cell level, highlighting the power of this integrated approach in disease biology.
Similar content being viewed by others
Introduction
Spatial analysis, a critical component of pathology, has greatly enhanced our understanding of complex biological processes. Traditional pathology, which involves scrutinizing tissue slides with high-power microscopy, is labor-intensive. However, the advent of digital image analysis (DIA) and machine learning (ML) technologies has broadened the scope of artificial intelligence (AI) in this field. Over the past few years, a slew of deep learning (DL) based whole slide image (WSI) analysis tools such as QuPath1, TIA Toolbox2, MONAI3, SlideFlow4, PHARAOH5, WSInfer6 have been introduced.
One of the significant hurdles in DL-based WSI analysis is the creation of training patterns. Hematoxylin & eosin (H&E), the standard tissue staining technique, provides structural information but rarely offers direct biological evidence like gene expressions and transcription factors. As a result, the success of DL-based WSI analysis hinges largely on the expertise of those conducting manual annotation tasks on the WSI H&E images.
On the other hand, spatial transcriptomics (ST) has seen significant advancements as it enables the visualization and analysis of histological sections with gene expression features. ST provides valuable spatial context to molecular data, making it vital for studying complex biological processes, such as cell-cell interactions. ST also presents a unique opportunity to address the challenges of DL-based WSI analysis, as subcellular ST technologies are already available. However, merging these two powerful modalities has been challenging due to differences in data formats and analyzing methods.
Numerous studies have delved into the application of tools for the analysis of ST in WSI. For example, Wood et al. examined the use of QuPath for image analysis, paired with GeoMx ST, to investigate gene expression variability in colorectal cancer and liver metastases7. Tippani et al. also have highlighted the necessity of bridging high-resolution images for spatially resolved transcriptomics data, and concluded the missing capabilities of QuPath for supporting preprocessing of the multichannel fluorescent images. Despite certain constraints, QuPath’s significant functionalities in image analysis have been widely recognized. Nonetheless, these studies mainly use QuPath for initial analysis, with detailed ST analysis done using other tools.
While providers of ST technologies have developed various platforms for visually examining and researching biological insights from given samples, such as Loupe Browser (https://www.10xgenomics.com/support/software/loupe-browser/latest), Xenium Explorer (https://www.10xgenomics.com/support/software/xenium-explorer/latest), and AtoMx Spatial Informatics Platform (https://nanostring.com/products/atomx-spatial-informatics-platform/atomx-sip-overview/), these tools have not fully exploited the functionalities of DIA, resulting in ineffective integration of existing DIA tools and platforms into ST research. To address this gap, we present QuST, a QuPath extension that offers a comprehensive platform for integrating and analyzing WSI and ST data. QuST is designed to enable more in-depth spatial-omics analysis, including cell-cell interactions, cell spatial profiling, and visualization. Furthermore, QuST’s implementation of Deep Learning (DL)-based cell categorization and region segmentation methods could facilitate image annotation based on biological evidence.
Methods
QuST is designed to seamlessly integrate WSI and ST analysis with QuPath, enhancing its capabilities with tools specifically tailored for spatial biology. The extension supports the visualization of spatial gene expression data within the context of histopathological images, enabling users to explore the molecular landscape of tissues at an unprecedented resolution (see Fig. 1). Below, we will introduce some analyzing tools and use cases available in QuST.
a Users begin by importing ST data into QuPath using QuST. This step may require additional spatial alignment data which can be generated via FIJI (see Fig. 2), if the user is working on Xenium dataset (see text). b Once the ST data is successfully loaded, users can perform analysis and visualization via QuPath and QuST. c Given the biological evidence provided by ST, users can generate the training set for image based cell classification and region segmentation based on H&E. Finally, the result generated using the DL module can be further analyzed using the functions described in b.
Integrative WSI and ST analysis at single-cell level
QuST is being developed with a focus on single-cell level analysis. It has the capability to load single-cell level ST data formats, including those from 10x Genomics Xenium and NanoString CosMx. Furthermore, it can also load 10x Genomics Visium datasets for whole-slide, full-spectrum ST analysis.
A significant challenge in single-cell level ST analysis lies in aligning ST data with WSI, primarily due to the involvement of different image modalities. For instance, in Xenium and CosMx, cell localization relies on DAPI staining, while WSIs can be H&E staining. This necessitates multiple rounds of sample preparation and scanning to acquire multiple staining of the samples. Consequently, loading ST data requires extra steps for aligning the ST data with the provided WSI. To address this issue, we took inspiration from the guidelines provided by 10x Genomics (https://www.10xgenomics.com/analysis-guides/he-to-xenium-dapi-image-registration-with-fiji), and proposed a method that aligns the coordinates of ST data to the reference image, as illustrated in Fig. 2.
To test the proposed approach, in the experiment, we first performed a cell detection algorithm, e.g., StarDist8, Cellpose9, etc. Then, for experimental purpose, we loaded transcriptomic data with and without including image registration information, separately. The results are shown in Fig. 3, and the statistical evaluation of cell displacement between the H&E image and transcriptomic data is shown in Fig. 4. In the experiment, 723,384 cells were detected from H&E images. Without using image registration, 122,288 cells were missing. The root causes include: 1) different quality control approaches for the two data modalities; and 2) the location information obtained from the transcriptomic data did not match the cells detected on the given H&E images. With image registration, this number dropped to 99,405, representing an 18.71% improvement. In addition, it can be observed that the deformation looks much relevant to the grid-like artifacts10 resulting from stitching the DAPI images (see Fig. 3e, f). As a result, integrating the proposed ST and WSI registration can mitigate the noises generated from data acquisition.
For performing whole-slide, full-spectrum ST analysis using 10x Genomics Visium datasets, QuST requires the corresponding affine matrix, generated manually via the Visium Image Alignment function in the Loupe Browser (see https://www.10xgenomics.com/support/software/space-ranger/latest/analysis/inputs/image-fiducial-alignment). In addition, in the work of HEST-1k11, an automatic fiducial detection using the YOLOv8 model12 was introduced, which enables alignment inference for full-spectrum ST and WSI interactive analysis using Visium datasets.
Cellular spatial profiling
Cell spatial profiling plays a critical role in spatial-omics analysis. In QuST, cell spatial profiling provides the foundation of all other spatial related computing. First, the Delaunay clustering is required in order to obtain the neighboring cell connectivity. Next, the edge distance of each chosen cells is computed. As a result, the position of each cell in the cluster is obtained and can be used for the following analyzing tasks. The detailed algorithm is shown in Algorithm 1.
The results shown in Fig. 5 represent insights from cellular spatial analysis. The heat map indicates the boundary distance of individual cells, e.g., the distance from a cancer epithelial cell to the boundary of the corresponding tumor boundary. Based on the heat map, one can explore the differential gene expression patterns between the intratumoral tumor cells and the tumor cells present in the immune-invasive region, which are located on the surface of the tumor.
a Neighboring cell connectivity based on Delaunay clustering. Various single cell analyses available in QuST are based on the neighboring cell connectivity. b QuST's cellular spatial profiling generates a heat map indicating the distance to boundary of a specific cell type, e.g., tumor-epithelial cells to the corresponding tumor boundary.
A use case of QuST is spatial profiling for tumor micro-environment (TME). The TME encompasses the surrounding cellular and non-cellular components that interact with cancer cells. It plays a crucial role in tumor growth, progression, and response to therapy. By understanding the complex interactions between cancer cells, immune cells, stromal cells, and the extracellular matrix, researchers can identify potential targets for therapeutic intervention. Given the rich information provided by a ST dataset, the functions that QuST can provide are of paramount importance for TME study.
Algorithm 1
Cell spatial profiling based on Delaunay clustering
1: definition
2: \(C=\left\{{c}_{1},{c}_{2},\cdots \right\}\): all chosen cells
3: \(T=\left\{{t}_{1},{t}_{2},\cdots \right\}\): all cell types
4: \(d\in {{\mathbb{Z}}}_{ > 0}\): a threshold of edge distance defining the neighborhood of a cell
5: \(k\in \left[\mathrm{0,1}\right]\): a threshold for determining is c locating at the boundary of a cluster
6: \({\rm{EdgeDist}}\left(c,c^{\prime} \right)\in {{\mathbb{Z}}}_{ > 0}\): edge distance between \(c\) and \(c^{\prime}\)
7: \({\rm{CellType}}\left(c\right)\in T\): function for obtaining cell type of \(c\)
8: \({e}_{c}\in {{\mathbb{Z}}}_{\ge 0}\): the computed edge distance to the boundary of clusters, \(\forall c\in C\)
9: procedure \({\rm{CellSpatialProfiling}}\left(t\in T:{\rm{a\; chosen\; cell\; type}}\right)\)
10: \({e}_{c}\leftarrow 0,\forall c\in C\)
11: for each \(c\in C\) where \({\rm{CellType}}\left(c\right)=t\), do
12: \({N}_{c}\leftarrow \left\{{c}^{{\prime} }|{\rm{EdgeDist}}\left(c,{c}^{{\prime} }\right)\le {d}\,{\bf{and}}\,{c}^{{\prime} }\ne c,\forall c^{\prime} \in C\right\}\) ⊳ all neighbors of \({c}\)
13: \({O}_{c}\leftarrow \left\{{c}^{{\prime} }|{\text{CellType}}\left({c}^{{\prime} }\right)\,\ne\, t,\forall c^{\prime} \in {N}_{c}\right\}\) ⊳ all other types in neighbors of c
14: \({e}_{c}\leftarrow 1,{\bf{if}}\frac{\left|{O}_{c}\right|}{\left|{N}_{c}\right|}\ge k\) ⊳ initialize ec the distance to the cluster boundary
15: \({i}1\) ⊳ i: distance indicator
16: repeat
17: for each \(c\in C\) where \(\text{CellType}\left(c\right)=t\), do
18: \({N}_{c}\leftarrow \left\{{c}^{{\prime} }|{\rm{EdgeDist}}\left(c,{c}^{{\prime} }\right)\le {d}\,{\bf{and}}\,{c}^{{\prime} }\ne c,\forall c^{\prime} \in C\right\}\) ⊳ all neighbors of c
19: \({O}_{c}\leftarrow \left\{{c}^{{\prime} }|\text{CellType}\left({c}^{{\prime} }\right)\,\ne\, t,\forall c^{\prime} \in {N}_{c}\right\}\) ⊳ all other types in neighbors of c
20: \({e}_{c}\leftarrow i+1,{\bf{if}}\frac{\left|{O}_{c}\right|}{\left|{N}_{c}\right|}\le k\,{\bf{and}}\,\exists {c}^{{\prime} }\in {N}_{c},{e}_{{c}^{{\prime} }}=i\)
21: \(i\leftarrow i+1\)
22: until \(\forall c\in C,{e}_{c}\) are obtained
return \({e}_{c},\forall c\in C\)
Cellular neighborhood analysis
Living tissues are composed of various cellular communities that coexist in complex spatial structures. Clusters of cells, each specialized for particular tissues, function together in advanced functional units to uphold and manage organ functions. Thus, the analysis of spatial context is critical for a thorough understanding of tissue biology. Schapiro et al. proposed HistoCAT13, highlighted the potential of studying the cellular neighbors using multiplex immunohistochemistry (mIHC) imaging. In addition, Ruitenberg et al. explored the analysis of cellular neighbors in spatial omics, yielding new biological insights14. Inspired by their works, we introduced Cellular Neighborhood Analysis (CNA) (see Algorithm 2).
Algorithm 2
Cellular neighborhood analysis
1: definition
2: \(C=\left\{{c}_{1},{c}_{2},\cdots \right\}\): all chosen cells
3: \(T=\left\{{t}_{1},{t}_{2},\cdots \right\}\): all cell types
4: \(d\in {{\mathbb{Z}}}_{ > 0}\): a threshold of edge distance defining the neighborhood of a cell
5: \({\rm{EdgeDist}}\left(c,c^{\prime} \right)\in {{\mathbb{Z}}}_{ > 0}\): edge distance between \(c\) and \(c^{\prime}\)
6: \({\rm{CellType}}\left(c\right)\in T\): function for obtaining cell type of \(c\)
7: \({h}_{c,t}\in \left[\mathrm{0,1}\right]\): the probability that \(t\) exists in the neighborhood of \(c\)
8: procedure CellularNeighborhoodAnalysis
9: \({h}_{c,t}\leftarrow 0,\forall c\in C,\forall t\in T\)
10: for each \(c\in C\), do
11: \({N}_{c}\leftarrow \left\{{c}^{{\prime} }|{\rm{EdgeDist}}\left(c,{c}^{{\prime} }\right)\le {d}\,{\bf{and}}\,{c}^{{\prime} }\,\ne\, c,\forall c^{\prime} \in C\right\}\) ⊳ all neighbors of c
12: for each \(c^{\prime} \in {N}_{c}\), do
13: \(t^{\prime} \leftarrow {\rm{CellType}}\left(c^{\prime} \right)\)
14: \({h}_{c,t^{\prime} }\leftarrow {h}_{c,t^{\prime} }+1\)
15: \({h}_{c,t^{\prime} }\leftarrow {h}_{c,{t}^{{\prime} }}/\left|{N}_{c}\right|,\forall t^{\prime} \in T\)
return \({h}_{c,t},\forall c\in C,\forall t\in T\)
An outcome is depicted in Fig. 6. In the experiment, we initially emphasized the epithelial and tumor cells. We then carried out CNA and used a heat map to display the count of lymphocytes that could be identified in the vicinity of a cell. By overlaying the tumor regions and the CNA results, one can determine the likelihood of a tumor cell having at least one lymphocyte in its immediate surroundings.
Cell-cell interaction analysis
ST is a powerful tool for understanding cell-cell interactions (CCIs) within tissues. By mapping gene expression patterns and spatial organization, researchers gain insights into how cells communicate and influence each other. This knowledge has implications for drug development, disease research, and personalized medicine.
QuST uses the datasets provided by CellTalkDB15, which is a manual curated database that provides a comprehensive collection of ligand-receptor (LR) pairs in both humans and mice. The database includes 3398 human LR pairs and 2033 mouse LR pairs, which were obtained through a combination of text mining, manual verification of known protein-protein interactions using the STRING database, and literature-supported evidence for each pair.
QuST uses the results of cellular spatial profiling to compute CCI, incorporating crucial information about cell neighborhoods within specific regions of interest. When analyzing a cell of receptors, QuST takes into account all ligand cells situated within a designated neighboring distance, determined using Delaunay clustering, for the computation of the corresponding CCI. The algorithm presented in Algorithms 2 provide detailed explanations of how QuST calculates the LR product. Our future work includes incorporating implementations for more advanced methods15.
Figure 7 shows a result of CCI, focusing on CEACAM6-EGFR CCI analysis. As CCI analysis offers an additional layer of investigation by generating a heat map that illustrates the intensity of CCI for a specific ligand-receptor pair. The generated heat map provides a quantitative measure of the strength and significance of communication among a cluster of cells.
Algorithm 3
Ligand/Receptor expression computing for CCI based on LR product
1: definition
2: \(C=\left\{{c}_{1},{c}_{2},\cdots \right\}\): all chosen cells
3: \(Q=\left\{{\left(l,r\right)}_{1},{\left(l,r\right)}_{2},\cdots \right\}\): all chosen ligand-receptor pairs
4: \(d\in {{\mathbb{Z}}}_{ > 0}\): a threshold of edge distance defining the neighborhood of a cell
5: \({\rm{EdgeDist}}\left(c,c^{\prime} \right)\in {{\mathbb{Z}}}_{ > 0}\): edge distance between \(c\) and \(c^{\prime}\)
6: \(\text{LigExpr}\left(c,l\right)\in {{\mathbb{R}}}_{\ge 0}\): ligand \(l\) expression of \(c\)
7: \(\text{RecExpr}\left(c,r\right)\in {{\mathbb{R}}}_{\ge 0}\): receptor \(r\) expression of \(c\)
8: \({v}_{c,\left(l,r\right),l}\in {{\mathbb{R}}}_{\ge 0}\): ligand expression of the given \(c\) and \(\left(l,r\right)\)
9: \({v}_{c,\left(l,r\right),r}\in {{\mathbb{R}}}_{\ge 0}\): receptor expression of the given \(c\) and \(\left(l,r\right)\)
10: procedure \({\rm{CCIProfiling}}\)
11: \({v}_{c,\left(l,r\right),l}\leftarrow 0,\forall c\in C,\forall \left(l,r\right)\in Q\)
12: \({v}_{c,\left(l,r\right),r}\leftarrow 0,\forall c\in C,\forall \left(l,r\right)\in Q\)
13: for each \(c\in C\), do
14: \({N}_{c}\leftarrow \left\{{c}^{{\prime} }|{\rm{EdgeDist}}\left(c,{c}^{{\prime} }\right)\le {d}\,{\bf{and}}\,{c}^{{\prime} }\ne c,\forall c^{\prime} \in C\right\}\) ⊳ all neighbors of c
15: for each \(\left(l,r\right)\in Q\), do
16: for each \(c^{\prime} {\in N}_{c}\), do
17: \({v}_{c,\left(l,r\right),l}\leftarrow {v}_{c,\left(l,r\right),l}+\text{LigExpr}\left(c,l\right)\times \text{RecExpr}\left(c^{\prime} ,r\right)\)
18: \({v}_{c,\left(l,r\right),r}\leftarrow {v}_{c,\left(l,r\right),r}+\text{RecExpr}\left(c,r\right)\times \text{LigExpr}\left(c^{\prime} ,l\right)\)
19: \({v}_{c,\left(l,r\right),l}\leftarrow {v}_{c,\left(l,r\right),l}/\left|{N}_{c}\right|\)
20: \({v}_{c,\left(l,r\right),r}\leftarrow {v}_{c,\left(l,r\right),r}/\left|{N}_{c}\right|\)
return \({v}_{c,\left(l,r\right),l},{v}_{c,\left(l,r\right),r},\forall c\in C,\forall \left(l,r\right)\in Q\)
Cell clustering analysis
Recent research has highlighted the profound relationship between the functioning of various biological processes, such as cell-cell interactions, and the local densities and positions of cells within cellular monolayers and stratified epithelia. In an attempt to delve deeper into this relationship, Küchenhoff et al. introduced DBSCAN-CellX, a density-based clustering algorithm16 to study cell densities and positions within cellular monolayers and stratified epithelia, crucial for understanding various biological processes. This algorithm extends the DBSCAN method17 to analyze cell localization and tissue physiology. For example, it has been noted that Tertiary Lymphoid Structures (TLSes) are associated with a favorable cancer prognosis18. Furthermore, a technique for identifying TLS and evaluating their density on H&E-stained digital slides of lung cancer was proposed19.
To facilitate cell clustering analysis, we integrated DBSCAN-CellX into QuST. Given the classes of the cells, QuST is able to compute local densities and positions of cells accordingly. The results are shown as the measurements in the detection table. Hence, the results can also be visually investigated and exported using the native QuPath functions of measurement maps.
The result is shown in Fig. 8, indicating that clusters of lymphocytes (50+) were identified from the given sample. This integration allows researchers to utilize DBSCAN-CellX directly for quantitatively analyzing lymphocyte clusters, potentially identifying imaging biomarkers that correlate with disease prognosis.
Pseudo spot generation based on single cell ST data
While sub-cellular ST technologies exist, they often have limitations in terms of the range of the genome they can cover. Consequently, lower-resolution technologies capable of analyzing the entire genome continue to be widely used. As a result, there is a requirement to generate data for evaluating gene expression deconvolution approaches.
QuST provides an opportunity to evaluate the spatial single cell deconvolution methods by mimicking the Visium datasets (see Fig. 9). For example, Huang et al. proposed an approach of ST auto-encoder and deconvolution method, largely utilizing the integration of a scalable deep generative model for predicting gene expression at cellular or nuclei level based on H&E imaging and in situ RNA capturing, thus allowing a better understanding of the tissue micro-environment20.
DL-based image classification and segmentation with QuPath
QuST primary utility lies in utilizing transcriptomic data as the biological basis for training deep learning-based object categorization and region segmentation for H&E images. To service this purpose, QuST features functions for extracting single-cell H&E images for cell classification, as well as WSI patches for region segmentation. The QuPath annotation and detection measurement tables can be exported as files, serving as image label data for DL model training. Consequently, QuST significantly reduces the workload for users training their models for WSI analysis.
QuST includes a DL capability based on PyTorch21 to perform image classification, which aids in cell categorization and region segmentation tasks. To initiate the training procedure, two inputs are required. Firstly, an image set in a folder generated using the aforementioned image sampling approach. Secondly, the detection/annotation measurement table, which can be directly generated from QuPath measurement functions. QuST offers a wide range of neural networks, including resnet22, vgg23, densenet24, and various variations of modern vision transformers (ViT)25.
Object classification
In the experiment, we first acquired H&E image patches for each detected object (e.g., a cell). Next, we used the genotype information provided by the chosen datasets, and train the DL model for object classification. In this experiment, we used ViT. A confusion matrix, shown in Fig. 10, was computed based on 10-fold cross-validation. Some examples of single-cell genotype classification based on H&E are shown in Fig. 11. In our experiment, as shown in the confusion matrix, cell types 1 and 10 can be better predicted based on single-cell H&E images, while type 4 has a poor prediction result. This result revealed that certain cell types are more readily distinguishable in H&E staining, such as lymphocytes, blood cells, etc., while others are not. Furthermore, differentiating B-cells from T-cells based on H&E staining is recognized as a particularly challenging task.
Region segmentation
In the experiment, we used manual annotation as shown in Fig. 12a. The chosen model was resnet50. The testing target is shown in Fig. 12b. Based on 10-fold cross-validation, we obtained the confusion matrix shown in Fig. 13.
In addition, QuST also provides region segmentation with arbitrary tile size (aka. resolution). The higher the resolution, the longer the processing time. Figure 14 shows an example of various resolutions for region segmentation.
Data availability
Given the availability of H&E images, in the experiments, we mainly used 10x Genomics datasets. Below are the two datasets used: 10x Genomics dataset FFPE Human Breast using the Entire Sample Area (https://www.10xgenomics.com/datasets/ffpe-human-breast-using-the-entire-sample-area-1-standard). The sample was 5 μm section from a FFPE human breast resected tumor mass sample of Infiltrating Ductal Carcinoma, provided by Avaden Biosciences. 10x Genomics dataset FFPE Human Breast with Custom Add-on Panel (https://www.10xgenomics.com/datasets/xenium-ffpe-human-breast-with-custom-add-on-panel-1-standard). The sample was 5μm section from a FFPE human Infiltrating ductal carcinoma, Ductal carcinoma in situ, provided by BiolVT. NanoString CosMX dataset FFPE Human NSCLC (https://staging.nanostring.com/products/cosmx-spatial-molecular-imager/ffpe-dataset/nsclc-ffpe-dataset/), file ID: Lung 5-2.
Code availability
The QuST is developed based on QuPath 0.5.1 and Python 3.10+ and is available under the Apache 2.0 license (https://github.com/huangch/qust).
References
Bankhead, P. et al. Qupath: Open source software for digital pathology image analysis. Sci. Rep. 7, 16878 (2017).
Pocock, J. et al. Tiatoolbox as an end-to-end library for advanced tissue image analytics. Commun. Med. 2, 120 (2022).
Cardoso, M. J. et al. Monai: An open-source framework for deep learning in healthcare. Preprint at https://arxiv.org/abs/2211.02701 (2022).
Dolezal, J. M. et al. Slideflow: deep learning for digital histopathology with real-time whole-slide visualization. BMC Bioinforma. 25, 134 (2024).
Faust, K. et al. PHARAOH: A collaborative crowdsourcing platform for phenotyping and regional analysis of histology. Nat. Commun. 16, 742 (2025).
Kaczmarzyk, J. R. et al. Open and reusable deep learning for pathology with wsinfer and qupath. Nat. Precis. Oncol. 8, 9 (2024).
Wood, C. S. et al. Spatially resolved transcriptomics deconvolutes prognostic histological subgroups in patients with colorectal cancer and synchronous liver metastases. Cancer Res. 83, 1329–1344 (2023).
Schmidt, U., Weigert, M., Broaddus, C. & Myers, G. Cell detection with star-convex polygons. In Medical Image Computing and Computer Assisted Intervention (MICCAI), 265–273 (Springer, 2018).
Stringer, C., Wang, T., Michaelos, M. & Pachitariu, M. Cellpose: a generalist algorithm for cellular segmentation. Nat. Methods 18, 100–106 (2020).
Wang, S. et al. A deep learning-based stripe self-correction method for stitched microscopic images. Nat. Commun. 14, 5393 (2023).
Jaume, G. et al. Hest-1k: A dataset for spatial transcriptomics and histology image analysis, Advances in Neural Information Processing Systems (2024).
Varghese, R. & Sambath, M. Yolov8: A novel object detection algorithm with enhanced performance and robustness. In International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), 1–6 (IEEE, 2024).
Schapiro, D. et al. Histocat: Analysis of cell phenotypes and interactions in multiplex image cytometry data. Nat. Methods 14, 873–876 (2017).
Ruitenberg, M. J. & Nguyen, Q. H. Cellular neighborhood analysis in spatial omics reveals new tissue domains and cell subtypes. Nat. Genet. 56, 362–364 (2024).
Shao, X. et al. Celltalkdb: a manually curated database of ligand-receptor interactions in humans and mice. Brief. Bioinforma. 22, bbaa269 (2021).
Küchenhoff, L. et al. Extended methods for spatial cell classification with dbscan-cellx. Sci. Rep. 13, 18868 (2023).
Ester, M., Kriegel, H.-P., Sander, J. & Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD’96. 226–231 vol. 1 (AAAI Press,1996).
Rodriguez, A. B. et al. Immune mechanisms orchestrate tertiary lymphoid structures in tumors via cancer-associated fibroblasts. Cell Rep. 36, 109422 (2021).
Barmpoutis, P. et al. Tertiary lymphoid structures (tls) identification and density assessment on h&e-stained digital slides of lung cancer. PLoS ONE 16, e0256907 (2021).
Huang, C.-H., Park, Y., Pang, J. & Bienkowska, J. R. Single-cell gene expression prediction using h&e images based on spatial transcriptomics. In Proceeding SPIE 12471, Medical Imaging 2023: Digital and Computational Pathology, vol. 12471, 1247105 (SPIE, 2023).
Paszke, A. et al. PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, (eds. Wallach, H. Larochelle, H. Beygelzimer, A. d’Alché Buc, F. Fox, E. Garnett, R.) 8024–8035 (Curran Associates, 2019).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 29th IEEE Computer Vision and Pattern Recognition (CVPR), vol. 2016, 770–778 (IEEE, 2016).
Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In 3rd International Conference on Learning Representations (ICLR2015), Computational and Biological Learning Society 1–14. (2015).
Huang, G., Liu, Z., Maaten, L. V. D. & Weinberger, K. Q. Densely connected convolutional networks. In 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2017, 2261–2269 (IEEE, 2017).
Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. In 9th International Conference on Learning Representations (ICLR2021), Computational and Biological Learning Society (2021).
Author information
Authors and Affiliations
Contributions
CHH conceived the project, developed the algorithms and the QuPath plugin. SL trained and evaluated the machine learning models for image analysis, and maintained the documentation for the project. DF provided consultations and pathological opinions for the experiments. CHH wrote the manuscript with revisions from all authors. All authors approved the final version of the manuscript and agreed to submission.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Huang, CH., Lichtarge, S. & Fernandez, D. Integrative whole slide image and spatial transcriptomics analysis with QuST and QuPath. npj Precis. Onc. 9, 70 (2025). https://doi.org/10.1038/s41698-025-00841-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41698-025-00841-9