Abstract
Multiplexed imaging and spatial transcriptomics enable highly resolved spatial characterization of cellular phenotypes, but still largely depend on laborious manual annotation to understand higher-order patterns of tissue organization. As a result, higher-order patterns of tissue organization are poorly understood and not systematically connected to disease pathology or clinical outcomes. To address this gap, we developed an approach called UTAG to identify and quantify microanatomical tissue structures in multiplexed images without human intervention. Our method combines information on cellular phenotypes with the physical proximity of cells to accurately identify organ-specific microanatomical domains in healthy and diseased tissue. We apply our method to various types of images across healthy and disease states to show that it can consistently detect higher-level architectures in human tissues, quantify structural differences between healthy and diseased tissue, and reveal tissue organization patterns at the organ scale.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Data availability
Datasets used in this manuscript are publicly available at the repositories from the original publications: healthy lung IMC24: https://doi.org/10.5281/zenodo.6376766; COVID-19 lung IMC27: https://doi.org/10.5281/zenodo.4110559; lung cancer t-CyCIF30: https://doi.org/10.7303/syn17865732; upper tract urothelial carcinoma IMC33: https://doi.org/10.5281/zenodo.5719187. For convenience and reproducibility we make available a repository containing all processed datasets in h5ad format here: https://doi.org/10.5281/zenodo.6376766.
Code availability
Source code is publicly available at the following URL: https://github.com/ElementoLab/utag.
Change history
21 November 2022
A Correction to this paper has been published: https://doi.org/10.1038/s41592-022-01717-7
References
Giesen, C. et al. Highly multiplexed imaging of tumor tissues with subcellular resolution by mass cytometry. Nat. Methods 11, 417–422 (2014).
Angelo, M. et al. Multiplexed ion beam imaging of human breast tumors. Nat. Med. 20, 436–442 (2014).
Goltsev, Y. et al. Deep profiling of mouse splenic architecture with CODEX multiplexed imaging. Preprint at bioRxiv https://doi.org/10.1101/203166 (2018).
Lin, J.-R., Fallahi-Sichani, M. & Sorger, P. K. Highly multiplexed imaging of single cells using a high-throughput cyclic immunofluorescence method. Nat. Commun. 6, 8390 (2015).
Gerdes, M. J. et al. Highly multiplexed single-cell analysis of formalin-fixed, paraffin-embedded cancer tissue. Proc. Natl Acad. Sci. USA 110, 11982–11987 (2013).
Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. & Zhuang, X. RNA imaging. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).
Eng, C.-H. L. et al. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH. Nature 568, 235–239 (2019).
Stickels, R. R. et al. Highly sensitive spatial transcriptomics at near-cellular resolution with Slide-seqV2. Nat. Biotechnol. https://doi.org/10.1038/s41587-020-0739-1 (2020).
Merritt, C. R. et al. Multiplex digital spatial profiling of proteins and RNA in fixed tissue. Nat. Biotechnol. 38, 586–599 (2020).
Salmén, F. et al. Barcoded solid-phase RNA capture for spatial transcriptomics profiling in mammalian tissue sections. Nat. Protoc. 13, 2501–2534 (2018).
Rizzardi, A. E. et al. Quantitative comparison of immunohistochemical staining measured by digital image analysis versus pathologist visual scoring. Diagn. Pathol. 7, 42 (2012).
Rakhlin, A., Shvets, A., Iglovikov, V. & Kalinin, A. A. Deep convolutional neural networks for breast cancer histology image analysis. Preprint at bioRxiv https://doi.org/10.1101/259911 (2018).
Kiemen, A. et al. In situ characterization of the 3D microanatomy of the pancreas and pancreatic cancer at single cell resolution. Preprint at bioRxiv https://doi.org/10.1101/2020.12.08.416909 (2020).
Keren, L. et al. A structured tumor-immune microenvironment in triple negative breast cancer revealed by multiplexed ion beam imaging. Cell 174, 1373–1387.e19 (2018).
Jackson, H. W. et al. The single-cell pathology landscape of breast cancer. Nature https://doi.org/10.1038/s41586-019-1876-x (2020).
Raza Ali, H. et al. Imaging mass cytometry and multiplatform genomics define the phenogenomic landscape of breast cancer. Nat. Cancer 1, 163–175 (2020).
Schürch, C.M. et al. Coordinated cellular neighborhoods orchestrate antitumoral immunity at the colorectal cancer invasive front. Cell 182, 1341–1359.e19 (2020).
Ash, J. T., Darnell, G., Munro, D. & Engelhardt, B. E. Joint analysis of expression levels and histological images identifies genes associated with tissue morphology. Nat. Commun. 12, 1609 (2021).
Brbić, M. et al. Annotation of spatially resolved single-cell data with STELLAR. Preprint at bioRxiv https://doi.org/10.1101/2021.11.24.469947 (2021).
Fischer, D. S., Schaar, A. C. & Theis, F. J. Learning cell communication from spatial graphs of cells. Preprint at bioRxiv https://doi.org/10.1101/2021.07.11.451750 (2021).
Innocenti, C. et al. An unsupervised graph embeddings approach to multiplex immunofluorescence image exploration. Preprint at bioRxiv https://doi.org/10.1101/2021.06.09.447654 (2021).
Traag, V. A., Waltman, L. & van Eck, N. J. From Louvain to Leiden: guaranteeing well-connected communities. Scientific Reports 9, 5233 (2019).
Stassen, S. V. et al. PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells. Preprint at bioRxiv https://doi.org/10.1101/765628 (2019).
Rustam, S. et al. A unique cellular organization of human distal airways and its disarray in chronic obstructive pulmonary disease. Preprint at bioRxiv https://doi.org/10.1101/2022.03.16.484543 (2022).
Liu, Q., Hsu, C.-Y. & Shyr, Y. Scalable and model-free detection of spatial patterns and colocalization. Preprint at bioRxiv https://doi.org/10.1101/2022.04.20.488961 (2022).
Chen, Z., Soifer, I., Hilton, H., Keren, L. & Jojic, V. Modeling multiplexed images with Spatial-LDA reveals novel tissue microenvironments. J. Comput. Biol. 27, 1204–1218 (2020).
Rendeiro, A. F. et al. The spatial landscape of lung pathology during COVID-19 progression. Nature 593, 564–569 (2021).
Halawa, S. et al. Potential long-term effects of SARS-CoV-2 infection on the pulmonary vasculature: a global perspective. Nat. Rev. Cardiol. https://doi.org/10.1038/s41569-021-00640-2 (2021).
Ackermann, M. et al. Pulmonary vascular endothelialitis, thrombosis, and angiogenesis in Covid-19. N. Engl. J. Med. https://doi.org/10.1056/NEJMoa2015432 (2020).
Rashid, R. et al. Highly multiplexed immunofluorescence images and single-cell data of immune markers in tonsil and lung cancer. Sci. Data 6, 323 (2019).
Lehmann, M. et al. Human small intestinal infection by SARS-CoV-2 is characterized by a mucosal infiltration with activated CD8+ T cells. Mucosal Immunol. 14, 1381–1392 (2021).
Damond, N. et al. A map of human type 1 diabetes progression by imaging mass cytometry. Cell Metab. 29, 755–768.e5 (2019).
Ohara, K. et al. The evolution of genomic, transcriptomic, and single-cell protein markers of metastatic upper tract urothelial carcinoma. Preprint at bioRxiv https://doi.org/10.1101/2021.11.16.468622 (2021).
Weigert, M., Schmidt, U., Haase, R., Sugawara, K. & Myers, G. Star-convex polyhedra for 3D object detection and segmentation in microscopy. Preprint at arXiv https://doi.org/10.48550/arXiv.1908.03636 (2019).
Mandal, S. & Uhlmann, V. SplineDist: automated cell segmentation with spline curves. Cold Spring Harb. Lab. https://doi.org/10.1101/2020.10.27.357640 (2020).
Stringer, C., Wang, T., Michaelos, M. & Pachitariu, M. Cellpose: a generalist algorithm for cellular segmentation. Nat. Methods 18, 100–106 (2021).
Greenwald, N. F. et al. Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning. Nat. Biotechnol. https://doi.org/10.1038/s41587-021-01094-0 (2021).
Chen, C. S., Tan, J. & Tien, J. Mechanotransduction at cell-matrix and cell-cell contacts. Annu. Rev. Biomed. Eng. 6, 275–302 (2004).
Snijder, B. et al. Population context determines cell-to-cell variability in endocytosis and virus infection. Nature 461, 520–523 (2009).
Imle, A. et al. Experimental and computational analyses reveal that environmental restrictions shape HIV-1 spread in 3D cultures. Nat. Commun. 10, 2144 (2019).
Zanotelli, V. R. T. et al. A quantitative analysis of the interplay of environment, neighborhood, and cell state in 3D spheroids. Mol. Syst. Biol. 16, e9798 (2020).
Bhate, S. S., Barlow, G. L., Schürch, C. M. & Nolan, G. P. Tissue schematics map the specialization of immune tissue motifs and their appropriation by tumors. Cell Syst. https://doi.org/10.1016/j.cels.2021.09.012 (2021).
Regev, A. et al. The Human Cell Atlas. eLife 6, e27041 (2017).
HuBMAP Consortium. The human body at cellular resolution: the NIH Human Biomolecular Atlas Program. Nature 574, 187–192 (2019).
Ardini-Poleske, M. E. et al. LungMAP: The Molecular Atlas of Lung Development Program. Am. J. Physiol. Lung Cell. Mol. Physiol. 313, L733–L740 (2017).
Currlin, S. et al. 3D-mapping of human lymph node and spleen reveals integrated neuronal, vascular, and ductal cell networks. Preprint at bioRxiv https://doi.org/10.1101/2021.10.20.465151 (2021).
Maric, D. et al. Whole-brain tissue mapping toolkit using large-scale highly multiplexed immunofluorescence imaging and deep neural networks. Nat. Commun. 12, 1550 (2021).
Kuett, L. et al. Three-dimensional imaging mass cytometry for highly multiplexed molecular and cellular mapping of tissues and the tumor microenvironment. Nature Cancer https://doi.org/10.1038/s43018-021-00301-w (2021).
Palla, G. et al. Squidpy: a scalable framework for spatial single cell analysis. Preprint at bioRxiv https://doi.org/10.1101/2021.02.19.431994 (2021).
Johnson, W. E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).
Nirmal, A. J., Chen, Y.-A. & Sokolov, A. labsyspharm/scimap: Release v.0. 19. (2022); https://doi.org/10.5281/zenodo.6410307
Hirschberg, J. B. & Rosenberg, A. V-Measure: a conditional entropy-based external cluster evaluation. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 410–420 https://doi.org/10.7916/D80V8N84 (2007).
Aric A. Hagberg, Daniel A. Schult and Pieter J. Swart, “Exploring network structure, dynamics, and function using NetworkX”, in Proceedings of the 7th Python in Science Conference (SciPy2008), Gäel Varoquaux, Travis Vaught, and Jarrod Millman (Eds), (Pasadena, CA USA), pp. 11–15, Aug 2008.
Vallat, R. Pingouin: statistics in Python. JOSS 3, 1026 (2018).
Berg, S. et al. ilastik: interactive machine learning for (bio)image analysis. Nat. Methods 16, 1226–1232 (2019).
Davidson-Pilon, C. lifelines: survival analysis in Python. J. Open Source Softw. 4, 1317 (2019).
van der Walt, S. et al. scikit-image: image processing in Python. PeerJ 2, e453 (2014).
Pedregosa, F. & Varoquaux, G. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).
Acknowledgements
A.F.R. is supported by an NCI T32CA203702 grant. O.E. is supported by NIH grants UL1TR002384 and R01CA194547, and Leukemia and Lymphoma Society SCOR 7012-16, SCOR 7021-20 and SCOR 180078-02 grants.
Author information
Authors and Affiliations
Contributions
J.K., A.F.R. and O.E. planned the study; J.K. and A.F.R. performed analysis. S.R., J.M.M., S.H.R., and R.S. provided samples, expertise in pulmonary biology, histology and definition of microanatomical domains. O.E. supervised the research. J.K., A.F.R. and O.E. wrote the manuscript.
Corresponding authors
Ethics declarations
Competing interests
O.E. is scientific advisor and equity holder in Freenome, Owkin, Volastra Therapeutics and OneThree Biotech. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks Raza Ali and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Rita Strack, in collaboration with the Nature Methods team. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 UTAG analysis of IMC images of healthy lung.
a) UMAP representation of all cells across all images based on cellular phenotypes only (left), or cellular phenotypes and positional information combined with UTAG (right). b) Labeling of domains from clustering indices. Leiden clustering at resolution 0.3 was mapped to domains based on expression profiles as it performed well on both Rand and Homogeneity score. Data in boxplots are presented by minimum, 25th percentile, median, 75th percentile, and maximum. **p < 0.01,,*p < 0.05, two-sided Mann-Whitney-U test Benjamini-Hochberg adjusted. c) Deciding optimal resolution for healthy lung IMC data. Leiden clustering for resolution of 0.1 was selected as the ideal resolution because it had the greatest median rand score across all slides.
Extended Data Fig. 2 Illustration of UTAG results on IMC images of healthy lung.
a) Illustration of lung IMC images where the first column illustrates three channels (KRT5, aαSMA, DNA), the second column cell type identities, the third column cells colored by manual annotation of microanatomical domains, and the fourth column cells colored by UTAG domains. Each channel on the raw signal is keratin 5 for red, alpha smooth muscle for green, and DNA for blue. Scale bars represent 200 µm.
Extended Data Fig. 3 Benchmarking UTAG and competing methods against expert labels.
a) Results of each method on healthy lung data to segment microanatomical domains. Number of latent topics for SpaGene was set to 10 to capture the diverse target phenotypes. Due to supporting only single images, SpaGene topics were relabeled using agglomerative clustering to consistently label topics across slides. b) Results of each method on tumor vs. stroma on upper tract urothelial carcinoma. Number of latent topics for SpaGene was set to four to differentiate tumor versus stroma. c) Example of running UTAG, SpatialLDA, and SpaGene to demonstrate the difference in performance. The color mapping in this panel is different for each method as all three methods are unsupervised. d) Same as c) but with domain colors remapped to correspond to the ones from expert labels for ease of visual comparison. For a) and b), Data in boxplots are presented by minimum, 25th percentile, median, 75th percentile, and maximum. Values outside of 1.5 times interquartile range are classified as outliers and are denoted as fliers.
Extended Data Fig. 4 Application of UTAG to quantify domain co-localization frequency.
a) Full comparison of domain colocalization frequency for all pairwise microanatomical domains in lung infection data grouped by disease type. Data in boxplots are presented by minimum, 25th percentile, median, 75th percentile, and maximum. Values outside of 1.5 times interquartile range are classified as outliers and are denoted as fliers.
Extended Data Fig. 5 Application of UTAG to various data and tissue types.
a) Discovery of tumor and stromal domains in CyCIF images of two types of lung cancer. The top row illustrates the intensity of three selected channels, while the bottom row displays the UTAG domains. Scale bars represent 200 µm. b) Discovery of structural domains in 15 intestine IMC images of COVID-19 infected patients30. The first row shows three channels of representative IMC images. The second row shows the corresponding segmented microanatomical domains. Scale bars represent 500 µm. c) Discovery of micro-anatomy in a dataset of 100 IMC images from pancreatic tissue of diabetes patients31. Each row represents a different region of interest. The first column shows three channels of IMC images. The second column shows identified cell types in the dataset. The third column shows supervised islet segmentation results from a trained random forest using manual labels available in the original publication. The fourth column shows unsupervised islet segmentation results from UTAG. Scale bars represent 200 µm.
Supplementary information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kim, J., Rustam, S., Mosquera, J.M. et al. Unsupervised discovery of tissue architecture in multiplexed imaging. Nat Methods 19, 1653–1661 (2022). https://doi.org/10.1038/s41592-022-01657-2
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41592-022-01657-2
This article is cited by
-
Benchmarking multi-slice integration and downstream applications in spatial transcriptomics data analysis
Genome Biology (2025)
-
STModule: identifying tissue modules to uncover spatial components and characteristics of transcriptomic landscapes
Genome Medicine (2025)
-
Profiling cell identity and tissue architecture with single-cell and spatial transcriptomics
Nature Reviews Molecular Cell Biology (2025)
-
Quantifying and interpreting biologically meaningful spatial signatures within tumor microenvironments
npj Precision Oncology (2025)
-
Identification and characterization of cell niches in tissue from spatial omics data at single-cell resolution
Nature Communications (2025)