Abstract
Deciphering cellular microenvironments at atlas scale remains challenging because molecular identity, spatial context, and platform heterogeneity are tightly coupled. Here we present CellNiche, a scalable contrastive-learning framework that identifies and characterizes cellular microenvironments from spatial omics data using cell-centric spatial-proximity subgraphs. CellNiche combines spatial co-localization and molecular co-expression cues to learn microenvironment-aware embeddings. Across spatial omics datasets from multiple platforms (>10 million cells in total), scaling experiments show improved representations with more training data and competitive clustering and embedding-quality performance with efficient computation. In a multi-sample human non-small-cell lung cancer (NSCLC) cohort, CellNiche identifies conserved and sample-specific tumor and immune microenvironments and captures localized spatial transitions. In four independent mouse brain atlases, CellNiche integrates 293 slices into a unified virtual brain map for cross-atlas annotation transfer and spatial refinement.
Similar content being viewed by others
Data availability
The osmFISH dataset of mouse somatosensory cortex is available at https://github.com/drieslab/spatial-datasets28. The mouse spleen CODEX dataset is available at https://data.mendeley.com/datasets/zjnpwh8m5b/130. The STARMap dataset of mouse brain is available at https://singlecell.broadinstitute.org/single_cell/study/SCP183032. The human CRC CODEX dataset is available at https://data.mendeley.com/datasets/mpjzbtfgfr/137. The NSCLC CosMx dataset is available at https://nanostring.com/products/cosmx-spatial-molecular-imager/nsclc-ffpe-dataset/27. The spatial transcriptomics atlases of mouse brain are available at https://singlecell.broadinstitute.org/single_cell/study/SCP1830 (Atlas 1)32, https://doi.brainimagelibrary.org/doi/10.35077/g.610 (Atlas 2)31, https://doi.brainimagelibrary.org/doi/10.35077/act-bag (Atlas 3)33, https://info.vizgen.com/mouse-brain-map (Atlas 4)34. The mouse E16.5 whole embryo Stereo-seq data is available at https://db.cngb.org/stomics/mosta/download/36. Source data are provided in this paper.
Code availability
The software package implementing the CellNiche algorithm has been deposited at GitHub https://github.com/Super-LzzZ/CellNiche under the MIT license. The version associated with this study has been archived at Zenodo (https://doi.org/10.5281/zenodo.19143524)65.
References
Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018).
Xu, C. et al. Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models. Mol. Syst. Biol. 17, e9620 (2021).
Gayoso, A. et al. A Python library for probabilistic analysis of single-cell omics data. Nat. Biotechnol. 40, 163–166 (2022).
Chappell, L., Russell, A. J. & Voet, T. Single-cell (multi) omics technologies. Annu. Rev. Genom. Human Genet. 19, 15–41 (2018).
Theodoris, C. V. et al. Transfer learning enables predictions in network biology. Nature 618, 616–624 (2023).
Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21, 1470–1480 (2024).
Heimberg, G. et al. A cell atlas foundation model for scalable search of similar human cells. Nature 638, 1085–1094 (2025).
Marx, V. Method of the Year: spatially resolved transcriptomics. Nat. Methods 18, 9–14 (2021).
Moffitt, J. R., Lundberg, E. & Heyn, H. The emerging landscape of spatial profiling technologies. Nat. Rev. Genet. 23, 741–759 (2022).
Hamilton, W., Ying, Z. & Leskovec, J. Inductive representation learning on large graphs. In Proceedings Advances in Neural Information Processing Systems (2017).
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (2017).
Veličković, P. et al. Graph attention networks. In International Conference on Learning Representations (2018).
Dong, K. & Zhang, S. Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder. Nat. Commun. 13, 1739 (2022).
Long, Y. et al. Spatially informed clustering, integration, and deconvolution of spatial transcriptomics with GraphST. Nat. Commun. 14, 1155 (2023).
Kim, J. et al. Unsupervised discovery of tissue architecture in multiplexed imaging. Nat. Methods 19, 1653–1661 (2022).
Singhal, V. et al. BANKSY unifies cell typing and tissue domain segmentation for scalable spatial omics data analysis. Nat. Genet. 56, 431–441 (2024).
Hu, Y. et al. Unsupervised and supervised discovery of tissue cellular neighborhoods from cell phenotypes. Nat. Methods 21, 267–278 (2024).
Xia, C.-R., Cao, Z.-J. & Gao, G. DECIPHER for learning disentangled cellular embeddings in large-scale heterogeneous spatial omics data. Nat. Commun. 16, 7991 (2025).
Wen, H. et al. CellPLM: pre-training of cell language model beyond single cells. In International Conference on Learning Representations (2024).
Wang, C. et al. scGPT-spatial: Continual pretraining of single-cell foundation model for spatial transcriptomics. Preprint at https://doi.org/10.1101/2025.02.05.636714 (2025).
Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. In Proceedings International Conference on Machine Learning (2020).
Liu, Y. et al. Knowledge discovery and data mining. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2024).
Palla, G. et al. Squidpy: a scalable framework for spatial omics analysis. Nat. Methods 19, 171–178 (2022).
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 1–5 (2018).
Vaswani, A. et al. Attention is all you need. In Proceedings 31st Conference on Neural Information Processing Systems (2017).
Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. & Zhuang, X. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).
He, S. et al. High-plex imaging of RNA and proteins at subcellular resolution in fixed tissue by spatial molecular imaging. Nat. Biotechnol. 40, 1794–1806 (2022).
Codeluppi, S. et al. Spatial organization of the somatosensory cortex revealed by osmFISH. Nat. Methods 15, 932–935 (2018).
Wang, X. et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science 361, eaat5691 (2018).
Goltsev, Y. et al. Deep profiling of mouse splenic architecture with CODEX multiplexed imaging. Cell 174, 968–981 (2018).
Yao, Z. et al. A high-resolution transcriptomic and spatial atlas of cell types in the whole mouse brain. Nature 624, 317–332 (2023).
Shi, H. et al. Spatial atlas of the mouse central nervous system at molecular resolution. Nature 622, 552–561 (2023).
Zhang, M. et al. Molecularly defined and spatially resolved cell atlas of the whole mouse brain. Nature 624, 343–354 (2023).
Vizgen Data Release V1.0, https://info.vizgen.com/mouse-brain-map (2021).
Dwivedi, V. P. et al. Graph transformers for large graphs. Preprint at https://doi.org/10.48550/arXiv.2312.11109 (2023).
Chen, A. et al. Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays. Cell 185, 1777–1792 (2022).
Schürch, C. M. et al. Coordinated cellular neighborhoods orchestrate antitumoral immunity at the colorectal cancer invasive front. Cell 182, 1341–1359 (2020).
Varrone, M., Tavernari, D., Santamaria-Martínez, A., Walsh, L. A. & Ciriello, G. CellCharter reveals spatial cell niches associated with tissue remodeling and cell plasticity. Nat. Genet. 56, 74–84 (2024).
Schumacher, T. N. & Thommen, D. S. Tertiary lymphoid structures in cancer. Science 375, eabf9419 (2022).
Zhao, L. et al. Tertiary lymphoid structures in diseases: immune mechanisms and therapeutic advances. Signal Transduct. Target. Ther. 9, 225 (2024).
Leone, P. et al. MHC class I antigen processing and presenting machinery: organization, function, and defects in tumor cells. J. Natl. Cancer Inst. 105, 1172–1187 (2013).
Dustin, M. L. The immunological synapse. Cancer Immunol. Res. 2, 1023–1033 (2014).
Jin, S. et al. Inference and analysis of cell-cell communication using CellChat. Nat. Commun. 12, 1088 (2021).
Roche, P. A. & Furuta, K. The ins and outs of MHC class II-mediated antigen processing and presentation. Nat. Rev. Immunol. 15, 203–216 (2015).
Li, D. et al. Cancer-associated fibroblast-secreted IGFBP7 promotes gastric cancer by enhancing tumor associated macrophage infiltration via FGF2/FGFR1/PI3K/AKT axis. Cell Death Discov. 9, 17 (2023).
Liu, T., Zhou, L., Li, D., Andl, T. & Zhang, Y. Cancer-associated fibroblasts build and secure the tumor microenvironment. Front. Cell Dev. Biol. 7, 60 (2019).
Maiques, O. et al. Matrix mechano-sensing at the invasive front induces a cytoskeletal and transcriptional memory supporting metastasis. Nat. Commun. 16, 1394 (2025).
Zhan, Q. et al. New insights into the correlations between circulating tumor cells and target organ metastasis. Signal Transduct. Target. Ther. 8, 465 (2023).
Zhao, D. et al. CEACAM6 expression and function in tumor biology: a comprehensive review. Discov. Oncol. 15, 186 (2024).
Hänzelmann, S., Castelo, R. & Guinney, J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinform. 14, 1–15 (2013).
Liberzon, A. et al. The molecular signatures database hallmark gene set collection. Cell Syst. 1, 417–425 (2015).
Chen, G., Wu, K., Li, H., Xia, D. & He, T. Role of hypoxia in the tumor microenvironment and targeted therapy. Front. Oncol. 12, 961637 (2022).
Taki, M. et al. Tumor immune microenvironment during epithelial–mesenchymal transition. Clin. Cancer Res. 27, 4669–4679 (2021).
Yuan, Z. et al. Extracellular matrix remodeling in tumor progression and immune escape: from mechanisms to treatments. Mol. Cancer 22, 48 (2023).
Zhao, J. et al. Glycolysis in the tumor microenvironment: a driver of cancer progression and a promising therapeutic target. Front. Cell Dev. Biol. 12, 1416472 (2024).
He, Y. et al. Towards a universal spatial molecular atlas of the mouse brain. Preprint at https://doi.org/10.1101/2024.05.27.594872 (2024).
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Devaux, J., Fykkolodziej, B. & Gow, A. Claudin proteins and neuronal function. Curr. Top. Membr. 65, 229–253 (2010).
González de San Román, E. et al. Anatomical location of LPA 1 activation and LPA phospholipid precursors in rodent and human brain. J. Neurochem. 134, 471–485 (2015).
Gargareta, V.-I. et al. Conservation and divergence of myelin proteome and oligodendrocyte transcriptome profiles between humans and mice. Elife 11, e77019 (2022).
Stern, D. B., Wilke, A. & Root, C. M. Anatomical connectivity of the intercalated cells of the amygdala.eNeuro 10, ENEURO.0238-23 (2023).
Mu, L. et al. SoxC transcription factors are required for neuronal differentiation in adult hippocampal neurogenesis. J. Neurosci. 32, 3067–3080 (2012).
Kuerbitz, J. et al. Loss of intercalated cells (ITCs) in the mouse amygdala of Tshz1 mutants correlates with fear, depression, and social interaction phenotypes. J. Neurosci. 38, 1160–1177 (2018).
Haslinger, A., Schwarz, T. J., Covic, M. & Chichung Lie, D. Expression of Sox11 in adult neurogenic niches suggests a stage-specific role in adult neurogenesis. Eur. J. Neurosci. 29, 2103–2114 (2009).
Zhongming, L. et al. CellNiche represents cellular microenvironments in atlas-scale spatial omics data with contrastive learning. CellNiche: v0.1.7. Zenodo https://doi.org/10.5281/zenodo.19143524 (2025).
Acknowledgements
This work was supported by the National Key Research and Development Program of China (2022YFA1004800), Strategic Priority Research Program of the Chinese Academy of Sciences (XDB1350000), the CAS Project for Young Scientists in Basic Research (YSBR-077), Zhejiang Province Vanguard Goose-Leading Initiative (no. 2025C01114), and the National Natural Science Foundation of China (12025107, 12571550, 12326610).
Author information
Authors and Affiliations
Contributions
Z.M.L. and S.P.L. conceived the study. Z.M.L. designed and developed CellNiche. Z.M.L. assembled the data and performed computational analyses. Z.M.L., Y.W., and S.P.L. designed analyses and supervised the work. B.X.Z. and M.Q.J. participated in the discussion and provided suggestions. Z.M.L., Y.W., and S.P.L. wrote the manuscript. All authors edited and approved the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Peer review
Peer review information
Nature Communications thanks Zhana Duren and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Liang, Z., Zhong, B., Jiao, M. et al. CellNiche represents cellular microenvironments in atlas-scale spatial omics data with contrastive learning. Nat Commun (2026). https://doi.org/10.1038/s41467-026-71759-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-026-71759-4


