Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Nature Communications
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. nature communications
  3. articles
  4. article
Systematic background selection with BasCoD enhances contrastive dimension reduction in single cell genomics
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 17 March 2026

Systematic background selection with BasCoD enhances contrastive dimension reduction in single cell genomics

  • Kwangmoon Park1,
  • Zhongxuan Sun  ORCID: orcid.org/0009-0004-5682-20782,
  • Ruiqi Liao  ORCID: orcid.org/0000-0002-9553-33493,
  • Emery H. Bresnick  ORCID: orcid.org/0000-0002-1151-56543 &
  • …
  • Sündüz Keleş  ORCID: orcid.org/0000-0001-9048-09222,4 

Nature Communications , Article number:  (2026) Cite this article

  • 1617 Accesses

  • 19 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Cell biology
  • Computational biology and bioinformatics
  • Immunology
  • Scientific community

Abstract

In single-cell experiments spanning diverse conditions, distinguishing variation specific to one condition (e.g., treatment) from shared or background variation (e.g., control) is critical for uncovering treatment-specific molecular responses. However, these studies typically yield ultra-high-dimensional data, necessitating effective dimension reduction for reliable biological interpretation. Contrastive dimension reduction methods address this challenge by identifying low-dimensional features enriched in a target dataset relative to a background dataset that captures shared variation. Despite their growing utility, the success of such methods critically depends on the choice of background, yet no formal criterion exists for evaluating or selecting backgrounds. To address this gap, we introduce BasCoD, a statistical testing framework based on spectral subspace inclusion theory, that enables rigorous evaluation and systematic selection of background datasets. Applying BasCoD across a range of single-cell datasets, we show that it effectively identifies suitable backgrounds, substantially improving the contrast and interpretability of the resulting target representations. We further demonstrate how BasCoD can guide the design of contrastive analyses in large-scale single-cell experiments conducted under heterogeneous conditions and elucidate potential interaction effects in perturbation studies.

Similar content being viewed by others

VBASS enables integration of single cell gene expression data in Bayesian association analysis of rare variants

Article Open access 25 July 2023

Dimension-agnostic and granularity-based spatially variable gene identification using BSP

Article Open access 14 November 2023

Multi-batch single-cell comparative atlas construction by deep learning disentanglement

Article Open access 12 July 2023

Data availability

• Mouse protein expression data. The processed mouse protein expression data used in this study are available at the cPCA GitHub repository. • Perturb-seq data. The processed Perturb-seq data used in this study are available at Figshare through the ContrastiveVI tutorial. • Mouse intestinal single-cell RNA-seq data. The processed mouse intestinal single-cell RNA-seq data used in this study are available at the ContrastiveVI tutorial. • Human Cell Atlas bone marrow (HCA-BM) data. The processed HCA-BM data used in this study are available at the Lamian GitHub repository. • Population-scale single-cell RNA-seq data with ROT treatment. The processed population-scale single-cell RNA-seq data used in this study are available in the Zenodo database under accession code 4333872. • Single-cell RNA-seq data with inflammation treatment. The processed single-cell RNA-seq data used in this study are available at Zenodo: https://zenodo.org/records/18776758. Source data are provided with this paper.

Code availability

The code used to develop the model, perform the analyses and generate results in this study is publicly available and has been deposited in the GitHub repository at https://github.com/keleslab/BasCoD, under the MIT license. The specific version of the code associated with this publication is archived in Zenodo and is accessible via https://doi.org/10.5281/zenodo.1829118331.

References

  1. Jerber, J. et al. Population-scale single-cell rna-seq profiling across dopaminergic neuron differentiation. Nat. Genet. 53, 304–312 (2021).

    Google Scholar 

  2. Soskic, B. et al. Immune disease risk variants regulate gene expression dynamics during cd4+ t cell activation. Nat. Genet. 54, 817–826 (2022).

    Google Scholar 

  3. Dixit, A. et al. Perturb-seq: dissecting molecular circuits with scalable single-cell rna profiling of pooled genetic screens. cell 167, 1853–1866 (2016).

    Google Scholar 

  4. Abid, A., Zhang, M. J., Bagaria, V. K. & Zou, J. Exploring patterns enriched in a dataset with contrastive principal component analysis. Nat. Commun. 9, 2134 (2018).

    Google Scholar 

  5. Abid, A., Zou, J. Contrastive variational autoencoder enhances salient features. arXiv preprint arXiv:1902.04601 (2019).

  6. Severson, K.A., Ghosh, S., Ng, K. Unsupervised learning with contrastive latent variable models. In: Proceedings of the AAAI Conference on Artificial Intelligence. Volume 33. 4862–4869 (2019).

  7. Weinberger, E., Lin, C. & Lee, S. I. Isolating salient variations of interest in single-cell data with contrastivevi. Nat. Methods 20, 1336–1345 (2023).

    Google Scholar 

  8. Weinberger, E., Covert, I., & Lee, S. I. Feature selection in the contrastive analysis setting. Adv. Neural. Inf. Process. Syst. 36, 66102–66126 (2023).

  9. Zhang, B., Nyquist, S., Jones, A., Engelhardt, B. E. & Li, D. Contrastive linear regression. Ann. Appl. Stat. 19, 1868 (2025).

  10. Ebrahimi, A., Siahpirani, A. F. & Montazeri, H. scin: a contrastive learning framework for single-cell multi-omics data integration. Brief. Bioinforma. 26, bbaf411 (2025).

    Google Scholar 

  11. Li, W., Murtaza, G. & Singh, R. sccontrast: A contrastive learning based approach for encoding single-cell gene expression data. bioRxiv 2025–04 (2025).

  12. He, K., Fan, H., Wu, Y., Xie, S. & Girshick, R. Momentum contrast for unsupervised visual representation learning. In: Proc. of the IEEE/CVF conference on computer vision and pattern recognition 9729–9738 (2020).

  13. Hawke, S., Zhang, E., Chen, J. & Li, D. Contrastive dimension reduction: A systematic review. arXiv preprint arXiv:2510.11847 (2025).

  14. Townes, F. W., Hicks, S. C., Aryee, M. J. & Irizarry, R. A. Feature selection and dimension reduction for single-cell rna-seq based on a multinomial model. Genome Biol. 20, 295 (2019).

    Google Scholar 

  15. Ahmed, M. M. et al. Protein dynamics associated with failed and rescued learning in the ts65dn mouse model of down syndrome. PloS one 10, e0119491 (2015).

    Google Scholar 

  16. Higuera, C., Gardiner, K. J. & Cios, K. J. Self-organizing feature maps identify proteins critical to learning in a mouse model of down syndrome. PloS one 10, e0129126 (2015).

    Google Scholar 

  17. Norman, T. M. et al. Exploring genetic interaction manifolds constructed from rich single-cell phenotypes. Science 365, 786–793 (2019).

    Google Scholar 

  18. Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. methods 15, 1053–1058 (2018).

    Google Scholar 

  19. Wilson, D. J. The harmonic mean p-value for combining dependent tests. Proc. Natl. Acad. Sci. 116, 1195–1200 (2019).

    Google Scholar 

  20. Trapnell, C. et al. Pseudo-temporal ordering of individual cells reveals dynamics and regulators of cell fate decisions. Nat. Biotechnol. 32, 381 (2014).

    Google Scholar 

  21. Ji, Z. & Ji, H. Tscan: Pseudo-time reconstruction and evaluation in single-cell rna-seq analysis. Nucleic acids Res. 44, e117–e117 (2016).

    Google Scholar 

  22. Street, K. et al. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC genomics 19, 1–16 (2018).

    Google Scholar 

  23. Hou, W. et al. A statistical framework for differential pseudotime analysis with multiple single-cell rna-seq samples. Nat. Commun. 14, 7286 (2023).

    Google Scholar 

  24. DeTomaso, D. & Yosef, N. Hotspot identifies informative gene modules across modalities of single-cell genomics. Cell Syst. 12, 446–456 (2021).

    Google Scholar 

  25. Ficara, F. et al. Pbx1 restrains myeloid maturation while preserving lymphoid potential in hematopoietic progenitors. J. cell Sci. 126, 3181–3191 (2013).

    Google Scholar 

  26. Yu, G., Wang, L. G., Han, Y. & He, Q. Y. clusterprofiler: an r package for comparing biological themes among gene clusters. Omics: a J. Integr. Biol. 16, 284–287 (2012).

    Google Scholar 

  27. Murray, C. W. et al. LKB1 drives stasis and C/EBP-mediated reprogramming to an alveolar type II fate in lung cancer. Nat. Commun. 13, 1090 (2022).

    Google Scholar 

  28. Lara-Astiaso, D. et al. In vivo screening characterizes chromatin factor functions during normal and malignant hematopoiesis. Nat. Genet. 55, 1542–1554 (2023).

    Google Scholar 

  29. Hawke, S., Ma, Y. & Li, D. Contrastive dimension reduction: when and how?. Adv. Neural Inf. Process. Syst. 37, 74034–74057 (2024).

    Google Scholar 

  30. Liu, Y. & Xie, J. Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures. J. Am. Stat. Assoc. 115, 393–402 (2020).

    Google Scholar 

  31. Park, K., Sun, Z., Liao, R., Bresnick, E.H. & Keleş, S. Systematic background selection with bascod enhances contrastive dimension reduction in single cell genomics. GitHub Repository: BasCoDhttps://doi.org/10.5281/zenodo.18291183 (2026).

  32. Haber, A. L. et al. A single-cell survey of the small intestinal epithelium. Nature 551, 333–339 (2017).

    Google Scholar 

Download references

Acknowledgements

We thank Shuchen Yan from the University of Wisconsin–Madison for sharing the processed scRNA-seq dataset from Jerber et al.1. We thank Dr. Siqi Shen (Fred Hutchinson Cancer Center) and Coleman Breen (University of Wisconsin–Madison) for insightful discussions. This work was supported by NIH grants R01HG003747 (S.K.) and R21HG012881 (S.K.), and a Chan Zuckerberg Initiative Data Insights Award (S.K.).

Author information

Authors and Affiliations

  1. Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA

    Kwangmoon Park

  2. Department of Biostatistics and Medical Informatics, University of Wisconsin - Madison, Madison, WI, USA

    Zhongxuan Sun & Sündüz Keleş

  3. Wisconsin Blood Cancer Research Institute, Department of Cell and Regenerative Biology, Carbone Cancer Center, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA

    Ruiqi Liao & Emery H. Bresnick

  4. Department of Statistics, University of Wisconsin - Madison, Madison, WI, USA

    Sündüz Keleş

Authors
  1. Kwangmoon Park
    View author publications

    Search author on:PubMed Google Scholar

  2. Zhongxuan Sun
    View author publications

    Search author on:PubMed Google Scholar

  3. Ruiqi Liao
    View author publications

    Search author on:PubMed Google Scholar

  4. Emery H. Bresnick
    View author publications

    Search author on:PubMed Google Scholar

  5. Sündüz Keleş
    View author publications

    Search author on:PubMed Google Scholar

Contributions

K.P. and S.K. conceived the project. K.P. and S.K. designed the research and developed the method. K.P. performed the experiments and simulation studies. K.P. and S.K. contributed to the preparation of the manuscript. R.L. and E.B. generated the single-cell RNA-seq dataset with inflammation treatment. Z.S. processed the dataset and designed experimental ideas involving double-perturbed cells.

Corresponding author

Correspondence to Sündüz Keleş.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Chaojie Wang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Reporting Summary (download PDF )

Transparent Peer Review file (download PDF )

Source data

Source Data (download XLSX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Park, K., Sun, Z., Liao, R. et al. Systematic background selection with BasCoD enhances contrastive dimension reduction in single cell genomics. Nat Commun (2026). https://doi.org/10.1038/s41467-026-70652-4

Download citation

  • Received: 30 July 2025

  • Accepted: 01 March 2026

  • Published: 17 March 2026

  • DOI: https://doi.org/10.1038/s41467-026-70652-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Videos
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims & Scope
  • Editors
  • Journal Information
  • Open Access Fees and Funding
  • Calls for Papers
  • Editorial Values Statement
  • Journal Metrics
  • Editors' Highlights
  • Contact
  • Editorial policies
  • Top Articles

Publish with us

  • For authors
  • For Reviewers
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Nature Communications (Nat Commun)

ISSN 2041-1723 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing