Abstract
Understanding how rare genetic variants influence complex traits remains a major challenge, particularly when these variants lie in noncoding regions of the genome. The effects of variants within candidate cis-regulatory elements (cCREs) often depend on the cell type, making interpretation difficult. Here we introduce cellSTAAR, which integrates whole-genome sequencing data with single-cell assay for transposase-accessible chromatin using sequencing data to capture variability in chromatin accessibility across cell types via the construction of cell-type-specific functional annotations and regulatory elements. To reflect the uncertainty in cCRE–gene linking, cellSTAAR uses a comprehensive strategy to link cCREs to their target genes. We applied cellSTAAR to data from the Trans-Omics for Precision Medicine consortium (n ≈ 60,000) and replicated our findings using the UK Biobank (n ≈ 190,000). Across four lipid traits, cellSTAAR improved the detection of biologically meaningful associations and enhanced biological interpretability. These results demonstrate the potential of cell-type-aware approaches to boost discovery in rare variant whole-genome sequencing association studies.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout



Similar content being viewed by others
Data availability
This paper used the TOPMed Freeze 8 WGS data and lipid phenotype data. Genotype and phenotype data are both available in the database of Genotypes and Phenotypes. The TOPMed WGS data were from the following 20 study phases (accession numbers provided in parentheses): Old Order Amish (phs000956.v1.p1), Atherosclerosis Risk in Communities Study (phs001211), Mt Sinai BioMe Biobank (phs001644), Coronary Artery Risk Development in Young Adults (phs001612), Cleveland Family Study (phs000954), Cardiovascular Health Study (phs001368), Diabetes Heart Study (phs001412), FHS (phs000974), Genetic Study of Atherosclerosis Risk (phs001218), Genetic Epidemiology Network of Arteriopathy (phs001345), Genetic Epidemiology Network of Salt Sensitivity (phs001217), Genetics of Lipid Lowering Drugs and Diet Network (phs001359), Hispanic Community Health Study - Study of Latinos (phs001395), Hypertension Genetic Epidemiology Network and Genetic Epidemiology Network of Arteriopathy (phs001293), JHS (phs000964), Multi-Ethnic Study of Atherosclerosis (phs001416), San Antonio Family Heart Study (phs001215), Genome-wide Association Study of Adiposity in Samoans (phs000972), Taiwan Study of Hypertension using Rare Variants (phs001387) and Women’s Health Initiative (phs001237). UKB WGS data are available from the UKB Research Analysis Platform, and the UKB analyses were conducted using the UKB resource under application 52008. The single-cell ATAC-seq used from CATlas is publicly available at http://catlas.org/humanenhancer/.
Code availability
cellSTAAR is freely available as an R package at https://github.com/edvanburen/cellSTAAR/, vcf2agds68 was used to preprocess the UKB WGS data and is freely available as a collection of applets in the UKB RAP at https://github.com/drarwood/vcf2agds_overview. GENESIS18, available at https://bioconductor.org/packages/release/bioc/html/GENESIS.html and FastSparseGRM69, available at https://github.com/rounakdey/FastSparseGRM are freely available as R packages, and were used to calculate the ancestral principal components and sparse GRMs for TOPMed and UKB data, respectively. Code used in the analysis has been archived on Zenodo at https://doi.org/10.5281/zenodo.16113567 (ref. 70).
References
Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
All of Us Research Program Investigators; Denny, J. C. et al. The “All of Us” Research Program. N. Engl. J. Med. 381, 668–676, (2019).
Li, Z. et al. A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nat. Methods 19, 1599–1611 (2022).
Bansal, V., Libiger, O., Torkamani, A. & Schork, N. J. Statistical analysis strategies for association studies involving rare variants. Nat. Rev. Genet. 11, 773–785 (2010).
Kiezun, A. et al. Exome sequencing and the genetic basis of complex traits. Nat. Genet. 44, 623–630 (2012).
Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).
Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011).
Morris, A. P. & Zeggini, E. An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genet. Epidemiol. 34, 188–193 (2010).
Liu, Y. et al. ACAT: a fast and powerful P value combination method for rare-variant analysis in sequencing studies. Am. J. Hum. Genet. 104, 410–421 (2019).
Lee, S., Wu, M. C. & Lin, X. Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13, 762–775 (2012).
Sun, J., Zheng, Y. & Hsu, L. A unified mixed-effects model for rare-variant association in sequencing studies. Genet. Epidemiol. 37, 334–344 (2013).
Li, X. et al. Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nat. Genet. 52, 969–983 (2020).
Weissbrod, O. et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat. Genet. 52, 1355–1363 (2020).
Kichaev, G. et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 10, e1004722 (2014).
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).
Gogarten, S. M. et al. Genetic association testing using the GENESIS R/Bioconductor package. Bioinformatics 35, 5346–5348 (2019).
Zhou, H. et al. FAVOR: functional annotation of variants online resource and annotator for variation across the human genome. Nucleic Acids Res. 51, D1300–D1311 (2023).
Preissl, S., Gaulton, K. J. & Ren, B. Characterizing cis-regulatory elements using single-cell epigenomics. Nat. Rev. Genet. https://doi.org/10.1038/s41576-022-00509-1 (2022).
Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature 593, 238–243 (2021).
Gasperini, M., Tome, J. M. & Shendure, J. Towards a comprehensive catalogue of validated and target-linked human enhancers. Nat. Rev. Genet. 21, 292–310 (2020).
Zhang, K. et al. A single-cell atlas of chromatin accessibility in the human genome. Cell 184, 5985–6001 (2021).
Buenrostro, J. D. et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523, 486–490 (2015).
Corces, M. R. et al. Single-cell epigenomic analyses implicate candidate causal variants at inherited risk loci for Alzheimer’s and Parkinson’s diseases. Nat. Genet. 52, 1158–1168 (2020).
Schilder, B. M. & Raj, T. Fine-mapping of Parkinson’s disease susceptibility loci identifies putative causal variants. Hum. Mol. Genet. 31, 888–900 (2022).
Selvaraj, M. S. et al. Whole genome sequence analysis of blood lipid levels in >66,000 individuals. Nat. Commun. 13, 5995 (2022).
Moore, J. E., Pratt, H. E., Purcaro, M. J. & Weng, Z. A curated benchmark of enhancer-gene interactions for evaluating enhancer-target gene prediction methods. Genome Biol. 21, 17 (2020).
Fulco, C. P. et al. Activity-by-contact model of enhancer–promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).
Boix, C. A., James, B. T., Park, Y. P., Meuleman, W. & Kellis, M. Regulatory genomic circuitry of human disease loci by integrative epigenomics. Nature 590, 300–307 (2021).
Abascal, F. et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).
Jagadeesh, K. A. et al. Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics. Nat. Genet. https://doi.org/10.1038/s41588-022-01187-9 (2022).
Zhang, M. J. et al. Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data. Nat. Genet. https://doi.org/10.1038/s41588-022-01167-z (2022).
Hamel, A. R. et al. Integrating genetic regulation and single-cell expression with GWAS prioritizes causal genes and cell types for glaucoma. Nat. Commun. 15, 396 (2024).
Yin, M. et al. sc2GWAS: a comprehensive platform linking single cell and GWAS traits of human. Nucleic Acids Res. https://doi.org/10.1093/nar/gkae1008 (2024).
Das, A. C. et al. Single-cell chromatin accessibility data combined with GWAS improves detection of relevant cell types in 59 complex phenotypes. Int. J. Mol. Sci. 23, 11456 (2022).
Fishilevich, S. et al. GeneHancer: genome-wide integration of enhancers and target genes in GeneCards. Database 2017, bax028 (2017).
Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).
Forrest, A. R. R. et al. A promoter-level mammalian expression atlas. Nature 507, 462–470 (2014).
Dunham, I. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Consortium, I. The impact of genomic variation on function (IGVF) Consortium. Nature 663, 47–57 (2024).
Regev, A. et al. The Human Cell Atlas. eLife 6, e27041 (2017).
Rozenblatt-Rosen, O., Stubbington, M. J. T., Regev, A. & Teichmann, S. A. The Human Cell Atlas: from vision to reality. Nature 550, 451–453 (2017).
Chen, C. -H. et al. Determinants of transcription factor regulatory range. Nat. Commun. 11, 2472 (2020).
Schaffner, S. F. et al. Calibrating a coalescent simulation of human genome sequence variation. Genome Res. 15, 1576–1583 (2005).
Li, X. et al. A statistical framework for multi-trait rare variant analysis in large-scale whole-genome sequencing studies. Nat. Comput. Sci. https://doi.org/10.1038/s43588-024-00764-8 (2025).
Graham, S. E. et al. The power of genetic diversity in genome-wide association studies of lipids. Nature 600, 675–679 (2021).
Klarin, D. et al. Genetics of blood lipids among ~300,000 multi-ethnic participants of the Million Veteran Program. Nat. Genet. 50, 1514–1523 (2018).
Maestri, A. et al. Lipid droplets, autophagy, and ageing: a cell-specific tale. Ageing Res. Rev. 94, 102194 (2024).
Molenaar, M. R., Penning, L. C. & Helms, J. B. Playing Jekyll and Hyde—the dual role of lipids in fatty liver disease. Cells 9, 2244 (2020).
Schulze, R. J., Schott, M. B., Casey, C. A., Tuma, P. L. & McNiven, M. A. The cell biology of the hepatocyte: a membrane trafficking machine. J. Cell Biol. 218, 2096–2112 (2019).
Rutkowski, J. M., Stern, J. H. & Scherer, P. E. The cell biology of fat expansion. J. Cell Biol. 208, 501–512 (2015).
He, Q. et al. Role of liver sinusoidal endothelial cell in metabolic dysfunction-associated fatty liver disease. Cell Commun. Signal. 22, 346 (2024).
Hussain, M. M. Intestinal lipid absorption and lipoprotein formation. Curr. Opin. Lipidol. 25, 200–206 (2014).
Jones, R. C. et al. The Tabula Sapiens: a multiple-organ, single-cell transcriptomic atlas of humans. Science https://doi.org/10.1126/science.abl4896 (2022).
Ignatiadis, N. & Huber, W. Covariate powered cross-weighted multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 83, 720–751 (2021).
Hao, Y. et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat. Biotechnol. 42, 293–304 (2024).
Engreitz, J. M. et al. Deciphering the impact of genomic variation on function. Nature 633, 47–57 (2024).
Hindy, G. et al. Rare coding variants in 35 genes associate with circulating lipid levels—a multi-ancestry analysis of 170,000 exomes. Am. J. Hum. Genet. 109, 81–96 (2022).
Safarova, M. et al. Advances in targeting LDL cholesterol: PCSK9 inhibitors and beyond. Am. J. Prev. Cardiol. 19, 100701 (2024).
Wadhera, R. K., Steen, D. L., Khan, I., Giugliano, R. P. & Foody, J. M. A review of low-density lipoprotein cholesterol, treatment strategies, and its impact on cardiovascular disease morbidity and mortality. J. Clin. Lipidol. 10, 472–489 (2016).
Liu, Y. & Xie, J. Cauchy combination test: a powerful test with analytic P-value calculation under arbitrary dependency structures. J. Am. Stat. Assoc. 115, 393–402 (2020).
Breslow, N. E. & Clayton, D. G. Approximate inference in generalized linear mixed models. J. Am. Stat. Assoc. 88, 9–25 (1993).
Chen, H. et al. Efficient variant set mixed model association tests for continuous and binary traits in large-scale whole-genome sequencing studies. Am. J. Hum. Genet. 104, 260–274 (2019).
Chen, H. et al. Control for population structure and relatedness for binary traits in genetic association studies via logistic mixed models. Am. J. Hum. Genet. 98, 653–666 (2016).
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Conomos, M. P., Reiner, A. P., Weir, B. S. & Thornton, T. A. Model-free estimation of recent genetic relatedness. Am. J. Hum. Genet. 98, 127–148 (2016).
Li, X. et al. Streamlining large-scale genomic data management: Insights from the UK Biobank whole-genome sequencing data. Cell Genom. https://doi.org/10.1016/j.xgen.2025.101009 (2025).
Lin, X. et al. Scalable analysis of large multi-ancestry biobanks by leveraging sparse ancestry-adjusted sample-relatedness. Preprint at Research Square https://doi.org/10.21203/rs.3.rs-5343361/v1 (2024).
Van Buren, E. cellSTAAR paper analysis code. Zenodo https://doi.org/10.5281/zenodo.16113567 (2025).
Acknowledgements
This work was supported by grants R35-CA197449, U19-CA203654, U01-HG012064 and U01-HG009088 (to X. Lin); R01-HL142711 and R01-HL127564 (to P.N. and G.M.P.); 75N92020D00001, HHSN268201500003I, N01-HC-95159, 75N92020D00005, N01-HC-95160, 75N92020D00002, N01-HC-95161, 75N92020D00003, N01-HC-95162, 75N92020D00006, N01-HC-95163, 75N92020D00004, N01-HC-95164, 75N92020D00007, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, UL1-TR-000040, UL1-TR-001079, UL1-TR-001420, UL1-TR001881, DK063491, R01-HL071051, R01-HL071205, R01-HL071250, R01-HL071251, R01-HL071258, R01-HL071259 and UL1-RR033176 (to J.R. and Y.C.); 1R35-HL135818, R01-HL113338 and HL046389 (to S.R.); HL105756 (to B.P.); HHSN268201600018C, HHSN268201600001C, HHSN268201600002C, HHSN268201600003C and HHSN268201600004C (to C.K.); R01-MD012765 and R01-DK117445 (to N.F.); R01-HL153805 and R03-HL154284 (to B.E.C.); HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700005I and HHSN268201700004I (to E.B.); U01- HL072524, R01-HL104135-04S1, U01-HL054472, U01-HL054473, U01-HL054495, U01-HL054509 and R01-HL055673-18S1 (to D.K.A.); U01-HL72518, HL087698, HL49762, HL59684, HL58625, HL071025, HL112064, NR0224103 and M01-RR000052 (to the Johns Hopkins General Clinical Research Center); R01-HL133040 (to R.L.M.); R01-HL093093 (to S. T. McGarvey); R01-HL173044 and R01-AG085581 (to X. Li); and NHLBI TOPMed Fellowship 75N92021F00229 (to X. Li and M.S.S.). The Cardiovascular Health Study research was supported by NHLBI contracts HHSN268201200036C, HHSN268200800007C, HHSN268201800001C, N01HC55222, N01HC85079, N01HC85080, N01HC85081, N01HC85082, N01HC85083, N01HC85086 and 75N92021D00006; and NHLBI grants R01-HL172803, U01HL080295, R01-HL087652, R01-HL105756, R01-HL103612, R01-HL120393 and U01HL130114, with additional contribution from the National Institute of Neurological Disorders and Stroke. Additional support was provided through R01AG023629 from the National Institute on Aging. A full list of principal CHS investigators and institutions can be found at CHS-NHLBI (https://chs-nhlbi.org/). This work was also supported by R01-HL92301, R01-HL67348, R01-NS058700, R01-AR48797, R01-DK071891, R01-AG058921, the General Clinical Research Center of the Wake Forest University School of Medicine (M01-RR07122, F32 HL085989), the American Diabetes Association and a pilot grant from the Claude Pepper Older Americans Independence Center of Wake Forest University Health Sciences (P60 AG10484). The Coronary Artery Risk Development in Young Adults Study (CARDIA) is conducted and supported by the NHLBI in collaboration with the University of Alabama at Birmingham (75N92023D00002, 75N92023D00005), Northwestern University (75N92023D00004), University of Minnesota (75N92023D00006) and Kaiser Foundation Research Institute (75N92023D00003). The FHS acknowledges the support of contracts NO1-HC-25195, HHSN268201500001I and 75N92019D00031 from the NHLBI and grant supplement R01-HL092577-06S1 for this research. We also acknowledge the dedication of the FHS study participants without whom this research would not be possible. R.S.V. is supported in part by the Evans Medical Foundation and the Jay and Louis Coffman Endowment from the Department of Medicine, Boston University School of Medicine. The JHS is supported and conducted in collaboration with Jackson State University (HHSN268201800013I), Tougaloo College (HHSN268201800014I), the Mississippi State Department of Health (HHSN268201800015I) and the University of Mississippi Medical Center (HHSN268201800010I, HHSN268201800011I and HHSN268201800012I) contracts from the NHLBI and the National Institute on Minority Health and Health Disparities. We also thank the staff and participants of the JHS. Support for GENOA was provided by the NHLBI (U01HL054457, U01HL054464, U01HL054481, R01-HL119443 and R01-HL087660) of the National Institutes of Health. Collection of the San Antonio Family Study data was supported in part by National Institutes of Health grants P01 HL045522, MH078143, MH078111 and MH083824; and WGS of SAFS participants was supported by U01 DK085524 and R01-HL113323. The Diabetes Heart Study was supported by R01-HL92301, R01-HL67348, R01-NS058700, R01-AR48797, R01-DK071891, R01-AG058921, the General Clinical Research Center of the Wake Forest University School of Medicine (M01-RR07122, F32 HL085989), the American Diabetes Association and a pilot grant from the Claude Pepper Older Americans Independence Center of Wake Forest University Health Sciences (P60 AG10484). Molecular data for the TOPMed program was supported by the NHLBI. Genome sequencing for ‘NHLBI TOPMed: Coronary Artery Risk Development in Young Adults (CARDIA)’ (phs001612.v1.p1) was performed at the Baylor Sequencing Center (HHSN268201600033I). Core support, including centralized genomic read mapping and genotype calling, variant quality metrics and filtering, was provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1; contract HHSN268201800002I). Core support, including phenotype harmonization, data management, sample-identity quality control and general program coordination, was provided by the TOPMed Data Coordinating Center (R01-HL120393, U01HL-120393; contract HHSN268201800001I). Support for the Multi-Ethnic Study of Atherosclerosis was provided by contracts 75N92025D00022, 75N92025D00026, 75N92025D00024, 75N92025D00027, 75N92025D00025 and 75N92025D00028. We gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed. The full study-specific acknowledgements are detailed in the Supplementary Note.
Author information
Authors and Affiliations
Consortia
Contributions
E.V.B., Y.Z., X. Li, Z. Li and X. Lin designed the experiments. E.V.B., Y.Z., X. Li and X. Lin performed the experiments. E.V.B., Y.Z., X. Li, Z. Li, H.Z., M.S.S., N.D.P., D.K.A., J.B., E.B., B.E.C., J.C.C., A.P.C., Y.D.I.C., J.C., R.D., M.F., N.F., M.G., C.G., X.G., J.H., N.H.C., L.H., Y.J.H., R.R.K., S.L.R.K., E.K., C.K., B.G.K., L.L., D.L., C.L., S.L., D.L.J., R.J.F.L., A.W.M., L.M., R.L.M., R.J.M., B.M., J.C.M., T.N., K.N., J.O., J.P., P.P., B.P., L.R., R.S.V., S.R., A.R., S.S.R., J.S., B.S., H.T., K.D.T., R.T., S.V., L.Y.,W.Z., J.R., G.M.P., P.N. and X. Lin acquired, analyzed or interpreted data. J.R., G.M.P., P.N. and the NHLBI TOPMed Lipids Working Group provided administrative, technical or material support. E.V.B. and X. Lin drafted the manuscript and revised it according to suggestions by the coauthors. All authors critically reviewed the manuscript, suggested revisions as needed and approved the final version.
Corresponding author
Ethics declarations
Competing interests
E.K. has received personal fees from Regeneron Pharmaceuticals, 23&Me, Allelica and Illumina; has received research funding from Allelica; and serves on the advisory boards for Encompass Biosciences, Overtone and Galateo Bio. P.N. reports research grants from Allelica, Amgen, Apple, Boston Scientific, Cleerly, Genentech/Roche, Ionis, Novartis and Silence Therapeutics, personal fees from AIRNA, Allelica, Apple, AstraZeneca, Bain Capital, Blackstone Life Sciences, Bristol Myers Squibb, Creative Education Concepts, CRISPR Therapeutics, Eli Lilly & Co, Esperion Therapeutics, Foresite Capital, Foresite Labs, Genentech/Roche, GV, HeartFlow, Magnet Biomedicine, Merck, Novartis, Novo Nordisk, TenSixteen Bio and Tourmaline Bio; equity in Bolt, Candela, Mercury, MyOme, Parameter Health, Preciseli and TenSixteen Bio; royalties from Recora for intensive cardiac rehabilitation; and spousal employment at Vertex Pharmaceuticals, all unrelated to the present work. B.M.P. serves on the Steering Committee of the Yale Open Data Access Project funded by Johnson & Johnson. L.M.R. is a consultant for the TOPMed Administrative Coordinating Center (ACC) through Westat. X. Lin is a consultant of AbbVie Pharmaceuticals and Verily Life Sciences. The other authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editor: Lin Tang, in collaboration with the Nature Methods team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–18 and Note.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Van Buren, E., Zhang, Y., Li, X. et al. cellSTAAR: incorporating single-cell-sequencing-based functional data to boost power in rare variant association testing of noncoding regions. Nat Methods (2025). https://doi.org/10.1038/s41592-025-02919-5
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41592-025-02919-5


