Abstract
Large-scale, unbiased single-cell genomics studies of complex developmental compartments, such as hematopoiesis, have inferred novel cell states and trajectories; however, further characterization has been hampered by difficulty isolating cells corresponding to discrete genomic states. To address this, we present a framework that integrates multimodal single-cell analyses (RNA, surface protein and chromatin) with high-dimensional flow cytometry and enables semiautomated enrichment and functional characterization of diverse cell states. Our approach combines transcription factor expression with chromatin activity to uncover hierarchical gene regulatory networks driving these states. We delineated and isolated rare bone marrow Lin−Sca−CD117+CD27+ multilineage cell states (‘MultiLin’), validated predicted lineage trajectories and mapped differentiation potentials. Additionally, we used transcription factor activity on chromatin to trace and isolate multilineage progenitors undergoing multipotent to oligopotent lineage restriction. In the proposed model of steady-state hematopoiesis, discrete states governed developmental trajectories. This framework provides a scalable solution for isolating and characterizing novel cell states across different biological systems.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout







Data availability
All genomics and flow cytometry data (raw and processed) have been deposited in the following open-access repositories: Gene Expression Omnibus (GSE266609), Synapse (syn60529836), mouse hematopoietic CITE-seq interactive browser (https://altanalyze.org/MarrowAtlas/) and mouse MultiLin GRN visualization web application (https://multilin-grn-viewer-6c053f707717.herokuapp.com/). Source data are provided with this paper.
Code availability
Scripts and associated documentation necessary to reproduce the genomics and InfinityFlow bioinformatics analyses have been deposited in GitHub at https://github.com/KyleFerchen/MultiLin_Project_Code_Repository.
References
Becht, E. et al. High-throughput single-cell quantification of hundreds of proteins using conventional flow cytometry and machine learning. Sci. Adv. 7, eabg0505 (2021).
Ferchen, K., Salomonis, N. & Grimes, H. L. pyInfinityFlow: optimized imputation and analysis of high-dimensional flow cytometry data for millions of cells. Bioinformatics 39, btad287 (2023).
Pronk, C. J. et al. Elucidation of the phenotypic, functional, and molecular topography of a myeloerythroid progenitor cell hierarchy. Cell Stem Cell 1, 428–442 (2007).
Muench, D. E. et al. Mouse models of neutropenia reveal progenitor-stage-specific defects. Nature 582, 109–114 (2020).
Olsson, A. et al. Single-cell analysis of mixed-lineage states leading to a binary cell fate choice. Nature 537, 698–702 (2016).
Laurenti, E. & Gottgens, B. From haematopoietic stem cells to complex differentiation landscapes. Nature 553, 418–426 (2018).
Velten, L. et al. Human haematopoietic stem cell lineage commitment is a continuous process. Nat. Cell Biol. 19, 271–281 (2017).
Liggett, L. A. & Sankaran, V. G. Unraveling hematopoiesis through the lens of genomics. Cell 182, 1384–1400 (2020).
Basu, J. et al. ThPOK is a critical multifaceted regulator of myeloid lineage development. Nat. Immunol. 24, 1295–1307 (2023).
Zhang, X. et al. An immunophenotype-coupled transcriptomic atlas of human hematopoietic progenitors. Nat. Immunol. 25, 703–715 (2024).
Solomon, M. et al. Slow cycling and durable Flt3+ progenitors contribute to hematopoiesis under native conditions. J. Exp. Med. 221, e20231035 (2024).
Venkatasubramanian, M., Chetal, K., Schnell, D. J., Atluri, G. & Salomonis, N. Resolving single-cell heterogeneity from hundreds of thousands of cells through sequential hybrid clustering and NMF. Bioinformatics 36, 3773–3780 (2020).
Li, G. et al. Decision level integration of unimodal and multimodal single cell data with scTriangulate. Nat. Commun. 14, 406 (2023).
Winkler, I. G. et al. Bone marrow macrophages maintain hematopoietic stem cell (HSC) niches and their depletion mobilizes HSCs. Blood 116, 4815–4828 (2010).
Wattrus, S. J. et al. Quality assurance of hematopoietic stem cells by macrophages determines stem cell clonality. Science 377, 1413–1419 (2022).
Dutertre, C. A. et al. Single-cell analysis of human mononuclear phagocytes reveals subset-defining markers and identifies circulating inflammatory dendritic cells. Immunity 51, 573–589 (2019).
Wang, H. et al. A reporter mouse reveals lineage-specific and heterogeneous expression of IRF8 during lymphoid and myeloid cell differentiation. J. Immunol. 193, 1766–1777 (2014).
Akashi, K., Traver, D., Miyamoto, T. & Weissman, I. L. A clonogenic common myeloid progenitor that gives rise to all myeloid lineages. Nature 404, 193–197 (2000).
Balazs, A. B., Fabian, A. J., Esmon, C. T. & Mulligan, R. C. Endothelial protein C receptor (CD201) explicitly identifies hematopoietic stem cells in murine bone marrow. Blood 107, 2317–2321 (2006).
Benveniste, P. et al. Intermediate-term hematopoietic stem cells with extended but time-limited reconstitution potential. Cell Stem Cell 6, 48–58 (2010).
Hamey, F. K. et al. Single-cell molecular profiling provides a high-resolution map of basophil and mast cell development. Allergy 76, 1731–1742 (2021).
Iwasaki, H. et al. Identification of eosinophil lineage-committed progenitors in the murine bone marrow. J. Exp. Med. 201, 1891–1897 (2005).
Kiel, M. J. et al. SLAM family receptors distinguish hematopoietic stem and progenitor cells and reveal endothelial niches for stem cells. Cell 121, 1109–1121 (2005).
Kondo, M., Weissman, I. L. & Akashi, K. Identification of clonogenic common lymphoid progenitors in mouse bone marrow. Cell 91, 661–672 (1997).
Kwok, I. et al. Combinatorial single-cell analyses of granulocyte–monocyte progenitor heterogeneity reveals an early uni-potent neutrophil progenitor. Immunity 53, 303–318 (2020).
Liu, Z. et al. Fate mapping via MS4A3-expression history traces monocyte-derived cells. Cell 178, 1509–1525 (2019).
Pietras, E. M. et al. Functionally distinct subsets of lineage-biased multipotent progenitors control blood production in normal and regenerative conditions. Cell Stem Cell 17, 35–46 (2015).
Solomon, M., DeLay, M. & Reynaud, D. Phenotypic analysis of the mouse hematopoietic hierarchy using spectral cytometry: from stem cell subsets to early progenitor compartments. Cytometry A 97, 1057–1065 (2020).
Sommerkamp, P. et al. Mouse multipotent progenitor 5 cells are located at the interphase between hematopoietic stem and progenitor cells. Blood 137, 3218–3224 (2021).
Wilson, A. et al. Hematopoietic stem cells reversibly switch from dormancy to self-renewal during homeostasis and repair. Cell 135, 1118–1129 (2008).
Pedersen, C. B. et al. cyCombine allows for robust integration of single-cell cytometry datasets within and across technologies. Nat. Commun. 13, 1698 (2022).
Triana, S. et al. Single-cell proteo-genomic reference maps of the hematopoietic system enable the purification and massive profiling of precisely defined cell states. Nat. Immunol. 22, 1577–1589 (2021).
Dahlin, J. S. et al. A single-cell hematopoietic landscape resolves 8 lineage trajectories and defects in Kit mutant mice. Blood 131, e1–e11 (2018).
Hao, Y. et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat. Biotechnol. 42, 293–304 (2024).
Zhu, B. et al. Robust single-cell matching and multimodal analysis using shared and distinct features. Nat. Methods 20, 304–315 (2023).
Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Eds. Krishnapuram, B. et al.) 785–794 (Association for Computing Machinery, 2016).
Swanson, E. et al. Simultaneous trimodal single-cell measurement of transcripts, epitopes, and chromatin accessibility using TEA-seq. eLife 10, e63632 (2021).
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Trevino, A. E. et al. Chromatin and gene-regulatory dynamics of the developing human cerebral cortex at single-cell resolution. Cell 184, 5053–5069 e5023 (2021).
Shrikumar, A. et al. Technical note on transcription factor motif discovery from importance scores (TF-MoDISco) version 0.5.6.5. Preprint at https://doi.org/10.48550/arXiv.1811.00416 (2018).
Chen, C. et al. Spatial genome re-organization between fetal and adult hematopoietic stem cells. Cell Rep. 29, 4200–4211 (2019).
Kamimoto, K. et al. Dissecting cell identity via network inference and in silico gene perturbation. Nature 614, 742–751 (2023).
Brass, A. L., Kehrli, E., Eisenbeis, C. F., Storb, U. & Singh, H. Pip, a lymphoid-restricted IRF, contains a regulatory domain that is important for autoinhibition and ternary complex formation with the ETS factor PU.1. Genes Dev. 10, 2335–2347 (1996).
Taing, L. et al. Cistrome data browser: integrated search, analysis and visualization of chromatin data. Nucleic Acids Res. 52, D61–D66 (2024).
Kong, W. et al. Capybara: a computational tool to measure cell identity and fate transitions. Cell Stem Cell 29, 635–649 (2022).
Wolf, F. A. et al. PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome Biol. 20, 59 (2019).
Sanjuan-Pla, A. et al. Platelet-biased stem cells reside at the apex of the haematopoietic stem-cell hierarchy. Nature 502, 232–236 (2013).
Jindal, K. et al. Single-cell lineage capture across genomic modalities with CellTag-multi reveals fate-specific gene regulatory changes. Nat. Biotechnol. 42, 946–959 (2024).
Li, L. et al. A mouse model with high clonal barcode diversity for joint lineage, transcriptomic, and epigenomic profiling in single cells. Cell 186, 5183–5199 (2023).
Ahmed, N. et al. A novel GATA2 protein reporter mouse reveals hematopoietic progenitor cell types. Stem Cell Rep. 15, 326–339 (2020).
Laslo, P. et al. Multilineage transcriptional priming and determination of alternate hematopoietic cell fates. Cell 126, 755–766 (2006).
Obata-Ninomiya, K., Domeier, P. P. & Ziegler, S. F. Basophils and eosinophils in nematode infections. Front. Immunol. 11, 583824 (2020).
Fulkerson, P. C. & Rothenberg, M. E. Targeting eosinophils in allergy, inflammation and beyond. Nat. Rev. Drug Discov. 12, 117–129 (2013).
Lin, D. S. et al. A multi-track landscape of haematopoiesis informed by cellular barcoding and agent-based modelling. Preprint at bioRxiv https://doi.org/10.1101/2024.03.28.587126 (2024).
Yamamoto, R. et al. Clonal analysis unveils self-renewing lineage-restricted progenitors generated directly from hematopoietic stem cells. Cell 154, 1112–1126 (2013).
Constantinides, M. G., McDonald, B. D., Verhoef, P. A. & Bendelac, A. A committed precursor to innate lymphoid cells. Nature 508, 397–401 (2014).
Sasmono, R. T. et al. A macrophage colony-stimulating factor receptor-green fluorescent protein transgene is expressed throughout the mononuclear phagocyte system of the mouse. Blood 101, 1155–1163 (2003).
Huang, C. Y., Bredemeyer, A. L., Walker, L. M., Bassing, C. H. & Sleckman, B. P. Dynamic regulation of c-Myc proto-oncogene expression during lymphocyte development revealed by a GFP–c-Myc knock-in mouse. Eur. J. Immunol. 38, 342–349 (2008).
Gazit, R. et al. Lethal influenza infection in the absence of the natural killer cell receptor gene Ncr1. Nat. Immunol. 7, 517–523 (2006).
Yang, Q. et al. TCF-1 upregulation identifies early innate lymphoid progenitors in the bone marrow. Nat. Immunol. 16, 1044–1050 (2015).
Basak, O. et al. Mapping early fate determination in LGR5+ crypt stem cells using a novel Ki67–RFP allele. EMBO J. 33, 2057–2068 (2014).
Madisen, L. et al. A robust and high-throughput Cre reporting and characterization system for the whole mouse brain. Nat. Neurosci. 13, 133–140 (2010).
Heinrich, A. C., Pelanda, R. & Klingmuller, U. A mouse model for visualization and conditional mutations in the erythroid lineage. Blood 104, 659–666 (2004).
Passegue, E., Wagner, E. F. & Weissman, I. L. JunB deficiency leads to a myeloproliferative disorder arising from hematopoietic stem cells. Cell 119, 431–443 (2004).
Caton, M. L., Smith-Raska, M. R. & Reizis, B. Notch–RBP-J signaling controls the homeostasis of CD8− dendritic cells in the spleen. J. Exp. Med. 204, 1653–1664 (2007).
Kirstetter, P., Anderson, K., Porse, B. T., Jacobsen, S. E. & Nerlov, C. Activation of the canonical Wnt pathway leads to loss of hematopoietic stem cell repopulation and multilineage differentiation block. Nat. Immunol. 7, 1048–1056 (2006).
Hoppe, P. S. et al. Early myeloid lineage choice is not initiated by random PU.1 to GATA1 protein ratios. Nature 535, 299–302 (2016).
Thambyrajah, R. et al. GFI1 proteins orchestrate the emergence of haematopoietic stem cells through recruitment of LSD1. Nat. Cell Biol. 18, 21–32 (2016).
Seehus, C. R. et al. The development of innate lymphoid cells requires TOX-dependent generation of a common innate lymphoid cell progenitor. Nat. Immunol. 16, 599–608 (2015).
Doyle, A. D. et al. Homologous recombination into the eosinophil peroxidase locus generates a strain of mice expressing Cre recombinase exclusively in eosinophils. J. Leukoc. Biol. 94, 17–24 (2013).
Abe, T. et al. Visualization of cell cycle in mouse embryos with Fucci2 reporter directed by Rosa26 promoter. Development 140, 237–246 (2013).
Li, Y. et al. Single-cell analysis of neonatal HSC ontogeny reveals gradual and uncoordinated transcriptional reprogramming that begins before birth. Cell Stem Cell 27, 732–747 (2020).
Rodriguez-Fraticelli, A. E. et al. Single-cell lineage tracing unveils a role for TCF15 in haematopoiesis. Nature 583, 585–589 (2020).
Schlitzer, A. et al. Identification of cDC1- and cDC2-committed DC progenitors reveals early lineage priming at the common DC progenitor stage in the bone marrow. Nat. Immunol. 16, 718–728 (2015).
Zhou, W. et al. Single-cell analysis reveals regulatory gene expression dynamics leading to lineage commitment in early T cell development. Cell Syst. 9, 321–337 (2019).
Isobe, T. et al. Preleukemic single-cell landscapes reveal mutation-specific mechanisms and gene programs predictive of AML patient outcomes. Cell Genom. 3, 100426 (2023).
Stepanchick, E. et al. DDX41 haploinsufficiency causes inefficient hematopoiesis under stress and cooperates with p53 mutations to cause hematologic malignancy. Leukemia 38, 1787–1798 (2024).
Sturgess, K. et al. Pharmacological inhibition of METTL3 impacts specific haematopoietic lineages. Leukemia 37, 2133–2137 (2023).
Abdelhamed, S. et al. Mutant Samd9l expression impairs hematopoiesis and induces bone marrow failure in mice. J. Clin. Invest. 132, e158869 (2022).
Vukadin, L. et al. A mouse model of Zhu–Tokita–Takenouchi–Kim syndrome reveals indispensable SON functions in organ development and hematopoiesis. JCI Insight 9, e175053 (2024).
Williams, M. J. et al. Maintenance of hematopoietic stem cells by tyrosine-unphosphorylated STAT5 and JAK inhibition. Blood Adv. 9, 291–309 (2025).
Kain, B. N. et al. Hematopoietic stem and progenitor cells confer cross-protective trained immunity in mouse models. iScience 26, 107596 (2023).
Wang, M. et al. Genotoxic aldehyde stress prematurely ages hematopoietic stem cells in a p53-driven manner. Mol. Cell 83, 2417–2433 (2023).
Herault, L. et al. Single-cell RNA-seq reveals a concomitant delay in differentiation and cell cycle of aged hematopoietic stem cells. BMC Biol. 19, 19 (2021).
Mitchell, C. A. et al. Stromal niche inflammation mediated by IL-1 signalling is a targetable driver of haematopoietic ageing. Nat. Cell Biol. 25, 30–41 (2023).
Garner, H. et al. Understanding and reversing mammary tumor-driven reprogramming of myelopoiesis to reduce metastatic spread. Cancer Cell 43, 1279–1295 (2025).
Auer, F. et al. Trajectories from single-cells to PAX5-driven leukemia reveal PAX5–MYC interplay in vivo. Leukemia 39, 1607–1626 (2025).
Zheng, Z. et al. The ATF4–RPS19BP1 axis modulates ribosome biogenesis to promote erythropoiesis. Blood 144, 742–756 (2024).
Caron, D. P. et al. Multimodal hierarchical classification of CITE-seq data delineates immune cell states across lineages and tissues. Cell Rep. Methods 5, 100938 (2025).
Weinreb, C., Rodriguez-Fraticelli, A., Camargo, F. D. & Klein, A. M. Lineage tracing on transcriptional landscapes links state to fate during differentiation. Science 367, eaaq3381 (2020).
Acknowledgements
This work was partially supported by RC2DK122376 and R01HL122661 to H.L.G. NIH training grant T32CA117846 partially supported K.F. Flow cytometric data were acquired using equipment maintained by the CCHMC Research Flow Cytometry Core supported by NIH S10OD025045. Sequencing was performed by the CCHMC DNA Sequencing and Genotyping Core or by Novogene US. We thank M. Daud Khan (CCHMC) for assistance with genomics analyses. The CellTag lentivirus was packaged and purified by the CCHMC Translational Core Laboratory Vector Production Facility. We thank B. Song (CCHMC) for contributing to ADT titration and cocktail formulation, CITE-seq atlases, initial InfinityFlow and CellTag culture and library work. We thank D. Schnell for providing statistical guidance and enrichment analysis. We thank K. Jindal (Washington University) for advice on and troubleshooting of CellTag protocols. We thank J. Butler (University of Florida) for BMEC-Akt cells as a gift. We thank M. DeLay (Cytek Biosciences) and K. Weller (Ohio State University Comprehensive Cancer Center) for help gaining access to the Cytek Aurora and Cytek Aurora CS for InfinityFlow and FS-scRNA-seq captures. Cytek Biosciences financially supported the use of the CS at Ohio State University Comprehensive Cancer Center. We thank Biolegend and former Biolegend employees B. Z. Yeung (BioTuring) and K. Nazor (Proteintech Genomics) for providing the prototype cocktails used for antibody titration. Acknowledgement of individuals and companies does not imply their endorsement of the study’s data and conclusions.
Author information
Authors and Affiliations
Contributions
Conceptualization: K.F., N.S. and H.L.G. Methodology: K.F., X.Z., N.S. and H.L.G. Software: K.F., G.L., K.T., P.R., S.S., and N.S. Validation: K.F., X.Z. and D.B. Formal analysis: K.F. and N.S. Investigation: K.F., X.Z., D.B., A.O., S.N.B. and C.P. Resources: J.C., S.M. and F.D.F. Data curation: K.F., K.T. and N.S. Writing, original draft: K.F., N.S. and H.L.G. Writing, review and editing, K.F., S.M., H.S., N.S. and H.L.G. Visualization: K.F. and N.S. Supervision: H.S., N.S. and H.L.G. Project administration: K.F., N.S. and H.L.G. Funding acquisition: H.L.G.
Corresponding authors
Ethics declarations
Competing interests
J.C. declares that they are an employee and stakeholder of BioLegend (a part of the Revvity group of companies). S.M. is a co-founder of CapyBio Inc. The other authors declare no competing interests.
Peer review
Peer review information
Nature Immunology thanks Shalin Naik and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Ioana Staicu, in collaboration with the rest of the Nature Immunology team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Experimental and bioinformatics rubric to define, isolate and resolve genomics linkages between murine progenitors.
a, Comprehensive and concordant atlases of murine hematopoietic progenitors were created using multiomic single-cell (CITE-seq) and flow-cytometric profiling (InfinityFlow), then integrated via a unified computational workflow for label transfer (KDE+cellHarmony). b, Trimodal multiomic single cell data (TEA-seq) was analyzed to derive GRNs (ChromLinker). c, Based on their transcriptional and chromatin profiles, genomic-defined MultiLin populations were isolated by flow cytometry and validated. Single cell lineage tracing established the developmental potential of these cells. d, Discrete states govern hematopoietic developmental trajectories. Markers responsive to nascent transcription factor activity (CD55, CD371) highlighted. e-f, Gates used to selectively enrich HSC-MPP cells (e) and CD127+ lymphoid cells (f). g, Fluidigm captures for the populations represented as a gene expression heatmap of marker genes using the named marker strategies (indicated as a tick-map at the bottom). h, Initial MultiLin enrichment strategy (Lin−CD117+CD34+CD115−Ly6C−) based upon Fluidigm captures.
Extended Data Fig. 2 CITE-seq encompassing early hematopoietic states.
a-c, A cluster comparison across TotalVI processed surface protein ADT signals from; a, the hand-titrated 60-ADT antibody panel, b, the universal mix (v1.0), and (c) the final product after molecular sequence-based titration. d, Cartoon illustrating the experimental design of the titration experiment. HASH antibodies were used to multiplex titration samples from the same population. e, A bar plot illustrating the ARI score for re-classification of transcriptome defined CITE-seq labels using ADT feature values for the BioLegend Universal mix versus the final titrated mix. f, Outline illustrating the populations captured to generate the CITE-seq atlas, collected features of the CITE-seq atlas, resolving clustering using scTriangulate. g-j, The integrated transcriptome UMAP embedding, illustrating; the sort gate used to enrich for the targeted population (g), scTriangulate confidence scores (h), the RNA (i), ADT contribution values for the final cluster definitions (j) and source annotations by final stable clusters (k). l, Marker heatmap of cells from qHSC, HSC-Mac-1 and Mac-Nr1h3 (CITE-seq titrated), for RNAs (top) and ADTs (bottom). m, Flow plots of HSC-MPP gated bone marrow cells (Lin−Sca1+CD117+ gating out CD150−CD48+ for MPP3-MPP4-gate cells) reveals rare CD193+ and CD115+ populations; in agreement with predicted HSC-Macrophage populations observed with CITE-seq.
Extended Data Fig. 3 Predicted cluster-defined populations within published flow-defined populations.
Heatmap shows percentage overlap between InfinityFlow atlas populations and in-silico gated populations from the indicated publications (below). CITE-seq atlas population labels (vertical axis) and published flow-cytometry-defined populations (horizontal axis).
Extended Data Fig. 4 Identification of new gating schemes for CITE-seq defined populations.
a, Gating scheme for initial “FS-scRNA-seq” validation focused on myeloid end-state populations. b-e, Expression of Epx-CRE ROSA-LSL-tdTomato reporter (b,d) and eosinophil marker CD125 (IL5RA)(c,e) for gates within the (b,c) Lin−CD16-32+Irf8lowLy6C− gated cells and (d,e) Lin−CD16-32+Irf8lowLy6C+ gated cells. f-g, UMAP of gene expression data from HIVE captures of CD117 + , Eosinophil trajectory, and BMCP trajectories clustered (f) and illustrating Ly6c2 expression (g). h, UMIs for genes from selected FS-scRNA-seq populations (Unknown MultiLin, EoP, Ly6C + EoP1, Ly6C + EoP2). i, UMIs for cell HASH oligos from selected FS-scRNA-seq populations (Unknown MultiLin, EoP, Ly6C + EoP1, Ly6C + EoP2).
Extended Data Fig. 5 FS-scRNA-seq validation steps and creation of benchmark dataset.
a-f, Gating schemes derived from Ab-MarkerFinder with targeted enrichment for the MPP3-IER (a), ML-1b and ML-2 (b), MEP (c), BMCP (d), IG2-MP (e), and IG2-proNeu1 (f) clusters. g, InfinityFlow in silico gated populations replicating mutually-exclusive (non-overlapping) FS-scRNA-seq definitions projected over the UMAP embedding. h, Gene expression profiles of CITE-seq cells (cluster colors are the same as those pointed to in g) that map to FS-scRNA-seq sorted populations by their marker genes. i, Example heatmap of pairwise overlap between the true gate label to the predicted gate label (confusion matrix) in InfinityFlow data used to calculate ARI scores.
Extended Data Fig. 6 “ChromLinker” analysis scheme integrates CITE-seq atlas populations with TEA-seq and then infers GRN.
a, Input data (CITE-seq and TEA-seq) and their underlying components. b, CITE-seq clusters defined by scTriangulate. c, harmonypy integration of CITE-seq labels to TEA-seq (RNA). d, UMAP representation of clusters captured by TEA-seq gates (HSC-MPP and MultiLin gates). e, TEA-seq BAM files (ATAC) were split according to pseudobulk cluster definitions. f, Peaks were called on individual pseudobulk cluster BAM files. g-h, Peaks were tested for association with genes within pre-defined TADs (g) using Pearson correlation of pseudobulk TEA-seq ATAC accessibility profile to pseudobulk CITE-seq gene expression across the 57 cluster profiles (h). i, A set of ~100,000 peaks significantly correlated to gene expression values (p-value < 0.001). j, ChromBPNet bias models were generated using the total merged peak set and the combined 10X Cell Ranger output BAM files from the MultiLin sort gate. k, ChromBPNet bias-factorized models were successfully generated for 32 of the 57 pseudobulk profiles. l, Contribution scores were calculated for each of the 32 models. m, TF-MoDISco was used to cluster the contribution score seqlets and identify CWM patterns, which were annotated with known transcription factor DNA-binding motifs using the CIS-BP2 database. n, The dynamic peak set was scanned and scored for the CWM profiles identified by TF-MoDISco. o-p, A merged database of seqlets was generated (o) and within TADs, the dot product of the contribution scores were correlated to gene expression (p). q-r, Significantly correlated seqlets (r > 0.4) were identified for each gene (q), annotated by their underlying transcription factors to generate a pairwise correlation matrix (r). s, These connections were filtered to significant connections to build an initial gene regulatory network. t, The connections were scored for each cluster and aggregated to generate activity scores: Z-score integrating target gene expression, transcription factor expression, and regulatory contribution of the transcription factor to its putative target genes in each of the 32 clusters.
Extended Data Fig. 7 Label Transfer from CITE-seq to TEA-seq.
a, UMAP projections of merged CITE-seq (blue) and TEA-seq (orange) transcriptome profiles prior to harmonypy integration (integration was done separately for corresponding HSC-MPP and MultiLin gates between CITE-seq and TEA-seq). b, Distributions of principal components of CITE-seq (blue) and TEA-seq (orange) (an X denotes the removal of principal component 1 prior to harmonypy integration - ‘run_harmony‘). c, UMAP projections of merged CITE-seq (blue) and TEA-seq (orange) profiles after modified harmonypy algorithm implementation with removal of principal component #1. d-e, Validation of label propagation by pairwise comparison of ranking (Spearman correlation as blue-white-red color scale) of marker genes within each cluster for both HSC/MPP TEA-seq replicates (d) and MultiLin TEA-seq replicates (e). 1 = 0-90*/
Extended Data Fig. 8 Prior evidence of epigenetic activity and transcription factor ChIP-seq binding validates gene regulatory network.
a, Cluster-specific GRN with gene nodes colored and scaled according to their relative expression levels: qHSC, ML-1b, IG2-proNeu1, ML-MDP, BMCP, MEP. b, Dot plot comparisons of the InfinityFlow fluorescent reporter level (vertical axis) and activity (horizontal axis) across each of the 32 mapped clusters for MYC. c, Stacked bar plots showing candidate cCREs in the proposed GRN and ENCODE v4, with the green area illustrating the overlap. Gray bars indicate non-overlapping regions. d, Stacked bar plots showing candidate cCREs in the proposed GRN for the 32 clusters. Overlap with ENCODE v4 cCRE colored according to cluster, with unique cCRE colored grey.e, ChIP-seq experiments targeting transcription factors (CistromeDB) were tested for corresponding transcription factor/seqlet-enrichment using GIGGLE to index and Fisher’s exact test to score. f, A heatmap of all pairwise comparisons of seqlet instances (rows) and CistromeDB ChIP-seq peak sets (columns). Transcription factors are grouped with color bars to annotate families. Red outlines are used to highlight direct family to self-comparisons. The color indicates the rank across all ChIP-seq datasets in CistromeDB for the enrichment of the given transcription factor/seqlet instances by GIGGLE score. g, Dot plot showing the Log2 fold enrichment of the GIGGLE score between the seqlet instances and its corresponding ChIP-seq peak (self-to-self) set over all other ChIP-seq peak sets (self-to-others). Significance is given by a one-sided (positive enrichment) Mann-Whitney U test (significant: p < 0.05).
Extended Data Fig. 9 MultiLin populations display distinct lineage priming.
a, sc-Hrödinger scores using the top 100 marker genes per cluster for mixed-lineage priming of the indicated cluster gene expression programs (top) across the indicated CITE-seq atlas populations (left) using CITE-seq gene expression data. b, Bar plots show the expression of specified progenitor marker genes within MultiLin clusters. c, Capybara predictions, using scaled quadratic programming (QP) and multiple identity (Multi-ID) percentages for each cell state relative to the same restricted cell-states as in (a). d, The integrated scRNA-seq gene expression UMAP illustrating PAGA-defined linkages (edges) between states colored as high confidence (blue), medium confidence (green), and not recapitulated by CellTag (dotted). e, CellTag workflow: Two progenitor subsets (Lin−Kit+Sca+, Lin−Kit+Sca−) were independently transduced with CellTag-multi lentiviral barcoding vector, cultured and then GFP+ cells were sorted and captured for scRNA-seq analysis (n = 2 technical replicates). f, Cells in MultiLin Lin−Kit+Sca−CD27+ gate were index-sorted for CD55−CD371−, CD55+CD371−, CD55−CD371+ populations and single cells were cultured for 5 days. The output of single-cells is classified as either (gray) all tdTomato− (red) all tdTomato+, or (purple) a mix of tdTomato+ and tdTomato− cells.
Extended Data Fig. 10 Flow cytometric and genomic dissection of MultLin heterogeneity and lineage restriction.
a, 11-color Flow cytometry panel used with Sony MA900 sorter to monitor activity of lineage-defined-CRE activation of ROSA-LSL-tdTomato reporters. b, Heatmap comparison of the predicted cluster content of in-silico-sorted InfinityFlow cell populations (left) and FS-scRNA-Seq analysis (right) for the sort gates (A-G). c, Nippostrongylus brasiliensis infection model schematic shows full spectrum flow analysis and scRNA-seq capture timepoints after infection. d, Cell frequency curve plot for 4 out of 22 in vivo perturbation scRNA-seq datasets. MultiLin cell populations are re-scaled among themselves to 1 (left side of the plot). Unadjusted cell frequency is shown for selected non-MultiLin clusters. e, Chi-square residuals for MultiLin scaled cell-population frequency versus all non-HSPC-MultiLin clusters with significant associations (*).
Supplementary information
Supplementary Table 1
CITE-seq antibody information. (1) Antibody information for the CITE-seq 60-ADT panel. (2) Antibody information for the Universal Mix v1.0 195-ADT panel and titration panel. (3) Notes concerning the selection of antibodies for final CITE-seq and TEA-seq panels. (4) List of antibodies removed for the final ADT panel. (5) Antibody information for the final product CITE-seq 112-ADT panel (also used for TEA-seq). (6) Annotation of corresponding feature names used for the same target across the different CITE-seq antibody panels.
Supplementary Table 2
Sequencing metrics. (1) Sequencing metrics for next-generation sequencing experiments.
Supplementary Table 3
InfinityFlow antibody information. (1) Antibody information for the initial backbone panel used for InfinityFlow. (2) Biotin-conjugated lineage markers used for InfinityFlow panels. (3) A list of transgenic fluorescent reporter mouse models incorporated into the InfinityFlow object. (4) Surface marker antibodies used as Infinity markers (those imputed during InfinityFlow). (5) Panel used to capture the curated reference panel for the final InfinityFlow object. (6) Backbone panel used to impute CD115.
Supplementary Table 4
CITE-seq InfinityFlow integration. (1) The Pearson correlation coefficient values of corresponding transcription factors between nearest neighbor logicle-scaled transgenic fluorescent reporter signal intensities and CITE-seq mRNA log2 (CPTT) count values after supervised KDE mapping normalization.
Supplementary Table 5
Durable cell states, markers and transcriptional regulators. Annotated CITE-seq-defined cell states and broad lineage classifications, with associated RNA and ADT markers. Ab-MarkerFinder-defined predicted and validated flow cytometry markers. Top ChromLinker-predicted transcriptional regulators.
Supplementary Table 6
Cell-state-specific differentially expressed genes after in silico perturbation of select transcription factors. Reports the differentially expressed genes (adjusted P < 0.05) across all MultiLin and HSC/MPP gate populations tested for genes after in silico perturbation of Gata1, Gata2, Irf8, Spi1, Cebpa and Cebpe.
Supplementary Table 7
PAGA and CellTag connection scores. (1) PAGA and CellTag edge scores between cell-type pairs with two replicates from CellTag for both Ly6a− (Sca− input) and Ly6a+ (Sca+ input) captures. The ‘Type’ indicates known or new connections, and each has been labeled with MPP, MultiLin (ML) or bipotential (Bi) hypothesized labels. CellTag scores are reported as clonal coupling (CC) scores.
Source data
Source Data Fig. 1
Underlying source data (multiple data types).
Source Data Fig. 2
Underlying source data (multiple data types).
Source Data Fig. 3
Underlying source data (multiple data types).
Source Data Fig. 4
Underlying source data (multiple data types).
Source Data Fig. 5
Underlying source data (multiple data types).
Source Data Fig. 6
Underlying source data (multiple data types).
Source Data Fig. 7
Underlying source data (multiple data types).
Source Data Extended Data Fig. 1
Underlying source data (multiple data types).
Source Data Extended Data Fig. 2
Underlying source data (multiple data types).
Source Data Extended Data Fig. 3
Underlying source data (multiple data types).
Source Data Extended Data Fig. 4
Underlying source data (multiple data types).
Source Data Extended Data Fig. 5
Underlying source data (multiple data types).
Source Data Extended Data Fig. 7
Underlying source data (multiple data types).
Source Data Extended Data Fig. 8
Underlying source data.
Source Data Extended Data Fig. 9
Underlying source data.
Source Data Extended Data Fig. 10
Underlying source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ferchen, K., Zhang, X., Thakkar, K. et al. A unified multimodal single-cell framework reveals a discrete state model of hematopoiesis in mice. Nat Immunol 26, 2086–2099 (2025). https://doi.org/10.1038/s41590-025-02307-3
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41590-025-02307-3