Extended Data Fig. 3: Transcriptome-based cell type annotation of human CD34+ HSPCs.

(a) Unsupervised Louvain clustering was used to identify 22 transcriptionally distinct HSPC clusters which were annotated based on marker gene expression and position in UMAP space. For lineages with multiple progenitor clusters, lower number designation (for example, E-Prog-1) indicates less apparent differentiation and a higher number (for example, E-Prog-3) indicates an overall greater degree of commitment to that lineage. Clusters are organized into lineage groups based on priming towards a single lineage or based on multipotency. (b) singleCellNet classification of HSPC states against a healthy human adult bone marrow scRNA-seq dataset21. (c) The five lymphoid progenitor clusters making up the two lymphoid trajectories in UMAP space are analyzed for cell cycle status based on gene expression signatures. Upon cell cycle regression of the dataset, the two lymphoid trajectories integrate into a single lymphoid trajectory in UMAP space. (d) Potential doublets were removed from the dataset using DoubletFinder82 and UMAP was performed on remaining cells. The relationship of lineages and their trajectories is similar comparing the full dataset (top UMAP) and dataset upon removal of potential doublets (bottom UMAP). Color scheme for lineages is the same as in panel C. (e) CD34+ HSPCs were isolated at the indicated ages and cultured in methylcellulose medium with cytokines for 14 days, at which time colony frequency and morphology was tabulated in a blinded fashion. * indicates samples where scRNA-seq was also performed. CFU-GEMM = colony forming unit granulocyte/erythroid/monocyte/megakaryocyte BFU-E = burst forming unit erythroid CFU-GM = colony forming unit granulocyte/monocyte CFU-M = colony forming unit monocyte CFU-G = colony forming unit granulocyte.