Extended Data Fig. 1: Strategies to capture the molecular repertoire of erythroid cells from distinct developmental stages.

a, Schematic illustration of the experimental design. b, c, Flow cytometry plots showing the gating strategies for isolation of erythroid precursors (GYPAloCD71+ and GYPAhiCD71+) and progenitors (early erythroid progenitors (EEP): CD34+CD36−CD123−GYPA− and later erythroid progenitors (LEP): CD34−CD36+CD123−GYPA−) from FL and UCB; red boxes showing the sorted cell populations. d-f, t-SNE plots of cells collected from YS (d, left), FL (e, left) and UCB (f, left). The erythroid cell cluster was defined according to the marker genes (right). Ery, erythroid cells. g, Representative significantly enriched GO terms of erythroid cluster; dot size represents the count of identified signature genes and dot color indicates the adjusted P value. P values were determined by hypergeometric test and adjusted for multiple testing using the Benjamini–Hochberg method. h, Violin plots showing the number of UMIs captured in erythroid precursors of YS, FL and UCB. i, FeaturePlots showing the expression of GYPA in YS, FL and UCB cells. j, Spearman correlation analysis of GYPA+ erythroid precursors of each individual sample from YS, FL and UCB. k, UMAP plots showing each individual sample. l, Stacked bar chart indicating the proportion of cell clusters at YS, FL and UCB. m, Heatmap showing the relative expression (scaled by row) of signature genes in each cluster. n, Boxplot illustrating the erythroid maturation score (GO:0043249) for each cluster (C1, n = 8,373 cells; C2, n = 10,328 cells; C3, n = 10,918 cells; C4, n = 1502 cells). The horizontal line across the box indicates the median value and the box represents the first and third quartiles. o, Correlation analysis between UCB erythroid precursors in each cluster and the FACS-sorted, stage-defined erythroid precursors23. The circle area represents the absolute value of the corresponding correlation coefficient. p, Pseudotime analysis of C4 cells. C4 cells were divided into relatively immature ‘C4-A’ (Pseudotime < 20) and mature ‘C4-B’ (Pseudotime ≥ 20) groups. q, The dynamic expression of HBB and HBA2 along the inferred pseudotime axis. r, Violin plots showing the enrichment of C2 and C3 signature (adopted from Extended Data Fig. 1m) for the indicated group of cells.