Introduction

Red blood cells (RBCs) fulfill the crucial role of oxygen exchange for CO2 in the human body and are continually produced in the bone marrow in a process termed erythropoiesis1. Erythropoiesis progresses via at least two distinct differentiation phases2. In the first phase, multipotent hematopoietic stem and progenitor cells (HSPCs) gradually undergo commitment to the erythroid lineage and produce erythroid progenitors capable of limited self-renewal. In the second phase, known as terminal erythroid maturation, proerythroblasts (ProE) undergo a series of cell divisions, differentiating sequentially into basophilic erythroblasts (BasoE), polychromatic erythroblasts (polyE), and orthochromatic erythroblasts (orthoE). OrthoE shed their nuclei giving rise to reticulocytes, which develop into mature RBCs, also known as erythrocytes.

Erythropoiesis is regulated by external signaling molecules that influence the ongoing intracellular processes responsible for RBC production3. While much is known about the external signaling molecules important for RBC production, the repertoire of genes that regulate terminal erythroid differentiation remains only partly defined. Several genes have already been identified as essential for RBC development through loss-of-function studies in the mouse or in human erythroid cells in vitro. However, these studies have been limited in scope to a small number of genes that were already suspected or known to influence RBC production. Since there are ~10,000 genes expressed in erythroid cells4, it was unclear how many of them are functionally required for normal erythroid development.

In an effort to identify new genes that regulate erythroid development, genome-wide association studies (GWAS) were performed, identifying thousands of variants associated with RBC traits5,6,7. To validate the GWAS findings, targeted small-scale loss-of-function screens were successful in linking genetic perturbations in genes initially nominated by GWAS with disrupted erythropoiesis8. This approach was successful in validating the contributions of 77 different genes to human erythroid development. However, 93% of the genetic variants identified by GWAS are located in non-coding regions5, which presents considerable uncertainty in ascribing phenotypes to any specific gene amongst all nearby genes that are in linkage disequilibrium.

Recently, a CRISPR knock-out screen examined the influence of chromatin factors on murine hematopoietic differentiation9. This targeted screen (interrogating 680 genes) was performed in vitro and was followed by a focused perturb-seq experiment (interrogating 80 genes) in vivo9. This study identified candidates with roles in HSPC maintenance or in priming progenitor cells along different lineage trajectories9,10. Though powerful, similar in vivo screening strategies can only be performed at a small scale due to limitations arising from genetic modification and transplantation of HSPCs into recipient mice.

Here, we performed a genome-scale CRISPR knock-out screen in human erythroid progenitors to comprehensively identify genes that are required for terminal erythroid development. This screen was followed by a targeted secondary screen dedicated to validating candidate genes revealed in the original genome-wide analysis. Through this approach, we identified 277 genes that are required for terminal erythroid differentiation. The top ranked genes from the screen included previously characterized regulators of erythroid development as well as several genes with no prior documented involvement in this process. Further validation studies uncovered a requirement for VAC14, a member of the PIKfyve protein kinase complex, in erythropoiesis in vitro and in vivo, through hematopoietic stem cell transplantation experiments in mice. The results of the CRISPR screen reported here are included on a website (https://cdb-rshiny.med.umich.edu/Khoriaty_Erythropoiesis/), which we believe will serve as an important asset to investigators interrogating hematopoiesis.

Results

A genome-scale CRISPR knock-out screen identifies regulators of erythroid development

To identify the repertoire of genes that are required for erythroid differentiation, we developed a genome-scale pooled CRISPR knock-out screen, in which we deleted one gene per erythroid cell, and evaluated the impact of the mutation on erythroid differentiation. This screen was performed using the human erythroid cell line, HUDEP-2. HUDEP-2 cells are maintained at the proE stage when cultured in ‘maintenance media’, but undergo relatively synchronous differentiation when cultured in ‘differentiation media’, reaching the orthoE stage after 12 days of differentiation (Fig. 1A, B).

Fig. 1: Genome-scale CRISPR knock-out screen in HUDEP-2 cells.
figure 1

A Cytospin images of HUDEP-2 cells at differentiation days 0, 4, 8, and 12. Images are representative of 10 independent experiments. B Expression of CD233 and CD49d revealed by flow cytometry in HUDEP-2 cells before differentiation and at days 8 and 12 of differentiation. C Strategy of genome-scale CRISPR knock-out screen, in which HUDEP-2 cells were transduced with the h-GeCKOv2 lentiviral library, cultured for 9 days in maintenance media, and subsequently differentiated for 12 days. Cells were collected both prior to differentiation and live CD49dlow cells were sorted at day 12 of differentiation. D Schematic representation of relative sgRNA abundance in library, HUDEP-2 cells prior to differentiation, and HUDEP-2 cells at day 12 of differentiation. Similar to common essential genes, sgRNAs targeting erythroid essential genes are depleted in day 0 cells compared to library, while sgRNAs targeting genes important for erythroid differentiation are depleted in day 12 compared to day 0 HUDEP-2 cells. E, F Volcano plots displaying MAGeCK gene-level enrichment scores on the y-axis and average log2-fold change representing abundance of all sgRNAs targeting each gene on the x-axis. Genes implicated in CDA and DBA are highlighted for (E) day 0 versus library analysis and (F) day 12 versus day 0 analysis. Average Log2-fold Change of sgRNA abundance for common essential genes, GWAS nominated genes, and genes at various expression levels for both (G) day 0 versus library analysis and (H) day 12 versus day 0 analysis. CPM = counts per million reads. Source data are provided in Source Data file.

To perform the screen, we expanded and packaged the GeCKOv2 genome-scale CRISPR knockout library into lentiviral particles. Each viral particle from this library delivers one of 123,411 single-guide RNAs (sgRNAs) targeting virtually all the genes in the human genome (6 sgRNA/gene), Cas9, and a puromycin resistant cassette. HUDEP-2 cells cultured in maintenance media were transduced with the GeCKOv2 lentiviral library at a multiplicity of infection (MOI) of 0.3–0.5; this low MOI allowed the majority of transduced cells to receive only one sgRNA in order to mutate a single gene per cell (Fig. 1C). Transduced cells were selected with puromycin and maintained at the proE stage for 9 days to allow for turnover of proteins encoded by deleted genes and for negative selection of cells with deletions in essential genes (Fig. 1C). Surviving HUDEP-2 cells were then differentiated for 12 days. Nine days post-transduction, cells were collected at the onset differentiation (day 0, D0). Additionally, orthochromatic cells were sorted at D12 of differentiation on the basis of CD49d (α4 integrin) downregulation (Fig. 1C).

This screen was performed in biological triplicate, and importantly, coverage was maintained at >200x (>200 cells/sgRNA) throughout, which was achieved by transducing 75 × 106 cells with the genome-scale CRISPR library and by maintaining >25 × 106 cells in culture at all times after puromycin selection of transduced cells.

Analysis of the genome-scale CRISPR knock-out screen

By assessing the abundance of sgRNAs in the lentiviral CRISPR library, at D0 proE, and D12 orthoE, we sought to identify genes that are essential for proE survival (designated as ‘erythroid essential’) and genes that are required for differentiation from proE to orthoE (designated as ‘required for differentiation’). sgRNAs targeting essential erythroid genes are expected to be depleted in D0 proE compared to the CRISPR library (Fig. 1D), while sgRNAs targeting genes required for erythroid differentiation are expected to be reduced at D12 when compared to D0 (Fig. 1D).

Analysis of the genome-scale CRISPR knockout screen demonstrated that sgRNAs targeting well-known regulators of erythropoiesis, such as EPOR, GATA1 and KLF1 were underrepresented in D0 proE in comparison to the CRISPR library, as predicted (Fig. 1E). Similarly, sgRNAs targeting genes known to be required for heme biosynthesis, such as HMBS, UROD and ALAS2, were depleted in D12 orthoE when compared to D0 proE (Fig. 1F).

We analyzed the set of genes that are known to be mutated in congenital dyserythropoietic anemia (CDA), a group of disorders characterized by anemia that results, at least in part, from ineffective erythropoiesis11. Certain CDA genes such as CDIN1 (previously denoted C15ORF41) and KIF23 appeared to be erythroid essential in this screen, while others, such as SEC23B and RACGAP1 appeared to be required for erythroid differentiation (Fig. 1E, F, Supplementary Fig 1A, B, and Supplementary table 1). We performed similar analyses for genes that are commonly mutated in Diamond-Blackfan Anemia (DBA)12. As expected, DBA genes appeared to be erythroid essential in the genome-scale screen13 (Fig. 1E, F, Supplementary Fig 1C, D, and Supplementary table 1). A minority of genes implicated in CDA or DBA trended towards being erythroid essential or required for erythroid differentiation, but did not reach statistical significance in the genome-scale screen. This finding could be attributed to variabilities in individual sgRNA targeting efficiencies (Supplementary Fig 1A–D), as we demonstrated for other genes (Supplementary Fig 1E). Taken together, these findings validate this CRISPR screening strategy.

We next identified the list of “common essential” genes, defined by their viability requirement in more than 90% of cell lines included in the Dependency Map Consortium database14,15,16,17, and examined a requirement for these genes in terminal erythroid maturation. Analysis was stratified to define subsets of the “common essential” genes that are erythroid essential or required for erythroid differentiation. We performed a similar analysis for genes nominated by GWAS to be associated with RBC counts or hemoglobin levels18. All remaining genes were stratified by their expression levels in erythroid cells19 and analyzed for their requirements for erythropoiesis, as above.

As expected, “common essential” genes were the most required in terminal erythroid maturation (Fig. 1G, H). Notably, not all “common essential” genes were erythroid essential, an observation that could be attributable either to variable efficiencies of the sgRNAs included in the genome-scale screen (Supplementary Fig 1E) or to differences between the cancer cell lines included in the Dependency Map Consortium versus HUDEP-2 cells. Genes that are more abundantly expressed in erythroid cells when compared to genes expressed at lower levels were more likely to be erythroid essential or required for erythroid differentiation (Fig. 1G, H), suggesting a correlation between expression level and functional relevance in erythropoiesis.

Validation of the genome-wide CRISPR screen results

While CRISPR screens may be limited by variabilities in individual sgRNA deletion efficiencies, this shortcoming can be mitigated by using a larger number of sgRNAs targeting each gene. As such, we excluded previously identified core fitness genes20,21 and designed a custom CRISPR-knockout library containing 10 sgRNAs targeting the top 1100 candidate positive regulators of erythropoiesis nominated by the results from the genome-scale screen (300 candidate erythroid essential genes and 800 candidate genes required for differentiation), as well as 500 non-targeting (control) sgRNAs.

The new sgRNA pool was cloned into LentiCRISPRv222 and the pooled plasmid library was deep sequenced, confirming the presence and adequate relative abundance of all sgRNAs (Supplementary Fig 2). Following packaging into lentiviral particles, the secondary screen was performed in a manner similar to the genome-scale screen with 2 changes (Fig. 2A). First, day 12 orthoE were selected on the basis of BAND3 (CD233) expression and CD49d downregulation (i.e., CD49dloCD233+) (Fig. 2A). Second, coverage was maintained at >1000x (>1000 cells/sgRNA) throughout the secondary screen. The high depth of coverage combined with the increased number of sgRNAs/gene was intended to enhance the statistical confidence in the results of this validation CRISPR screen. The secondary screen was performed in biological triplicate. sgRNA abundance was compared among the secondary CRISPR library, D0 proE, and D12 orthoE, to identify erythroid essential genes and genes required for differentiation, as described above (Fig. 1D).

Fig. 2: Secondary targeted CRISPR knock-out screen in HUDEP-2 cells.
figure 2

A Overview of the secondary (validation) CRISPR knock-out screen, in which HUDEP-2 cells were transduced with the custom targeted lentiviral library, cultured for 9 days in maintenance media, and subsequently differentiated for 12 days. Cells were collected both prior to differentiation and live CD49dlow CD233+ cells were sorted at day 12 of differentiation. sgRNA abundance was compared among the secondary CRISPR library and the latter two cell populations. Volcano plots displaying MAGeCK gene-level enrichment scores on the y-axis and average log2-fold change of sgRNA abundance of each gene on the x-axis for (B) day 0 versus library analysis and (C) day 12 versus day 0 analysis. Improved False Discovery Rates (FDR) in the secondary versus genome-scale screens for several characterized (D) erythroid essential genes and (E) genes required for erythroid differentiation. Improved FDRs in the targeted compared to genome-scale screen for virtually all genes included in both screens, for both (F) day 0 versus library analysis and (G) day 12 versus day 0 analysis. Source data are provided in Source Data file.

Comparison between the initial genome-wide screen and the secondary validation screen

Analysis of the secondary screen showed that well-characterized regulators of erythropoiesis such as EPOR, GATA1, MYB, and JAK2 were among the highest ranked genes whose sgRNAs were depleted in D0 proE compared to the secondary library (Fig. 2B). Similarly, sgRNAs that disrupted genes known to be important for erythroid maturation such as ALAS2, ALAD, and ZFPM1, were highly depleted in D12 orthoE compared to D0 proE (Fig. 2C).

To ascertain whether the secondary screen performed better than the genome-scale screen at distinguishing true hits from random background noise, we first selected several well-defined regulators of erythropoiesis and compared the statistical confidence (False Discovery Rate; FDR) achieved for each gene in both screens. Notably, the FDRs for genes included in this analysis were overall lower in the secondary compared to the genome-wide screen, both for erythroid essential genes (Fig. 2D) and for genes required for erythroid differentiation (Fig. 2E).

Similarly, and on a broader level, FDRs for all genes incorporated in the secondary library were compared between the genome-scale and secondary screens. This analysis demonstrated a pattern overall consistent with improved FDRs in the secondary screen, particularly for the genes tested for their requirement for erythroid differentiation (Fig. 2F, G). Taken together, these results suggest that the validation screen indeed improved the statistical power to identify regulators of erythropoiesis when compared to the genome-scale screen.

Comparison between erythroid genes nominated by CRISPR screens versus GWAS

Numerous genes have been previously implicated in erythropoiesis by GWAS in which single nucleotide variants were found to associate with various RBC traits. We selected the set of genes that were nominated by GWAS to be associated with RBC count or hemoglobin concentration (NHGRI-EBI GWAS Catalog). One hundred and fifty-nine of these GWAS-identified genes were included in the secondary library, 20 and 88 of which were validated (at FDR < 0.05) as either erythroid essential or required for differentiation, respectively.

A recently published study identified 412 GWAS-nominated candidate genes with potential roles in erythropoiesis. This elegant study evaluated the impact of downregulating these genes using a short hairpin RNA (shRNA) screen in CD34+ HSPCs that were then differentiated into erythroid cells in vitro8. In this shRNA screen, 77 of the 412 genes were found to have an impact on erythroid differentiation. In comparison, 69 or 177 of these 412 genes appeared to impact erythropoiesis in the CRISPR screen reported here, using MAGeCK statistical cutoffs of 0.01 or 0.1, respectively. Notably, the majority of the GWAS-nominated genes that were identified as putative regulators of erythropoiesis in our CRISPR screens (51 of 69 genes or 137 of 177 genes, at MAGeCK statistical cutoffs of 0.01 or 0.1, respectively) did not appear to validate in the published shRNA screen (Fig. 3A), suggesting that the two screening strategies are complimentary.

Fig. 3: GWAS and genome-scale CRISPR screens provide complimentary approaches to identify regulators of erythropoiesis.
figure 3

A Number of candidate genes nominated to have a role in erythropoiesis by either GWAS or the CRISPR knock-out screen performed in HUDEP-2 cells. ES Enrichment Score. B Number of genes identified in the CRISPR screen as erythroid essential (EE) or required for differentiation (RD), and their distributions among common essential genes, GWAS-nominated genes, and others. Shared genes denotes genes that are both common essential and nominated by GWAS. Data shown at 2 statistical cutoffs: FDR < 0.05 and FDR < 0.01 + log2-fold change < -1.

Several factors may explain the differences between the findings of these diverse screens. Genes that were identified as regulators of erythropoiesis in the HSPC screen but not in the HUDEP-2 screen may impact erythropoiesis at an earlier stage of erythroid differentiation (such as BFU-E of CFU-E cells) or may impact erythropoiesis in a non-erythroid cell autonomous manner. In contrast, the screen performed in HUDEP-2 cells may uniquely identify genes that impact terminal erythroid differentiation but that otherwise may result in lethality in early progenitor cells (such as multipotent progenitors) in HSPC cultures. Additionally, differences in shRNA-mediated mRNA down-regulation versus CRISPR/Cas-9 mediated gene inactivation may account for some of the differences between these screens.

Based on the present screen, we estimate that 40 genes are erythroid essential genes and that 237 genes are required for erythroid differentiation (using a log2 fold change of <−1 and FDR < 0.01 as cutoffs) (Fig. 3B). Importantly, 83% (FDR < 0.01) of the genes that were identified in the current CRISPR genome-scale screen as candidate regulators of erythropoiesis were not nominated by GWAS. Therefore, the unbiased genome wide screen undertaken here identified new regulators of erythropoiesis in addition to previously known and GWAS-nominated genes.

Discovery of erythroid differentiation genes I: NHLRC2 expression is required in human erythroid progenitors

Surprisingly, several genes previously implicated as contributing to hemolytic anemia were identified in the CRISPR screen as required for erythroid differentiation23. We sought to validate the requirement of one such gene, NHLRC2, for erythroid differentiation (Fig. 4A). Loss of NHLRC2 in HUDEP-2 cells, using any of four shRNAs, resulted in cell death before differentiation induction (Fig. 4B).

Fig. 4: NHLRC2 is required for erythroid differentiation.
figure 4

A Mean sgRNA abundance in library, HUDEP-2 cells prior to differentiation (day 0), and HUDEP-2 cells at day 12 of differentiation, as recovered from the genome-scale CRISPR screen. B Fold change in counts of HUDEP-2 cells grown in maintenance media following transduction with one of 4 shRNAs targeting NHLRC2 (resulting in 40–80% reduction in NHLRC2 mRNA) or scramble control shRNA. Statistical analysis was performed using two-way ANOVA followed by Dunnett’s multiple comparison test (n = 3 biological replicates per condition, data represented as mean +/- standard deviation). C Differentiation strategy of CD34+ HSPCs into erythroid cells in vitro. D Fold change in counts of erythroid cells differentiated from CD34+ HSPCs following transduction with one of 4 shRNAs targeting NHLRC2 or scramble control shRNA. Statistical analysis was performed compared to scramble control, using two-way ANOVA followed by Dunnett’s multiple comparison test (n = 3 biological replicates per condition, data represented as mean ± standard deviation). Erythroid differentiation of HSPCs assessed by flow cytometry following transduction with one of 4 shRNAs targeting NHLRC2 or scramble control shRNA at (E) day 14 and (F) day 18 of differentiation. Live cells were analyzed for CD233 and CD49d expression. Statistical analysis was performed using one-way ANOVA followed by Dunnett’s multiple comparison test (n = 3 biological replicates per condition, data represented as mean ± standard deviation). G Cytospin images of day 18 erythroid cells differentiated from CD34+ HSPCs following transduction with one of 4 shRNAs targeting NHLRC2 or scramble control shRNA. Images are representative of 4 independent experiments. ns, not significant; *p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001. Source data are provided in Source Data file.

We subsequently verified a requirement for NHLRC2 in primary human CD34+ cells undergoing erythroid differentiation (Fig. 4C). Knockdown of NHLRC2 using one of four shRNAs resulted in reduced cell numbers in culture (Fig. 4D and Supplementary Fig 3A, B) and reduced numbers of BAND3+ erythroid cells at day 14 of differentiation (Fig. 4E). The reduction in cell number can be attributed to increased apoptosis (Supplementary Fig 3C–F). At day 18 of differentiation, the surviving NHLRC2-depleted cells appeared less differentiated than wild-type cells, based on α4 integrin staining (Fig. 4F) and morphologic analysis (Fig. 4G). We next transduced CD34+ HSPCs cultured in expansion media with NHLRC2-targeting or scramble control shRNA, and subsequently subjected HSPCs to erythroid differentiation. NHLRC2 downregulation resulted in an erythroid defect at the progenitor stage (Supplementary Fig 4). Altogether, these results demonstrate that NHLRC2 is required for efficient erythroid differentiation.

Discovery of erythroid genes II: VAC14 is required for human erythroid maturation

Among genes that are essential in a small proportion of cell lines (<5% of 1095 cell lines examined by the DepMap project), 107 were predicted to be required for erythropoiesis by the CRISPR screen reported here (using a log2 fold change of <−1 and FDR < 0.01 as cutoffs), including VAC14 (Fig. 5A). To evaluate the role of VAC14 in human erythroid maturation, we first transduced HUDEP-2 cells with one of three VAC14-targeting shRNAs. Compared to cells transduced with scramble control shRNAs, down-regulation of VAC14 in HUDEP-2 led to decreased cell proliferation (Fig. 5B).

Fig. 5: VAC14 is required for human erythroid differentiation.
figure 5

A Mean sgRNA abundance in library, HUDEP-2 cells prior to differentiation, and HUDEP-2 cells at day 12 of differentiation, obtained from the secondary custom CRISPR screen. B Fold change in counts of HUDEP-2 cells grown in maintenance media following transduction with one of 3 shRNAs targeting VAC14 (resulting in 50–70% reduction in VAC14 mRNA levels) or scramble control shRNA. Statistical analysis was performed compared to scramble control, using two-way ANOVA followed by Dunnett’s multiple comparison test (n = 3 biological replicates per condition, data represented as mean ± standard deviation). C Fold change in counts of erythroid cells differentiated from CD34+ HSPCs following transduction with one of 3 shRNAs targeting VAC14 or scramble control shRNA. Statistical analysis was performed using two-way ANOVA followed by Dunnett’s multiple comparison test (n = 3 biological replicates per condition, data represented as mean ± standard deviation). D Erythroid differentiation as assessed by flow cytometry (E) Cytospin images of day 18 erythroid cells differentiated from HSPCs following transduction with one of 3 shRNAs targeting VAC14 or scramble control shRNA. Images are representative of 3 independent experiments. ***p < 0.001; ****p < 0.0001. Source data are provided in Source Data file.

To further define the role of VAC14 in human erythroid cells, we transduced CD34+ HSPCs with one of three VAC14-targeting shRNAs and then induced erythroid differentiation. VAC14 downregulation resulted in reduced erythroid cell numbers (Fig. 5C and Supplementary Fig 5A, B) and increased apoptosis (Supplementary Fig. 5C–F). Additionally, whereas the majority of wildtype or scramble shRNA-transduced cells were polychromatic erythroblasts by day 18 of differentiation, VAC14 downregulated cells exhibited delayed differentiation as assessed by flow cytometry demonstrating reduced CD49d downregulation (Fig. 5D) and morphologically by cytospin evaluation (Fig. 5E). VAC14 down-regulation also resulted in an erythroid defect at the progenitor stage (Supplementary Fig 5G). These findings suggest a role for VAC14 in erythroid maturation.

Vac14 is required for murine erythroid cell maturation

We next sought to define the role of Vac14 in erythropoiesis in vivo. Since Vac14 null mice exhibit perinatal lethality24, we harvested fetal liver cells from E.15.5 Vac14 null (or wildtype control) murine embryos and transplanted them into lethally irradiated adult recipient mice (Fig. 6A).

Fig. 6: Vac14 is required for mouse erythropoiesis.
figure 6

A Lethally irradiated WT mice were transplanted with Vac14−/− or WT fetal liver cells. B Absolute numbers of terminally differentiated erythroid cells in bone marrows of recipient mice at 23–26 weeks post-transplantation. Stages I to V correspond to proE, basoE, polyE, orthoE, and reticulocytes/erythrocytes, respectively. C Red blood cell counts, hemoglobin levels, and mean corpuscular volume (MCV) in recipients of Vac14-/- or WT fetal liver cells at weeks 9–10 or 23–26 post-transplantation. D Spleen weights of mice at 23–26 weeks following transplantation with Vac14−/− or WT fetal liver cells. E Absolute numbers of terminally differentiated erythroid cells in spleens of recipient mice at 23–26 weeks post-transplantation. Statistical analysis for (BE) was performed using unpaired t-tests without correction for multiple comparisons (n = 3 WT mice and n = 4 Vac14−/− mice, data represented as mean ± standard deviation). F Histology of sternum bone marrows and spleens harvested from recipients of Vac14−/− fetal liver cells demonstrating decreased hematopoiesis and large intracellular vacuoles. G Vacuolated erythroid progenitors (proE) present in bone marrows of mice transplanted with Vac14−/− fetal liver cells. Images in (F, G) are representative of 2 independent experiments. Arrows indicate erythroid progenitors. ns, not significant; *p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001. Source data are provided in Source Data file.

Transplant recipients of Vac14 null fetal liver cells demonstrated a defect in erythroid development, with a significant reduction in absolute numbers of proE, basoE, polyE, and orthoE in the bone marrow (Fig. 6B and Supplementary Fig 6A). Despite a significant reduction in terminally differentiated erythroid cells in the bone marrow, mice transplanted with Vac14 null fetal liver cells did not exhibit anemia (Fig. 6C), presumably due to the massive splenomegaly observed in these mice (Fig. 6D) that is associated with expanded splenic extramedullary erythropoiesis (Fig. 6E). Additional peripheral blood counts and flow cytometric analyses of bone marrow and spleen hematopoiesis are summarized in Supplementary Fig 6B–F and Supplementary Fig 7A–I.

On histological evaluation, bone marrows of mice transplanted with Vac14-null HSCs were diffusely infiltrated with a population of large vacuolated cells, morphologically consistent with phagocytically active macrophages (Fig. 6F). These cells frequently contained intracytoplasmic erythrocytes, consistent with hemophagocytosis. Cytoplasmic vacuoles were also noted in erythroid cells (Fig. 6G) with no apparent stage predilection. The number of myeloid lineage cells was decreased overall, but their maturation appeared normal, even though cytoplasmic vacuoles were noted in these cells as well (Supplementary Fig 6C–E). Megakaryocytes were reduced in numbers but did not exhibit vacuolation.

Histologic evaluation of spleens harvested from mice transplanted with Vac14-null HSPCs demonstrated markedly enlarged spleens, with diffuse expansion of the red pulp by histiocytes (macrophages) and by extramedullary hematopoietic cells of erythroid and granulocytic lineages (Fig. 6F and Supplementary Fig 8). The vacuoles within tissue resident macrophages stained negatively for lipids (by Sudan Black), lipid droplet proteins (perilipin-1 stain), and polysaccharides (Periodic acid-Schiff stain), while a moderate number of vacuolated macrophages stained positively for iron (Perls’ stain) to varying degrees, further supportive of hemophagocytosis (Supplementary Fig 9).

Discussion

In this report, we designed an unbiased genome-wide CRISPR knockout screen to define the repertoire of genes that are required for erythroid differentiation. Since the erythroleukemia cell lines K562 and HEL exhibit poor erythroid differentiation capacity and since several well established erythroid essential genes (such as EPOR, KLF1, and ALAS2) are dispensable in both these cell lines14,15,16,25, we elected to perform the screen in HUDEP-2 cells, which represent human erythroid cells more accurately. The HUDEP-2 cell line is HSPC-derived26 and undergoes semi-synchronous erythroid differentiation, faithfully recapitulating key aspects of human terminal erythroid maturation, including hemoglobinization and chromatin condensation27. Though primary HSPCs can be efficiently differentiated into erythroid cells in vitro, their differentiation using standard differentiation protocols is asynchronous, and HSPCs are difficult to obtain in sufficiently large numbers to perform a genome-scale CRISPR screen.

The screen performed in this report was designed to identify genes that are essential for erythroid survival (erythroid essential genes) as well as genes required for erythroid differentiation. Reassuringly, several genes known to be required for the survival of erythroid progenitors (such as EPOR, GATA1, and MYB) were among the top ranked erythroid essential genes in this screen. Similarly, as expected, genes encoding hemoglobin biosynthesis enzymes (such as ALAS2, ALAD, HMBS, PPOX) were among the top ranked genes that are needed for erythroid differentiation. These findings provided confidence in the results of the screen. Importantly, based on this screen, we estimate that 40 and 237 genes are erythroid essential genes or genes required for erythroid differentiation, respectively (using a log2 fold change of <−1 and FDR < 0.01 as cutoffs), with the majority of these genes not previously implicated in erythropoiesis.

Using GWAS that aimed to associate genomic variants with RBC traits, numerous candidate genes have been nominated to play important roles in erythropoiesis. However, assigning GWAS-identified variants with target genes is challenging, since ~93% of variants associated with blood cell traits reside in non-coding regions5. Additionally, validating hundreds of thousands of variants for their roles in regulating erythroid traits would be extraordinarily challenging, since each variant may regulate several genes and each gene could be regulated by numerous variants. Notwithstanding these complications, previous efforts evaluated 412 genes nominated by GWAS for their role in erythropoiesis, demonstrating that 77 of them appear to regulate erythroid development8. The relatively low % of genes that validated could be due to misassignment of genes that correspond to non-coding variants identified by GWAS. Unlike GWAS, the unbiased CRISPR knock-out screen reported here was designed to target virtually all protein-coding genes in the human genome, and is thus poised to identify genes that regulate erythroid differentiation without prior knowledge or implication of any connection between the function of these genes and erythroid differentiation. Indeed, the secondary screen reported here identified 277 genes that appear to play a role in erythropoiesis, with most of these genes (~85%) not previously implicated in erythroid differentiation by GWAS. Thus the present study demonstrates that CRISPR screens represent a powerful complement to GWAS for identifying genes that are required for erythroid development.

Although the primary and secondary screens included 6 or 10 sgRNAs/gene respectively, thereby reducing the impact of variable efficiencies of individual sgRNAs as well as off-target effects, and though the analysis was performed at a conservative FDR, as appropriate with any large scale screen, validation assays are required prior to drawing firm conclusions about the role of any particular gene in erythropoiesis. Additionally, findings observed in HUDEP-2 need to be validated in erythroid cells differentiated from primary HSPCs or in vivo. As stated above, the fact that genes known to be critical for erythropoiesis such as EPOR, GATA1, and hemoglobin biosynthesis genes were among the top ranked genes in our screen provides confidence that the overwhelming majority of the findings reported here might validate in primary erythroid cells. It is important to note that HUDEP-2 cells approximate the pro-E stage of erythropoiesis, and therefore genes that may be critical for earlier erythroid progenitors but not for terminally differentiated erythroid cells are not expected to be identified in this screen.

Surprisingly, many genes previously implicated in hemolytic anemia were identified in this screen as essential for erythroid differentiation. For example, in 2018, a disease named FINCA was reported to result from mutations in NHLRC2, with affected individuals presenting with chronic hemolytic anemia among other symptoms23. To date, the function of NHLRC2 and its role in erythropoiesis remain to be elucidated28. NHRLC2 was one of the top hits in our screen, and functional validation demonstrated that downregulation of NHRLC2 resulted in delayed erythroid differentiation, a reduction in overall erythroid output, and increased apoptosis of differentiating erythroid cells. Consistent with a more recent report29, these data suggest that the anemia in FINCA may result at least in part from a defect in erythroid differentiation (possibly in combination with hemolytic anemia). Data from the CRISPR screen reported here may help classify erythroid disorders as resulting from erythroid differentiation defects versus other etiologies.

Of the genes that are essential in a small proportion (<5%) of cell lines examined by the DepMap project, 107 genes were predicted to be required for erythropoiesis by this CRISPR screen (using a log2 fold change of < −1 and FDR < 0.01 as cutoffs). We validated a requirement for one of these genes, VAC14, in both human and mouse erythroid development. VAC14 is a component of the PIKfyve complex, which is essential for endosomal trafficking to the trans-Golgi network30. The accumulation of large intracellular vesicles within VAC14 null erythroid progenitors suggests that failure to recycle endosomal content results in an erythropoiesis defect. Our data demonstrate that the utility of the CRISPR screen we report here extends to genes with no prior annotation or implication in erythropoiesis.

Defining genes that are critical for erythroid differentiation has additional implications. For example, knowledge of whether inhibition of a therapeutic target may have adverse effects on erythroid development may lead to a better, more restricted selection of potential therapeutic targets. Additionally, data from the CRISPR screen reported here may act as a reference to readily identify genes that, when mutated, might result in human erythroid disease. Furthermore, improved understanding of the genetic circuits that regulate erythropoiesis may result in enhanced RBC production in vitro with the ultimate goal to alleviate supply shortages for transfusion support. Therefore, to make our findings easily accessible, we have included the results of the CRISPR screen reported here on a website (https://cdb-rshiny.med.umich.edu/Khoriaty_Erythropoiesis/), which we hope will be valuable to investigators in this field.

Methods

Cell culture

The HUDEP-2 cell line was a gift from Y. Nakamura26. HUDEP-2 cells were cultured and differentiated as previously described31. Briefly, cells were maintained in maintenance media composed of StemSpan SFEM (StemCell Technologies, cat. 09650) supplemented with Epogen (3 U/mL; Amgen), recombinant human stem cell factor (50 ng/mL; R&D, cat. #255-SC-01M), dexamethasone (1 mM; Cayman, cat. #11015), doxycycline (1 mg/mL; Sigma, cat. #D9891), and penicillin-streptomycin (100 U/mL; ThermoFisher, cat. 15140122). Cells were maintained below 106 cells/mL and passaged every 48–72 h. HUDEP-2 cells were differentiated in erythroid differentiation media (EDM) comprised of IMDM (Gibco, cat. 12440061), pooled human AB plasma (5%; Rhode Island Blood Center, cat. X00004), Epogen (3 U/mL; Amgen), recombinant human insulin (10 mg/mL; Sigma, cat. 91077 C), holo-transferrin (330 mg/mL; Gemini, cat. 800–131 P), heparin (2 U/mL; Sigma, cat. H3149), L-glutamine (2 mM; ThermoFisher, cat. 25030081), and penicillin-streptomycin (100 U/mL; ThermoFisher). Between days 0–4 of differentiation, cells were grown in EDM supplemented with recombinant human stem cell factor (100 ng/mL) and doxycycline (1 mg/mL). Between days 4–8 of differentiation, cells were cultured in EDM supplemented with doxycycline (1 µg/mL) without stem cell factor. From days 8–12 of differentiation, HUDEP-2 were maintained in EDM without doxycycline or stem cell factor.

CD34+ HSPCs were purchased from the Fred Hutchinson Cancer Research Center (Seattle, WA). HSPCs were differentiated in EDM supplemented with hydrocortisone (1 mM; StemCell Technologies, cat. 07904), recombinant human IL-3 (5 ng/mL; R&D, cat. 203-IL), and recombinant human stem cell factor (100 ng/mL; R&D, cat. 255-SC) from days 0–7 of differentiation. At day 7, hydrocortisone and recombinant human IL-3 were removed from the culture media. From days 11−18 of differentiation, cells were maintained in EDM without additional supplements. Cells were maintained at a concentration of 104–106 cells/mL from days 0 to 7, 105–106 cells/mL from days 7 to 11, and 2.5 × 105–106 cells/mL between days 11 to 18.

To evaluate early erythroid progenitors, CD34+ HSPCs were expanded in HSPC cytokine-rich medium consisting of SFEM (StemCell Technologies, cat. 09650) supplemented with StemSpan CC100 (StemCell Technologies, cat. 02690), TPO (100 ng/mL; Peprotech, cat. 300-18), and UM729 (350 nM; StemCell Technologies, cat. 72332) for 5 days to allow for selection of transduced cells prior to inducing erythroid differentiation, as described above.

HEK293T cells were cultured in Dulbecco’s modified Eagle’s medium (DMEM; Gibco, cat. 11965118) supplemented with 10% fetal bovine serum and were passaged every 2–3 days.

Lentiviral transduction

Lentiviral transduction of HUDEP-2 cells was performed as previously described32. Briefly, HUDEP-2 cells were mixed with lentivirus and polybrene (5 mg/mL; Sigma, cat. 107689) and centrifuged at 1000 g for 2 h at 33 °C. Following spinfection, cells were washed and resuspended in fresh HUDEP-2 expansion media. Human CD34+ HSPCs undergoing erythroid differentiation were mixed with lentivirus and polybrene (10 mg/mL; Sigma Cat. 107689), centrifuged at 1420 g for 2 h at 33 °C, incubated overnight at 37 °C (5% CO2), and then washed with fresh media.

Genome-Scale CRISPR knockout screen

The GeCKOv2 human whole-genome CRISPR library (Addgene, cat. 1000000048/1000000049; a gift from F. Zhang) was expanded and packaged into lentiviral particles as previously described33. Briefly, the library was expanded through electroporations into competent E. coli (Lucigen, cat. 60242) and plated on bioassay dishes (Corning, cat. 431301). Colonies were harvested, pooled, and isolated using the EndoFree Plasmid Maxi Kit (Qiagen, cat. 12362). Pooled plasmids were then deep sequenced confirming minimum skewing during sgRNA expansion (90th/10th ratio of 6.14). Pooled plasmids were then packaged into lentiviral particles by co-transfection with psPAX2 (Addgene, Cat. #12260; a gift from D. Trono) and pCMV-VSV-G (Addgene, cat. #8454; a gift from R. Weinberg) into HEK293T cells using Lipofectamine LTX with PLUS reagent (ThermoFisher, cat. 15338100). Viral supernatants were collected and pooled prior to aliquoting and snap-freezing in liquid nitrogen.

1.2 × 108 HUDEP-2 cells were spinfected with pooled LentiCRISPRv2 virus at an MOI of 0.4 ± 0.1. Cells were then washed with PBS and centrifuged for 5 min at 350 g, resuspended in fresh media and incubated overnight. Cells were subsequently treated with puromycin (Sigma, cat. P8833) at a concentration of 0.33 mg/mL for 2–4 days until all non-transduced control cells treated with puromycin had died. HUDEP-2 cells were then cultured in maintenance media for 7 days and subsequently in differentiation media for 12 days. 4 × 107 cells were collected prior to differentiation (on Day 0 = D0) and 2.4 × 106 orthochromatic erythroblasts were sorted after 12 days of differentiation (D12) on the basis of α4 integrin reduction.

Cell numbers were maintained to keep sgRNA coverage at >200x throughout the entirety of the screen, which was performed in biological triplicates. Genomic DNA was isolated from cell populations using phenol-chloroform extraction. Isolated sgRNA sequences were amplified and sequenced as previously described33.

A custom secondary CRISPR knock-out screen

Following exclusion of core fitness genes (as previously defined20,21), a list of candidate genes was curated for further evaluation in a secondary screen. The list included the top 300 genes required for HUDEP-2 cell viability prior to the onset of differentiation, the top 800 genes required during erythroid differentiation, and genes with 2 or more sgRNAs exhibiting increased abundance at differentiation D12 versus D0. A custom secondary library was obtained from CustomArray consisting of an oligonucleotide pool containing 10 sgRNAs targeting each gene using the GPP sgRNA designer34,35 and 500 non-targeting sgRNAs. The oligo pool was cloned into pLentiCRISPRv2 (Addgene, cat. 52961; a gift from F. Zhang) by Gibson assembly (NEB, cat. E2611) as previously described36. The assembled vectors were then electroporated into Endura electrocompetent E. coli (Lucigen, cat. 60242-2), expanded, and sequenced as described above for the GeCKOv2 library. Absence of significant skewing in sgRNA expansion was confirmed (90th/10th ratio 3.1) and lentiviral pools were generated as described above.

The secondary CRISPR screen was performed identically to the genome scale screen with the following modifications: 4 × 107 HUDEP-2 cells were spinfected with the pooled secondary library, 2 × 107 HUDEP-2 cells were collected at D0 of differentiation, and 1.5 × 107 orthochromatic erythroblasts were sorted on differentiation D12 on the basis of CD233 expression and α4 integrin downregulation. Cell numbers were maintained to provide sgRNA coverage at >1000x throughout the entirety of the screen. Genomic DNA was isolated from cell populations using a Qiagen Blood and Tissue Kit (Qiagen cat. 69506). The secondary screen was performed in biological triplicates.

Data analysis

Integrated viral sgRNA sequences were amplified from genomic DNA in a one-step PCR reaction using NEBNext (NEB, cat. M0541L) as described elsewhere36. Following PCR amplification, DNA was mixed with 5 volumes of DNA Binding Buffer (Zymo, cat. D4004-1-L) and isolated using Zymo-Spin V columns with reservoir (Zymo, cat. C1016-25) and then washed once with DNA wash buffer (Zymo, cat. D4003-2-24). Samples were subsequently evaluated on a bioanalyzer (Agilent) and sequenced using a NextSeq 2000 HO 75-cycle kit (Illumina cat. 20024906). Samples produced an average of 3 × 106 reads per cell population for the genome-scale screen and an average of 3.5 × 106 reads per cell population collected in the secondary screen. A custom Perl script identified and mapped sgRNA sequences, as previously described37; 99.5% of sgRNA sequences were identified in cell populations in the genome-scale screen and 99.8% of sgRNA sequences were identified in cell populations in the secondary screen. Gene enrichment levels were analyzed using MAGeCK to determine differences in sgRNAs abundance between cell populations38,39, normalizing to the median of the population in the genome-scale screen and to nontargeting sgRNAs in the secondary screen.

Analysis of HUDEP-2 and HSPC differentiation by flow cytometry

Assessment of human terminal erythroid maturation was performed by staining cells with CD233-FITC (IBGRL, cat. 9439FI, 1:50 dilution) and CD49d-PE-Cy7 (Biolegend, cat. 304314, 1:50 dilution), as previously described32. Viability was assessed using either DAPI (4’,6-diamidino-2-phenylindole; Sigma, cat. D8417) or propidium iodide (cat. Calbiochem, cat. 537059). Apoptosis was assessed using Annexin V-PE (Biolegend, cat. 640908) and viability dye Helix NP NIR (Biolegend, cat. 425301). Characterization of early and late human erythroid maturation was performed using CD117-PE (BD, cat. 323408, 1:12.5 dilution), CD123-PE-Cy7 (BD, cat. 25-1239-42, 1:12.5 dilution), CD34-APC (BD, cat. 555824, 1:12.5 dilution), CD71-AF700 (BD, cat. 563769, 1:15 dilution), CD41a-BV510 (BD, cat. 563250, 1:25 dilution), CD105-BV421 (BD, cat. 563810, 1:25 dilution), and CD235a-BUV395 (BD, cat. 563810, 1:75 dilution) in brilliant stain buffer (ThermoFisher, cat. 00-4409-42). Flow cytometry was performed on a BD LSR Fortessa and data analysis was done using FlowJo (company). Cell sorting was performed using a BD FACS ARIA III or Sony MA900.

Targeted gene deletion using CRISPR/Cas9 or mRNA reduction using shRNA

Individual sgRNAs targeting specific genes were cloned into the pLentiCRISPRv2 plasmid (Addgene) as previously described31 and packaged as described above. Independent shRNA sequences targeting NHLRC2 were purchased from Sigma in a modified pLV vector. Independent shRNA sequences targeting VAC14 were cloned into a modified pLV vector containing the PGK promoter and BFP reporter gene (pLV[shRNA]-TagBFP2-U6 > hVAC14[shRNA#1/3/5]) by VectorBuilder. Relevant pLV construct vector IDs are available in Supplementary Table 2. Lentiviral particles were generated by co-transfecting HEK293T pLV vector with psPAX2 (Addgene, Cat. #12260; a gift from D. Trono) and pCMV-VSV-G (Addgene, cat. #8454; a gift from R. Weinberg) using Fugene HD transfection reagent (Promega, cat. E2311) according to the manufacturer’s instructions. Cells transduced with shRNA targeting NHLRC2 were selected using 1 μg/mL puromycin for 3 days or until all non-transduced control cells treated with puromycin had died, while cells transduced with shRNA targeting VAC14 were sorted based on expression of BFP 3 days after transduction.

qPCR analysis

Erythroid cells differentiated from HSPCs were collected at day 14 of differentiation. qRT-PCR was performed as previously described40. Briefly, RNA was isolated using the RNEasy microkit (Qiagen, cat. 74304) and cDNA was synthesized using Superscript III Reverse Transcriptase (Invitrogen, cat. 18080044). qRT-PCR was performed using PowerSYBR (ThermoFisher, cat. A25777) on an ABI QuantStudio 3. qRT-PCR primers are listed in Supplementary table 2.

Vac14 null mice

We obtained the previously described Vac14 loss-of-function allele (Vac14-) in which a gene trap cassette was inserted into the first intron of the gene24. Mice heterozygous for the Vac14 null allele (Vac14+/- mice) were crossed to wild-type C57BL/6 J mice for more than 8 generations. Genotyping for the Vac14 null allele was performed using a common forward primer located in intron 1 (Vac14InF: GACATCCAACTTGGTAACAGCGGACCATATGTG) and 2 reverse primers, one specific for the wild-type allele (Vac14InR: CCAAAAGCTACAGGTCCAACTGACCACATACTACCC) and one specific for the Vac14 gene-trap allele (Vac14VecR: CGAAGTTATCGATGCGATCTGCGTTCTTCTTC). This PCR reaction results in 904 and 634 base pair amplicons representing the wildtype and Vac14 gene-trap alleles, respectively. Mice were housed in a 12 h light/dark cycle and mouse work was approved by the University of Michigan Animal Care and Use Committee (protocol number 09841-2).

Fetal liver cell transplant

Vac14+/- mice were intercrossed and timed pregnancies were performed as previously described41. Embryos were collected at E15.5 and genotyped as above. Vac14−/− and WT fetal livers were harvested and homogenized into single cell suspensions as previously described41. Following RBC lysis (ACK buffer Gibco), 106 cells were injected into the tail veins of lethally irradiated (900 rads) B6-SJL-Ly5.2 female recipient mice (NCI Charles River, strain 01B96). Transplanted mice were monitored over time by serial complete blood count analyses and were subsequently euthanized for bone marrow and spleen histology and flow cytometry evaluations.

Complete blood count analysis

Blood was obtained through retroorbital bleeding using EDTA-treated tubes as previously described42. Complete blood count analysis was done using the Advia 120 Hematology System (Siemens).

Analysis of murine hematopoiesis by flow cytometry

As previously described41, bone marrows were flushed into DMEM (Gibco) supplemented with 4% fetal bovine serum. For flow cytometry analysis, cells were stained as recently described41 with various combinations of antibodies recognizing the following antigens at 1:100 dilution: CD3 (Biolegend 100308), GR-1 (Biolegend 108408), CD4 (Biolegend 116006 or 115904), CD150 (Biolegend 115904 or 115914), CD11b (Biolegend 101208 or 101204, or eBioscience 17-0112-82), CD8 (Biolegend 100708 or 100714), CD11c (Biolegend 117308), B220 (Biolegend 103208 or 103224), TCRB (Biolegend 109208 or 109212), TCRgd (Biolegend 118108), Ter119 (Biolegend 116208), CD19 (Biolegend 11508 or 115512), NK1.1 (Biolegend 108708), CD117 (Biolegend 135107 or 105826), CD44 (Biolegend 103012), CD45.2 (Biolegend 109824, 109828, or 109806), Sca-1 (Biolegend 108124), Ter119 (Biolegend 116228 or 116204), CD45.1 (eBioscience 25-0453-82), CD16/32 (Biolegend 101318), CD48 (Biolegend 103404), CD34 (eBioscience 11-0341-85), Thy1.2 (Biolegend 105304), CD71(Biolegend 113803) (SA Biolegend 405214). The lineage cocktail included antibodies against CD4, CD8, CD11c, GR1, Ter119, B220, CD19, Nk1.1, TCRb, TCRgd, and CD3. Flow analysis was performed on either BD FACS Canto or Aria II cytometers and data analysis was done using FlowJo (BD Biosciences).

Histology

Mouse spleens were collected at necropsy, fixed in aqueous buffered formalin (Z-Fix, Anatech), embedded in paraffin, and sectioned at 4 mm prior to staining with Hematoxylin and Eosin (Vector H-3502), as previously described43. Bone marrow cytospins were prepared by centrifuging 5 × 104 to 105 cells onto glass slides at 500 rpm for 3 min. (Shandon Cytospin 2). Slides were dried and then stained using Hema 3 Stat Pack per manufacturer’s instructions (ThermoFisher, cat. 122-122911). Bone marrow sections were prepared for histological evaluation as previously described43.

Special stains were performed by the ULAM Pathology Core at the University of Michigan. Staining for glycogen, iron, and ceroid was performed on fixed, paraffin-embedded sections using McManus periodic acid Schiff (PAS), Perl’s Prussian blue, and May-Grunwald Giemsa methods, respectively, as detailed below. All sections were cut at 4 µm, mounted on glass slides, deparaffinized, and hydrated through xylene and graded ethanols to water.

For PAS, sections were incubated at room temperature in 0.5% periodic acid (#375810, Sigma-Aldrich, Darmstadt, Germany) for 5 min, rinsed, and stained in McManus Schiff reagent (#1371B, Newcomer Supply, Middleton, WI) for 15 min. Slides were rinsed, incubated in prewarmed deionized water for 2 min, rinsed, and counterstained with Harris hematoxylin, followed by ClarifierTM 1 (22-050-116, EprediaTM Signature SeriesTM, Fisher Scientific) and bluing reagent (22-050-114, EprediaTM Signature SeriesTM, Fisher Scientific).

For Perl’s Prussian blue, sections were successively incubated at room temperature through 10% potassium ferrocyanate (J-606-1, Rowley Biochemical, Danvers, MA) for 5 min, potassium ferrocyanide hydrochloric acid (J-606-2, Rowley Biochemical) for 20 min, water rinses, and nuclear fast red (J-606-1, Rowley Biochemical) for 5 min.

For May Grunwald Giemsa staining, sections were rinsed in water followed by methanol. Sections were then incubated in 100% May-Grunwald solution for 3 min, in 1:1 May Grunwald:deionized water solution for 6 min, and in 6% working Giemsa solution for 45 min. Sections were rinsed and differentiated with 1% acetic acid. Following staining, all paraffin sections were dehydrated through graded ethanols, clearing with xylene, and placed under coverslips.

Staining for lipid was performed on fresh frozen sections using Sudan Black B. Frozen sections were cut at 4 µm thickness on a cryostat. Sections were dehydrated in 100% propylene glycol (H-502-4, Rowley Biochemical, Danvers, MA) for 10 min, stained in Sudan Black B solution (H-502, Rowley Biochemical), differentiated in 85% propylene glycol (H-502-2) for 2–3 min with agitation, rinsed, and counterstained with nuclear fast red (H-502-3, Rowley Biochemical) for 5 min. Sections were rinsed and mounted with an aqueous-based mounting media.

Perilipin-1 immunohistochemistry was performed on paraffin-embedded spleen sections 5 µm thick as described elsewhere using perilipin-1 primary antibody (Cell Signaling Technology, cat. 9349)44.

Statistical analysis

Statistical analysis comparing proliferation or differentiation across various conditions was done using two-way ANOVA. Experiments in mice comparing recipients of wildtype or Vac14−/− fetal liver cells was done using unpaired t-test. Analysis of supplemental Figs. S3A and S5A was performed using one-way ANOVA. Experiments in supplemental Figs. S3B-F, S4A, and S5B-G were analyzed using unpaired t-test. Where applicable, the sample size for each experiment is clear for each figure, since the data for each sample is shown by a separate dot and the value for each data point is reported in the supplemental Source Data File. For Figs. 4B, D, 5B, C, S3B, S4A, S5B, and S5G, the sample size is n = 3 per group. For Fig. 1G, H, the median and interquartile ranges are represented. All other data are reported as mean ± standard deviation. Measurement for each data point was taken from a distinct sample.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.