Main

Human radial glia (RG) evolved an increased proliferative capacity, altered cell fate potential and protracted period of maturation, supporting the increased number and complexity of daughter cells2,15,16. Gene regulatory networks governing RG self-renewal, differentiation and maturation have long been implicated in cortical expansion17,18. RG sequentially produce distinct subtypes of neuron followed by glial cell types19,20. Although cortical inhibitory neurons (INs) are generated mainly in the ganglionic eminences and migrate to the cortex21,22,23, recent lineage tracing studies1,24,25,26,27 and developmental cell atlases9,12 have demonstrated that human RG also produce cortical-like INs at late stages of neurogenesis that share transcriptional signatures with caudal ganglionic eminence (CGE) and lateral ganglionic eminence derived INs. However, the role of TFs in regulating RG differentiation decisions, lineage plasticity to produce INs and maturation remains largely unexplored.

Genetic perturbations combined with measurements of single-cell gene expression provide a powerful approach, termed Perturb-seq, for high-throughput dissection of gene function13,14. Cas9-based CRISPR loss-of-function perturbation approaches have been applied recently to study gene regulatory networks influencing cortical development using induced pluripotent stem (iPS)-cell-derived organoid28,29,30 and mouse31,32 models. However, Cas9-induced double-stranded DNA breaks can cause cytotoxicity, influencing proliferation and differentiation decisions33,34, whereas acquired genetic and epigenetic variation in source iPS cells35,36, and patterning biases37 and cell stress38 during differentiation, can influence cell-type fidelity and fate specification in organoid models. CRISPR interference (CRISPRi) screens using dCas9-KRAB enable efficient and uniform repression of target genes with limited cellular toxicity39,40, whereas primary cortical culture41,42 captures physiological aspects of human cortical neurogenesis, supporting sensitive detection of cell fate choices.

Here we established a primary cell model system that recapitulates in vivo differentiation dynamics and performed Perturb-seq to measure the transcriptional and cell fate consequences of repressing 44 TFs that are expressed robustly in the human cortical RG lineage. Extending human primary cell culture approaches1,42,43, we targeted TFs in a homogeneous RG population and then removed growth factors to permit cortical neurogenesis, cell fate choice and early subtype specification. Our screening revealed the role of ZNF219—not previously described in cortical development—in repressing neuronal differentiation, opposing roles for NR2E1 and ARX in regulating the balance of human excitatory versus inhibitory neurogenesis, and the role of ARX in safeguarding IN subtype specification through transcriptional repression of downstream transcription cofactor LMO1. Intersecting dysregulated genes under different perturbations revealed candidate hub effector genes downstream of several TFs enriched for roles in neurodevelopmental and neuropsychiatric disorders. Coupling CRISPRi screening with barcoded lineage tracing demonstrated the potential to engineer lineage plasticity and developmental tempo of individual RG through TF perturbation. Together with parallel screening in rhesus macaque, our data illuminate conserved mechanisms governing cortical RG lineage progression across primates.

Systematic TF repression during human corticogenesis

We designed a primary cell model of neurogenesis and lineage progression to evaluate the impacts of TF repression on cell fate choice during human cortical development. To direct gene targeting to RG at the start of differentiation, we first enriched for RG isolated from primary human tissue samples by adding epidermal growth factor (EGF) and fibroblast growth factor 2 for 5 days before infecting with an all-in-one CRISPRi lentivirus (Fig. 1a), and we further expanded RG for 1 week allowing for target gene knockdown (KD) before differentiation39 (Extended Data Fig. 1a–c). We then replaced growth factors with brain-derived neurotrophic factor (BDNF) to support spontaneous differentiation of perturbed RG. Consistent with recent studies1,42,43, this model recapitulates in vivo RG lineage progression and generation of excitatory neurons (ENs) and INs (Fig. 1a and Extended Data Fig. 1a,b), enabling detection of perturbations that affect differentiation dynamics and cell fate choice. Repression of PAX6 and EOMES recapitulated the effects of these TFs described in other model systems in promoting excitatory neurogenesis, highlighting the potential of our system to uncover additional regulators44,45,46,47 (Extended Data Fig. 1d).

Fig. 1: Single-cell Perturb-seq on 44 TFs in primary human neuronal differentiation.
figure 1

a, Experimental design for high-throughput perturbation of 44 TFs in primary human cortical progenitors from four individuals, with immunocytochemical labelling of EOMES and NEUROD2 in in vitro 2D culture of human cortical RG before (day 0) and after (day 3, day 7) induced differentiation. Representative images from four individuals are shown. FACS, fluorescence-activated cell sorting. b, Uniform manifold approximation and projections (UMAPs) of cells collected on day 0 (21,151 cells, n = 2 individuals) and day 7 (116,166 cells, n = 4 individuals), coloured by cell class, individual and sex. c, UMAPs highlighting cells from different timepoints. d, UMAP coloured by supervised cell type, with stacked barplots (bottom) showing cell-type distributions across individual at each timepoint. e, UMAPs showing expression of HOPX, MKI67, NEUROD2, DLX2 and BCAS1. f, Left, dotplot of marker gene expression across annotated clusters; right, barplot of cell numbers. Dot size denotes the expressing-cell fraction and colour denotes mean expression. g, Heatmap of normalized expression of 44 TFs targeted in this study along pseudotime with branches for IN and EN (left) lineages, with a tree plot showing cell-type distribution along pseudotime (right). Pseudotime was calculated using NT cells on day 7. h, Stacked barplot showing sgRNA distributions per TF target on day 7, coloured by individual sgRNA. i, Dotplot showing target gene KD efficiency compared with NT cells on day 7, filled by individual sgRNA with borders highlighting significance. log2FC values were calculated with DEseq2 in each cell class in n = 4 individuals and the lowest value for each sgRNA was used for visualization and filtering. sgRNAs with less than 25% reduction were removed from downstream analyses; remaining active sgRNAs showed a median 72% KD (28% residual expression). Astro, astrocytes; DIV, dividing; ImN, immature neuron; MG, microglia; OPC, oligodendrocyte progenitor cell; Vasc, vascular. Scale bars, 200 μm. Panel a created in BioRender. Nowakowski, T. (https://BioRender.com/7cr6o12).

To systematically identify regulators of human cortical neurogenesis, we first prioritized TFs using single-cell RNA sequencing (scRNA-seq) and single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq) multiome data10,12. We selected 44 TFs based on robust expression, motif accessibility, gene regulatory network size, target gene expression and predicted transcriptional consequences in the RG lineage (Supplementary Table 1; Methods). We then adapted an all-in-one CRISPRi vector co-expressing green fluorescent protein (GFP)48 by inserting a capture sequence in the single guide RNA (sgRNA) scaffold, enabling direct capture of sgRNAs during scRNA-seq using a 10x Genomics platform49, thereby supporting screening in primary cell culture systems50. We synthesized and cloned a library containing 164 sgRNAs, targeting the 44 TFs, and included 20 non-targeting control (NT) sgRNAs. For each TF, we targeted active promoters with accessible chromatin during human cortical neurogenesis9 using three sgRNAs per promoter51,52 (Supplementary Table 2). We derived primary human cultures from cryopreserved cortical tissue of four individuals at stages of peak neurogenesis from gestational weeks (GW) 16–18 (ref. 41). We delivered the CRISPRi library by lentivirus into primary RG, targeting less than 30% infection rate to generate libraries with mostly singly infected cells (Extended Data Fig. 1a and Supplementary Table 3), removed growth factors, and confirmed efficient gene repression before and throughout differentiation (Extended Data Fig. 1c,e).

scRNA-seq confirmed a 95% population of cycling RG on day 0 of differentiation, marked by co-expression of RG marker HOPX and proliferation marker MKI67 (Fig. 1b–f). By day 7, three principal cortical cell class trajectories emerged—EN, IN and oligodendrocyte lineages marked by NEUROD2, DLX2 and BCAS1, respectively (Fig. 1c–e)—in addition to a continuum from RG to astrocytes marked by AQP4 (Extended Data Fig. 1f). Integrating both experimental timepoints highlighted homogeneity of day 0 populations with minimal spontaneous differentiation (Fig. 1c and Extended Data Fig. 1g), as well as comparable representation of cells derived from independent technical and biological replicates across cell types with minimal batch effects (Fig. 1b,d and Extended Data Fig. 1h).

Pearson correlation and reference mapping to a developmental cortical cell atlas12 confirmed the recapitulation of in vivo-like gene expression, cell types, states, differentiation dynamics and cell fate choice in the primary culture differentiation system (Extended Data Fig. 1i,j). Most neurons exhibited immature states (Extended Data Fig. 1j), consistent with recent generation from RG after differentiation. These differentiation dynamics were also recovered by RNA velocity and pseudotime analysis (Fig. 1g and Extended Data Fig. 1l,m). Notably, comparison with iPS-cell-based models revealed a near twofold increase in the Pearson correlation coefficients of marker gene expression compared with in vivo cell types in the primary two-dimensional (2D) system, supporting improved fidelity to normal development (Extended Data Fig. 1i,k). In addition, primary culture in 2D and organotypic slice both showed reduced transcriptional signatures of cellular stress across principle dimensions, including glycolysis, endoplasmic reticulum stress, oxidative stress and apoptosis53,54,55 (Extended Data Fig. 1n), supporting the physiological relevance of the primary cell models.

We assigned sgRNAs to 118,456 cells (Methods) across timepoints, with a mean of 200 singly infected cells per sgRNA on differentiation day 7 (Fig. 1h and Extended Data Fig. 1o). KD efficiency was calculated in each cell class and 18 sgRNAs with less than 25% KD (log2 fold change (FC) > −0.4) were removed from downstream analyses (Fig. 1i and Supplementary Table 4; Methods). Remaining active sgRNAs exhibited a median KD efficiency of 72%, with comparable transcriptional responses to independent sgRNAs targeting the same promoter (Fig. 1i and Extended Data Fig. 2). For further analysis, we collapsed all active sgRNAs sharing the same target TF, which yielded a median KD efficiency of 80% and a mean of 600 cells per gene (Extended Data Fig. 1o,p).

Transcriptional regulators of human corticogenesis

We next examined the consequences of TF repression on gene expression and cell-type composition by differentiation day 7 (Fig. 2a and Supplementary Table 5Methods). Gene expression changes were correlated between individual sgRNAs and genes (average Pearson R = 0.88) and cell-type abundance was preserved upon downsampling of cell number (average Pearson R = 0.94), supporting the power of the screen to detect changes in both modalities (Extended Data Fig. 3a,b).

Fig. 2: Convergent transcriptional changes reveal neuropsychiatric-disorder-enriched hub effector genes.
figure 2

a, Schematics showing analytical workflow applied to the day 7 Perturb-seq dataset obtained from Fig. 1. b, Scatterplot for summed |estimated coefficient| in cell abundance (x axis) and number of DEGs (y axis) across seven main cell types in the EN and IN lineage for each TF perturbation. Dot size reflects perturbations with the strongest phenotypes calculated by (summed |estimated coefficient| × total number of DEGs), with embedded pie charts indicating fractions of up- and downregulated DEGs. c, Volcano plot of DEGs shared across different numbers of perturbations. DEGs with |log2FC | > 0.2 are shown. DEGs that overlap with seven sets of neuropsychiatric disorder-associated gene sets are coloured red and SCZ-associated genes are labelled. d, Dotplot showing gene ontology biological processes enrichment among convergent DEGs that were detected in at least two TF perturbations, using non-convergent DEGs as background. Dot size denotes Z score; colour denotes −log10[adjusted P]. e, Dotplot showing enrichment of seven disease-associated gene sets (Methods) among convergent DEGs versus non-convergent DEGs. One-sided Fisher’s exact test; dot size denotes odds ratio; colour denotes −log10[P]. f, Line plot showing odds ratio of disease-associated gene enrichment in DEGs shared across the minimum number of perturbations indicated. g, Regulatory network plot showing eight prioritized TFs (NR2E1, ARX, SOX2, ZNF219, NEUROD2, PHF21A, SOX9 and CTCF) and connected convergent DEGs. Edge colour indicates direction of regulation (pink, positive regulation (where DEGs were downregulated upon TF perturbation)). Pink nodes or labels denote disease-associated genes or TFs. AD, Alzheimer’s disease; ADHD, attention-deficit/hyperactivity disorder; BP, bipolar disorder; NDD, neurodevelopmental disorder. Panel a created in BioRender. Nowakowski, T. (https://BioRender.com/7cr6o12).

We observed a positive correlation between the effects of individual TFs on cell-type composition and on gene expression across cell types (Fig. 2b) and within cell classes (Extended Data Fig. 3c). Comparing the extent of changes in both dimensions prioritized TFs whose depletion caused the strongest phenotypes: NR2E1, ARX, ZNF219, SOX2, SOX9, CTCF, NEUROD2 and PHF21A (Fig. 2b). These TFs also showed the strongest impact on other molecular phenotypes, including Euclidean distance and energy distance56 to NT, which compare the average expression between groups and the distance between and within groups, respectively, as well as maximum composition change among all cell types in the RG lineage (Extended Data Fig. 3d). Notably, the TFs with strong cellular and transcriptional phenotypes have been implicated previously in neurological disorders: ARX in X-linked lissencephaly57, epilepsy58,59 and intellectual disability60,61, and autism spectrum disorder (ASD)58; NR2E1 in schizophrenia (SCZ)62; SOX2 in intellectual disability and epilepsy63,64; CTCF in intellectual disability with microcephaly65,66; NEUROD2 in intellectual disability67 and early infantile epileptic encephalopathy68; PHF21A in intellectual disability with epilepsy and ASD69 and ZNF219 in a case of low IQ ASD70. Although we cannot rule out that additional TFs may impact differentiation or maturation phenotypes not captured in this assay due to incomplete KD, redundancy or stage-specific roles, these results highlight the functional importance and disease relevance of TFs prioritized by screening.

At the gene expression level, the predominance of upregulated differentially expressed genes (DEGs) following NR2E1 (88%) and ARX (85%) KD was consistent with their role as transcriptional repressors71,72 (Fig. 2b). Approximately 25% of DEGs were affected by the perturbation of more than one TF (Fig. 2c). These convergent DEGs showed enrichment for cell adhesion- and synaptic development-related terms, suggesting that effector genes regulating neurogenesis, migration and maturation are highly interconnected across TF regulatory networks (Fig. 2d). Moreover, intersecting these convergent DEGs with seven sets of neurodevelopmental and neuropsychiatric disorder-related genes revealed significant overlaps with SCZ and major depressive disorder (MDD)-associated genes, in comparison with non-convergent DEGs (one-sided Fisher’s exact test, P < 0.05) (Fig. 2e,f). This overlap includes PTPRD, modulated by ARX, SOX2 and NEUROD2; and IL1RAPL1, modulated by SOX2, ZNF219, CTCF and TBR1 (Fig. 2c,g). Both PTPRD and IL1RAPL1 are associated with SCZ and depletion of both genes have been shown to induce aberrant neurogenesis, maturation and behaviours in mouse models73,74,75. This finding highlights strong connections among disease-related genes in TF-driven regulatory networks, indicating their potential roles as hub effector genes downstream of developmental TFs.

Distinct fate outcomes of RG upon TF perturbations

We further investigated the impact of TF perturbations on cell-type composition using cluster-free approaches76 to map cell fate changes with higher resolution along developmental trajectories (Fig. 3a). These approaches yielded consistent results with cluster-aware methods at the gene and individual sgRNA level and were robust to downsampling (Extended Data Figs. 3e,f and 4).

Fig. 3: TF perturbations alter EN/IN output of human cortical RG.
figure 3

a, Neighbourhood graphs based on the UMAP in Fig. 1d showing results of differential abundance testing using Milo under perturbation of eight prioritized TFs at the gene level. NT cells were downsampled to match perturbed cell numbers. Node sizes denote cell numbers in each neighbourhood; edges depict the number of cells shared between neighbourhoods; node colour denotes log2FC (false discovery rate (FDR) = 0.1) when perturbed. b, Top, schematics illustrating RG of distinct fate biases; bottom, experimental design for combined TF perturbation and lineage tracing in cortical RG of nine human individuals. c, UMAP of lineage-resolved 129,003 human cells collected on differentiation day 7, coloured by supervised cell type. A median of 15,469 cells per TF perturbation were recovered after filtering. d, Upset plot of cell class compositions in clones containing cells from more than one class (multi-class clones). e, Heatmap for lineage coupling score matrix in main cell types. f, UMAPs showing cell fate distributions in clonal clusters identified using scLiTr89: RG-, EN-, EN/IN-, IN- and oligodendrocyte-biased clusters. g, Box plots showing distributions of clonal cluster fractions in eight human individuals in NR2E1, ARX and ZNF219 KD versus NT. Two-sided paired Wilcoxon tests; P = 0.0078, 0.64, 0.016 (NR2E1); 0.0078, 0.0078, 1 (ARX) and 0.38, 0.023, 0.0078 (ZNF219). Centre line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; points, outliers. *P < 0.05; **P < 0.01; ***P < 0.001. Panel b created in BioRender. Nowakowski, T. (https://BioRender.com/7cr6o12).

Different perturbations elicited a range of composition consequences in differentiated cell types (Fig. 3a and Extended Data Fig. 3e,f). These phenotypes included examples consistent with mechanistic studies. For example, repression of SOX2 resulted in accumulation of RG77, whereas repression of the proneural factor NEUROD2 enriched for progenitors and immature ENs at the expense of mature ENs (Fig. 3a and Extended Data Fig. 3e). However, screening also revealed additional roles for previously characterized genes implicated in disease. For example, NR2E1 knockout was previously shown to drive premature neural differentiation in mouse models78,79, but repression in human RG specifically enriched for post-mitotic INs, in addition to depleting RG (Fig. 3a and Extended Data Fig. 3e). Similarly, ARX knockout was previously shown to affect IN differentiation and migration in mouse models80,81,82, but repression in human RG specifically enriched for the EN lineage at the expense of INs.

Our screen also uncovered disease-linked genes with composition phenotypes, including ZNF219 and PHF21A, that to our knowledge have yet to be examined functionally during cortical neurogenesis. Repression of ZNF219 mirrored the effects of repressing SOX9—a known ZNF219 interaction partner in chondrocytes83—by increasing the proportion of ENs, but showed different effects from SOX9 in dividing RG and INs (Fig. 3a and Extended Data Fig. 3e). Repression of PHF21A—a member of the BRAF/HDAC complex implicated in synapse formation84—led to an overall trend towards EN over IN differentiation (Fig. 3a and Extended Data Fig. 3e,f), matching the effects of targeting other ASD causal genes ARX and CTCF85,86.

Flow cytometry analysis of the top TF candidates, ARX, NR2E1 and ZNF219, with finer temporal resolution validated the cellular phenotypes observed in our Perturb-seq (Extended Data Fig. 5 and Supplementary Table 6), and further revealed early manifestations of compositional changes in KI67+ and EOMES+ progenitors on differentiation day 4, confirming that these perturbations affect self-renewal and cell fate decisions at the stage of neurogenesis instead of neuronal maturation (Extended Data Fig. 5d).

TF perturbations alter EN to IN output of individual RG

The influence of multiple TFs on the relative abundance of ENs and INs nominated RG lineage plasticity as a candidate developmental mechanism for observed composition differences. Composition differences could arise from perturbations of RG that preferentially affect EN-biased clones, mixed clones containing both ENs and INs, or IN-biased clones (Fig. 3b). To distinguish between these possibilities, we combined Perturb-seq with lineage tracing using STICR—a GFP-expressing lineage tracing library with static barcodes that can be measured by scRNA-seq1. We constructed an mCherry-expressing sub-library targeting NR2E1, ARX and ZNF219, and also included SOX2 and NEUROD2 as controls, the former known to impact both lineages and the latter known to promote the EN lineage (Fig. 3b).

As RG fate plasticity changes with maturation, we extended the experiments using this targeted dual library to primary human RG culture isolated from nine human individuals from GW16 to GW22, spanning the peak of excitatory and inhibitory neurogenesis12,27 to early stages of gliogenesis. Cells co-expressing mCherry and GFP were isolated on day 7 for scRNA-seq, and transcriptomes were reference mapped to the initial Perturb-seq data for cell-type annotation. This analysis yielded 129,003 cells with assigned sgRNAs and lineage barcodes (Fig. 3c and Extended Data Fig. 6a–j; Methods) and recapitulated the gene expression and cell-type composition consequences observed in the initial screen (Extended Data Fig. 6k,l).

We next investigated the landscape of clonal lineage relationships. We considered multicellular clones containing at least three cells with consistent sgRNA assignment, recovering 8,937 clones (Extended Data Fig. 7a; Methods). One person with low overall cellular coverage was dropped from this analysis. Remaining clones contained a mean of five cells in NT conditions (Extended Data Fig. 7b). The most abundant multi-class clones contained RG and ENs (35%) and RG and INs (20%) but, consistent with recent studies, we also observed around 15% clones containing both ENs and INs (Fig. 3d). Lineage coupling, defined as the normalized barcode covariance between cell types87,88, further supported the presence of mixed clones with a comparable linkage of dividing RG between both excitatory and inhibitory intermediate progenitor cells (IPCs) and immature neurons (Fig. 3e). Unsupervised clustering of clones based on cell-type composition using scLiTr89 (Methods) revealed five categories of fate biases, comprising RG, EN, IN, oligodendrocytes and EN/IN dual fate biased clonal clusters (Fig. 3f and Extended Data Fig. 7c). A small percentage of INs were observed in oligodendrocyte-biased clones and vice versa, suggesting their close lineage relationship12,27,90.

Intersecting perturbation and lineage tracing information revealed fate alterations by different TF perturbations. NR2E1 KD significantly enriched for IN-biased clones at the expense of RG- and EN- biased clones, while ARX KD strongly enriched for EN-biased clones at the expense of EN/IN dual fate biased clones (Fig. 3g and Extended Data Fig. 7d). A significant decrease in the fraction of EN/IN dual fate biased clones was also observed upon ZNF219 KD, suggesting that it shapes the balance of EN/IN output by increasing the diversity of lineage output from individual RG. In each case, the alterations in RG lineage biases were consistent with the observed cell-type composition changes (Fig. 3a and Extended Data Fig. 6k), supporting lineage plasticity as the developmental mechanism underlying the effects of these TFs on EN and IN fates.

TF perturbations alter RG lineage progression

RG undergo temporal fate restriction sequentially, producing distinct subtypes of neuron as well as glial cell types. Along the developmental axis, we observed a stage-dependent shift in RG fate biases marked by a decrease of EN-biased clones and increase of IN-clones around GW18–19 (refs. 12,27) (Fig. 4a), supported by increased lineage coupling between dividing RG and IN after GW19 (Extended Data Fig. 7e). At GW19, ARX KD delayed the timing of IN-biased cluster expansion, whereas NR2E1 KD promoted this transition (Fig. 4a), suggesting stage-dependent effects of TF perturbations on lineage decisions. Furthermore, there seemed to be a critical time window for altering EN/IN fate before GW21–22 (Extended Data Fig. 7d). Pseudotemporal ordering based on clonal types recapitulated the sequence of temporal cell fate choices beyond the EN/IN transition (Fig. 4b–e). In addition to increasing IN-biased clones at early stages, we found that NR2E1 KD also increased oligodendrocyte-biased clones at later stages, nominating this TF as a regulator of developmental tempo (Fig. 4a,f and Extended Data Fig. 7d). Conversely, ARX KD shifted the clonal pseudotime distribution significantly to more immature stages. Together, analysing lineage progression across differentiation stages, revealed opposing effects of NR2E1 and ARX on delaying and accelerating RG lineage progression, respectively (Fig. 4f).

Fig. 4: Heterochronic RG lineage progression under TF perturbations.
figure 4

a, Stacked barplot showing distributions of clonal cluster fractions along human developmental stages in NT, NR2E1, ARX, SOX2, ZNF219 and NEUROD2 KD. Legend is shown in c. bd, UMAPs of human clones coloured by pseudotime (b), fate biases (c) and individuals (d). e, Ridge plots of pseudotime distribution in clones grouped by stages. f, Boxplot showing median clonal pseudotime in each individual under TF perturbation, highlighting opposing roles of ARX and NR2E1 in regulating maturation of RG fate potential. Two-sided paired Wilcoxon tests; P = 0.0078 (NR2E1), 0.016 (ARX), 0.84 (SOX2), 0.0016 (ZNF219), 0.25 (NEUROD2). Centre line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; points, outliers. g, Heatmap showing changes in gene expression in RG along clonal pseudotime. h, Violin plots showing gene expression changes in RG in ARX and NR2E1 KD. *P < 0.05; **P < 0.01; ***P < 0.001.

To identify candidate effector genes involved in RG lineage progression, we compared gene expression in RG with different lineage histories along clonal pseudotime. Lineage tracing studies in rodents and gene expression data in humans have suggested that the fate transition of RG from EN to olfactory bulb IN neurogenesis involves a shift from GLI3 to EGFR signalling12,90,91,92. Similarly, RG in EN-biased clones preferentially expressed GLI3, whereas those involved in IN production (EN/IN- and IN-biased clones) expressed EGFR and PDGFRA highly (Fig. 4g and Extended Data Fig. 7f). As the INs in these clones expressed mainly cortical-like markers, including SCGN and PAX6 (Extended Data Fig. 6d,g), this observation suggests reuse of a conserved signalling pathway transition for olfactory bulb and local IN production by cortical RG. We further identified downstream genes under NR2E1 and ARX KD with lineage-dependent expression (Fig. 4h), nominating shared effector gene programs contributing to RG lineage progression.

To investigate conservation of RG lineage plasticity and underlying mechanisms among primates, we extended our screening approach to rhesus macaque RG (Extended Data Fig. 6a). The production of local INs by cortical RG has been suggested in rhesus macaque93,94,95, but not yet demonstrated by lineage tracing. We observed conservation in EN/IN lineage plasticity (Extended Data Fig. 7g–i), including the production of cortical-like INs expressing SCGN and PAX6, and conserved changes in cell-type compositions (Extended Data Fig. 6g,k,m), gene expression (Extended Data Fig. 6n,o and Supplementary Table 7) with a minority of divergent changes reflecting gene network evolution (Extended Data Fig. 6p) and conserved effects of ARX and NR2E1 on heterochrony of RG clonal output (Extended Data Fig. 7j). Collectively, these results suggest conserved lineage plasticity of cortical RG in cortical-like IN neurogenesis across primates and conserved roles of ARX and NR2E1 in modulating cortical developmental tempo.

ARX safeguards IN identity by repressing LMO1

Beyond lineage plasticity and progression, we also investigated the roles of TFs in subtype specification of cortical neurons. Both ARX and SOX2 KD drove local composition changes among INs (Fig. 3a), motivating us to further examine their effects on subtype abundance. Iterative clustering of INs revealed populations expressing subtype markers including ST18 (immature), PAX6 (lateral ganglionic eminence-like) and CALB2 (CGE-like), as well as a small population expressing PBX3 (olfactory bulb-like) (Fig. 5a,b and Extended Data Fig. 8a,b). We observed two distinct subclusters defined exclusively by perturbations (‘ectopic’): ARX KD created a cluster distinguished by LMO1 (a previously identified target gene repressed by ARX in ventral telencephalon72,80) and RIC3 (a chaperone for nicotinic acetylcholine receptors), whereas SOX2 KD created a subcluster highly expressing SPOCK1 (a neuronally produced and secreted proteoglycan) (Fig. 5b–d and Extended Data Fig. 8c). Consistent results were observed across several sgRNAs sharing the same target, between individuals, batches and species (Extended Data Fig. 8d–g), highlighting conserved roles of ARX and SOX2 in not only regulating neurogenesis but also specifying normal IN subtype identity, as well as the distinct ectopic transcriptional states formed by their loss.

Fig. 5: ARX safeguards IN identity by repressing LMO1.
figure 5

a, UMAP of integrated human and rhesus macaque IN subtypes identified in the lineage-resolved 2D screen. b, Left, dotplot showing marker gene expression in each IN subtype; right, barplot of cell numbers. Dot size denotes the expressing-cell fraction and colour denotes mean expression. c, Stacked barplots showing distributions of IN subtypes across TF perturbations in human (top) and macaque (middle) 2D culture and in human slice culture (bottom). d, UMAPs showing results of differential abundance testing in IN subtypes in ARX KD in human (top) and macaque (bottom). Left, UMAP of downsampled NT and perturbed cells from day 7, coloured by perturbation condition; right, neighbourhood graph of differential abundance testing coloured by log2FC (FDR = 0.05). e, Heatmaps showing log2FC in top DEGs in the IN_LMO1/RIC3 cluster versus other IN subtypes in human (left) and macaque (right). Asterisks indicate Benjamini–Hochberg (BH)-adjusted P values identified from DEseq2. f, Schematics illustrating gene expression regulation under NT, ARX KD and ARX, LMO1 dKD. g, UMAPs showing results of differential abundance testing in IN subtypes in NT, ARX KD and dKD. Left, UMAP of downsampled NT and perturbed cells coloured by conditions. Right, neighbourhood graph of differential abundance testing coloured by log2FC (FDR = 0.05) for each contrast: ARX KD versus NT, dKD versus ARX KD and dKD versus NT. h, Beeswarm plots showing differential abundance in neighbourhoods identified in g across IN subtypes, with significant changes coloured by log2FC. i, Dotplot showing the IN_LMO1/RIC3 marker expression in ARX, LMO1 dKD. j, Graphical summary of findings: human cortical RG gives rise sequentially to EN, IN and oligodendrocytes along developmental stages. NR2E1 KD promotes the RG lineage progression, whereas ARX KD delays the transition and impairs normal IN subtype specification. ZNF219 KD promotes differentiation of both EN and IN with an overall preference to EN. *P < 0.05; **P < 0.01; ***P < 0.001. Illustrations in ce and schematic in f created in BioRender. Nowakowski, T. (https://BioRender.com/7cr6o12). Illustration in j created by H. Pinheiro (www.hpinheiro.com).

To further test the relevance of these findings to in vivo differentiation trajectories, we employed organotypic cortical slice culture, which preserved the three-dimensional organization of cortical tissue. To preferentially label progenitor populations we locally transduced germinal zone, where progenitors and immature cell types reside, with the targeted CRISPRi and lineage tracing libraries (Extended Data Fig. 9a–c). Although the sensitivity of this model for detecting cell fate consequences is limited by heterogeneous starter cells and spontaneous differentiation in advance of gene repression39, the slice culture experiment recapitulated the transcriptomic changes observed in 2D culture (Extended Data Fig. 9d,e) and enabled studies of subtype specification. We observed immature cortically derived INs displaying low expression for migration markers ERBB4 and CXCR4 and lineage coupling to ENs (IN_local), along with mature CGE- and medial ganglionic eminence (MGE)-derived IN clusters that were probably labelled post-mitotically and not coupled to ENs (Extended Data Fig. 9f,g). Among INs, ARX repression again induced the LMO1/RIC3 subtype, while SOX2 repression induced the CALB1/SPOCK1 subtype, as observed in the 2D model (Fig. 5c and Extended Data Fig. 9h).

Differential gene expression analysis focusing on the LMO1/RIC3 subtype revealed dysregulated activin and TGFβ signalling pathways driven by ACVR1 and SMAD2/3, as well as impaired cell migration (SMAD3, FOXP4, CDH20) and synaptic membrane adhesion (PTPRD, EFNA5) (Fig. 5e, Extended Data Fig. 8h,i and Supplementary Table 8). We observed consistent patterns of dysregulated genes in organotypic slice culture and conservation in rhesus macaque, consistent with previous reports of migration defects of INs in ARX mutant mice57 (Fig. 5e and Extended Data Fig. 8j). Despite dysregulation of genes in these pathways, we did not observe an elevated cellular stress response (Extended Data Fig. 8k).

To examine the developmental origin of these ectopic clusters, we analysed lineage coupling and infection during differentiation. We observed similar lineage coupling across cell types in the RG lineage following ARX and SOX2 perturbations in 2D culture (Extended Data Fig. 8l), indicating that effects on subtype identity probably emerge post-mitotically rather than by altering initial progenitor fate specification. Further supporting a post-mitotic effect, the ectopic state emerged in cortical born INs in slice culture under conditions in which RG were not expanded and were permitted to differentiate immediately (Fig. 5c and Extended Data Fig. 9f,h). Furthermore, we observed a similar ectopic cluster characterized by LMO1 induction in CGE and MGE-derived INs that were infected post-mitotically (Extended Data Fig. 9j–l). Consistent with this observation, ARX expression is retained in post-mitotic INs (Extended Data Fig. 9m–o). Reference mapping to an in vivo atlas of cortical INs96 showed modest similarity between LMO1/RIC3 subtypes and MGE-derived LHX6+ INs, but missing canonical MGE marker expression suggests this subtype is unlikely to have a physiological counterpart (Fig. 5b and Extended Data Fig. 9p).

We next examined the molecular mechanisms underlying the emergence of the LMO1/RIC3 cluster. As LMO1 is a conserved downstream target of ARX across species whose expression is dysregulated in ARX mutant mice72,97,98 and acts as a transcriptional coactivator, we reasoned that LMO1 may mediate the transcriptional programs driving the ectopic cell state. We performed double KD (dKD) with ARX and LMO1 (Fig. 5f) by combining CRISPRi vectors expressing GFP and mCherry and integrated the resulting cell populations with previous data (Extended Data Fig. 10a,b). Reference mapping of IN subtypes revealed enrichment of an intermediate cell state between the LMO1/RIC3 subtype and other subtypes, suggesting a partial rescue of ectopic cell state driven by ARX KD (Fig. 5g,h). Consistent with this model, LMO1 dKD rescued the expression of many conserved marker genes identified in the LMO1/RIC3 subtype, including RIC3 and EFNA5 (Fig. 5i and Extended Data Fig. 10c–e). These results indicate that ARX safeguards normal post-mitotic IN identity, in part by repressing LMO1, which is necessary for inducing additional genes characterizing the ectopic cluster. Together, our study introduces a primary cell model of cortical neurogenesis with improved fidelity to normal development and reveals a broadly conserved landscape of responses to TF perturbations in gene regulatory networks, cell fate decisions and neuronal subtype specification (Fig. 5j).

Discussion

Systematic profiling of single-cell gene expression and chromatin accessibility during human cortical development has supported construction of human cell atlases, inference of developmental trajectories and prioritization of gene regulatory networks9,10,99,100. However, functionally examining candidate regulators requires scalable approaches for perturbing gene expression in physiological models of human cortical neurogenesis101. We established a primary culture system with improved fidelity and decreased cellular stress response, and applied single-cell Perturb-seq in primary human and rhesus macaque RG undergoing differentiation to examine the role of 44 TFs in cortical development at the level of gene expression, gene network interactions, cell fate biases and subtype specification. Convergence of effector genes downstream of several TFs and their enrichment in neuropsychiatric and neurodevelopmental processes and disorders highlighted the connectedness of TF networks during neurogenesis. Different perturbations elicited distinct RG fate outcomes. By coupling TF perturbation and barcoded lineage tracing, we showed the lineage plasticity of individual human and macaque RG. We identified ZNF219 as a new regulator in human cortical neurogenesis, and NR2E1 and ARX as known regulators with new functions in regulating inhibitory neurogenesis and guiding temporal lineage progression of individual cortical RG, with post-mitotic roles for ARX in IN subtype specification, partially through repressing LMO1.

RG undergo waves of neurogenesis and gliogenesis, sequentially generating distinct subtypes of neuron, oligodendrocyte and astrocyte. We derived a culture system that sensitively captured the fate transition of EN to IN production and showed the lineage plasticity of individual RG in response to TF perturbations. ARX repression extended the window of EN production, while NR2E1 repression accelerated the transition to IN and oligodendrocyte production. Mapping clonal clusters along the developmental axis revealed variation in effect sizes of the fate switch potential in RG at different maturation stages, suggesting a restricted time window (GW18–19) for lineage plasticity, corresponding to the timing of endogenous EN/IN transition12,27. Combining perturbations with multi-modal profiling including chromatin accessibility of RG at different stages could reveal the underlying cis-regulatory changes that account for different cell fate and developmental tempo responses downstream of TF depletion, and extending to genome scales could reveal additional regulators of protracted maturation of human RG.

The dorsal origin of a subset of human cortical INs has been supported through lineage tracing using both lentiviral-based static barcodes and somatic mosaicism1,12,25,26,27, but underlying mechanisms regulating this process remained elusive. Previous studies highlighted RG lineage plasticity as a potentially human-specific process in comparison to mouse1,24, but our lineage tracing experiments in rhesus macaque revealed conservation of lineage plasticity and responses to TF perturbations among primates. Further studies of IN migration and molecular identity will help illuminate possible quantitative differences between species and to distinguish between cortical and olfactory bulb-bound INs generated by cortical RG.

Beyond cell fate commitment and maturation, our screening platform robustly detected changes in IN subtype specification, exemplified by conserved phenotypes observed in ARX and SOX2 KD across primates. Pathogenic ARX mutations associated with neurologic disorders show altered DNA binding preferences and loss of LMO1 repression97,98. Dual repression of ARX and LMO1 partially reverted transcriptional programs driving the ARX-mediated ectopic IN state, highlighting LMO1 repression as crucial for normal IN identity. Increased LMO1 expression has been associated with tumour invasion and metastasis102,103, highlighting its potential role in ectopic IN migration. Although Ebf3—another Arx target—has previously been implicated in Arx-mediated migration deficits104, future studies combining time lapse imaging and human cellular models can examine the role of LMO1 as a master regulator mediating aberrant cell behaviours. More generally, the observation that the TFs with the strongest transcriptional and cell composition consequences have also been implicated in neurodevelopmental disorders highlights the correspondence of cellular phenotypes in a simplified culture system to organismal consequences. Collectively, our study provides a framework for functional dissection of gene regulatory networks in human cortical neurogenesis.

Methods

Tissue processing and cell culture

Tissue samples

De-identified human tissue samples were collected with previous patient consent in strict observance of the legal and institutional ethical regulations. Protocols were approved by the Human Gamete, Embryo and Stem Cell Research Committee (Institutional Review Board) at the University of California, San Francisco.

The Primate Center at the University of California, Davis, provided four specimens of cortical tissue from PCD60 (n = 1), PCD75 (n = 1) and PCD80 (n = 2) rhesus macaques. All animal procedures conformed to the requirements of the Animal Welfare Act, and protocols were approved before implementation by the Institutional Animal Care and Use Committee at the University of California, Davis.

Cell culture and lentiviral transduction

For cryopreservation, tissue samples were cut into small pieces, placed in Bambanker (NIPPON Genetics, BB02), frozen in CoolCell at −80 and transferred to liquid nitrogen within 2 weeks. For single-cell dissociation, each cryovial was thawed in 37 °C warm water and placed in a vial containing a pre-warmed solution of Papain (Worthington Biochemical Corporation, catalogue no. LK003153) solution supplemented with 5% trehalose (Fisher Scientific, catalogue no. BP268710) that was prepared according to the manufacturer protocol for 10 min at 37 °C. After approximately 30 min incubation, tissue was triturated following the manufacturer protocol. Cells were plated on a 24-well tissue culture dish coated with 0.1% PEI (Sigma, catalogue no. P3143), 5 μg ml−1 Biolamina LN521 (Invitrogen, catalogue no. 23017-015) and at a density of 500,000 cells cm−2. Expansion medium contained insulin (Thermo Fisher, catalogue no. A1138IJ), transferrin (Invitria, catalogue no. 777TRF029-10G), selenium (Sigma, catalogue no. S5261-10G), 1.23 mM ascorbic acid (Fujifilm/Wako, catalogue no. 321-44823), 1% polyvinyl alcohol (PVA) (Sigma, catalogue no. P8136-1KG), 100 μg ml−1 primocin (Invivogen, catalogue no. ant-pm-05), 20 ng ml−1 fibroblast growth factor 2 (Preprotech, catalogue no. 100-18B) and 20 ng ml−1 EGF (Preprotech, catalogue no. AF100-15) in DMEM-F12 (Corning, catalogue no. MT10092CM), supplemented with ROCK inhibitor CEPT cocktail105. For lentiviral transduction, CRISPRi and STICR lentivirus was added to culture media on day 5 at roughly 1:500 and 1:5,000 dilution, respectively. After 6 h, the virus-containing medium was removed and replaced with fresh medium. Seven days after infection, expansion medium was removed and replaced with differentiation medium containing insulin-transferrin-selenium, 1.23 mM ascorbic acid, 1% PVA, 100 μg ml−1 primocin, 20 ng ml−1 BDNF (Alomone Labs, catalogue no. B-250) in DMEM-F12. At 7 days after differentiation, cultures were dissociated using papain supplemented with 5% trehalose and GFP-positive or GFP and mCherry co-positive cells were isolated by fluorescence-activated cell sorting (FACS) on BD Aria Fusion, resuspended in 0.2% bovine serum albumin (BSA) in PBS and captured with 10x Chromium v.3.1 HT kit (catalogue no. PN-1000348) or Illumina Single Cell CRISPR Library kit (Illumina Single Cell CRISPR Prep, T10, catalogue no. 20132435).

Organotypic slice culture and lentiviral transduction

For organotypic slice culture experiments, samples were embedded in 3% low-melting-point agarose (Fisher, catalogue no. BP165-25) and then cut into 300-μm sections perpendicular to the ventricle on a Leica VT1200S vibrating blade microtome in oxygenated artificial cerebrospinal fluid containing 125 mM NaCl, 2.5 mM KCl, 1 mM MgCl2, 1 mM CaCl2 and 1.25 mM NaH2PO4. Slices were cultured slice medium containing in insulin-transferrin-selenium, 1.23 mM ascorbic acid, 1% PVA, 100 μg ml−1 primocin, Glutamax (Invitrogen, catalogue no. 35050061), 1 mg ml−1 BSA, 15 μM uridine (Sigma, catalogue no. U3003-5G), 1 μg ml−1 reduced glutathione (Sigma), 1 μg ml−1 (+)-α-tocopherol acetate (Sigma, catalogue no. t3001-10g), 0.12 μg ml−1 linoleic (Sigma, catalogue no. L1012) and linolenic acid (Sigma, catalogue no. L2376), 10 mg ml−1 docosahexaenoic acid (Cayman, catalogue no. 10006865), 5 mg ml−1 arachidonic acid (Cayman, catalogue no. 90010.1), 20 ng ml−1 BDNF in DMEM-F12. CEPT cocktail was added on the first day. Lentiviral transduction was performed on the following day locally at the germinal zone to preferentially label neural progenitor cells. CRISPRi and STICR lentivirus was added at 1:20 and 1:100 dilution, respectively. At 24 h after transduction, virus-containing medium was replaced with fresh medium and daily half-medium replacement was performed. At 12–14 days after transduction, cultures were dissociated using papain supplemented with 5% trehalose, and GFP- and mCherry co-positive cells were isolated by FACS and captured with 10x Chromium v.3.1 HT kit (catalogue no. PN-1000348).

Immunocytochemistry

Cells were fixed in 2% paraformaldehyde for 15 min at room temperature and then in ice cold 90% methanol for 10 min. After a 1-h incubation in blocking solution (5% BSA, 0.3% Triton-X in PBS), cells were incubated with primary antibodies with the following dilution in blocking solution at room temperature for 1 h: mouse-EOMES (Thermo Fisher, catalogue no. 14-4877-82) 1:500, rabbit-NEUROD2 (Abcam, catalogue no. ab104430) 1:500, goat-SOX9 (R&D, catalogue no. AF3075) 1:300, mouse-KI67 (BD, catalogue no. 550609) 1:500, mouse-DLX2 (Santa Cruz Biotechnology, catalogue no. sc-393879) 1:200. The cells were then washed with PBS three times and incubated with secondary antibodies in blocking solution for 1 h at room temperature, and counterstained with 4′,6-diamidino-2-phenylindole. Finally, after PBS washes, digital image acquisition was performed using an Evos M7000 microscope and Evos M7000 Software.

Flow cytometry

Intracellular staining was performed using Foxp3 transcription factor staining kit (Invitrogen, catalogue no. 00-5523-00) according to the manufacturer’s protocol. Briefly, cell culture was dissociated with Accutase (STEMCELL Technologies, catalogue no. 07920) supplemented with 5% trehalose (Fisher Scientific, catalogue no. BP268710) and fixed in Foxp3 fixation buffer for 30 min at room temperature. Cells were then washed twice with Foxp3 permeabilization buffer and then stained with primary antibodies mouse-DLX2 (Santa Cruz Biotechnology, catalogue no. sc-393879) and rabbit-NEUROD2 (Abcam, catalogue no. ab104430) at 1:100 dilution. After washing with Foxp3 permeabilization buffer, cells were stained with secondary antibodies donkey anti-mouse-488 and donkey anti-rabbit-647 (Invitrogen) at 1:200. Finally, cells were stained with conjugated antibodies KI67-421 (BD, catalogue no. 565929), SOX2-PerCP-Cy5.5 (BD, catalogue no. 561506), EOMES-PE-Cy7 (Invitrogen, catalogue no. 25-4877-42) at 1:100, washed twice with Foxp3 permeabilization buffer and resuspended in 0.2% BSA in PBS. Cells were then analysed directly by flow cytometry. Data were acquired through BD FACSDiva software and analysed with FlowJo. Fractions of marker positive populations were normalized to the mean value of two NT sgRNAs per individual sample at each timepoint. Two-sided Wilcoxon tests were performed on replicate mean against NT after collapsing two NT sgRNAs to identify significant changes in cell-type composition.

Immunohistochemistry

Primary cortical tissue were fixed with 4% paraformaldehyde in PBS overnight, washed three times with PBS, then placed in 10% and 30% sucrose in PBS overnight, sequentially, and embedded in optimal cutting temperature compound for sectioning to 20 μm.

All samples were blocked in blocking solution (5% BSA, 0.3% Triton-X in PBS) for 1 h. Primary and secondary antibodies were diluted in blocking solution. Samples were incubated in primary antibody solution overnight at 4 °C, then washed three times with PBS at room temperature. Samples were then incubated in a secondary antibody solution with 4′,6-diamidino-2-phenylindole for 1 h at room temperature and then washed three times with PBS before being mounted with Fluoromount (Invitrogen, catalogue no. 0100-20). Primary antibodies used include mouse-DLX2 (Santa Cruz Biotechnology, catalogue no. sc-393879) 1:50; rabbit-SCGN (Millipore-Sigma, catalogue no. HPA006641) 1:500; rabbit-KI67 (Vector, catalogue no. VP-K451) 1:500; mouse-HOPX (Santa Cruz Biotechnology, catalogue no. sc-398703) 1:50; sheep-ARX (R&D, catalogue no. AF7068SP) 1:500. Secondary antibodies in this study include: donkey anti-sheep 647 (Thermo Fisher, catalogue no. A21448) 1:1,000, donkey anti-mouse 488 (Thermo Fisher, catalogue no. A32766) 1:1,000, donkey anti-rabbit 555 (Thermo Fisher, catalogue no. A32794) 1:1,000. Images were collected using ×20 air objectives on an Evos M7000 microscope, and processed using ImageJ/Fiji.

Target TF selection through enhancer-driven gene regulatory network inference

Two public datasets of single-cell multi omic developing human cortex10,12 were used for enhancer-driven gene regulatory network (eGRN) inference. The dataset from ref. 12 was subset randomly to 50,000 cells. Cells were grouped into meta cells using SEACells106 to overcome data sparsity and noise. On average, one meta cell contains 75 single cells and meta cells representing all the cell types originally presented in the dataset were input to the SCENIC+ workflow. Peak calling was performed using MACS2 in each cell type and a consensus peak set was generated using the TGCA iterative peak filtering approach following the pycisTopic workflow. The resulting consensus peaks were then summarized into a peak-by-nuclei matrix and used as input for topic modelling using default parameters in pycisTopic. The optimal number of topics (15) was determined based on four different quality metrics provided by SCENIC+. We applied three different methods in parallel to identify candidate enhancer regions by selecting regions of interest through (1) binarizing the topics using the Otsu method; (2) taking the top 3,000 regions per topic; (3) calling differentially accessible peaks on the imputed matrix using a Wilcoxon rank sum test (log2FC > 1.5 and Benjamini–Hochberg-adjusted P < 0.05). Pycistarget and discrete-element-method-based motif enrichment analysis were used to link candidate enhancers to TFs. Next, eGRNs, defined as TF-region-gene network consisting of (1) a specific TF, (2) all regions that are enriched for the TF-annotated motif and (3) all genes linked to these regions, were determined by a wrapper function provided by SCENIC+ using the default parameters and subsequently filtered according to the following criteria: (1) only eGRNs with more than ten target genes and positive region-to-gene relationships were retained; (2) eGRNs with an extended annotation was kept only if no direct annotation is available and (3) only genes with top TF-to-gene importance scores (rho > 0.05) were selected as the target genes for each eGRN. Specificity scores were calculated using the RSS algorithm based on region-or gene-based eGRN enrichment scores (area under the curve scores). To infer potential effects of TF repression on RG differentiation, in silico KD simulation was applied following the SCENIC+ workflow to a subset of TFs that showed high correlation between TF expression and target region enrichment scores. Briefly, a simulated gene expression matrix was generated by predicting the expression of each gene using the expression of the predictor TFs, while setting the expression of the TF of interest to zero. The simulation was repeated over five iterations to predict indirect effects. Cells were then projected onto an eGRN target gene-based PCA embedding and the shift of the cells in the original embedding was estimated based on eGRN gene-based area under the curve values calculated using the simulated gene expression matrix. Metrics including (1) normalized TF expression in RG, (2) scaled TF motif accessibility and target gene expression in RG, (3) predicted eGRN sizes, (4) cell-type specificity (RSS) of eGRNs and (5) predicted transcriptomic consequences in the RG lineage were used to select TF targets for this study (Supplementary Table 1).

Plasmids and lentivirus production

Plasmids

The all-in-one CRISPRi plasmid encoding dCas9-KRAB and sgRNA were obtained from Addgene (catalogue no. 71237). Capture sequence 1 was cloned into the vector by Vectorbuilder to enable sgRNA capture with 10x Genomics single-cell capture50. mCherry-H2B was cloned into the vector to substitute GFP for lineage tracing and flow cytometry experiments. For single sgRNA cloning, the oligonucleotides containing 20 bp protospacer and overhang were obtained from Integrated DNA Technologies and cloned into the BsmBI-v2 (New England Biolabs, catalogue no. R0739)-digested backbone through T4 ligation (T4 DNA ligase, New England Biolabs, catalogue no. M0202). Protospacers were obtained from dual sgRNA CRISPRi libraries107 or Dolcetto52. For experiments in rhesus macaques, protospacers that were mapped uniquely to the rheMac10 genome were selected. STICR plasmids (Addgene catalogue numbers 180483, 186334, 186335) used for lineage tracing were obtained from the Nowakowski laboratory.

sgRNA library construction

Oligonucleotide pools were synthesized by Twist Bioscience. BsmBI recognition sites were appended to each sgRNA sequence along with the appropriate overhang sequences for cloning into the sgRNA expression plasmids, as well as primer sites to allow differential amplification of subsets from the same synthesis pool. The final oligonucleotide sequence was: 5′-(forward primer)CGTCTCACACCG(sgRNA, 20 nt)GTTTCGAGACG(reverse primer).

Primers were used to amplify individual subpools using 50 μl 2× NEBNext Ultra II Q5 Master Mix (New England Biolabs, catalogue no. M0544S), 20 μl of oligonucleotide pool (approximately 20 ng), 0.5 μl of primer mix at a final concentration of 0.5 μM, and 29 μl nuclease-free water. PCR cycling conditions: (1) 98 °C for 30 s; (2) 98 °C for 10 s; (3) 68 °C for 30 s; (4) 72 °C for 30 s; (5) go to (2), eight times; (6) 72 °C for 2 min.

The resulting amplicons were PCR-purified (Zymo, catalogue no. D4060) and cloned into the library vector using Golden Gate cloning with Esp3I (Thermo Fisher Scientific, catalogue no. ER0451) and T7 ligase (New England Biolabs, catalogue no. M0318); the library vector was pre-digested with BsmBI-v2 (New England Biolabs, catalogue no. R0739). The ligation product was electroporated into MegaX DH10B T1R Electrocomp cells (Invitrogen, catalogue no. C640003) and grown at 30 °C for 24 h in 200 ml LB broth with 100 μg ml−1 carbenicillin. Library diversity and sgRNA representation were assessed through PCR amplicon sequencing.

Lentivirus production

Lentivirus was produced in HEK293T cells (Takaro Bio). HEK293Ts were seeded at a density of 80,000 cells cm−2 24 h before transfection. Transfection was performed using Lipofectamine 3000 (Invitrogen, catalogue no. L3000001) transfection reagent according to the manufacturer’s protocol. After 6 h, the medium was removed and replaced with fresh medium supplemented with 1× ViralBoost (Alstem, catalogue no. VB100). Supernatant was collected at 48 h after transfection and concentrated roughly at 1:100 with lentivirus precipitation solution (Alstem, catalogue no. VC100).

Generation and analysis of scRNA-seq libraries

scRNA-seq library generation

The manufacturer-provided protocol (CG000421 Rev D) was used to generate 10x single-cell gene expression and sgRNA libraries. Samples from several individuals were pooled (Supplementary Table 3) and each 10x lane was loaded with ~100,000 cells in total. sgRNA libraries were amplified separately from each 10x cDNA library using the 10x 3’ Feature Barcode Kit (catalogue no. PN-1000262). To generate STICR barcode libraries, 10 μl of 10x cDNA library was used as template in a 50 μl PCR reaction containing 25 μl Q5 Hot Start High Fidelity 2× master mix (NEB, catalogue no. M0494) and STICR barcode read 1 and 2 primers (0.5 μM, each) described in Delgado et al.1 using the following program: 1, 98 °C, 30 s; 2, 98 °C, 10 s; 3, 62 °C, 20 s; 4, 72 °C, 10 s; 5, repeat steps 2–4 15 times; 6, 72 °C, 2 min; 7, 4 °C, hold. A 0.8–0.6 dual-sided size selection was performed using SPRIselect Bead (Beckman Coulter, catalogue no. B23318) after PCR amplification.

For the ARX, LMO1 dKD experiment, cells are captured with Illumina Single Cell CRISPR Prep (T10, catalogue no. 20132435) where direct sgRNA capture was enabled. Gene expression and sgRNA libraries were prepared following the manufacturer-provided protocol (FB0004762; FB0002130).

Alignments and quality control

Libraries were sequenced on Illumina NovaSeq platforms to the depth of roughly 20,000–30,000 reads per cell for gene expression and 5,000 reads per cell for sgRNA and STICR libraries. Data were acquired using Illumina sequencer control software and bcl2fastq software.

The 10x gene expression libraries, together with sgRNA and STICR libraries, were aligned to the hg38 genome obtained from CellRanger (refdata-gex-GRCh38-2024-A) with feature barcode reference (Supplementary Table 2) using CellRanger v.7.2.0. Aligned counts were then processed with Cellbender108 for ambient removal. The resulting counts were processed by Scanpy109 to remove low quality cells containing fewer than 1,000 genes, a high abundance of mitochondrial reads (greater than 15% of total transcripts) or a high abundance of ribosomal reads (greater than 40% of total transcripts). Illumina single-cell CRISPR libraries were aligned using DRAGEN single-cell RNA v.4.4.5 with additional arguments --single-cell-barcode-tag and --single-cell-umi-tag to support downstream individual demultiplexing. Low quality cells that had fewer than 500 genes or high abundance of mitochondrial or ribosomal reads were removed.

To align 10x runs with pooled human and rhesus macaque individual samples (Supplementary Table 3), we made a composite genome using CellRanger mkref function with (1) hg38 (CellRanger refdata-gex-GRCh38-2024-A) and (2) a version of rheMac10 included in a hierarchical alignment and annotated with the Comparative Annotation Toolkit110 as part of ref. 111, then filtered with litterbox (https://github.com/nkschaefer/litterbox). Cells identified as cross-species multiplets were removed from downstream analyses.

Demultiplexing of individuals from pooled sequencing and doublet removal

CellSNP-lite followed by Vireo112,113 were used to identify different individuals based on reference-free genotyping using candidate SNPs identified from 1000 Genome Project. Sex information was acquired through PCR-based genotyping using genomic DNA extracted from each person. Each person was assigned based on sex and pooling information (Supplementary Table 3). The same individuals from different 10x lanes were merged based on identical SNP genotypes. Inter-individual doublets and unassigned cells were removed from downstream analyses.

sgRNA and lineage barcode assignments

Cellbouncer114 was used for sgRNA assignment using the sgRNA count matrix obtained from CellRanger. ‘effective_sgRNA’ was defined for each cell based on assigned sgRNAs after collapsing NT sgRNA. For example, cells assigned with NT sgRNA and SOX2-targeting sgRNA are classified as ‘effective_sgRNA’=SOX2. ‘Target_gene’ was defined for each cell based after collapsing NT sgRNA and sgRNAs sharing the same target genes. For example, cells infected with two different SOX2-targeting sgRNAs are classified as ‘Target_gene’=SOX2 and included for analysing SOX2 KD phenotypes at the gene level. Multi-infected cells with sgRNAs targeting different genes were assigned to have more than one ‘Target Gene’ and removed from downstream analyses.

KD efficiency in the initial screen was calculated in each cell class following the DEseq2 (ref. 115) workflow, where single-cell data were pseudobulked by cell class per sgRNA per individual. We reasoned that KD efficiency can vary between cell types and classes depending on the baseline expression level of the target gene, and that KD effects may be difficult to detect in cell types with low baseline expression. log2FC output by the DEseq2 Wald test was obtained in each cell class and the lowest log2FC among cell classes was used for visualization and downstream filtering (Supplementary Table 4). sgRNAs that had KD efficiency less than 25% (log2FC > −0.4, 18 sgRNAs) were considered inactive and excluded from downstream analyses. KD efficiency at the gene level was calculated pseudobulking each supervised cell type per TF per gene per individual after collapsing all active sgRNAs targeting the same gene. Pearson correlation coefficients comparing log2FC of DEGs (DEseq2 Wald test, Benjamini–Hochberg-adjusted P < 0.05) at the sgRNA level and at the gene level were calculated to evaluate variabilities between sgRNAs.

STICR barcodes for lineage tracing experiments were aligned and assigned using a modified NextClone116 workflow that allows for whitelisting. The pipeline is available at https://github.com/cnk113/NextClone. Individual barcodes were filtered by at least three reads supporting a single unique molecular identifier and at least two unique molecular identifiers to call cells with a barcode. Clone calling was done using CloneDetective116.

Comparison with public datasets and cell-type annotation

The processed developing human cortex multiomic dataset, including metadata and aligned CellRanger output, was obtained from ref. 12 and used for reference mapping. The reference model was built with scvi-tools117 using the top 2,500 variable genes, and label transfer to query datasets generated in this study was performed for data integration and to examine correspondence of cell-type assignment to the reference dataset. Cell-type annotation in the initial human screen generated in this study was performed based on marker expression as well as scANVI predictions. The initial human screen dataset was then used as a reference dataset to map other 2D populations collected from lineage-resolved targeted screen in human and rhesus macaque, and the ARX, LMO1 dKD experiment to minimize impacts of batch effects and species differences on cell-type annotation.

Public datasets of processed single-cell RNA-seq from iNeuron53, FeBO54 and iPS-cell-derived organoid55 studies were used to compare fidelity of in vitro specified cell types across different systems. Reference mapping and label transfer to the ref. 12 in vivo dataset were performed to examine correspondence to the in vivo cell types. Pearson correlation coefficients per cell type between datasets were calculated using expression of the top 25 markers identified from the in vivo dataset to compare the fidelity of cell identities. To assess the level of cellular stress, gene scores for glycolysis, endoplasmic reticulum stress, reactive oxygen species (ROS), apoptosis and senescence were calculated using the score_genes function in Scanpy with gene sets obtained from the MSigDB database118,119. NT control cells of all cell types were subset from data obtained in this study for this comparison. Each dataset was then downsampled randomly to 5,000 cells to ensure balanced representation between datasets.

Trajectory analysis using pseudotime and RNA velocity

The initial human screen data was subset to NT cells for pseudotime and RNA velocity inference. Pseudotime was calculated using scFates120 by tree learning with SimplePPT and setting the root node within the RG class on ForceAtlas2 embedding. Excitatory and inhibitory trajectories were defined as main branches of the principal graph that led to distinct sets of clusters. The velocyto121 pipeline was implemented to quantify spliced and unspliced reads from CellRanger output. Highly variable genes were separately calculated using spliced and unspliced matrices and the top 3,000 genes were used for inferring RNA velocity using scVelo122.

Differential composition and gene expression analysis

Cells with single ‘Target_gene’ were subset and sgRNAs targeting the same gene were collapsed for downstream analyses at the gene level. Composition changes in each cluster per TF perturbation were quantified using DCATS123, which detects differential abundance using a beta-binomial generalized linear model model and returns the estimated coefficients and likelihood ratio test P values. To detect compositional changes in a finer resolution, Milo76 was used to quantify differential abundance in a label-free manner. Briefly, perturbed and/or NT cells were downsampled randomly to ensure the balance of total cell numbers between the two groups. Neighbourhoods were constructed based on k-nearest-neighbour graphs, and differential cell abundance in each neighbourhood was then tested against NT using design = ~stage + sex + batch + perturbation. Robustness to downsampling was tested in four perturbations through downsampling randomly to 200–600 cells.

Cluster-aware differential gene expression analysis was performed following the DEseq2 (ref. 115) pipeline using NT cells as reference and individual as replicates. When pseudobulking within cell types, conditions that have fewer than five cells per cell type or fewer than two biological replicates were removed from downstream analyses. DEGs (DEseq2 Wald test, Benjamini–Hochberg-adjusted P < 0.05) identified under more than one perturbation across 44 TFs perturbations at the gene level in the initial human screen were defined as ‘convergent DEGs’ and gene ontology enrichment analysis was performed using non-convergent DEGs as background with richR (https://github.com/guokai8/richR). One-sided Fisher’s exact test was performed to identify significant overlap between convergent DEGs and seven sets of disease-associated genes (see below) using non-convergent DEGs as background.

To prioritize TFs whose repression led to strongest cellular and transcriptional and consequences in the initial human screen, |estimated coefficient| from DCATS and numbers of DEGs from DEseq2 across seven cell types in the EN and IN lineage (RG_DIV, RG-Astro, IPC_EN, EN_ImN, EN, IPC_IN, IN) were summed to quantify accumulated effects of TF repression in composition and gene expression, respectively. To test the robustness of prioritized TFs, scCoda124 was applied to estimate log2FC in cell abundance per cell type. Euclidean and energy distance between each TF perturbation were calculated with Pertpy125 to compare the global transcriptional landscape post perturbation.

The DCATS and DEseq2 pipeline described above was also implemented in the datasets generated in lineage-resolved targeted screens in human and macaque 2D culture and in human organotypic slice culture to compare results between batches, cultural systems and species, and to support robustness and reproducibility of the phenotypes reported.

Disease-associated genes from previous studies

Genes significantly associated with neurodevelopmental and neuropsychiatric disorders were obtained from:

  • ASD: SFARI gene database85, score 1

  • NDD: Supplementary Table 11 in ref. 86

  • MDD: Supplementary Table 9 in ref. 126

  • SCZ: Supplementary Table 12 in ref. 127

  • BP: Supplementary Table 4 in ref. 128

  • ADHD: Supplementary Table 7 in ref. 129

  • AD: Supplementary Table 5 in ref. 130

Lineage coupling and clonal clustering

Clones with fewer than three cells and clones that have conflicted sgRNA assignments were removed from the clonal analysis to consider only multicellular clones generated post infection. One human individual (GW17 187-M) with overall low cellular coverage was dropped from the subsequent analyses. Cospar88 was used to calculate the fate coupling scores per perturbation, defined as the normalized barcode covariance between different cell types.

scLiTr89 was used to identify and cluster fate biased clones by training a neural network to predict clonal labels of nearest neighbours for each clonally labelled cell. To minimize batch effects in cluster identification, we used integrated lineage-resolved data to define five clusters with distinct fate biases. The results obtained were exported and a two-sided paired Wilcoxon test was performed comparing the mean fractions of clonal clusters per TF perturbation against NT. Wilcoxon rank sum test through Scanpy was used to identify marker genes in the RG class (RG-Astro and RG_DIV) between different clonal clusters in human. Monocle3 (ref. 131) was used to construct pseudotime trajectory at the clonal level per species. The median pseudotime per individual was calculated under each perturbation and used for two-sided paired Wilcoxon tests to evaluate the changes on clonal pseudotime. To fit gene expression along the clonal pseudotime, cells in the RG class from multicellular clones were subset and assigned with pseudotime values based on their clone identities.

IN subtype identification and gene expression analysis

To identify IN subtypes, the IN cluster from the initial human screen was subset and re-clustered into six subtypes, each expressing distinct markers identified through Scanpy using Wilcoxon rank sum test. Reference mapping and label transfer were then performed querying INs generated in the lineage-resolved human and macaque datasets and the dKD dataset to identify equivalent clusters. For cell abundance testing in IN subtypes in each perturbation, NT and perturbed INs were downsampled and the Milo pipeline described above was applied to identify differential abundance within the IN cluster. Differential expression analysis was performed using DEseq2 contrasting the IN_LMO1/RIC3 cluster with other physiological subtypes identified in unperturbed INs (IN_NFIA/ST18, IN_PAX6/ST18, IN_CALB2/FBN2, IN_PBX3/GRIA1) in each species, using individuals as replicate. Gene ontology terms enrichment in biological processes in the human IN_LMO1/RIC3 cluster were identified using pathfindR132 (one-sided hypergeometric test).

To survey physiological counterparts of the IN_LMO1/RIC3 cluster, an in vivo atlas of the developing macaque telencephalon96 was used as reference and label transfer using scvi-tools was performed to map INs generated in the targeted 2D screen. Gene scores for glycolysis, endoplasmic reticulum stress, ROS, apoptosis and senescence were calculated in INs per perturbation as described above to assess changes in cellular stress induced by TF perturbation.

Comparison with organotypic slice culture dataset

The lineage-resolved targeted screen dataset in human organotypic slice culture was preprocessed as described above. Cell-type annotation was done based on marker gene expression and reference mapping to the ref. 12 atlas. Only cells derived from multicellular clones were included for visualizing cell-type compositions per TF perturbation to consider the effects on dividing progenitors.

Four IN clusters, including those derived from RG in vitro (IN_local) and those labelled post-mitotically (IN_MGE, IN_CGE,IN_OB), were used to compare the pre- and post-mitotic effects of ARX KD on IN gene expression and subtype specification using Milo and DEseq2 as described above. IN_local was used for iterative clustering to identify IN subtypes described in 2D and compare between perturbations.

ARX and LMO1 dKD

A GFP expressing CRISPRi library targeting LMO1 and NT controls, together with an mCherry-expressing vector targeting ARX, were used to achieve dKD and double infected cells expressing both colours were sorted and captured with Illumina Single Cell CRISPR Library Kit. The resulting library was aligned with DRAGEN Single Cell RNA v.4.4.5, reference mapped to the initial human screen data and integrated with previous data from the targeted human screen. sgRNA assignment was done using cellbouncer and double infected INs (ARX, LMO1 dKD or ARX, NT KD) were used for downstream analysis. Differential gene expression in INs was performed using the DEseq2 Wald test by contrasting ARX, LMO1 dKD against ARX KD. Cluster-free differential abundance testing was performed using Milo after integrating with previous batches to compare differential abundance of IN subtypes in NT, ARX KD and dKD.

Ethics and inclusion statement

All studies were approved by UCSF GESCR (Gamete, Embryo, and Stem Cell Research) Committee.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.