Introduction

Schizophrenia (SCZ) is a severe, chronic psychiatric disorder with neurodevelopmental origins1, affecting ~1% of the global population2. Its complex etiology and heterogeneous clinical presentation have limited progress in therapeutic development3. Since therapies targeting genetically associated disease mechanisms have a higher success rate4, understanding SCZ’s genetic architecture and pathogenic mechanisms is essential for new treatments.

SCZ has a strong genetic basis. Genome-wide association studies (GWAS) have identified 287 common variant loci associated with SCZ risk5. Although each contributes modest risk, their cumulative burden is substantial. Among biological pathways enriched for SCZ-associated variants, the histone H3K4 methylation pathway is prominent6. Genes encoding H3K4 methyltransferases, demethylases, and readers are implicated in several neurodevelopmental disorders (NDDs), including autism spectrum disorder (ASD) and intellectual disability (ID)7. The frequent implication of H3K4me3 regulatory proteins in neurodevelopmental disorders suggests a critical role for H3K4me3-associated pathways in human brain development, which may occur through both enzymatic and non-enzymatic mechanisms.

In addition to common variants, rare copy number variations (CNVs) and loss-of-function (LoF) mutations in individual genes also contribute to risk. Rare heterozygous LoF mutations in SETD1A are among the strongest single-gene risk factors for SCZ (OR = 20.1)8 and are also implicated in NDDs8,9,10,11. SETD1A encodes a methyltransferase that catalyzes H3K4me2/3 as part of the COMPASS complex12. A recurrent heterozygous patient mutation, SETD1A c.4582-2delAG, affects the exon 16 splice acceptor and has been identified in multiple SCZ and NDD cohorts9,10,11,13,14. Despite this genetic evidence, SETD1A’s targets and function in the developing human brain remain poorly understood. Whether its regulatory actions converge on SCZ polygenic risk loci or contribute to sporadic SCZ in the broader population remains unclear.

In mouse models, Setd1a is required for neural progenitor proliferation15,16 and its deficiency leads to impaired neuronal morphology and synaptic function16,17,18,19. These phenotypes are rescued by LSD1 (KDM1A) inhibition but not by KDM5 inhibition16, implicating H3K4me3 dynamics in disease mechanisms.

SETD1A has also been implicated in genome stability, but the role in non-dividing neuronal cells is unknown. In yeast and non-neuronal human cells, SETD1A localizes DNA damage sites and mediates H3K4me3 to promote repair20,21,22. During early brain development, neural progenitors undergo extensive proliferation, making accurate DNA repair critical. Defective repair can lead to somatic mutations, particularly at recurrent DNA break clusters (RDCs) that overlap with common fragile sites (CFSs), mosaic CNVs, and long neural genes linked to psychiatric disorders23,24,25,26,27,28. Recent studies in human postmortem neurons revealed increased somatic single nucleotide variants (SNVs) in SCZ brains at transcription factor binding sites active during fetal development29. Similarly, DNA repair hotspots (DRHs) have been identified in open chromatin of postmitotic neurons30,31.

Here, we used CRISPR to engineer human iPSC-derived isogenic lines carrying the SETD1A c.4582-2delAG mutation. We characterized the effects of SETD1A haploinsufficiency on neurodevelopment, gene expression, chromatin occupancy, and DNA repair. Integrating GWAS, CUT&Tag profiling, and postmortem transcriptomic data, we show that SETD1A regulates genes involved in synaptic signaling and genome maintenance, converging with both rare and common SCZ genetic risk.

Results

SETD1A c.4582-2delAG mutation causes SETD1A haploinsufficiency and reduced H3K4me3

To investigate the consequences of SETD1A haploinsufficiency during early human neurodevelopment, we used CRISPR-Cas9 to introduce heterozygous LoF mutations (SETD1A+/−) in two neurotypical iPSC lines (LIBD2c1 and LIBD7c6)32. We generated 12 isogenic SETD1A+/− lines, including two clones carrying the recurrent patient-specific SETD1A c.4582-2delAG mutation at exon 16 splice acceptor site, causing intron retention and a premature stop codon in exon 16 (SETD1Ac.4582-2delAG/+) (Fig. 1a–c; Supplementary Fig. 1a, b and Supplementary Data 1), and six unedited wild-type (WT) clones. No off-target edits or CNVs were detected (Supplementary Fig. 1c and Supplementary Data 2), and six SETD1A+/− and six WT lines were selected for downstream analysis (Supplementary Data 3).

Fig. 1: SETD1A haploinsufficiency causes neurodevelopmental abnormalities.
Fig. 1: SETD1A haploinsufficiency causes neurodevelopmental abnormalities.The alternative text for this image may have been generated using AI.
Full size image

a Schematic of targeted pathogenic SETD1A loss-of-function variant (c.4582-2delAG). Intron 15 retention causes frame-shifted exon 16 with premature stop codon. b Isogenic iPSC lines from two individuals. Six SETD1A+/− and three WT from LIBD2c1; two SETD1Ac.4582-2delAG/+, four SETD1A+/−, three WT from LIBD7c6. All SETD1A+/− carry premature stop codons. c Chromatograms showing WT and mutant alleles with c.4582-2delAG mutation. d Representative images of iPSCs, NPCs, neurons, cerebral organoids from WT and SETD1Ac.4582-2delAG/+. Nuclei: Hoechst. Scale: 50 μm (iPSC, organoids); 20 μm (NPC); 10 μm (neurons). e SETD1A expression in iPSCs (n = 2 technical replicates), NPCs (n = 3 technical replicates), 4-week neurons (n = 1 or 4 technical replicates), day-48 cerebral organoids (n = 2 technical replicates) from six WT, six SETD1A+/− including two SETD1Ac.4582-2delAG/+, normalized to GAPDH. Mann–Whitney test, **p = 0.0051; ***p = 0.0005; ****p < 0.0001. Unpaired t-test, *p = 0.015. f SETD1A transcript frequencies between exons 15–16 in NPCs and 4–6-week neurons from six WT, six SETD1A+/−. Binomial test, ****p < 0.0001. g Western blots showing reduced full-length SETD1A and absent truncated protein in SETD1Ac.4582-2delAG/+ NPCs/neurons. h Histone H3K4 methylation in NPCs from two WT (n = 2 technical replicates), two SETD1Ac.4582-2delAG/+ (n = 3 technical replicates), normalized to H3. Unpaired t-test, *p = 0.0208. i NPC proliferation over 3 days from six WT (n = 3 or 7 technical replicates), six SETD1A+/− (n = 3, 6, or 7 technical replicates). Unpaired t-test, ***p = 0.0003. j Schematic of experimental procedures. k SYN-RFP-labeled 6-week neurons. Scale: 100 μm. l Sholl analysis from three WT (n = 56 cells), two SETD1Ac.4582-2delAG/+ (n = 40 cells). Two-way ANOVA with Bonferroni’s test, ***p = 0.0003 (60 µm); ***p = 0.0002 (70 µm); **p = 0.0019 (160 µm); *p = 0.0191 (170 µm); ****p < 0.0001 (50, 80–150 µm). m Raster plots showing neuronal bursts (60 s) at weeks 8, 11. Blue = network bursts. n MEA recordings (6–13 weeks): firing rates, synchrony index, inter-network burst intervals from WT (six lines), SETD1A+/− (six lines) including two SETD1Ac.4582-2delAG/+. Scheirer-Ray-Hare test with Dunn’s tests, **p = 0.0038 (week 8); **p = 0.0024 (week 9) for mean firing rate; **p = 0.0275 (week 9) for synchrony index; *p = 0.0223 (week 7); **p = 0.0086 (week 8); *p = 0.0102 (week 9) for inter-network burst-interval. All tests two-sided except (f) (one-sided). Mean ± S.D. (e, h, i); Mean ± S.E.M. (l); Median with interquartile range (n). Source data provided as Source Data file.

Upon differentiation into 2D NPCs, 2D forebrain neurons, and 3D cerebral organoids (COs) (Fig. 1d), SETD1A mRNA was significantly reduced in all heterozygous lines across all cell types (Fig. 1e). RNA-seq analysis confirmed allele-specific downregulation of the SETD1A c.4582-2delAG allele, consistent with nonsense-mediated decay in SETD1Ac.4582-2delAG/+ lines33 (Fig. 1f). Western blotting showed reduced full-length SETD1A protein with no detectable truncated protein products in SETD1Ac.4582-2delAG/+ NPCs and neurons (Fig. 1g).

We next examined the impact of SETD1A haploinsufficiency on H3K4 methylation. A specific and significant reduction in H3K4me3 was observed in NPCs from SETD1Ac.4582-2delAG/+ lines (Fig. 1h). H3K4me1/2 levels were also reduced, although these did not reach statistical significance.

SETD1A haploinsufficiency disrupts NPC proliferation and alters neuronal morphology and network activity

We next assessed whether SETD1A LoF affects neurodevelopmental cellular phenotypes. Neural progenitor cell (NPC) proliferation was significantly decreased in SETD1A+/− lines compared to WT (Fig. 1i), consistent with prior mouse studies15,16,34. To evaluate neuronal maturation and function, we performed Sholl analysis and multielectrode array (MEA) recordings on neurons co-cultured with astrocytes (Fig. 1j). Neurons from SETD1Ac.4582-2delAG/+ lines showed reduced neurite complexity (Fig. 1k, l) and exhibited earlier and more robust increases in spontaneous activity and synchrony than WT neurons (Fig. 1m, n). This hyperexcitability phenotype was most pronounced between weeks 7–9 of culture and aligns with findings from CRISPR-engineered neurons with heterozygous SETD1A mutation not found in patients35. Collectively, these data indicate that SETD1A haploinsufficiency leads to disrupted NPC proliferation and altered neuronal morphology and activity during early neurodevelopment.

Patient-specific SETD1A c.4582-2delAG mutation alters NPC fate, neuronal maturation, and subtype specification

To define the neurodevelopmental impact of SETD1A haploinsufficiency, we performed single-cell RNA-seq (scRNA-seq) on COs at days 30 and 48 derived from three WT and two SETD1Ac.4582-2delAG/+ lines (Fig. 2a and Supplementary Fig. 2a–c). We identified 13 cell types based on established marker genes and cell-state identifiers36,37 (Fig. 2b; Supplementary Fig. 2d and Supplementary Data 4). Both genotypes showed increased neuronal proportions over time, but SETD1Ac.4582-2delAG/+ COs had a persistently lower fraction of S/G2/M-phase NPCs by day 30, indicative of impaired developmental trajectory (Supplementary Fig. 2e). G1/postmitotic NPCs were also significantly reduced in mutant lines (Fig. 2c), and cycling NPCs skewed toward G2/M-phase, suggesting mitotic delay (Supplementary Fig. 2f).

Fig. 2: SETD1A patient mutation increases genomic instability in neuronal cells.
Fig. 2: SETD1A patient mutation increases genomic instability in neuronal cells.The alternative text for this image may have been generated using AI.
Full size image

a Schematic of COs on days 30, 48 analyzed in scRNA-seq. Three WT lines and two SETD1Ac.4582-2delAG/+ lines. b UMAP showing 12,803 cells from COs of three WT and two SETD1Ac.4582-2delAG/+ lines on days 30, 48, colored by cell types. IPC: intermediate progenitor cell; NEC: neuroectoderm cell. c Cell type proportion of COs on days 30, 48. Two-way ANOVA with Bonferroni’s test, ****p < 0.0001. d Top: volcano plot showing 624 DEGs in NPC (S/G2/M) between genotypes. 299 genes downregulated, 325 genes upregulated in SETD1Ac.4582-2delAG/+ vs. WT with FDR < 0.05. Bottom: volcano plot showing 309 DEGs in Neuron between genotypes. 108 genes downregulated, 201 genes upregulated in SETD1Ac.4582-2delAG/+ vs. WT with FDR < 0.05. Dashed lines indicate FDR = 0.05. e GSEA results of gene expression patterns NPC (S/G2/M) and Neuron. NES: normalized enrichment score. *adjp < 0.05; **adjp < 0.01; ****adjp < 0.0001. f Representative image of CO on day 48 from WT iPSC line and schematic of organoid layers. Nuclei: Hoechst. Scale: 50 μm. g Representative images showing γH2AX expression in NPCs in COs from WT and SETD1Ac.4582-2delAG/+ lines on day 48. WT organoids treated with 0.25 μM aphidicolin (APH) for 24 h as positive control. Scale: 10 μm. h Quantification of NPCs with γH2AX foci in COs. APH-treated WT organoid (n = 1 technical replicate) as positive control. Each data point indicates average of 6 ROI from two organoids per iPSC line (six WT; six SETD1A+/− including two SETD1Ac.4582-2delAG/+). Unpaired t-test, ****p < 0.0001. i Representative images showing γH2AX foci in 6-week neurons from WT and SETD1Ac.4582-2delAG/+. WT neurons treated with 100 μM H2O2 for 60 min. Arrowheads indicate neurons with γH2AX foci. Scale: 20 μm (low magnification); 10 μm (high magnification). j Quantification of neurons with γH2AX foci, n = 3 technical replicates; two WT (n = 1229 cells); two SETD1Ac.4582-2delAG/+ (n = 825 cells). Unpaired t-test, ***p = 0.0002. k Histogram showing fraction of nuclei with γH2AX foci (≥1 foci) for neurons from n = 3 batches; two WT (n = 356 cells) and two SETD1Ac.4582-2delAG/+ (n = 397 cells). Kolmogorov–Smirnov test, statistic = 0.1916; p = 2.55e-16. All tests two-sided. Mean ± S.D. (c, h, j). Source data provided as Source Data file.

Differential expression analysis of S/G2/M NPCs identified 624 DEGs (FDR < 0.05) with downregulation of DNA replication/repair genes and upregulation of mitotic and synaptic programs (Fig. 2d, e and Supplementary Data 5). Analysis of G1/postmitotic NPCs revealed 788 DEGs (FDR < 0.05), but gene set enrichment analysis (GSEA)38 did not detect significantly enriched pathways (Supplementary Fig. 2g and Supplementary Data 5). To validate findings, we performed bulk RNA-seq on 2D NPCs from six WT and six SETD1A+/− lines, including SETD1Ac.4582-2delAG/+ and CRISPR-engineered mutations not found in patients (Supplementary Fig. 3a). A total of 161 DEGs (FDR < 0.1) were detected in NPCs (Supplementary Fig. 3b and Supplementary Data 6), with GSEA revealing downregulation of DNA replication, chromatin remodeling, histone modification, and telomere maintenance pathways, and upregulation of neurogenesis-related genes in SETD1A+/− lines (Supplementary Fig. 3c).

Given that cell cycle length influences the transition of NPCs from proliferation to neurogenesis39,40,41,42, we next evaluated the effect of SETD1A LoF on neuronal differentiation. Flow cytometry of PSA-NCAM immunostaining in day 30 COs showed a higher proportion of cells with high PSA-NCAM expression in SETD1A+/−, suggesting increased immature neuron presence without a change in total PSA-NCAM+ cells (Supplementary Fig. 3d–f). Neuronal pseudo-bulk analysis of the scRNA-seq revealed 309 DEGs (FDR < 0.05; Supplementary Data 5), including downregulated glutamatergic synapse genes and upregulation of early synapse and metabolic genes (Fig. 2d, e). To validate this with more isogenic lines, we performed bulk RNA-seq of 2D neurons from six WT and six SETD1A+/− lines (Supplementary Fig. 4a). While the heterogeneous cell types and states in the 2D neuronal cultures reduced the power of bulk RNA-seq to detect DEGs, the analysis confirmed this trend with 41 DEGs (FDR < 0.1; Supplementary Fig. 4b and Supplementary Data 6) and showing downregulation of chromatin regulation, DNA repair, and glutamatergic pathways in SETD1A+/− neurons compared to WT neurons (Supplementary Fig. 3c). Neuron projection guidance pathway was also reduced in SETD1A+/− neurons, aligning with observed neurite deficits (Fig. 1k, l).

Additionally, consistent with the significant downregulation of GABAergic differentiation genes in 2D SETD1A+/− neurons (Supplementary Fig. 3c), subtype classification of neurons in CO scRNA-seq revealed an increased glutamatergic-to-GABAergic neuron ratio in SETD1Ac.4582-2delAG/+, with no change in dopaminergic cells (Supplementary Fig. 4c–e). This excitatory shift aligns with elevated spontaneous activity observed in neurons carrying SETD1Ac.4582-2delAG/+ and CRISPR-engineered mutations not found in patients (Fig. 1n), reinforcing the functional impact of subtype imbalance.

Patient-specific SETD1A c.4582-2delAG mutation increases double-strand DNA breaks and leads to a signature of senescence

Given the transcriptional downregulation of DNA repair pathways observed in SETD1Ac.4582-2delAG/+ COs, we next investigated whether SETD1A haploinsufficiency leads to the accumulation of DNA damage in developing neural cells. GSEA of bulk RNA-seq data from 2D SETD1A+/− haploinsufficient NPCs and neurons revealed significant downregulation of genes involved in double-strand break (DSB) repair, chromatin remodeling, histone modification, and telomere maintenance (Supplementary Fig. 3c). A moderate but significant correlation between transcriptomic changes in SETD1A+/− NPCs and neurons (Supplementary Fig. 4f) further supports shared dysregulation of gene programs across developmental stages. Genes downregulated in both SETD1A+/− NPCs and neurons were enriched for DNA damage response and DSB repair pathways (Supplementary Fig. 4g), suggesting that SETD1A haploinsufficiency disrupts these pathways throughout neural development.

To directly assess DNA damage accumulation, we examined the presence of γH2AX foci, a marker of DSBs. Immunostaining of day 30 COs revealed a significantly higher proportion of γH2AX-positive NPCs in SETD1A+/− COs compared to WT (Fig. 2f–h). Similarly, 2D neurons derived from SETD1Ac.4582-2delAG/+ lines showed a greater proportion of γH2AX-positive cells and an increased number of γH2AX foci per nucleus relative to WT neurons (Fig. 2i–k). These results indicate persistent DNA damage in both proliferative progenitor and postmitotic cell populations under conditions of SETD1A haploinsufficiency.

To assess broader biological consequences of this damage, we analyzed aging-related transcriptional signatures in 2D SETD1A+/− NPC and neurons. These included downregulation of telomere maintenance genes and upregulation of genes involved in inflammatory response pathways (Supplementary Figs. 3c and 4g), both of which are associated with cellular senescence.

Together, these data demonstrate that SETD1A haploinsufficiency is associated with impaired DNA repair and increased accumulation of DNA damage across developmental stages in human neural cells. The accompanying transcriptomic changes suggest premature activation of stress response and aging-like gene expression programs, potentially contributing to altered neurodevelopmental trajectories.

Identification of genome-wide SETD1A targets in the human developing cortex and isogenic SETD1A c.4582-2delAG/+ neuronal cells

To understand the mechanisms underlying the neurodevelopmental perturbation and transcriptional alterations in SETD1A haploinsufficient neuronal cells, we next sought to identify genome-wide targets of SETD1A. We performed CUT&Tag assays43 on 2D NPCs and neurons derived from two WT and two SETD1Ac.4582-2delAG/+ lines and compared SETD1A binding and H3K4me3 binding in SETD1Ac.4582-2delAG/+ cells to those observed in WT cells. We also performed CUT&Tag on three human prenatal frontal cortical samples in the second trimester, given the higher expression levels of SETD1A during the prenatal stage in the cortex throughout the human lifespan (Fig. 3a and Supplementary Fig. 5a, b). To investigate neuronal cells at different stages of developmental trajectory, including NPCs, intermediate progenitor cells (IPCs), and neurons, we dissected the developing cortical layers from the ventricular zone to the marginal zone (Fig. 3a).

Fig. 3: SETD1A targets active promoter regions of common SCZ risk genes, regulating chromatin remodeling, DNA repair, and synaptic signaling pathways in human brain development.
Fig. 3: SETD1A targets active promoter regions of common SCZ risk genes, regulating chromatin remodeling, DNA repair, and synaptic signaling pathways in human brain development.The alternative text for this image may have been generated using AI.
Full size image

a Schematic of brain tissue dissected and Allen Human Brain Atlas image at post-conception week (PCW) 21 https:/atlas.brain-map.org/atlas?atlas=3#atlas=3&plate=101892621&structure=11585&x=10116&y=23872&zoom=-7&resolution=128.08&z=372&zoom=-7&resolution=128.08&z=3. Human prenatal cortices were dissected to include ventricular zone through marginal zone. b Proportion of SETD1A peaks at active promoters in human prenatal cortices and isogenic iPSC-derived NPCs and neurons. c Representative IGV tracks of CUT&Tag signals for H3K4me3, SETD1A_CST, SETD1A_Atlas, and control IgG on chr16. Black bars indicate peaks at active promoters. d Overrepresentation of SETD1A peaks in human prenatal cortices and isogenic iPSC-derived NPCs and neurons on 287 SCZ GWAS loci5. Permutation test, ****p < 0.0001. e LDSC analysis showing heritability enrichment for psychiatric disorders (SCZ, BD, MDD), neurodevelopmental disorders (ASD, ADHD), neurodegenerative disorders (AD, PD), stroke, and T2DM across SETD1A binding sites. *p < 0.05; ***p < 0.001; ****p < 0.0001. f GO:BP terms enriched for SCZ GWAS significant genes with SETD1A-bound active promoters in human prenatal cortices (165 genes), WT NPCs (199 genes) and WT neurons (199 genes) among 682 SCZ GWAS significant genes5. *FDR < 0.05; **FDR < 0.01; ***FDR < 0.001; ****FDR < 0.0001. g Comparison of SETD1A peaks at active promoters between genotypes (one WT and one SETD1Ac.4582-2delAG/+ lines with technical duplicates for NPCs; two WT and two SETD1Ac.4582-2delAG/+ lines for neurons). Unpaired t-test, **p = 0.0085; n.s. not significant. h Venn diagram showing overlap of SETD1A peaks at active promoters between genotypes. Numbers indicate peak counts; parentheses indicate annotated gene counts. Red parentheses indicate SCZ GWAS-significant genes. i Representative IGV tracks showing lost SETD1A binding in SETD1Ac.4582-2delAG/+ cells (dashed boxes). j Comparison of SETD1A and H3K4me3 binding densities at lost/gained regions. Wilcoxon signed-rank test, ***p = 0.00012; ****p < 0.0001. k GO:BP terms enriched in genes with lost promoter regions and enrichment of 116 SCZ GWAS significant genes. *FDR < 0.05; **FDR < 0.01; ***FDR < 0.001; ****FDR < 0.0001. l Expression of 116 SCZ GWAS significant genes with lost promoter regions. Wilcoxon signed-rank test, ****p < 0.0001. Data presented as mean ± S.E.M. (b), mean ± S.D. (g), median with 25th–75th percentiles and min-max whiskers (j, l). All tests two-sided. RPKM reads per kilobase per million, TPM transcripts per million.

We first validated the specificity of anti-SETD1A antibodies by western blot analysis on adult postmortem dorsolateral prefrontal cortex (DLPFC) tissue (Supplementary Fig. 5c). We found that two antibodies (CST and Sigma Atlas, which we will refer to as SETD1A_CST and SETD1A_Atlas) showed higher specificity than the other three antibodies, including the antibody from Bethyl that were widely used in previous studies15,16,18,19. Specificity of the two anti-SETD1A antibodies, SETD1A_CST and SETD1A_Atlas, were further confirmed by western blot of HEK293T cells with SETD1A knockdown by siRNA (Supplementary Fig. 5d). We then performed CUT&Tag on the prenatal cortical tissue using five antibodies: H3K27ac, H3K4me1, H3K4me3, and SETD1A (SETD1A_CST and SETD1A_Atlas) (Supplementary Data 7), and on 2D NPCs and neurons using three antibodies: H3K4me3 and SETD1A (SETD1A_CST and SETD1A_Atlas) (Supplementary Data 8).

We confirmed that peaks of H3K27ac, a marker for open chromatin regions, accumulated on both promoters and introns, including enhancers; peaks of H3K4me1, a marker for enhancers, accumulated on introns; and peaks of H3K4me3, indicating active promoters, showed marked accumulation on promoters (Supplementary Fig. 3e). We then annotated the CUT&Tag peaks in genomic regulatory elements, based on the histone modification patterns. Our definitions of genomic regulatory elements were: active promoters, open chromatin regions enriched in H3K4me3 peaks annotated as promoter [between −1 kb to 100 bp of transcription start site (TSS)] by ChIPseeker44; active enhancers, regions enriched in H3K27ac but depleted in H3K4me3 and not annotated as promoter; and other genomic regions consisting of five genomic regions, namely 5′ UTR, 3′ UTR, exon, intron, and distal intergenic regions as annotated by ChIPseeker44.

SETD1A preferentially binds active promoter regions and regulates chromatin remodeling and DNA repair in the developing human cortex

SETD1A peaks from the human prenatal cortices accumulated more on active promoter regions (Fig. 3b, c). In all three brains, we found that SETD1A peaks from the two antibodies accumulated more at active promoter regions (~80% of SETD1A_CST peaks and ~40% of SETD1A_Atlas peaks) compared to active enhancers (~3.5% of SETD1A_CST peaks and ~12% of SETD1A_Atlas peaks) (Supplementary Fig. 5f, g). Although SETD1A_CST peaks exhibited more specific accumulation on promoters than SETD1A_Atlas peaks, there was a highly significant overlap of SETD1A peaks between the two antibodies around TSSs in each of the three brains (Supplementary Fig. 5h).

Similarly, we observed that SETD1A predominantly binds active promoter regions (more than 60% of both SETD1A_CST and SETD1A_Atlas peaks) in both NPCs and neurons from both WT and SETD1Ac.4582-2delAG/+ (Fig. 3b, c and Supplementary Fig. 5i–k). SETD1A peaks overall and at active promoter regions in NPCs and neurons showed significant overlap with the peaks in the human prenatal cortex (Supplementary Fig. 5l). These findings contrast with a previous study, which showed that 66% of Setd1a peaks overlap with enhancer marks in the PFC of 6-week-old mice16, perhaps due to differences in methodology, antibodies and human vs. mouse differences.

Gene ontology (GO) enrichment analysis with GREAT45 revealed that both active promoters and active enhancers bound by SETD1A in the prenatal cortical tissue genome-wide were significantly enriched in pathways that regulate DNA repair, histone modification, chromatin remodeling, the mitotic cell cycle, and autophagy (Supplementary Fig. 6a, b). The SETD1A-bound active promoters were also specifically enriched in pathways such as the ERAD pathway and telomere maintenance (Supplementary Fig. 6a). In contrast, the SETD1A-bound active enhancers showed specific enrichment in pathways regulating neurogenesis and synaptic functions, consistent with the previous mouse study16 (Supplementary Fig. 6b).

Similar to the results from prenatal brains, genome-wide active promoters bound by SETD1A exhibit significant enrichment for genes involved in chromatin remodeling, DNA repair, proteolysis, and telomere maintenance in both 2D NPCs and neurons from WT lines (Supplementary Fig. 6c, d). These findings indicate that SETD1A regulates these biological pathways throughout early neurodevelopment, consistent with the downregulation of genes associated with these pathways in SETD1A+/− NPCs and neurons compared to WT cells (Supplementary Fig. 3c). Additionally, we found specific enrichment of SETD1A-bound active promoters for genes regulating DNA replication and mitotic cell cycle in NPCs, and for genes involved in H3K4me pathways in neurons, aligning with the downregulation of these pathways in SETD1A+/− neuronal cells (Supplementary Fig. 3c).

To identify gene modules associated with SETD1A expression in the human developing brain, we performed weighted gene co-expression network analysis (WGCNA) from human prenatal cortical transcriptomes (n = 50) (Supplementary Fig. 7a). Among 63 co-expression modules (Supplementary Fig. 7b), 11 modules showed strong eigengene correlation with SETD1A expression (|Pearson correlation coefficient (r)| > 0.65) (Supplementary Fig. 7c). The genes in these modules positively correlated with SETD1A were associated mainly with chromatin organization and DNA repair pathways. Moreover, overrepresentation analysis revealed a significant enrichment of genes harboring SETD1A-bound active promoters and/or active enhancers with genes in the positively correlated modules. Genes in the negatively correlated modules, regulating synaptic signaling, mitochondrial function, and translation pathways, were not enriched with genes whose active gene regulatory regions are bound by SETD1A (Supplementary Fig. 7d). These findings suggest that: (i) genes in the positively correlated modules are direct targets of SETD1A in the prenatal cortex, and (ii) SETD1A LoF leads to downregulation of the genes regulating chromatin modification and DNA repair. Network analysis also validated the strong positive association of SETD1A with SETD1A-bound genes in the “lightyellow” module that includes SETD1A as a hub gene and regulates chromatin organization (Supplementary Fig. 7e).

Overall, these data suggest that SETD1A targets gene regulatory regions of actively transcribed genes, preferentially at promoter regions, and regulates chromatin remodeling and DNA repair pathways in the developing human cortex.

SETD1A targets common SCZ risk genes that regulate chromatin remodeling, DNA repair, and synaptic signaling pathways in neuronal cells during human cortical development

To investigate how SETD1A contributes to SCZ risk, we examined the overlap between SETD1A-bound genomic regions and SCZ-associated loci and genes. SETD1A binding sites in prenatal cortical tissue, as well as in 2D NPCs and neurons (WT and SETD1Ac.4582-2delAG/+), showed significant enrichment at 287 SCZ GWAS loci5 (Fig. 3d). This overlap was more pronounced in neurons than in NPCs (Fisher’s exact test, OR = 1.21; p = 2.44 × 10−5 in WT; OR = 1.24; p = 0.0003 in SETD1Ac.4582-2delAG/+), underscoring SETD1A’s prominent role in regulating SCZ risk loci specifically in neurons.

We also observed that active promoters and enhancers in the prenatal cortex globally overlapped with SCZ GWAS loci5 (Supplementary Fig. 8a), consistent with the enrichment of SCZ risk variants in non-coding regulatory regions46. Notably, SETD1A-bound active promoters and enhancers exhibited greater enrichment for GWAS loci than unbound regions (OR = 1.50; p = 1.65 × 10−5 for promoters; OR = 1.39; p = 2.54 × 10−3 for enhancers) (Supplementary Fig. 8a), suggesting that SETD1A preferentially targets regulatory elements harboring common SCZ risk variants.

To test whether SETD1A-bound regions contribute to the heritability of neuropsychiatric disorders, we performed linkage disequilibrium score regression (LDSC)47. SETD1A-bound loci in the prenatal cortex and WT neuronal cells were significantly enriched for heritability of SCZ5, bipolar disorder (BD)48, major depressive disorder (MDD)49, ASD50, and attention-deficit/hyperactivity disorder (ADHD)51, but not for neurodegenerative disorders [e.g., Alzheimer’s disease (AD)52, Parkinson’s disease (PD)53], or non-brain traits such as stroke54 and type 2 diabetes (T2DM)55 (Fig. 3e and Supplementary Data 9, 10). Active promoters and enhancers in the prenatal cortex and 2D neuronal cells were likewise enriched for SCZ heritability (Supplementary Fig. 8b), aligning with prior reports that SCZ risk genes are more active in fetal than postnatal life56,57.

Among active promoters, SETD1A-bound regions in the prenatal cortex and WT neuronal cells were enriched for SCZ, BD, MDD, and ASD heritability, whereas unbound regions lacked such enrichment, for all in neuronal cells and particularly for BD in the prenatal cortex (Fig. 3e). Interestingly, only SETD1A-unbound enhancers showed enrichment for ADHD heritability (Supplementary Fig. 8c), suggesting that SETD1A-bound enhancer regions are not major contributors to ADHD genetic risk.

SETD1A binding was also enriched at active promoters of eight out of ten genes carrying rare coding variants linked to high SCZ risk in the SCHEMA study8 (CACNA1G, CUL1, HERC1, RB1CC1, SETD1A, SP4, TRIO, and XPO7), emphasizing its relevance to both common and rare genetic risk.

In contrast, in SETD1Ac.4582-2delAG/+ NPCs and neurons, while active promoters remained enriched for SCZ, BD, and ASD heritability (Supplementary Fig. 8b and Supplementary Data 10), SETD1A binding itself no longer specifically contributed to this enrichment (Fig. 3e), indicating that the patient-specific c.4582-2delAG mutation disrupts SETD1A’s binding to disease-associated loci.

To explore the biological processes involved, we examined SETD1A-bound genes among the 682 SCZ GWAS-significant genes5. We identified 165 such genes in the prenatal cortex and 199 in both WT NPCs and neurons (Supplementary Data 11). These genes were enriched in pathways regulating DNA damage response and repair, chromatin remodeling, synaptic signaling, cell cycle, inflammation, and telomere maintenance (Fig. 3f and Supplementary Fig. 8d). Examples include: KAT5, encoding TIP60, a histone acetyltransferase involved in chromatin remodeling and DNA repair58; DOT1L, encoding a methyltransferase critical for H3K79 methylation and genome integrity59; KCNB1, a voltage-gated potassium channel gene; LRRC4, a synaptic adhesion molecule regulating excitatory synapse development60.

Together, these findings demonstrate that SETD1A targets a subset of SCZ-associated genes involved in key neurodevelopmental and epigenetic processes. Loss of SETD1A binding in SETD1Ac.4582-2delAG/+ neurons disrupts its association with these loci, suggesting that SETD1A LoF and common SCZ variants converge on a shared molecular network crucial for early brain development and psychiatric disease risk.

Patient-specific SETD1A c.4582-2delAG mutation affects SETD1A-mediated regulation of common SCZ risk genes involved in DNA damage response, chromatin remodeling, and synaptic signaling

To investigate how the SETD1Ac.4582-2delAG/+ LoF mutation disrupts the regulation of SCZ-associated pathways, we compared SETD1A binding and H3K4me3 levels at active promoter regions between mutants and WT in 2D NPCs and neurons. SETD1Ac.4582-2delAG/+ lines exhibited a global reduction in SETD1A peaks, including at active promoters (Fig. 3g and Supplementary Fig. 9a). In NPCs, both SETD1A and H3K4me3 binding densities were significantly reduced genome-wide in SETD1Ac.4582-2delAG/+ versus WT (Supplementary Fig. 9b), consistent with lower H3K4me3 levels in SETD1Ac.4582-2delAG/+ NPCs (Fig. 1h). In contrast, SETD1A and H3K4me3 binding densities in neurons were not diminished but instead showed a small and significant increase in SETD1Ac.4582-2delAG/+ cells compared to WT (Supplementary Fig. 9b), suggesting a cell-type-specific compensatory response.

We confirmed this by observing cell-type-specific alterations in SETD1A binding in active promoter regions where gained or lost SETD1A peaks in SETD1Ac.4582-2delAG/+ cells compared to WT cells. We classified SETD1A binding sites at active promoters into three groups: those lost or gained in cells from SETD1Ac.4582-2delAG/+ compared to WT, or shared between two genotypes (Fig. 3h, i). Even though some low-signal peaks might individually fall below detection thresholds, analyzing these loci as a group revealed a significant alteration of SETD1A binding overall. As expected, regions lost in SETD1Ac.4582-2delAG/+ cells showed significantly less SETD1A binding, while gained regions showed significantly more binding compared to WT (Fig. 3j). In NPCs, loss of SETD1A binding at active promoters in SETD1Ac.4582-2delAG/+ was associated with reduced H3K4me3, while gain of SETD1A binding also correlated with reduced H3K4me3. Unexpectedly, the opposite pattern was observed in neurons: both loss and gain of SETD1A binding at active promoters in SETD1Ac.4582-2delAG/+ neurons were associated with increased H3K4me3 (Fig. 3j). This underscores the distinct epigenetic consequences of the SETD1A mutation in NPCs versus neurons.

GO analysis of genes with promoter regions that lost SETD1A binding in SETD1Ac.4582-2delAG/+ (3,771 genes in NPCs; 4357 in neurons) revealed strong enrichment for pathways regulating DNA repair, cell cycle, chromatin remodeling, and synaptic signaling (Fig. 3k). Many of these pathways were downregulated in SETD1A+/– neuronal cells (Supplementary Fig. 3c), consistent with their dependence on SETD1A function. In neurons, regions that lost SETD1A binding in SETD1Ac.4582-2delAG/+ were also enriched for telomere maintenance (Fig. 3k). Active promoter regions bound by SETD1A in both SETD1Ac.4582-2delAG/+ and WT cells (shared regions) also mapped to similar pathways (Supplementary Fig. 9c), yet SETD1A and H3K4me3 levels at these sites were significantly reduced in SETD1Ac.4582-2delAG/+ NPCs compared to WT NPCs (Wilcoxon signed-rank test, p < 0.0001), indicating reduced regulatory activity despite peak retention. No specific biological pathways were enriched in either NPCs or neurons for genes whose promoters gained SETD1A binding in SETD1Ac.4582-2delAG/+.

While SETD1A’s catalytic function regulates gene expression via H3K4 methylation, non-catalytic roles have also been described. Previous study showed that SETD1A interacts with cyclin K to regulate DNA repair gene expression during S-phase independently of H3K4me361. We tested this by assessing the correlation between H3K4me3 binding changes and gene expression for DNA repair genes with active promoter regions, where SETD1A binding was lost in SETD1Ac.4582-2delAG/+ neurons. No significant correlation was found (Supplementary Fig. 9d, e), suggesting that SETD1A may regulate these genes via non-catalytic mechanisms.

We identified 116 SCZ GWAS-significant genes5 with active promoter regions where SETD1A binding was lost in SETD1Ac.4582-2delAG/+ neurons (Fig. 3h and Supplementary Data 11). These genes showed significantly reduced expression in patient mutant neurons (Fig. 3l), and were enriched for biological processes including histone modification, chromatin remodeling, DNA damage response, and synaptic signaling and vesicle cycling (Fig. 3k). These data support the idea that SCZ risk—whether from common variants or SETD1A haploinsufficiency—converges on shared pathways involving chromatin and synaptic regulation.

To assess the persistence of these regulatory relationships in the mature brain, we analyzed adult human DLPFC transcriptomes. Genes whose expression positively correlated with SETD1A in adult DLPFC were also enriched for chromatin remodeling, histone modification, DNA repair, and synaptic pathways, including cognition, memory, and social behavior (Supplementary Fig. 10 and Supplementary Data 12). This suggests that SETD1A continues to regulate these core neurobiological functions into adulthood, and that disruption of these processes underlies broader SCZ pathophysiology.

To test whether the loss of H3K4me3 contributes to impaired NPC proliferation, we treated mutant NPCs with KDM5 inhibitors (KDOAM-25 and JQKD82), which block demethylation of H3K4me362,63. Following 48-hour treatment, we observed a significant rescue of proliferation in mutant NPCs to levels comparable with WT cells (Supplementary Fig. 9f–k). These findings suggest that failure to maintain H3K4me3 at promoters of SETD1A target genes underlies the NPC proliferation deficit in SETD1A+/− lines.

SETD1A functions at DNA damage-prone sites and DNA repair hotspots in neuronal cells

Our findings thus far support a role for SETD1A in safeguarding genomic stability in NPCs and neurons. DNA damage can lead to somatic mutations, especially in RDCs that overlap with CFSs and mosaic CNVs23,27. In mature neurons, thousands of DRHs occur in open chromatin30,31. We first assessed whether active gene regulatory regions in the prenatal cortex co-localize with these DNA damage-prone regions. Active promoters and enhancers in prenatal cortex significantly overlapped with human CFSs64 and DRHs identified by SAR-seq30 and Repair-seq31, as well as with their shared intersection (Supplementary Fig. 11a). In agreement with previous reports30, SAR-seq DRHs showed stronger enrichment at enhancers than at promoters. These regions also overlapped with somatic SNVs found in the DLPFC of individuals with SCZ, but not with SNVs from neurotypical controls29 (Supplementary Fig. 11a), suggesting that gene regulatory elements active during fetal development are vulnerable to DNA damage in SCZ and may require tightly regulated repair mechanisms.

A recent study reported that SCZ-associated somatic SNVs are increased in frequency and enriched at fetal transcription factor binding sites, suggesting early progenitor vulnerability29. We therefore examined whether SETD1A binds at these DNA damage-prone regions. Indeed, SETD1A-bound sites in the prenatal cortex and WT NPCs significantly overlapped with hCFSs64 and somatic SNVs from SCZ cases29 (Fig. 4a, b), indicating a potential protective role for SETD1A at fragile sites during early cortical development. This overlap was markedly reduced in SETD1Ac.4582-2delAG/+ NPCs (Fig. 4a, b), suggesting that the patient mutation impairs SETD1A’s ability to associate with and possibly stabilize these fragile genomic regions.

Fig. 4: SETD1A functions at fragile genomic sites in NPCs and postmitotic neurons.
Fig. 4: SETD1A functions at fragile genomic sites in NPCs and postmitotic neurons.The alternative text for this image may have been generated using AI.
Full size image

a Overrepresentation of SETD1A peak loci from prenatal cortical tissue and isogenic iPSC-derived NPCs at human common fragile sites (hCFSs)64. Permutation test, ***p = 0.0009; ****p < 0.0001; n.s. not significant. b Overrepresentation of SETD1A peak loci from prenatal cortical tissue and isogenic iPSC-derived NPCs within ±75 bp of somatic single nucleotide variants (sSNVs) identified in postmortem brain tissue from control and SCZ individuals29. Permutation test, **p = 0.0080; ****p < 0.0001; n.s. not significant. c Overrepresentation of SETD1A peak loci from prenatal cortical tissue and isogenic iPSC-derived neurons at DNA repair hotspots (DRHs) identified by both SAR-seq30 and Repair-seq31. Permutation test, ****p < 0.0001. d Quantification of DNA damage in six-week-old neurons comparing WT and SETD1Ac.4582-2delAG/+ lines with or without KDM5 inhibitor treatment. SETD1Ac.4582-2delAG/+ lines were treated with vehicle (mock), 50 μM KDOAM-25, or 0.4 μM JQKD82 for 48 h. Each data point indicates result from independent batch (n = 3 batches; two WT lines (n = 1229 cells); two SETD1Ac.4582-2delAG/+ lines (n = 825 cells for mock; n = 546 cells for KDOAM-25; n = 588 cells for JQKD82)). One-way ANOVA followed by Tukey’s multiple comparisons test, ****p < 0.0001; n.s. not significant. e Comparison of gene length on SCZ GWAS loci5, stratified by fold change (FC) of H3K4me3 density (SETD1Ac.4582-2delAG/+ /WT). Analysis includes 37 genes on 52 loci with |log2FC| > 0.415 and 1574 genes on 235 loci with |log2FC| < 0.415. Kolmogorov–Smirnov test, ****p < 0.0001. Red line indicates 100 kb. f SynGO analysis results of genes on SCZ GWAS loci, stratified by FC of H3K4me3 density (SETD1Ac.4582-2delAG/+ /WT). *FDR < 0.05; **FDR < 0.01; ***FDR < 0.001. Data presented as mean ± S.D. (d). Box plots show median (middle line), 25th and 75th percentiles (box), and minimum and maximum values (whiskers) (e). All tests two-sided.

We next tested whether SETD1A binds neuronal DRHs in postmitotic neurons. SETD1A-bound regions in prenatal cortex and 2D neurons significantly overlapped with DRHs identified via SAR-seq30 and Repair-seq31, both individually and in combination (Fig. 4c and Supplementary Fig. 11b, c), independent of genotype. The magnitude of overlap was reduced in SETD1Ac.4582-2delAG/+ neurons compared to WT, indicating that SETD1A LoF may compromise its recruitment to neuronal repair hotspots, potentially impairing DNA repair efficiency.

Given this, we hypothesized that impaired SETD1A binding at DRHs may result in reduced local H3K4me3 deposition. Coupled with decreased expression of DNA repair genes in SETD1A+/− neurons (Supplementary Figs. 3c and 4g), this could lead to increased DNA damage accumulation (Fig. 2i–k). To test whether stabilizing H3K4me3 could rescue this phenotype, we treated neurons with KDM5 inhibitors (KDOAM-25 or JQKD82) for 48 h. Both compounds significantly increased H3K4me3 levels in WT and SETD1Ac.4582-2delAG/+ neurons (Supplementary Fig. 11d, e), and no genotype-specific differences in H3K4me3 stabilization were observed. Moreover, consistent with our findings (Supplementary Fig. 9b), baseline H3K4me3 levels in neurons were not significantly different between WT and SETD1Ac.4582-2delAG/+ cells (Supplementary Fig. 11e), underscoring the importance of locus-specific regulation rather than global changes.

Importantly, H3K4me3 stabilization significantly reduced DNA damage in SETD1Ac.4582-2delAG/+ neurons. Following treatment, γH2AX-positive neurons in SETD1Ac.4582-2delAG/+ were significantly reduced to levels comparable to WT (Fig. 4d), suggesting that pharmacological stabilization of H3K4me3 can ameliorate DNA damage accumulation in the context of SETD1A LoF.

In summary, our results indicate that SETD1A binds to DNA damage-prone regions in NPCs and postmitotic neurons, including CFSs and DRHs, and plays a role in promoting DNA repair through both transcriptional regulation and H3K4me3 deposition. SETD1A haploinsufficiency likely leads to increased genomic instability across development, with implications for neuronal function and disease risk in SCZ.

Patient-specific SETD1A c.4582-2delAG mutation disrupts H3K4me3 regulation at long SCZ risk-associated genes

We next examined whether SETD1A c.4582-2delAG mutation affects H3K4me3 levels at common SCZ risk loci. Although overall H3K4me3 levels at the 287 SCZ GWAS loci5 were not significantly different between WT and mutant neurons (Supplementary Fig. 12a), we identified a subset of 52 loci showing pronounced changes in H3K4me3 density (|log₂FC| > 0.415) in patient mutants—hereafter referred to as the “most affected loci” (Supplementary Fig. 12b). Strikingly, genes located at these most affected loci were significantly longer than those at minimally affected loci (Fig. 4e and Supplementary Data 13), implicating gene length as a potential vulnerability factor. These long genes included several SCZ-associated loci such as AKAP6, DCC, GRIN2A, SGCD, and PTPRD. Many of these genes exceed 100 kb in length (20/37), aligning with prior reports that long neural genes are particularly susceptible to DSBs23.

GO analysis of the most affected loci revealed significant enrichment for presynaptic functions, including regulation of synaptic transmission and membrane potential (Fig. 4f). In contrast, genes from both affected and unaffected loci were enriched for postsynaptic processes, suggesting a selective disruption of presynaptic regulatory programs in neuronal cells carrying SETD1A LoF mutations.

Together, these findings indicate that SETD1A-mediated H3K4me3 is especially critical for epigenetic stability at long, fragile genes involved in synaptic function. The disruption of H3K4me3 at these loci in SETD1Ac.4582-2delAG/+ may render these genes more susceptible to transcriptional dysregulation and genomic instability, further contributing to the convergence between SETD1A haploinsufficiency and SCZ genetic risk.

SETD1A downregulation is associated with SCZ pathogenesis in the broader patient population without rare LoF mutations

Finally, we asked whether the altered SETD1A activity might contribute to SCZ pathogenesis beyond rare LoF mutations. Using large-scale postmortem cortical transcriptomic datasets from PsychENCODE (PEC)65 and BrainSeq66, we evaluated SETD1A expression and its downstream targets in adult individuals with SCZ who do not carry SETD1A LoF variants.

In the PEC dataset, SETD1A was modestly but significantly downregulated in SCZ cortex compared to neurotypical controls65 (Fig. 5a), implicating dysregulation of SETD1A in the broader adult patient population. Among DEGs in SCZ cortex, only downregulated—but not upregulated—genes were significantly enriched for SCZ GWAS loci5 (Fig. 5b). Notably, SETD1A target genes were significantly overrepresented among these downregulated SCZ GWAS genes (37/68), compared to the upregulated ones (19/53) (Fig. 5c), indicating the preservation of SETD1A-targets between prenatal and adult patient brains. These SETD1A target genes were also specifically enriched for synaptic function (Fig. 5d), a pattern not observed among the non-SETD1A targets (Supplementary Fig. 13a), suggesting that SETD1A downregulation in SCZ may impact synaptic gene networks associated with genetic risk.

Fig. 5: Downregulation of SETD1A target SCZ GWAS significant synaptic genes contributes to common variant SCZ pathogenesis.
Fig. 5: Downregulation of SETD1A target SCZ GWAS significant synaptic genes contributes to common variant SCZ pathogenesis.The alternative text for this image may have been generated using AI.
Full size image

a Schematic summary of psychENCODE (PEC) postmortem brain dataset65 and SETD1A expression patterns in control (CTRL) versus SCZ cohorts. Statistical information from original PEC study65. b Overrepresentation of DEGs (FDR < 0.05) between SCZ and CTRL individuals from PEC study among SETD1A target genes at active promoter regions in WT neurons and SCZ GWAS significant genes. Fisher’s exact test, **p = 0.0086; ****p < 0.0001; n.s. not significant. No specific enrichment observed between upregulated and downregulated genes for both sets (Fisher’s exact test, n.s.). c Percentage of SETD1A target genes identified in WT neurons among SCZ GWAS significant genes5 differentially expressed in SCZ versus neurotypical individuals in PEC dataset. Fisher’s exact test, OR = 2.14; *p = 0.0320. Number of SETD1A target genes among SCZ GWAS significant DEGs indicated. d SynGO analysis (GO:BP) of 37 SETD1A target SCZ GWAS significant genes downregulated in SCZ versus CTRL individuals. e Schematic strategy for stratification of CTRL and SCZ cohorts from PEC65 and BrainSeq66 postmortem brain datasets based on SETD1A expression level. f Overrepresentation of DEGs (FDR < 0.05) between SCZ-low and CTRL-high groups in PEC and BrainSeq datasets with SETD1A target genes in WT neurons and SCZ GWAS significant genes. Fisher’s exact test, ****p < 0.0001; n.s. not significant. Specific enrichments of downregulated genes observed: Fisher’s exact test, *p = 0.0158; ****p < 0.0001. g Percentage of SETD1A target genes in WT neurons among SCZ GWAS significant genes differentially expressed in SCZ-low versus CTRL-high groups in PEC and BrainSeq datasets. Fisher’s exact test, PEC: OR = 1.54; *p = 0.0371, BrainSeq: OR = 2.67; ***p = 0.0004. Numbers of SETD1A target genes among SCZ GWAS significant DEGs indicated. h SynGO analysis (GO:BP) of SETD1A target SCZ GWAS-significant genes downregulated in SCZ-low versus CTRL-high groups in PEC and BrainSeq datasets. Gene symbols annotated for enriched GO terms indicated. i Graphical summary of SETD1A functions in neurons contributing to SCZ pathogenesis when dysfunctional. Box plots show median (middle line), 25th and 75th percentiles (box), and ±1.5 × IQR (whiskers) (a). All tests two-sided. Schematics created in BioRender. Sawada, T. (2025) https://BioRender.com/m17s1as; https://BioRender.com/0ek3ajj.

To test whether low SETD1A expression drives this effect, we stratified both control and SCZ samples from PEC65 and BrainSeq66 into high, middle, and low SETD1A expression groups. We then compared gene expression between high-SETD1A controls (CTRL-high) and low-SETD1A SCZ (SCZ-low) samples (Fig. 5e; Supplementary Fig. 13b; and Supplementary Data 14, 15). Although the smaller sample size in BrainSeq66 yielded fewer significant DEGs overall (245 vs. 4,821 in PEC65), similar patterns emerged. In both datasets, downregulated genes in the SCZ-low group were significantly enriched for SETD1A targets and SCZ GWAS genes, whereas upregulated genes were not (Fig. 5f). Among SCZ GWAS-significant DEGs, SETD1A target genes were consistently overrepresented among downregulated genes (PEC65: 96/197; BrainSeq66: 91/146) compared to upregulated ones (PEC65: 52/136; BrainSeq66: 31/81) (Fig. 5g).

GO analysis revealed that SETD1A target genes downregulated in the SCZ-low group were significantly enriched for DNA damage response, chromatin remodeling, and telomere maintenance—processes not enriched among non-SETD1A targets (Supplementary Fig. 13c). Both gene sets were enriched for synaptic functions (Fig. 5h and Supplementary Fig. 13d), but only the SETD1A targets showed consistent overlap in synaptic terms across datasets and stratification conditions (Fig. 5d,h). Shared synaptic SETD1A target genes included ASIC1, DGKZ, IGSF9B, LRRC4, and ZDHHC5, which were consistently downregulated in the SCZ cortex in association with low SETD1A expression.

Taken together, these findings demonstrate that reduced SETD1A activity contributes to SCZ pathogenesis in a broader subset of patients. Even in the absence of rare LoF variants, lower SETD1A expression—potentially driven by common genetic variation or environmental factors—results in downregulation of SETD1A target genes at SCZ risk loci. These genes are enriched for synaptic and genomic stability functions, suggesting that disruption of SETD1A-dependent regulation may represent a convergent mechanism in SCZ pathogenesis (Fig. 5i).

Discussion

In this study, we integrated isogenic neuronal models carrying a patient-specific SETD1A c.4582-2delAG mutation, epigenomic profiling of human prenatal brain and neuronal cells, and statistical genetic analyses to dissect SETD1A’s role in SCZ pathogenesis. We confirmed that the c.4582-2delAG mutation causes a true LoF, resulting in transcript degradation and reduced SETD1A protein. Our findings demonstrate that SETD1A directly regulates SCZ GWAS-linked genes involved in synaptic signaling and DNA repair, providing a convergent molecular mechanism across rare and common SCZ risk.

SETD1A preferentially binds active promoters at SCZ risk loci and maintains genomic stability in neural cells. Notably, individuals without rare SETD1A mutations show reduced SETD1A expression in the cortex65 and serum67. This downregulation correlates with reduced expression of synaptic SCZ risk genes, including ASIC1, DGKZ, and LRRC4, as well as 8 of 10 SCHEMA high-confidence genes8. These data support a model in which SETD1A acts as a central regulator of a polygenic SCZ risk network, and suggest that restoring SETD1A function or its related H3K4 methylation levels could be therapeutically beneficial.

One strategy may involve modulating H3K4 methylation dynamics through inhibition of histone demethylases such as KDM5 and KDM1A (LSD1)16. KDM5 inhibition rescued DNA damage phenotypes in SETD1A mutant neurons, and preclinical studies show that LSD1 inhibition improves cognition and synaptic plasticity. Together with findings from Setd1a+/− mouse models16,18 and human SETD1A LoF variant carriers10,11, our results suggest that targeting H3K4 methylation may offer a promising avenue to treat cognitive and negative symptoms in SCZ.

Despite these insights, our study has limitations. While we used γH2AX staining to detect increased DNA damage in SETD1Ac.4582-2delAG/+ neurons, additional markers such as 53BP1 would provide stronger validation. For our CUT&Tag experiments, although we validated antibody specificity by western blot and observed reduced peak numbers in isogenic SETD1A c.4582-2delAG/+ haploinsufficient lines (Fig. 3g and Supplementary Fig. 9a), we did not perform antibody validation using knockout or knockdown cells in the same assay, limiting our ability to completely exclude potential cross-reactivity. Moreover, differences between our CUT&Tag findings and ChIP-seq findings in Mukai et al.16 may reflect both antibody specificity, mouse vs. human differences, and methodological differences, as CUT&Tag is known to preferentially enrich for promoter regions due to accessibility biases, whereas ChIP may capture broader regulatory landscapes including enhancers.

Our data show that SETD1A predominantly binds promoters in human prenatal cortex and iPSC-derived NPCs and neurons. This contrasts with initial mouse studies suggesting enhancer targeting but is supported by recent reports of promoter enrichment in hippocampal and PFC neurons18,19,68. While methodological and species differences may contribute, promoter-focused binding appears conserved.

The patient-specific SETD1A c.4582-2delAG mutation reduced genome-wide H3K4me3 in NPCs but not in neurons, with no significant reduction at promoters of downregulated genes. This may reflect cell type heterogeneity, as prior studies that found H3K4me3 loss restricted to Foxp2+ neurons19. Future work using cell-type-resolved profiling will clarify these effects.

SETD1A haploinsufficiency also altered neurodevelopmental trajectories, increasing excitability and impairing GABAergic differentiation—phenotypes that converge with previous findings in patient-derived neurons33,35 and Setd1a+/− mice16,18. Discrepancies across models, such as human iPSC-derived neurons carrying exon 7 mutant not found in patients showing increased dendritic complexity35, unlike our results, or the other mouse studies16,17,18,19, may reflect mutation-specific or technical factors and underscore the need to study diverse SETD1A variants.

In line with previous mouse data15,16, NPCs with SETD1A LoF showed reduced proliferation. Although microcephaly is not reported in humans with SETD1A mutations11, the developmental impact of this proliferation defect remains unclear. Notably, lymphoblastoid cells from patients with SETD1A variants show impaired DNA repair11, and mouse transcriptomic studies reveal consistent downregulation of DNA repair genes (Supplementary Data 16)16,19,69, suggesting that genomic instability is a conserved phenotype.

We show that SETD1A binds fragile genomic regions in both NPCs and neurons—including sites enriched for SCZ-associated somatic SNVs29 and neuronal DRHs30,31—and that KDM5 inhibition mitigates DNA damage, likely by restoring H3K4me3 locally. Future studies are needed to define SETD1A’s role in regulating DNA repair genes and its impact on gene expression and neuronal function.

Genomic instability is a hallmark of aging70 and is implicated in SCZ, which is associated with grey matter loss and accelerated brain aging71,72,73,74,75,76. SETD1A haploinsufficient mutant neurons exhibit transcriptional signatures of aging, including reduced synaptic and telomere-related gene expression and increased inflammation. These features suggest that SETD1A deficiency may promote an accelerated aging-like state in the brain.

Finally, SETD1A binds regulatory regions that overlap with SCZ-enriched somatic SNVs arising in fetal progenitors29. Since SCZ risk genes are more highly expressed during fetal than postnatal life56,57, our data support a model in which SETD1A haploinsufficiency causes transcriptional dysregulation and developmental genomic instability that propagates through neural lineages, leading to cumulative dysfunction over time. These findings bridge developmental and aging-related mechanisms and highlight SETD1A as a key regulator of brain health across the lifespan.

Methods

Cell lines and tissue samples

Human iPSC lines from two neurotypical individuals (LIBD2c1 and LIBD7c6) were established by reprogramming postmortem dura-derived fibroblasts32. Fibroblast collection and use for research were approved by the Western Institutional Review Board (#WIRB 20111080).

Postmortem adult human DLPFC was obtained by autopsy from a 27-year-old Hispanic male with informed consent from the legal next of kin (under Protocol No. 12-24 from the State of Maryland Department of Health and Mental Hygiene)66,77. Clinical characterization, diagnoses, and macro- and microscopic neuropathological examinations were performed on all samples using a standardized paradigm, and this subject was free of any evidence of macro- or microscopic neuropathology, as well as any neurological or psychiatric diagnoses. Fetal brain tissue samples from three European males in the second trimester were obtained in conjunction with the Medical University of Sofia, with informed consent of the mother, under a protocol approved by the Research Ethics Committee of the university. Karyotyping or Aneufast QF-PCR (Genomed-Biotech) was performed on all fetal samples to rule out any major chromosomal abnormalities. For the fetal samples, full-thickness cortical samples from the developing frontal neocortex were dissected using a hand-held dental drill, over the dorsal convexity halfway between the frontal and temporal poles.

Generation of isogenic SETD1A LoF mutant iPSCs

iPSC lines were maintained on a tissue culture plate coated with Cultrex Reduced Growth Factor Basement Membrane Matrix (R&D Systems) in StemFlex medium (Thermo Fisher Scientific). To generate isogenic SETD1A mutant lines, an oligonucleotide (sgRNA) targeting around the splice acceptor of intron 15 in SETD1A was designed and cloned into U6-gRNA:CMV-CAS9-2A-tGFP vector by MilliporeSigma. Single-strand oligonucleotide (ssODN) carrying SETD1A c.4582-2delAG mutation and PAM blocking mutation (Supplementary Fig. 1a and Supplementary Data 2) was also designed by MilliporeSigma. By following a recent study78, iPSCs were treated with 1× RevitaCell Supplement (Thermo Fisher Scientific) for 2 h at 37 °C and dissociated with 0.75× TrypLE Select Enzyme (Thermo Fisher Scientific) diluted in 0.5 mM EDTA/PBS for 3 min at 37 °C into single cells. Single cell suspension (1 × 106 iPSC cells per reaction) was electroporated with 5 μg of U6-gRNA:CMV-CAS9-2A-tGFP plasmid and 100 pmol of ssODN with 4D Nucleofector Unit X (Lonza) using P3 Primary Cell 4D-Nucleofector X Kit (Lonza) and programs CA-137 and CM-130. Cells were plated and cultivated on a tissue culture plate coated with recombinant human Laminin-521 (Thermo Fisher Scientific) in StemFlex medium for 4 days. For the first 24 h, the medium was supplemented with 1× RevitaCell. Four days post-electroporation, the cells were dissociated into single cells with 0.75× TrypLE Select Enzyme and plated on a 96-well plate coated with Cultrex by targeting <0.5 cells per well by limiting dilution. Cells were maintained in StemFlex medium supplemented with 1× RevitaCell for the first 24 h. The single iPSC colonies were manually picked and expanded on a tissue culture plate coated with Cultrex in StemFlex medium. DNA was extracted from 148 subclones for LIBD2c1 and 229 subclones for LIBD7c6 and screened for edited genome by SURVEYOR assay with a Surveyor Mutation Detection Kit (IDT) and HinfI (NEB) enzyme digestion.

We validated SETD1A mutation in the selected candidate iPSC subcloned lines by TA cloning with TOPO TA Cloning Kit for Sequencing (Thermo Fisher Scientific) and Sanger sequencing using Cel-1 primers. Potential off-target mutations at NR4A1 and PTPRN2 loci (Supplementary Data 2) were validated by Sanger sequencing. Primers used for PCR and Sanger sequencing were listed in Supplementary Data 17. All 18 lines were karyotyped by genome-wide genotyping array using an Infinium Omni2.5-8 Kit (Illumina) by Psomagen. iPSC lines used in each assay are listed in Supplementary Data 3.

Neuronal differentiation

To generate NPCs, iPSCs were treated with 10 μM Y-27632 (BioGems) at 37 °C overnight when they reached 50–60% confluency. Cells were then incubated with StemPro Accutase (Thermo Fisher Scientific) at 37 °C for 7 min and dissociated into single cells. Dissociated iPSCs were plated on 96-well V-bottom plates (S-bio) at 5000 cells/150 μL per well in StemFlex medium to make embryoid body (EB). The medium was supplemented with 30 μM Y-27632 for the first 24 h and was gradually switched to DFN2, consisting of DMEM/F12 + GlutaMAX (Thermo Fisher Scientific), 1× N-2 supplement (Thermo Fisher Scientific), 1× non-essential amino acid (NEAA, Thermo Fisher Scientific), 100 μM 2-mercaptoethanol (2-ME, Thermo Fisher Scientific) and 1× Antibiotic-Antimycotic (Anti-Anti, Thermo Fisher Scientific), supplemented with 10 μM SB 431542 (BioGems) and 100 nM LDN-193189 (BioGems) from day 2. The EBs were fed every other day till day 8. The EBs were transferred onto a 6-well plate (48 EBs/well) coated with Matrigel Basement Membrane Matrix Growth Factor Reduced (Matrigel GFR, Corning) in DFN2B27-VA media consisting of DMEM/F12 + GlutaMAX, 1× NeuroCult SM1 without vitamin A (StemCell Technologies), 1× N-2 supplement and 1× Anti-Anti, supplemented with 2 μg Heparin (MilliporeSigma) and 2 μM Cyclopamine (BioGems) on day 10 and were fed every other day. On day 16, neuronal rosettes were manually picked up, dissociated in StemPro Accutase at 37 °C for 15 min, and plated on a 6-well plate coated with 0.001% Poly-L-Ornithine (MilliporeSigma) and 5 μg/mL mouse Laminin (Thermo Fisher Scientific) in NPC medium: DFN2B27-VA supplemented with 20 ng/mL human basic FGF (PeproTech) and 1 μg/mL mouse Laminin. The medium was supplemented with 10 μM Y-27632 for the first 24 h. The cells were fed every other day and passaged using StemPro Accutase on a Matrigel GFR-coated plate. At passage 2–3, NPCs were characterized by immunostaining with antibodies against SOX2, PAX6, and NESTIN. To measure cell proliferation, NPCs (passage 6–10) were plated on a 12-well plate at 200,000 cells/well in the NPC medium. Seventy-two hours later, the cells were dissociated with StemPro Accutase and counted on Countess II (Thermo Fisher Scientific) using a Trypan Blue solution (Thermo Fisher Scientific). Cell proliferation was also measured by CyQUANT Cell Proliferation Assay (Thermo Fisher Scientific) according to the manufacturer’s instructions. NPCs were plated at 10,000 cells/well in a 96-well plate. The cells were maintained for 72 h and subjected to assay. Cell viability was measured at 24 h post-plating. For KDM5 inhibition, SETD1A hetero mutant NPCs were treated with KDOAM-25 (MilliporeSigma) or JQKD82 (MedKoo) at 24 h post-plating for 48 h.

For differentiation of NPCs into neurons, NPCs were plated on a cell culture plate coated with 0.001% Poly-L-Ornithine and 5 μg/mL mouse Laminin at 20,000 cells/cm2 in the NPC medium. Two days post-plating, medium was switched to neuronal medium consisting of Neurobasal (Thermo Fisher Scientific), 1× GlutaMAX-I supplement (Thermo Fisher Scientific), NeuroCult SM1 (StemCell Technologies) and 1× Anti-Anti, supplemented with 20 ng/mL human BDNF (PeproTech), 20 ng/mL human GDNF (PeproTech), 100 μM dibutyryl-cAMP (BioGems) and 200 μM ascorbic acid (MilliporeSigma). Cells were fed every other day. For KDM5 inhibition, 6-week-old SETD1A patient mutant neurons were treated with 50 μM KDOAM-25 or 0.4 μM JQKD82 for 48 h.

Cerebral organoid (CO) differentiation

CO differentiation was performed as previously described79 with small modifications. In brief, iPSCs were treated with 10 μM Y-27632 at 37 °C overnight when they reached 50–60% confluency. Cells were then incubated with StemPro Accutase at 37 °C for 7 min and dissociated into single cells. Dissociated iPSCs were plated on 96-well V-bottom plates at 9000 cells/150 μL per well in StemFlex medium to make EB. The medium was supplemented with 30 μM Y-27632 for the first 24 h. Forty-eight hours later, the medium was switched to DFN2 supplemented with 10 μM SB 431542 and 100 nM LDN-193189 (day 0). Organoids were fed every three days. On day 12, each EB was embedded in a droplet of Matrigel GFR and transferred to a 60-mm culture dish in DFN2 supplemented with 100 nM LDN-193189. Organoids were fed on day 14. On day 16, the medium was switched to differentiation medium consisting of 1:1 mixture of DMEM/F12 + GlutaMAX and Neurobasal supplemented with 0.5× N-2 Supplement, 0.5× NeuroCult SM1 without vitamin A, 0.5× NEAA, 100 μM 2-ME, and 2.65 μg/ml Insulin (Thermo Fisher Scientific). Organoids were fed every 3 days and maintained with agitation in an incubator shaker (Eppendorf, New Brunswick S41i) at 80 r.p.m. On day 30, the medium was switched to neuronal medium as described above. From day 60, dibutyryl-cAMP and ascorbic acid were withdrawn from the medium and the organoids were fed every 4 days. CO was treated with 0.25 μM aphidicolin (Cayman Chemical) for 24 h before fixation (Fig. 2g, h).

Immunostaining

Cells were fixed with 4% PFA in 1× PBS at room temperature for 15 min and permeabilized and blocked with 10% normal donkey serum in PBS containing 0.3% Triton X-100 for 30 min at room temperature. COs were fixed with 4% PFA in PBS overnight at 4 °C. After washing with PBS, organoids were placed in serial dilutions of PBS-buffered sucrose (10, 20, and 30%, in sequence) at 4 °C. Each solution was replaced every day. The dehydrated organoids were maintained in 30% sucrose solution at 4 °C until embedding with OCT compound (Sakura Finetek). One day before cryosectioning, fixed organoids were placed in a 1:2 mixture of 30% sucrose solution and OCT compound and left overnight at 4 °C. Then, organoids were embedded in a 1:2 mixture of 30% sucrose solution and OCT compound, frozen immediately in dry ice/acetone, and cryosectioned at 10 μm. Tissue sections were subjected to additional fixation with 4% PFA in 1× PBS for 3 min at room temperature and permeabilized and blocked with 10% normal donkey serum in 1× PBS containing 0.3% Triton X-100 for 30 min at room temperature. After washing with 5% serum in PBS containing 0.01% Tween-20, cells or sections were incubated with primary antibodies at 4 °C overnight and with secondary antibodies (Jackson ImmunoResearch) at room temperature for 90 min. Detailed information regarding primary antibodies is available in Supplementary Data 18. Image analyses were carried out with CellProfiler80 for COs (Fig. 2h) and ImageJ (ver 1.54) for 2D neurons (Fig. 2j, k).

Real-time qPCR

Total cellular RNA was extracted from COs using TRIzol Reagent (Thermo Fisher Scientific) and a Direct-zol RNA MiniPrep kit (Zymo Research), following the manufacturer’s instructions. cDNA was prepared by reverse transcription using the SuperScript IV VILO Master Mix (Thermo Fisher Scientific). Real-time qPCR was carried out using QuantiTect SYBR Green PCR Kit (Qiagen) on QuantStudio 3 Real-Time PCR System (Applied Biosystems). The primer sequences are shown in Supplementary Data 17.

Single-cell RNA sequencing (scRNA-seq)

COs were differentiated from three WT and two SETD1Ac.4582-2delAG/+ lines from the LIBD7c6 prenatal iPSC line. Five organoids from each iPSC line on days 30 and 48 were dissociated into single cells with the Papain dissociation system (Worthington Biochemical)81. Cells from all five iPSC lines were pooled into one library per time point by cell hashing82 with TotalSeq-A Hashtag antibodies (BioLegend). The libraries were generated with Chromium Next GEM Single Cell 3′ Reagent Kits v3.1 (10× Genomics). We prepared three libraries (three technical replicates) from pooled cells for each time point. Libraries were sequenced on an Illumina NovaSeq 6000 (SP, 2 × 150 bp).

scRNA-seq data analysis

Raw sequencing data were preprocessed with Cell Ranger 7.2 (10× Genomics), in which reads were aligned to the hg38 human reference genome. Expression data were processed with Scanpy83 (ver. 1.9.3) in Python (ver. 3.11.3). Initially, we removed cells that contained fewer than 2000 total UMI, fewer than 1000 genes detected, greater than 20% of reads that corresponded to mitochondrial genes, or fewer than 0.8 log10(Genes per UMI), a measure of cell complexity. Read counts were then normalized by counts per 10,000 in each cell and subsequently log-transformed. We used Scanpy’s function sc.pp.highly_variable_genes to identify the top 4000 highly variable genes using the parameter “flavor = ’seurat_v3”. We performed dimensionality reduction with principal components analysis on the highly variable genes, followed by nearest neighbor embedding in two dimensions using uniform manifold approximation and projection (UMAP). We then pooled data across multiple libraries and batches and performed dataset integration using Harmonypy84 (ver. 0.0.9) workflow. Cells were grouped into different clusters by using the Leiden algorithm85 and then annotated manually into 13 different cell types based on the expression of marker genes (Supplementary Data 4). Cell cycle phase scoring was performed using Seurat86 (4.3.0.1) in R (4.3.1). Cell type-specific pseudo-bulk differential gene expression analysis was performed to compare SETD1Ac.4582-2delAG/+ and WT cells using edgeR (3.42.4)87. Pseudo-bulk gene expression was calculated for each iPSC line at each time point by pooling data from three libraries. We used timepoint (“day”) as a covariate for differential expression analysis. GSEA38 was performed using the sign of the fold change multiplied by the inverse of the p-value.

Bulk RNA-seq

Total RNA was extracted from NPCs and four-week-old neurons derived from iPSCs using TRIzol Reagent and a Direct-zol RNA MiniPrep kit, following the manufacturer’s instructions. Libraries were prepared using a TruSeq Stranded Total RNA with RiboZero Human/Mouse/Rat Library Prep Kit (Illumina) and sequenced on NovaSeq6000 with a read length of 150-bp paired-end by Psomagen. Fastq files were quality-checked using FastQC (ver. 0.11.3) and MultiQC88 (ver.1.12). The reads were aligned to the human genome version hg38.p12 using HISAT289 (ver. 2.2.1). For count quantification, we used featureCounts algorithm90 (ver. 2.0.1), using the options of the paired end (option -P), ignoring chimeric reads (option -C), stranded alignment on the reverse strand (option -s 2) and considering only primary alignments (option --primary). For differential expression analysis between WT and SETD1+/− cells, we used DESeq291 (ver. 1.40.2). Sequencing batches were considered as a covariate. Differentially expressed genes were defined with a threshold of Benjamini and Hochberg corrected p-value (false discovery rate, FDR) < 0.1 or nominal p-value < 0.05. T-statistic was calculated by log2FoldChange/log2FoldChange S.E.M. GSEA38 was performed using the sign of the fold change multiplied by the inverse of the p-value. No technical replicates were analyzed.

Flow cytometry

Five COs per iPSC line were washed once with 1× PBS (Thermo Fisher Scientific) and incubated in 1 mL Cell Recovery Solution (Corning) at 4 °C for 1 h with agitation at 300 r.p.m. The organoids were washed once with ice-cold 1× PBS and incubated in 0.5 mL StemPro Accutase at 37 °C for 35 min to dissociate into single cells. Dissociated cells were resuspended in 200 μL of 1% BSA (Cell Signaling Technology) in 1× PBS. One million cells per iPSC line were fixed 100 μL Click-iT Fixative (Thermo Fisher Scientific) at room temperature for 15 min. The cells were then washed twice with 0.5% BSA in 1× PBS, resuspended in 100 μL of 0.5% BSA in 1× PBS and incubated with 2 μL anti-PSA-NCAM-PE antibody (Miltenyi Biotech) at 4 °C for 15 min. We washed the cells twice with 0.5% BSA in 1× PBS and incubated them in 2.5 μL 7-AAD (BD) in 100 μL of 0.5% BSA in 1× PBS at room temperature for 10 min. Stained cells were analyzed on LSR II flow cytometer (BD). Data were analyzed using FlowJo (ver. 10.7.1) (BD).

Multi-electrode array (MEA) recording

Four-week-old neurons were dissociated with 1:1 mixture of StemPro Accutase and Accumax (Thermo Fisher Scientific) at 37 °C for 30 min, triturated and resuspended in wash buffer-1 consisting of 1× PBS, 2.5 mM EDTA pH 8.0 (Thermo Fisher Scientific), 25 mM HEPES-KOH pH 7.5 and 1% FBS (Thermo Fisher Scientific). The cell suspension was passed through 70 μm cell strainer (Falcon) and the cells were resuspended in wash buffer-2 consisting of DMEM (Thermo Fisher Scientific), 1% FBS and 20 U/mL DNase I (NEB). After centrifugation, the cells were suspended in neuronal medium and passed through a 35 μm cell strainer (Falcon)92. Dissociated neurons were plated on a 48-well MEA plate (Axion Biosystems) coated with 0.1% Polyethylenimine solution (MilliporeSigma) and 10 μg/mL mouse Laminin with human astrocytes (ScienCell #1800) at 4:1 by following the Elixirgen Scientific’s instruction. The cells were fed with the neuron medium supplemented with 20 μg/ml mouse laminin twice a week. The medium was switched to BrainPhys (Stem Cell Technologies) supplemented with 1× NeuroCult SM1, 10 mM Glucose (Thermo Fisher Scientific), 1× Anti-Anti, 20 ng/mL human BDNF and 20 ng/mL human GDNF from week 2 (week 5 of neuronal differentiation) and the cells were fed twice a week. Recordings were performed once a week from week 3 till week 10 (from week 6 till week 13 of neuronal differentiation) using a Maestro MEA system and AxIS Software Spontaneous Neural Configuration (Axion Biosystems). Spikes were detected with AxIS software using an adaptive threshold crossing set to 5.5 times the standard deviation of the estimated noise for each electrode. The plate was first allowed to rest for 3 min in the Maestro device, and then data was recorded for 5 min. For recording, electrodes exhibiting a minimum firing rate of 5 spikes per minute were designated as active using Axion Biosystems’ Neural Metrics Tool. Burst detection on individual electrodes was performed by applying an inter-spike interval (ISI) criterion that required at least 5 spikes occurring within a maximum ISI of 100 ms. Network burst identification required a minimum of 10 spikes meeting the same ISI parameters, with at least 25% of electrodes in the well showing activity. Cross-correlogram analysis with a 20 ms synchrony window was used to determine the synchrony index93.

Sholl analysis

Three-week-old neurons were dissociated and co-cultured with human astrocytes as described above. Ten days after starting co-culture, neurons were transduced with Incucyte Neurolight Orange Lentivirus (Sartorius) and incubated for ten days. Neuronal morphology at six weeks of differentiation were analyzed using Neurolucida software (MBF Bioscience). Sholl profile was obtained by applying a series of concentric circles at 10 μm intervals from the soma center, subsequently, the number of intersections was measured for each distance interval.

SETD1A knockdown by siRNA

HEK293T cells (ATCC #CRL-3216) were transfected with a pool of three SETD1A siRNAs or a scrambled siRNA (Silencer® Select (ThermoFisher Scientific)) at 10 nM, following the manufacturer’s protocol. Cells were harvested at 72 h after transfection.

Western blotting

For postmortem DLPFC whole tissue lysate, frozen tissue was homogenized and lyzed in RIPA buffer, 150 mM NaCl, 1% NP-40, 0.5% sodium deoxycholate, 0.1% SDS, 50 mM Tris-HCl (pH 7.5) and 1× protease inhibitor (MilliporeSigma) and centrifuged at 20,000 × g for 30 min at 4 °C. For postmortem DLPFC nuclei lysate, frozen tissue was homogenized in 5× tissue volume of nuclei extraction buffer (NEB) (320 mM sucrose, 5 mM CaCl2, 3 mM Mg(Ac)2, 0.1 mM EDTA, 0.1% NP-40, 10 mM Tris-HCl (pH 7.5) and 1× protease inhibitor) with a dounce homogenizer on ice. Brain homogenate was centrifuged at 94 x g for 8 min at 4 °C. The pellet was resuspended gently in 500 mL NEB and debris was strained out with a 35 mm filter. Tissue suspension was mixed with iodixanol (Optiprep, MilliporeSigma) to make the final iodixanol concentration at 25%. The suspension was layered over 29% iodixanol and centrifuged at 13,500 × g for 20 min at 4 °C. The pellet was lysed in RIPA buffer and centrifuged at 20,000 × g for 30 min at 4 °C. Protein concentration was measured with BCA protein assay kit (Pierce). The whole tissue lysate and nuclei lysate (26 mg of protein for each) were loaded onto SDS-PAGE gel (Thermo Fisher Scientific). For NPCs, neurons, and HEK293T, cells were lyzed in RIPA buffer and centrifuged at 20,000 × g for 30 min at 4 °C. Cell lysates were sonicated on Bioruputor pico (Diagenode) for detecting histone modifications. The lysates were loaded on SDS-PAGE gel. Signals were detected on Odyssey DLx (LI-COR Biosciences). ImageJ (ver 1.54) was used to quantify signals. Primary antibodies used for WB were listed in Supplementary Data 18. SETD1A_CST antibody was used in Fig. 1g.

CUT&Tag

CUT&Tag43 was carried out following the step-by-step protocol from the Henikoff lab (dx.doi.org/10.17504/protocols.io.wnufdew). Tn5-adapter complex was purchased from EpiCypher. Nuclei were isolated from human prenatal frontal lobes (110-170 mg tissues) as described above. The nuclear pellets were resuspended gently in nuclei storage buffer (NSB) (167 mM sucrose, 5 mM CaCl2, 3 mM Mg(Ac)2, Tris-HCl (pH 7.5) and 1× protease inhibitor) and incubated on ice until the pellet is loosened. NPCs were dissociated with Accutase at 37 °C for 2 min. Four-week-old neurons were dissociated with Papain (Worthington Biochemical) dissolved in 1:1 solution of Accutase and 1× PBS at 37 °C for 30 min. Dissociated neurons were then washed with Neurobasal medium supplemented with 1× GlutaMax and 2,000 U/mL DNase I (Worthington Biochemical). 1 × 105 nuclei or cells for histone marks and 1 × 106 nuclei or cells for SETD1A were used. Antibodies used for CUT&Tag were listed in Supplementary Data 18. The control IgG and antibodies against H3K4me marks were used at 1:100 and the antibodies against H3K27ac and SETD1A were used at 1:50. Index primer mix (Nextera DNA UD Indexes Set A) was purchased from Illumina. Libraries were sequenced on HiSeq2500 (Illumina) with a read length of 50-bp paired-end by Psomagen.

CUT&Tag data analysis

CUT&Tag data were processed and analyzed following a tutorial (dx.doi.org/10.17504/protocols.io.bjk2kkye) with modifications. Fastq files were pre-processed using FastQC (ver. 0.11.3) and MultiQC88 (ver. 1.12). Paired-end reads were aligned to the human genome version hg38.p12 using bowtie294 (ver. 2.4.5) with the following parameters: “--end-to-end --very-sensitive --no-mixed --no-discordant --phred33 -I 10 -X 700”. The reads were not trimmed, and we did not perform spike-in calibration. Duplicated reads were marked and removed using Picard (ver. 2.21.1) from each sample to generate.sam files. Each file was sorted and filtered based on the read quality (“minQualityScore = 2”) with samtools95 (ver. 1.14) and converted to .bam and .bedgraph files with bedtools96 (ver. 2.30.0). To evaluate the reproducibility among the samples, the read counts were split into 500 base pair (bp) bins and applied Spearman’s rank correlation between all pairwise brain samples and antibodies.

For the human prenatal cortical samples, sparse enrichment analysis for CUT&RUN (SEACR)97 was used for peak calling. Peaks for the reads from each antibody were identified using the reads from control IgG as a background. SEACR normalization “norm” and “stringent” arguments were used. For iPSC-derived NPCs and neurons, we used MACS298 (v.2.2.9.1) using the command “-macs2 callpeak -t <sample_input.bam>-c <IgG_input.bedgraph>-f BAM -g hs --keep-dup all -n <sample_id>--outdir./ -B –trackline” on .bam files of each sample. Peaks were visualized using Integrative Genomics Viewer99 (ver. 2.17.3). To annotate peaks into genomic regions, we initially used ChIPseeker44 (ver. 1.36.0) using “annotatePeak()” function. For annotation by ChIPseeker, promoters were defined as genomic regions between −1 kb to 100 bp of TSS for each gene. We then manually defined “active promoter” as H3K4me3 peaks overlapped with the promoters annotated by ChIPseeker and “active enhancer” as H3K27ac peaks not overlapped with our “active promoters”. Peaks outside active regulatory regions were annotated based on Chipseeker annotation and defined as “other genomic regions”. To find peaks between samples, we used GenomicRanges100 (ver. 1.52.0) and ChIPpeakAnno101 (ver. 3.28.1) using the “findOverlapsOfPeaks()” function with the arguments “ignore.strand = T” and “connectedPeaks = “merge””. Overrepresentation test was carried out using regioneR102 (ver. 1.26.1) with permTest() function with the arguments: function = randomizeRegions, evaluate.function = numOverlaps, count.once = TRUE and 10,000 permutations. We used a custom version of hg38 with 910 genomic regions removed based on the DAC exclusion list regions (https://www.encodeproject.org/files/ENCFF356LFX/) as a background genome. For SCZ GWAS loci, we used merged regions LD-r2 > 0.1 clumps plus 50 kb window5. SCZ GWAS loci and DRHs identified by SAR-seq30 were converted the loci to hg38 using liftOver package103 (ver. 1.30.0). Pathway enrichment analysis was performed using rGREAT104 (ver. 1.27.1). We selected pathways with p-value < 0.05 from binomial test as enriched pathways.

To compare H3K4me3 and SETD1A binding density between the genotypes, the diffbind package105 (ver. 3.10.0) was used to obtain RPKM value for each sample. The binding density was compared using the Wilcoxon signed-rank test in R (ver. 4.3.1).

For GO enrichment analysis of SCZ GWAS significant genes, we used g:profiler106 (ver. 0.2.0) by using all 682 SCZ GWAS significant (FINEMAP + SMR) genes5 as “background”. For synaptic gene enrichment, SynGO107 was used with “brain expressed” genes as background.

Linkage disequilibrium (LD) score regression

We performed stratified LD score (S-LDSC) regression108 to evaluate the enrichment of heritability of brain-related traits for CUT&Tag peaks. We selected eight brain-related traits with GWAS sample sizes >40,000, including a wide range of behavioral traits, and neurodevelopmental, psychiatric, and neurological disorders109. We also included five non-brain traits as controls. We downloaded GWAS summary statistics of each trait from the sources listed in Supplementary Data 19. Following recommendations from the LDSC resource website (https://alkesgroup.broadinstitute.org/LDSCORE), S-LDSC was run using LDSC (ver. 1.0.1)47 for each list of variants that included 97 annotations to control for the LD between variants with other functional annotations in the genome. We used HapMap Project Phase 3 SNPs as regression SNPs, and 1000 Genomes SNPs of European ancestry samples as reference SNPs, downloaded from the LDSC resource website.

WGCNA on human prenatal cortical transcriptomic data

To identify gene expression patterns in human prenatal cortices, we performed WGCNA. Transcriptome data of fetal cortices (n = 50 individuals in the second trimester)77 (Supplementary Fig. 7a) were used. Genes with sufficient abundance (RPKM ≥ 0.1) in more than 80% of samples (n = 20,675 genes) were then normalized by log2(RPKM + 1) transformation. We then removed 10 samples outlying for the standardized connectivity Z.k for a threshold of min Z.k = −4; Z.k was computed by hierarchical clustering of the Euclidean distances measured from the expression data110. Normalized expression data from all 40 samples were adjusted to remove unwanted variance potentially explained by RNA quality with cleaningY function (ver. 0.99.30). Residuals calculated from the “cleaning” model were then used as expression input in WGCNA. For the co-expression network analysis, we used the WGCNA package111 (ver. 1.68). The beta power threshold was selected by analysis of scale-free topology for multiple soft thresholding powers. The networks were automatically created in a step-wise manner with functions implemented in WGCNA: (1) Calculation of adjacency matrix: correlation type = bi-weight midcorrelation; type of network = signed; power = 12 selected with soft thresholding to correspond to an R2 ≥ 0.8; (2) Calculation of topological overlap matrix dissimilarity; (3) Detect modules by hierarchical clustering (deep split parameter = 2, minimum module size = 50) and summarize them by the first principal component (i.e., the “eigengene”- ME). The MEs can be regarded as expression profiles that best characterize the gene correlations within modules. Biological inference can be drawn from the genes in the constructed modules by using gene set enrichment analyses and by correlating module eigengenes with biological covariates. (4) Merging modules whose eigengenes are highly correlated for a height of cut dendrogram of 0.15. Genes were finally assigned to 63 modules (gene size per module: minimum = 61, maximum = 1802); 159/20,675 genes were not assigned to modules (“grey” genes). Detected modules were labeled with pre-specified colors implemented in WGCNA routines. GO analysis was then carried out on the modules with g:profiler106 (ver. 0.2.0) to find biological processes and functions enriched by the gene sets of each cluster. The whole list of genes (n = 20,675) was used as background. Expression for 875 genes: SETD1A and 874 SETD1A-bound genes in top 11 modules showing high correlation with SETD1A (|Pearson correlation coefficient (r)|>0.65) were used to calculate a sub-network, respectively a co-expression matrix (875 × 875) represented by Pearson’s correlation coefficients. This sub-network was subsequently converted into a graph object- an adjacency matrix, by using functions from the iGraph package112. Edges (i.e., correlation coefficients) less than the absolute value of 0.6 were removed. Essentially, the 875 genes represent nodes in the sub-network and correlation coefficients represent weights of edges (i.e., links between nodes weighted by the strength of correlations between nodes).

Gene expression correlation analysis

Postmortem DLPFC samples from 165 European ancestry individuals66,77 were analyzed (Supplementary Fig. 10a). Gene expression was normalized and transformed into log2(TMP + 1). We used lmFit in limma package113 (ver. 3.54.2) to remove covariate effects, including diagnosis, gender, five genetic principal components (PCs), and 17 expression PCs. The number of expression PCs was determined using sva package114 (ver. 3.46.0). Co-expressions were examined using Pearson correlation (cor.test). GO enrichment analysis was then performed with g:profiler106 (ver. 0.2.0).

Differential gene expression analysis of psychENCODE (PEC) and BrainSeq datasets

We analyzed DEGs in postmortem cortices between neurotypical control individuals and individuals with SCZ identified in the PEC project65. Fisher’s exact test was performed for the enrichment analyses. Neurotypical control and SCZ cohorts were stratified into three subgroups based on SETD1A expression level on the quantile ranges (<25% [low], >25 to <75% [middle] and >75% quantiles [high]). Differential expression analysis between Control-high vs. SCZ-low was performed following the PEC study65. Briefly, we used DESeq291 (ver. 1.40.2) by using RIN, sex, ethnicity, and age of death as covariates based on principal component analysis using factoextra (v.1.0.7) (https://CRAN.R-project.org/package=factoextra) and singular vector decomposition using “svd()” function in R. Genes were filtered to include only those on autosomes longer than 250 base pairs with transcripts per million reads (TPM) > 0.1 in at least 25% of samples, removing immunoglobulin isotypes. For the BrainSeq datasets66, transcriptome profiles of postmortem DLPFC tissue from 226 neurotypical control individuals and 155 individuals with SCZ (≥17-year-old at the age of death) were analyzed. SETD1A expression levels were evaluated by generating an edgeR87 (ver. 4.2.2) object [DGEList()] using raw counts and normalized to library-size using the calcNormFactors() function to generate trimmed mean of M-values counts. The normalized SETD1A counts were applied to the voom normalization115 using limma package113 (ver. 3.62.1) with a linear model. Voom-normalized counts were fitted into the null model using the limma’s function lmfit(). After regressing out covariates, the z-score transformation was applied on the residualized counts116. We then stratified both neurotypical control and SCZ cohort into three subgroups on the quantile ranges (<25% [low], >25 to <75% [middle] and >75% quantiles [high]) of the SETD1A residualized expression. Differential gene expression analysis was performed using edgeR and limma by using RIN, sex, age of death, mtDNA rate, rRNA rate, and number of total assigned genes as covariates. DEGs were defined with a threshold of FDR < 0.05. For GO enrichment analysis, we used g:profiler106 (ver. 0.2.0) by using all 682 SCZ GWAS significant genes5 as “background”. For synaptic gene enrichment, SynGO107 was used with “brain expressed” genes as background.

Statistics

Raw data were tested for normality of distribution, and statistical analyses were performed with a method described in figure legends using GraphPad Prism (ver. 10), Python (ver. 3.11.3), and R (ver. 4.3.1). P-values of less than 0.05 were considered to indicate a statistically significant difference between groups unless otherwise indicated.

Statistics and reproducibility

The results shown as representative images (Figs. 1g and 2f) were validated and reproduced in at least two independent batches of experiments.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.