Abstract
During mouse embryonic brain development, the Evf2 ultraconserved enhancer (UCE) lncRNA guides the Dlx5/6UCE to ~129 sites across chr6. However, previous work identified only 4 transcriptionally regulated targets associated with Evf2-Dlx5/6UCE-gene guidance, raising questions about the significance of Evf2-regulated Dlx5/6UCE-gene interactions. Here, single-cell transcriptomics reveal far greater alignment between Evf2-Dlx5/6UCE-gene guidance and transcriptional regulation than previously reported. Evf2 divides chr6 into short-range ( <10 Mb distant), activated genes, and long/super-long-range (10-129 Mb distant), repressed genes, identifying seizure regulating genes in the embryonic subventricular zone that predict adult phenotypes. Evf2-regulated Dlx5/6UCE-gene distances and directions (closer to or further from gene targets) can be decoupled from gene target transcriptional effects. Evf2 regulates Evf2-ribonucleoprotein (RNP) binding in a combinatorial manner to key regulatory sites, including chr6 Evf2-Dlx5/6UCE-gene guided sites, Evf1/2 RNA-directly bound sites (RBSs), and inter-chromosomal HiC looping interactions. RBSs divide chromosomes into multi-megabase domains enriched for Evf2-regulated RNP recruitment, transcription factor motifs, and HiC looping interactions. Together with Evf2-controlled homeobox motif recognition at Evf2-RNP recruitment sites and transcription factor motif enrichment in RBSs with DNA identity, this work supports direct roles for Evf2 in enhancer-gene guidance and transcriptional regulation, with the potential for both site-specific and chromosomal domain specific RNP recruitment.
Similar content being viewed by others
Introduction
With the discovery that DNA regulatory sequences selectively regulate genes across megabase distances of the genome, it is now clear that co-regulation does not require co-linear gene organization, but rather 3D organization. Understanding how key regulatory sequences, specifically enhancers, select specific target genes, while skipping others has become a major focus in the field. Multiple 3D gene organizational mechanisms have been proposed, including but not limited to, formation of intra-chromosomal and inter-chromosomal enhancer-gene 3D hubs, loop extrusion through CTCF and cohesin recruitment, R-loop formation, and CTCF-RNA binding1,2,3,4,5.
Initial studies on short enhancer non-coding RNAs (eRNAs, <200 bp length) demonstrated enhancer regulating activities6,7 and short-range chromosome looping activity8. However, genome-wide roles for enhancer long non-coding RNAs (e-lncRNAs, >200 bp) as organizers have also been proposed9,10, supported by enrichment at chromosomal loop anchors, increased frequency of interactions with promoters, RNA Pol II and YY1 binding, and G-quadruplexes11. A human atlas of 45,411 eRNAs (HeRA) from 9577 samples across 54 human tissues12, and a compendium of eRNA functional assays highlight their potential significance13 and dual regulatory functions at the level of RNA and DNA.
Our work on Evf2 identified nuclear e-lncRNA cloud formation with DLX homeodomain transcription factors (TFs), and trans-regulation of enhancer activity through ultraconserved sequences3. These findings were based on the initial identification and characterization of the ultraconserved Dlx5/6 intergenic enhancer I56i (Dlx5/6UCE)14. Genome-wide UCEs were initially described as >200 bp sequences of 100% conserved DNA between human, mice, and rat genomes15,16. The results from deleting UCEs in mice are variable, from no change in viability or phenotype17, to neurological/growth abnormalities18 or limb defects19. Although the absence of viability or phenotypic effects in mice may be explained by enhancer redundancy20, most UCE polymorphisms exhibit low minor allele frequencies, suggesting that UCE polymorphisms may contribute to human disease21. Against a requirement for ultraconservation, many UCEs are functionally robust to mutagenesis, with little loss of function at 5% mutation of ultraconserved base pairs22,23. In addition, most UCEs show only modest levels of purifying selection against point mutations24, with >300 mutations identified in a database of 4273 human UCEs25. While the reason for ultraconservation remains a mystery, the subset of transcribed UCEs, with dual functions at both the RNA and DNA level, may provide further insight.
Functional roles for individual lncRNAs and eRNAs support roles in 3D organization. Investigations into lncRNA and e-lncRNA regulators of 3D organization reveal both intra-chromosomal [Xist26, ThymoD lncRNA27, Evf25] and inter-chromosomal [Firre28, DRReRNA29] mechanisms. The Xist lncRNA is among the most well-characterized lncRNAs that regulate intra-chromosomal 3D organization26,30. Evf2 and Xist exhibit several similarities, both forming RNA clouds at sites of transcription initiation, altering chromosome topology and repressing genes through effects on chromatin remodeling3,5,26,30,31,32. Given that Xist appears to form clouds using conventional confocal microscopy, but can be resolved into discrete puncta that colocalize with H3K27me3 and PRC2 by super-resolution microscopy33, it remains to be determined whether Evf2 clouds are also composed of multiple discrete puncta. However, in visualizing Evf2 RNA clouds using near super-resolution microscopy, we could not resolve Evf2 RNA clouds into discrete puncta beyond the capability of Zeiss LSM 880 Airyscan confocal imaging (which nears super-resolution detection, enabling 140 nm resolution laterally and 400 nm resolution axially5).
Proteomic analysis of the Xist-RNP has revealed the roles of several RNPs34,35,36,37,38. Another commonality between Xist and Evf2 is that the Xist-RNP85 contains 16 proteins34, shared with Evf2-RNP8732, including SMARCA4, the catalytic subunit of BAF chromatin-remodeling complex, and 3D organizer cohesin (SMC1a and SMC3)32,35. Evf2 RNA- and G-rich RNA-SMARCA4 interactions inhibit chromatin remodeling activity through inhibition of ATPase activity32, a mechanism also shared during X-inactivation31. Furthermore, Evf2-dependent recruitment of cohesin near key topologically controlled sites supports involvement in Evf2-Dlx5/6UCE long-range gene guidance5; both intra- (ThymoD lncRNA27) and inter-(DRReRNA29) chromosomal RNA regulators also recruit cohesin, supporting shared lncRNA-3D mechanisms. In addition to its cis acting effects on the X chromosome39, Xist downregulates autosomal genes in trans40. Therefore, Evf2 cis- and trans- mediated gene regulation of chr6 genes5 and direct Evf2 RNA binding and transcriptional regulation of genes on non-chr6 chromosomes (this work) provide additional parallels between Xist and Evf2.
While our previous work showed that Evf2 guides Dlx5/6UCE to sites across an entire chromosome, in part, through SOX2 and cohesin recruitment4,5, the significance of the majority of Dlx5/6UCE guidance events was not known. In addition, despite the demonstration of Evf2 RNA cloud formation at transcriptionally regulated target genes, directly bound Evf2 sites, that would distinguish between direct and cascade mechanisms had not been determined. In this work, we combine scRNAseq, ChIRPseq, Cut&Run, and HiC, with previously reported Dlx5/6UCE chromosome conformation capture (4Cseq) and Cut&Run/ChIPseq datasets4,5. We identify a role for 1D and 3D enhancer-gene relationships in Evf2 e-lncRNA-mediated selectivity in transcriptional regulation, and RBS-defined chromosomal domains (RCDs) that are differentially enriched for Evf2-regulated RNPs and chromosomal interactions. Taken together, these data link multiple Evf2 regulatory mechanisms, including Evf2 direct RNA binding, RNP recruitment, TF-DNA motif recognition, RCD formation, and Dlx5/6UCE guidance, that in combination are responsible for selectively regulating gene transcription.
Results
Cellular heterogeneity masks the extent of Evf2 transcriptional regulation in mouse E13.5 ganglionic eminences
Chromosome conformation capture (4Cseq) studies using Dlx5/6UCE as the bait identified 129 Evf2-regulated Dlx5/6UCE sites located within 50 kb of gene targets on chr65. However, bulk analysis (microarray) of E13.5GEs identified only 6 Evf2 regulated chr6 genes, 2 adjacent genes (Dlx5 and Dlx6), and 4 long-range genes5. One possibility was that Dlx5/6UCE regulation poises genes for regulation later in development41. However, heterogeneity of Evf2 RNA nuclear cloud numbers and sizes4,42, Evf2-regulated SOX2 protein pool/condensate (PPC) heterogeneity4, Evf2-regulated enhancer-gene distance heterogeneity5, and transcriptional heterogeneity among E13.5GE subpopulations in mice and primates43,44, supported the possibility that E13.5GE cellular heterogeneity masks the extent of Evf2 transcriptional effects. In order to determine the full extent of Evf2 transcriptional regulation, we used 10X genomics scRNAseq to sequence 16,294 single E13.5GE cells from wildtypes (Evf2+/+, 8GEs) and mice with a targeted transcription stop in Evf-exon 1, preventing Evf2 expression45, (Evf2TS/TS, 4GEs) (Fig. 1A). We used differential analysis to identify Evf2 gene regulation at single-cell resolution (Fig. 1A–D, Fig. S1, S2). In the subventricular zone (SVZ), where Evf2 is first activated, Evf2 is decreased by >5-fold, consistent with previous characterizations of Evf2TS/TS 45. ScRNAseq analysis shows that low levels of Evf2 transcripts detected in Evf2TS/TS GEs (Fig. 1D) contain the common region between Evf1 and Evf2 (exon4) (Fig. S2D, blue peaks). These results are consistent with previously reported TAQman qRT-PCR5,45. Analysis of Evf1/2TS/TS E13.5GE, where both Evf1 and Evf2 transcription are prevented (Fig. 1A, mice containing TS insertions in exon 1 and exon 3), confirms that exon 4-containing transcripts are substantially reduced compared to wildtype and Evf2TS/TS (Fig. S2D, pink peaks).
A 10Xgenomics scRNAseq datasets generated from embryonic brain tissues dissected as shown: E13.5GEs (lateral/medial/caudal, LGE/MGE/CGE). Mice containing a triplepolyA transcription stop signal inserted into Evf2 exon1 (Evf2TS/TS, described in45) are compared with wildtypes (Evf2+/+). B Colors assigned to major scRNAseq E13.5GE subpopulations. C UMAP identifies clustering of major subpopulations (VZ, SVZ, and Lhx6 + , Lhx6/8 + , or Isl1+ clusters), Evf2 activation in cells transiting from cycling progenitor zones (VZ, light pink, and SVZ, yellow). Arrow represents transition from the SVZ to differentiated populations. D UMAPs comparing Evf2+/+ and Evf2TS/TS subpopulations, and overlapping Evf2 co-expression with Dlx5 and Dlx6. E. Toppgene analysis of SVZ and SVZ-derived subpopulations, identifying shared and distinct functional groups (cell cycle genes, chromosomal organization, clonic seizures). F In Evf2TS/TS, cell cycle target genes (8) decrease in differentiated populations, while 2/8 increase in the subventricular zone. G, H Evf2 regulation of histone modification (H3K27me3, native ChIPseq), and/or Evf2-RNP recruitment (SMARCA4, SMARCC2, SOX2, DLX, Cut&Run) at cell cycle genes, peaks comparing Evf2+/+ vs. Evf2TS/TS. I Toppgene analysis of Evf2 regulated genes involved in chromosomal organization. J Evf2 differential regulation of genes involved in chromosomal organization among E13.5GE subpopulations. K. Evf2 regulation of histone modification (H3K27me3, ChIPseq), and Evf2-RNP recruitment (SMARCA4, SMARCC2, SOX2, DLX, Cut&Run) at chromosome organization regulator Top2a gene, peaks comparing Evf2+/+ vs. Evf2TS/TS. L Toppgene analysis of Evf2-regulated genes in E13.5GE SVZ and human phenotypes associated with seizures. Relative expression of 24 genes involved in interictal epileptiform activity and the subset of epilepsy genes, colors corresponding to chromosomal assignment. M–R Evf2 regulation of histone modification (H3K27me35), and Evf2-RNP recruitment (SMARCA4, SOX2, DLX, SMC34,5, SMARCC2, NONO [Cut&Run, this work, see Fig. 2C]) at 6 genes involved in interictal epileptiform activity, as indicated in peaks comparing Evf2+/+ vs. Evf2TS/TS. Toppgene hypergeometric distribution calculation of p values (two-tailed, E, I, L).
Cell type annotations are determined using marker specificity within clusters of cells, as determined by genesorteR (https://github.com/mahmoudibrahim/genesorteR). Subpopulations corresponding to previously annotated cell types43, and cycling cells (ventricular zone [VZ], SVZ) are shown in UMAPs for each genotype (Fig. 1B, C). Evf2 is activated and co-expressed with Dlx5 and Dlx6 in cells transitioning from VZ to SVZ (Fig. 1C, D), consistent with previous RNA in situ analysis3,42,45. The UMAP identifies Evf2(+) populations in cycling progenitors (SVZ) and differentiated neuronal populations (Lhx6, Lhx6/8 clusters) (Fig. 1C), populations that have exited the SVZ. The Isl1 cluster expresses low levels of Evf2 and high levels of Dlx6, consistent with Evf2 (also known as Dlx6OS1) - mediated Dlx6 anti-sense gene repression5.
We also identify a major NeuroD6 cluster that is largely Evf2 (-), in which only 3.8% of all 3747 Evf2 + E13.5GE cells co-express NeuroD6 (Fig. S1D, F). Given this low percentage of cells co-expressing NeuroD6 and Evf2, and previous analysis of GABAergic progenitor scSeq analysis46, we excluded the NeuroD6 cluster from further analysis. However, the origin of the NeuroD6 subpopulation in embryonic GEs is controversial, as fatemapping studies using mice containing a cre knockin into NeuroD6 exon2 (NEX-cre)47,48 did not detect SVZ cells without viral infection, and may have disrupted regulation of the locus. In our examination of the NeuroD6 locus (located on chr6 upstream of Evf2/Dlx5/6UCE), there is a high level of histone methylation and acetylation in E13.5GEs, corresponding to where NEX-cre integrates (NeuroD6 intron2, Fig. S3). Together with Evf2 and Evf2/Sox2-Dlx5/6UCE guidance upstream and downstream of NeuroD6, RNP recruitment to the NeuroD6−5’ promoter4,5, and studies shown later in this report identifying an Evf1/2 RNA binding site downstream of NeuroD6 (Fig. S3), we cannot rule out the possibility of a direct role for Evf2 in NeuroD6 gene regulation in the SVZ, prior to differentiation. Experiments to investigate whether NeuroD6 populations derive from ventral SVZ in mice where the NeuroD6 locus is not altered will be important to resolve the question of the origin of the NeuroD6 subpopulation.
Differential scRNAseq comparing Evf2+/+ vs. Evf2TS/TS E13.5GE subpopulations reveals a greater extent of Evf2 transcriptionally regulated target genes than previously identified (Figs. 1 and S1). In Evf2(+) and Dlx5 UMAP-defined clusters (well-established derivatives of the SVZ), there are 559 SVZ [293(activated)/265(repressed)], 107 Lhx6, and 101 Lhx6/8 Evf2 transcriptional targets (cutoffs of >100 cells, >1.5-fold change, qval < 10−5). Evf2 transcriptionally regulates genes in clusters that either no longer express Evf2, or express Evf2 at low levels after exiting the SVZ (362 (Isl1) and 60 (NeuroG2)). Therefore, it is possible that Evf2 is required in SVZ progenitors prior to differentiation, at low levels in differentiated clusters, and/or indirect effects including cell non-autonomous or downstream effects are involved. Despite extensive gene expression effects within clusters, Evf2 does not appear to affect overall cell type or cell cycle proportions in SVZ (Fig. S1D, E) arguing against major roles in fate specification or cell cycle control. Toppgene analysis of Evf2-regulated genes identifies both common and unique GO functional terms (Fig. 1E). In differentiated neuronal clusters (Lhx6, Lhx6/8, Isl1), mitotic cell cycle genes or nuclear division are the number 1 Evf2-regulated GO Biological Process. In Evf2TS/TS differentiated clusters, but not SVZ, cell cycle regulators are decreased, supporting effects on transitory cells (cells expressing both cell cycle and differentiated genes) (Fig. 1F). Previously reported native ChIPseq, crosslinked ChIPseq, and Cut&Run4,5 combined with newly generated Cut&Run profiles for Evf2 RNPs SMARCC2 and NONO, show that Evf2 regulates H3K27me3 and/or Evf2 RNP recruitment to cell cycle genes, (Figs. 1G–H and S4A–D). Although bulk E13.5GE tissues are used for ChIPseq/Cut&Run methods, Evf2-regulated RNP recruitment changes are detected at targets with cluster-specific effects (Ccnf/Cdk1, increased in SVZ, decreased in differentiated clusters, Fig. S4B, D). Alignment of Evf2-regulated RNP binding and histone modifications (HMs) at scRNAseq gene targets supports transcriptional rather than post-transcriptional mechanisms.
The number 1 Evf2-regulated GO molecular function (chromatin binding) is shared among Lhx6/Isl1 clusters, while the GO Biological process, chromosome organization (Fig. 1I–K, Fig. S4E (Venn diagram), Fig. S4F–H) is present in SVZ-derived clusters, supporting transcriptional regulation of chromosome organizers during Evf2-regulated chromosome organization4,5. Overall, transcripts encoding chromosome organizers increase in Evf2TS/TS SVZ, and decrease in differentiated subpopulations, except for Brd9, which increases independent of differentiation (Fig. 1J). Evf2 regulates H3K27me3/H3K27ac and Evf2-RNP binding at genes involved in chromosome organization, supporting transcriptional rather than post-transcriptional effects (Fig. 1K, Fig. S4F–H). Thus, cluster-specific Evf2 transcriptional regulation of key chromosome organizers supports the idea that heterogeneity of chromosomal organization in E13.5GEs is established, in part, through transcriptional regulation during neuronal differentiation.
The SVZ is unique in that the number 1 functional pathway is neuron projection development (GO Biological Process) (Fig. 1E) and the top 5 human phenotypes seizure seizure-related (Fig. 1L). Among 24 Evf2-regulated genes involved in interictal epileptiform activity, 20 decrease in Evf2TS/TS SVZ; 7 involved in epilepsy also decrease. In differentiated Evf2TS/TS subpopulations, NeuroD2 decreases in both Lhx6/8 and Lhx6 clusters, Cntn2 decreases in Lhx6 cluster, while Tubb2a increases in Isl1 and NeuroD6 clusters (marked on graph in Fig. 1L, detailed in Fig. S4O). Therefore, Evf2 maintains activation of Cntn2 and NeuroD2, but represses Tubb2 in differentiated subpopulations. Profiles of Evf2-regulated H3K27me3/H3K27ac and Evf2-RNP binding at 12/24 genes involved in interictal epileptiform activity are shown (Fig. 1M–R, Fig. S4I–N). Evf2-regulated DLX binding at Cacna1h, Gabrb2, Atp1a3, Cacna2d1, NeuroD2, and Reln is consistent with known roles of the DLX family of homeodomain TFs and downstream targets for their regulation of excitation/inhibition balance, epilepsy, and neurodevelopmental disorders49,50. Together, scRNAseq and Cut&Run analysis support that a large group of Evf2 target genes are altered in embryonic Evf2TS/TS SVZ, predicting adult phenotypes of Evf2TS/TS mice that we previously reported (GABAergic synaptic defects in slice cultures of hippocampus and prefrontal cortex by electrophysiology, and increased seizure susceptibility after treatment with the GABA antagonist pentylene tetrazole5,45).
Analysis of genes from all three functional groups (cell cycle, chromosome organization, and interictal epileptiform activity), indicates that Evf2 loss downregulates gene expression associated with decreased H3K27me3, a repressive mark, (Ccna2, Cdca2, Top2a, Scn3a, Syt1, Gabrb2, Cacna2d1, Fig. 1, and Fig. S4). However, at most sites, Evf2 recruits one or both chromatin remodelers (SMARCC2 and SMARCA4) and alters binding of one or more additional RNPs. Some striking examples include Cdca2-5’, where SMARCC2 (at a peak height of 700) is lost in Evf2TS/TS, accompanied by loss of SMARCA4 and DLX (Fig. 1H), and Atp1a3-5’, where loss of SMARCC2 (at a peak height of 1520) is accompanied by SMARCA4 loss and DLX increase in intron 1 (Fig. 1Q). In a few cases, Evf2 loss causes positional shifts in RNP binding (SOX2, Fig. S4M, SMARCA4, Fig. S4G). These data suggest that Evf2-RNP regulation is highly associated with transcriptional regulation. However, given that Evf2-regulation of H3K27me3 does not follow the expected transcriptional changes, Evf2-mediated subtle regulation of H3K27me3 may result from cells where gene regulation does not occur, cells in a state of transition, or be related to chromosomal looping changes that have been linked to subtle H3K27me3 regulation51, among many possibilities.
Transcriptionally regulated gene targets align with multimodal Evf2-Dlx5/6UCE-gene guidance and Evf2-RNP recruitment
We next determined how Evf2 transcriptionally regulated targets align with Evf2-regulated Dlx5/6UCE interactions on mouse chr6, combining differential scRNAseq results in E13.5GE subpopulations and Dlx5/6UCE-4Cseq results that identified 129 Evf2-regulated Dlx5/6UCE sites4,5. Here, we define the term EGG, as Evf2-Dlx5/6UCE enhancer-gene guidance to a gene that Evf2 transcriptionally regulates, and refer to “EGG” as the Evf2 transcriptionally regulated gene and its closest Evf2-regulated Dlx5/6UCE sites. In addition to “EGG”, we provide a list of abbreviations and definitions of terms in Supplementary Note 1 for increased clarity. We first focused on EGGs in which Dlx5/6UCE is located within 5 kb of gene bodies (GBs, defined as the region of the gene from 5’ promoter to the 3’UTR), based on the premise that the Evf2 regulated Dlx5/6UCE sites that are closest to GBs are expected to have the highest probability of transcriptional effects. For EGG-GBs, the Dlx5/6UCE loops to a site within 5 kb of the GB in either Evf2TS/TS and Evf2+/+, as previously determined by Dlx5/6UCE-4Cseq5. EGG-GBs align with 15 transcriptionally regulated targets on chr6 across 4 clusters (colored boxes, Fig. 2A); 7/15 are also regulated by Sox2 (bi-colored boxes, previously identified by Dlx5/6UCE-4Cseq in Sox2fl/fl;Dlx5/6cre E13.5GEs4). Gray boxes (Fig. 2A) identify 3 transcriptionally regulated targets associated with Evf2 independent-Dlx5/6UCE interactions. These 15 Evf2 transcriptionally regulated genes associated with Dlx5/6UCE locations within 5 kb of the GB are referred to as Dlx5/6UCE-GB15. Analysis of Dlx5/6UCE-GB15 targets reveals differences in transcriptional effects based on relative positions on chr6 (fixed distances from Dlx5/6UCE in one dimension (1D)). Specifically, Evf2 activates short-range genes located within 10 Mb of Dlx5/6UCE (Ccdc132-Foxp2) by 6-fold, and represses long/super-long-range targets located >10 Mb from Dlx5/6UCE (Gm15594-Grin2b) by 1.8-fold, dividing chr6 into Evf2-regulated activated and repressed domains (Fig. 2A). Importantly, although 4Cseq is performed on bulk E13.5GE chromatin, EGGs associated with transcriptionally regulated subpopulation-specific scRNAseq gene targets are identified. In order to align bulk 4Cseq5 and scRNAseq targets, we determined which Evf2 regulated Dlx5/6UCE looping interactions are located within ±5 kb of Evf2 differentially regulated transcriptional targets on chr6 for each subpopulation (Fig. 2A). Also listed are the total number of cells that express each gene (applying a cut off ≥100 cells) and the number of cells in each cell type (Fig. 2A). Previous work verified Evf2 RNA/Dlx5/6UCE/SMC3 association with the Cttnbp2 region, with associated adjacent gene regulation, but not effects on Cttnbp25. However, scRNAseq identifies Evf2 regulation of Gm15594[Cttnbp2-AS] in Isl1 and Dlx5 subpopulations, supporting the validity of aligning 4Cseq bulk results with scRNAseq cluster-specific gene targets.
Differential scRNAseq analysis of Evf2-regulated genes located on chr6 (Evf2TS/TS÷ Evf2+/+) identifies cell type specificity, represented according to color (VZ, SVZ, Isl1, and Lhx6). A 15 Evf2 target gene locations on chr6 from 3.5 Mb (Ccdc132) −134Mb (Etv6), vertically represented with strand information (+) or (-), colored boxes on the left corresponding to cell types, Dlx5/6UCE locations within ±5 kb of the gene body (GB) indicated, with Evf2 guidance of Dlx5/6UCE to 13/15 GBs. 3D-Dlx5/6UCE positional shifts relative to GBs in the presence (Evf2+/+, pink boxes) and absence (Evf2TS/TS, blue boxes) of Evf2, co-regulated by SOX2 (bi-colored boxes), or gray boxes (independent of Evf2), genes associated with enhancers, E, or an ultraconserved sequence, U (Foxp2). Chr6 qA1/A2 (dashed line) division of chr6 into short-range and long-range genes, decrease (dotted arrow) and increase (gray) in Evf2TS/TS; average fold change (FC). B Previous proteomics sequencing of Evf2 RNP87 (yellow circle) identified diverse RNA binding proteins (dashed lines) and protein-protein interactions (dotted lines), also identifying SMARCC2 and NONO as Evf2 RNPs32, schematic modified from ref. 4. C Violin Plots compare Evf2 regulated RNP binding at Evf2 (+) and (-) regulated Dlx5/6UCE sites in E13.5GEs (Evf2TS/TS vs. Evf2+/+) (n = 2 libraries sequenced of 4 independent biological replicates of biologically pooled 6-8 E13.5GEs for each genotype, and each antibody, IgG subtracted). Cut&Run ≤120 bp fragments (directly bound) and ≥150 bp fragments (indirectly bound) profiles of SMARCC2 and NONO peaks at Evf2 positively (+) and negatively (-) regulated Dlx5/6UCEins (4Cseq, previously reported in ref. 5) across chr6. Two-tailed Student’s t test, significant pvals in red: (-) SMARCC2 120 bp, p = 4.1E-16, (-) SMARCC2 150 bp, p = 2.3E-08, (+)SMARCC2 120 bp, p = 5.7E-06, ( + ) SMARCC2 150 bp, p = 5.0E-07, ( + ) NONO 120 bp, p = 0.0009), blue (Evf2TS/TS) yellow (Evf2+/+). (+) SMARCC2 150 bp [Evf2TS/TS]: min = 0, max=2.13, median =1.92, Q1 = 1.87, Q3 = 1.97, lower boundary= 1.72, upper boundary= 2.13, (+) SMARCC2 150 bp [Evf2+/+]: min = 1.68, max=5.22, median =2.20, Q1 = 1.98, Q3 = 2.57, lower boundary= 1.09, upper boundary= 3.47). (-) SMARCC2 150 bp [Evf2TS/TS]: min=1.36, max=2.13, median = 1.92, Q1 = 1.87, Q3 = 1.97, lower boundary= 1.73, upper boundary= 2.11, (-) SMARCC2 150 bp [Evf2+/+]: min = 1.69, max = 4.35, median = 2.08, Q1 = 1.94, Q3 = 2.52, lower boundary= 1.06, upper boundary= 3.40. (+) SMARCC2 120 bp [Evf2TS/TS]: min=0, max=4.02, median =0, Q1 = 0, Q3 = 3.57, lower boundary = -5.36, upper boundary=8.93, (+) SMARCC2 120 bp [Evf2+/+]: min=0, max=8.37, median =3.69, Q1 = 0, Q3 = 4.06, lower boundary = −6.08, upper boundary = 10.14. (-) SMARCC2 120 bp [Evf2TS/TS]: min=0, max=4.00, median=0, Q1 = 0, Q3 = 0, lower boundary=0, upper boundary=0, (-) SMARCC2 120 bp [Evf2+/+]: min=0, max=6.01, median =3.56, Q1 = 2.37, Q3 = 3.98, lower boundary = -0.06, upper boundary= 6.41. (+) NONO 150 bp [Evf2TS/TS]: min=0, max=2.22, median =1.88, Q1 = 1.81, Q3 = 1.94, lower boundary= 1.60, upper boundary= 2.15, (+) NONO 150 bp [Evf2+/+]: min=1.74, max=2.10, median=1.94, Q1 = 1.89, Q3 = 1.98, lower boundary=1.74, upper boundary= 2.13. (-) NONO 150 bp [Evf2TS/TS]: min=1.67, max = 2.43, median =1.90, Q1 = 1.85, Q3 = 1.99, lower boundary=1.65, upper boundary = 2.19, (-) NONO 150 bp [Evf2+/+]: min=1.74, max=2.29, median=1.93, Q1 = 1.86, Q3 = 1.99, lower boundary=1.67, upper boundary=2.18. (+) NONO 120 bp [Evf2TS/TS]: min=0, max=4.34, median =0, Q1 = 0, Q3 = 0, lower boundary=0, upper boundary=0, (+) NONO 120 bp [Evf2+/+]: min=0, max=4.03, median =0, Q1 = 0, Q3 = 3.89, lower boundary = −5.83, upper boundary = 9.72. (-) NONO 120 bp [Evf2TS/TS]: min=0, max=4.99, median =0, Q1 = 0, Q3 = 0, lower boundary= 0, upper boundary=0, (-) NONO 120 bp [Evf2+/+]: min=0, max=4.37, median =0, Q1 = 0, Q3 = 3.88, lower boundary = −5.82, upper boundary= 9.70.
In order to further understand the mechanism of EGG regulation, we investigated the relationship between EGGs and Evf2 RNP recruitment. The Evf2 RNA cloud is a scaffold for the assembly of a diverse DLX-bound, RNP complex initially identified by proteomic analysis of E13.5GEs (Evf2-RNP8732) (Fig. 2B, modified from ref. 4). Our previous work showed that Evf2 RNA directly binds chromatin remodelers SMARCA4, SMARCC2/1, and binds the SOX2 homeodomain TF through promiscuous RNA-protein interactions, while protein-protein interactions indirectly link Evf2 RNA to the DLX homeodomain TF4,5,32. We also showed that Evf2 recruitment of diverse Evf2-RNPs (SOX2, DLX, SMARCA4, SMC1A, SMC3) is enriched at sites where Dlx5/6UCE looping is regulated4,5, and that Evf2-SOX2 RNP mechanisms contribute to Dlx5/6UCE guidance and transcriptional repression on chr64. Here, we expand our previous analysis of Evf2-RNP recruitment by performing Cut&Run of SMARCC2 (a chromatin remodeler with promiscuous RNA binding properties similar to SMARCA432), and NONO (the Neat-1 arcRNA binding, paraspeckle organizer52). Violin plots compare SMARCC2 and NONO binding at Evf2 positively regulated Dlx5/6UCE looping sites (+) and negatively regulated Dlx5/6UCE looping sites (-), in Evf2TS/TS (blue) and Evf2+/+ (yellow) (Fig. 2C). The Y-axis indicates normalized read counts of SMARCC2 bound peaks (left) panel and NONO bound peaks (right panel). The Cut&Run method distinguishes between directly bound SMARCC2 and NONO fragments (top panel, <120 bp) and indirectly bound fragments (bottom panel, >150 bp), identifying differences in significance and profiles. SMARCC2 and NONO bound peaks within ± 2 kb distance from Evf2 -regulated Dlx5/6UCE looping sites across chr6 are included in this analysis; significant p value differences between genotypes are indicated in red (SMARCC2, Dlx5/6UCE (+) and (-) regulated looping sites in the presence of Evf2 (Evf2+/+, yellow) compared to absence (Evf2TS/TS, blue) (Fig. 2C)). While Evf2 also regulates NONO binding at Dlx5/6UCE (+) regulated looping sites, regulation is limited compared to SMARCC2, identified in directly bound sites (<120 bp fragments), thereby predicting differential roles for SMARCC2 and NONO.
We next created EGG plots in order to visualize Dlx5/6UCE-GB15 targets on chr6 (1D fixed positions, X-axis) in relation to Evf2-regulated Dlx5/6UCE positional shifts with respect to gene targets (3D shifted positions, Y-axis), and alignment with Evf2-regulated RNP binding (SMARCC2, SMARCA4, SOX2, NONO, DLX (anti-DLX1/2/5/6)) (Fig. 3). Yellow and blue solid circles indicate Dlx5/6UCE positions relative to GBs in 3D (Evf2+/+ (yellow solid circle) and Evf2TS/TS (blue solid circle)), while gray (increase) and red (decrease) rings indicate transcriptional regulation in Evf2TS/TS (Fig. 3A). EGG plots identify at least one or more Evf2-regulated RNPs within ±50 kb of GBs (Fig. 3A [chr6 0-150 mb], Fig. 3B and Fig. S5 [chr6 3–9 Mb region:Ccdc132-Nxph1], data consistent with diverse Evf2 RNP roles in chromosome organization, chromatin remodeling and transcriptional regulation4,5,32, and Evf2 regulated HMs, data consistent with transcriptional regulation (Fig. S6).
Differential scRNAseq analysis of Evf2 regulated genes located on chr6 (Evf2TS/TS÷ Evf2+/+) identifies cell type specificity, represented according to color (VZ, SVZ, Isl1, Lhx6). A Evf2-EGG-GB plots align Dlx5/6UCE shifts at each chr6 transcriptionally regulated gene target and distances of associated Evf2 regulated Evf2-RNP binding sites. Colored loops highlight Evf2 differentially regulated scRNAseq gene targets on chr6 (Evf2TS/TS÷ Evf2+/+) in E13.5GE subpopulations (VZ, SVZ, Isl1, Lhx6) coincident with Evf2-Dlx5/6UCE guidance: Top panel (Evf2 positively regulated RNPs). Bottom panel (Evf2 negatively regulated RNPs), chr6: 0-150 Mb. B Panel focus is zoomed on chr6 proximal region (X-axis, chr6: 3.2-9.1 Mb). Evf2 positively and negatively regulated RNP binding are shown in one panel, with red rings around negatively regulated RNPs. Legend in (A) is also used in (B), individual RNPs. In both (A) and (B): Dlx5/6UCE distances are plotted on the Y-axis (X 1 Mb up (upstream) and down (downstream) distances from gene (0)). Evf2(+)regulated Dlx5/6UCE position (yellow dots), Evf2(℃)regulated Dlx5/6UCE position (blue dots). Transcriptional effects of Evf2 loss (Evf2TS/TS): shifts associated with decreased gene expression (red rings), shifts associated with increased gene expression (gray rings). Right Y axis (gene expression): log2 (fold change in gene expression), black diamonds (−3 to +3), red diamonds (<−3). Y-axis Evf2-regulated RNP binding sites (X10kb distances from gene (0)) (SMARCC2, SMARCA4, DLX, SOX2, NONO, SMC1): solid ovals (Evf2(+) regulated binding), red rings around solid ovals (Evf2(ȓ) regulated binding). Blue line indicates location of Dlx5/6UCE/Evf1/2 on chr6. Results are based on previously published datasets (4,5: Evf2 regulated Dlx5/6UCE-4Cseq, Cut&Run [Evf2 regulated RNP binding, SMARCA4, DLX, SOX2], ChIPseq [Evf2 regulated RNP binding, Smc1/Smc3], this work: scRNAseq, Cut&Run [Evf2 regulated RNP binding, NONO. SMARCC2]).
Since Evf2 is activated in cells exiting from the VZ to SVZ3,45, and expression levels are highest in SVZ progenitors, SVZ transcriptional targets constitute the most likely direct targets of Evf2 regulation. However, transcriptional effects of only 4 SVZ EGG-GB targets are detected, raising the possibility that long-distance Evf2-regulated Dlx5/6UCE positional shifts may also contribute to gene regulation in the SVZ. We next compared EGG plots of chr6 Evf2 regulated target genes in SVZ (29 genes, Fig. 4A) with Dlx5 populations (UMAP differentiated clusters (Isl1, Lhx6, Lhx6/8, derived from the SVZ, 23 genes, Fig. 4B), aligning genes on chr6 (1D fixed positions, X-axis) and their nearest Evf2 regulated Dlx5/6UCE positional shifts (3D shifted positions, Y-axis). In the SVZ, as transcriptionally regulated target gene distances with respect to the location of Evf2/Dlx5/6UCE increase in 1D on chr6 (X-axis), 3D distances between Dlx5/6UCE and transcriptionally regulated target genes also increase (Y-axis), averaging 380 kb for genes located between Tac1-Lsm5 and 5.1 Mb for genes located between Mad2l1-Fam60a, a 13.5-fold increase (4A). Arrows indicate the direction of each Evf2-regulated Dlx5/6UCE positional shift (in the absence of Evf2) with respect to the closest transcriptionally regulated target gene. In the SVZ, Evf2 regulated long-distance Dlx5/6UCE positional shifts in relation to GBs range between 5 kb–15.4 Mb, characterized by various positional changes (upstream, downstream, and across-GB). The identification of ≥2 transcriptionally regulated target genes that are located within 2.5 Mb distance of each other was defined as gene clusters. Gene clusters are associated with Dlx5/6UCE positional shifts that are equidistant (two-color dashed arrows), or similar (to within ±50 kb, solid arrows), and support co-regulatory 3D-causing transcriptional changes. The largest group of equidistant genes consists of the furthest 4-gene cluster (Eno2-Cdca3-Rad51ap1-Prmt8), in which Evf2 loss shifts Dlx5/6UCE towards the gene targets, decreasing Eno2 and Prmt8, and increasing Cdca3 and Rad51ap1 expression. Therefore, even though Evf2 shifts Dlx5/6UCE away from a gene, transcriptional effects can be positive or negative, a characteristic also observed for short-distance Evf2-regulated Dlx5/6UCE positional shifts in EGG-GBs. These results support the idea that Evf2 regulated Dlx5/6UCE-3D shift distances and shift direction can be uncoupled from transcriptional regulation.
Evf2-EGG plots of (A) SVZ progenitors and (B) Dlx5 differentiated (Lhx6, Lhx6/8, Isl1) cell types at transcriptionally regulated target genes located on chr6 (X-axis, 0-150 Mb: gene locations), Dlx5/6UCE 3D-distances from gene bodies (0) on left Y-axis (X10Mb), gene expression (log2[FC]) on right Y-axis. Evf2-regulated Dlx5/6UCE positional shift directions towards or away from gene bodies (arrows), arrow colors reflecting shared shift distances. Boxed gene clusters (2-4) with equidistant Dlx5/6UCE shifts (dashed two-color arrows) or shifts within 50 kb (solid color arrows). Evf2(+)regulated Dlx5/6UCE position (yellow dots), Evf2(-)regulated Dlx5/6UCE position (blue dots), distance from the gene (0). H conservation with human developing forebrain chr7 Dlx5/6UCE interaction sites HiC62. Red rings (shifts associated with decreased gene expression Evf2TS/TS), gray rings (increased gene expression). Right Y-axis: log2 (fold change in gene expression), black diamonds (−3 to +3), and two off-scale targets in Dlx5+ are marked (Gm9825, −3.8, Aldh1l1, −25.8). Stars highlight shared regulation between SVZ and Dlx5+ cells (gray, increased, red, decreased). C–E SVZ multimodal Evf2-EGGs (located ±10 Mb from gene target) coincident with Evf2-regulated binding of SOX2, SMARCC2, SMARCA4, NONO (±50 kb) and changes in transcriptional effects. Evf2 negatively regulated RNP (superimposed red rings). C Similar distance (pink arrows) Evf2-Dlx5/6UCE guidance across Mad2l1 and Gng12, with SOX2/SMARCC2/SMARCA4 RNP recruitment effects at GBs, and asymmetric SMARCC2/NONO downstream effects. D 15.4 Mb (gray dashed arrow) Evf2-Dlx5/6UCE guidance across Plxnd1, with RNP effects: GB (SMARCC2), upstream and downstream (SMARCA4), asymmetric downstream (SMARCC2/NONO). E Upstream Evf2-Dlx5/6UCE guidance, with RNP effects: GB (SOX2, SMARCC2/SMARCA4, DLX, NONO), upstream (SOX2) and downstream (NONO). F–I Upset plots of 7 Evf2-regulated RNPs (SOX2, SMARCC2, SMARCA4, NONO, DLX, SMC1, SMC3) and 4 HMs (H3K4me1/3, H3K27me3/ac) associated with Evf2-Dlx5/6UCE guidance. F ±50 kb to Evf2(+)regulated Dlx5/6UCE. G ±5 kb to Evf2(+)regulated Dlx5/6UCE. H ±50 kb to Evf2(-)regulated Dlx5/6UCE. I ±5 kb to Evf2(-)regulated Dlx5/6UCE. Results based on Evf2-regulated Dlx5/6UCE sites, HMs, and RNP binding (SOX2, Dlx, SMARCA4, Smc1, Smc3) in previous reports4,5, aligned with scRNAseq and Evf2 regulated NONO and SMARCC2 binding (this work).
EGG plot comparisons of SVZ progenitors and Dlx5 differentiated subpopulations identify both changes in and maintenance of transcriptional targets and transcriptional effects, as indicated by gray stars (+increased in Evf2TS/TS) or red stars (- decreased in Evf2TS/TS), which mark 9 shared target genes (Fig. 4A, B). While transcriptional regulation of 5/9 shared target genes (Gm9825, NeuroD6, Snrpg, Aldh1l1, Gm4604) are maintained, the majority of targets change during differentiation. For example, analysis of the furthest 4-gene cluster shows that Eno2 and Prmt8 are regulated in the SVZ, while Ccnd2 and Foxm1 are regulated in Dlx5 populations. In addition, transcriptional reversal of Cdca3 and Rad51ap1 occurs during differentiation (SVZ to Dlx5 populations) (Fig. 4A, B). Alignment of Evf2-regulated RNP binding sites near SVZ and Dlx5 population transcriptional targets identifies RNP binding regulation to within ±50 kb of 28/29 SVZ targets, and within ±5 kb of 26 SVZ targets (Fig. S7A–C, SVZ, Fig. S7D–F, Dlx5 population). Examples of Evf2 RNP recruitment associated with different types of Evf2-regulated Dlx5/6UCE positional shifts include one-sided downstream RNP recruitment/long distance-across gene shift (Fig. 4C), two-sided up- and downstream RNP recruitment/long distance-across gene shift (Fig. 4D) and two-sided up- and downstream RNP recruitment/upstream shift (Fig. 4E), supporting multi-modal mechanisms.
Gene clusters associated with equidistant or similar distance Evf2-regulated Dlx5/6UCE positional shifts and RNP recruitment support functional links between long-distance 3D co-regulation, RNP binding, and transcriptional regulation. Evf2 regulates RNPs to positively and negatively regulated Dlx5/6UCE looping sites ( + and - [SMARCC2, SOX2, SMARCA4], - [DLX], + [NONO], [Fig. 2C5,4], raising the possibility that RNP signatures may also distinguish EGG regulated transcriptional targets. While upset plots of RNPs and HMs identify only a few shared and unique signatures, SOX2 and SMARCC2 are the top recruited RNPs associated with Evf2-Dlx5/6UCE positive regulation (Fig. 4F, G). NONO recruitment and H3K4me3 are associated with Evf2-Dlx5/6UCE negative regulation (Fig. 4H, I). Therefore, differential Evf2 RNP recruitment at Evf2-regulated Dlx5/6UCE looping sites supports differential roles for individual RNPs and HMs in Evf2-Dlx5/6UCE positive and negative looping regulation, consistent with previous results showing that RNPs singly (as shown for the Evf2-SOX2 RNP4), or in combination, contribute to both 3D organization and transcription.
Evf2 directly binds long-range EGGs on chr6 and genome-wide sites, associated with multiple RNP recruitment
We next asked whether Evf2 direct binding contributes to EGG selectivity. We used ChIRPseq, an unbiased method to identify RNA direct binding sites in the genome53. Evf2 and Evf1, an alternatively spliced form, contain unique 5’ and common 3’ ends (Figs. 1A and S2)3, and are functionally redundant in a subset of regulatory events, including RNA cloud formation, RNP recruitment, and transcriptional regulation4,5. In order to identify Evf1/2-RNA binding sites in E13.5GEs, without the confounding contributions of 5’ or 3’ end RNA binding sites that are present in Evf1TS/TS or Evf2TS/TS single mutants5, we utilize double mutant mice lacking both Evf2 and Evf1 (Evf1/2TS/TS). Evf1/2TS/TS contains transcription stop insertions into Evf-exon1 and exon 3 (Fig. 1A, Fig. S2), providing an important negative control in ChIRP experiments. Duplicate mixtures of 12-16 E13.5GEs (6-8 embryonic brains) for each genotype are probed with alternating anti-sense oligonucleotides spanning Evf1/2, then the results are grouped into odd and even samples. Only sites recognized by both odd and even probes, not bound by Gapdh anti-sense probes (negative control), and absent from Evf1/2TS/TS (negative control) are considered Evf1/2 RNA binding sites (hereafter referred to as RBSs). ChIRP results in main figures show the significance of peaks from computationally combined results from duplicates, odd/even probe overlap, and Gapdh subtraction. The peaks obtained from the combined set of odd probes and peaks obtained from the combined set of even peaks in Evf1/2+/+, and the corresponding peaks in Evf1/2TS/TS for chr6 and key non-chr6 RBSs are shown as individual tracks in Fig. S8 and Fig. S9. ChIRPseq comparing Evf1/2+/+ and Evf1/2TS/TS E13.5GEs identifies 148 RBSs across the genome (purple circles, Fig. 5A). On autosomes, the highest number of RBSs is located on chr11 (22), and lowest on chr18 (3). On the X chromosome, there are 3 RBSs, and one on the Y chromosome, Since Evf1/2 is transcribed on chr6, it was expected that the majority of RBSs would be detected on chr6. However, only 6 RBSs are detected on chr6 at 41-129 Mb distances from Dlx5/6UCE and Evf1/2 transcription (Fig. 5B), near long-range transcriptionally regulated targets (RBSs 1-3, clustered at Pdia4 (Fig. 5C), RBS 4 at Creb5 (Fig. 5D), RBS 5 at Suclg2 (Fig. 5E)), and RBS 6 near Etv6 (in Grin2b intron, Fig. 5F)). Analysis of Evf2 regulated RNP binding and Dlx5/6UCE guidance associated with chr6 RBSs support a role for direct RNA binding in long-distance regulatory events (Fig. 5C–F). This finding is also consistent with our previous report that Evf2 preferentially regulates long distance Dlx5/6UCE guidance across chr6, at distances >1 Mb from the enhancer5. As with Evf2 EGGs, SOX2 and SMARCC2 are the top RNPs recruited to 148 RBSs, while DLX and NONO are inhibited from binding to a subset of sites (Upset plots, Fig. 5G). HOMER de novo motif analysis identifies two motifs, including G-rich stretches that can potentially form G-quadruplexes (as shown for 4/6 chr6 RBSs (Fig. 5B, H, I)). Furthermore, sequence alignments between RBSs on chr6 identify 174 bases of 100% identity between chr6 RBS 1 and 3 (Fig. 5J), raising the possibility of homologous DNA base-pairing interactions, as has been reported in Drosophila54.
A E13.5GE ChIRPseq identification of 148 Evf2 RBSs (purple ovals), (low to high gene density: blue to red range), highest (22 RBSs on chr11), lowest (1, Y-chr). Blue star (Evf2/Dlx5/6UCE location, chr6 proximal end) B RBSs (1-6) on chr6 map to long/super-long-range targets. C–F RBS-associated chr6 EGGs, transcriptional effects, RNP binding effects, and H3K4me1 marks. C chr6 RBS 1-3 at (1), multi-RNP binding regulation sites (1) and at (2, Pdia4-5’), guiding Dlx5/6UCE from site 2 (red bar, Zfp786) to Evf2(+) reg Dlx5/6UCE (green bar) in the Cntnap2-Ezh2 intronic region (green arrow), coincident with Sox2-Dlx5/6UCE negative regulation (red S, from ref. 4); in Evf2TS/TS Cntnap2, Ezh2, Pdia4s, and Krba1 transcripts increase. D. chr6 RBS 4 at site (1), marked by H3K4me1, multi-RNP binding regulation, site (1), and sites (2, Scrn1 intron and site 3, Creb-5’), Dlx5/6UCE guidance to sites 2 and 3 (green arrow, green bar), coincident with Sox2-Dlx5/6UCE positive regulation (green S); in Evf2TS/TS NeuroD6 and Creb5 transcripts decrease. E chr6 RBS 5 at site (1), marked by H3K4me1, Evf2(+) regulated H3K27ac; multi-RNP binding regulation, site (1), and sites (2: Kbtbd8-5’, 3: Evf2 (-) reg SMC3, 4: Suclg2 intron7); Evf2(-) regulated Dlx5/6UCE between sites 3&4 (red line); in Evf2TS/TS Suclg2 transcripts decrease. F chr6 RBS 6 (upstream of Etv6) at site (1), marked by H3K4me1, multi-RNP binding regulation, site (1), and sites (2: Grin2b intron6 [Evf2(-) reg. Dlx5/6UCE], 3: Grin2b intron5/6 [Evf2(+) reg. Dlx5/6UCE], 4: Grin2b intron5, 5: Etv6-5’); Evf2-Dlx5/6UCE guidance (site 2 to site 3, green arrow); Evf2 prevents Dlx5/6UCE looping to the Etv6 intron (red dashed bar), in Evf2TS/TS Etv6 transcripts increase. G Upset plots of 5 Evf2-RNPs (SOX2, SMARCC2, SMARCA4, NONO, DLX), 4 HMs (H3K4me1/3, H3K27me3/ac) within ±5 kb Ev1/2 RBSs. H Homer de novo analysis identifies G-rich sequences and GTTAGGGTTAGG motif enrichment at RBSs. I. chr6 RBSs 1, 3, 5, 6 (G’s, with potential to form G-quadruplex), J chr6 RBSs 1-3 (174 bases of 100% DNA:DNA identity).
Investigating the relationship between RBS DNA identity, DNA motifs, and chromosomal looping interactions
We next asked whether RBS DNA identity is limited to chr6 or detected genome-wide. A search for 100% identity of greater than 25 bp length identifies identical sequences in 70/148 RBSs, with 78 inter-chromosomal and 8 intra-chromosomal alignments (Fig. 6A, Fig. S10A). The majority of RBS sites match with one other site (21); chr13-RBS-8 has the greatest number of matches (identity with 7 other sites, pink star). chr6-RBS-2 shares identity with 2 RBS sites, and chr6-RBS-5 and chr4-RBS-4 share identity with 4 sites). Distances between intra-chr identical matches range from 4 kb to 50 Mb (one pair on chr1, chr6, chr10, chr12, chrX, and three pairs on chr11, detailed in Fig. S10 A–G). As observed at intra-chr6 RBSs (RBS-1 & 3, Fig. 5C, D), Evf2 also regulates RNP binding at non-chr6, intra-chr RBS identical matches (chrX- RBS-2&3, chr11 RBS12-13, Fig. S10B, D). While the significance of RBS identity remains to be determined, it is interesting to note that the one chrY RBS at the distal end is 100% DNA matched with chrX-RBS 2 (Fig. S10B), in the pseudoautosomal region involved in sex chromosome pairing during meiosis55,56. RBSs coincide with multi-RNP regulation (>3 RNPs) and/or RNP hotspots (>100 peak heights), as seen at the inter-chr4 RBS-4 site (shared identity with 4 RBSs) (Fig. 6A and S10H), and the chr1 RBS-3 RNP hotspot (Fig. S10I). While only 3 chrX RBSs are identified, multiple RNP hotspots and transcriptionally regulated targets are identified at a distance from RBSs, suggesting long-distance effects that spread from initial binding sites (Fig. S10B). In a gene-dense region of chr11, 3 intra-chr DNA matches are detected, with RBS 17-19 (5 kb distance) nested within RBS 16-21 (11.6 Mb distance), (Fig. S10D–F).
A RBSs with 100% matched DNA sequences (identity) >25 bp. Left: Graph showing the distribution of inter-chr alignments. Right: Chr6 related chromosomal identity: inter-chr (pink-black dashed) and intra-chr (pink loop), identity network based on DNA:DNA matches at RBSs (purple ovals), Evf1/2/Dlx5/6UCE (blue star), chr13-RBS-8 (pink star, 6/7 inter-chr identity shown). B chr13-RBS-8 and chr13-RBS-7 region: Intra-HiC significant loops are detected within ±20 kb of chr13-RBS-7 (pink loops, counts and qvals indicated), but not chr13-RBS-8 (pink star in A). PWMEnrich DNA motif analysis of chr13-RBS-8 (pval<1E-03). C PWMEnrich DNA motif identifies the top 3 ranked TF motifs (pval<1E-03) detected in RBSs with ≥3 inter-chr matches. D HiC analysis of genome-wide inter- and intra-chromosomal significant looping interactions within ±20 kb of RBSs in E13.5GEs. 77 RBSs are associated with intra-chr HiC loops (pink half ovals), while 3 RBSs form 11 inter-chr HiC loops (pink-black lines): Left: graph of the number of inter-chr HiC counts at RBS sites. Right: RBS looping within the chr6 network. HiC was performed on crosslinked chromatin from a mixture of 6-8 wildtype embryos (12-16 E13.5GEs), in duplicate, with count averages indicated, and qval < 0.02. E–J Multi-RNP and RNP Hotspot (100 minimum peak heights) recruitment sites (RNP Cut&Run) associated with HiC-inter-chr loops at RBSs (Evf1/2 RNA binding, ChIRP) and their non-RBS sites of interaction. E chr4-RBS-1, F chr15-RBS-3 (RNP hotspot), G chr5-RBS-7, H–J HiC inter-chr sites that interact with RBSs binding sites (yellow, identifies inter-chr HiC loops), H chr6 proximal region (RNP hotspot), multi-RBS and non-RBS inter-chr looping interactions, I chr2 distal region (181 Mb, two RNP hotspots) across 5 kb region) interacts with chr15-RBS-3 and chr16., J chr16 -57Mb, interacts with chr15-RBS-3, chr6-3.15 Mb proximal region, chr9-123.4 Mb (distal region). Results are based on this work (ChIRPseq, Cut&Run (Evf2 regulated NONO/SMARCC2), scRNAseq, HiC in wildtypes). Results of Evf2-regulated HMs and Evf2 RNP binding regulation (SOX2, DLX, SMARCA4, SMC1, and SMC3) are based on our previous reports4,5.
The number of inter-chr identical DNA matches for a given RBS varies; 21 RBSs match with only one other RBS (paired), whereas the maximum number of identical DNA matches are detected with chr13 RBS-8 (7 RBSs, pink star, Fig. 6A). The network of inter-chromosomal identity involving chr6 (RBS-2 and 5) raised the possibility of multi-hub DNA looping interactions involving chr13 RBS-8, the RBS with the highest number of matches (Fig. 6A). However, HiC analysis57,58 performed on E13.5GEs, fails to identify inter- or intra-chr interactions involving RBS DNA:DNA identity (loops located within ±20 kb of RBSs, qval < 0.01, duplicates of 12–16 pooled E13.5GEs). Analysis of the chr13-RBS-8 and chr13-RBS-7 regions shows that intra-HiC significant loops are detected within ±20 kb of chr13-RBS-7 (pink loops, counts, and qvals indicated), but not chr13-RBS-8 (pink star in A). Evf2 transcriptional targets nearest to intra-HiC loops are Sncb (Evf2TS/TS-SVZ, -2.4-fold change) and Mxd3 (Evf2TS/TS-SVZ, +1.6-fold change). Evf2-regulated SMC1 (red bars) and SMC3 (green bar) sites are indicated. Evf1/2 RNA binding at RBS-7 and RBS-8 is shown as light blue peaks. The focus of the chr13-RBS-8 site identifies regulation of H3K4me3 (green) and RNP binding (DLX, green and SMARCA4 blue). PWMEnrich DNA motif searches for the top 3 ranked motifs (pval<1E-03) in RBSs with ≥2 matches, identifies Zfp281 and Zfp740 motifs in chr13-RBS-8 (Fig. 6B), motifs common to 4 matched RBSs (Fig. 6C). In addition to DNA motif enrichment at RBSs with ≥2 matches, shared motifs among all matched RBSs include 31 Zfp740 motifs (11/31 with two Zfp740 motifs), and 21 Zfp281, 7 Atf1, and 14 Tcf3 single motifs (Figs. 6C and S11), supporting future investigation of these TFs in mechanisms involving RBSs and chromosomal interactions.
Although looping interactions between pairs of identical RBSs are not detected by HiC analysis, looping interactions involving RBSs are detected (80/148 RBSs). In one example, intra-chr interactions between chr13-RBS-7 and in the region between Evf2-regulated SVZ genes (Scnb and Mxd3), indicate that looping reduces the distance between regulated targets and the RBS that is ~0.38 Mb away (Fig. 6B). Of the 80 RBSs involved in looping interactions, 3 RBSs are involved in 11 inter-chr interactions (lines, Fig. 6D) and 77 RBSs are involved in intra-chr interactions (a subset shown as pink loops, Fig. 6D). Importantly, the highest number of RBS-containing inter-chr interactions are detected at proximal end of chr6 (6 with chr4 RBS-1 and 1 with chr15 RBS-3, Fig. 6D, boxed region in the graph, ch6-3.05–3.33 Mb, pink star (3.15 Mb)). Chr4-RBS-1 and chr15-RBS-3 interact with the chr6-3.15 Mb proximal site, while chr15-RBS-3 connects proximal chr6 with distal chr2 and chr16. RBS looping outside of the chr6 network shows chr5-RBS-7 interaction with chr8. The location of Evf1/2/Dlx5/6UCE is marked by the blue star (Fig. 6D). Evf2-multi-RNP recruitment and recruitment at RNP hotspots (peak heights greater than 100) are detected near or overlapping with RBSs involved in inter-chr interactions and their non-RBS sites of interaction as follows: chr15-RBS-3 (Fig. 6F) interacts with 3 sites (chr6-3.15 Mb (6H), chr2-181.65 Mb (6I) and chr16-57.4 Mb (6 J)), chr4 RBS-1 (6E) interacts with chr6-3.15 Mb (6H), and chr5-RBS-7 (6G) interacts with chr8. Together, these data support a potential role for shared TF recognition amongst RBSs containing matched sequences, a cluster of inter-chr interactions involving RBSs at the chr6 proximal end, and genome-wide roles for RBS involvement in chromosome organization.
Evf1/2 RNA binding partitions chromosomes into RNP regulated domains
Evf2-chr6 division into gene activation and repression domains (Figs. 2, 3), Evf2-regulated RNP binding at key sites (RBSs, Dlx5/6UCE/HiC interactions, transcriptionally regulated genes), and SOX2 protein/pool condensate size regulation4 led us to investigate whether Evf2-regulated RNPs are enriched according to RBS-defined chromosomal partitions. Based on RBS spacing (>105 bp), RBSs define chromosomal domains (RCDs). Spacing on chr6 between RBS-1-2-3 is <105 bp, therefore, RBS 1-3 are combined, generating 5 chr6 RCDs as follows: RCD1 (0-RBS-1, 47.7 Mb), RCD2 (RBS-1–4, 6.96 Mb), RCD3 (RBS-4-5, 40.1 Mb), RCD4 (RBS-5–6, 41.0 Mb), RCD5 (RBS-6-end, 13.8 Mb). Evf2-regulated RNP binding across chr6 RCDs identify altered distributions of (+, positively) regulated chromatin remodelers SMARCC2 and SMARCA4 and (-, negatively) regulated DLX and NONO in RCD2 (Fig. 7A). Many chromosomes contain RNP-regulated RCD gradients consisting of ≥4 consecutive RCDs that increase or decrease the number of binding sites in a step-wise manner (triangles, Fig. 7A, B and S12A-L). However, surprisingly, the DLX RCD gradient on chr6 and SOX2 RCD gradient on chr7 are unique in that these span the entire chromosome (7A, 7B, and red stars in S12C (SOX2), S12E (DLX)). Although rare, two RCD gradients can be detected on a single chromosome (NONO (chr10, chr15, Fig. S12K, L) and SMARCC2 (chr11, S12I)). On chrX, where only 3 RBSs are detected, the frequencies of Evf2 (+) and (-) regulated SMARCC2, NONO, and SOX2 binding increase on RCD 3 (Fig. 7C). On chr11, where the highest number of RBSs are detected, frequencies of Evf2 (+) regulated RNPs (SMARCC2, NONO, SMARCA4, DLX) increase on RCD 1, and frequencies of Evf2 (+) regulated chromatin remodelers SMARCC2 and SMARCA4 increase on RCD 8 (Fig. 7D). Thus, altered distributions of Evf2-RNP recruitment frequencies among RCDs are not only chr-specific, but RNP-specific, and can follow both cross-chromosomal and sub-chromosomal gradients.
A Evf2 regulated RNP binding across chr6 RCDs, based on Cut&Run results of single RNP binding to <120 bp fragments (SOX2, DLX, SMARCA44, NONO, SMARCC2), the numbers of Evf2 positively regulated sites (+, gray) and negatively regulated (-, red) sites across chr6 and in each RCD, fold changes for each RNP and RCD relative to the expected number set to 1 (dashed line, E). Pvals and χ² statistic values are indicated for each RNP. gray triangle (decreased gradient of positive-regulated Dlx sites in the proximal-distal direction). B Evf2-regulated RNP binding across RCDs in a gradient (defined as ≥4 consecutive RCDs) is shown. C. Evf2 regulated RNP binding across chrX-RBS-defined domains: Evf2(+, red) and (-, gray) Pval and χ² values are indicated for each RNP; χ² values are not calculated for RNPs that are not regulated (zero sites, SMARCA4, DLX). D Evf2 (+) regulated RNP binding across chr11 RBS-defined domains, the chromosome with the highest number of RCDs (17). Chi-square analysis, Evf2 (+) regulated (p = 0, all RNPs): SOX2, χ²=1947, SMARCA4, χ²= 14387, NONO, χ²= 2712, SMARCC2, χ²= 4055; χ² values are not calculated for RNPs that are not regulated at one of the RCDs (zero sites, Evf2 (+) reg. DLX, Evf2 (-) reg. SMARCA4, Evf2 (-) reg. SMARCC2). E JASPAR analysis of Sox2, Dlx2, and Dlx5 consensus DNA binding homeobox motif enrichment at Evf2 regulated sites, 120 fragment Cut&Run peaks in Evf2TS/TS ÷ Evf2+/+. F Chr6: Top-ranked PWMEnrich DNA motif analysis identified TF DNA motif enrichment at Evf2-regulated NONO or SMARCC2 sites (pval<1E-04, FDR < 0.05, ≥2 domains). Only motifs identified in 2 or more domains are marked. RCDs 1 and 2 are combined (gray region= domains 1 + 2), expected number of motifs for each domain based on proportional domain size. G, H Chr-RCD mapping of frequencies of HiC intra-chr (light gray) and inter-chr (dark gray) interaction sites in E13.5 wildtype GEs. E = 1 (expected number, dashed line). HiC, 6-8 wildtype embryos (12-16 E13.5GEs), n = 2 pools, FDR < 0.01, and qval<0.02 determined between biological replicates.
Given the associations of Evf2 regulated RNPs with RCDs (Fig. 7A–D and S12), Evf2 RBSs (Fig. 5C–F), and direct effects on homeodomain RNPs SOX2 and DLX family transcriptional activities3,4,5, we next asked whether Evf2 regulates homeodomain motif recognition in vivo. Cut&Run can distinguish between TF direct (<120 bp fragment) and indirect (>150 bp fragment) binding to DNA, thereby identifying effects on motif recognition in vivo59. In previous work, we showed that SOX2 consensus motifs are reduced only at DLX and SMARCA4 Evf2 co-recruited sites, but not DLX-only or SMARCA4-only Evf2 recruited sites, supporting that Evf2-RNP co-recruitment affects Sox2-motif recognition, in vivo4. Here, analysis of Sox2, Dlx2, and Dlx5 motifs in peaks from singly bound RNPs confirms that Evf2 increases SOX2 recruitment, without affecting Sox2 motif recognition (Fig. 7E). However, Evf2 decreases the number of DLX-bound sites by 2.63-fold, and also decreases the % of Dlx2/5 motifs. In peaks bound by the chromatin remodeler SMARCA4, an RNA/DNA binding protein with no known DNA sequence-specific binding consensus, Evf2 decreases both Dlx2/5 and Sox2 motif recognition. Therefore, not only does Evf2 affect Sox2 motif recognition at RNP co-recruited sites, but Evf2 also inhibits DLX recruitment by decreasing consensus Dlx2/5 DNA motif recognition at singly-bound sites, in vivo. Upset plots indicate that of 148 RBSs, DLX is the top negatively regulated RNP (12/148), and SOX2 is the second highest positively regulated RNP (28/148), supporting the idea that Evf2-homeodomain motif recognition may contribute to RNP binding at a subset of sites.
In contrast to homeobox motif effects at SOX2, DLX, and SMARCA4-regulated sites, Evf2-regulated SMARCC2 and NONO BSs contain very low percentages of homeobox sequences (Fig. 7E), raising the possibility that Evf2 recruitment may involve unknown TF motifs. We next used PWMEnrich DNA motif analysis to identify the top-ranked TF DNA motifs at Evf2-regulated NONO or SMARCC2 sites on chr6 (pval < 1E-04, FDR < 0.05), and to determine whether TF motif frequencies at Evf2-regulated NONO or SMARCC2 sites are altered according to RCDs. Because significant TF motifs are not identified in chr6-RCD 2 Evf2 (+/-) regulated SMARCC2 sites, and Evf2 (-) regulated NONO sites, RCD 1 and RCD 2 are combined in this analysis (gray region includes RCD 1&2, Fig. 7F). The distribution of TF motif frequencies in Evf2 regulated NONO and SMARCC2 sites among RCDs is significantly different from the expected distribution (NONO (χ2 = 8.99, p = 0.03), SMARCC2 (χ²=38.7, p = 1.98E-08), with RCD-specific increases (NONO BSs in RCDs 1&2 and 4, SMARCC2 BSs in RCDs 3 and 5), and decreases (NONO BSs in RCDs 3 and 5, SMARCC2 BSs in RCDs 1&2) (Fig. 7F). TF motifs that are in common between NONO and SMARCC2 regulated BSs and shared between RCDs are colored, with Hic1 (RCD 3) and Hoxb9 (RCD 4) motifs shared in the same RCD for both NONO and SMARCC2 regulated sites. The number 1 ranked motifs associated with each chr6 RCD, including unique motifs, are listed (Fig. S13); the significance of shared and unique TFs at NONO and SMARCC2 regulated sites across chr6 RCDs remains to be determined.
Inter-chr6 HiC interactions are enriched at the proximal end of chr6 (Fig. 6D, H), coinciding with an RNP hotspot, where Evf2 regulated RNP recruitment results in SMARCC2 peak heights of up to 1215. In addition, RNP recruitment hotspots identified at additional HiC interactions and RBSs, led to further investigation of the relationship between HiC interaction frequencies and RCDs. RCD mapping of the frequencies of HiC intra-chr (light gray) and inter-chr (dark gray) interaction sites in E13.5 wildtype GEs identifies unexpected profiles of inter- and intra -chr interaction- RCD alignments. RCD-HiC mapping profiles are shown for chr6, and the following chromosomes of interest: the inter-chr6 interaction network from Fig. 6D (chr 15, 16, 4, 2), chrX, chr11(with the highest number of RCDs), and chr13 (the location of the RBS with the greatest number of DNA identity) (Fig. 7G, H). Proximal-end inter-chr enrichment is detected at chr6, chr16, chr13, chr11, chr4 (Fig. 7G, H), and chr18 (Fig. S14–D)). As with Evf2-RNP recruitments, RCD gradients (groups of ≥4 consecutive RCDs) of increasing or decreasing intra-chromosomal interaction frequencies are observed (Fig. S14B).
Discussion
Chromosome capture led to the identification of 129 Evf2 e-lncRNA regulated Dlx5/6UCE looping sites spanning chr6, but only 4 non-adjacent transcriptional targets across ~40 Mb5, leading us to question whether Evf2-Dlx5/6UCE guidance is permissive, priming future transcriptional changes, or instructive, causing transcriptional changes that are either masked by E13.5GE tissue heterogeneity or below limits of detection. ScRNAseq analysis of E13.5GE subpopulations defines relationships between Evf2 differentially regulated genes and multiple modes of Evf2-Dlx5/6UCE-gene guidance, supporting instructive roles. GO analysis of Evf2-regulated genes identifies functional groups that are both shared and unique among E13.5GE subpopulations. Evf2 transcriptional regulation of chromosome organizers adds complexity to previously identified mechanisms (RNP recruitment and direct effects on ATPase activity of the chromatin remodeler, SMARCA432). The finding that Evf2 regulation of genes in the embryonic progenitor zone (SVZ cluster) predicts synaptic and seizure defects (Fig. 1E, L–R), detected later in adult mice lacking Evf25,45, has important implications for the etiology of neurodevelopmental disorders. Deletion of Dlx5/6UCE in embryonic brains reduces Evf2, Dlx5, Dlx6, and Gad1/2 transcripts, while a Dlx5/6UCE-autism mutation reduces Dlx5 and Dlx6 more subtly60. Therefore, deletion of Dlx5/6UCE resulting in loss of Evf2 has distinct adjacent gene regulatory effects (decreased Dlx5 and Dlx6) compared to loss of Evf2 transcripts that retains Dlx5/6UCE sequences (increased Dlx5 and Dlx6, Evf2TS/TS mice45). These models, together with previously reported Evf2 rescue transgenic models5,42 provide important tools to distinguish between the DNA regulatory role of the Dlx5/6UCE and its Evf2 RNA transcript during normal embryonic brain development and disease.
Focus on Evf2-directed Dlx5/6UCE guidance to target genes within 5 kb of gene bodies, the most likely direct instructive events, reveals unexpected enhancer-gene 1D/3D relationships and subpopulation-specific effects. Surprisingly, Evf2 divides chr6 into short-range activated and long/super-long-range repressed targets (Fig. 8A), regulating RNP binding at nearly every target. SOX2, an Evf2 RNP functionally linked to Evf2 enhancer guidance4, co-regulates Dlx5/6UCE positioning near gene bodies at 7/15 transcriptionally regulated target genes. SOX2 and SMARCC2 are the top two co-recruited RNPs, with SOX2 recruitment effects (within ± 50 kb of the GB) observed at all long-range targets, and 4/6 short-range targets, consistent with our previous report of SOX2 repression and multi-level contributions of Evf2-SOX2 interactions4. Furthermore, heterogeneous Evf2 regulation of SOX2 RNP PPC formation and target gene localization4 raise the possibility of a role for SOX2 in subpopulation-specific transcriptional regulation.
A–C Models of multi-level Evf2 regulation on mouse chr6 resulting from RBS stabilized divisions, EGGs and RNP recruitment. A Relationships between RBSs (purple ovals), DNA sequence identity, G-quadruplex (GQ), RNP recruitment, EGG-GBs corresponding to scRNAseq transcriptional targets (colored loops, top), short-range activation and long-range repression domains, aligned with all EGGs (Evf2 regulated +/- Dlx5/6UCE sites). B 1. Evf2 (blue arrow) expressed from the Dlx5/6UCE enhancer E, recruits RNPs/PPCs (yellow oval), to E, 2. Evf1/2 binds at chr6-RBS 1−3, facilitating DNA looping (purple oval/hairpin), recruiting RNPs, 3. Multi-modal EGG−3D changes (black arrows), 4–5. RNP recruitment and gene transcription (repression (red), activation (green). C RBSs spaced at >105 bp apart define 5 chromosomal domains on mo-chr6. The black arrow points to the cluster of RBS 1-3. In wildtype (WT) E13.5GE nuclei, the majority of inter-chr loops on chr6 (29/35) are located in RCD 1, proximal to Evf2/Dlx5/6UCE (blue square), overlapping with Evf2-positively regulated DLX ((+), green) enrichment. In wildtype E13.5GE nuclei, RCD 2 is enriched for intra-chromosomal looping, and both positive and negative regulation of RNP binding. RCDs define boundaries for TF motif enrichment at Evf2-regulated SMARCC2 and NONO sites. D i. Inter-chr interactions with chr6 proximal end (P) detected by HiC. RNP recruitment at hotspots near interaction sites, indicated by arrows to yellow ovals, and transcriptional effects on genes located on chr6 and on interacting chromosomes are indicated by the black boxes. ii–iii. Models of intra-chr super-looping through simultaneously paired inter-chr looping. ii. The blue ovals indicating pairs of chr6-chr4 HiC interaction sites. The distance between HiC sites on chr4 is 38.4 M, but on chr6 is 108 Mb. iii. The blue ovals indicate pairs of chr6-chr8 HiC interaction sites. E Potential RCD-defined 3-D stabilized looping (1–4) of enhancer-gene groups, with effects of the enhancer on looped domains indicated by arrows. i. sequential, ii. Nested loops, iii. Clover, iv. In the absence of Evf2, relaxed looping increases enhancer-gene targets within domains.
While Evf2-directed Dlx5/6UCE looping to gene bodies is the most likely 3D event to cause transcriptional changes, effects may occur in cells where 3D regulation occurs or subsequently in differentiated populations. Evf2-Dlx5/6UCE guidance to gene bodies that cause transcriptional effects is identified in proliferative zones where Evf2 is first activated, as well as in differentiated populations continuing to express Evf2 at high levels (Lhx6 cluster), supporting immediate transcriptional effects of 3D regulation. However, transcriptional effects in populations with decreased expression levels of Evf2 (Isl1 cluster) and/or low percentages of Evf2+ cells (NeuroG2) (Figs. 1, 2 and S1, S8A), raise the possibility of long-lasting or downstream effects in cells after exiting the SVZ. Given that Evf2 is activated to the highest levels in the SVZ, the highest number of direct effects would be expected in this population. However, the distribution of Evf2-directed Dlx5/6UCE looping to gene bodies resulting in transcriptional changes among subpopulations does not show a preference for SVZ-regulated chr6 targets (4/15). In addition, Evf2 directed Dlx5/6UCE looping to genes at distances greater than 5 kb (3D distance) are observed in clusters of 2–4 SVZ-regulated chr6 target genes that are distributed across chr6. Therefore, gene clusters located at >120 Mb (1D distance) from Evf2/Dlx5/6UCE may be co-regulated by Dlx5/6UCE 3D shifts (ranging between 95 kb and 3 Mb for the Eno-Cdca3-Rad51ap1-Prmt8, 4-gene cluster).
While examples of enhancer RNA-enhancer gene guidance to genes within 1 Mb distances are known, few reports span the entire chromosome, raising the question of the underlying basis for Evf2 long-distance effects. Our data support a role for direct RNA binding at key long/super-long-range sites, a subset characterized by G-quadruplex and DNA identity, and overlapping with multi-RNP recruitment. Linear models of chr6 that incorporate past and present work delineate gene activation and repression divisions of Evf2-Dlx5/6UCE guidance at gene bodies (Fig. 8A), long distance Dlx5/6UCE intra-chromosomal guidance (Fig. 8B), and alignment with chr6-RCDs enriched for key features such as Evf2-RNP binding regulation/DLX recruitment gradient, TF motif enrichment at Evf2-regulated NONO and SMARCC2 BSs, and inter-chr and intra-chr interactions, (Fig. 8C). Chr6 proximal-end enrichment of HiC inter-chr interactions and RCD 2 enrichment of intra-chr interactions raise the possibility of roles for direct RNA binding and RNP recruitment in higher-order chromosomal organization (Fig. 8C, D). A 3D model of chr6 inter-chr interactions (Fig. 8D i, pink dashed lines, green dotted lines) supports the possibility of hub formation involving the proximal ends of chr6, chr4 (RBS-1, purple oval), additional chr4 proximal sites, proximal end of chr13, the distal end of chr2, chr15 RBS-3 (purple oval), and two sites on chr16, with arrows representing Evf2-mediated RNP recruitment effects (Fig. 8Di). Despite the lack of chr6-RCD 1 RBSs, Evf2 transcripts from its transcriptional start site (blue box) and interactions with chr4 RBS-1 and chr15 RBS-3 may facilitate Evf2 regulation of the chr6 proximal end through inter-chr interactions.
In HiC analysis, the identification of 2 or more paired inter-chr interactions, where the distance between the sites on one chromosome is greater than the other, supports the possibility that simultaneously paired inter-chr interactions mediate 3D organizational effects (green dotted lines, Fig. 8Di). Simultaneously paired inter-chr interactions would support a model where chr6 RCD 1 and RCD 4 can be brought into close proximity through interactions with 2 sites on chr4 (Fig. 8Di, ii), or 2 sites on chr8 (Fig. 8Di, iii). In Fig. 8Dii and iii, HiC inter-chr interaction sites are indicated as blue ovals and pink dashed line (Fig. 8Dii), and chr6 super-loop formation is shown as black dotted lines. Similarly, the chr6 proximal end interaction with two sites on chr16 may help stabilize ~40 Mb loop on chr16 (Fig. 8Di, pink dashed line intersecting on two chr16 sites). These studies raise several questions that will be important to answer in the future, including whether a multi-chr hub forms at the chr6 proximal end, and whether simultaneously paired inter-chr interactions contribute to intra-chr super-looping, facilitating contact between proximal and distal ends of chromosomes. Equally important will be to determine whether Evf2 contributes to inter-chr or intra-chr interactions (in addition to Dlx5/6UCE guidance), and whether formation of RCDs is related to megadomains identified on the X chromosome30,61. However, given that loss of Evf2 results in increased mis-directed long-range Dlx5/6UCE interactions, rather than loss5, our model is consistent with the idea that differential RNP recruitment within RCDs contributes to Dlx5/6UCE precision, distribution of TFs, and transcriptional regulation, rather than establishing overall 3D organization (Fig. 8E, i). Through stabilization of RCDs, Evf2 may facilitate enhancer hub configurations that fine-tune transcriptional regulation of a subset of genes within a region (Fig. 8E, ii nested, or iii clover). The identification of chr6 inter-chr interactions to two sites on chr4 and chr8 support the possibility of nested formations, where RCD1 and RCD4 distance can be shortened in 3D (8Eii, 8Di-iii, green dotted lines). In the absence of Evf2, looping configurations within a region may become more relaxed, decreasing enhancer distances to a subset of regional targets, consistent with chromosome capture evidence of ectopic Dlx5/6UCE interactions (Fig. 8D, iv)5. Also consistent with the formation of RCDs are multimodal enhancer-gene target relationships that regulate transcription through changes in 3D relationships, rather than simply shifting enhancers closer to genes. The model supports the idea that RNP recruitment (yellow ovals) to key sites occurs during RNA-directed enhancer positional shifting on chr6 (intra-chr interactions, Fig. 8B), while transcriptional effects on target genes may be independent of the relative 3D distances between the enhancer and target genes.
RCD-dependent Evf2-RNP recruitment effects and enrichment of inter- and intra-chr interactions coincide on many chromosomes, and are not limited to chr6. Together with Evf2-mediated genome-wide effects on gene expression, our data raise the possibility that RCDs may constitute an organizational principle not readily detected by existing methods such as HiC mapping. Distinguishing between Evf2-regulated RNP hotspots that coincide with RBSs and RNP hotspots that do not coincide with RBSs may be key to identifying which factors determine Evf1/2 RNA binding. In addition, while the functional significance of genome-wide RCD based-Evf2 RNP recruitment gradients (consecutively arranged increasing or decreasing RCDs) is unknown, chr6 has the only chr-spanning DLX recruitment gradient, decreasing from the proximal to distal end. The chr6-DLX recruitment gradient coincides with gene activation (RCD 1, highest Evf2-regulated DLX recruitment) and repression divisions, that together with known direct Dlx binding to and positive regulation of Dlx5/6UCE activity3,5,14,45, and Evf2-regulated DLX homeobox recognition in vivo (Fig. 7E), support that RCD-RNP recruitment enrichment contributes to transcriptional effects. The identification of a chr7-spanning RCD-SOX2 recruitment gradient and multiple sub-chromosomal RNP gradients spanning 4 or more RCDs warrants future investigation as to the functional significance of RCD-RNP recruitment. RNP-enriched RCDs that span multi-megabase chromosomal domains and Evf2 RNA cloud formation together with differences between directly and indirectly bound RNPs may explain why RNP recruitment effects are adjacent to or distant from RBSs. RCDs enriched for intra-chr and inter-chr interactions raise important questions as to whether RCDs influence looping interactions (or vice versa), and whether Evf2-RCD partitioning is unique or a shared property of a subset of the larger class of e-lncRNAs. While the data obtained to date and Evf2 models presented here are consistent with the finding that Evf2 contributes to regional refinement5, rather than establishment of large-scale chromosome-spanning structures, future experiments to determine the role of Evf2 in Dlx5/6UCE independent looping interactions will be important.
Conclusions
Results in this report: (1) Evf2-transcriptional regulation of embryonic SVZ seizure-regulating genes predicts adult effects. (2) Evf2-Dlx5/6UCE-gene guidance divides chr6 into short-range transcriptionally activated and long-range transcriptionally repressed genes. (3) Evf2-regulated Dlx5/6UCE-gene distances and directions can be decoupled from gene target transcriptional effects. (4) Evf2 controls homeobox motif recognition in vivo. (5) Evf2 RNA binding is coincident with DNA identity enriched by specific transcription factor motifs. (6) Evf2 RNA binding reveals an added layer of chromosomal organization characterized by domains of enrichment for RNP recruitment, transcription factor motifs, and potentially HiC looping interactions.
Limitations; Only 29 chr6 SVZ Evf2-regulated genes are identified, not only a small percentage (~5%) of the total SVZ-regulated genes, but also a small percentage of the total possible gene targets on chr6. This was surprising, indicating that Evf2 transcriptional regulation is not enriched on chr6 over other chromosomes, and supporting the idea that trans mechanisms (as observed by genome-wide, RBSs, RNP recruitment, and potentially, RCD-partitions of RNP recruitment) play a significant role in Evf2 gene regulatory events. Given that the extent of Evf2 regulation of all 3D interactions (HiC) remains to be determined, the significance of RCD enrichment of looping interactions is not known. Furthermore, on chr6, we have not ruled out the possibility that cascade effects on additional enhancers, rather than Dlx5/6UCE positioning alone, contribute to transcriptional regulation of long-distance targets on chr6 and in trans. However, previous work showed that a subset of Dlx5/6UCE-gene interactions in mouse E13.5GE (4Cseq5,) is conserved in human fetal brain (HiC data62). Furthermore, the human Dlx5/6 gene interacts with four Evf2-SVZ-regulated targets (Tac1, Plxna4, Gm17829, Gm4604) and three Dlx5-regulated targets (Tac1, Creb5, Gm4604) (purple H (Fig. 4A, B)), supporting conservation, and adding significance of Evf2-Dlx5/6UCE guidance to neurological disorders in humans. Future experiments combining chromosome capture methods and genetic manipulations in mice and human iPSCs can distinguish between cascading and direct Dlx5/6UCE mechanisms, and will be important for determining how RBS identity involving TFs and RNA regulated homeobox recognition may distinguish cis and trans regulatory mechanisms, as well as significance to human neurological disorder.
Methods
For methods where differential analysis is performed, experiments are performed in parallel, using numbered tubes so that the genotypes and conditions are unknown (blinded to conditions until final analysis).
Summary of software used
Monocle (monocle_2.24.1)63,64,65, Seurat_5.0.1, biomaRt66,67, CellCycleScoring function on the Seurat data object68,
Bowtie2 v2.1.069, MACS270, RIdeogram package71, ‘ggpubr’ R package (Kassambara, ggpubr R Package: ggplot2-Based Publication Ready Plots), UpSetR: An R Package for the Visualization of Intersecting Sets and their Properties. (https://github.com/hms-dbmi/UpSetR), G4hunter72, bowtie2 (v2.4.4)69, HiC-Pro pipeline (v3.1.0)73, Hi-C 3.058, HiCLift74, Fithic 2.075, Bedtools v2.30.076,77.
ScRNAseq
Cell Dissociation: E13.5GE dissections from embryonic brains collected from timed-pregnant females were collected into ice-cold Leibovitz’s L-15 medium and 1% fetal bovine serum. For the single-cell dissociation, papain dissociation kit, protocol II (Worthington Biochemical, LK003150) was followed. In the last step, the supernatant was removed and the pelleted cells were resuspended in 1 ml pre-chilled L15 medium + 1% fetal bovine serum. Cells were counted using a Luna cell counter, and only GEs with viability determined ≥90% were processed. The 10X Genomics microfluidics system and the 10X Genomics Chromium Single Cell 3’ Reagent Kit v3 were used for single cell separation and library preparation, following all protocol recommendations, including PCR cycles (12 cycles for cDNA amplification of 5000 cells and 14 cycles for index PCR). E13.5GE libraries from Evf2+/+ and Evf2TS/TS were sequenced on the Illumina NovaSeq 6000 (Sequencing Core Facility at the La Jolla Institute).
ScRNAseq analysis
A custom mm9 transcriptome, in which the Evf1 and Evf2 transcripts were annotated as separate genes, was built using the Cellranger (version 3.1.0) mkref function. Sequencing reads were aligned to the custom transcriptome using Cellranger (version 6.1.1). A gene by cell matrix was generated from Cellranger output files within Monocle (monocle_2.24.1)63,64,65 dimension reduction was performed on the 2763 most highly variable genes determined from the transcripts/10,000 transcripts normalized matrix. Dimension reduction was performed on the top 12 principal components after regressing the effect of read depth on gene expression variance (total number of RNAs and the number of genes expressed) using uniform manifold approximation and projection (UMAP) method from the uwot package78,79, cells were clustered using standard approaches, with clusters assigned cell type annotations based on marker gene enrichment as performed previously80,81,82. Differential expression was determined using the differentialGeneTest function within monocle, using fullModelFormulaStr = “~genotype” and a reducedModelForulaStr = “~genotype + Total_mRNAs” to reduce the impact of read-depth on differential expression testing. Significant differential expression was determined when q-values < 1e-5. Determination of cell cycle phase was conducted in Seurat (Seurat_5.0.1). The monocle cell dataset was converted into a Seurat object, using the same variable genes as determined for input into UMAP dimension reductions. Cell cycle score and phase were determined using the mouse orthologs, as determined by biomaRt66,67 of the human cell cycle genes using the CellCycleScoring function on the Seurat data object68. Cell cycle score and phase were then transferred back to the monocle data object for subsequent analyses and plotting.
Cut&Run
Anti-SMARCC2 (Bethyl, Cat# A301-038A; RRID:AB_817992, https://www.fortislife.com/products/primary-antibodies/rabbit-anti-smarcc2-baf170-antibody/BETHYL-A301-038?selected=A301-038A) and anti-NONO (Bethyl, Cat#A300-587A; RRID:AB_495510, https://www.fortislife.com/products/primary-antibodies/rabbit-anti-nono-antibody/BETHYL-A300-587) were used for Cut&Run, based on methods described in refs. 59,83, and modified for E13.5GEs4. Cut&Run was based on with modifications as follows: Concanavalin A magnetic beads (10 μl per sample) were mixed with 1.5 ml of ice-cold Binding Buffer (20 mM HEPES-KOH pH 7.9, 10 mM KCl, 1 mM CaCl2, 1 mM MnCl2), placed on a magnetic stand, and beads were washed with 1.5 ml Binding Buffer, and resuspended in 10 μl Binding Buffer. E13.5 ganglionic eminences were isolated from embryos in L15 medium (6 embryos per genotype), pooled, triturated by pipetting, and filtered through a cell-strainer capped 5 ml polystyrene round-bottom tube (BD Falcon) to generate single-cell suspensions. Cells were counted using the Luna Automated Cell Counter (Logos Biosystems). 250,000 cells per sample were centrifuged at 600 g for 3 min at 4 °C, pellet resuspended in Wash Buffer (20 mM HEPES pH 7.5, 150 mM NaCl, 0.5 mM spermidine, EDTA-free protease inhibitor cocktail), washed a second time, and resuspended in 1 ml of Wash Buffer. The Concanavalin A bead suspension was added to the cells, while gently vortexing (~1000 rpm), and the tube was incubated with rotation for 10 min at 4 °C, split into aliquots, placed on a magnetic stand, and pellets resuspended in 50 μl of Antibody Buffer (20 mM HEPES pH 7.5, 150 mM NaCl, 0.5 mM spermidine, 2 mM EDTA, 0.02% digitonin, EDTA-free protease inhibitor cocktail) containing 0.5 μg of antibody, while gently vortexing, followed by rotation for 2 h at 4 °C. The samples were centrifuged for 5 s at 1000 rpm, placed on a magnetic stand, pellet resuspended in 1 ml ice-cold Digitonin Buffer (20 mM HEPES pH 7.5, 150 mM NaCl, 0.5 mM spermidine, 0.02% digitonin, EDTA-free protease inhibitor cocktail), repeat washed in Digitonin Buffer, pelleted on the magnetic stand, supernatant removed, and 50 μl of pA-Mnase solution (final concentration 700 ng/ml in Digitonin Buffer) added to each tube while gently vortexing, followed by rotation for 1 h at 4 °C. The samples were centrifuged for 5 s at 1000 rpm, pelleted on a magnetic stand, supernatant removed, and pellets washed in ice-cold Digitonin Buffer two times, supernatant removed, 150 μl of Digitonin Buffer added to each sample while gently vortexing, followed by incubation on ice for 5 min. 3 μl of 100 mM CaCl2 was added while gently vortexing. Samples were incubated at 0 °C for 30 min, and 100 μl of Stop Buffer (340 mM NaCl, 20 mM EDTA, 4 mM EGTA, 0.02% digitonin, 0.05 mg/ml RNAseA, 0.05 mg/m, 2 pg/ml heterologous spike-in DNA) added, followed by incubation for 10 min at 37 °C, then centrifugation for 5 min at 4 °C at 16,000 rpm. The samples were placed on the magnetic stand and the supernatant was transferred to a clean 1.5 ml microcentrifuge tube, DNA extraction was performed utilizing standard phenol chloroform and ethanol precipitation methods as described84, ethanol precipitated overnight at −20 °C. DNA pellets were dissolved in 20 μl 0.1x TE Buffer (1 mM Tris-HCl pH 8, 0.1 mM EDTA). The Qubit High-Sensitivity Assay was used for DNA quantification. Cut&Run libraries were prepared using the KAPA Hyper Prep Kit protocol, where adapter ligation, 5 μl of 3 μM Adapter stock from the KAPA Dual-Indexed Adapter Kit was used, and libraries amplified using the following cycling conditions: 98 °C—45 s, 98 °C—15 s, 60 °C—10 s, 14 cycles, 72 °C for 1 min. After library amplification, the libraries were purified using 50 μl of KAPA Pure Beads and eluted in 20 μl of water. Cut&Run sample quality was analyzed by TapeStation prior to sequencing on a NovaSeq 6000 (SP 100 cycles). Two independent biological replicates containing pools of 6-8 E13.5GEs for each genotype and each antibody, as well as anti-IgG negative controls for each genotype, were analyzed. The method for analysis is summarized as follows: mapping was performed using Bowtie2 v2.1.069 on mm9 with options “–local –very-sensitive-local –no-unal –nomixed–no-discordant –phred33 -I 10 -X 700”. The Picard toolkit command ‘MarkDuplicates’ was used to mark PCR duplicates, which were removed from the final mm9 genome-mapped bam files. We separated the sequence fragments into ≤120 and ≥150 bp, fragments mapping to repeat elements were removed, and replicates were joined before peak calling. The peak calling was performed using MACS270 callpeak options “-t -c –f BED-gmm–keep-dup all –bdg –nomodel –slocal 500 –llocal 5000 –-extsize 120/150”, with an FDR cutoff of 0.05; peaks obtained from anti-IgG probed chromatin were used as negative controls. Differential Cut&Run analysis of transcription factor-binding peaks between genotypes was performed using MACS2 program by treating one of the samples as the ‘control’ for the other (qpois<0.05).
Chromatin Isolation by RNA Purification (ChIRP)
The method described in ref. 53 was followed and modified for embryonic day 13.5 ganglionic eminences (13.5GEs). For each genotype, n = 2 pools of biological replicates containing 12 Evf1/2+/+ E13.5GEs or 16 Evf1/2TS/TS E13.5GEs were dissected in ice cold L15, pipetted to produce a single cell suspension, pelleted for 4 min, X 800 RCF, resuspended in PBS, pelleted for 4 min, X 800 RCF, and resuspended in 5 mL of 1% glutaraldehyde. Cells were cross-linked for 10 min, quenched with 500 µL of 1.25 M glycine for 5 min, pelleted for 5 min X 2000 RCF, 4 °C, washed with PBS, pelleted for 5 min X 2000 RCF for 5 min, 4 °C, resuspended 1 mL 4 °C PBS per 8 embryos, pelleted X 2000 RCF for 3 min at 4 °C, supernatant aspirated, and cell pellets flash frozen in liquid nitrogen and stored at −80 °C.
Cell Lysis and sonication
Lysis buffer (50 mM Tris-HCl pH 7, 10 mM EDTA, 1% SDS +fresh 0.01X protease inhibitor cocktail (SIGMA), 1 mM AEBSF (SIGMA A8456), and 0.01X RNAse inhibitor (SIGMA). Cell pellets were resuspended in 900 µL lysis buffer per 8 embryos and homogenized with a motorized pellet mixer. Cell lysate was sonicated in a Bioruptor (Diagenode) on high setting, with 24 cycles of 30 s ON and 45 s OFF for a total run time of 30 min. After sonication, 5 µL lysate was transferred to a microcentrifuge tube with 90 µL lysis buffer and 5 µL Proteinase K and incubated at 50 °C for 45 min and purified with ChIP DNA Clean and Concentrator Kit (Zymo Research). DNA size was checked on TapeStation, with an expected bulk of DNA around 100–500 bp. Remaining sonicated chromatin was centrifuged at 16,100 RCF for 10 min, 4 °C, and flash frozen in liquid nitrogen.
Chromatin immunoprecipitation
Chromatin was thawed at RT and 10 µL was removed for DNA INPUT. 4 mL Hybridization buffer (750 mM NaCl, 50 mM Tris-HCl pH 7, 1 mM EDTA, 1% SDS, 15% formamide +fresh 0.01X protease inhibitor cocktail, 1 mM AEBSF, and 0.01X RNAse inhibitor) for each pair of ChIRP reactions. 890 µL chromatin was split evenly into three tubes for Evf2 odd, Evf2 even, and Gapdh probes, and 2X volume Hybridization buffer was added. ChIRP probes were thawed at RT and 1 µL of 100 µM Evf2 odd, Evf2 even, or Gapdh probe pool per 890 µL volume chromatin was added to separate aliquots. Aliquots were incubated at 37 °C for 4 h with end-to-end rotation. 100 µL Ampure XP beads per 100 pmol probe were prepared by washing three times with lysis buffer, resuspended in complete lysis buffer, added to chromatin, and incubated for an addition 30 min with rotation. Pelleted beads were washed five times with 1 ml of 2X SSC, 0.5% SDS (+fresh 1 mM AEBSF) at 37 °C, incubating 5 min between each wash. The wash buffer was removed completely from the beads with the last wash.
DNA isolation, libraries, and sequencing
1 mL complete DNA Elution buffer (CDE: 50 mM NaHCO3, 1% SDS + 10 µL of 10 mg/mL RNase A and 10 µL of 10 U/µl RNase H). DNA INPUT was diluted with 140 µL CDE and incubated at 37 °C for 30 min on a thermomixer at 300 rpm. Beads were resuspended in 150 µL of CDE, pelleted on a magnetic stand, supernatant was kept (elution 1), beads resuspended in 150 µL CDE, incubated at 37 °C for 30 min at 300 rpm on a thermomixer, pelleted, and supernatant was kept (elution 2). Elution 1 and 2 were combined, 15 µL Proteinase K added, incubated at 50 °C for 45 min at 300 rpm (thermomixer). Samples were purified with ChiP DNA Clean and Concentrator Kit (Zymo Research). Libraries were created with NEBNext Multiplex Oligos for Illumina (Dual Index Primers Set 1), and 10 M reads from PE100 sequenced on NOVAseq. Raw sequencing reads for ChIRP datasets were aligned using bowtie269 mapper with the following settings “--local --very-sensitive-local --no-mixed --no-discordant --phred33 -I 10 -X 700”. The datasets were mapped against the mm9 version of the mouse genome. After mapping the reads, the peak calling was performed using the MACS2 program70. An FDR cutoff of 0.05 was used to call the final set of peaks. Following mapping and peak calling, the mouse ideogram visualization was created using the RIdeogram package71. The UCSC browser was used to visualize peaks, and alignments as shown.
Probe design
Anti-sense oligo probes for Evf2 were designed using the Biosearch Technologies’ Stellaris FISH Probe Designer. 20 total 20-mer Evf2 probes and 10 total 20-mer Gapdh probes with a GC% ~45% and relatively even spacing between probes were chosen and ordered with 3’-Biotin-TEG modification and HPLC purification. Probes were reconstituted at 100 mM. Evf2 probes were numbered by position from 5’ to 3’ and odd and even probe pools were created using 50 µL of each reconstituted odd and even probe, respectively. 50 µL of each of the 10 Gapdh probes was combined to create the Gapdh probe pool.
Probe Sequences for Dlx6os1 [Evf2]:
-
1.
CAG TGC CAT CCA ATT TGA AG
-
2.
CTG TGA AAC TTT GGG TTC GT
-
3.
CAG TCA GTC TTC AGA ATG GT
-
4.
AGT CTT CTT GAA GTT GGT GT
-
5.
TGG TTC ATC TCT GAT CTG AC
-
6.
GTT AGC ACT CTA AGA GGT CA
-
7.
CTA GTT CTG TGT TCT GTG AT
-
8.
CTA CAG GGT ACA CTC AAG GA
-
9.
GGC TTA GAG AAC ATA GCC AT
-
10.
TAC AGT CGC ATA GCT CTT TA
-
11.
TTG ATA AAA GAG ACC CTC CC
-
12.
TAC CTG ATG CAT ACT GCA TA
-
13.
ATA TTG TAT GTC AGT GCT CC
-
14.
GAG ATA GTT AGA GCC CTT AG
-
15.
ATA TGG ATT TGC TGA CTC CA
-
16.
GAA CAC ACC AGA CCA TTC AT
-
17.
CCA CCA AGA GAG TAC ATT CA
-
18.
GAT TTC TCT TGA GGG TAC GA
-
19.
CCT TTT TCT GTA AAC TGG CG
-
20.
TGG TAC TCA TTT TTT CCA GG
Probe Sequences for Gapdh:
-
1.
AAA GGA GAT TGC TAC GCC AT
-
2.
TTT TGA AAT GTG CAC GCA CC
-
3.
ATT ACG GGA TGG GTC TGA AC
-
4.
TAG AAT ACG CAT TAT GCC CG
-
5.
ATT TAA CCT CAG ATC AGG GC
-
6.
CAG ACC TGT GAA CTC ATT CA
-
7.
AAT GCT TGG ATG TAC AAC CC
-
8.
TCT CAT GTT CTT CAG AGT GG
-
9.
ACT CAT GGC AGG GTA AGA TA
-
10.
CCC AGT TGC TCT TAA AAG TC
Multiomics alignments
All scRNAseq, Cut&Run, and ChIRPseq raw and processed datasets, Metadatafile, and supplementary Excel files are available at NCBI GEO (GSE261585). 4Cseq chromosome capture datasets using Dlx5/6UCE as the bait were previously reported (comparing E13.5GE Evf2+/+ vs Evf2TS/TS 5, and comparing Sox2fl/fl;Dlx5/6cre vs Sox2fl/fl4). SMC1a, SMC3, HMs (H3K27me3, H3K27ac, H3K4me1 and H3K4me3) ChIPseq datasets from ref. 5, and SOX2, SMARCA4, DLX Cut&Run datasets comparing E13.5GE Evf2+/+ vs Evf2TS/TS were used. Dlx5/6UCE 4Cseq and CUT&RUN signal overlap and violin plots -Evf2+/+ and Evf2TS/TS Dlx5/6UCE 4Cseq peaks were first mapped on CUT&RUN differential TF peaks using bedtools “intersect” function76,77. The overlapping set of 4C-seq peaks from a Dlx5/6UCE condition was then mapped on the respective CUT&RUN IgG-normalized TF signal data. The log2 fold-enrichment of IgG-normalized signal was generated using MACS2 “bdgcmp” command70. The violin plots were made using “ggpubr” R package (Kassambara, ggpubr R Package: ggplot2-Based Publication Ready Plots). Quantification and statistical analysis were performed using R. The colored dotted line represents the mean of the respective class. An unpaired t-test was used to measure the significance. For ChIP-seq and CUT&RUN, a peak is defined as a region with q < 0.05; for a 4Cseq experiment, a significant peak is defined with FDR < 0.05 and an absolute log2 fold enrichment of ≥2.
The UpSet plot is an effective tool for visualizing intersections across multiple datasets. In our analysis, we used UpSet plots to display the overlapping statistics among Evf2-RNPs, HMs, 4Cseq, and ChIRP peaks from Evf2+/+ and Evf2TS/TS Dlx5/6UCE, as well as ChIRP peaks from Evf1/2TS/TS samples for negative controls. We employed the ‘intersect’ function in bedtools76,77 with two different window sizes (50 Kb and 5 kb) to identify overlaps among these elements. The visualization of these overlaps was then generated using the UpSetR package. UpSetR: An R Package for the Visualization of Intersecting Sets and their Properties. (https://github.com/hms-dbmi/UpSetR). The UCSC Browser was used to align peaks at key regions of the mouse genome as indicated. G-quadruplex analysis was performed using https://bioinformatics.ibp.cz/#/analyse/quadruplex, based on G4hunter72.
E13.5GE HiC
HiC was performed using the Arima-HiC kit (A510008), according to the manufacturer's protocols with modifications as described previously58. 12-16 E13.5 ganglionic eminences were isolated from 6 to 8 mouse embryonic brains and dissociated in L15 media gently first with a 1000 µL tip and then with a 200 µL tip 10-15 times to create a single cell suspension. Cells were spun for 2.5 min at 1000 rpm and 4 °C. The L15 was aspirated and cells were resuspended in 500 µL of ice-cold Nuclear Extraction Buffer (0.32 M sucrose, 5 mM CaCl2, 3 mM Mg(Ac)2, 0.1 mM EDTA, 20 mM Tris-HCl pH 8.0, 0.1% TritonX-100) and incubated on ice for 10 min. Nuclei were pelleted at 1000 rpm for 2.5 min at 4 °C, and resuspended in 1X PBS. Nuclei were fixed in 10 mL of PBS with 1% formaldehyde with rotation for 10 min at room temperature. Crosslinking was then quenched with glycine to a final concentration of 125 mM for 5 minutes at room temperature. Nuclei were washed with 1x PBS and pelleted for 15 min at 500 g and 4 °C. Nuclei were then crosslinked with fresh 3 mM DSG (disuccinimidyl glutarate) in PBS for 40 min with rotation at room temperature. Crosslinking was quenched with 0.4 M glycine for 5 min, and then washed with 1X PBS. The nuclei were flash frozen in liquid nitrogen. The input amount estimation and nuclei processing follow Arima-HiC instructions. Briefly, DNA was extracted using AMPure XP Beads and then quantified with Qubit. 75 ng was reserved for quality control to determine the percentage of biotin labeling. Library preparation was performed using the Arima Library Prep for Arima HiC+ Kit (A510008, A303011). DNA was fragmented with Covaris for an average fragment size of 400 bp. Volumes for the protocol were adjusted for an initial volume of 130 µL for DNA fragmentation. DNA was size selected with AMPure XP Beads and then enriched with biotin using T1 beads (ARIMA kit). Library preparation was performed according to the protocol and a unique index pair was added to each sample before the samples were amplified with PCR and purified with AMPure XP Beads.
HiC data analysis
Adaptor sequences were removed from fastq files by trimming the first 5 base pairs from each read. The HiC-Pro pipeline (v3.1.0)73 was used to obtain the normalized contact maps with default parameters. Briefly, trimmed reads were mapped to the mouse mm9 reference genome using bowtie2 (v2.4.4)69 with min. mapping quality >= 10 and ‘--very sensitive’ local parameters. Valid pairs are then extracted from the BAM file, and the contact map is built considering the ligation sites ‘GATCGATC, GANTGATC, GANTANTC, GATCANTC’. HiCLift74 was used to generate.hic files from the matrices. Loop analysis: Loops are defined as interactions between two anchors in the HiC map. Fithic 2.075 was used to identify significant loops at 20 kb resolution. For stringency, we first filtered loops with FDR ≤ 0.01 and Contact Count > =5 and then kept only significant loops that were found in both replicates. Bedtools v2.30.0 “pairtopair” commands were used for all overlaps76,77.
Evf1/2TS/TS mice
All mouse work was approved by and followed Northwestern’s IACUC guidelines.
Evf2TS/+ mice, previously reported45 are available at MMRRC. Evf2TS/+ mice (Dlx6os1tm1.1Jhdk/Mmmh), and Evf1/2TS/+ (MMRRC, submitted) mice are maintained on a mixed background of Bl6/129. All samples used in this study are from embryonic brain (E13.5). Evf1/2TS/TS mice containing triple polyA sequences inserted into exon 1 and exon 3 were generated by combining previously published recombineering constructs (exon 1 TS insertion, construct described in ref. 45 exon 3 TS insertion, construct described in ref. 5), electroporated into ES cells, and screened for correctly recombined cells containing both insertions by Southern and PCR, as described for Evf2TS/TS and Evf1TS/TS 5,45. Mice were crossed to EIIAcreBl6 (JAX 003724) to remove floxed neo, and crossed to Bl6 to remove the EIIAcre. Loss of both Evf1 and Evf2 transcripts in E13.5GEs were verified by Taqman RT-PCR (not shown) and RNAseq (Fig. S2). Both exon 1 (Evf2TS) and exon 3 (Evf1TS) are independently verified in progeny using the primers listed below and KAPA mouse genotyping kit (55 °C annealing, 75 °C elongation, 5 min, X35 cycles):
Exon 1 insertion (Evf2TS) forward: 5’ ACT CGC AGA AAC CTG GTC CTT TCT 3’
Exon 1 insertion (Evf2TS) reverse: 5’ AAC TGG ATG TGC TCA CTC AAC CAA 3’
Exon 3 insertion (Evf1TS) forward: 5’ TGT TCC CTG TAC CCA CTG AAA GCA 3’
Exon 3 insertion (Evf1TS) reverse: 5’ GCC TTC GTC CTA AAT TTG TCT CTC G 3’
PCR product sizes
exon1-Evf2+/+ 400 bp, Evf2TS/TS: 900 bp
exon3-Evf1+/+ 180 bp, Ev1TS/TS: 300 bp.
RCD determination
The number of RBSs can vary between 1 (Y chr) to 22 (chr11) (see Fig. 5A). We define RCDs (RBS defined chromosomal domains) on the basis of the number and positioning of Evf2 RNA binding sites (RBSs) spaced at distances of >100,000 bp from each other on any given chromosome. In order to reduce the effects from small regional variations resulting from closely spaced RBSs, we define RCDs as domains with lengths greater than >100,000 bp. Using chr6 as an example, where 6 RBSs were identified, RBS 1-3 are spaced at less than 100,000 bases distant from each other, and therefore combined (3XRBS, Fig. 8C). Therefore, although chr6 has 6 RBSs, only 5 RCDs of differing sizes are defined (Fig. 8C).
In order to determine whether RCDs demarcate functional domains, we used Cut&Run datasets identifying the number of Evf2 (+) and Evf2 (-) regulated RNP binding sites in each RCD (observed number of binding sites for each condition). The total number of RNP binding sites across the chromosome was determined, and depending on the proportional length (bp) of each RCD with respect to the full-length of chromosome where the RCD is located, the expected number of RNP binding sites was calculated for each RCD as follows:
expected #RNP binding sites = total #RNP binding sites across chr x (RCD bp/chromosome bp).
The observed numbers of Evf2 (+) and Evf2 (-) regulated RNP binding sites taken from the Cut&Run data for each RCD and RNP was then compared to the expected numbers (based on the size of the RCD domain). Observed sites are divided by the expected values (E = 1 dashed black line, Fig. 7A) to show the fold values of increased or decreased binding (Evf2 regulation).
Chr-RCD mapping of frequencies of HiC intra-chr and inter-chr interaction sites: The expected number of interaction sites for uniformly distributed sites is calculated as follows: the total number of HiC sites on the chr X RCD proportion, where RCD proportion =RCD domain length/ chr length. Fold change for each domain is calculated as: # HiC intra or inter sites in each domain observed in E13.5GEs/expected number for the domain. For graphs in Fig. 7G/H, E = 1 (expected number) is indicated as dashed line.
Summary of sampling
ScRNAseq
Two pools of Evf2+/+ (8 GEs from 4 brains), and one pool of Evf2TS/TS (4 GEs from 2 brains). For scRNAseq differential expression analysis, we use individual cells as replicate observations, rather than pseudo-bulked clusters/cell-types as biological replicates. The reason for this is our use of timed-pregnant homozygous genetic models that impose considerable limitations that do not occur with cell lines, post-natal brains and/or adult tissues. Since the main focus was in differential analysis, we prepared samples in parallel for best comparisons. We were not only constrained by the numbers of Evf2+/+ and Evf2TS/TS embryos obtained on a given day from timed pregnant dams (E13.5), but also by the time required for microdissections to meet the ≥90% viability cut-off after single cell suspension of tissues (only those that met ≥90% viability were loaded into the fluidics chamber), and the 4 sample limit in the fluidics chamber (10X Genomics). As a result, our analysis is based on two pools of Evf2+/+ (8 GEs from 4 brains), and one pool of Evf2TS/TS (4 GEs from 2 brains). In total, we sequenced 16,294 cells, which when clustered in an unbiased manner using UMAP dimension reduction reflected the VZ, SVZ (Evf2+/Dlx+), Dlx5+ differentiated (Islet1+/Lhx6+/Lhx6/8+) subpopulations observed in the developing brain, confirming previously published work. In addition, cell type proportions and cell cycle proportions do not vary across 3 pooled E13.5GE samples (2 shown here in Fig. S1D & E: Evf2+/+ rep1+ Evf2+/+ rep2, Evf2TS/TS rep1, and not shown here: Evf1TS/TS rep1), not only supporting a lack of overall effects on cell-type and cell cycle proportions, but also supporting reproducibility of scRNAseq in identifying subpopulatios.
Cut&Run
Two pools of Evf2+/+ (6–8 GEs from 3 to 4 brains), and two pools of Evf2TS/TS (6-8 GEs from 3 to 4 brains). Published results (anti-SOX2, anti-DLX, anti-SMARCA4) compared reproducibility of samples amongst n = 2–4 pooled samples of Evf2+/+ and Evf2TS/TS after IgG normalization (subtraction of chromatin probed with anti-IgG), and found high levels of reproducibility between replicates (FDR < 0.05 peaks, Qpois<0.05 differential peaks).
ChIRP
Two pools of Evf2+/+ (12-16 GEs from 6-8 brains), and two pools of Evf2TS/TS (12-16 GEs from 6–8 brains), tissues were collected and chromatin pooled to maximize the starting amount. CHIRP is prone to artifacts due in large part to non-specific binding of RNA, and independent duplicates are performed. However, duplicates alone insufficiently address specificity issues with this method. We use the ChIRPseq method described in ref. 53, where only overlapping ODD and EVEN probes designed throughout Evf1 and Evf2 are probed (each in duplicate on separate pools of E13.5GEs). Only peaks detected in both ODD and EVEN peaks, and not detected when Gapdh mRNA is used as the probe are initially considered positive Evf1/2 RNA binding sites. These are the standard controls for ChIRP that control for non-specific binding. We also performed CHIRP in Evf1/2TS/TS E13.5GE, adding another level of specificity that further eliminated non-specific Evf1/2 RNA binding. Therefore, only ODD and EVEN peaks, that had Gapdh peaks subtracted, that are not detected in Evf1/2TS/TS are considered positive for Evf1/2 RNA binding. (FDR < 0.05 peaks, Qpois<0.05 differential peaks).
HiC
Two pools of Evf2+/+ (12–16 GEs from 6–8 brains), tissues were collected and chromatin pooled to maximize the starting amount. Peak calls of FDR < 0.01, and peaks detected in both samples were considered to be positive chromosomal interactions, at qvals<0.05.
Note
Biological replicates were pooled to account for biological variation in every experiment (from pools of n = 2 to n = 8, depending on the specific experiment and experimental constraints). Independent pooled experimental replicates were analyzed.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
All E13.5GE Evf2+/+ vs Evf2TS/TS scRNAseq, Cut&Run (SMARCC2, NONO), E13.5GE Evf1/2+/+ vs Evf1/2TS/TS ChIRPseq, and E13.5GE Evf1/2+/+ HiC, raw and processed datasets, as well as Metadatafile, and supplementary excels have been made available at the NCBI GEO database under accession code GSE261585.
Code availability
The codes used in the paper are provided in the Supplementary Software.
References
Lim, B. & Levine, M. S. Enhancer-promoter communication: hubs or loops?. Curr. Opin. Genet. Dev. 67, 5–9 (2021).
Uyehara, C. M. & Apostolou, E. 3D enhancer-promoter interactions and multi-connected hubs: organizational principles and functional roles. Cell Rep. 42, 112068 (2023).
Feng, J. et al. The Evf-2 noncoding RNA is transcribed from the Dlx-5/6 ultraconserved region and functions as a Dlx-2 transcriptional coactivator. Genes Dev. 20, 1470–1484 (2006).
Cajigas, I. et al. Sox2-Evf2 lncRNA-mediated mechanisms of chromosome topological control in developing forebrain. Development 148, dev197202 (2021).
Cajigas, I. et al. The Evf2 ultraconserved enhancer lncRNA functionally and spatially organizes megabase distant genes in the developing forebrain. Mol. Cell 71, 956–972 e9 (2018).
Orom, U. A. & Shiekhattar, R. Noncoding RNAs and enhancers: complications of a long-distance relationship. Trends Genet. 27, 433–439 (2011).
Orom, U. A. et al. Long noncoding RNAs with enhancer-like function in human cells. Cell 143, 46–58 (2010).
Lai, F. et al. Activating RNAs associate with Mediator to enhance chromatin architecture and transcription. Nature 494, 497–501 (2013).
Mattick, J. S. et al. Long non-coding RNAs: definitions, functions, challenges and recommendations. Nat. Rev. Mol. Cell Biol. 24, 430–447 (2023).
Mattick, J. S., Enhancers are genes that express organizational RNAs. Front. RNA Res. 1 (2023).
Hou, Y., Zhang, R. & Sun, X. Enhancer LncRNAs influence chromatin interactions in different ways. Front. Genet. 10, 936 (2019).
Zhang, Z. et al. HeRA: an atlas of enhancer RNAs across human tissues. Nucleic Acids Res. 49, D932–D938 (2021).
Yao, L. et al. A comparison of experimental assays and analytical methods for genome-wide identification of active enhancers. Nat. Biotechnol. 40, 1056–1065 (2022).
Zerucha, T. et al. A highly conserved enhancer in the Dlx5/Dlx6 intergenic region is the site of cross-regulatory interactions between Dlx genes in the embryonic forebrain. J. Neurosci. 20, 709–721 (2000).
Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304, 1321–1325 (2004).
Woolfe, A. et al. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol. 3, e7 (2005).
Ahituv, N. et al. Deletion of ultraconserved elements yields viable mice. PLoS Biol. 5, e234 (2007).
Dickel, D. E. et al. Ultraconserved enhancers are required for normal development. Cell 172, 491–499.e15 (2018).
Nolte, M. J. et al. Functional analysis of limb transcriptional enhancers in the mouse. Evol. Dev. 16, 207–223 (2014).
Osterwalder, M. et al. Enhancer redundancy provides phenotypic robustness in mammalian development. Nature 554, 239–243 (2018).
Habic, A. et al. Genetic variations of ultraconserved elements in the human genome. OMICS 23, 549–559 (2019).
Snetkova, V. et al. Perfect and imperfect views of ultraconserved sequences. Nat. Rev. Genet. 23, 182–194 (2022).
Snetkova, V. et al. Ultraconserved enhancer function does not require perfect sequence conservation. Nat. Genet. 53, 521–528 (2021).
Dukler, N. et al. Extreme purifying selection against point mutations in the human genome. Nat. Commun. 13, 4312 (2022).
Fedorova, L. et al. Nucleotide composition of ultra-conserved elements shows excess of GpC and depletion of GG and CC dinucleotides. Genes13, 2053 (2022).
Nora, E. P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385 (2012).
Isoda, T. et al. Non-coding transcription instructs chromatin folding and compartmentalization to dictate enhancer-promoter communication and T cell fate. Cell 171, 103–119.e18 (2017).
Hacisuleyman, E. et al. Topological organization of multichromosomal regions by the long intergenic noncoding RNA Firre. Nat. Struct. Mol. Biol. 21, 198–206 (2014).
Tsai, P. F. et al. A muscle-specific enhancer RNA mediates cohesin recruitment and regulates transcription in trans. Mol. Cell 71, 129–141.e8 (2018).
Giorgetti, L. et al. Structural organization of the inactive X chromosome in the mouse. Nature 535, 575–579 (2016).
Jegu, T. et al. Xist RNA antagonizes the SWI/SNF chromatin remodeler BRG1 on the inactive X chromosome. Nat. Struct. Mol. Biol. 26, 96–109 (2019).
Cajigas, I. et al. Evf2 lncRNA/BRG1/DLX1 interactions reveal RNA-dependent inhibition of chromatin remodeling. Development 142, 2641–2652 (2015).
Sunwoo, H., Wu, J. Y. & Lee, J. T. The Xist RNA-PRC2 complex at 20-nm resolution reveals a low Xist stoichiometry and suggests a hit-and-run mechanism in mouse cells. Proc. Natl. Acad. Sci. USA 112, E4216–E4225 (2015).
Chu, C. et al. Systematic discovery of Xist RNA binding proteins. Cell 161, 404–416 (2015).
Minajigi, A. et al., Chromosomes. A comprehensive Xist interactome reveals cohesin repulsion and an RNA-directed chromosome conformation. Science 349,10 (2015).
Dossin, F. et al. SPEN integrates transcriptional and epigenetic control of X-inactivation. Nature 578, 455–460 (2020).
Chen, C. K. et al. Xist recruits the X chromosome to the nuclear lamina to enable chromosome-wide silencing. Science 354, 468–472 (2016).
Yi, W. et al. CRISPR-assisted detection of RNA-protein interactions in living cells. Nat. Methods 17, 685–688 (2020).
Brockdorff, N. Localized accumulation of Xist RNA in X chromosome inactivation. Open Biol. 9, 190213 (2019).
Dror, I. et al. XIST directly regulates X-linked and autosomal genes in naive human pluripotent cells. Cell 187, 110–129.e31 (2024).
Ghavi-Helm, Y. et al. Enhancer loops appear stable during development and are associated with paused polymerase. Nature 512, 96–100 (2014).
Berghoff, E. G. et al. Evf2 (Dlx6as) lncRNA regulates ultraconserved enhancer methylation and the differential transcriptional control of adjacent genes. Development 140, 4407–4416 (2013).
Mayer, C. et al. Developmental diversification of cortical inhibitory interneurons. Nature 555, 457–462 (2018).
Schmitz, M. T. et al. The development and evolution of inhibitory neurons in primate cerebrum. Nature 603, 871–877 (2022).
Bond, A. M. et al. Balanced gene regulation by an embryonic brain ncRNA is critical for adult hippocampal GABA circuitry. Nat. Neurosci. 12, 1020–1027 (2009).
Bandler, R. C. et al. Single-cell delineation of lineage and genetic identity in the mouse brain. Nature 601, 404–409 (2022).
Goebbels, S. et al. Genetic targeting of principal neurons in neocortex and hippocampus of NEX-Cre mice. Genesis 44, 611–621 (2006).
Wu, S. X. et al. Pyramidal neurons of upper cortical layers generated by NEX-positive progenitor cells in the subventricular zone. Proc. Natl. Acad. Sci. USA 102, 17172–17177 (2005).
Baraban, S. C. Emerging epilepsy models: insights from mice, flies, worms and fish. Curr. Opin. Neurol. 20, 164–168 (2007).
Sohal, V. S. & Rubenstein, J. L. R. Excitation-inhibition balance as a framework for investigating mechanisms in neuropsychiatric disorders. Mol. Psychiatry 24, 1248–1257 (2019).
Cai, Y. et al. H3K27me3-rich genomic regions can function as silencers to repress gene expression via chromatin interactions. Nat. Commun. 12, 719 (2021).
Yamazaki, T. et al. Functional Domains of NEAT1 Architectural lncRNA Induce Paraspeckle Assembly through Phase Separation. Mol. Cell 70, 1038–1053.e7 (2018).
Chu, C. et al. Genomic maps of long noncoding RNA occupancy reveal principles of RNA-chromatin interactions. Mol. Cell 44, 667–678 (2011).
AlHaj Abed, J. et al. Highly structured homolog pairing reflects functional organization of the Drosophila genome. Nat. Commun. 10, 4485 (2019).
Bonora, G. & Disteche, C. M. Structural aspects of the inactive X chromosome. Philos. Trans. R. Soc. Lond. B Biol. Sci. 372, 20160357 (2017).
Ellis, N. & Goodfellow, P. N. The mammalian pseudoautosomal region. Trends Genet. 5, 406–410 (1989).
Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
Lafontaine, D. L. et al. Hi-C 3.0: improved protocol for genome-wide chromosome conformation capture. Curr. Protoc. 1, e198 (2021).
Meers, M. P., Janssens, D. H. & Henikoff, S. Pioneer factor-nucleosome binding events during differentiation are motif encoded. Mol. Cell 75, 562–575.e5 (2019).
Fazel Darbandi, S. et al. Increased sociability in mice lacking intergenic Dlx enhancers. Front Neurosci. 15, 718948 (2021).
Froberg, J. E. et al. Megadomains and superloops form dynamically but are dispensable for X-chromosome inactivation and gene escape. Nat. Commun. 9, 5004 (2018).
Won, H. et al. Chromosome conformation elucidates regulatory relationships in developing human brain. Nature 538, 523–527 (2016).
Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32, 381–386 (2014).
Qiu, X. et al. Single-cell mRNA quantification and differential analysis with census. Nat. Methods 14, 309–315 (2017).
Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods 14, 979–982 (2017).
Durinck, S. et al. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat. Protoc. 4, 1184–1191 (2009).
Durinck, S. et al. BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics 21, 3439–3440 (2005).
Hao, Y. et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat. Biotechnol. 42, 293–304 (2024).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Hao, Z. et al. RIdeogram: drawing SVG graphics to visualize and map genome-wide data on the idiograms. PeerJ Comput. Sci. 6, e251 (2020).
Bedrat, A., Lacroix, L. & Mergny, J. L. Re-evaluation of G-quadruplex propensity with G4Hunter. Nucleic Acids Res. 44, 1746–1759 (2016).
Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).
Wang, X. & Yue, F. HiCLift: a fast and efficient tool for converting chromatin interaction data between genome assemblies. Bioinformatics 39, btad389 (2023).
Kaul, A., Bhattacharyya, S. & Ay, F. Identifying statistically significant chromatin contacts from Hi-C data with FitHiC2. Nat. Protoc. 15, 991–1012 (2020).
Quinlan, A. R. BEDTools: the Swiss-Army Tool for genome feature analysis. Curr. Protoc. Bioinforma. 47, 11 12 1–11 1234 (2014).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Melville, J. _uwot: the uniform manifold approximation and projection (UMAP) method for dimensionality reduction. [R package site] 2023; R package version 0.1.16, https://CRAN.R-project.org/package=uwot.
McInnes, L., Heaky, J., Melville, J., UMAP: uniform manifold approximation and projection for dimension reduction. https://doi.org/10.48550/arXiv.1802.03426 (2020).
Brodie-Kommit, J., et al., Atoh7-independent specification of retinal ganglion cell identity. Sci. Adv. 7 (2021)
Lu, Y. et al. Single-cell analysis of human retina identifies evolutionarily conserved and species-specific mechanisms controlling development. Dev. Cell 53, 473–491 e9 (2020).
Clark, B. S. et al. Single-Cell RNA-Seq analysis of retinal development identifies NFI factors as regulating mitotic exit and late-born cell specification. Neuron 102, 1111–1126 e5 (2019).
Meers, M. P., et al., Improved CUT&RUN chromatin profiling tools. Elife 8. (2019).
Skene, P. J., Henikoff, J. G. & Henikoff, S. Targeted in situ genome-wide profiling with high efficiency for low cell numbers. Nat. Protoc. 13, 1006–1019 (2018).
Acknowledgements
This work was funded by NIMH R01MH111267 and R03MH126145 to J.D.K., and RF1AG068140 to R.J.V. B.S.C. is supported by an individual career development award and an unrestricted grant to the Department of Ophthalmology and Visual Sciences from Research to Prevent Blindness. Bioinformatic resources were supplemented by the National Eye Institute of the National Institutes of Health under award number P30EY002687. NovaSeq6000 was acquired through the LaJolla Institute Shared Instrumentation Grant (SIG) Program (S10OD025052). All mouse work was approved by and followed Northwestern’s IACUC guidelines.
Author information
Authors and Affiliations
Contributions
Conceptualization, J.D.K., B.S.C., A.C., Methodology, I.C., B.S.C, E.L., L.G.H., R.B., J.D.K, Software, B.S.C., A.C., L.G.H., F.A., S.J.K., Validation, I.C., M.L., M.B., L.C., Formal Analysis, F.S., B.S.C., A.C., L.G.H., S.J.K., F.A., J.D.K., Investigation, E.L., I.C., M.L, Data Curation, J.D.K., A.C., F.S., B.S.C., F.A., Writing-original draft, J.D.K., Writing review & editing, J.D.K., B.S.C., A.C., E.L., I.C., J.A.K., R.J.V., Visualization, J.D.K. S.J.K., B.S.C., Supervision, J.D.K., Project Admin. J.D.K., R.J.V., Funding acquisition, J.D.K., R.J.V., F.A., B.S.C.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Piero Carninci, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Li, E., Chakraborty, A., Kohtz, S.J. et al. Single-cell transcriptomics of ventral forebrain progenitors identifies Evf2 enhancer lncRNA-enhancer gene guidance through direct RNA binding and RNP recruitment domains. Nat Commun 16, 6902 (2025). https://doi.org/10.1038/s41467-025-62205-y
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41467-025-62205-y










