Abstract
Activation-induced deaminase (AID) mutates immunoglobulin genes to initiate antibody diversification by class switch recombination and somatic hypermutation but can also target non-immunoglobulin loci with oncogenic consequences. The mechanisms determining gene susceptibility to AID remain unclear. Here, we show that the H3K79 histone methyltransferase DOT1L is proximal to nuclear AID and promotes both class switch recombination and off-target AID activity, including Igh-cMyc translocations in mouse B cells. AID-mutated genes display high DOT1L activity. In the absence of DOT1L, nascent transcription largely increases despite reduced RNA Polymerase II (RNAPII) occupancy. Integrative genomic analyses reveal that DOT1L locally restricts transcription elongation velocity proportionally to H3K79me2/3 levels and extends RNAPII pausing. These transcriptional conditions enhance AID occupancy and thereby its activity. Our findings provide a harmonizing explanation for the bidirectional gene expression changes observed in DOT1L-deficient cells, and link attenuated transcriptional elongation velocity and prolonged RNAPII pausing to productive AID engagement at target loci.
Similar content being viewed by others
Introduction
Upon activation, B cells modify the immunoglobulin (Ig) genes to enable the production of high-affinity antibodies with distinct effector functions. The variable exons (IgV) encoding the antigen-binding domain accumulate point mutations through the process of somatic hypermutation (SHM), thereby modulating antibody affinity1. The antibody heavy chain IgH locus additionally undergoes class switch recombination (CSR) to replace the constant region that defines the isotype, thus diversifying its effector function2.
SHM and CSR are initiated by Activation-Induced Deaminase (AID), which deaminates cytosine to uracil in single-stranded DNA exposed during transcription1. SHM results from uracil processing at the IgV1. Transcription from the main IgH promoter upstream of IgV enables SHM and produces transcripts encoding IgM by default. The IgH locus contains a tandem array of independent germline transcription units (GLTUs), each comprising a cytokine-inducible promoter, a repetitive intronic switch (S)-region, and the constant region exons of each isotype2. GLTUs produce sterile transcripts that enable AID activity. Uracil processing leads to DNA double-strand breaks (DSBs) simultaneously at two S-regions, triggering CSR1. While Sµ, the first S-region downstream from the IgVH is constitutively transcribed, transcription of the other S-regions is controlled by cytokine signaling, enabling the B cell microenvironment to influence isotype switching3. GLTU transcription interplays with cohesin-mediated chromatin loop extrusion to bring S-regions into proximity, facilitating recombination4.
AID can also introduce mutations and DSBs in a few hundred genes in normal B cells5,6,7,8,9,10,11. AID-induced DSBs can lead to oncogenic chromosomal translocations involving the Ig, such as IgH-cMyc fusions12. Notably, AID occupancy is not sufficient to predict mutagenesis, indicating that additional regulatory steps determine its activity6,13,14. Genes susceptible to AID activity share transcriptional features with the Ig, including super-enhancer regulation, antisense transcription, and high occupancy of RNA polymerase II (RNAPII) and its co-factor SPT57,8,9,15,16. We have proposed that AID activity requires a licensing step linked to the transition of RNAPII into productive elongation13, although the underlying mechanism remains unresolved. Histone modifications may facilitate AID activity. However, the histone marks correlating with AID activity, such as H3K4me3, H3K27ac, H3K36me3, and H3K79me2, are associated with active transcription9,17,18,19. Dissecting whether these modifications actively promote AID activity or reflect the transcriptional state of targeted genes is not trivial.
DOT1L is the only methyltransferase producing H3K79me1, me2, and me3 via a distributive mechanism, while demethylation is thought to occur largely by nucleosome dilution20,21,22. B cell-specific ablation of Dot1l in mice has pleiotropic effects, hampering B cell development, germinal center formation, and humoral responses, with reduced switched antibodies but also IgM23,24. Dot1l−/− mouse B cells stimulated ex vivo have reduced IgG1 switching, supporting a B cell-intrinsic role of DOT1L in CSR, albeit the effect depended on the stimulus, and the mechanism was not explored24. DOT1L ablation or inhibition in a human B cell lymphoma line reduced SHM25. Thus, while DOT1L has been implicated in CSR and SHM24,25, and AID off-target genes have H3K79me2 among other epigenetic marks typical of high transcription9,18, whether DOT1L contributes to off-target mutations, and the underlying mechanism remain unknown.
H3K79me2/3 are associated with actively transcribed genes26,27,28. However, DOT1L deficiency produces relatively few gene expression changes in stem cells, and results in at least as many upregulated as downregulated genes in somatic cells22,29,30,31, including B cells24. Although genes with higher elongation rates are associated with higher H3K79me2 levels27,28, some evidence suggests that DOT1L may limit transcription32,33. Moreover, DOT1L activity has been proposed to regulate transcription, either positively or negatively, at either initiation34, promoter-proximal pause escape32,33, and, indirectly, elongation32,33. Moreover, DOT1L might promote transcription by methylation-independent functions in defined contexts35,36,37. Consequently, interpreting the role of DOT1L in AID function remains challenging for lack of consensus on how DOT1L modulates transcription at the molecular level.
Here, we identify DOT1L and the super elongation complex (SEC) as nuclear AID-proximal factors. We find that the DOT1L-MLLT10 complex promotes CSR independently of SEC components. DOT1L catalytic activity is essential for this function, implicating H3K79 methylation. DOT1L also facilitates AID off-target activity with high local DOT1L activity, marking AID-targeted genes. Mechanistically, DOT1L deficiency causes transcriptional changes indicative of increased RNAPII elongation velocity over H3K79me2/3 regions and shorter pausing. These changes coincide with reduced AID occupancy and activity. Furthermore, because increased transcription elongation velocity increases or decreases transcriptional output depending on the gene context, it could explain expression changes caused by DOT1L deficiency. Our findings reveal transcription elongation effects by DOT1L and a link between RNAPII elongation dynamics and AID activity.
Results
DOT1L and the SEC are proximal to nuclear AID and promote CSR
We hypothesized that proteins enabling AID activity genome-wide would be enriched in its vicinity in the nucleus. To identify such factors, we performed proximity-dependent biotinylation (BioID) using AID fused to the biotin ligase BirA* at either terminus (AID-BirA, BirA-AID) in Flp-In 293 TREx cells38. This approach yielded 382 preys significantly enriched over BirA-GFP and BirA controls (Supplementary Data 1).
To refine this list, we first excluded preys also identified in BioID experiments using APOBEC2, a structurally similar but functionally distinct AID paralog39. Second, we exploited the inability of BirA-AID to access the nucleus to exclude preferentially cytoplasmic interactions. AID shuttles between nucleus and cytoplasm, but large N-terminal fusions block its nuclear import40. Accordingly, both AID-BirA and BirA-AID were mainly cytoplasmic under steady state, but only AID-BirA accumulated into the nucleus upon treatment with drugs that prevent AID cytoplasmic retention and nuclear export (Supplementary Fig. 1a)40. These filters yielded 124 preys preferentially labeled by AID-BirA (Supplementary Data 1), including known AID partners (e.g., Exosome-associated NEXT complex41). We prioritized transcription-associated factors for further analysis, in particular the DOT1L complex (Fig. 1a, b). We also identified histone readers MLLT6 and MLLT10, which can associate with DOT1L29,42,43, and components of the super elongation complex (SEC), which regulates promoter-proximal pause release (Fig. 1a–c)44. The SEC consists of one scaffold protein (AFF1 or AFF4), one histone reader (MLLT1 or MLLT3), and transcription elongation factors, notably P-TEFb) (Fig. 1c)44. AFF3 forms a distinct SEC-like complex, which also recruits P-TEFb44 and supports CSR to select isotypes45. DOT1L can interact with MLLT1 or MLLT3 independently of the SEC or MLLT6/10 (Fig. 1c), suggesting multiple functionally distinct DOT1L complexes29,42,46,47,48,49.
a Protein–protein interaction network from BioGRID for AID-BirA-specific preys identified by BioID in Flp-In T-REx 293 cells. Nodes represent proteins; edges indicate known interactions. Selected complexes are highlighted. b Heat map of average spectral counts (n = 4 replicates) for selected chromatin-associated proteins detected in AID and APOBEC2 (A2) BioID experiments. W–D specificity scores for each prey are shown. SEC and DOT1L complex components are highlighted. c Composition of DOT1L, SEC, and SEC-L3 complexes. Arrows indicate known physical interactions. d Frequency of IgA⁺ CH12F3 cells expressing the indicated shRNAs, measured by flow cytometry 3 days after cytokine (CIT) stimulation. Mean values (bars) from n = 3 to 5 biological replicates (symbols). e Relative levels of unspliced Sμ and Sα germline transcripts (GLTs) in CH12F3 cells expressing the indicated shRNAs (red), normalized to control cells expressing a non-targeting shRNA (grey), measured by RT–qPCR 48 h post-CIT. Mean values (bars) from n = 3 biological replicates (symbols). Schematic of the Igh locus shows amplicon positions. f Western blot of the indicated proteins in total lysates from parental and DOT1L-deficient CH12F3 cell clones. Uncropped blots in Source Data. g Frequency of IgA⁺ cells in parental and DOT1L-deficient CH12F3 clones measured by flow cytometry 48 h post-CIT. Mean values (bars) from n = 4 biological replicates (symbols). h Frequency of IgA⁺ cells in WT or DOT1L-deficient CH12F3 cells expressing the indicated shRNAs, measured by flow cytometry 48 h post-CIT. Mean values (bars) from n = 3 biological replicates (symbols). i Mutation frequency in a region 5′ of the Sμ segment in CH12F3 cells. Pie charts show the proportion of sequences with the indicated number of mutations; total number of sequences is shown at center. d, g (lefthand side), h P values shown for significant differences by one-way ANOVA with Dunnett’s multiple comparison test; e, g (righthand side) by unpaired two-tailed Student t-test.
To test the function of these complexes in CSR, we knocked down each factor in the CH12F3 mouse B cell line, which switches to IgA upon stimulation with anti-CD40, IL-4, and TGFβ−1 (CIT)50. This approach showed that DOT1L, MLLT10, MLLT1, AFF1, and AFF4 were each required for optimal CSR, without affecting cell division or AID mRNA levels, except for MLLT10 knockdown, which reduced AID by ~25% (Fig. 1d and Supplementary Fig. 1b, c). Apoptosis increased upon CIT stimulation after depletion of DOT1L, MLLT1, MLLT6, MLLT10, or AFF4 (Supplementary Fig. 1d), suggesting a protective role after acute stimuli. However, CSR was quantified in viable (DAPI-) cells and apoptosis did not correlate with CSR defects. Sµ or Sα GLT levels did not significantly change after depleting DOT1L or the SEC components contributing the most to CSR (Fig. 1e). We measured unspliced GLTs, which have <75 min half-lives and would reflect large transcriptional changes (Supplementary Fig. 1e).
We conclude that DOT1L, MLLT10, MLLT1, AFF1, AFF3, and AFF4 are proximal to nuclear AID and have non-redundant contributions to CSR.
DOT1L functions independently of the SEC in CSR
Because DOT1L and the SEC can function cooperatively, and MLLT1/3 can be part of either complex49, we performed epistasis analysis to determine whether DOT1L acts with or independently of the SEC in CSR. We inactivated Dot1l in two parental WT CH12F3 populations to obtain two independent clones with identical homozygous in-frame deletions in the DOT1L methyltransferase domain (Dot1lΔ/Δ−1 and Dot1lΔ/Δ−2), as well as a Dot1l−/− clone with frameshifting deletions that truncate the protein at amino acid ~100 (Supplementary Fig. 2a). Although no antibody detected mouse DOT1L, the protein variant potentially produced by Dot1lΔ/Δ clones had no effect on CSR when overexpressed in WT CH12F3 cells, excluding dominant-negative activity (Supplementary Fig. 2b). H3K79me2 was undetectable in all clones, confirming DOT1L functional deficiency (Fig. 1f). ChIP-qPCR showed that H3K79me2 and H3K79me3 were present at the IgH Sµ and Sα regions in WT but absent in DOT1L-deficient cells (Supplementary Fig. 2c).
CSR to IgA was reduced by 40–60% in all DOT1L-deficient clones (Fig. 1g), without affecting AID protein expression (Fig. 1f) or cell proliferation (Supplementary Fig. 2d). Transcriptional activity monitored by unspliced GLTs levels was unchanged for Sμ and showed clone-specific variation for Sα (Supplementary Fig. 2e), which did not correlate with CSR impairment (Fig. 1g).
To test for epistatic relationships between DOT1L and SEC, we knocked down Aff1 or Mllt1 in two DOT1L-deficient clones, which further reduced CSR by ~50% (Fig. 1h), demonstrating an independent role from DOT1L in CSR. In contrast, Mllt10 knockdown had no additive effect to DOT1L deficiency, suggesting they act as a complex for CSR (Fig. 1h).
We conclude that a DOT1L-MLLT10 complex is required for efficient CSR, playing a distinct role from SEC components AFF1 and MLLT1.
DOT1L is dispensable for DNA repair during CSR
Like Dot1l knockdown, CIT-stimulated DOT1L-deficient CH12F3 cells showed increased apoptosis, which was not prevented by AID depletion (Supplementary Fig. 2f). To test if a potential DNA repair defect might contribute to reducing CSR in DOT1L-deficient cells, we used CRISPR/Cas9 to introduce DSBs directly at the 5′ Sμ and 3′ Sα regions, bypassing upstream events51. DOT1L-deficient cells displayed increased CRISPR-triggered CSR (Supplementary Fig. 2g), likely reflecting enhanced NHEJ activity selected during clonal expansion. This was confirmed using an NHEJ-specific reporter (Supplementary Fig. 2h) and is consistent with the reduction in homologous recombination repair in DOT1L-deficient cells52,53, which we confirmed in CH12F3 cells via a reporter assay (Supplementary Fig. 2h). Treatment with the DOT1L-specific inhibitor pinometostat54 reduced CSR in CH12F3 cells but did not impair NHEJ or CRISPR/Cas9-induced CSR, confirming that DOT1L is dispensable for end-joining (Supplementary Fig. 2i). This places DOT1L function upstream of DSB formation during CSR, consistent with reduced accumulation of mutations in the Sµ region of DOT1L-deficient CH12F3 cells (Fig. 1i).
DOT1L methyltransferase activity is essential for CSR
To dissect how DOT1L promotes CSR, we reconstituted DOT1L-deficient CH12F3 cells with DOT1L mutants.
WT DOT1L rescued CSR, confirming the specificity of the defect (Fig. 2a). In contrast, catalytically impaired mutants CD-3 (Y312A) and CD-4 (N241A)55, though well expressed and able to occupy Sµ and Sα, failed to rescue CSR or restore H3K79 methylation (Fig. 2a, b and Supplementary Fig. 3a). A nucleosome binding–deficient mutant (R278E/R282E, ΔNucleosome)56,57, was expressed at lower levels than WT DOT1L but also failed to rescue CSR (Fig. 2a and Supplementary Fig. 3a).
a Scheme of DOT1L variants and their relative CSR activity when expressed in Dot1lΔ/Δ1 or Dot1l−/− CH12F3 cells. Proportion of IgA+ cells 48 h post-CIT for each mutant is normalized to cells expressing WT DOT1L. Mean values (bars) and individual biological replicates (symbols) from n = 3 to 6 experiments. P values for significant differences by one-way ANOVA with Dunnet’s multiple comparison test versus cells expressing WT DOT1L. b Occupancy of flag-DOT1L WT or catalytically inactive variants in CH12F3 Dot1l−/− cells at the indicated Igh locus amplicons, and intergenic region as negative (Neg) control from 3 biological replicates. P values by two-way ANOVA with Dunnet’s multiple comparison test versus EV samples (****p < 0.0001, ***p < 0.001, **p < 0.01). c Experimental set-up for measuring CSR to IgA in mouse splenic B cells treated with DMSO or different doses of the DOT1L inhibitor pinometostat. d Representative western blot of the indicated proteins in mouse B cells treated as in (c). Uncropped blots in Source Data. e Representative flow cytometry plots of surface IgA staining versus CTV dilution in B cells treated as in (c). The plot shows mean ± SD proportion of IgA+ B cells per cell division from 3 mice from two independent experiments. P values for significant differences by two-way ANOVA with Tukey’s multiple comparisons test. f Experimental set-up for measuring CSR to IgG1 in mouse splenic B cells. g Representative western blot of the indicated proteins in mouse B cells treated as in (f). h Representative flow cytometry plots of surface IgG1 staining versus CTV dilution in splenic B cells treated as in (f). The plot shows the mean ± SD proportion of IgG1+ cells per cell division from five mice from two independent experiments. P values for significant differences by two-way ANOVA with Sidak’s multiple comparisons test.
To extend these findings to primary B cells, we treated mouse splenic B cells with pinometostat during CSR to IgA, induced with a cytokine cocktail (Fig. 2c). Pinometostat depleted H3K79me2 without reducing AID protein levels (Fig. 2d), and inhibited switching in a dose-dependent manner, independent of cell proliferation effects (Fig. 2e). For CSR to IgG1, which has faster kinetics, we first stimulated splenic B cells with anti-CD180 to induce proliferation58, allowing time for H3K79me2 depletion before inducing AID and CSR with LPS and IL-4 (Fig. 2f). Again, pinometostat significantly reduced switching per cell division (Fig. 2h).
We then investigated its recruitment to Igh. DOT1L can be recruited to chromatin via MLLT1/3 (H3K27ac readers), MLLT6/10 (H3K27 readers), or by direct interaction with phosphorylated RNAPII C-terminal domain33,46,47,48,59. DOT1L mutants unable to bind RNAPII (ΔRNAPII)59, or MLLT1/3 (ΔMllt1/3)46, fully rescued global H3K79me2 and CSR (Fig. 2a and Supplementary Fig. 3a), indicating that these interactions are individually dispensable. A mutant unable to bind MLLT6/10 (ΔMllt6/10)47,48 restored CSR to ~90% of the WT levels despite only partially restoring global H3K79me2 levels (~60%) (Fig. 2a and Supplementary Fig. 3a). This discrepancy prompted a dose-response analysis of DOT1L activity. In CH12F3 cells, increasing doses of pinometostat reduced global H3K79me2 levels to a larger extent than they reduced CSR, with 50% H3K79me2 depletion reducing CSR by ~20% (Supplementary Fig. 3b). The effect of Dot1l knockdown was also consistent with a proportional but nonlinear relationship between global H3K79me2 levels and CSR efficiency. Interestingly, loci with high H3K79me2 are more resistant to pinometostat inhibition60.
Together, these findings show that while DOT1L may associate with chromatin through multiple, partially redundant interactions (MLLTs, RNAPII), it promotes CSR via its enzymatic activity, rather than any potential scaffolding function, with the possibility that the Igh might be particularly enriched in H3K79me2.
High DOT1L activity marks and facilitates AID activity
AID’s proximity to DOT1L in HEK293T cells, which lack active Ig loci, suggested that DOT1L might facilitate off-target AID activity. To investigate this, we examined the enrichment of DOT1L-catalyzed histone marks at a curated set of AID off-target genes previously identified in activated mouse splenic B cells7,8,11,61,62, compared to a control set matched for gene length and transcriptional activity inferred from GRO-seq signal, which reflects the occupancy of transcriptionally active RNAPII (Supplementary Fig. 3c). AID off-target genes exhibited significantly higher levels of H3K79me2 and H3K79me3 downstream of the transcription start site (TSS) compared to the control set (Fig. 3a, b).
a Metaplots showing enrichment of each H3K79 methylation form downstream from the TSS for AID off-target genes and control group in mouse splenic B cells activated with LPS + IL-4. Half-eye plots show the distribution of average signal from the TSS to intron 2, with the median (yellow), 66% of the data (thick line), and 95% of the data (thin line). P values for significant differences by unpaired two-tailed Student’s t-test. b UCSC genome browser snapshot for representative genes of each category in A). c Metaplots for enrichment of H3K79me2/3 downstream from the TSS for AID off-target genes and control group in CH12F3 cells. Half-eye plots show the distribution of average signal from the TSS to intron 2, with the median (yellow), 66% of the data (thick line), and 95% of the data (thin line). P values for significant differences by unpaired two-tailed Student’s t-test. d H3K79me3 levels of individual AID off-target genes within each quintile (Q1-5). Genes are color coded according to their frequency of mutation by AID in CH12F3 cells. e Metaplots of H3K79me2 and H3K79me3 distribution centered on the center of AID hotspot regions in CH12F3 cells. f) Relative fraction of GFP+ CH12F3 cells measured by flow cytometry in control and DOT1L knockdown (mean ± SEM of two experiments), or in WT and DOT1L-deficient (mean ± SD of four independent clones from two experiments). Expression of AIDΔC or its catalytically inactive variant E58A were linked to GFP via an IRES. Values are normalized to the maximum GFP proportion after transduction. P values are from two-way ANOVA with Sidak’s multiple comparison test. g Trp53−/− mouse splenic B cells activated with LPS + IL-4 and treated with DMSO or 10 µM pinometostat. Relative quantification of cMyc transcripts at 48 h, proportion of surface IgG1+ cells, and frequency of Igh-cMyc fusions 72 h post-activation in 2 mice from 2 experiments. Representative agarose gels displaying amplification products indicative of Igh-cMyc translocation are shown. Only translocations confirmed by Sanger sequencing were counted in the calculations.
Because DOT1L activity is distributive, the enrichment of higher-order H3K79 methylation marks at AID off-targets suggested locally elevated DOT1L activity relative to similarly transcribed genes, as inferred for the CH12F3 Igh.
To validate this hypothesis, we profiled genome-wide H3K79me2 and H3K79me3 distributions in CH12F3 cells. Unlike in primary B cells, the two marks substantially overlapped in CH12F3 (Supplementary Fig. 3d), likely reflecting increased DOT1L activity in this cell line. Nonetheless, both marks were significantly enriched at a published list of genomic regions frequently targeted by AID in CH12F310 compared to an equivalent GRO-seq control gene set (Fig. 3c and Supplementary Fig. 3e). Stratifying genes by H3K79me3 levels revealed that the top two quintiles contained >75% of all AID targets and ~95% of those more frequently mutated (Fig. 3d and Supplementary Fig. 3f, g). Furthermore, H3K79me2/3 peaked at AID-targeted regions (Fig. 3e).
To test whether DOT1L activity was required for off-target AID activity, we employed a cell fitness assay using a hyperactive AID variant (AIDΔC)13. Retrovirally transduced AIDΔC linked to GFP induces genome-wide DNA damage and genotoxicity, resulting in selective loss of GFP+ cells over time. Accordingly, catalytically inactive AIDΔC E58A mutant does not affect cell fitness. GFP+ cells declined more slowly in DOT1L-deficient CH12F3 cells compared to WT, indicating reduced AIDΔC activity (Fig. 3f).
To assess DOT1L’s role in AID off-target activity in primary B cells, we measured AID-dependent chromosomal translocations between Igh and cMyc12. Using Trp53−/− B cells to enhance detection sensitivity63, we found that treatment with pinometostat reduced cMyc expression by ~10% while reducing CSR to IgG1 by ~40% and Igh-cMyc translocations by ~75% without altering breakpoint location preference (Fig. 3g and Supplementary Fig. 3h).
We conclude that DOT1L facilitates AID activity at off-target genomic loci characterized by elevated H3K79me3 levels, reflecting locally increased DOT1L activity relative to non-targeted genes with comparable levels of transcribing RNAPII.
Expression changes from DOT1L loss cannot explain impaired AID activity
Although H3K79me2/3 is typically associated with actively transcribed genes, DOT1L deficiency does not uniformly suppress their expression and often leads to gene upregulation22,29,31. To investigate this in B cells, we performed RNA-seq in Dot1l−/− CH12F3 cells and reanalyzed published RNA-seq data from activated splenic B cells24.
CH12F3 Dot1l−/− cells showed 1037 downregulated and 1483 upregulated genes, whereas Dot1l−/− mouse B cells displayed 1232 downregulated and 2639 upregulated genes, with most gene expression remaining unchanged across all H3K79me3 quintiles in either system (Fig. 4a and Supplementary Fig. 4a). Downregulated genes tended to be highly expressed and longer, while shorter and lowly expressed genes were more commonly upregulated (Supplementary Fig. 4b, c), consistent with trends observed in other DOT1L-deficient systems24,30,64. Upregulated genes predominantly overlapped with active chromatin states and were depleted of the repressive mark H3K27me3 (Supplementary Fig. 4d), ruling out upregulation due to reduced EZH2 function24,65.
a MA plot of mRNA levels differences between WT and Dot1l−/− CH12F3, by RNA-seq, showing up- (red) and down- (blue) regulated genes. Stacked bar plot shows gene expression changes per H3K79me3 quintile. b Number of AID off-target genes with altered expression by RNA-seq in each system. c RNA-seq of all Igh GLTUs from WT or Dot1l−/− murine activated B cells and CH12F3 cells. Significant (adjusted p value < 0.1) upregulation (red) or downregulation (blue) is indicated. d GLT levels for all Igh GLTUs in murine activated B cells stimulated for CSR to IgG1, measured by RT-qPCR and plotted as fold change over DMSO controls from 3 biological replicates. e Representative flow cytometry of the proportion of IgG3+ or IgG2+ mouse B cells stimulated as in (d) and treated with DMSO or 10 µM pinometostat. Bar plots show means and individual mouse values (symbols) for IgG3, or mean ± SD per cell division determined by CTV staining for 3 mice for IgG2b. f GLTU transcript levels of unstimulated WT and DOT1L-deficient CH12F3 cells by RT-qPCR, analyzed as in d) from 4-6 biological replicates. g UCSC genome browser snapshot of Igh transcripts by RNA-seq and H3K79me2, H3K79me3, RAD21, NIPBL, and CTCF signals in WT and Dot1l−/− CH12F3 cells. CBE, CTCF binding element; 3’ RR, 3’ regulatory region enhancer. h Fold change of each GLT in Dot1l−/− CH12F3 cells expressing WT DOT1L, variants thereof, or empty vector (EV). GLT by RT-qPCR was normalized to cells complemented with WT DOT1L from 3 to 4 biological replicates. i Metaplot of H3K79me2/3 distribution over architectural stripes in CH12F3 cells, aligned to the strongest stripe anchor (i.e.,k, highest RAD21 peak). H3K79me2/3 and CTCF levels for the equivalent Igh region and orientation are shown at the top. j Metaplots showing the distribution of RAD21, NIPBL, and CTCF levels at stripes in Dot1l−/− and WT CH12F3 cells. The last metaplot shows the signal difference in Dot1l−/− minus WT cells. d–f, h P values for significant differences by two-way mixed ANOVA with Sidak’s multiple comparisons test.
The expression of most AID off-target genes remained unchanged in DOT1L-deficient cells, while a minority were either downregulated (e.g., cMyc, Pax5) or upregulated (e.g., Ly6e, Il21, Bcl6) (Fig. 4b). Neither of the S-regions engaged in CSR (Ighm and Igha in CH12F3, Ighm and Ighg1 in LPS/IL-4 B cells) was downregulated in Dot1l−/− B cells RNA-seq data (Fig. 4c). Importantly, transcript levels of genes encoding for factors implicated in SHM and CSR showed either minimal or no significant changes in Dot1l−/− B cells (Supplementary Fig. 4e).
Thus, gene expression changes in DOT1L-deficient CH12F3 and B cells follow the pattern observed in other cell types and do not account for reduced CSR or diminished AID mutagenesis.
DOT1L deficiency uncouples germline transcription from CSR
Cytokine-induced GLT expression is the primary determinant guiding AID activity to specific S-regions. However, DOT1L-deficient B cells exhibited a dissociation between GLT and CSR. Specifically, GLTUs that are weakly induced by LPS/IL-4 in WT cells, e.g., Sγ3 and Sγ2b, were upregulated in activated Dot1l−/− B cells (Fig. 4c, Supplementary Fig. 5a). This finding was validated in primary B cells treated with pinometostat under similar stimulation conditions (Fig. 4d).
Notably, the Sγ2b GLT increased nearly 5-fold in pinometostat-treated cells, reaching absolute transcript levels comparable to Sγ1 (Fig. 4c-d). Yet, switching to IgG3 was not significantly higher, and switching to IgG2b was markedly impaired (Fig. 4e). Using IgA-switching conditions showed similar results: pinometostat increased Sγ3 GLT ~3-fold without corresponding increases in IgG3 switching (Supplementary Fig. 5b). Moreover, GLT deregulation was evident even in unstimulated naïve splenic Dot1l−/− B cells (Supplementary Fig. 5a), indicating that this effect was independent of cytokine signaling and reflected intrinsic deregulation of the Igh locus in the absence of DOT1L activity.
GLT deregulation was also evident in DOT1L-deficient CH12F3 cells. While Sμ GLTs were unchanged and Sα GLTs were modestly reduced in one clone but not the other, several internal GLTUs (e.g., Sγ3, Sγ2b, Sγ2c) were consistently upregulated (Fig. 4f, g and Supplementary Fig. 5c). Although both clones showed a slight increase in IgG3 switching (Supplementary Fig. 5d), this was not proportional to the Sγ3 upregulation.
Importantly, GLT deregulation was not due to long-term adaptation to DOT1L loss but to the absence of DOT1L enzymatic activity. Re-expression of WT DOT1L or methyltransferase-competent DOT1L mutants restored Sγ3 and Sγ2b GLT levels, whereas catalytically inactive variants did not (Fig. 4h and Supplementary Fig. 5e).
These findings demonstrate that GLT levels alone do not dictate optimal AID activity but require a DOT1L-dependent transcriptional environment that supports AID function, beyond regulating transcriptional output.
DOT1L activity supports cohesin trafficking at chromatin stripes
Although GLT deregulation was consistently observed across all DOT1L-deficient systems, the specific S-regions affected and extent of deregulation varied (Fig. 4c, d, f and Supplementary Fig. 5c). For instance, Igha GLT was upregulated in primary B cells but not CH12F3, while Ighg3 and Ighg2b were consistently upregulated, though to varying degrees. Moreover, these changes in GLT expression did not correlate with local H3K79me3 levels, which were very low at unscheduled GLTUs (Fig. 4g and Supplementary Fig. 5a), suggesting an indirect effect of DOT1L loss.
The Igh region spanning the Eµ enhancer to the 3’ CTCF binding elements (CBE) showed very high H3K79me2/3 at these boundaries and overall higher H3K79me2/3 than flanking regions (Fig. 4g and Supplementary Fig. 5a). This region overlaps with the Igh super-enhancer7,8 and aligns with an architectural stripe, a chromosomal region undergoing asymmetric cohesin-mediated chromatin loop extrusion essential for CSR4,66. Genomic datasets from primary B cells showed H3K79me2 enrichment both in super-enhancers and in architectural stripes, though it overlapped H3K27ac with distinct patterns: distributed uniformly over super-enhancers but asymmetrically in architectural stripes, peaking near the strongest anchor (Supplementary Fig. 5f). CH12F3 cells showed a similar H3K79me2/3 pattern within stripes (Fig. 4i).
Despite the H3K79me2 enrichment, Igh enhancer RNA showed minimal changes in DOT1L-deficient CH12F3 cells, which was unlikely to explain CSR or GLT deregulation (Supplementary Fig. 5g). As proxy for loop extrusion, we profiled genomic occupancy of cohesin components RAD21 and NIPBL, and the anchor protein CTCF. Dot1l−/− CH12F3 cells showed modest reductions in RAD21 and CTCF occupancy, especially at the strongest anchor, which was adjacent but did not overlap with H3K79me2/3 peaks (Fig. 4i, j). Additionally, RAD21 was increased at NIPBL peaks (cohesin loading sites) and reduced at CTCF-defined anchors genome-wide, including at the Igh (Supplementary Fig. 5h–I and Fig. 4g). These modest alterations were reminiscent of impaired cohesin trafficking induced by ATP depletion in B cells66. Accordingly, DOT1L-deficient CH12F3 cells and pinometostat-treated primary B cells showed reduced microhomology usage at Sμ–Sα junctions (Supplementary Fig. 5j), a process regulated by cohesin67.
Together, these results show that H3K79me2 marks architectural stripes and may aid cohesin trafficking, whereby DOT1L deficiency might indirectly affect the H3K79me2/3-poor unscheduled GLTUs. However, the modest impact of DOT1L loss on cohesin distribution is unlikely to explain reduced AID activity.
DOT1L slows transcription elongation and extends RNAPII pausing
Since transcription impacts chromatin loop extrusion68, we hypothesized that transcriptional alterations at H3K79me2/3-rich regions might underlie cohesin distribution changes in DOT1L-deficient B cells. We analyzed transcriptional dynamics by transient transcriptome sequencing (TT-seq), which uses a short pulse of 4-thiouridine (4SU) to label and measure nascent RNA69.
DOT1L-deficient live cells showed increased 4SU incorporation, suggesting globally elevated nascent transcription, as confirmed by sequencing with spike-in normalization (Fig. 5a, Supplementary Fig. 6a). The increase in nascent RNA levels in Dot1l−/− cells positively correlated with H3K79me3 quintiles in WT cells, implicating a direct, mark-dependent effect (Fig. 5a). Notably, this increase extended beyond H3K79me2/3 peaks, encompassing entire gene bodies (Fig. 5a). Upregulation was predominantly observed in sense transcripts, with only a minor increase in antisense RNA, which typically originates from regions lacking H3K79 methylation (Fig. 5a). These results indicate a direct suppressive role for DOT1L on nascent transcription across methylated gene bodies.
a Metagene plots showing average ChIP-seq coverage of H3K79me3, TT-seq, RNAPII, S2P-RNAPII and S5P-RNAPII signals for genes belonging to each H3K79me3 quintile (Q1-5) in WT and Dot1l−/− CH12F3 cells. b Metagene plots as in a) for the −1 to +3 kb region around the TSS of Q1 genes. The TSS and the peak H3K79me3 regions are shaded. c Quantification of RNAPII, S2P-RNAPII and S5P-RNAPII ChIP-seq signals in the TSS −0.1 kb to TSS + 0.3 kb region of Q1 genes (n = 2266) and H3K79me3 peak (TSS + 0.5 kb to TSS + 3 kb). P values for significant differences by paired two-sided Wilcoxon rank sum test. d) Inferred RNAPII promoter-proximal pause duration in WT and Dot1l−/− CH12F3 cells for genes in H3K79me3 Q1 (n = 1378), Q2 (n = 1383) and Q3 (n = 861). P values by Kruskal–Wallis test with paired two-sided Wilcoxon rank sum test for multiple comparisons. e Metaplots showing elongation velocity calculated as the ratio of TT-seq to RNAPII ChIP-seq signals for genes in Q1-3 in WT and Dot1l−/− CH12F3 cells. f Metaplots overlaying elongation velocity changes in Dot1l−/− compared to WT CH12F3 cells, with H3K79me3 distribution in WT for genes in quintiles Q1-3. a, b, e, f Metaplots show 5% trimmed means. c, d In box and whisker plots, line inside box represents median; top and bottom edges of the box indicate 25th and 75th percentiles; and whiskers extend the largest value up to 1.5 times the inter-quartile range.
To understand this increase, we performed ChIP-seq for RNAPII and its C-terminal domain phosphorylation marks at Ser2 (S2P-RNAPII), marking elongation, and Ser5 (S5P-RNAPII), associated with initiation but also promoter-proximal and internal pausing70. Western blotting excluded global changes in total RNAPII or its phosphorylated forms (Supplementary Fig. 6b). Metagene analysis revealed modest but significant reductions in total RNAPII, S2P-RNAPII, and S5P-RNAPII at promoter-proximal regions and over the H3K79me3-enriched sites in Dot1l−/− CH12F3 cells (Fig. 5a–c). These subtle reductions were confirmed by ChIP-qPCR at selected high H3K79me3 (Q1) genes (Supplementary Fig. 6c), with the limitation that this technique has lower statistical power to detect small variations than the aggregated genomic data.
Given increased nascent transcription, reduced S5P-RNAPII was not consistent with transcription initiation defects, suggesting instead increased release from promoter-proximal pausing71. While the traditional RNAPII pausing index was not significantly altered (Supplementary Fig. 6d), this metric lacks temporal resolution. We therefore estimated RNAPII pause duration by integrating TT-seq and ChIP-seq data72, which revealed a significant reduction in RNAPII pause duration in Dot1l−/− cells (Fig. 5d). While this result could indicate enhanced pause release or increased early termination, the former is more likely given the increased nascent transcription.
The ratio of nascent RNA production (TT-seq) to RNAPII occupancy (ChIP-seq) yields transcription elongation velocity72, which revealed significantly increased elongation velocity in Dot1l−/− cells, most pronounced within the TSS + 1–4 kb region corresponding to the H3K79me2/3 peak (Fig. 5e and Supplementary Fig. 6e). Faster elongation could also explain reduced S2P-RNAPII downstream from the TSS (Fig. 5a–e) and increased RNAPII occupancy downstream of transcription termination sites (TTS) in Dot1l−/− cells73 (Supplementary Fig. 6f). As in other systems27,28, higher H3K79me2/3 was associated with faster elongation in CH12F3 cells. Yet, DOT1L deficiency further increased elongation velocity in these regions, suggesting that H3K79 methylation restricts elongation velocity under physiological conditions. Indeed, the magnitude of the increase in velocity in Dot1l−/− cells steadily followed H3K79me3 levels in WT cells, strongly suggesting a direct role for H3K79me2/3 in limiting RNAPII elongation velocity (Fig. 5f).
We conclude that DOT1L suppresses nascent transcription by limiting RNAPII elongation velocity proportionally to H3K79me2/3 levels, which likely prolongs RNAPII promoter-proximal pausing, as suggested by other systems in which faster elongation reduces pausing74.
Transcription kinetics correlate with expression changes in Dot1l −/−
To test if altered transcriptional kinetics could explain gene expression changes in Dot1l−/− cells, we compared nascent transcriptional activity to steady-state transcript levels. Most genes with significant nascent transcription changes in Dot1l−/− cells showed increased TT-seq (n = 8655), comprising a greater proportion of each H3K79me3 quintile as methylation increased (Fig. 6a and Supplementary Fig. 6g). While modest TT-seq increases did not always significantly change steady-state mRNA levels, larger increases correlated well with the mRNA increase of upregulated genes (Fig. 6b and Supplementary Fig. 6h). Similarly, downregulated genes were largely within a smaller subset of genes (n = 1192) with reduced nascent transcription (Fig. 6b and Supplementary Fig. 6g, h).
a Stacked bar plots per H3K79me3 quintile showing number of genes with significant changes in nascent transcription, as measured by TT-seq in Dot1l−/− CH12F3 compared to WT cells. b Correlation between TT-seq and RNA-seq log2 fold changes (Dot1l−/− over WT) in CH12F3 cells. Each dot represents a gene with colors-coded expression changes in Dot1l−/− versus WT cells by RNA-seq. Spearman correlation coefficient shown. c Metagene plots showing 5% trimmed means ChIP-seq coverage of H3K79me3, TT-seq, RNAPII, S2P-RNAPII and S5P-RNAPII signals in WT and Dot1l−/− CH12F3 cells for genes categorized by their gene expression change. d Metaplots showing average elongation velocities for genes categorized as in (c). e Quantification for average elongation velocities at indicated gene segments downstream from the TSS for genes categorized as in (c). f Inferred RNAPII promoter-proximal pause duration for genes categorized as in (c). e, f In box and whisker plots, line inside box represents median; top and bottom edges of the box indicate 25th and 75th percentiles; and whiskers extend the largest value up to 1.5 times the inter-quartile range. P values by Kruskal–Wallis test with paired two-sided Wilcoxon rank sum test for multiple comparisons.
Metagene analyses revealed distinct transcriptional profiles for up- and downregulated genes. Upregulated genes had the lowest transcriptional activity, slowest elongation velocities, and longest RNAPII pausing in WT cells (Fig. 6c–f). Conversely, downregulated genes were characterized by the highest levels of nascent transcription, fastest elongation velocities, and shortest RNAPII pause durations (Fig. 6c–f).
Importantly, DOT1L deficiency caused increased elongation velocity across all gene categories, including those downregulated (Fig. 6d, e). In contrast, pause duration was reduced in upregulated genes but unchanged in downregulated ones (Fig. 6f).
Thus, increased elongation velocity is a consistent outcome of DOT1L loss, yet transcriptional consequences depend on the gene’s basal activity. In less active genes, more elongation velocity enhances productive transcription, while in highly transcribed genes the additional increase in speed reduces transcriptional output. This provides a mechanistic explanation for the long-standing observation that DOT1L deficiency can lead to gene upregulation.
DOT1L regulates transcription kinetics at AID targets genes
As observed genome-wide, most AID targets in CH12F3 cells (66 of 78 genes) were upregulated or unchanged in Dot1l−/− cells and exhibited increased nascent transcription in Dot1l−/− compared to WT cells (Fig. 7a and Supplementary Fig. 7a). A minority (12 genes) was downregulated, corresponding with reduced TT-seq signal. Regardless of expression changes, transcription elongation velocity was consistently increased in AID off-targets, closely overlapping H3K79me3 abundance, again suggesting a direct effect of the mark (Fig. 7b, c). This increase was accompanied by a shorter RNAPII pause duration (Supplementary Fig. 7b). RNAPII distribution was largely similar for these genes in Dot1l−/− and WT cells. However, S2P-RNAPII at TSS was significantly reduced in DOT1L-deficient cells (Fig. 7d), as observed genome-wide. These genes also showed greater and more extended total RNAPII accumulation downstream of TTS (Supplementary Fig. 7c), consistent with accelerated elongation. Moreover, S5P-RNAPII occupancy was significantly reduced both at the TSS, evidencing decreased RNAPII promoter-proximal pausing, and within the gene bodies in the region overlapping H3K79me3 marks (Fig. 7d). Intragenic S5P-RNAPII accumulation likely reflects stalled RNAPII, which has been linked to AID activity15,17,75,76,77.
a TT-seq log2 fold changes (Dot1l−/− over WT) at AID target genes categorized according to their differential expression by RNA-seq in CH12F3 cells. P values by one-sample two-sided Wilcoxon test, comparing the mean of distributions to 0. b Metaplots showing average elongation velocities for AID target genes in WT and Dot1l−/− CH12F3 cells. Box plots show quantification of average elongation velocities in the 3 kb region from TSS + 1 kb to TSS + 4 kb for AID target genes. P-values by paired two-sided Wilcoxon rank sum test. c Metaplots overlaying elongation velocity changes in Dot1l−/− compared to WT CH12F3 cells, with H3K79me3 distribution in WT for AID target genes. d Metagene plots showing average ChIP-seq coverage of RNAPII, S2P-RNAPII, and S5P-RNAPII signals in WT and Dot1l−/− CH12F3 cells for AID target genes. Box plots show quantification of total RNAPII, S2P-RNAPII, and S5P-RNAPII ChIP-seq signals at the TSS (from TSS-0.1 kb to TSS + 0.3 kb) and at the peak of H3K79me3 mark (TSS + 0.5 kb to TSS + 3 kb) in AID target genes (n = 63). P values by paired two-sided Wilcoxon rank sum test. b–d Metaplots show 5% trimmed means. a, b, d In box and whisker plots, line inside box represents median; top and bottom edges of the box indicate 25th and 75th percentiles; and whiskers extend the largest value up to 1.5 times the inter-quartile range. e UCSC snapshot of H3K79me2, H3K79me3, RNAPII elongation velocity, TT-seq, RNAPII, S2P-RNAPII, and S5P-RNAPII ChIP-seq in WT and Dot1l−/− CH12F3 cells, as well as signal difference between Dot1l−/− and WT at the Sμ and Sα GLTUs of the Igh. f AID occupancy by ChIP-qPCR at the indicated Igh positions for WT and Dot1l−/− CH12F3 cells at 16 h post-CIT. Mean (bars) and individual values (symbols) for n = 3 biological replicates. g As in f for selected AID off targets. Mean (bars) and individual values (symbols) for n = 3 biological replicates. h Model for DOT1L function in limiting RNAPII elongation velocity, RNAPII pausing, and AID occupancy. f, g P values for significant differences by one-way ANOVA with Dunnett’s multiple comparison test.
Thus, DOT1L deficiency increases elongation velocity, decreases promoter-proximal pausing, and reduces RNAPII stalling within the gene bodies of AID off-targets, similarly to other H3K79me2/3-enriched genes.
Faster elongation without DOT1L lowers AID occupancy
We next examined transcriptional changes at the Igh locus in Dot1l−/− CH12F3 cells. The unscheduled GLTUs that have little H3K79me2/3 and are upregulated in Dot1l−/− cells (Sγ2b, Sγ2a, Sγ3), showed modest increases in TT-seq, RNAPII, and S2P-RNAPII, with no significant changes in S5P-RNAPII (Supplementary Fig. 7d, e). Transcription velocity increased for Sγ2b and Sγ2a, which were essentially silent in WT cells, but was unchanged for Sγ3 which is transcribed in WT cells (Supplementary Fig. 7e). This is consistent with our interpretation that changes outside H3K79me2/3-enriched regions are secondary to cohesin trafficking alterations driven by transcription deregulation in the H3K79me2/3-enriched Igh regions.
In contrast, the Sµ and Sα GLTUs enriched in H3K79me2/3 and engaged in switching exhibited transcriptional changes characteristic of most H3K79me2/3-rich genes: increased TT-seq and modestly reduced RNAPII occupancy, indicating increased elongation velocity (Fig. 7e and Supplementary Fig. 7d). It is noteworthy that RNAPII and S5P-RNAPII signals in WT cells were broadly distributed throughout the Sµ and Sα GLTUs without any accumulation at the TSS (Fig. 7e), consistent with transcription proceeding with little canonical promoter-proximal pausing and instead featuring frequent RNAPII stalling linked to AID recruitment15,17,75,76,77. This S5P-RNAPII signal was reduced at the Sµ and Sα of DOT1L-deficient cells (Fig. 7e and Supplementary Fig. 7d).
To test whether DOT1L affected AID occupancy, we performed ChIP-qPCR at Sµ on two DOT1L-deficient clones at 16 h post-CIT, when AID occupancy peaks in CH12F3 cells. AID occupancy was significantly reduced at both the promoter and internal Sµ region, as well as at Sα (Fig. 7f and Supplementary Fig. 7f). AID occupancy was also significantly reduced at AID off-target genes cMyc and Ly6e (Fig. 7g), despite cMyc being only modestly downregulated and Ly6e upregulated in Dot1l−/− cells, uncoupling transcription output from AID occupancy.
Collectively, our data support a model in which DOT1L-catalyzed H3K79me2/3 limits RNAPII elongation velocity in a dose-dependent manner, thereby prolonging RNAPII pausing. This extended dwell time of RNAPII at the TSS and internal stalling sites likely facilitates AID recruitment and activity (Fig. 7h).
Discussion
We report that DOT1L and the SEC contribute independently to CSR, significantly extending previous observations that implicated DOT1L in CSR and SHM24,25. We show that DOT1L facilitates AID activity during CSR through its methyltransferase activity, rather than by a scaffolding role35,36,37. While DOT1L can form complexes with several histone readers42,43,46,47,48, its function in CSR is epistatic with its canonical partner MLLT10. We also provide mechanistic insight into how DOT1L and H3K79me2/3 contribute to transcription regulation, thereby fine-tuning gene expression and facilitating AID activity at selected loci.
Our data indicate that DOT1L activity, likely via H3K79me2/3, limits RNAPII elongation velocity. This is supported by elevated levels of nascent transcription in most H3K79me2/3-rich genes, accompanied by a modest decrease in RNAPII occupancy and increased readthrough at the TTS73,74. While nascent transcription is uniformly elevated throughout the gene bodies in DOT1L-deficient cells, the increase in elongation velocity dovetails and quantitatively correlates with H3K79me2/3. Notably, elongation velocity increases regardless of the effect of DOT1L deficiency on gene expression. These findings are consistent with H3K79me2/3 locally restraining RNAPII elongation velocity. The exact mechanism remains speculative. H3K79me2/3 may alter nucleosome dynamics to create a more resistant chromatin environment, increasing pausing or backtracking. Alternatively, it may affect recruitment or activity of other factors that indirectly modulate RNAPII velocity, such as the H3K79me2 reader Menin60,78.
According to our data, the enrichment of H3K79me2/3 at genes with high transcription elongation rate26,27,28 does not reflect a causative role in promoting elongation speed. Since DOT1L likely methylates co-transcriptionally29, it rather suggests that DOT1L activity provides negative feedback to transcription. This aligns with work showing that DOT1L inhibition in MEFs led to increased 5-ethynyl uridine (EU) incorporation after DRB release, measured by flow cytometry, while DOT1L overexpression in mouse ESCs reduced EU incorporation32. Similarly, C. elegans larvae carrying a mutated MLLT10 that cannot recruit DOT1L to chromatin exhibited globally increased GRO-seq33. In both studies, analysis of selected DOT1L target genes and RNAPII travelling ratio suggested that DOT1L deficiency reduced promoter-proximal pausing32,33. However, these studies lacked the spatial and temporal resolution to assess transcription kinetics. Moreover, consistent with our observations, others have reported no significant changes in pausing index in DOT1L-deficient human leukemia cells34. Our findings could harmonize these apparent discrepancies. In contrast to changes in elongation velocity, which dovetail H3K79me3 abundance, the TSS and promoter-proximal pausing regions are relatively poor in H3K79me2/3. We propose that increased elongation velocity in DOT1L-deficient cells shortens pause duration through more frequent RNAPII release from the paused state. Supporting this idea, Arabidopsis thaliana cells engineered to express a faster-than-WT RNAPII exhibit reduced promoter-proximal RNAPII occupancy74. This indirect effect may sometimes manifest as a detectable reduction in traveling ratio32,33 or not, as in DOT1L-deficient CH12F3 and leukemia cells34.
Our model could explain the long-standing paradox of DOT1L-deficient cells exhibiting bidirectional changes in gene expression, despite the correlation between H3K79me2/3 and highly active transcription22. In CH12F3 Dot1l−/− cells, nascent RNA levels correlate with steady-state transcript changes and are better explained by faster elongation, a feature common to all affected genes, than by changes in promoter-proximal pausing, which do not affect downregulated genes.
As the modest increase in elongation velocity in DOT1L-deficient cells is a deviation from normal transcription, it is reasonable to assume that limiting elongation has some functional benefit. One possibility is that transcription without DOT1L might not always lead to mature mRNA, and therefore the TT-seq signal includes both productive and non-productive transcription. We propose that the consequences of increased elongation velocity resulting from DOT1L loss vary according to gene expression levels and gene length. Genes significantly upregulated in DOT1L-deficient cells tend to be shorter, have the lowest elongation velocities and transcript abundance in WT cells, as well as the longest pausing duration. In these conditions, transcription increase, even if not always productive, would produce detectable gains in gene transcript abundance. Genes with unchanged expression fall within intermediate expression ranges, where modest gains in nascent transcription are insufficient to alter steady-state mRNA levels, possibly because productive and non-productive transcripts offset each other. Additionally, cellular mRNA buffering mechanisms, whereby changes in transcription or decay trigger compensatory responses, may further stabilize RNA levels79. In contrast, a small subset of genes that in WT cells show the highest elongation velocities, short pausing durations, and elevated H3K79me3 enrichment, may already be operating near the maximal transcriptional output and might not tolerate further increases in elongation velocity upon DOT1L deficiency, thus resulting in reduced mRNA abundance.
We have not addressed the mechanism by which faster elongation downregulates this small subset of genes in DOT1L-deficient cells. While we cannot rule out a defect in transcription initiation in these genes, it seems more likely that RNAPII deceleration over the H3K79me2/3 has a regulatory role. One possibility is that DOT1L supports a transcriptional checkpoint, potentially by influencing promoter-proximal pausing. This scenario is not consistent with our data, as downregulated genes showed unchanged pausing in DOT1L-deficient CH12F3 cells. Alternatively, RNAPII deceleration, where H3K79 methylation peaks downstream from the TSS, may facilitate coupling with processivity or splicing factors80. Alternatively, excess nascent RNA may disrupt elongation dynamics in DOT1L-deficient cells, for instance by triggering condensate dissolution81. In both models, longer genes are more vulnerable to early termination or reduced processivity, consistent with findings in DOT1L-deficient CH12F3, primary B cells, and embryonic stem cells37.
While our findings emphasize DOT1L’s catalytic role and high-order H3K79 methylation in facilitating AID activity and regulating gene expression, DOT1L may also regulate gene expression through non-enzymatic mechanisms in specific settings35,36,37, which merit further investigation.
AID-targeted regions are encompassed within H3K79me2/3-enriched regions, where DOT1L regulates elongation velocity. Despite increased transcription in DOT1L-deficient cells, AID occupancy is reduced, uncoupling transcription output from AID activity. Instead, the evidence supports a direct role for DOT1L activity in facilitating AID activity: (1) The expression of AID and most genes encoding CSR factors remains intact in DOT1L-deficient cells. Similarly, the events downstream from the DSB are functional. (2) Igh and off-target loci exhibit high DOT1L activity, reflected in H3K79me3 enrichment. Accordingly, inactive DOT1L fails to rescue CSR despite localizing to the S-regions. (3) The genomic regions mutated by AID overlap with H3K79me2/3-enriched regions where elongation velocity increases in DOT1L-deficient cells. (4) The same regions show reduced S5P-RNAPII and reduced AID occupancy in the absence of DOT1L. This suggests that faster elongation in the absence of DOT1L reduces RNAPII stalling over intragenic regions where AID would normally deaminate. This is consistent with the link between AID activity and RNAPII stalling within S-regions15,17,75,76,77.
Our findings support a model in which DOT1L moderates transcription elongation velocity, thus facilitating the RNAPII pausing and stalling, which stabilizes AID association with the target loci to facilitate its activity. This mechanism parallels recent findings on ELOF1, a transcription-coupled repair factor that also facilitates AID activity, likely by stabilizing paused RNAPII82,83. Although DOT1L and ELOF1 act through distinct mechanisms (ELOF1 loss consistently reduces transcription), both favor RNAPII pausing, creating a transcriptional context that favors deamination despite AID’s low catalytic efficiency84. These findings strengthen a model in which productive AID targeting is determined by a specific transcriptional context found at the Ig and a limited number of other loci, thus sparing most of the genome from AID.
Methods
Cell culture
Flp-In T-REx 293 cells (Invitrogen cat: R78007) and HEK293T cells were cultured in DMEM media (Wisent) supplemented with 10% fetal bovine serum (FBS; Wisent) and 1% penicillin/streptomycin (Wisent). CH12F3 cells (A gift from Dr T. Honjo, Kyoto University)50 and CH12F3-G (a subclone with higher CSR activity derived by limiting dilution from CH12F3) were cultured in RPMI 1640 media (Wisent) supplemented with 10% fetal bovine serum (Wisent), 1% penicillin/streptomycin (Wisent), and 0.1 mM 2-mercaptoethanol (Bioshop). All cells were grown at 37 °C with 5% (vol vol−1) CO2.
Animals
Primary mouse B cells were purified from splenocytes obtained from C57BL6/J or C57BL6/J Trp53−/− (IMSR_JAX:002101) mice housed in an SPF+ facility under controlled environmental conditions. Animals were maintained on a 12:12 h light/dark cycle, with an ambient temperature of 20–24 °C and relative humidity of 40–60%. Mice were euthanized by CO2 chamber, and splenocytes were obtained by mashing the spleen. Naïve B cells were purified using EasySep™ Mouse B Cell Isolation Kit (Stemcell technologies) and were cultured with RPMI 1640 (Wisent), supplemented with 10% fetal bovine serum (Wisent), 1% penicillin/streptomycin (Wisent), 0.1 mM 2-mercaptoethanol (Bioshop), 10 mM HEPES, 1 mM sodium pyruvate. Work was reviewed and approved by the IRCM animal protection committee (protocol 2019-05) in accordance with the guidelines from the Canadian Council of Animal Care.
Antibodies
Details about antibodies used are in Supplementary Table 1. While the anti-H3K79me3 antibody used here, and for the ChIP-seq data we reanalyzed66 displays some cross-reactivity with an H3K79me2 peptide, a trimethylated peptide was 10-fold more efficient at competing the binding than a dimethyl one indicating a strong preference for H3K79me385. Given this difference, we do not expect the H3K79me3 profile to be overly influenced by H3K79me2.
Immunofluorescence
Cells were cultured on coverslip in 24-well plates with 1 µg mL−1 tetracycline for 24 h, followed by 50 ng mL−1 Leptomycin B (LC laboratories cat no. L-6100) and/or 100 nM Didemnin B (NCI) for 2 h. Cells were fixed in PFA 3.7% (Sigma) for 10 min at RT, washed 3 x PBS, and blocked with 0.5% Triton X-100, 5% goat serum, 1% BSA in PBS for 30 min at RT. Cells were then incubated overnight at 4 °C with anti-flag (1:200) diluted in blocking solution. After 3 × 5 min washes, with 0.1% Triton X-100 in PBS (PBS-T), cells were incubated with anti-mouse IgG Alexa-546 (1:1000) diluted in blocking solution for 1 h at RT. After 3 × 5 min washes with PBS-T, cells were incubated with 300 nM DAPI (ThermoFisher) in PBS for 10 min at RT. Finally, coverslips were washed with PBS followed by ddH2O before mounting onto slides using Lerner Aqua-Mount (ThermoFisher). Confocal microscopy images were acquired by Leica Sp8 microscope with 400X or 630X (with oil immersion) objectives using Leica Las X software.
Monitoring and analysis of CSR
CSR to IgA in CH12F3 cells was induced with CIT [1 μg mL−1 rat-anti-CD40 (clone 1C10, eBioscience or prepared in-house from hybridoma FGK45), 10 ng mL−1 murine recombinant interleukin (mrIL) −4 and 1 ng mL−1 transforming growth factor-β1 (R&D Systems)]. Purified mouse splenic B cells were labeled with 2.5 μM CellTraceTM Violet (CTV) (Invitrogen #C34557) in PBS for 20 min at 37 °C before quenching as recommended by the supplier. For switching to IgG1, CTV-stained naïve B cells were stimulated with LPS (5 μg mL−1, Sigma) and mrIL-4 (25 ng mL−1, Peprotech). For switching to IgA, cells were stimulated with mouse recombinant IL-21 (20 ng mL−1, Peprotech), TGF-β1 (5 ng mL−1), retinoic acid (1 μM, Sigma), F(ab’)2 goat anti-mouse IgM (5 μg mL−1, Jackson Immunoresearch), and anti-CD40 mAb 1C10 (5 μg mL−1). Switching to various isotypes was measured by flow cytometry at indicated time points. Cells were treated with mouse FcR blocking reagent (Miltenyi Biotec #130-092-575), then stained with antibodies listed in Supplementary Table 1. Dead cells were excluded using propidium iodide, DAPI, 7-AAD, or eBioscience™ Fixable Viability Dye eFluor™ 780 (Invitrogen #65-0865-14) as appropriate. CRIPSR/Cas9 CSR assays were done by transfecting pX330 plasmids with either SpCas9 (for blunt ends) or Cas9N863A (for staggered ends) and gRNAs targeting Sμ or Sα regions (gift from Dr Alberto Martin, University of Toronto), into CH12F3 cells using Amaxa Nucleofector V (Lonza). pCAG-mCherry vector was co-transfected to control for transfection efficiency, and CSR was measured 2 days after transfection. Pinometostat (10 μM) treatment was started 2 days before transfection. The gRNA sequences are indicated in Supplementary Table 2.
Sequencing mutations and switch junctions
DNA from cells was extracted using DirectPCR lysis reagent (Viagen biotech #301-C). For sequencing mutations, a region 5’ of Sµ was PCR amplified with primers OJ353 and OJ354 (oligonucleotides are listed in Supplementary Table 2) using KOD DNA polymerase (Millipore Sigma #71086) with 35 cycles of denaturation, annealing, and extension at 95 °C, 60 °C, and 72 °C for 30 s each. The 650 bp PCR product was cloned into pGEMT-Easy (Promega) and submitted for Sanger Sequencing at Genome Quebec (Montreal). The sequences were analyzed for mutations using DNA Baser Assembler v5.15.0. Clonal mutations were removed, and the frequency and distribution of mutations were summarized using a custom tool. Sμ-Sα switch junctions were amplified with OJ523 and OJ524 (see Supplementary Table 2) using the Expand Long Template PCR system with Buffer 1 (Roche). PCR products 0.5–1 kb were gel-purified, cloned, and sequenced. Sequences were aligned to the mouse Ighm and Igha loci. Switch junctions were identified at the boundary of sequences with an uninterrupted alignment at both loci. Microhomology was scored by counting the number of nucleotides that aligned to both loci at the switch junction. Duplicate junctions were removed from the analysis, except for when sequences had a different mutation profile in the surrounding regions.
DNA constructs
Target sequences for shRNA were obtained from the RNAi consortium (TRC) library database (https://portals.broadinstitute.org/gpp/public/). Two complementary oligonucleotides encoding the antisense RNA were annealed and ligated into the lentiviral pLKO.1 TRC cloning vector (Addgene plasmid # 10878) using AgeI and EcoRI (New England Biolabs) sites. CRISPR guide RNAs (gRNA) were designed using CHOPCHOP v3 webtool. For mouse Dot1l we targeted the exon 4. Complementary oligonucleotides encoding each guide RNA were annealed and cloned into pSpCas9(BB)−2A-GFP (Addgene plasmid # 48138). Mouse DOT1L (NM_199322.1) was cloned by PCR using cDNA from LPS + IL4-activated primary B cells into pBlueScript KS (+). DOT1L mutants were made using Q5 mutagenesis or NEBuilder® HiFi DNA Assembly kit (New England Biolabs) and were cloned into pMX-Puro retroviral vector. Oligonucleotide sequences are in Supplementary Table 2. Details of mutations introduced in each DOT1L are as follows, based on amino acid numbers of mouse DOT1L (accession: NM_199322.1):
DOT1L mutant | Mutations |
|---|---|
CD-1 | G163R/S164C |
CD-2 | G165R |
CD-3 | Y312A |
CD-4 | N241A |
ΔNucleosome binding | R278E/R282E |
ΔRNAPolII binding | N616A/K619A/K621A/R625A |
ΔMllt1/3 binding | L638A/I640A/V865A/I867A/V879A/I881A |
ΔMllt6/10 binding | L523D/L530D/I569D/L576D/L583D/L638D/L645D |
Gene targeting and overexpression in CH12F3 cells
Lentiviruses were produced by cotransfecting HEK293 cells with pMD2.G, psPAX2, and pLKO.1 vectors (0.25:0.75:1 ratio, 2 μg DNA total) using Trans-IT LT-1 (Mirus Bio, Cat# MIR 2305) and supernatant harvested 48 h later. CH12F3 cells (0.5 × 106 cells in 24-well plates) were infected by adding 1.5 mL of HEK293 supernatant in the presence of 8 μg mL−1 Hexadimethrine bromide (Sigma, Cat# H9268) and spinning at 600 × g for 90 min at 30 °C. Medium was replaced 4 h later, and 2 days after transduction, transduced cells were selected with 0.4 µg mL−1 puromycin for ≥48 h. For CRISPR/Cas9, CH12F3cells were co-transfected using Amaxa Nucleofector Kit V (Lonza) with 2 pSpCas9(BB)−2A-GFP vectors containing 1 gRNA each intended to create a small gene deletion. Individual GFP+ cells were sorted by FACS 48 h post-transfection. Cell clones were genotyped by PCR (see primers in Supplementary Table 2) and knockouts confirmed by western blot. Complementation assays were performed by transduction with retroviral vectors. Virions were produced by co-transfecting the plasmids VSV-G, MLV gag-pol, and pMXs vector (1:1:2 ratio, 2.5 μg DNA total) into HEK293 cells using Trans-IT LT-1 and then following the same procedure as for lentiviral infection.
Measuring AID off-target activity
Igh-cMyc chromosomal translocations were detected in DNA from Trp53−/− mouse B cells, stimulated for 3 days with LPS (5 μg mL−1, Sigma) and mrIL-4 (25 ng mL−1, Peprotech), by nested PCR amplification as previously described12,40. PCR products were purified and sequenced to confirm each translocation. The hyperactive AIDΔC or its catalytically inactive variant AIDΔCE58A cloned into pMXs-IRES-GFP plasmid were retrovirally transduced into CH12F3 cells as described above. The rate of loss of GFP+ cells in the infected cell populations was monitored daily for one week by flow cytometry.
Monitoring apoptosis and proliferation
Cell survival and proliferation were estimated by cell counting. Proliferation of CH12F3 cells was measured by dilution of CFSE dye 2 days after staining cells with 2.5 µM CFSE (Invitrogen). Apoptosis was monitored using AnnexinV-APC (BD Biosciences). Dead cells were stained using propidium iodide, DAPI, 7-AAD, or PI as appropriate.
DNA repair assays
CH12F3 cells were stably transfected with 2 μg of reporter constructs for NHEJ: pCMV6-AC EJ7-GFP (Addgene plasmid #113617), A-EJ: pCMV6-AC 4-μHOM (Addgene plasmid #113619), or HR: pDR-GFP (Addgene plasmid #26475). For CRISPR/Cas9-based DNA repair assays86, cells were transiently transfected with 1 μg each of 7a and 7b sgRNA plasmids for NHEJ (Addgene plasmid #113620 and #113624). For HR, cells were transiently transfected with 2 μg of linearized plasmid coding for Isce-I (gift from Dr. Maria Jasin, Memorial Sloan Kettering Cancer Center, NY). Transfections were done with Amaxa Nucleofector V (Lonza) for CH12F3 cells or Trans-IT LT-1 (Mirus Bio, Cat# MIR 2305) for HEK293T cells, along with pCAG-mCherry vector to control for transfection efficiency. GFP+ cells were measured 2 days post-transfection.
Western blotting
Protein extracts were performed by lysing cells in RIPA lysis buffer (1% NP-40, 10% glycerol, 20 mM Tris pH 8.0, 137 mM NaCl, 10% glycerol, 2 mM EDTA), containing protease and phosphatase inhibitor (ThermoFisher), followed by sonication for 10 min. For histone western blots, proteins were extracted from cells using RIPA buffer with 2 µl/ml Benzonase® Nuclease HC, Purity > 90% (Sigma cat. no. 71205-3). Protein extracts were separated by SDS-PAGE and transferred to nitrocellulose membranes (BIO-RAD). Equal protein loading was controlled by Ponceau red staining of the membrane after transfer. Membranes were blocked in TBS 5% BSA and probed with primary antibodies in TBS 0.1%Tween, as indicated in Supplementary Table 1. After 4 × 5 min washes, signal was developed by secondary antibody incubation and read on the Odyssey CLx imaging system (LI-COR) for fluorophore-conjugated antibodies.
RT-qPCR
RNA was isolated using TRIzol reagent (Life Technologies) and reverse transcribed using M-MuLV reverse transcriptase (New England Biolabs). Quantitative PCR (qPCR) was performed with PowerUpTM SYBRTM Green Master Mix (Applied Biosystems) in ViiaTM 7 machine and software. For spike-in control, RNA from S2 (Drosophila melanogaster) cells was added in a 1:5 ratio to mouse cells prior to TRIzol extraction, and Ct values were normalized to Drosophila Act5c levels. To evaluate half-lives of unspliced or spliced GLT, CH12F3 cells were treated with 100 μM DRB (Cayman Chemical, # 10010302-50) to inhibit transcription by RNAPII. The levels of each transcript pre- and following DRB addition were assessed by RT-qPCR using 18S ribosomal RNA as normalization control. For each sample, transcript expression levels were normalized to time 0. Primers used for qPCR are listed in Supplementary Table 2.
BioID
Human AID and APOBEC2 were cloned into pDEST-pcDNA5-BirA-FLAG N-term and pDEST-pcDNA5-BirA-FLAG C-term (gifts from Dr. Anne-Claude Gingras, Lunenfeld-Tanenbaum Research Institute, Toronto) to produce C- and N-terminally BirA-tagged baits by gateway cloning (Invitrogen). Flp-In T-REx 293 cells were transfected with 200 ng of either construct together with 2 μg of pOG44 Flp-recombinase expression vector (ThermoFisher cat: V600520) and selected with 200 μg mL−1 hygromycin B (Wisent cat: 450-141-XL) to generate stable cell lines. Baits were integrated at the same unique locus, as described by supplier. Flp-In T-REx 293 cells with BirA-FLAG-eGFP and BirA-FLAG were similarly produced. Expression of the baits was induced with 1 μg mL−1 tetracycline (Bioshop cat: TET701.10), and 50 μM biotin (BioBasic cat:BB0078) was simultaneously added to the medium. The cells were harvested 24 h later. After cell lysis, biotin-labeled proteins were captured by streptavidin coupled to magnetic beads (Sigma cat: GE17-5113-01) and processed as described previously87.
MS data analysis
BioID samples were injected into a Q Exactive Quadrupole Orbitrap (ThermoFisher), and raw files were analyzed with the search engines Mascot and X! Tandem through the iProphet pipeline integrated in Prohits, using the human RefSeq database (version 57) supplemented with “common contaminants” from the Max Planck Institute (http://maxquant.org/downloads.htm), the Global Proteome Machine (http://www.thegpm.org/crap/index.html), and decoy sequences. All references for software packages are provided in Supplementary Table 3. The search parameters were set with trypsin specificity (two missed cleavage sites allowed), and variable modifications oxidation (M) and deamidation (NQ). The mass tolerances for precursor and fragment ions were set to 15 ppm and 0.6 Da, respectively, and peptide charges of +2, +3, and +4 were considered. The search results were individually processed by PeptideProphet, and peptides were assembled into proteins as described in ProteinProphet using the Trans-Proteomic Pipeline with settings: -p 0.05 -x20 -PPM –d“DECOY”, iprophet options: pPRIME and PeptideProphet: pP. Four biological replicates each of AID-BirA, BirA-AID were compared to 4 replicates of the BirA and BirA-GFP controls. Two biological replicates each of APOBEC2-BirA and BirA-APOBEC2 were jointly compared to the same controls. The interactions scoring was performed on proteins with an iProphet protein probability ≥ 0.9 and unique peptides ≥ 2, by two algorithms. We used SAINTexpress (version 3.6.1) with settings: nControl:4 (2-fold compression), nCompressBaits:4 (no bait compression). Interactions displaying a BFDR ≤ 0.01 were considered statistically significant. We also used DESeq2 (version 1.32.0), an R package that applies negative binomial distribution to calculate enrichments over controls. DESeq2 was run using default settings, and prey significantly enriched over BirA and BirA-eGFP were selected by applying a ≤ 0.1 p value cut-off. The combined list of significant prey obtained from SAINTexpress and DESeq2 was defined as potential proximity interactors. Remaining unfiltered contaminants, such as keratins, hornerin, carboxylases, BirA, GFP, streptavidin, decoys, trypsin, albumin, lysozyme C, or beta galactosidase were removed post hoc.
Graphical representations of protein networks were generated with Cytoscape (version 3.7.2). Network augmentations of our BioID screens were performed by extracting prey-prey interactions from the human BioGRID network (version 3.5.176), and from Cytoscape’s PSICQUIC built-in web service client (October 2019 release) after searching against the IntAct and iRefIndex databases. Clusters were extracted by the Markov Clustering Algorithm (MCL) from Cytoscape’s ClusterMaker2 application (version 1.3.1). WD-Scores were calculated on the ProHits-Viz webpage (prohits-viz.org) by uploading SAINT results. For graphical representation of specific protein abundances, each prey abundance was represented by the averaged spectral counts of the replicates, the values were Log2-transformed, and used to generate heatmaps in the ProHits-viz web tool.
Chromatin immunoprecipitation
CH12F3 4 × 105 cells were stimulated in 10 mL of media with CIT and cross-linked with 1% formaldehyde for 8 min at RT, at 16 h post-stimulation. The reaction was stopped with glycine (125 mM final concentration). For unstimulated CH12F3 cells, 1 × 107 cells were cross-linked with 1% formaldehyde for 8 min at room temperature in 10 mL of media. In all cases, cells were washed twice with cold PBS, harvested, and resuspended in RIPA buffer and sonicated in a Covaris E220 apparatus with following parameters: Time: 10 min, Duty Factor: 10%, PIP: 140 W, and CBP 200, to generate DNA 200-600 bp fragments. Samples were then clarified by addition of 1/10 Volume of 10% Triton X-100 and centrifugation at 20,000 × g (4 °C) for 15 min. For immunoprecipitation samples were pre-cleared with 20 μL of Dynabeads Protein G (ThermoFisher, 10004D) for 1 h (4 °C). The different antibodies (listed in Supplementary Table 1) were coupled to 25 μL of Dynabeads Protein G for 30 min and blocked for 1 h (4 °C) with BT buffer (0.5% Triton X-100, 10 mM Tris pH8, 100 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 15 mg/mL BSA, 3 mg/mL tRNA, and 1×CPI). IP was performed by overnight incubation of 20 μg of pre-cleared chromatin with blocked Ab-bead complexes. Beads were then washed twice (15 min at 4 °C) with RIPA buffer, three times with ChIP Wash Buffer (100 mM Tris-HCl [pH 8.5], 0.5 M LiCl, 1% [v/v] Igepal CA-630, and 1% [w/v] sodium deoxycholate), and once with 1×TE. Immunocomplexes were eluted for 10 min at 65 °C with 100 μL of Elution buffer (1% [w/v] SDS, 200 mM NaCl, 1 mM EDTA, and 1 mM DTT, 5 μg of proteinase K) and overnight incubation at 65 °C. DNA was purified using QIAquick PCR purification kit (Qiagen), and DNA resuspended in 50 μl Tris-HCl (pH 8) was used as template in qPCR reactions. Briefly, 10 μL PCR reactions containing 1×SYBR Green Mix (ThermoFisher, A25741), 1/10 fraction of the ChIP-enriched DNA, and 100 nM primers were set up in 384-well plates. Primers sequences in Supplementary Table 2.
ChIP-seq
ChIP-seq was performed as described for ChIP with the following modifications to include an exogenous control (spike-in) of Drosophila S2 cells. Briefly, 1 μg of S2 chromatin was spiked to 24 μg of CH12F3 chromatin (1:24 ratio) before IP. For the IP, 30 μL Dynabeads Protein G were coupled with the target Ab and 2 μg of a spike-in Ab against H2Av. For the S2 chromatin preparation, 8 × 106 cells were seeded in 100 mm plates and grown for 4 days. Cells were crosslinked in 1X PBS containing 1% formaldehyde at RT for 20 min and quenched with 125 mM glycine for 5 min. Crosslinked cells were harvested and washed twice with cold 1X PBS. Cell pellets were resuspended in 130 μL of RIPA buffer and fragmented using Covaris E220 with the following parameters to obtain chromatin fragments between 100 and 500 bp: PIP = 140 W, DF = 20%, CPB = 200, duration = 150 s. Soluble fragmented chromatin was recovered by centrifugation, and 25 μL of the sample was taken to assess sonication pattern and DNA content. DNA was purified as per standard ChIP. For ChIP-seq, size distribution and molarity of IP and input samples were evaluated on a 2100 bioanalyzer (Agilent Technologies). Libraries were prepared using the KAPA HyperPrep library kit (Roche Diagnostics) as per manufacturer’s instructions. Normalization of the sample quantities was done after quantification of the ligation products by qPCR, and library size distribution was assessed on a 2100 bioanalyzer. Equimolar libraries were sequenced on Illumina NovaSeq 6000, with an S4 flowcell at the McGill University and Génome Québec Innovation Centre to generate 100 bp paired-end reads.
CUT&RUN
CUT&RUN was performed following a protocol described previously88, with the following modifications: 5 × 105 CH12F3 of viable cells were employed per condition. Quality controls of cell integrity, bead conjugation, and cell permeabilization. CH12F3 cells were permeabilized with 0.001% digitonin as determined by optimization. Following incubation with Calcium Incubation Buffer for 30 min at 0 °C, reaction was stopped with STOP buffer containing 1 ng of E. coli spike-in DNA (Epicypher, 15-1016). The released chromatin fragments were purified with Nucleotide Purification Kit (Qiagen) and eluted in 25 µL of Tris-HCl (pH 8). CUT&RUN libraries were prepared and sequenced as for ChIP-seq.
RNA-seq preparation and analysis
Briefly, 0.5million CH12F3 WT or Dot1l−/− cells and 0.1million Kc167 Drosophila cells spike-in were mixed. RNA was extracted using Qiagen RNeasy Micro kit, mRNA was enriched using NEBNext Poly(A) mRNA Magnetic Isolation Module (NEB), and library was prepared using KAPA RNA HyperPrep kit (Roche Diagnostics) as per manufacturer’s instructions. Equimolar libraries were sequenced on Illumina NovaSeq 6000, with an S4 flowcell at McGill University and Génome Québec Innovation Centre to generate 100 bp paired-end reads. CH12F3 and publicly available RNA-seq datasets (listed in Supplementary Table 4) were analyzed as follows (all references for software packages are provided in Supplementary Table 3): Sequencing adapters were trimmed with fastp (0.23.1 or 0.23.2+galaxy0) before alignment with STAR (2.7.9a or 2.7.10b+galaxy1) to custom reference genome reflecting the recombined VDJ locus in CH12, produced by modifying mouse (mm10) reference genome based on the Igh cDNA sequences from CH12F3 cells using Reform (https://gencore.bio.nyu.edu/reform/), mouse (mm10) and drosophila (dm6) genomes, where applicable. Gene expression counts were determined with featureCounts (Rsubread 2.12.2 or 2.0.3+galaxy0) in a strand-specific fashion. Differentially expressed genes were determined with DESeq2 (version 1.38.2). Genes that did not pass the independent filtering criteria of DESeq2 were considered “silent”. Mapped reads were filtered for mapping quality (MAPQ ≥ 1) using samtools (1.16.1 or 1.15.1+galaxy0) to generate bigwig files using bamCoverage and bamCompare from deepTools 3.5.0 with a bin size of 1 bp.
ChIP-seq and CUT&RUN analysis
Quality of libraries was verified with FastQC 0.11.9 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). All references for software packages are provided in Supplementary Table 3. For ChIP-seq and CUT&RUN, sequencing adapters and first 10 bp were trimmed with fastp 0.23.1 before alignment with bowtie2 2.4.4 to a custom reference genome reflecting the recombined Igh in CH12F3, produced as described in RNA-seq analysis. Duplicate reads were removed with MarkDuplicatesSpark from Picard tools through GATK 4.2.4.0 and filtered for mapping quality (MAPQ ≥ 1) using Samtools 1.16.1. Alignment statistics were compiled and checked using MultiQC 1.13. Peaks were called with MACS2 2.2.7.1. Normalization was performed using DiffBind package 3.8.3, specifically by using DESeq2 (version 1.38.2) to calculate scaling factors on “background bin” signals estimated by the csaw package 1.32.0. These scaling factors were then used to generate bigwig files using bamCoverage and bamCompare from deepTools 3.5.0 with a bin size of 20 bp.
Publicly available bigwig files for ChIP-seq data from primary mouse splenic B cells were lifted over to mm10 using UCSC kentutils (https://github.com/ucscGenomeBrowser/kent) liftOver tool. Snapshots of ChIP-seq, CUT&RUN, TT-seq, and RNA-seq profiles across different loci were visualized using track hubs in UCSC genome browser.
For further analysis, data tables were handled using the R package data. Table 1.14.6, genomic intervals were handled with the R package GenomicRanges 1.50.2, and rtracklayer 1.58.0 was used to import GTF, GFF3, and BED format files H3K79me3 quintiles were defined based on the average H3K79me3 signal from TSS to the end of second intron or transcription termination site (TTS), whichever is shorter, for all expressed genes using multiBigwigSummary from deepTools.
AID off-target genes in in vitro activated mouse primary B cells were compiled from earlier translocation capture studies7,8,61,62. To compose the list of AID off-target genes in CH12F3 cells, we used AID-dependent genomic translocation target regions previously identified10. Genes overlapping with or within 20 kb of each region were found using GenomicRanges. Most (63/83) regions were associated with only one gene. For regions associated with more genes, the gene with the highest active transcription based on mean GRO-seq signal was chosen as the likeliest target for AID off-target activity. Control genes for AID targets with matching GRO-seq or TT-seq signals, as well as matching gene length distributions, were selected randomly using sample function in R, with probability weights for non-target genes adjusted based on AID target features, approximated either using density (kernel density estimation) and approxfun functions or kde2d interp.surface functions from MASS and fields R packages.
Mean signals at specified regions of each gene were calculated using R package Megadepth. R package ggplot2 was used for plotting of results from high-throughput sequencing experiments. For strand-sensitive metagene profiles, trimmed means of signals at a trim value of 10% were calculated from the normalized bigwig files with a custom script using Megadepth and plotted with ggplot2. Half-eye plots were produced with R package ggdist 3.2.1.
Transient transcriptome sequencing (TT-seq)
4SU enriched RNA was extracted following TT(chem)-seq protocol with minor modifications89. Briefly, RNA was extracted by TRIzol (ThermoFisher, #15596026) following manufacturer’s instructions. As a control for equal sample preparation, 4-thiouracil (4TU)-labelled RNA from S. cerevisiae (strain BY4741) was spiked-in. S. cerevisiae was grown in YPD medium overnight, diluted to an OD600 of 0.1, and grown to mid-log phase (OD600 of 0.8) and labelled with 5 mM 4TU (Sigma-Aldrich, 440736) for 5 min. Total RNA was extracted following a hot-phenol purification. For purification of 4SU labelled RNA, 100 μg mammalian 4SU labelled RNA was spiked-in 1/100 of 4TU-labelled S. cerevisiae RNA. The RNA was fragmented by addition of 1 M NaOH to fragment RNA and left on ice for 30 min. Fragmentation was stopped by addition of 1 M Tris, pH 6.8, and cleaned up twice with Micro Bio-Spin™ P-30 Gel Columns (BioRad, #7326223) according to the manufacturer’s instructions. Biotinylation of 4SU-residues was carried out in 10 mM Tris-HCl [pH 7.4], 1 mM EDTA, and 5 mg MTSEA biotin-XX linker (Biotium, #BT90066) for 30 min at room temperature in the dark. RNA was then purified by phenol: chloroform extraction, denatured by 10 min incubation at 65 °C, and added to 200 μL μMACS Streptavidin MicroBeads (Miltenyi, #130-074-101). RNA was incubated with beads for 15 min at room temperature, and beads applied to a μColumn in the magnetic field of a μMACS magnetic separator. Beads were washed twice with pull-out wash buffer (100 mM Tris-HCl, pH 7.4, 10 mM EDTA, 1 M NaCl, and 0.1% Tween 20). 4SU-RNA was eluted twice by addition of 100 mM DTT, and RNA was cleaned up using the RNeasy MinElute kit (Qiagen, #74204). RNA integrity was assessed on a 2100 bioanalyzer (Agilent Technologies) with an RNA 6000 Pico kit. Libraries were prepared from an equal quantity of total RNA for each sample (190 or 260 ng) with the KAPA RNA Hyperprep Kit (Roche Diagnostics) with a 7 or 8-cycle final amplification. Library size distribution was assessed on a 2100 bioanalyzer (Agilent Technologies) using a High Sensitivity DNA Kit, and libraries were quantified by qPCR. Equimolar libraries were sequenced in paired-end reads (PE100) on a Novaseq system (Illumina), with of 60 M fragments per library.
TT-seq analysis
Quality controls were performed with FastQC (v0.12.1). Reads were trimmed with Trimmomatic (v0.36) and parameters “PE ILLUMINACLIP:adapters.fa:2:30:10: TRAILING:3 MINLEN:25” before alignment with STAR (v2.7.9a) to custom CH12 genome (see RNA-seq analysis for details) and S. cerevisiae genome R64-1-1 with default parameters. Poorly mapped reads, low-quality mapped, and duplicates were removed with SAMtools (v1.20) using the parameters “-F 2048 -F 256 -F 1024 -f 2 -q 2”. Genome coverage BigWig files from filtered reads were made with deepTools bamCoverage (v 3.5.1) using the parameters “--extendReads --scaleFactor $scaleFactor --binSize 10 -p”, where “$scaleFactor” was replaced by the scale factor computed using 2,000,000 divided by the number of filtered reads aligned to S. cerevisiae genome.
GRO-seq analysis
Publicly available GRO-seq dataset in primary activated mouse B cells (GSE130266) was analyzed following the publicly available pipeline https://nf-co.re/nascent/2.1.1 without modification to create alignment BAM files and genome coverage bigwig files, and design a control group of genes with similar levels of transcription to AID off-target genes, as described in the corresponding section.
Architectural stripes
Stripes were called from published Hi-C data using Stripenn 1.1.65.1590 with default parameters and filtering for stripes with a p value < 0.05 and “Stripiness” value > 0. Published CH12F3 HiC data10 was realigned to CH12F3 custom genome using the distiller-nf pipeline version 0.3.4 (https://zenodo.org/records/7309110). Stripes in activated B cells were obtained from the results of published analysis using Stripenn. Super-enhancer regions in LPS + IL4-activated B cells were obtained from GSE620638. CH12F3 super-enhancers were identified as described with the software package ROSE. Briefly, H3K27Ac ChIP-seq peaks from dataset GSE121355 were called with MACS2 callpeak 2.2.7.1 with the --qvalue “0.001”. Super-enhancers were then identified with ROSE, with “TSS_EXCLUSION_ZONE_SIZE = 2000” and default STITCHING_DISTANCE (12.5 kb). For ChIP-seq metaplots, signal across the selected regions was summarized using computeMatrix from deepTools and was plotted using plotHeatmap from deepTools, or trimmed means (±5%) were calculated for each bin and plotted using ggplot2 package using a custom R script. Compartments analysis was carried out with HiC data using CALDER2 version 0.7 at a resolution of 10 kb with default settings. RAD21 loading sites were defined as regions with overlapping RAD21 and NIPBL peaks, and anchor sites were defined as regions with overlapping RAD21 and CTCF peaks. Union of peaks in WT and Dot1l−/− CH12F3 for each factor was considered for this analysis. For CTCF peaks, the peaks from CTCF CUT&RUN were supplemented with peaks from a published CTCF ChIP-seq in WT CH12F3cells66.
Calculation of transcription kinetics parameters
RNAPII pausing index, pausing duration, and elongation velocity were calculated using custom scripts. For calculating pausing index (traveling ratio), only genes with a minimum length of 1 kb were considered. Pausing index was calculated as ratio of mean RNAPII ChIP-seq signal around TSS (−100bp to +300 bp) to mean RNAPII ChIP-seq signal in gene body (+300 bp to TTS) using R packages GenomicRanges and Rsubread 2.12.2.
For analyses in Figs. 5 and S6, pausing duration and RNAPII elongation velocity were calculated as described before72,91. To ensure accurate analysis and sufficient coverage, overlapping genes, genes less than 10 kb, genes with TT-seq coverage of less than 4 reads per base, and genes with zero RNA-seq reads in three or more replicates were not considered in the analysis. For each gene, the isoform with the highest paused RNAPII ChIP-seq signal at the TSS (−200bp to +400 bp) of the transcript, as well as the highest mean TT-seq signal was defined as the major expressed isoform, breaking ties in favor of the longer isoform. For each major transcript, productive initiation frequency was calculated as the mean TT-seq coverage in exons in the window TSS + 500 bp up to TSS + 25 kb, excluding the first exon. The pause site was determined as the base pair with the highest RNAPII ChIP-seq signal in a search region from TSS-200bp to either TSS + 400 bp, the end of the first exon, or TSS + 1 kb if first exon is longer than 1 kb, whichever is longer; and RNAPII pause window was assigned as the region ±200 bp of the pause site. Only transcripts with a strong RNAPII pausing signal (3-fold or higher peak RNAPII signal over the mean RNAPII signal in the pause site search region defined above) were kept for pause duration calculations. Pause duration for each major transcript was calculated as the mean RNAPII ChIP-seq signal at the pausing window divided by the productive initiation frequency. Elongation velocities at specified windows were determined as the TT-seq signal divided by the RNAPII ChIP-seq signal at these windows.
For calculating pause duration and elongation velocity for differentially expressed genes and AID target genes (Figs. 6, 7 and S7), some of the above filters were removed as detailed below, for a more comprehensive characterization of these gene sets, as full filter sets led to removal of a large fraction of these genes. Whether calculations were made with or without these filters, results for these categories of genes were qualitatively identical. Thus, for differentially expressed genes, minimum transcript length was reduced to 4 kb (from 10 kb), RNA-seq and TT-seq coverage limits were removed, and overlapping genes as well as genes without a strong pausing signal were included. For AID target genes, minimum transcript length was reduced to 4 kb for elongation velocity calculations and to 1 kb for pause duration calculations; pause window was assigned as the region from TSS to 200 bp; and overlapping genes as well as genes without a strong pausing signal were included. These modifications were made because AID targets include some convergently transcribed genes, several genes less than 10 kb, and tend to have atypical RNAPII pausing15,17,76.
Elongation velocity tracks for UCSC genome browser were calculated as stranded TT-seq signal divided by RNAPII ChIP-seq signal in 10-bp windows for each gene, using GenomicRanges and data. table, and exported as bigwig files using rtracklayer. For elongation velocity metaplots ranging from TSS to TSS + 10 kb, only major transcripts longer than 10 kb were included to avoid including intergenic signal.
Statistics
The sample size used for each experiment was based on the magnitude of the effect, the availability of mice, and/or to ensure reproducibility. All data points plotted in the figures represent biological replicates. Male and female mice older than 6 weeks were chosen randomly for experiments and were distinguished only based on their genotype. Blinding was not performed, nor was it necessary since most measurements were performed objectively using automated software. No data was excluded from the analysis. The statistical tests were performed using bioinformatics packages and tools mentioned above, or the GraphPad Prism software version 7 or 8. Groups were tested for normality using the Shapiro-Wilk test, normal Q-Q plot, and/or homoscedasticity of residuals plot (implemented in GraphPad Prism). If all groups passed the normality test, parametric tests were used (unpaired or paired two-tailed Student’s t-test with Welch’s correction for comparing two groups; one-way ANOVA for comparing ≥3 groups; two-way ANOVA for comparing ≥2 groups split between two variables; one-sample t-test for comparing groups to a fixed value). Otherwise, non-parametric tests were used (unpaired or paired two-tailed Wilcoxon rank sum test for comparing two groups; Kruskal-Wallis test with post-hoc Wilcoxon test for ≥3 groups). For multiple comparisons, the optimal post-hoc test based on study design was determined with GraphPad Prism. Statistical tests and significance cut-off used are indicated in the corresponding figure legends and text.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The proteomics data generated in this study has been deposited in the MassIVE database under accession code MSV000092065. The genomics and RNA-seq datasets generated have been deposited in the GEO database under accession code GSE233975. Accession codes and references for all public datasets analyzed are provided in Supplementary Table 4. Source data are provided with this paper.
Code availability
Pipelines and custom code used for analyses, including TT-seq, ChIP-seq, RNA-seq, and transcription kinetics calculations, have been deposited in GitHub at https://github.com/tellyalogicalguy/dot1l_project and https://github.com/francoisrobertlab/dinoia_dot1l_project. Custom code is also available as a persistent archive at https://doi.org/10.5281/zenodo.17764465.
References
Methot, S. P. & Di Noia, J. M. Molecular mechanisms of somatic hypermutation and class switch recombination. Adv. Immunol. 133, 37–87 (2017).
Yu, K. & Lieber, M. R. Current insights into the mechanism of mammalian immunoglobulin class switch recombination. Crit. Rev. Biochem. Mol. Biol. 54, 333–351 (2019).
Xu, M. Z. & Stavnezer, J. Regulation of transcription of immunoglobulin germ-line gamma 1 RNA: analysis of the promoter/enhancer. EMBO J. 11, 145–155 (1992).
Zhang, X. et al. Fundamental roles of chromatin loop extrusion in antibody class switching. Nature 575, 385–389 (2019).
Liu, M. et al. Two levels of protection for the B cell genome during somatic hypermutation. Nature 451, 841–845 (2008).
Yamane, A. et al. Deep-sequencing identification of the genomic targets of the cytidine deaminase AID and its cofactor RPA in B lymphocytes. Nat. Immunol. 12, 62–69 (2011).
Meng, F.-L. et al. Convergent transcription at intragenic super-enhancers targets AID-initiated genomic instability. Cell 159, 1538–1548 (2014).
Qian, J. et al. B cell super-enhancers and regulatory clusters recruit AID tumorigenic activity. Cell 159, 1524–1537 (2014).
Álvarez-Prado, ÁF. et al. A broad atlas of somatic hypermutation allows prediction of activation-induced deaminase targets. J. Exp. Med. 215, 761–771 (2018).
Peycheva, M. et al. DNA replication timing directly regulates the frequency of oncogenic chromosomal translocations. Science 377, eabj5502 (2022).
Hakim, O. et al. DNA damage defines sites of recurrent chromosomal translocations in B lymphocytes. Nature 484, 69–74 (2012).
Ramiro, A. R. et al. AID is required for c-myc/IgH chromosome translocations in vivo. Cell 118, 431–438 (2004).
Methot, S. P. et al. A licensing step links AID to transcription elongation for mutagenesis in B cells. Nat. Commun. 9, 1248 (2018).
Matthews, A. J., Husain, S. & Chaudhuri, J. Binding of AID to DNA does not correlate with mutator activity. J. Immunol. 193, 252–257 (2014).
Pavri, R. et al. Activation-induced cytidine deaminase targets DNA at sites of RNA polymerase II stalling by interaction with Spt5. Cell 143, 122–133 (2010).
Basu, U. et al. The RNA exosome targets the AID cytidine deaminase to both strands of transcribed duplex DNA substrates. Cell 144, 353–363 (2011).
Wang, L., Wuerffel, R., Feldman, S., Khamlichi, A. A. & Kenter, A. L. S region sequence, RNA polymerase II, and histone modifications create chromatin accessibility during class switch recombination. J. Exp. Med. 206, 1817–1830 (2009).
Wang, Q. et al. Epigenetic targeting of activation-induced cytidine deaminase. Proc. Natl. Acad. Sci. USA 111, 18667–18672 (2014).
Sheppard, E. C., Morrish, R. B., Dillon, M. J., Leyland, R. & Chahwan, R. Epigenomic modifications mediating antibody maturation. Front. Immunol. 9, 355 (2018).
Frederiks, F. et al. Nonprocessive methylation by Dot1 leads to functional redundancy of histone H3K79 methylation states. Nat. Struct. Mol. Biol. 15, 550–557 (2008).
Stulemeijer, I. J. E. et al. Dot1 histone methyltransferases share a distributive mechanism but have highly diverged catalytic properties. Sci. Rep. 5, 9824 (2015).
Wille, C. K. & Sridharan, R. Connecting the DOTs on cell identity. Front. Cell Dev. Biol. 10, 906713 (2022).
Kealy, L. et al. The histone methyltransferase DOT1L is essential for humoral immune responses. Cell Rep. 33, 108504 (2020).
Aslam, M. A. et al. Histone methyltransferase DOT1L controls state-specific identity during B cell differentiation. EMBO Rep. 22, e51184 (2021).
Duan, Z. et al. Role of Dot1L and H3K79 methylation in regulating somatic hypermutation of immunoglobulin genes. Proc. Natl. Acad. Sci. USA 118, 2022 (2021).
Jonkers, I., Kwak, H. & Lis, J. T. Genome-wide dynamics of Pol II elongation and its interplay with promoter proximal pausing, chromatin, and exons. Elife 3, e02407 (2014).
Veloso, A. et al. Rate of elongation by RNA polymerase II is associated with specific gene features and epigenetic modifications. Genome Res. 24, 896–905 (2014).
Fuchs, G. et al. 4sUDRB-seq: measuring genomewide transcriptional elongation rates and initiation frequencies within cells. Genome Biol. 15, R69 (2014).
Vlaming, H. & van Leeuwen, F. The upstreams and downstreams of H3K79 methylation by DOT1L. Chromosoma 125, 593–605 (2016).
Wang, X., Chen, C.-W. & Armstrong, S. A. The role of DOT1L in the maintenance of leukemia gene expression. Curr. Opin. Genet. Dev. 36, 68–72 (2016).
Kealy, L., Runting, J., Thiele, D. & Scheer, S. An emerging maestro of immune regulation: how DOT1L orchestrates the harmonies of the immune system. Front. Immunol. 15, 1385319 (2024).
Wille, C. K., Zhang, X., Haws, S. A., Denu, J. M. & Sridharan, R. DOT1L is a barrier to histone acetylation during reprogramming to pluripotency. Sci. Adv. 9, eadf3980 (2023).
Cecere, G., Hoersch, S., Jensen, M. B., Dixit, S. & Grishok, A. The ZFP-1(AF10)/DOT-1 complex opposes H2B ubiquitination to reduce Pol II transcription. Mol. Cell 50, 894–907 (2013).
Wu, A. et al. DOT1L complex regulates transcriptional initiation in human erythroleukemic cells. Proc. Natl. Acad. Sci. USA 118, e2106148118 (2021).
Malcom, C. A. et al. Primitive erythropoiesis in the mouse is independent of DOT1L methyltransferase activity. Front. Cell Dev. Biol. 9, 813503 (2021).
Borosha, S. et al. DOT1L mediated gene repression in extensively self-renewing erythroblasts. Front. Genet. 13, 828086 (2022).
Cao, K. et al. DOT1L-controlled cell-fate determination and transcription elongation are independent of H3K79 methylation. Proc. Natl. Acad. Sci. USA 117, 27365–27373 (2020).
Roux, K. J., Kim, D. I., Raida, M. & Burke, B. A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells. J. Cell Biol. 196, 801–810 (2012).
Pecori, R., Di Giorgio, S., Paulo Lorenzo, J. & Nina Papavasiliou, F. Functions and consequences of AID/APOBEC-mediated DNA and RNA deamination. Nat. Rev. Genet. 23, 505–518 (2022).
Methot, S. P. et al. Consecutive interactions with HSP90 and eEF1A underlie a functional maturation and storage pathway of AID in the cytoplasm. J. Exp. Med. 212, 581–596 (2015).
Lim, J. et al. Nuclear proximity of Mtr4 to RNA exosome restricts DNA mutational asymmetry. Cell 169, 523–537.e15 (2017).
Mohan, M. et al. Linking H3K79 trimethylation to Wnt signaling through a novel Dot1-containing complex (DotCom). Genes Dev. 24, 574–589 (2010).
Uğurlu-Çimen, D. et al. AF10 (MLLT10) prevents somatic cell reprogramming through regulation of DOT1L-mediated H3K79 methylation. Epigenetic Chromatin 14, 32 (2021).
Luo, Z., Lin, C. & Shilatifard, A. The super elongation complex (SEC) family in transcriptional control. Nat. Rev. Mol. Cell Biol. 13, 543–547 (2012).
Tsukumo, S.-I. et al. AFF3, a susceptibility factor for autoimmune diseases, is a molecular facilitator of immunoglobulin class switch recombination. Sci. Adv. 8, eabq0008 (2022).
Kuntimaddi, A. et al. Degree of recruitment of DOT1L to MLL-AF9 defines level of H3K79 Di- and tri-methylation on target genes and transformation potential. Cell Rep. 11, 808–820 (2015).
Zhang, H. et al. Structural and functional analysis of the DOT1L-AF10 complex reveals mechanistic insights into MLL-AF10-associated leukemogenesis. Genes Dev. 32, 341–346 (2018).
Song, X. et al. A higher-order configuration of the heterodimeric DOT1L-AF10 coiled-coil domains potentiates their leukemogenenic activity. Proc. Natl. Acad. Sci. USA 116, 19917–19923 (2019).
He, N. et al. Human Polymerase-Associated Factor complex (PAFc) connects the Super Elongation Complex (SEC) to RNA polymerase II on chromatin. Proc. Natl. Acad. Sci. USA 108, E636–E645 (2011).
Nakamura, M. et al. High frequency class switching of an IgM+ B lymphoma clone CH12F3 to IgA+ cells. Int. Immunol. 8, 193–201 (1996).
Ling, A. K. et al. Double-stranded DNA break polarity skews repair pathway choice during intrachromosomal and interchromosomal recombination. Proc. Natl. Acad. Sci. USA 115, 2800–2805 (2018).
Tang, H. et al. DOT1L-mediated RAP80 methylation promotes BRCA1 recruitment to elicit DNA repair. Proc. Natl. Acad. Sci. USA 121, e2320804121 (2024).
Kari, V. et al. The histone methyltransferase DOT1L is required for proper DNA damage response, DNA repair, and modulates chemotherapy responsiveness. Clin. Epigenet. 11, 4 (2019).
Daigle, S. R. et al. Potent inhibition of DOT1L as treatment of MLL-fusion leukemia. Blood 122, 1017–1025 (2013).
Min, J., Feng, Q., Li, Z., Zhang, Y. & Xu, R. M. Structure of the catalytic domain of human Dot1L, a non-SET domain nucleosomal histone methyltransferase. Cell 112, 711–723 (2003).
Valencia-Sánchez, M. I. et al. Structural basis of Dot1L stimulation by histone H2B lysine 120 ubiquitination. Mol. Cell 74, 1010–1019.e6 (2019).
Anderson, C. J. et al. Structural basis for recognition of ubiquitylated nucleosome by Dot1L methyltransferase. Cell Rep. 26, 1681–1690.e5 (2019).
Chan, V. W. et al. The molecular mechanism of B cell activation by toll-like receptor protein RP-105. J. Exp. Med. 188, 93–101 (1998).
Kim, S.-K. et al. Human histone H3K79 methyltransferase DOT1L protein [corrected] binds actively transcribing RNA polymerase II to regulate gene expression. J. Biol. Chem. 287, 39698–39709 (2012).
Lin, J. et al. Menin “reads” H3K79me2 mark in a nucleosomal context. Science 379, 717–723 (2023).
Klein, I. A. et al. Translocation-capture sequencing reveals the extent and nature of chromosomal rearrangements in B lymphocytes. Cell 147, 95–106 (2011).
Chiarle, R. et al. Genome-wide translocation sequencing reveals mechanisms of chromosome breaks and rearrangements in B cells. Cell 147, 107–119 (2011).
Ramiro, A. R. et al. Role of genomic instability and p53 in AID-induced c-myc-Igh translocations. Nature 440, 105–109 (2006).
Wille, C. K. & Sridharan, R. DOT1L inhibition enhances pluripotency beyond acquisition of epithelial identity and without immediate suppression of the somatic transcriptome. Stem Cell Rep. 17, 384–396 (2022).
Kwesi-Maliepaard, E. M. et al. The histone methyltransferase DOT1L prevents antigen-independent differentiation and safeguards epigenetic identity of CD8 + T cells. Proc. Natl. Acad. Sci. USA 117, 20706–20716 (2020).
Vian, L. et al. The energetics and physiological impact of cohesin extrusion. Cell 173, 1165–1178.e20 (2018).
Thomas-Claudepierre, A.-S. et al. The cohesin complex regulates immunoglobulin class switch recombination. J. Exp. Med. 210, 2495–2502 (2013).
Busslinger, G. A. et al. Cohesin is positioned in mammalian genomes by transcription, CTCF, and Wapl. Nature 544, 503–507 (2017).
Schwalb, B. et al. TT-seq maps the human transient transcriptome. Science 352, 1225–1228 (2016).
Chen, F. X., Smith, E. R. & Shilatifard, A. Born to run: control of transcription elongation by RNA polymerase II. Nat. Rev. Mol. Cell Biol. 19, 464–478 (2018).
Harlen, K. M. & Churchman, L. S. The code and beyond: transcription regulation by the RNA polymerase II carboxy-terminal domain. Nat. Rev. Mol. Cell Biol. 18, 263–273 (2017).
Velychko, T. et al. CDK7 kinase activity promotes RNA polymerase II promoter escape by facilitating initiation factor release. Mol. Cell 84, 2287–2303.e10 (2024).
Fong, N., Saldi, T., Sheridan, R. M., Cortazar, M. A. & Bentley, D. L. RNA pol II dynamics modulate Co-transcriptional chromatin modification, CTD phosphorylation, and transcriptional direction. Mol. Cell 66, 546–557.e3 (2017).
Leng, X. et al. Organismal benefits of transcription speed control at gene boundaries. EMBO Rep. 21, e49315 (2020).
Wang, X., Fan, M., Kalis, S., Wei, L. & Scharff, M. D. A source of the single-stranded DNA substrate for activation-induced deaminase during somatic hypermutation. Nat. Commun. 5, 4137 (2014).
Rajagopal, D. et al. Immunoglobulin switch mu sequence causes RNA polymerase II accumulation and reduces dA hypermutation. J. Exp. Med. 206, 1237–1244 (2009).
Kodgire, P., Mukkawar, P., Ratnam, S., Martin, T. E. & Storb, U. Changes in RNA polymerase II progression influence somatic hypermutation of Ig-related genes by AID. J. Exp. Med. 210, 1481–1492 (2013).
Jin, B. et al. MEN1 is a regulator of alternative splicing and prevents R-loop-induced genome instability through suppression of RNA polymerase II elongation. Nucleic Acids Res. 51, 7951–7971 (2023).
Berry, S. & Pelkmans, L. Mechanisms of cellular mRNA transcript homeostasis. Trends Cell Biol. 32, 655–668 (2022).
Jonkers, I. & Lis, J. T. Getting up to speed with transcription elongation by RNA polymerase II. Nat. Rev. Mol. Cell Biol. 16, 167 (2015).
Schede, H. H., Natarajan, P., Chakraborty, A. K. & Shrinivas, K. A model for organization and regulation of nuclear condensates by gene activity. Nat. Commun. 14, 4152 (2023).
Dai, P. et al. Transcription-coupled AID deamination damage depends on ELOF1-associated RNA polymerase II. Mol. Cell 85, 1280–1295.e9 (2025).
Wu, L. et al. Transcription elongation factor ELOF1 is required for efficient somatic hypermutation and class switch recombination. Mol. Cell 85, 1296–1310.e7 (2025).
Larijani, M. et al. AID associates with single-stranded DNA with high affinity and a long complex half-life in a sequence-independent manner. Mol. Cell Biol. 27, 20–30 (2007).
Steger, D. J. et al. DOT1L/KMT4 recruitment and H3K79 methylation are ubiquitously coupled with gene transcription in mammalian cells. Mol. Cell Biol. 28, 2825–2839 (2008).
Bhargava, R. et al. C-NHEJ without indels is robust and requires synergistic function of distinct XLF domains. Nat. Commun. 9, 2484 (2018).
Couzens, A. L. et al. Protein interaction network of the mammalian Hippo pathway reveals mechanisms of kinase-phosphatase interactions. Sci. Signal. 6, rs15 (2013).
Hogan, A. K. et al. UBR7 acts as a histone chaperone for post-nucleosomal histone H3. EMBO J. 40, e108307 (2021).
Gregersen, L. H., Mitter, R. & Svejstrup, J. Q. Using TTchem-seq for profiling nascent transcription and measuring transcript elongation. Nat. Protoc. 15, 604–627 (2020).
Yoon, S., Chandra, A. & Vahedi, G. Stripenn detects architectural stripes from chromatin conformation data using computer vision. Nat. Commun. 13, 1602 (2022).
Gressel, S. et al. CDK9-dependent RNA polymerase II pausing controls transcription initiation. Elife 6, e29736 (2017).
Acknowledgements
We thank Dr. Nicole Francis, Dr Ramiro Verdun, and Dr Ivan D’Orso for critical reading, Dr. Rafael Casellas, and Mani Larijani for discussions. We thank the technical assistance of E. Massicotte and J. Lord with flow cytometry, D. Faubert with mass spectrometry, and S. Boissel, M. Rondeau, P. Gingras-Gélinas, and F. Couderc with Sanger and NGS experiments. Computations were made on the supercomputer Narval from École de technologie supérieure, managed by Calcul Québec and Compute Canada. The operation of this supercomputer is funded by the Canada Foundation for Innovation (CFI), Ministère de l’Économie, des Sciences et de l’Innovation du Québec (MESI) and le Fonds de recherche du Québec – Nature et technologies (FRQ-NT). The distiller-nf pipeline was run on Cedar using computing resources provided by the Digital Research Alliance of Canada, the organization responsible for digital research infrastructure in Canada, and ACENET, the regional partner in Atlantic Canada. ACENET is funded by Industry Science & Economic Development, the provinces of New Brunswick, Newfoundland & Labrador, Nova Scotia, and Prince Edward Island, as well as the Atlantic Canada Opportunities Agency. This work was supported by grants from the Canadian Institute of Health Research PJ-155944, PJ-497331 to J.M.D.N., and PJ-156383 to F.R., and the National Science and Engineering Research Council of Canada RGPIN-2016-04808 to J.-F.C. P.G.S. was supported by a doctoral fellowship from the Cole Foundation. N.S., J.R., and P.A.D. were supported by graduate student fellowships from the IRCM Foundation. P.G.S., N.S., and H.B. were supported by doctoral training awards from the Fonds de Recherche du Québec-Santé (FRQ-S). J.-F.C. holds the Canada Research Chair in Cellular Signaling and Cancer Metastasis and the Alain Fontaine Chair in Cancer Research from the IRCM Foundation. J.M.D.N. held a Distinguished Research Scholarship from the FRQ-S.
Author information
Authors and Affiliations
Contributions
P.G.S., N.S., and J.M.D.N. conceived the study, designed experimentation, and performed data analysis. P.G.S., N.S., J.R., P.D., M.P., and H.B. conducted the investigation. P.G.S. and J.B. performed bioinformatic analysis of proteomic data. P.G.S. and C.P. performed bioinformatic analysis of genomic data. J.F.C. supervised HB. FR provided theoretical guidance and co-supervised N.S. J.M.D.N. wrote the manuscript. P.G.S., N.S., and F.R. corrected the manuscript. All authors provided comments on and approved the manuscript. J.M.D.N. procured funding.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks David Schatz, Tanja Vogel, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Subramani, P.G., Seija, N., Ridani, J. et al. DOT1L activity limits transcription elongation velocity and favors RNAPII pausing to facilitate mutagenesis by AID. Nat Commun 17, 1623 (2026). https://doi.org/10.1038/s41467-026-68332-4
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41467-026-68332-4









