Main

RNAs are decorated with various chemical modifications that determine their fate, and specific m5C deposition is important for regulating gene expression necessary for cell proliferation and differentiation, development, stress response and immunity3,4. The NOP2/Sun (NSUN) family of methyltransferases add m5C modifications to RNAs5, which can then be detected by reader proteins to regulate RNA function, transport or stability and also be removed by TET enzymes or ALKBH1 (refs. 6,7,8,9). Thus, m5C is a dynamic epitranscriptomic modification that provides a handle for regulating RNAs post-transcriptionally.

NSUN2 is the primary m5C writer for mRNAs, and its dysregulation has been implicated in numerous human diseases10,11,12,13. Loss-of-function mutations in NSUN2 cause intellectual disability and Dubowitz-like syndrome, as NSUN2 is important for neural circuit formation and synaptic function14,15,16,17. NSUN2 also regulates ageing, tissue regeneration and tumourigenesis18,19,20,21,22,23,24. Notably, NSUN2 copy number is amplified in more than 10% of non-small-cell lung cancers25 (Extended Data Fig. 1a), and dysregulated m5C metabolism also promotes leukaemogenesis26. Therefore, NSUN2 is an important biological regulator and a potential therapeutic target.

Each m5C writer exhibits distinct substrate preferences5. NSUN2 is an especially versatile methyltransferase that can modify many types of RNA, including mRNAs, tRNAs, various ncRNAs and viral RNAs27,28,29. Even within tRNAs, NSUN2 can methylate multiple positions30,31,32. However, the mechanism underlying the specific and regulated methylation activity of NSUN2 remains unknown, and transcriptome-wide studies yield no clear consensus motif10,11,13,33. Deciphering the molecular mechanisms underlying NSUN2 activity is important for understanding its broader impact on human health and pathology.

Here, we describe the mechanism by which NSUN2 specifically methylates RNA. We present a high-resolution cryogenic electron microscopy (cryo-EM) structure (2.57 Å) of an NSUN2–RNA reaction intermediate. Moreover, we provide eight more cryo-EM structures of NSUN2–substrate complexes at various stages of the catalytic cycle to explain how an interplay among different RNA conformational states, cofactor binding and specific protein–RNA contacts drives reaction progress. Furthermore, we show how the methyltransferase can modify multiple sites within the same tRNA scaffold while maintaining specificity. We identify a dual-stem structure with a bulge-like linker and an m5CNNRR motif as the preferred features of NSUN2 substrates. We introduce a simplified, minimal substrate to demonstrate how NSUN2 targets are more generally recognized. Overall, our findings show the molecular basis of m5C modification, providing a framework for understanding its role in biology and disease.

Cryo-EM structure of NSUN2–tRNA

To understand the mechanism of NSUN2-mediated m5C modification of RNAs, we aimed to obtain an atomic model of the enzyme–substrate complex. We selected tRNAs as they are well-established substrates of NSUN2 (refs. 30,31) and are known for their stable, conserved fold. We purified recombinantly expressed wild-type full-length human NSUN2 protein and reconstituted its methylation activity with purified, in vitro transcribed tRNAs (Fig. 1a and Extended Data Fig. 1b). tRNALysCTT contains a known NSUN2 methylation site at Cyt48 in the variable region, and is robustly methylated in vitro. By contrast, tRNAIleTAT is a poor methylation substrate because it lacks a cytosine in this region. When Cyt48 of tRNALysCTT is mutated to uridine, the catalytic activity becomes undetectable, although the binding affinity remains comparable to that of wild-type substrates in the electrophoretic mobility shift assay (EMSA; Extended Data Fig. 1c,d), indicating that stable protein–RNA complex association is insufficient to drive methylation without cytosine at position 48 in tRNALysCTT. Thus, we successfully reconstituted the specific methylation activity of NSUN2 with a tRNA substrate.

Fig. 1: Cryo-EM structure of human NSUN2–tRNA reaction intermediate.
Fig. 1: Cryo-EM structure of human NSUN2–tRNA reaction intermediate.The alternative text for this image may have been generated using AI.
Full size image

a, In vitro methylation activity of full-length wild-type (WT) human NSUN2 with indicated tRNA substrates, shown as mean ± standard deviation (s.d.) (n = 3). Statistical significance is indicated (two-tailed unpaired t-test). DPM, disintegrations per minute. b, In vitro crosslinking activity of NSUN2C271A with indicated tRNA substrates. Representative SDS–PAGE gel from three replicates is shown. For gel source data, see Supplementary Fig. 1. c, Domain organization of human NSUN2. d, Sharpened cryo-EM density map of the NSUN2C271A–tRNALysCTT complex with bound SAH (pink) and RNA (orange) in the D-arm conformation. The NSUN2 portion is coloured by domain as marked in c. e, Cloverleaf diagram of tRNALysCTT used in cryo-EM. Protein–RNA contacts (distance <3.8 Å) are coloured by the interacting protein domain, as in c, with the target cytosine highlighted in yellow. Unresolved nucleotide 76 is shown in grey. fh, Models of the NSUN2C271A–tRNALysCTT–SAH complex in the D-arm conformation in cartoon representation from three different views, with SAH as pink sticks and m5Cyt48 as cyan sticks.

Source data

To gain insight into the catalytic activity of NSUN2, we sought to stabilize the reaction intermediate by leveraging its unique chemical mechanism. According to the proposed mechanism conserved for m5C methyltransferases34,35, NSUN2 forms a covalent linkage with the substrate cytosine base through a cysteine (C321), activating the methyl transfer from S-adenosylmethionine (SAM) to the cytosine and yielding S-adenosylhomocysteine (SAH) as a by-product; subsequently, a second cysteine (C271) is necessary to resolve the catalytic intermediate (Extended Data Fig. 1e,f). We mutated C271 to alanine (C271A) to inhibit the dissolution of the protein–RNA conjugate, as an intermolecular crosslink would increase the stability of the protein–RNA complex, enabling us to determine a higher-resolution complex structure and reveal a view of the catalytic intermediate. NSUN2–tRNA conjugate formation requires SAM and Cyt48, and the specific crosslinking activity is readily detectable as the conjugate is resistant to denaturing conditions (Fig. 1b).

We used single-particle cryo-EM to determine the structure of an NSUN2C271A–tRNA crosslinked complex at 2.57 Å resolution (Fig. 1c–e, Extended Data Table 1 and Extended Data Figs. 1 and 2). The overall architecture of the complex shows that NSUN2 contains four domains: an N-terminal domain (NTD), a methyltransferase domain (MTD) and two C-terminal domains (CTD1 and CTD2). The catalytic MTD is the largest domain and organizes the protein by making direct contact with NTD and CTD1, and CTD2 is wedged between CTD1 and NTD (Fig. 1f). NSUN2 uses all the domains to make extensive contact (2,878 Å2) with the tRNA, and most of the tRNA interacts with NSUN2, except for the D-arm (Fig. 1g,h). The lack of substantial protein contact probably leads to conformational variability in the D-arm; we observe two main classes of particles: one with a visible D-arm and the other with a disordered D-arm (Extended Data Fig. 1g,h). We also observe that the interactions between NSUN2 and the anticodon arm are distinct in these two conformations (Extended Data Fig. 1i,j), suggesting that the anticodon arm interacts with the protein more weakly. Nevertheless, despite weaker contact with D- and anticodon arms, the tRNA molecule is stably bound to NSUN2 to produce a high-resolution map of the RNA–protein complex (Extended Data Fig. 2).

A striking feature of the NSUN2–tRNA conjugate structure is the unusual conformation of the bound tRNA. In the NSUN2-bound state, the tRNA no longer adopts its canonical L-shape that depends on the contact between the T- and D-loops to form the elbow (Fig. 2a and Extended Data Fig. 1k). Structural superimposition indicates a large amount of unfolding, in which Gua19N1 and Cyt56N3, which typically form a hydrogen bond as part of the kissing loop interaction between the D- and T-loops, become separated by about 35 Å. This conformational change of the tRNA is probably stabilized by extensive protein–RNA contacts. Although NSUN2 envelops most of the tRNA by a large basic surface (Fig. 2b), the most intimate contact is observed near the NSUN2 regions of higher evolutionary conservation that bind the T-arm and parts of the acceptor arm (Fig. 2c–e). The T-stem is recognized by a row of basic side chains from the NTD that recognizes the pattern of phosphates from the duplexed RNA, ensuring a stem structure adjacent to the methylated cytidine (Fig. 2f and Extended Data Fig. 1l). Here, the guanidinium groups of R133 and R137 also contact N7 of guanine bases, which would cause purines to be preferred at these positions. The unpaired nucleotides in the T-loop form several favourable contacts with NSUN2, through ring-stacking and hydrophobic interactions, along with a few electrostatic contacts (Fig. 2g and Extended Data Fig. 1m). The base of the acceptor arm is buttressed by the MTD helix (α3) that provides a platform for the RNA duplex, and the phenol group of Y222 stabilizes Gua66 by ring stacking (Fig. 2h and Extended Data Fig. 1n). The backbone edge of the duplex RNA in the acceptor arm makes favourable contacts with the CTD1 (Fig. 2i and Extended Data Fig. 1o). At the 3′ end of the acceptor arm, the single-stranded RNA (ssRNA) is ordered enough to be modelled, although less defined than most of the RNA, suggesting that it may have loose interactions with the CTDs (Extended Data Fig. 1p). The key RNA–protein contacts can be summarized using a schematic (Fig. 2j). Several point mutations in these contact areas can diminish productive methylation, in agreement with our structural observations (Fig. 2k). The R133Q mutation has been observed in tumour samples36 (Fig. 2f), and reduced methylation activity may contribute to certain cancers. Therefore, the structure-guided biochemical analysis suggests that the NSUN2–tRNA complex is primarily stabilized by RNA structure-dependent interactions, including electrostatic interactions with the phosphate edges of the duplex (acceptor arm and T-arm) and sequence-independent stacking interactions with the T-loop.

Fig. 2: NSUN2 stabilizes a non-canonical tRNA conformation.
Fig. 2: NSUN2 stabilizes a non-canonical tRNA conformation.The alternative text for this image may have been generated using AI.
Full size image

a, Alignment of unbound (PDB: 1FIR, grey) and NSUN2-bound (PDB: 9Z2N from this study, orange) tRNALysTTT by their acceptor arms. Dashed arrows show the conformational differences in key structural elements. b,c, Models of the NSUN2C271A–tRNALysCTT complex with RNA shown as ribbon and protein in surface representation, coloured by vacuum electrostatic potential (b) or evolutionary conservation score (c). Conservation score per residue calculated using ConSurf, with yellow indicating insufficient data. d, NSUN2 surface representation coloured by domain, except for the residues within 3.8 Å of RNA that are shown in orange. e, Similar to d, protein coloured by domain, with RNA–protein interfaces highlighted with black boxes and panel labels. fi, Close-up views of protein–RNA interfaces highlighted in e. Protein side chains within 3.8 Å of RNA are shown as sticks, and electrostatic contacts (<3.8 Å) are indicated by black dashed lines. Cyt48 is shown in cyan. j, Schematic of protein–RNA contacts (<3.8 Å) in T- and acceptor arms with protein residues coloured according to d. Arrows indicate electrostatic (triangle) or hydrophobic (circle) interactions between main chain (solid) or side chain (dashed) protein atoms with RNA. k, In vitro methylation activity of NSUN2 mutants with substitutions in NTD (blue) or MTD (purple), with wild-type tRNALysCTT, shown as mean ± s.d. (n = 3). Ctr, no enzyme control. Significance tests are comparisons with NSUN2WT (two-tailed unpaired t-test).

Source data

Conformational changes during catalysis

Our NSUN2–tRNA conjugate structures provide a high-resolution view of the NSUN2 catalytic pocket in action, revealing the reaction intermediate (Fig. 3a and Extended Data Fig. 1q). The methylated cytosine ring and the by-product SAH are deeply buried in the catalytic pocket. Our maps are sufficiently resolved to model the SAH and the protein–RNA crosslink between Cyt48 and the catalytic residue C321. An intricate network of polar contacts and van der Waals interactions determines the conformations of the cofactor and the methylated cytosine ring. The specificity for the cytosine base is conferred through recognition of N4 by main-chain carbonyl interactions with V269 and S319 (Extended Data Fig. 1r). RNA drapes over the bound SAH, suggesting the bound cofactor helps organize the RNA near the catalytic pocket (Extended Data Fig. 1s). As expected for residues responsible for precise chemistry, point mutations of the amino acids involved in coordinating the catalytic pocket are deleterious for methylation activity, including the R220C mutation found in multiple types of cancer36 (Fig. 3a,b). Thus, a near-atomic-resolution view of the NSUN2–RNA conjugate provides valuable insight into how the key cytidine is recruited into the catalytic pocket to undergo methylation.

Fig. 3: RNA conformations change with the catalytic cycle.
Fig. 3: RNA conformations change with the catalytic cycle.The alternative text for this image may have been generated using AI.
Full size image

a, View of the NSUN2C271A–tRNALysCTT catalytic pocket with protein side chains within 3.8 Å of SAH or m5Cyt48 shown as sticks. Electrostatic contacts are shown as black dashed lines, and other nucleotides are removed for clarity. b, In vitro methylation activity of NSUN2 catalytic site mutants, shown as mean ± s.d. (n = 3). Ctr, no enzyme control. Significance tests are comparisons with NSUN2WT (two-tailed unpaired t-test). ce, Models of apo NSUN2WT–tRNALysCTT complex in two conformations (c), NSUN2WT–tRNALysCTT–SFG complex (d) and NSUN2C271A–tRNALysCTT–SAH complex (e). Protein coloured by domain, SFG and SAH shown as pink sticks, and Cyt48 shown as cyan sticks. fi, Close-up views of the catalytic sites corresponding to the structures in ce overlaid with their sharpened cryo-EM density map shown as transparent surfaces. Protein side chains, SFG, SAH and Cyt48 are shown as sticks.

Source data

Given the dynamic nature of the bound RNA, as evidenced by the two main conformations observed with the NSUN2C271A–tRNALysCTT complex, we investigated changes in RNA–protein interactions throughout the catalytic cycle. We determined the NSUN2 affinity for SAM to be about 10 µM by isothermal titration calorimetry, similar to previously reported values19, whereas that for tRNAs is >100-fold tighter regardless of cofactor-bound state (Extended Data Fig. 1c,d,t–v). We used wild-type NSUN2–tRNALysCTT complexes to determine cryo-EM structures with no cofactor (apo) or bound with a non-reactive SAM-analogue, sinefungin (SFG) (Fig. 3c–e, Extended Data Table 1 and Extended Data Figs. 3 and 4). The apo state exhibits the greatest RNA conformational heterogeneity, particularly near the T-arm and the catalytic pocket (Fig. 3c,f,g). The vacant cofactor-binding pocket allows the RNA to be more flexible near the active site, resulting in two distinct RNA conformations, especially the T-arm. Each RNA segment shift is followed by the corresponding bound protein domain, underscoring the specific and tight interactions between the T-arm and the NTD, and between the acceptor arm and the CTD. Thus, the two structures from the apo state demonstrate that the domains of NSUN2 can recognize and bind the RNA substrate with sufficient specificity and affinity to bring the target base close to the catalytic pocket, though not enough to stabilize the active conformation (Fig. 3f,g). On binding SFG, Cyt48 docks properly into the catalytic pocket with the target carbon atom (C5) moving approximately 4 Å deeper into the catalytic groove, ready for methylation (Fig. 3d,h). These results align with our findings that efficient crosslinking in the presence of mutant NSUN2C271A requires SAM (Fig. 1b). The protein and RNA conformations in the pre-catalytic conformation with SFG are similar to the catalytic intermediate represented by the crosslinked complex structure with NSUN2C271A (Fig. 3e,i, root mean square deviation (RMSD) = 0.31 Å). Therefore, although the overall affinity of NSUN2 for substrate RNA is sufficient to form stable complexes with approximately correct architecture, establishing an active catalytic pocket requires the bound cofactor to position the cytosine ring with proper geometry.

NSUN2 methylates multiple tRNA positions

Analysis of multiple catalytic states of NSUN2 in complex with tRNALysCTT provided valuable insights into its mechanism and substrate specificity. However, tRNALysCTT contains only one cytidine at position 48, and NSUN2 has been reported to methylate nearby positions on the tRNA, such as positions 49 and 50 (refs. 30,31). To compare methylation efficiency among the different positions in the variable region, we reconstituted NSUN2 methylation of human wild-type tRNAs that contain a single cytidine within the 48–50 window. We found that methylation was more robust at positions 48 and 49, and almost undetectable for tRNAs with only Cyt50 (Fig. 4a). We also performed systematic mutagenesis studies, using three tRNA scaffolds, each with a different native cytidine position, and moved the Cyt to the other two positions (Fig. 4b). When we used tRNALysCTT (with Cyt48) as the scaffold, moving the cytidine to positions 49 or 50 did not support detectable levels of methylation. However, when we used tRNALysTTT (with Cyt49), moving the cytidine to position 48 did not significantly alter methylation efficiency, and position 50 yielded lower methylation activity. We did not observe methylation above background with wild-type tRNATrpCCA (Cyt50), but moving the cytidine to positions 48 or 49 in this scaffold greatly increased methylation efficiency. Similar differences in catalytic activity were observed when we used NSUN2C271A to detect the formation of a reaction intermediate (Fig. 4c). From these experiments, we conclude that NSUN2 generally methylates positions 48 and 49 of tRNAs more efficiently, whereas position 50 is less preferred.

Fig. 4: NSUN2 can methylate multiple tRNA positions.
Fig. 4: NSUN2 can methylate multiple tRNA positions.The alternative text for this image may have been generated using AI.
Full size image

a,b, In vitro methylation activity of NSUN2 with wild-type human tRNAs containing a single cytosine in tRNA positions 48, 49 or 50 (a), or mutant tRNAs in which the single cytosine was shifted to the other positions (b). Ctr, no enzyme control. Data are shown as mean ± s.d. (n = 3, two-tailed unpaired t-test). c, In vitro crosslinking activity of NSUN2C271A with the same tRNAs used in b. A representative gel from three replicates is shown. For gel source data, see Supplementary Fig. 1. d, Structure of NSUN2C271A–tRNALysTTT–SAH complex, in the D-arm conformation. Protein is shown as a cartoon, RNA as a grey ribbon and SAH as pink sticks. e, Superimposition of NSUN2C271A–tRNALysCTT and NSUN2C271A–tRNALysTTT D-arm conformation structures by MTD. The RNA conformational shift is shown as orange (tRNALysCTT) and grey (tRNALysTTT) ribbons with protein in surface representation. Cyt49 of tRNALysTTT is coloured cyan. f,g, Close-up views of the junctions of acceptor arm and T-arm in the NSUN2C271A–tRNALysCTT (C48, f) and NSUN2C271A–tRNALysTTT (C49, g) structures. Key nucleotides are shown as sticks for orientation, and target cytosines are coloured cyan with protein shown in surface representation.

Source data

To investigate how the same enzyme accommodates several tRNA positions given the extensive contacts and intricate atomic geometry observed in the NSUN2–tRNALysCTT (with Cyt48) complexes, we determined cryo-EM structures of NSUN2 crosslinked to wild-type tRNALysTTT with a Cyt49 (Fig. 4d, Extended Data Table 1 and Extended Data Fig. 5). Comparing the two tRNA structures with distinct cytidine positions shows that the broad specificity of NSUN2 as a methyltransferase arises from its ability to bind the tRNA in several distinct conformations (Fig. 4e). We find that the first base pair of the tRNA T-stem (position 49:65) is dissolved to allow for Cyt49 to enter the catalytic pocket, and the released position 65 pairs with the Ura8 to lengthen the acceptor arm by 1 base pair; the protein conformation remains constant (RMSD = 0.123 Å) in the two structures (Fig. 4f,g and Extended Data Fig. 5h,i). In both structures, NSUN2 recognizes the duplex RNAs surrounding the methylated cytidine, and the protein–RNA contact locations are determined by their distance from the methylated cytidine. To enable Cyt50 to access the catalytic pocket in a similar manner, the resulting three-base-pair T-stem may be too short to fully retain the observed favourable interactions in the structures (Fig. 2j and Extended Data Fig. 5j). Therefore, by comparing how NSUN2 methylates distinct tRNA positions, we identified key RNA features for productive m5C modification.

Designing a minimal substrate for NSUN2

By combining insights from structures across multiple states and different substrates, we could identify which protein–RNA contacts are important for productive m5C modification. We postulated that RNA structural features drive NSUN2 recognition because of the many interactions with the dsRNA. A linear tRNALysTTT segment (11 nt) containing the m5C site and its neighbouring primary sequence exhibits no detectable methylation activity, indicating the importance of a larger structural context (Fig. 5a,b). Disrupting the secondary structure of the T- and acceptor arms by introducing Cyt-to-Ade mutations that create mismatches in the RNA stems impairs methylation activity. Reinstating the Watson–Crick base pairing while changing the primary sequence (from CAGGG to CAAAG) adjacent to the methylated cytidine rescues the methylation activity, suggesting that the secondary structure is crucial for recognition. However, a mutant tRNALysTTT (DLoopMut) that cannot form proper kissing loop interactions between D- and T-loops required for tRNA tertiary structure37 is methylated with similar efficiency as the wild-type tRNA, suggesting that the full tRNA fold is not required for recognition by NSUN2 (Extended Data Fig. 6a–c). Another tRNA variable region methyltransferase, METTL1–WDR4, is known to recognize the intact tRNA tertiary fold through the elbow structure to methylate Gua46 (refs. 38,39). In contrast to NSUN2, methylation of the DLoopMut mutant by METTL1–WDR4 is undetectable, indicating that the tRNA elbow structure is disrupted (Extended Data Fig. 6d). These data suggest that NSUN2 recognizes RNA duplexes but does not require a full tRNA structure.

Fig. 5: Designing a miniature NSUN2 RNA substrate.
Fig. 5: Designing a miniature NSUN2 RNA substrate.The alternative text for this image may have been generated using AI.
Full size image

a, Schematics of tRNALysTTT variants. The 11-nt ssRNA ‘Oligo’ segment spans the region marked in blue, and ‘AC-arm’ refers to the anticodon arm. Mutations are coloured red, and compensatory mutations to preserve base pairing are coloured grey. b, In vitro methylation activity of NSUN2 with tRNA substrates shown in a (light grey) and Extended Data Fig. 6a (dark grey). The dashed line separates the two mutagenesis series. RNA sequence following the methylated Cyt (arrow) is shown on the right with position numbers. Data are shown as mean ± s.d. (n = 3, two-tailed unpaired t-test). c, Schematics of tRNALysTTT with arm deletions. d,e, In vitro methylation activity of NSUN2 (d) and the m7G methyltransferase METTL1–WDR4 complex (e) with the tRNAs from c. Significance tests are comparisons with wild-type tRNALysTTT. Data are shown as mean ± s.d. (n = 3, two-tailed unpaired t-test). f, Schematics of Mini NSUN2 substrate derived from NSUN2C271A–tRNALysTTT structures. g, In vitro methylation activity of NSUN2 with Mini (f), with additional deletions or modifications as described. Data are shown as mean ± s.d. (n = 3, two-tailed unpaired t-test). h,i, Schematics of two variants of Mini (f), with a closed C-stem and open bulge (h) or open N-loop (i). Ctr, no enzyme control.

Source data

Apart from structural recognition, we consistently observe ordered arginine side chains (R133 and R137) interacting with guanine bases 3′ of the methylated Cyt (Fig. 2f and Extended Data Fig. 1l). Previous transcriptome-wide studies and consensus analyses also suggested a preference for guanosines near NSUN2-dependent m5C sites10,11,12,13,33. We tested the importance of nucleotide sequence by systematically replacing each nucleotide position with Cyt (Fig. 5b and Extended Data Fig. 6e). Uridine was tested alongside cytidine at position 50 because Cyt50 is susceptible to methylation. Although the two nucleotides immediately following the m5C modification site could be replaced with a pyrimidine without altering methylation activity, purines are preferred at the positions that interact with the arginines (Gua52 and Gua53 in tRNALysTTT). Therefore, although single pyrimidine substitutions can still support m5C modification, the CNNRR sequence is preferred by NSUN2.

The uneven distribution of protein contacts on the tRNA in our complex structures suggests that not every arm of the tRNA is important for methylation. We measured changes in enzymatic activity with systematic RNA deletions (Fig. 5c). When we delete each arm of tRNALysTTT, D- and anticodon arms are dispensable whereas T- and acceptor arms are necessary for NSUN2-mediated methylation (Fig. 5d). To examine whether the arm requirement was specific to tRNALysTTT, we made an equivalent deletion series for tRNAMetCAT and observed a similar trend (Extended Data Fig. 6f). We tested the truncated tRNALysTTT constructs with METTL1–WDR4 and found that any arm deletion abrogates m7G methylation by METTL1–WDR4, drawing a stark contrast between the enzymes in their substrate requirements, despite their similar target location (Fig. 5e). These findings corroborate with previous reports that NSUN2 can methylate the mitochondrial tRNASerGCT that lacks a D-arm40,41 (Extended Data Fig. 6g). The in vitro methylation assay results can also be recapitulated using the orthogonal crosslinking assay with NSUN2C271A (Extended Data Fig. 6h). Therefore, unlike some other tRNA modification enzymes, NSUN2 does not require a full tRNA structure to catalyse m5C modification.

We aimed to design a minimal RNA substrate for NSUN2 to further dissect the key features that are important for methylation. Deleting both the D- and anticodon arms preserves methylation at levels comparable to those of the full-length tRNA (Fig. 5f,g). We term this double-deletion construct a Mini substrate for NSUN2, and it contains the two necessary RNA stem structures—N-stem and C-stem—which bind to the NTD or CTDs of NSUN2, respectively. Mini also binds NSUN2 significantly better than the constructs missing the T- or acceptor arm (Extended Data Fig. 7a,b). Shortening the Mini substrate by deleting the overhanging 3′ ssRNA causes a modest decrease in methylation (Mini∆ssRNA; Fig. 5g). We further truncated the blunt-ended RNA by shortening the C-stem to 7 bp, as observed in tRNALysCTT, and this shorter stem resulted in a minor decrease in methylation (Mini7bp). We then tested whether the location of the 5′ and 3′ ends in Mini is important. Closing the C-stem with a tetraloop while opening the bulge region (MiniOpen-Bulge; Fig. 5g, h) preserved methylation activity comparable to the Mini7bp construct, demonstrating that NSUN2 can recognize and methylate multiple types of dual-stem scaffold. However, closing the C-stem with the same tetraloop while opening the N-loop dramatically reduces methylation activity, indicating that the N-stem-loop structure is important to interact with the NTD properly (Fig. 5g,i). We tested the importance of the N-loop sequence that is derived from the tRNA T-loop. Mutating the conserved AG motif to GA does not alter methylation propensity, suggesting that the stem-loop conformation is important but not the sequence (Extended Data Fig. 7c,d). The shortest RNA fragments with good reactivity (Mini7bp and MiniOpen-Bulge) exhibit about half the level of methylation as the full-length tRNALysTTT. Nevertheless, given the wide activity variance among wild-type tRNAs (Fig. 4a), they serve as reasonable minimal substrates for NSUN2, reducing tRNALysTTT from 76 to 39 nucleotides. Considering the important relative orientation of the two stems, we examined how bulge length affects their function. Shorter bulges are increasingly poor NSUN2 substrates, indicating that a sufficient length is required to correctly position the two stems at an angle that allows each to bind its respective protein domain (Extended Data Fig. 7e). These experiments indicate that a dual-stem structure with a bulge or flexible linker on one side is a key RNA element for NSUN2 recognition.

Leveraging our ability to use a minimal substrate, we investigated potential interactions between adjacent m5C sites by synthesizing oligonucleotides with a unique m5C modification adjacent to the target cytosine, analogous to positions 48 and 49 in tRNAs (Extended Data Fig. 7f). Methylation activity is similar between premethylated and unmethylated substrates when the pre-existing m5C is 5′ to the target cytidine and moderately higher when m5C is on the 3′ side (Extended Data Fig. 7g). This apparent activation is probably due to structural variation (Fig. 4), as the two premethylated oligonucleotides also differ in methylation activity. As the first m5C modification can be added to Cyt48 and Cyt49 with similar efficiencies (Fig. 4a,b), our combined data suggest that each NSUN2 methylation event on a tRNA is probably stochastic rather than sequential. Modelling the premethylated nucleotides by substituting the base with m5C does not generate notable clashes or favourable contacts, consistent with the model in which adjacent m5C modifications are independent of each other (Extended Data Fig. 7h).

In summary, through a structure-guided series of biochemical studies, we have identified a small RNA fragment that encapsulates the important features of an RNA substrate for m5C methylation: a dual-stem structure with a bulged connection that can support the stems to bind at an angle, with a preference for the CNNRR motif at the 5′ end of the N-stem loop.

NSUN2 recognition elements in other RNAs

Our ability to design a minimal substrate for NSUN2 suggests that the dual-stem model could be applied to other substrate RNAs lacking a tRNA fold. To extend our understanding of NSUN2 substrates, we investigated how sites outside the tRNA variable region are recognized. Human long noncoding RNA RP11 contains a high-confidence m5C site with a top methylation score, which has been reproduced across multiple studies11,42. Applying the dual-stem framework, we identified a small RP11 fragment that recapitulates methylation and binding at the same site in vitro (Extended Data Fig. 7i,j). Mutating the methylated cytidine to uridine (RP11CtoU) abrogates methylation. When the purines in the CNNRR motif of the RP11 fragment are mutated to pyrimidines (RP11GAGCU), with compensatory mutations to maintain the stem structure, we observe a loss in methylation activity, underscoring the preference for the CNNRR motif. Therefore, our dual-stem substrate-recognition model for NSUN2 can be applied to non-tRNA substrates.

Previous studies have shown that pre-tRNALeuCAA can be modified by NSUN2 in the anticodon loop at position 34, apart from position 48 (refs. 27,32) (Fig. 6a). Cyt34 methylation may appear inconsistent with our model, in which NSUN2 methylates dual-stem RNAs resembling T- and acceptor arms (Fig. 5d). However, the stems important for NSUN2 recognition are determined by their relative positions to the potential m5C site. In pre-tRNALeuCAA, the intron introduces two extra stem structures into the tRNA. As a result, pre-tRNALeuCAA has two separate dual-stem elements: one for Cyt34 and one for Cyt48. We can reconstitute the intron-dependent m5C modification of Cyt34 and the constitutive methylation of Cyt48 for pre-tRNALeuCAA with purified NSUN2, in agreement with previous studies27,32 (Fig. 6b). Moreover, we show that a dual-stem RNA fragment that mostly consists of the intron sequence (Intron, for clarity) can be methylated with similar efficiency as the full-length pre-tRNALeuCAA (Fig. 6c,d). Mutating Intron at the m5C site (IntronC34U) or the CNNRR motif (IntronCAACC) substantially diminished methylation activity, consistent with our NSUN2 substrate-recognition model.

Fig. 6: NSUN2 recognizes dual-stem elements in various RNAs.
Fig. 6: NSUN2 recognizes dual-stem elements in various RNAs.The alternative text for this image may have been generated using AI.
Full size image

a, Cloverleaf diagram of pre-tRNALeuCAA with the intron coloured blue and target cytosines highlighted in yellow. Stems recognized by NSUN2 for each m5C site are marked with dashed lines. b, In vitro methylation activity of NSUN2 with tRNALeuCAA mutants. Mature tRNA is without the intron. Data are shown as mean ± s.d. (n = 3, two-tailed unpaired t-test). c, Secondary structure diagrams of wild-type (top) and mutant (bottom) pre-tRNALeuCAA intron fragments. Mutations in the 5′ stem are coloured red, compensatory mutations to preserve base pairing are coloured grey and the intron sequence is coloured blue. d, In vitro methylation activity of NSUN2 with the substrates described in c and the same fragment with a C34U mutation. Data are shown as mean ± s.d. (n = 3, two-tailed unpaired t-test). e, Cryo-EM structure of the NSUN2C271A–IntronWT crosslinked complex. Protein shown as cartoon coloured by domain, RNA as orange ribbon, and SAH as pink sticks. f, Model of NSUN2–RNA recognition and methylation. NSUN2 recognizes dual-stem RNA structures. NTD of NSUN2 binds the N-stem-loop, and CTD of NSUN2 binds the C-stem. At the 5′ end of the N-stem, the CNNRR motif is preferred for efficient methylation of the first Cyt. NSUN2 can recognize the RNA in the absence of the SAM cofactor (apo state), although NSUN2 may also bind SAM first. The empty SAM-binding pocket adjacent to the α3-helix of the MTD allows RNA conformational variability in the apo state. Cofactor binding to NSUN2 (pre-catalytic state) rigidifies the protein–RNA complex and organizes the reaction centre to be poised for catalysis. NSUN2 forms a covalent bond (through C321) with the cytidine to initiate the catalytic reaction (catalytic intermediate state), enabling methyl transfer even in the absence of the second cysteine (C271).

Source data

To provide a structural model of NSUN2 recognizing an isolated dual-stem structure, we used cryo-EM to visualize its binding to RP11 or Intron. Although the NSUN2C271A–RP11 complex data did not converge on an interpretable three-dimensional reconstruction, we successfully obtained a cryo-EM structure of the NSUN2C271A–IntronWT complex (Fig. 6e, Extended Data Fig. 8 and Extended Data Table 1). The 30-nucleotide Intron fragment binds NSUN2 with the N- and C-stems positioned similarly to the corresponding stems of intact tRNAs (RMSD, overall: 0.1 Å, RNA: 1.3 Å; Extended Data Fig. 8e), and the purines of CNNRR are similarly recognized by the side chains of R133 and R137 (Extended Data Fig. 8f). During data processing, we observed another similar conformation of the Intron complex with a longer C-stem density (Extended Data Fig. 8b). We did not build a model for the second conformation, as the shorter oligonucleotides are more likely than tRNAs to adopt alternative conformations, including forming artificial concatemers. In summary, we demonstrated that the substrate-recognition model for NSUN2—based on detecting a bulged dual-stem RNA feature with a preference for the CNNRR sequence at the 5′ end of the N-stem—can be applied more generally to NSUN2 methylation sites other than in the tRNA variable region.

Discussion

Through a series of cryo-EM structures and complementary biochemical investigations, our work sheds light on the dynamic interactions between NSUN2, RNA and SAM cofactor across multiple stages of the methylation catalytic cycle and for different types of substrate (Fig. 6f and Supplementary Video 1). We demonstrate that although the NTD and CTDs robustly bind RNA stems independently of cofactors, the SAM cofactor is essential for precisely docking the target cytosine into the active site. By comparing diverse substrates, we establish that NSUN2 selectivity is primarily driven by a dual-stem RNA element, with a preference for the CNNRR sequence starting with the methylated Cyt, at the 5′ end of the N-stem.

Structure-dependent substrate recognition by NSUN2 explains why a clear consensus motif is difficult to identify with sequence-based methods. Deleting approximately 30 nucleotides in a tRNA between NSUN2 recognition elements maintains similar methylation activity, indicating that large gaps are allowed in sequence space. Nevertheless, our structural evidence for purine preference at positions 4 and 5 corroborates existing transcriptome-wide sequence analyses that are G-rich6,10,11,13. Therefore, by analysing the contributions of sequence within the structural framework, our work provides a unifying model for interpreting data on NSUN2-dependent m5C methylation of RNAs.

Our findings also provide a mechanistic understanding of clinically relevant mutations and regulatory modifications of NSUN2. The G679R mutation is linked to intellectual disability and probably destabilizes the CTD2 hydrophobic core, leading to the protein aggregation we observed in vitro15,43 (Extended Data Fig. 9a). Similarly, cancer-associated mutations and regulatory SUMOylation/ubiquitination sites are positioned at the substrate–enzyme interface, in which they would interfere with substrate binding23,44 (Extended Data Fig. 9b,c). Conversely, the disordered N-terminal region reported to mediate glucose-mediated metabolic regulation did not influence in vitro activity, suggesting that this regulation requires additional cellular factors19 (Extended Data Fig. 9d).

Comparison of NSUN2 with the rest of the enzyme family highlights a divergent model35 (Extended Data Fig. 9e–g). Although the catalytic domains are conserved, substrate specificity is dictated by the unique arrangement of peripheral RNA-binding domains. Although our study primarily used unmodified in vitro transcripts and left the potential influence of crosstalk with other RNA modifications for future study, the series of snapshots provides a robust framework for understanding the catalytic cycle. Ultimately, defining the minimal dual-stem substrate with a CNNRR motif enables the precise identification of direct NSUN2 targets and the design of small-molecule therapeutics to modulate its activity in disease.

Methods

Cloning, protein expression and purification

The full-length human NSUN2 gene was cloned into pET-21(+) (Novagen) with an N-terminal hexahistidine tag and a TEV cleavage site. Protein was expressed in Escherichia coli Rosetta DE3 cells (Sigma) grown in ZYM-5052 auto-induction media at 18 °C for 18 h (ref. 45). Cells were lysed by sonication, and the clarified lysate was subjected to immobilized metal affinity chromatography using Ni-NTA resin (Thermo Fisher). The nickel-affinity chromatography eluate was further purified by ion-exchange chromatography using a linear gradient from 100 mM to 1 M NaCl. The protein was further polished by size-exclusion chromatography with a buffer containing 20 mM Tris-HCl, pH 7.5, 500 mM NaCl and 5 mM DTT. All NSUN2 constructs were purified similarly, and the representative gel-filtration profiles and sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS–PAGE) gels of the purified proteins are shown (Supplementary Figs. 1 and 2). The human METTL1–WDR4 complex was expressed and purified using previously reported methods38,39.

RNA transcription and purification

DNA templates containing human tRNA genes for in vitro transcription were assembled using primer extension polymerase chain reaction (PCR) with a 5′ extended T7 promoter sequence (GCGAAATTAATACGACTCACTATA-[tRNA gene]). The PCR products were purified using a GeneJET PCR purification kit (Thermo Fisher). DNA templates were transcribed in vitro using 0.1 mg ml−1 recombinant T7 RNA polymerase produced in-house in a final buffer containing 50 mM Tris-HCl, pH 8.0, 1 mM spermidine, 0.01% TritonX-100, 5 mM DTT, 2 mM rNTPs, 18–22 mM MgCl2 and 20 ng µl−1 of DNA template. RNA transcripts were extracted using acid phenol:chloroform pH 4.5 (with IAA, 125:24:1) and precipitated in 1.2 volumes of isopropanol supplemented with sodium acetate pH 5.2 at −20 °C. Precipitated RNA was resuspended in a buffer containing 20 mM Tris-HCl, pH 7.5, 100 mM NaCl and 2 mM MgCl2, then further purified using electrophoresis with a 10% urea-PAGE gel. Target RNAs were eluted with a buffer containing 20 mM Tris-HCl, pH 7.5, 100 mM NaCl and 2 mM MgCl2 for 16 h to extract RNA by passive diffusion. Extracted RNA was precipitated again with sodium acetate and isopropanol at −20 °C, and resuspended in a final RNA dissolving buffer of 20 mM Tris-HCl, pH 7.5, 100 mM NaCl and 2 mM MgCl2. Purified RNAs were analysed by native PAGE gels for evaluation (Supplementary Fig. 2). All RNAs in this study were annealed by incubating at 75 °C for 5 min, followed by slow cooling to 25 °C over 30 min before use. A list of DNA primers used for PCR is provided in Supplementary Table 1. Wild-type tRNA sequences and secondary structures were obtained from gtRNAdb46. The mitochondrial tRNASerGCT sequence and its secondary structure are adapted from ref. 47. Secondary structure diagrams of non-tRNA substrates were adapted from RNAfold calculations at 37 °C using the Andronescu 2007 energy model with coaxially stacked dangling ends unless stated otherwise48. For lncRNA RP11, the fragment was designed around the reported NSUN2-dependent m5C site Hg19, Chr1:854196.

In vitro methylation assay

In vitro methylation reactions with NSUN2 were performed with 50 nM enzyme and 100 nM RNA in 20 mM Tris-HCl, pH 7.5, 100 mM NaCl, 2 mM MgCl2, 5% glycerol, 5 mM DTT and a 200 nM [3H]-SAM:SAM mixture at a molar ratio of 1:4 (Figs. 1, 46 and Extended Data Figs. 6 and 8) or 1:3 (Figs. 2 and 3). Reactions with METTL1–WDR4 were performed under the same conditions for Extended Data Fig. 6d, and with 100 nM enzyme and 150 nM tRNA in 20 mM Tris-HCl, pH 7.5, 100 mM NaCl, 2 mM MgCl2, 5% glycerol, 5 mM DTT and a 1:3 molar ratio of [3H]-SAM:SAM mixture at a total concentration of 300 nM for Fig. 5e. Reactions were incubated at 37 °C for 1 h and blotted onto Hybond-N+ hybridization membranes. Membranes were washed with a buffer containing 20 mM Tris-HCl, pH 7.5, and 0.01% Triton X-100 before being placed in scintillation vials and suspended in 4 ml of a scintillation cocktail (RPI) for counting using an AccuFLEX LSC-8000 scintillation counter (Hitachi). A list of RNA sequences used is available in Supplementary Table 2.

NSUN2 in vitro crosslinking assay

In vitro crosslinking assays were performed with 500 nM NSUN2C271A, 1 µM RNA and 50 µM SAM in 20 mM Tris-HCl, pH 7.5, 100 mM NaCl, 500 µM MgCl2, 5% glycerol and 5 mM DTT. Reactions were incubated at 37 °C for 1 h, then quenched with an SDS–PAGE gel loading buffer containing 300 mM Tris-HCl, pH 6.8, 12% (w/v) SDS and 40% glycerol supplemented with 10 mM DTT and heated at 95 °C for 5 min. About 350 ng of protein from each sample was loaded and separated by SDS–PAGE, then stained with Coomassie Blue.

EMSA

Dephosphorylated RNAs were 5′ radiolabelled using 0.4 U µl−1 T4 polynucleotide kinase (PNK) in 1X PNK buffer (New England BioLabs) with 133 nM [γ-32P]ATP (Revvity). Radiolabelled RNAs were purified over Bio-Gel P-6 SEC beads (BioRad) equilibrated in 20 mM Tris-HCl, pH 7.5, 100 mM NaCl and 2 mM MgCl2. RNA–protein complexes were formed by mixing the indicated concentrations of protein with 1 nM radiolabelled RNA substrate in 20 mM Tris-HCl, pH 7.5, 150 mM NaCl, 3 mM MgCl2, 10% glycerol, 5 mM DTT and 10 µM of non-specific DNA oligonucleotides at 25 °C for 1 h, then resolved with a native 8% Tris-borate polyacrylamide gel. Gels were dried and imaged using a Typhoon FLA 9500 phosphorimager (GE Healthcare). Band quantitation was performed with ImageLab v.6.1 (BioRad) using a background subtraction disk size of 8.0 mm. KD values were determined by fitting fractional binding data for three replicates to a Hill slope model with a maximum saturation value constrained to 1.0 using Prism 10.

Isothermal titration calorimetry

Binding enthalpies for SAM and NSUN2 were measured at 25 °C in 20 mM Tris-HCl, pH 7.5, 500 mM NaCl and 2 mM TCEP using a MicroCal PEAQ-ITC instrument (Malvern Panalytical). For two independent runs, 425 µM SAM was titrated into 400 µl of 42.5 µM NSUN2 using a reference cell power of 5 µcal s−1. Both titrations spanned 21 injections with an initial injection volume of 0.5 µl of titrant over 1 s, and all subsequent titrations injected 1.9 µl over 3.8 s with a 120-s spacing. Peak integrations for both replicates were calculated using NITPIC v.2.0.7 (ref. 49) and subsequent KD determination with confidence intervals was performed in SEDPHAT v.15.2b (ref. 50). Plotting of thermogram data was done in GUSSI v.1.4.2 (ref. 51).

Cryo-EM sample preparation and data collection

For samples containing NSUN2WT and tRNALysCTT with and without cofactor, a single complex batch was assembled by mixing NSUN2WT and tRNALysCTT in a 1:1.2 molar ratio in a buffer containing 20 mM Tris-HCl, pH 7.5, 100 mM NaCl, 2 mM MgCl2, 5% glycerol and 5 mM DTT at 37 °C for 1 h. Complexes were then separated from excess RNA by gel filtration over a Superdex 200 pg column (Cytiva) equilibrated in 20 mM Tris-HCl, pH 7.5, 100 mM NaCl, 2 mM MgCl2 and 5 mM DTT. Fractions containing the complex were concentrated and then split into two to prepare apo and pre-catalytic (with 500 µM sinefungin) samples. For catalytic intermediate samples with tRNALysCTT and tRNALysTTT, NSUN2C271A–tRNA adducts were assembled by mixing protein and RNA in a 1:1.6 molar ratio in a buffer containing 20 mM Tris-HCl, pH 7.5, 100 mM NaCl, 500 µM MgCl2, 5% glycerol, 5 mM DTT and 500 µM SAM at 37 °C for 2 h. The crosslinked sample with Intron was assembled at a protein:RNA molar ratio of 1:2 under the same conditions. Complexes were then separated from excess RNA by gel filtration.

For all samples, the final enzyme concentration was 4 mg ml−1, estimated using a Bradford assay, and 0.015% (w/v) NP40 was added for cryo-EM grid preparations. A total of 3 µl of each sample was applied to Quantifoil Cu 1.2/1.3 400-mesh grids, which were glow-discharged at 30 mA for 1 min. These samples were blotted for 4.5 s at a blot force of 13 at 4 °C in 100% relative humidity, then plunge-frozen in liquid ethane using a Vitrobot Mark IV (Thermo). For the crosslinked sample with Intron, Quantifoil Cu 1.2/1.3 300-mesh grids were used, and the grids were blotted with a blot force of 10 for 4 s. Data collection was performed on an FEI Titan Krios equipped with a Falcon 4i detector, a Bioquantum energy filter and a cold-FEG using SerialEM v.4.0.8 (ref. 52).

Cryo-EM data processing, model building and analysis

Motion correction, particle picking and three-dimensional analysis were performed using cryoSPARC v.4.6.0 (ref. 53). Conical FSC curves were produced using the Orientation Diagnostics job in cryoSPARC v.4.6.0. Initial models (Extended Data Table 1) were docked into cryo-EM maps using ChimeraX v.1.8 and were further built and refined using COOT v.0.9.8.7 and Phenix v.1.21.1 (refs. 54,55,56). Guinier plot B-factor values from cryoSPARC non-uniform refinement are provided in Extended Data Table 1. Map sharpening B-factors were scaled from B-factor values determined by Phenix.Autosharpen to avoid oversharpening and overinterpretation from noise during modelling. For modelling protein–RNA covalent linkages, monomer restraints for the conjugated cytosine (5DC) were generated in Phenix.eLBOW and link restraints between the 5DC monomer and NSUN2 C321 were generated using AceDRG distributed through the CCP4 software suite v.8.0.011 (refs. 57,58). Buried surface area per residue was calculated using PDBePISA v.2.0 (ref. 59), and conservation score per residue was calculated using the ConSurf web server60. Human NSUN protein multi-sequence alignment was performed with Clustal Omega v.1.2.4 (ref. 61). Structural biology applications used in this project were compiled and configured by SBGrid62. To model pre-existing m5C adjacent to target cytosines (Extended Data Fig. 6o) using COOT v.0.9.8.7, we used NSUN2WT–tRNALysTTT–SAH D-arm structure for m5Cyt48 and the NSUN2WT–tRNALysCTT–SAH D-arm structure for m5Cyt49.

Statistics and reproducibility

All in vitro methylation, in vitro crosslinking and EMSAs were performed at least three independent times (technical replicates) using protein and RNA reagents whose homogeneity and purity were verified using SDS–PAGE or native PAGE, respectively (Supplementary Fig. 2). Binding affinities determined using isothermal titration calorimetry were calculated from two independent titration series and reported with a 95% confidence interval. For gel-electrophoresis-based experiments, at least three independent experiments were performed. Uncropped gel images are provided in Supplementary Fig. 1. The micrographs shown are representative images. Representative data were used for figures and statistical tests, and replication attempts over all experiments were similar and successful. Statistical analysis for quantitative experiments was performed using GraphPad Prism 10. An unpaired, two-tailed Student’s t-test was used for each comparison, and the P-value is indicated. The number of replicates is included in each figure legend, with centre bars indicating the mean and error bars representing standard deviation.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.