Introduction

CRISPR-based base editing offers a potent method for editing genetic material by directly modifying single bases in DNA and RNA. This approach, involving C-to-U (cytidine-to-uridine) and A-to-I (adenosine-to-inosine) transitions mediated by deaminases, avoids the strand breaks typically induced by the CRISPR/Cas system1,2. Initially developed for DNA modification, CRISPR/Cas-based targeting has become an effective tool in genetic research and holds potential for therapeutic applications, such as correcting disease-associated mutations in animal models3,4,5,6,7,8,9.

Unlike the permanent modifications made by DNA base editing10,11,12, RNA base editing offers a transient method for in vivo base alternations, providing a relatively safer avenue for therapeutic strategies13,14,15,16,17. The A-to-I transition in RNA, mediated by Adenosine Deaminases Acting on RNA (ADAR), has been effectively demonstrated in vivo18,19,20. Two evolved ADAR2-based editing systems, RESCUE-S and xCBE, have been developed to achieve both A-to-I and C-to-U RNA editing21,22. However, RESCUE-S is strongly biased against editing GC and CC motifs, limiting its versatility23. To address this limitation, Stafforst’s group introduced the SNAP-CDAR-S system, a SNAP-tag-based tool designed to edit CC motifs in specific RNA targets24. While promising, evolved ADAR2-mediated base editing systems, including SNAP-CDAR-S, exhibit simultaneous C-to-U and A-to-I off-target editing, imposing constraints on their broader application21,22,23,24. Engineering Apolipoprotein B mRNA editing enzyme catalytic polypeptide (APOBEC) family proteins offers a potential solution. These proteins are specifically tailored for precise C-to-U base editing, effectively eliminating the A-to-I off-target effects and reducing undesired side effects. This approach represents a significant advancement in RNA base editing technology.

More recently, an RNA base editing system called CURE has been developed utilizing the dCasRx protein and a native APOBEC3A, to achieve limited yet specific UC transitions23. However, the specific Cas13-based C-to-U transition has not achieved for efficiently correcting disease-causing mutations in mouse models23,25,26. One potential reason is the strong preference of APOBEC for single-stranded RNA, conflicting with the double-stranded structures formed by target mRNA and small guide RNA used in the CRISPR/Cas system27.

To address existing limitations, the gRNA-free system, REWIRE (RNA editing with individual RNA-binding enzyme), was developed by exploiting the programmable RNA targeting capabilities of PUF proteins28. PUF proteins feature 8- or 10- repeat motifs, each of which can be programmed to specifically bind any RNA base via interaction with the Watson–Crick edge29,30,31,32,33. Various groups have developed PUF-based systems incorporating different functional domains to manipulate RNA splicing34, translation35, degradation36 and methylation37. Given the versatility of PUF domain in recognizing nearly any short 8- or 10- nucleotide RNA sequences, its combination with ADAR or APOBEC enzymes allows REWIRE to perform precise and efficient A-to-I or C-to-U editing on specific nucleosides of RNA targets in cultured cells28. Notably, the use of mammalian-derived proteins in the REWIRE minimizes the potential immune responses often triggered by bacterial Cas enzymes in therapeutic base editing scenarios38.

Current C-to-U RNA base editing systems rely on the enzymatic activity of APOBEC to hydrolytically deaminate cytidine13,28. Natural APOBECs, however, often exhibit deamination activity in a sequence context-dependent manner, primarily with single-strand DNA or RNA39. For instance, Human APOBEC3A preferentially edits cytidine within UC motifs in mRNA23,28. These context preferences limited further application of RNA base editors.

APOBEC enzymes, recognized for their critical role in inducing mutations within the genomes of retroviruses during infection, constitute a significant group of antiviral genes. Current NCBI database has documented over 8231 eukaryotic genes homologous to the AID/APOBEC family, highlighting their extensive presence and diversity in eukaryotic organisms40. These AID/APOBEC proteins share a common domain configuration, with a highly conserved catalytic domain centered between the variable N-terminal (NTD) and C-terminal domain (CTD)40,41,42, which may enable a rational design of cytidine deaminases using artificial intelligence-assisted protein engineering.

In this study, we use AlphaFold2-mediated structural engineering to develop Professional APOBECs (ProAPOBECs) with greatly expanded C-to-U editing capability. When integrated within the REWIRE system, ProAPOBECs demonstrate improved specificity and efficiency in multiple sequence contexts, including GC, CC, AC, and UC.

Importantly, we achieve effective in vivo C-to-U RNA editing in the liver and brain of mice using the CU-REWIRE5s with ProAPOBECs delivered via adeno-associated virus (AAV). This in vivo RNA editing mediated by CU-REWIRE successfully reduces cholesterol levels in mice by targeting Pcsk9 mRNA and alleviates autistic-like behaviors by correcting a point mutation in Mef2c mRNA in an autism spectrum disorder (ASD) mouse model8. In conclusion, this study demonstrates that the AI-assisted protein engineering can be applied to refine the activity of RNA base editors, enabling the correction of autistic-like phenotype in mouse model and paving the way for the practical applications of RNA base editing in the realm of gene therapy.

Results

Enhanced stability and efficacy of CU-REWIRE via structural optimization of PUF domain

Our initial efforts focused on improving CU-REWIRE3.0, which combined a 10-repeat PUF domain (PUF10) with the cytidine deaminase enzyme APOBEC3A. Structural analysis revealed high similarity between the fourth repeat (R4) of PUF10 in CU-REWIRE3.0 and the fifth repeat (R5) of the native Pumilio 2 protein. Importantly, R4 lacked the Leucine-Proline (LP) peptide present in R5, a potentially crucial element for structural flexibility as suggested by the previous research43. This insight led us to engineer an enhanced PUF10 (ePUF10) with the LP peptide integrated into R4, resulting in the development of CU-REWIRE4.0 (Fig. 1a; Supplementary Fig. 1a and Supplementary Data 1).

Fig. 1: Structure-based optimization of APOBEC3A enhances editing efficacy and reduces off-target effects of REWIRE-mediated RNA base editing.
figure 1

a Schematic design of REWIREs with PUF variants. Top: Original CU-REWIRE 3.0 with PUF10, targeting 10-nt sequences. Bottom: Upgraded CU-REWIRE 4.0 with ePUF10. b Expression levels of CU-REWIREs in cultured cells (n = 3, technical replicates). Representative Western blot images using anti-Flag antibody, related to Supplementary Fig. 1b. c Crystal structure of Human APOBEC3A (PDB: 4XXO) highlighting putative dimerization interactions. Amino acids involved in dimer formation are shown in spherical forms. d Crystal structure of Human APOBEC3A (PDB: 5KEG) in complex with single-stranded DNA. Amino acids responsible for RNA recognition are labeled in spherical forms. Zn symbolizes the catalytic core. e Schematic illustration of APOBEC3A with its functional domains. f Schematic showing interaction between target EGFP mRNA and CU-REWIRE4.0 and 4.X with modified APOBEC3A variants. g Base editing efficiency of EGFP mRNA by CU-REWIRE4.X. The PUF binding site in EGFP is underlined in blue, with adjacent cytosines marked in red (top). Heatmap displays editing rates of cytosines near the on-target site C459 for each CU-REWIRE and control (ePUF10 alone), with C459 editing rate detailed on the right. Editing rates were measured by RNA-seq with triplicates (refer to methods). Values represent mean ± SEM. h Global editing rates of transcriptome-wide RNA base editing in samples treated with different CU-REWIRE4.X (related to data from g labels and cutoff as in Supplementary Fig. 1e). Orange rhombuses indicate the on-target site EGFP-C459, and the mean number of off-target editing events is noted beside the dot plots. n represents the total number of edited cytosines detected. Values represent mean ± SD. i Correlation between on-target editing rates (Y-axis) and transcriptome-wide C-to-U off-target editing events (X-axis) for various CU-REWIRE4s.

Comparative analysis showed that CU-REWIRE4.0 exhibited higher expression levels than its predecessor, suggesting improved stability due to the LP insertion (Fig. 1b). We then assessed the editing efficacy of CU-REWIRE4.0 by targeting the C-to-U transition on the C459 site in the Enhanced Green Fluorescent Protein (EGFP) mRNA. As judged by mRNA sequencing (mRNA-seq), the CU-REWIRE4.0 achieved a substantial increase in editing efficiency, with an 82.3% success rate compared to 69.7% for CU-REWIRE3.0 (Supplementary Fig. 1b).

To evaluate off-target effects, we conducted RNA-seq analysis with 50X transcriptome coverage (See Methods). Four independent sample sets (APOBEC3A, ePUF10, CU-REWIRE3.0, and CU-REWIRE4.0) were analyzed with mRNA-seq, revealing 224 potential off-target events for CU-REWIRE3.0 and 731 for CU-REWIRE4.0 (Supplementary Fig. 1c–e). Notably, none of the off-target sites were located within 20-nt downstream of ePUF10-binding sequences (Supplementary Fig. 1f), suggesting a high precision of ePUF10 in recognizing target and minimal ePUF10-dependent off-target editing, supporting its potential for therapeutic applications. However, off-target events were largely attributed to the basal activity of APOBECs, highlighting the need for further optimization.

While showing promise in editing efficiency, the original CU-REWIRE3.0 was restricted to C-to-U editing activity within the UC context28. Similar to its predecessors, we found that the CU-REWIRE4.0 maintained the preference for editing cytidines within the UC consensus motif (Supplementary Fig. 1g). Its precision was further improved in that the C-to-U conversions predominantly occurred at the second position downstream of the ePUF10 binding site (Supplementary Fig. 1h). We also applied CU-REWIRE4.0 to specifically targeting the C832 site on the report mRNA of the mouse Pcsk9 gene. Our results revealed a marked improvement in C-to-U editing efficiency at C832 with CU-REWIRE4.0 compared to CU-REWIRE3.0 (Supplementary Fig. 1i). This underscores the enhanced specificity and efficiency of CU-REWIRE4.0.

Structure-based optimization of APOBEC3A minimizes off-target effects

APOBEC3A proteins tend to form dimers that stabilize their interaction with single-stranded DNA or RNA39. While this interaction is key in inducing mutations in viral genomes, it also contributes to bystander editing and off-target effects in non-target genomic regions during base editing. Therefore, our next objective was to reduce APOBEC3A dimerization.

By investigating the crystalline structure of APOBEC3A and building upon previous research, we identified two amino acid sites (H11, C171) critical for dimerization39. The mutation of these sites could reduce the dimer formation and the association with nucleic acids of APOBEC3A (Fig. 1c). Additionally, two amino acids (K30, H56) were found to be involved in substrate RNA recognition by APOBEC3A39. Thus, we speculate that mutations at these sites may diminish RNA recognition and, consequently, reduce off-target effects (Fig. 1d).

To test this possibility, we introduced various point mutations into human APOBEC3A (Fig. 1e). We then fused these mutated APOBEC3A versions with the ePUF10 domain, creating multiple CU-REWIRE4 variants that target C459 of EGFP mRNA expressed in HEK293T cells (Fig. 1f). After co-transfecting these constructs with the EGFP expressing plasmid, we evaluated the C-to-U editing efficiency at the C459 site of EGFP mRNA by these CU-REWIRE4 variants (Fig. 1g).

One variant, CU-REWIRE 4.1, with a C171A substitution in the C-terminal domain (CTD) of APOBEC3A, showed high editing efficiency while significantly reducing off-target effects (Fig. 1g, h). Interestingly, the CU-REWIRE 4.2, 4.4, and 4.5, each containing point mutations in the NTD, showed moderate editing efficiency but substantially minimized off-target events (Fig. 1g, h). These findings indicate that strategically designed point mutations in the NTD and CTD of APOBEC3A can significantly decrease off-target effects while maintaining sufficient editing efficiency (Fig. 1h). We further examined the on-target editing rates and off-target editing events of each CU-REWIRE variant, and found that these newly engineered variants provide optimal ratios of on- vs. off-target editing rates, which may offer versatile options for base editing applications (Fig.1i).

Like the original CU-REWIREs, the modified CU-REWIRE4 variants predominantly edited cytidines within the UC motif (Supplementary Fig. 2a), reflecting the intrinsic target preference of APOBEC3A13,28. To broaden the application scope of RNA base editing, we explored whether the APOBEC-derived base editors can achieve C-to-U editing in other context (i.e., CC, GC, and AC motifs) by systematically screening different AID/APOBEC enzymes that are integrated in the CU-REWIRE platform (Supplementary Fig. 2b). We found that only the APOBEC3A-ePUF10 configuration showed detectable base editing near the ePUF10 binding site (Supplementary Fig. 2c), and therefore select APOBEC3A as the chassis to introduce additional modifications that may expand the editing capabilities of new generation CU-REWIREs.

AI-assisted engineering of ProAPOBECs

Our analysis of AID/APOBEC proteins revealed that while the deaminase domain is relatively conserved, the NTD and CTD are prone to evolutionary changes40,41,42. This implies that the deaminase domain is crucial for catalyzing the C-to-U transition, while the NTD and CTD may specialize in recognizing different DNA/RNA targets40. Consistent with this notion, previous studies suggested that various APOBEC proteins might exhibit C-to-U activity outside of the UC context13, leading us to hypothesize that the core deaminase domain of APOBEC proteins may potentially edit cytosine in diverse contexts when combined with distinct NTD or CTD domains.

To test this hypothesis, we utilized AlphaFold2 to analyze the structures of APOBEC3 and APOBEC1 from both human and rodent origins (Fig. 2a), similar to a previously used approach that analyzes deaminases across different species44. These two enzymes are the only cytosine deaminases that are currently effective as active domains in DNA and RNA base editors13. We analyzed the phylogenetic tree of AID/APOBEC family using 1160 full length eukaryotic AID/APOBEC-related genes documented in the NCBI database, and found that the APOBEC3 and APOBEC1 belong to the closely related clade of this tree (Fig. 2b). There is striking similarity in the core deaminase domains of APOBEC3s and APOBEC1s, suggesting that functional modules may be interchangeable among these proteins. Therefore, we combined the core deaminase domain of different APOBECs with the NTD and CTD of human APOBEC3A. These hybrid deaminases were termed Professional APOBECs (ProAPOBECs), potentially applicable for both DNA and RNA base editing (Fig. 2c). Together, over thousands of functional ProAPOBEC proteins could be predicted (Fig. 2d).

Fig. 2: AI-Assisted design of programmable APOBEC platform for engineering ProAPOBEC enzymes.
figure 2

a Crystal structure of Human APOBEC3A (PDB: 4XXO), AlphaFold2-predicted structures of native APOBEC proteins: Mouse APOBEC3, Human APOBEC1, Mouse APOBEC1, and Rat APOBEC1, showcasing the NTD, C-to-U deaminase domain, and CTD. b Phylogenetic tree of the AID/APOBEC family, with major clades highlighted in different colors. c Schematic representation of the assembly of ProAPOBECs. d Estimated number of potential ProAPOBEC candidates based on AI-predicted AID/APOBEC deaminase domains. e AlphaFold2-predicted structures of AID/APOBEC proteins. Top: AI-predicted structures of native AID/APOBECs. Bottom: AI-predicted structures of ProAPOBECs. f The programmable CU-REWIRE5 system with ProAPOBEC variants. Top: Upgraded CU-REWIRE5, featuring integration of ProAPOBEC variants for targeted C-to-U editing. Middle: Schematic depiction of the modular assembly process of CU-REWIRE5. The ProAPOBECs domain, constructed from a selection of natural APOBEC deaminase domains (APOBECDD) derived from various sources, is engineered for specific C-to-U base editing. Accompanying this, the ePUF10 domain includes 10 RNA recognition units, each designed for the specific detection of a unique RNA base type. Bottom: Illustration of the precise engagement of CU-REWIRE5 with the target RNA, showcasing the accurately binding and editing C459 of EGFP mRNA. g Illustrations depicting the components of CU-REWIRE5, ProAPOBECs, and their deaminase donors. h Various editing rates on C459 of EGFP mRNA of CU-REWIRE5s, assembled with ProAPOBECs in (i) and Supplementary Fig. 2d. Editing rates were measured by Sanger-sequencing with triplicates (refer to methods), values represent mean ± SEM. i Protein expression levels of CU-REWIRE5s in HEK293T cells. The CU-REWIREs were Flag-tagged, anti-Flag antibody was used to examine CU-REWIREs, and GAPDH was used as a loading control.

We then applied AlphaFold2 to predict structures of ProAPOBECs derived from human and rodent deaminase domains (Fig. 2e and Supplementary Fig. 2d). These ProAPOBECs showed structural conservation with human APOBEC3A, suggesting their potential functionality for base editing. We subsequently fused various ProAPOBECs-C171A with the ePUF10 domain, creating a new generation of CU-REWIREs (version 5.1 to 5.17) (Figs. 2f, g and Supplementary Data 2). Upon expression analysis, most of the CU-REWIRE5 variants (CU5s) demonstrated stable expression and various editing activity towards C459 site of EGFP mRNA expressed in the HEK293T cells (Fig. 2h, i). This result demonstrated that we have significantly expanded the AID/APOBEC protein repertoire, leading to the development of RNA base editors with a broader targeting context for C-to-U editing. Next, we aim to characterize the editing properties and specificities of these newly engineered CU5s in additional target sites.

Enhanced targeting range and specificity in RNA base editing with CU5s toolkit

We evaluated the RNA editing activity of CU5s by testing their C-to-U editing capabilities at the artificially modified sites of EGFP mRNA, where the U458C459 was altered to GC, CC, and AC to test the sequence preference of CU5s (Fig. 3a). Surprisingly, multiple CU5s base editor containing ProAPOBECs exhibited C-to-U transition under GC, CC, and AC contexts (Fig. 3b). We found that while some new enzymes (like CU5.7 and 5.13) maintain the original preference of UC context for their editing activity (Fig. 3b and Supplementary Fig. 3a, b). The CU5.16, which demonstrates a particular interest in the C-to-U editors with AC context (Supplementary Fig. 3c–f).

Fig. 3: Enhanced targeting range and specificity in RNA base editing with CU5s.
figure 3

a Schematic illustration demonstrating the targeting of the C459 site in EGFP mRNA by CU5s. The native U458C context was manually altered to A458C, G458C, and C458C to test editing preferences. b Heatmap depicting the percent editing levels of CU5s at the C459 site of EGFP transcript in AC, UC, GC, and CC contexts. The PUF10 binding sequence and on-target site were showed by (a). c Editing efficacy of CU5s at the on-target and bystander site of EGFP transcript in AC, UC, GC, and CC contexts. A heatmap displays editing rates of all cytosines near the on-target site C459 for each CU5s and control. Editing rates of C459 are measured by RNA-seq with triplicates. d Global editing rates of transcriptome-wide RNA-seq by CU5s (related to data from b, with labels and cutoffs the same as in Fig. 1g). Orange rhombuses indicate the on-target EGFP sites, and the average efficiency of total off-target editing events is noted on the right. n numbers represent edited cytosines detected. Values represent mean ± SD. e Sequence motif logos derived from cytosines edited by different CU5s, based on RNA-seq data (related to d).

Importantly, new CU5s have high sequence specificity distinct from the original UC specificity21,28. For example, the CU5.1 and 5.8, containing the deaminase domain of human APOBEC1 and APOBEC3H, respectively, demonstrated significant C-to-U editing activity within GC or AC motifs (Fig. 3b and Supplementary Fig. 4a, b). Our finding underscores that Pro.hAPOBEC1 editing capabilities significantly differ from natural human APOBEC3A (only editing UC motifs) and human APOBEC1 (editing UC motifs with low efficiency), indicating the emergence of cytidine editing activities in engineered ProAPOBECs. In addition, the CU5.15, containing the deaminase domain of mouse APOBEC3, demonstrated significant C-to-U editing activity within CC or UC motifs (Fig. 3b and Supplementary Fig. 4c). The other enzymes can edit cytosine at various sequence context, including two enzymes (CU5.3 and 5.17, containing the deaminase domain of mouse APOBEC1 and rat APOBEC1 respectively.) that can edit the cytosine in any context (i.e., NC motifs), suggesting they may function as a base editor with broad target range (Fig. 3b and Supplementary Fig. 4d, e). Such expansion of specificity is particularly useful in their application as a therapeutic protein (Fig. 3b and Supplementary Figs. 3 and 4).

We further used RNA-seq to assess the bystander editing effects of these new CU5s, with particular interest in the C-to-U editors with different sequence specificity (like GC editors CU5.1 and 5.8 (Fig. 3c, top), the CC editor CU5.15 (Fig. 3c, middle), and the NC editors CU5.3 and 5.17 (Fig. 3c, below).

The RNA-seq experiments also allow us to assess the global off-target effects of CU5s across the entire transcriptome (Fig. 3d). Our results demonstrated that the CU5.1, 5.17, and 5.16 showed higher off-targeting editing rates compared to CU4.1, while certain CU5s have lower number of off-target editing sites (Fig. 3d). There seems to be correlation between off-target editing rates and the on-target editing efficiency (i.e., more activity CU5s have a larger number of off-target edit sites), suggesting that we have to balance these two properties of CU5s in the practical applications. We further analyzed the off-target editing sites for their consensus sequence motifs (Fig. 3e) and found that these motifs are largely consistent with the sequence preference observed using C-to-U editing on a single target site (C459) of EGFP mRNA (Fig. 3b and Supplementary Fig. 4).

Editing windows and in vitro application of CU5s

To elucidate the editing characteristics of CU5s on the target RNA, we conducted a comprehensive evaluation of the editing windows associated with selected CU5s. Three variants, CU5.1, CU5.15, and CU5.17, were selected for this analysis, because they can edit cytosine in the new sequence context other than the original UC motifs. We found that CU5.1 (editing at GC motif) demonstrated the highest editing efficiency at positions 2- or 4-nucleotides (nt) downstream of the ePUF10 binding site in EGFP mRNA, as shown by different target reporters containing GC dinucleotides at different positions downstream of the ePUF10 binding site (Fig. 4a).

Fig. 4: Editing windows and application of CU5s.
figure 4

a Determination of editing windows for CU5.1 on C459 of EGFP mRNA. The graph shows on-target editing rates by CU5.1 at various distances from the ePUF10 binding site. b Editing efficacy of CU5.1 on C526 of APOE (APOE4 & APOE3) reporter mRNA expressed in HEK293T cells. The ePUF10-binding sites are marked in blue. c Editing efficacy of CU5.1(Top) and the Cas13-bases xCBE (bottom) editors on C526 of endogenous APOE3 mRNA in HepG2 cells. The ePUF10-binding site and gRNA-xCBE target regions are underlined. Editing efficacy of CU5.1 on C143 of SOD1 reporter mRNA (d) and C76 of Rhodopsin reporter mRNA (e) expressed in HEK293T cells. The ePUF10-binding sites are marked in blue. f Summary of the editing window for CU5.1 across EGFP, APOE4/3, SOD1 and Rhodopsin reporter mRNAs. The x-axis represents the distance from the editing site to the ePUF10 binding site; the y-axis shows the C-to-U editing rate. Editing rates were acquired by Sanger sequencing. g Determination of editing windows for CU5.15 on C459 of EGFP mRNA. Editing efficacy of CU5.15 on C104 of Mef2c reporter mRNA (h) and C823 of PCSK9 reporter mRNA (i) expressed in HEK293T cells. j Summary of the cytosine editing window mediated by CU5.15 around the target sites of EGFP, PCSK9, and Mef2c reporter mRNAs. k Determination of editing windows for CU5.17 on C459 of EGFP mRNA. Editing efficacy of CU5.17 on C1541 of DDX3X reporter mRNA (l) and C143 of SOD1 reporter mRNA (m) expressed in HEK293T cells. n Summary of the cytosine editing window mediated by CU5.17 around the target sites of EGFP, SOD1 and DDX3X reporter mRNAs. In a–n values represent mean ± SEM (n = 3).

We further evaluated the efficacy of CU5.1 at disease-relevant targets using both reporter constructs and endogenous genes. Focusing on Apolipoprotein E (APOE), a major genetic determinant of Alzheimer’s disease (AD), we targeted isoform-defining sites where single-nucleotide changes confer differential disease risk45,46. APOE4 substantially increases AD susceptibility and accelerates disease onset47,48, whereas APOE1 and APOE2 exhibit protective effects comparing to APOE349. The c.C526T (p.R176C) variant converts pathogenic APOE4 to benign APOE148,50. The same variant (c.C526T) converts APOE3 to neuroprotective APOE245,49.

To examine whether CU-REWIRE is able to perform RNA base editing on the c.526 site, we designed CU5.1 specifically targeting the c.526 of APOE3/4. We found that the CU5.1 binding at 2-nt upstream of the mutated site effectively converted C526 to U526 with editing efficiency at ~90% in reporter mRNA (Fig. 4b). Moreover, CU5.1 also effectively perform c.526C > U transition in endogenous APOE3 mRNA in HepG2 cells with high efficiency (Fig. 4c). In contrast, the optimized RNA editor xCBE exhibited no detectable editing activity at this locus (Fig. 4c). These findings highlight the unique capacity of CU5.1 for precise isoform conversion, highlighting its potential as a therapeutic RNA editing platform for AD.

In another example, the c.T143C mutation in the SOD1 gene, resulting in the p.V48A substitution in the SOD1 protein, is the most common mutation associated with ALS in Southeastern China51. We designed two CU5.1 enzymes with ePUF10 domains targeting sequences at positions U132–U141 and A131–A140 (i.e., 2 or 3 nucleotides upstream of C143, respectively). The CU5.1 enzyme binding 2 nucleotides upstream of the mutated site successfully converted C143 to U143 with an editing efficiency of ~85%, effectively transforming the mutant SOD1 allele into the benign form52 (Fig. 4d). In addition, the c.C68A (p.P23H) mutation in the human Rhodopsin gene, known to cause Retinitis Pigmentosa (RP)53 can be effectively edited by CU5.1 that bind to 2-nt upstream of the mutated site (Fig. 4e). As a result of editing C76 to U76, we introduced a stop codon in the mutated mRNA, which should abolish the toxic protein mutant (Fig. 4e). Combining the results across different target mRNAs, we summarized the editing window for CU5.1 as 2 ~ 5-nt downstream of the ePUF10 binding site (Fig. 4f).

Furthermore, CU5.15 demonstrated the highest editing efficiency 2-nt downstream of the ePUF10 binding site in EGFP mRNA (Fig. 4g). We adopted similar approach to examine the editing window of CU5.15 using reporters containing CC at different distance from the designed binding site (Fig. 4g), or using multiple CU5.15 targeting the c.T104C (p.L35P) mutation in MEF2C gene (Fig. 4h)8. In addition, we employed CU5.15 to generating a premature stop codon at the C823AG position within a human Pcsk9 reporter in HEK293T cells (Fig. 4i). We found that CU5.15 effectively edited cytidine 2-nt or 3-nt downstream of the ePUF10 binding site, resulting in an editing window within 2–4-nt in various target mRNAs (Fig. 4j).

Finally, CU5.17, with editing activity in “NC” context, demonstrated a high efficiency at positions 2-nt downstream of the ePUF10 binding site as judged by EGFP reporter (Fig. 4k). Importantly, CU5.17 was highly efficient in editing the c.T1541C site of the human DDX3X mRNA linked to neurodevelopmental disorders54, particularly when the CU5.17 binding to 2-nt or 3-nt upstream of the mutated site (Fig. 4l). We also assessed the editing efficiency of various CU-REWIREs using a SOD1 reporter in HEK293T cells (Fig. 4m). We found that the editing window for CU5.17 is within 2–5-nt after the ePUF10 binding site across different mRNAs (Fig. 4n).

Improving the CU-REWIRE5 system with AI-assisted design of ProAPOBECs utilizing cytosine Deaminase derived from Mammals

We also explored the possibility of using ProAPOBECs derived from deaminase of other mammalian species using similar approach (Supplementary Fig. 5 and Supplementary Data 2). AlphaFold2-based predictions revealed that the deaminase domains from many mammalian APOBEC1s and APOBEC3s may form functional structures with the NTD and CTD of the human APOBEC3A (Supplementary Fig. 5a). We found that both Z1 and Z2 domains of mammalian APOBEC3s could form functional ProAPOBECs when combined with NTD and CTD of human APOBEC3A (Supplementary Fig. 5b, c).

Furthermore, we integrated these newly engineered ProAPOBECs into CU-REWIRE (i.e., by fusing with ePUF10 domain) and tested their editing activity on EGFP mRNA. We found significant C-to-U editing activity in GC, AC, and CC contexts by different ProAPOBECs (Supplementary Fig. 5d–i). ProAPOBECs with deaminase domains from APOBEC1 showed strong activity in GC contexts, while some with APOBEC3-Z1 domains demonstrated activity in CC and AC contexts (Supplementary Fig. 5e–g). These results highlight the versatility of engineered ProAPOBECs for C-to-U base editing, suggesting an expansion of specificity is particularly useful in potential therapeutic applications (Supplementary Fig. 5h, i).

In vivo RNA editing of PCSK9 in mice reduces cholesterol levels

To demonstrate in vivo applications, we applied CU-REWIREs to edit Pcsk9, a therapeutic target gene, by introducing a premature stop codon to repress PCSK9 protein expression55. Inhibiting PCSK9 protein reduces serum low-density lipoprotein cholesterol (LDL-C) levels, which is associated with a lower risk of cardiovascular disease56. Our aim was to achieve precise RNA editing of PCSK9 without alternations of genome DNA, addressing limitations commonly associated with DNA editing strategies57,58,59.

First, we assessed the editing efficiency of various CU-REWIREs and the Cas13-based C-to-U editors (CURE23, xCBE22) in generating premature stop codons at the C832AG position within a mouse Pcsk9 reporter in HEK293T cells (Fig. 5a). While no significant editing was observed with CURE or xCBE, CU4.1 (CU-REWIRE4.1) and CU5.21 (CU-REWIRE5 featuring ProAPOBEC from Ailuropoda melanoleuca) effectively converted C832AG to U832AG when ePUF10 domains bound to sequences spanning U821–G830 (Fig. 5a). Notably, CU5.21 demonstrated a remarkable 1.6-fold increase in editing efficiency compared to CU4.1, achieving 96% versus 37%, respectively (Fig. 5a). This highlights the superior application potential of ProAPOBEC in specific contexts compared to native APOBEC proteins.

Fig. 5: In vivo RNA editing of PCSK9 in mice reduces cholesterol levels.
figure 5

a Efficient editing of C832 in PCSK9 mRNA by CU-REWIREs and Cas13-based editor. The ePUF10 binding site in PCSK9 is underlined in blue, with on-target site marked in red (top). Editing rates of C832 are measured by Sanger sequencing with triplicates, values represent mean ± SEM. b Schematic illustration of CU5.21 vectors delivered into mice through intravenous injection of AAV8 virus (top) and the key experimental steps (bottom). c In vivo knockdown frequencies induced by AAV8-CU5.21. Unpaired two-sided Student’s t-test was used; statistical values represent the mean ± SEM (n = 5). *P < 0.05, P = 0.0094. d–g The levels of serum PCSK9 protein (P = 0.0005), cholesterol (PLDL-C = 0.0026, PHDL-C = 0.0148, PTotal-C = 0.0042), glyceride, and total protein levels in the mice injected with CU5.21. Unpaired two-sided Student’s t-test was used; statistical values represent the mean ± SEM (n = 5). *P < 0.05. h Statistical analysis of weight changes in mice following administration in the CU5.21 and EGFP control. Values represent mean ± SEM (n = 5). i Scatter plots showing transcriptome-wide C-to-U RNA base editing in liver samples treated with CU5.21 and EGFP control in mice. Values represent mean ± SD (n = 3).

To assess the biosafety profile of CU-REWIRE editors versus Cas13-based systems, we analyzed subcellular localization and potential DNA off-target effects. Immunofluorescence confirmed CU-REWIRE constructs predominantly localize to the cytoplasm (Supplementary Fig. 6a), minimizing nuclear exposure and reducing potential genomic interactions. We engineered a nuclear-targeted variant (CU-REWIRE-NLS) that maintained editing capability in nuclear transcripts (Supplementary Fig. 6b, c), demonstrating programmable spatial control. Moreover, deep sequencing of genomic DNA from in vitro editing experiments (Fig. 5a) revealed no detectable C-to-T mutations - even at sites with 96% RNA editing efficiency (Supplementary Fig. 6d).

Next, we packaged CU5.21 into AAV8 vectors (Fig. 5b), leveraging its high liver tropism for in vivo RNA base editing. Mice injected with AAV8-CU5.21 were analyzed for changes in Pcsk9 mRNA, protein levels, and total cholesterol four weeks post-administration. By introducing a premature stop codon, the edited Pcsk9 mRNA was subjected to degradation via nonsense-mediated decay60. Quantitative PCR revealed that AAV8-CU5.21 reduced Pcsk9 mRNA levels by over 70% compared to the EGFP control group (Fig. 5c). Consistently, this led to significant reductions in both PCSK9 protein (Fig. 5d) and cholesterol levels (Fig. 5e) in serum, without affecting glyceride or total protein levels (Fig. 5f, g).

As expected, the mice displayed no abnormalities in growth following AAV8-CU5.21 administration (Fig. 5h). Furthermore, transcriptome-wide RNA-seq identified approximately 400 C-to-U editing events associated with CU5.21 (Fig. 5i). Crucially, deep sequencing of genomic DNA from in vivo editing experiments revealed no detectable C-to-T mutations (Supplementary Fig. 6e). These results establish CU-REWIRE as a specific RNA-targeting platform with no evidence of DNA alterations.

Together, these results demonstrate that CU5.21 induces efficient and precise in vivo editing of the mouse Pcsk9 gene, providing therapeutic benefits with minimal transcriptome-wide off-target effects.

In vivo RNA editing in an ASD mouse model using CU5.15

Effective RNA base editing in the brain has been a challenging endeavor20. To explore the potential of CU5s, powered by ProAPOBECs, for in vivo RNA editing, we utilized an ASD mouse model with p.L35P (c.T104C) point mutation in Mef2c gene, which was reported to cause severe ASD in human8. An APOBEC-embedded cytosine base editor for DNA has successfully performed C-to-T conversion at the C104 site of the Mef2c gene in the Mef2c L35P mutant mice, effectively reversing the ASD-like phenotypes in this mouse model8. However, the previous methods need to co-inject two AAV vectors for in vivo DNA base editing, posing practical challenges for its application in clinical settings6,8,61. Packaging the base editing system into a single AAV vector could significantly reduce costs and enhance delivery efficiency in clinical applications.

We first examined the C-to-U base editing efficiency of the CU-REWIRE system as compared with Cas13-based RNA editing tools on the C104 site on the Mouse Mef2c (c.104) report mRNA23,28. We observed substantially higher editing efficiencies with CU5 editors than with CRISPR/Cas-based tools. CU5.15 achieved 43% and CU1.15 18% editing (Supplementary Fig. 7a, b). By contrast, the CURE-X system—using either native APOBECs or Pro.mAPOBEC3-1—failed to edit Mef2c mRNA and exhibited upstream bystander editing (Supplementary Fig. 7c–e).

We further packaged CU5.15, featuring the ePUF10 module targeting the A92-A101 region of mouse Mef2c mRNA, into brain blood barrier (BBB)-crossing AAV-PHP.eB vectors (Fig. 6a). These vectors, along with an EGFP reporter driven by the human synapsin 1 gene promoter (hSyn), were injected into the tail vein of 4-week-old Mef2c WT and L35P+/− mice8. Robust EGFP expression was observed in the cortical and hippocampal regions 4- and 16- weeks post-injection (Fig. 6b, c), confirming the efficient gene delivery in mouse brains.

Fig. 6: In vivo RNA base editing in Mef2c L35P+/− mice corrects genetic mutations and protein expression.
figure 6

a Schematic illustration of AAV-CU-REWIRE5.15 (CU5.15) vectors delivered into Mef2c L35P+/- mice through intravenous injection of AAV-PHP.eB virus in the tail vein. b Immunofluorescence images showing CU5.15 (green) and DAPI (blue) in the cortical and hippocampal brain regions, 4-weeks post-viral injection (n = 3 brain slices). Scale bar: 200 µm. c Immunofluorescence images of CU5.15 (green) and DAPI (blue) in the cortical and hippocampal regions, 16-weeks post-viral injection (n = 3 brain slices). d Editing efficiency of C104 in endogenous mouse Mef2c mRNA by CU5.15. The ePUF10 binding site is underlined in blue, C104 marked in red (top). Lower panel: Editing rates of cytosines around the on-target site. Rates were measured by RNA-seq with triplicates. Values represent mean ± SEM. e Percentage of U104 in prefrontal cortex and hippocampus tissues of Mef2c WT or L35P+/− mice edited by ePUF10 or CU5.15, measured by RNA-seq with triplicates, values represent mean ± SEM. ePUF10 vs. CU5.15 in prefrontal cortex, P = 0.1196; ePUF10 vs. CU5.15 in hippocampus, P = 0.0034, *P < 0.05. C-to-U of the Mef2c C104 site editing rate formula: Cedited = (UCU5.15-UePUF10)/Ctotal (1-UePUF10). f Scatter plots showing transcriptome-wide C-to-U RNA base editing in prefrontal cortex or hippocampus samples treated with CU5.15 and ePUF10 control in mice brain samples. Values represent mean ± SD (n = 3). g, h MEF2C protein levels in the prefrontal cortex or hippocampus of Mef2c WT or L35P+/− mice treated with ePUF10 or CU5.15. anti-MEF2C antibody was used. GAPDH used as a loading control (Left). Quantification of MEF2C protein level (Right). ePUF10 vs. CU5.15 in prefrontal cortex, P = 0.0203; ePUF10 vs. CU5.15 in hippocampus, P = 0.0423. Unpaired two-sided Student’s t-test was used; statistical values represent the mean ± SEM (n = 3). *P < 0.05.

Sanger sequencing and RNA-sequencing of prefrontal cortical and hippocampus tissue samples from the mice revealed that CU5.15 induced C-to-U editing at the C104 site in the Mef2c L35P+/− mice injected with CU5.15 (Supplementary Fig. 8a, b). RNA-seq experiments showed a significant increase in the percentage of U104 in the prefrontal cortex (61.3%) and hippocampus (68%) of Mef2c L35P+/− mice injected with CU5.15, compared to those injected with ePUF10 (Fig. 6d, e). In addition, genomic DNA sequencing from in vivo RNA editing experiments (Fig. 6d) showed no evidence of C-to-T substitutions in the genomic C104 site of Mef2c locus (Supplementary Fig. 8c, d).

This data confirmed the induction of C-to-U editing events by CU5.15. Additionally, no bystander editing events were observed in the C nucleotides from positions C65-C150 of Mef2c mRNA, demonstrating the precision of CU5.15-mediated RNA editing (Fig. 6d). C-to-U editing rates across various brain regions ranged from 30 to 50% (Fig. 6e). Considering that the brain tissues collected for RNA-seq are mixture of neurons and glial cells, the exact C-to-U editing of the Mef2c L35P site in neurons would be higher. Transcriptome-wide RNA sequencing revealed approximately 800 C-to-U editing events associated with CU5.15 (Fig. 6f).

Notably, MEF2C protein levels were fully restored in the cortical and hippocampal regions of Mef2c L35P+/- mice injected with CU5.15, as compared to those injected with ePUF10 and to Mef2c WT mice (Fig. 6g). This restoration was further validated by immunostaining using an anti-MEF2C antibody (Fig. 7a, b). Remarkably, the lifespans of Mef2c L35P+/- mice injected with CU5.15 were close to WT group (Fig. 7c), which was not observed in previous DNA editing work8, suggesting that RNA base editing facilitated by CU5.15 may exert a broader impact in this mouse model.

Fig. 7: In vivo RNA base editing in Mef2c L35P+/− mice rescued abnormal social behaviors.
figure 7

a Immunohistochemical staining for MEF2C (red) and DAPI (blue) in the retro splenial cortex (RSC) and hippocampus of Mef2c WT and L35P+/− mice, indicating viral injections. b Left: Quantification of MEF2C protein fluorescence density in the RSC (ePUF10 vs. CU5.15, P = 0.0040). Right: In the hippocampus (ePUF10 vs. CU5.15, P < 0.0001. *P < 0.01. n = 6). c Lifespan of Mef2c L35P+/− mice with ePUF10 or CU5.15. Kaplan–Meier curve were compared between (L35P+/-(ePUF10) and L35P+/−(CU5.15) groups using the Log-rank (Mantel–Cox) test (Chi square = 25.99, P < 0.0001, two-sided). WT (ePUF10) n = 23, L35P+/- (ePUF10) n = 17, L35P+/− (CU5.15) n = 23). d Heatmaps representing locomotion of Mef2c WT or L35P+/− mice in the sociability and social novelty sessions of the three-chamber test. e Cumulative interaction duration with mouse and empty cage in different groups of Mef2c mice (stranger vs. empty, WT+ePUF10, P < 0.0001; L35P+/−+ePUF10, P < 0.0001; L35P+/− + CU5.15, P < 0.0001. n = 14). f Cumulative interaction duration with stranger and familiar mouse in different groups of Mef2c mice (stranger vs. familiar, WT+ePUF10, P = 0.0007; L35P+/−+ePUF10, P = 0.9130; L35P+/− + CU5.15, P < 0.0001. Significantly different, P < 0.01. n = 14). g Diagram of the home cage experiment. h Sniffing time in social intruder test for Mef2c WT +ePUF10 (T1 vs. T2, P < 0.0001; T1 vs. T3, P < 0.0001; T1 vs. T4, P < 0.0001; T4 vs. T5, P < 0.0001. n = 12). i Sniffing time in the social intruder test for Mef2c L35P+/− + ePUF10 (T1 vs. T2, P = 0.0153; T1 vs. T3, P = 0.0614; T1 vs. T4, P = 0.0026; T4 vs. T5, P < 0.0001. n = 12). j Sniffing time in the social intruder test for Mef2c L35P+/− + CU5.15 (T1 vs. T2, P < 0.0001; T1 vs. T3, P < 0.0001; T1 vs. T4, P < 0.0001; T4 vs. T5, P < 0.0001. n = 12). Paired two-sided Student’s t-test was used, statistical values represent the mean ± SEM. *P < 0.01, **P < 0.001, ***P < 0.0001.

We subsequently examined the influence of CU5.15 on autistic-like behaviors in Mef2c L35P mice using a series of social interaction tests. The investigation began with the three-chamber test, comprising both sociability and social novelty tests. In the sociability test, mice were given the choice between interacting with another mouse or an empty cage (Fig. 7d). This was followed by the social novelty test, where the mice had to choose between a familiar partner and a novel partner. Mice treated with AAV-CU5.15 demonstrated a significant improvement in social interaction performance during the social novelty test8 (Fig. 7e, f).

The study progressed to analyzing social interaction behaviors using the social intruder test, consisting of six trials (T0-T5) (Fig. 7g). After three days of social isolation (T0), the mice encountered the same partner for four consecutive trials (T1-T4, each lasting 5 minutes), and then a new partner in the fifth trial (T5). The duration of sniffing time between the subject mice and their partners was recorded. Notably, the aberrant social behaviors observed in Mef2c L35P mice during the initial four trials (T1-T4) with intruder partners were entirely mitigated following treatment with CU5.15 (Fig. 7g–j). These findings suggest that in vivo RNA base editing facilitated by CU5.15 effectively restores normal MEF2C protein expression and alleviates the social impairments associated with the Mef2c L35P mutation.

In summary, our study demonstrates that RNA base editing, as facilitated by CU5.15, provides a potent tool for correcting genetic mutations in the brain and has potential therapeutic applications for ASD and other genetic disorders.

Discussion

APOBEC3A is a highly active deaminase frequently used in DNA base editors. However, its application has been hindered by some intrinsic limitations of this enzyme, including off-target effects and RNA binding issues11,12. Although point mutations in APOBEC3A have reduced off-target effects in DNA base editing, its application in RNA base editing has remained largely unexplored62,63. We hypothesized that the high off-targeting rates of APOBEC3A might be largely due to its close interaction with target mRNA. Structurally, the dimerization of APOBEC3A plays a key role in stabilizing its mRNA binding. Therefore, our initial strategy was to modulate dimerization to minimize its off-target effects and decrease the strong associations with mRNA. Indeed, APOBEC3A mutants in the NTD or CTD, exhibiting reduced mRNA interaction and dimerization, showed fewer off-target effects. This provided a range of APOBEC3A variants for precise RNA base editing in conjunction with the REWIRE system (Fig. 1c–i). However, such variants were limited to editing cytidines in UC contexts (Supplementary Fig. 2a), indicating the need for a more fundamental protein engineering of cytosine base editors with broader targeting scopes.

Utilizing AlphaFold2-assisted protein structure analysis, we discovered that the deaminase domains of APOBECs could act as interchangeable modules (Fig. 2). This suggested that the NTD and CTD may contribute to the recognition of viral genetic materials in various organisms. Consequently, we anticipated that deaminase activities could emerge from new combinations of deaminases with NTDs and CTDs. This led to the identification of a series of APOBECs with cytidine-editing activity under various contexts, significantly broadening the application of REWIRE-mediated RNA base editing.

Compared to Cas13-based C-to-U base editors, the CU5s with installed ProAPOBECs demonstrated an impressive capacity for effecting C-to-U transitions. Notably, our study revealed that the CU 5.7 and 5.13 are capable of editing cytosine in UC motifs, akin to the CasR13-based RNA editing system that utilizes human APOBEC3A or evolved ADAR2 enzyme21,23, the CU5.15 enzyme can edit cytosine across CC sequence context, similar to the SNAP-CDAR-S system that employs a evolved ADAR2 enzyme24. Meanwhile, the CU5.3 and 5.17 enzyme can edit cytosine across NC sequence context, suggesting their potential as broad-spectrum editing tools. Importantly, multiple CU5s base editor have shown remarkable C-to-U transitions capabilities within GC, AC, and CC motifs, exhibiting high sequence specificity distinct from traditional RNA base editors (Fig. 3). For example, CU5.1 has shown significant C-to-U editing activity within GC motifs, albeit with lower efficiency in AC motifs. This enzyme notably reduces bystander editing effects and enhances editing specificity in the target region. Such expansion of specificity is particularly useful in their application as a therapeutic protein (Fig. 3, Supplementary Figs. 3 and 4).

While Cas13-mediated A-to-I RNA editing has been reported in mouse models of human genetic disorders18,20, effective C- to-U RNA editing had not been achieved due to the intrinsic properties of APOBECs (~20% editing efficiency achieved in mouse model with no phenotypical improvement23,25,26). With the new REWIRE system, we demonstrated that ProAPOBEC can effectively perform RNA base editing in the liver and brain (Figs. 5 and 6). With an editing rate of nearly 50% in vivo, Mef2c mutant mice showed restored protein levels and correction of autistic-like behaviors. Furthermore, effective RNA base editing prolonged the lifespan of Mef2c mutant mice, a result not observed with the Mef2c L35P DNA editing8. The effects of in vivo RNA base editing in our study appear to be long-lasting for over 6 months (Fig. 7c).

ProAPOBECs also have potential applications in the field of RNA epigenetics. The elucidation of epigenetic modifications in DNA and RNA has become crucial for understanding their diverse biological functions64. RNA modifications add a new layer to gene regulation, giving rise to “RNA epigenetics.” Techniques such as DART-seq (deamination adjacent to RNA modification targets)65, ACE-Seq (APOBEC-coupled epigenetic sequencing)66, and AMD-Seq (APOBEC3A-mediated deamination sequencing)67, all employing APOBECs manipulation, have been developed for comprehensive mapping of RNA epigenetics. Additionally, RNA-binding proteins (RBPs) are pivotal in gene expression and RNA processing. To elucidate RBP dynamics at the single-cell level, the Gene lab developed STAMP (Surveying Targets by APOBEC-Mediated Profiling), enhancing the detection of RBP-RNA interactions68.

The ProAPOBECs engineered in this study have enhanced editing efficiency and specificity over their natural APOBEC counterparts, emerging as significant improvement to the current repertoire of tools available for genetic modification. Our objective is to refine the ProAPOBECs’ activity and expand their application, facilitating the development of streamlined and accurate methods for conducting both DNA and RNA edits in vivo. In summary, the integration of REWIRE technology with ProAPOBEC-mediated base editing heralds a promising new era in both fundamental research and applied therapeutic strategies. This work not only enriches the existing collection of tools for genetic engineering but also paves the way for novel approaches to treat and manage genetic disorders, marking a significant step forward in the field of genomics and molecular medicine.

Methods

PUF10R domains construction

Utilizing codon optimized PUF10R repeats from human PUM1/2, which recognize all four RNA bases as detailed in our previously published work28, we assembled ten synthesized PUF repeats via PCR to construct the PUF10R domain. We enhanced the PUF10R domain by introducing a short peptide (LP) between the fourth and fifth repeats, thereby creating the enhanced PUF10 (ePUF10) domain. This ePUF10 domain was assembled through PCR amplification and substituted the PUF10R in the original CU-REWIRE 3.0 expression vectors, resulting in the upgraded CU-REWIRE 4.0.

Upgrading CU-REWIRE4s

Deaminase domains of human APOBEC3A were amplified from CU-REWIRE3.0 plasmids, with mutant variants of APOBEC3A produced using gene-specific primers for the desired mutations. These APOBEC3A fragments were cloned into the CU-REWIRE 4.0 vector to fabricate various CU-REWIRE 4s variants.

Development of CU-REWIRE5s

DNA fragments encoding the deaminase domains from mammalian AID/APOBECs were either amplified from existing CBE plasmids9 or synthesized by GenScript Inc. The C-to-U deaminase domain of APOBEC3A was then replaced with AID/APOBECs deaminase domains, creating a range of ProAPOBEC variants. These ProAPOBECs were used to replace APOBEC3A in the original CU-REWIRE 4.1 expression vectors, as described in our prior publication, thus generating the upgraded CU-REWIRE5s.

Development of CURE-X-ProAPOBECs

DNA fragments encoding the deaminase domains from mammalian AID/APOBECs were amplified from existing CU-REWIRE5 plasmids. The C-to-U deaminase domain of APOBEC3A was then replaced with AID/APOBECs deaminase domains, creating a range of ProAPOBEC variants. These ProAPOBECs were used to replace APOBEC3A in the original CURE-X expression vectors23,28, as described in our prior publication, thus generating the upgraded CURE-X-ProAPOBECs.

Construction of C-terminal mCherry-Fused CU-REWIRE and CURE-X editors

To visualize subcellular localization, mCherry was fused to the C terminus of the CU-REWIRE4.1 and CURE-X base editors via a flexible linker. The coding sequences of each editor and mCherry were amplified by PCR and cloned into the corresponding expression vectors using standard restriction enzyme or Gibson assembly methods. All constructs were verified by Sanger sequencing to ensure correct in-frame fusion and sequence integrity.

Reporter constructs

Wild-type and mutant EGFP reporter coding sequences were amplified from pEGFP-C1 (Clontech) using specific primers and cloned into the pCDH-CMV-MCS-EF1-Puro vector (Promega) via Gibson Assembly. Reporters bearing disease-relevant T > C mutations had their coding sequences synthesized and similarly cloned.

Assembly of various PUF domains

To incorporate various PUF domains targeting distinct mRNAs into the CU-REWIRE vectors, the MultiF Seamless Assembly Mix kit (ABclonal) was employed. DNA fragments for assembly were amplified using Q5 Hot Start High-Fidelity DNA Polymerase (NEB). All constructs were verified by Sanger sequencing.

The nucleotides sequences of all ProAPOBEC and CU-REWIRE constructs used in this study are provided in Supplementary Data 1.

Mammalian cell culture and transfection

HEK 293 T cells (ATCC CRL-3216) were cultured in Dulbecco’s Modified Eagle’s Medium (DMEM) with high glucose (Gibco), supplemented with 10% fetal bovine serum (FBS), and maintained at 37 °C with 5% CO2. HepG2 cells (ATCC HB-8065) were cultured in Minimum Essential Medium (MEM) with 1% NEAA (Gibco), supplemented with 10% fetal bovine serum (FBS), and maintained at 37 °C with 5% CO2. To evaluate the editing efficiency on exogenous reporters, cells were seeded in 24-well plates and, after approximately 12 h (reaching around 70% confluency), were transfected with various CU-REWIRE expression vectors alongside reporter plasmids. The total DNA amount used for each well was 1 μg, with the CU-REWIREs and reporter plasmids being co-transfected into HEK293T cells at a ratio of 5:1. Forty-eight hours post-transfection, cells were washed and collected for further analysis. To edit the endogenous mRNAs, HepG2 cells were transfected with 1 μg REWIRE expression plasmids and harvested 48 h post-transfection for subsequent analyses. All transfections were carried out using Lipofectamine 3000 (Thermo Fisher Scientific), employing 24-well plates with the plasmid amounts adjusted accordingly. A mock group, receiving only reporter plasmids without any CU-REWIRE vectors, was included to control for nonspecific editing events.

Structure predictions by AlphaFold2

Protein structure predictions were conducted using AlphaFold2 (version 2.3.1) on the HPC Biowulf Cluster at the National Institutes of Health. A customized sbatch job script was utilized for initiating structure predictions, as detailed below:

#!/bin/bash

#SBATCH --cpus-per-task = 8

#SBATCH --mem = 60 g

#SBATCH --time = 4-00:00 # Runtime in D-HH:MM

module load alphafold2/2.3.1

for file in “${seq_dir}”/*.fa; do

run_singularity \

--model_preset=monomer \

--fasta_paths = “${file}” \

--max_template_date=2020-05-14 \

--output_dir = ${out_dir}/

done

The input for the predictions consisted of fasta sequence files containing modified APOBEC sequences. The models with the highest rankings were chosen for subsequent analyses.

Construct phylogenetic tree

The orthologs of APOBEC1, APOBEC2, APOBEC3, and AID were obtained from NCBI Orthologs database (as of Dec 2023). Multiple sequence alignment was performed using Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/). A phylogenetic tree was reconstructed in MEGA 11.

Vector construction and production of AAV

For the generation of AAV8 vectors carrying CU5.21, the backbone plasmid pAV-TBG was obtained from OBiO Tech Inc. This plasmid was linearized using NcoI and HindIII restriction enzymes. The CU5.21 sequence was cloned into the linearized vector employing Gibson Assembly, resulting in the creation of the pAV-TBG-CU5.21 plasmid. Subsequently, the AAV8 particles were produced by OBiO Tech Inc. For the generation of PHP.eB-AAV vectors carrying CU5.15, the backbone plasmid, pAV-hSyn, was sourced from our prior publication8. This plasmid was linearized using NcoI and BglII restriction enzymes, after which the CU5.15 sequence was cloned into the linearized vector, resulting in the creation of the pAV-hSyn-CU5.15 plasmid. The PHP.eB-AAV particles were then produced by PackGene Biotech Inc. utilizing HEK 293T cells through a triple-transfection method. The particles were purified following standard protocols using an iodixanol gradient.

Animal experiments

All animal experiments received approval from the Institutional Animal Care and Use Committee (IACUC) of Shanghai Jiao Tong University School of Medicine, Songjiang Hospital. The study employed wild-type C57BL/6 Mus musculus and Mef2c L35P knock-in mice, with the latter specifically designed on a C57BL/6 background as described previously8.

AAV8 particles encoding either CU5.21 or EGFP-control were administered to wild-type C57BL/6 mice via tail intravenous injection, using a dosage of 2 × 1012 vector genomes per mouse (aged 8-weeks). Each AAV was adjusted to 200 µl per mouse with sterile PBS before injection. Four weeks post-injection, samples from the liver and blood were collected, and the serum was separated by centrifugation. Serum levels of PCSK9 were measured using the Mouse PCSK9 ELISA Kit (MPC-900, R&D Systems). Serum levels of total cholesterol, glyceride, and total protein were measured by BIOSSCI Inc. Total RNA from mouse liver tissues was extracted using TRIzol (Thermo Fisher) as per the manufacturer’s instructions for editing efficiency assessment through RNA-Seq.

AAV-PHP.eB particles encoding either CU5.15 or PUF10-control were administered to both wild-type and Mef2c L35P knock-in mice via tail intravenous injection, using a dosage of 2 × 1012 vector genomes per mouse (aged 4-weeks). Four weeks post-injection, samples from the hippocampus and prefrontal cortex were collected from Mef2c L35P knock-in mice for editing efficiency assessment through RNA-Seq and Sanger sequencing.

Social behavioral tests

Reflecting prior findings that individuals with ASD exhibit challenges in social recognition tasks, our study aimed to evaluate social recognition as a marker of ASD in animal models. Behavioral tests were conducted on male mice four weeks after injection, following 3 days of acclimatization to handling. Behavioral assessments were performed in a controlled environment, with recordings analyzed using Ethovision XT software (Noldus) or by an investigator blinded to the genotype of the mice.

The three-chamber test was conducted under a gentle illumination of 80 lux in a three-chambered apparatus made of white plastic board, measuring 60 × 40 × 30 cm. Each behavioral trial comprised three 10-minute sessions: habituation, social approach, and social novelty. During the habituation session, each mouse was placed in the central chamber and given the freedom to explore all three chambers. Subsequently, during the social approach session, the same test mouse was positioned in the central chamber, while a social partner mouse (mouse-1) was placed in small iron cages at the one of the side chambers. In the social novelty session, the test mouse was retained in the central chamber, and this time a novel unfamiliar mouse (mouse-2) was introduced into the other side chamber. Video recordings captured using a Da Hua high-definition camera facilitated the analysis of interactions, with Ethovision XT software (Noldus) employed for data analysis and Image J used to generate locomotion heat maps.

For in-home cage social reciprocity evaluations, a moderate illumination of 50 lux was used. Mice underwent a social isolation period of three days prior to testing, with the social recognition protocol divided into approach/familiarity and novelty/recognition sessions. To investigate the social discernment abilities of Mef2c L35P+/− mice, each test mouse was individually housed for three consecutive days, promoting social isolation, and given one hour to acclimate to the testing environment prior to commencing the task. The social recognition protocol consisted of two distinct segments: a session focused on social approach and familiarity, followed by a session on social novelty and recognition. Throughout the task, verification of the test subject’s recognition of a social partner mouse took place at 5-min intertrial intervals (2 min for social testing and 3 min for relaxation). During the social approach and familiarity session (trials 1 to 4), an unfamiliar mouse (designated as “mouse-1”; a male mouse of the wild-type variety aged 8 weeks) was introduced to the test home cage for a duration of 2 min, allowing for unrestricted exploration and interaction during the initial trial. Subsequently, three additional trials were conducted, each separated by a 5-minute interval, to foster familiarity with mouse-1. In the subsequent social novelty and recognition session (trial 5), a different unfamiliar mouse (referred to as “mouse-2”) was introduced to the test subject. Social interactions were recorded, and cumulative sniffing time was meticulously measured, ensuring objective data analysis by researchers blinded to the mice’s genotypes. T0: social isolation for three days. T1-T4: Social interaction trials with one same partner mouse. T5: Social interaction with a novel partner mouse. Recording sniffing time within a 2-min period.

Western blot and immunohistochemistry

For western blot analysis, HEK 293T cells were lysed using 1× SDS-PAGE loading buffer (Beyotime) and heated at 95 °C for 10 min. Mouse hippocampus and prefrontal cortex tissue samples were lysed with RIPA buffer (Roche, 04693159001), subjected to rotation at 4 °C for thorough digestion, and centrifuged at 12,000 rpm for 10 minutes at 4 °C. The supernatant containing protein samples was collected, mixed with 1× SDS-PAGE loading buffer (Beyotime), and heated at 95 °C for 20 min. Proteins were then separated on a 4–20% SDS-PAGE gel (GenScript) and transferred onto PVDF membranes (Millipore). Primary antibodies used included anti-FLAG (F1084, Sigma-Aldrich) and anti-GAPDH (14C10, CST) at a 1:2000 dilution, and anti-MEF2A + MEF2C (ab64644, Abcam) at a 1:1000 dilution, following the manufacturer’s guidelines. HRP-linked goat anti-mouse IgG (CST#7076) and goat anti-rabbit IgG (CST#7074) secondary antibodies were applied at a 1:5000 dilution. Visualization was achieved using an enhanced chemiluminescence detection kit (32106, Pierce) and a ChemiDoc Touch Imaging System (Bio-Rad).

For immunofluorescent staining of brain sections, detailed preparation methods are available in our previous publication. Sections were washed with 1×PBS, blocked (5% BSA, 0.3% Triton X-100 in 1×PBS) for 2 h at room temperature, and incubated overnight at 4 °C with primary antibodies (anti-MEF2A + MEF2C [ab64644, Abcam] and anti-PV [MAB1572, Millipore] at 1:1000; anti-GFP [E022030, Earth Life Sciences Inc.] and anti-DAPI [28718-90-3, Sigma-Aldrich] at 1:5000). Following secondary antibody incubation for 2 h at room temperature, images were captured using a Nikon NiE-A1 plus upright confocal fluorescence microscope.

RNA extraction, reverse transcription, and sequencing

RNA from HEK293T cells and mouse tissues was extracted using TRIzol (Thermo Fisher) as per the manufacturer’s instructions. One microgram of RNA was reverse transcribed using the PrimeScript RT reagent Kit with gDNA Eraser (Takara). The resulting cDNA was amplified for deep sequencing on an Illumina NovaSeq6000 platform and for Sanger sequencing. Genomic DNA extraction was conducted using the Mammalian Genomic DNA Extraction Kit (Beyotime).

RNA-seq and Sanger sequencing of RNA amplicons

RNA purification employed TRIzol Reagent (Thermo Fisher), with mRNA-seq libraries prepared from 1 μg of total RNA using KAPA Stranded mRNA-seq Kits (Roche, KK8421). Sequencing was performed on an Illumina NovaSeq6000 platform (2 × 150 bp paired end; 20 M reads per sample). PCR amplification used gene-specific primers flanking target sequences, with PCR products purified and submitted for Sanger sequencing. C-to-U editing rates were determined using EditR (https://moriaritylab.shinyapps.io/editr_v10/) tools.

RNA editing analysis

RNA editing was analyzed using procedures described in our previously published work28. In this study, RNA editing analysis was conducted using RNA-seq data from HEK293T cells and mouse cells (hippocampus and prefrontal cortex tissues). Initially, raw RNA-seq data were aligned to the GRCh38 human reference genome using STAR software (v2.7.10b) to generate BAM files from FASTQ files. The same process was applied to RNA-seq data from mouse cells (hippocampus and prefrontal cortex tissues), except for the alignment to the GRCm39 reference genome. These BAM files were further processed with the Genome Analysis Toolkit (GATK v4.4.0.0), where duplicates were removed using the MarkDuplicates tool, and reads containing Ns in their cigar string were split using the SplitNCigarReads tool. RNA editing events were detected and analyzed quantitatively for base substitution editing sites using the REDItools2 package. The Variant Effect Predictor (VEP, release 110) was used for annotations to determine whether editing sites were located on the positive or negative strand of the genome, with editing sites enumerated using Excel and custom coding. Additionally, to characterize sequences around editing sites in HEK293T cell RNA-seq experiments, BAM files were converted to FASTQ files using bcftools (v1.17), followed by statistical analysis of sequence characteristics around the editing sites using Excel and custom codes.

To mitigate the impact of background SNVs in HEK293T cells, we aligned the whole-genome sequencing data of HEK293T cells to the GRCh38 human reference genome using bwa (version 0.7.17). Following this alignment, BAM files were generated according to the GATK Best Practices Workflows. Subsequently, REDItools2 was employed to identify mutation sites. Variants that appeared with a coverage greater than 20 reads in all three replicates of each sample group were identified as RNA editing sites. We used the total off-target events in the analysis of functional effects and sequence motifs. The sequence motifs were generated with R package ggseqlogo. All plots were generated with ggplot2 package in R.

Statistics and reproducibility

All bar plots show individual biological replicates as dots. Detailed statistical information is provided in figure legends. No statistical methods were used to predetermine sample size, which was chosen based on experimental variability and standard practice. No data were excluded. Investigators were not blinded. Statistical significance was set at P < 0.05 (“ns” indicates not significant), with exact P values reported in the figure legends.

Ethics statement

All animal experiments received approval from the Institutional Animal Care and Use Committee (IACUC) of Shanghai Jiao Tong University School of Medicine, Songjiang Hospital.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.