Abstract
Base editing (BE) can permanently correct over half of known human pathogenic genetic variants without requiring a repair template, thus serving as a promising therapeutic tool to treat a broad spectrum of genetic diseases. However, the broad activity windows of current base editors pose a major challenge to their therapeutic application. Here, we show that integrating a naturally occurring oligonucleotide binding module into the deaminase active center of TadA-8e, a highly active deoxyadenosine deaminase, enhances its editing specificity. When conjugated with a Cas9 nickase or alternative PAM Cas9 variants, the engineered TadA variant—TadA-NW1—consistently achieves robust A-to-G editing efficiencies within an editing window consisting of four nucleotides, substantially narrower than the 10-bp editing window of the TadA-8e-derived ABEs. Moreover, compared to ABE8e, ABE-NW1 shows significantly decreased Cas9-dependent and -independent off-target activity while maintaining similar on-target editing efficiency. Further, TadA-NW1 can be reprogrammed to perform desired cytidine deamination and adenine transversion within a restricted editing window. Finally, in a cystic fibrosis (CF) cell model, ABE-NW1 outperforms existing ABEs in accurately and efficiently correcting the CFTR W1282X variant, one of the most common CF-causing mutations. In all, we engineered a suite of base editors with refined activity windows, enabling more precise base editing. Importantly, this study presents a streamlined genome editor re-engineering strategy to accelerate the development of therapeutic base editing.
Similar content being viewed by others
Introduction
Single-nucleotide substitutions account for over 58% of human disease-causing genetic variations1; correcting these mutations may prevent or even reverse the associated disease progression. Directed by a guide RNA that specifies the target site, CRISPR-associated base editing (BE) can program targeted single-nucleotide conversions without creating double-stranded DNA breaks (DSBs) or requiring a repair template, thus serving as a powerful tool to correct pathogenic point mutations. Current base editors are constructed by fusing an evolved cytosine or adenine deaminase to a catalytically impaired Cas protein (e.g. nSpCas9). To date, five classes of base editors have been developed to mediate versatile single-nucleotide conversions in vitro and in vivo, including cytosine base editors (CBEs) for C·G-to-T·A transitions2,3,4, adenine base editors (ABEs) for A·T-to-G·C transitions5,6, C-to-G base editors (CGBEs) for C·G-to-G·C transversions3,7, and adenine transversion editors (ACBEs) for A·T-to-C·G transversions8.
Despite the rapidly growing array of base editors with enhanced editing efficiencies and broader targeting scopes, the accuracy of the base editing technique is compromised by the broad activity windows that span multiple nucleotides within the protospacer. The size of the base editing window is often positively correlated with its activity9. For example, ABE8e—the most efficient ABE variant to date5—exhibits a 10-bp editing window10, much wider than the five-nucleotide activity window of canonical ABEs1. Base editors cannot discriminate the target base when multiple editable nucleotides are present within or near the activity editing window, resulting in bystander single-nucleotide conversions. Notably, approximately 82.3% of human disease-associated mutations that can be corrected by ABEs are located within regions containing multiple adenines (Fig. 1a), suggesting that ABEs may induce undesired mutations when correcting the majority of pathogenic variants. Bystander edit may adversely impact the editing outcome by disrupting the corrected gene function, presenting a significant hurdle to the therapeutic applications of current base editors. Developing efficient base editors with highly focused activity at the intended nucleotide will help address this challenge.
a Predicted adenine base editing outcomes for correcting known human pathogenic genetic variants. b, General principle for re-engineering TadA-8e. An oligonucleotide-binding module (purple) is introduced into the substrate-binding pocket (yellow), forming stacking interactions (red lines) as well as hydrogen bonds and electrostatic contacts (black lines) with the nucleobases (blue) on the DNA nontarget strand. Cartoons were created in BioRender. jiang, t. (2025) https://BioRender.com/ol36n1g. c Overview of the amino acid substitutions introduced for engineering TadA-NW variants. d An enlarged view of the predicted interactions between the mutated residues (red and pink) with the nontarget DNA strand (green) in the TadA-NW1 substrate binding pocket (yellow) by Pymol (v3.1.3). The original amino acids are in gray. The predicted stacking interaction and other contacts are presented as red and blue dashed lines, respectively. The position of the target base is denoted as “0”, with the two non-target bases at its 5’ end as “−1” and “−2”, respectively. e A-to-G conversion efficiencies of ABE8e and ABE-NW variants at an endogenous locus. The conversion efficiencies were measured by targeted-amplicon high-throughput sequencing (HTS). The protospacer sequence is shown with all the editable adenines highlighted in red, and the PAM sequence is underlined. The most PAM-distal base within the protospacer is counted as protospacer position 1. The heat map represents average editing rates from three independent experiments. f Comparison of bystander (A3 or A9)-to-target (A5) editing ratios for ABE8e and ABE-NW variants within the genomic site shown in (e). Data represent mean ± SEM (n = 3 biologically independent experiments). Statistical significance was calculated using two-tailed Student’s t test comparing each ABE-NW variant to ABE8e; ns = no statistical difference, *P < 0.05. Source data are provided as a Source Data file where exact p values are provided.
Early efforts to improve the specificity of base editors focused on saturation mutagenesis of the key residues in the corresponding base editors to alter their deamination activity11 or substrate binding affinity10,12. However, the generation of large-scale libraries and the subsequent screening of a large number of base editor variants are both labor-intensive and time-consuming13, underscoring the need for a streamlined protein re-engineering strategy to accelerate the development of base editors with refined editing windows. Naturally occurring single-stranded DNA (ssDNA) or RNA binding proteins commonly utilize conserved amino acid side chains on the docking interface to form specific electrostatic bonds (van der waals interactions), hydrogen bonds, and/or stacking interactions with nucleobases14,15,16; these highly modular interactions allow the proteins to recognize and stabilize the target ssDNA or RNA. We envision that incorporating a well-characterized oligonucleotide binding module into the deaminase domain of a base editor can increase its binding affinity and specificity with the DNA nontarget strand and consequently reduce unintended bystander editing. Here, we introduce mutations into the substrate-binding pocket of TadA-8e5 to recapitulate the structural feature of the RNA-binding domain of human Pumilio1 protein16. The resulting TadA variant—TadA-NW1—consistently exhibits robust A-to-G conversion efficiency within a narrower editing window when fused to Cas9 variants recognizing different PAMs. Moreover, TadA-NW1-derived CBE and ACBE mediate C·G-to-T·A or A·T-to-C·G conversion in restricted editing windows, respectively. Further, compared to ABE8e, ABE-NW1 demonstrates significantly reduced Cas9-dependent and -independent off-target activity. To evaluate the therapeutic potential of ABE-NW1, we use it to correct the CFTR W1282X mutation, one of the most common CFTR mutations, in the lung epithelial cells. Among all the tested ABEs, ABE-NW1 achieves the lowest bystander editing rate while maintaining robust on-target correction efficiency. In all, we expect that this generalizable protein engineering framework will accelerate the evolution of therapeutic base editing.
Results
Development of a structure-guided protein re-engineering strategy to narrow the base editing window
By analyzing the available structures of DNA-bound cytosine and adenine deaminases17,18, we reasoned that the highly flexible U-shaped conformation of the DNA nontarget strand in the active-site pocket potentially increases the accessibility of the nucleotides flanking the target base to the deaminase active center, thereby leading to bystander editing. Moreover, the rapid deamination kinetics of highly active deaminase variants, such as TadA-8e5,17, further promote bystander editing17. Based on these observations, we hypothesize that enhancing the binding affinity and specificity of the deaminase for the U-shaped nontarget strand can stabilize the substrate conformation, reduce the deamination rate, and thus mitigate bystander effects (Fig. 1b).
Oligonucleotide-binding proteins typically recognize and bind target single-stranded DNA or RNA through highly modular and specific intermolecular interactions14,16,19. For example, the human Pumilio homology domain, a well-characterized RNA binding domain, utilizes three amino acid side chains at conserved positions to recognize and contact with the target nucleobases: (1) one aromatic residue (Phe or Tyr) stacks between adjacent nucleotides, contributing to the binding specificity, and (2) two amino acids (Asn, Gln, or Cys) form hydrogen-bonding and/or van der Waals contacts with the base opposite to increase the binding affinity14,19. We envisioned that integrating this binding module into the deaminase active center may enhance interactions between the base editor and its substrate, stabilize the U-shaped nontarget DNA strand, and consequently reduce bystander editing. To this end, we set out to engineer TadA-8e, the most active deoxyadenosine deaminase to date5 (Fig. 1b). Based on its structure17, different combinations of amino acid substitutions were introduced into the substrate-binding pocket of TadA-8e, respectively (Fig. 1c), establishing additional stacking interactions, hydrogen bonds, and electrostatic interactions with the nucleotides flanking the target base (Fig. 1d and Supplementary Fig. 1a-e). These engineered TadA-8e variants are referred to as TadAs-Narrow Editing Window or TadAs-NW.
To test the hypothesis that stabilizing the interaction of deaminase with the substrate nucleotides can narrow the editing window, we conjugated TadAs-NW with a SpCas9 nickase, and quantified the editing efficiencies of the resulting TadA-NW-derived adenine base editor variants (ABEs-NW) at an endogenous genomic site bearing multiple adenines in HEK293T cells using targeted-amplicon high-throughput sequencing (HTS). ABEs-NW and ABE8e exhibited comparable peak editing efficiencies at A5 and A7, counting the protospacer adjacent motif (PAM) as positions 21-23 (Fig. 1e). In contrast, compared to ABE8e, all the ABE-NW variants led to a decrease in the ratio of bystander to target editing at A3 (up to 15.2-fold) and A9 (up to 20.3-fold), indicating narrowed editing window (Fig. 1f). After defining a bystander-to-target editing ratio threshold of 20%9,20, we nominated two ABE variants, ABE-NW1 and ABE-NW2, for further characterization (Fig. 1f). In all, we identified a general protein engineering strategy that can reduce bystander editing and improve base editing precision.
Characterization of TadA-NW-derived ABE variants
Next, we comprehensively analyzed the editing efficiency and selectivity of ABE-NW1 and ABE-NW2 at nine endogenous genomic sites containing multiple adenosines in HEK293T cells. Across seven of the nine sites, ABE-NW1 maintained peak editing efficiencies comparable to ABE8e, but exhibited substantially decreased A-to-G conversion activities at bystander adenines (A2-A3 and A8-A12; Fig. 2a and Supplementary Fig. 2a). Although ABE-NW1 mediated significantly lower editing efficiencies at Sites 1 and 9, the peak-to-bystander editing ratio increased by up to 19.4-fold (Site 1) and 97.1-fold (Site 9) relative to ABE8e (Supplementary Fig. 2b), suggesting markedly reduced bystander editing. ABE-NW2 exhibited an editing pattern identical to that of ABE-NW1; however, its performance was poor at certain genomic sites, suggesting the potential sequence context preference of ABE-NW2 (Fig. 2a and Supplementary Fig. 2a). In summary, the ABE-NW variants’ activity windows—defined as positions within the protospacer that showed ≥20% of the peak editing efficiency2—were refined to the protospacer positions 4 to 7, substantially narrower than the 10-bp editing window of ABE8e (from positions 3 to 12 of the protospacer, Fig. 2b). Given the robust editing activity of ABE-NW1 across variable genomic sites, we selected it for further characterization. At the coding regions of three genes, ABE-NW1 consistently achieved robust A-to-G conversions within its activity window, with markedly reduced bystander editing (Supplementary Fig. 2c), confirming the high efficiency of ABE-NW1 within a narrowed editing window.
a A-to-G editing efficiencies of ABE8e and two ABE-NW variants (ABE-NW1 and ABE-NW2) at six endogenous genomic loci (n = 3 (site 1 and 3–6) or 4 (site 2) biologically independent experiments). The corresponding protospacer sequences are listed with editable adenines highlighted in red. b Base editing activity windows for ABE8e and ABEs-NW. The most PAM-distal base within the protospacer is numbered as protospacer position 1. Each dot represents the average A-to-G conversion rate across all the sites containing adenine at the indicated protospacer position. Data points used for this analysis are in Fig. 2a and Supplementary Fig. 2a, c. c ABE-NW1-to-ABE8e editing ratios at target and bystander adenines located within 9 NAN motifs from 12 target sites. Data used for this analysis are from Fig. 2a and Supplementary Fig. 2a, c. d Comparison of the editing efficiencies and specificities of ABE8e, ABE-NW1, and ABE-NW1-NL at three endogenous sites (n = 3 biologically independent experiments). The protospacer sequences are as indicated in (a). e Ratios between target editing (A5) and bystander editing (A7 or A8 for site 1; A7 or A10 for site 2; A3, A7, or A10 for site 3) in (d). The numbers above the bars represent the average fold changes in the target-to-bystander editing ratio mediated by ABE-NW1 and ABE-NW1-NL relative to ABE8e. All A-to-G conversion efficiencies were measured by targeted-amplicon HTS. Data represent mean ± SEM. Statistical significance was calculated using two-tailed Student’s t test comparing ABE-NW variants to ABE8e; ns = no statistical difference, *P < 0.05, **P < 0.01, ***P < 0.001. Source data are provided as a Source Data file where exact p values are provided.
To scrutinize the editing selectivity of ABE-NW1 for target and bystander adenines, we normalized the editing efficiencies of ABE-NW1 relative to ABE8e at the corresponding adenines across twelve endogenous genomic sites, and compared the ABE-NW1-to-ABE8e editing ratio at the same sequence motifs located inside or outside of the ABE-NW1 editing window. Consistently, across all the tested motifs, ABE-NW1 selectively mediated lower editing efficiencies at the bystander adenines relative to the target adenines in the same context (averaged ABE-NW1-to-ABE8e editing ratio: 0.223 (bystander) vs 0.817 (target); Fig. 2c and Supplementary Fig. 2d). These observations suggest that ABE-NW1 can discriminate between target and bystander adenines based on their positions within the protospacer, rather than nearby sequence contexts. TadA can induce cytosine substitution in a TCN motif at the target site, independent of its adenine deamination activity3. Next, we assessed the enzyme selectivity of ABE-NW1 by measuring C-to-T and C-to-G conversion rates at three genomic sites. As compared to ABE8e, ABE-NW1 induced an average of 13.6-fold and 18.1-fold decrease in C-to-T and C-to-G conversions, respectively, yielding minimal cytosine editing ranging from 0.00233% to 0.209% (Supplementary Fig. 2e). Collectively, these results define the high selectivity of ABE-NW1 for A-to-G conversion at positions 4 to 7 of the protospacer.
Shortening or removing the flexible linker between Cas9 nickase and deaminase can modulate the number of bases that are accessible to the deaminase3,11,21. To explore if the editing window of ABE-NW1 can be further narrowed, we constructed the ABE-NW1-NL variant by removing the linker region between TadA-NW1 and Cas9 nickase. Compared to ABE8e and ABE-NW1, ABE-NW1-NL mediated significantly lower A-to-G conversion rates at A3 and A7-A10 while maintaining high levels of peak editing at A5 across three endogenous genomic loci in HEK293T cells (Fig. 2d). We quantified the specificities of ABE8e, ABE-NW1, and ABE-NW1-NL by dividing the editing rate at target adenine (A5) by that at other editable adenines (target-to-bystander editing ratio). ABE-NW1-NL achieved an average 175-fold improvement in the target-to-bystander editing ratio relative to ABE8e, notably higher than the 19.4-fold increase by ABE-NW1, highlighting the further narrowed editing window of ABE-NW1-NL (Fig. 2e).
Expanding the targeting and editing scopes of TadA-NW1
Canonical SpCas9-associated genome editors require an NGG PAM sequence for target recognition. To date, variable SpCas9 variants with altered PAM specificities have been engineered22,23,24. For example, NG-Cas9 recognizes a relaxed NG PAM23, while VRQR-Cas9 is compatible with an NGA PAM. To expand the targeting scope of TadA-NW-derived ABEs, we first fused TadA-NW1 with an NG-Cas9 nickase. The resulting ABE variant, NG-ABE-NW, exhibited comparable peak editing rate with NG-ABE8e5 across four endogenous genomic loci in HEK293T cells (Fig. 3a), whereas its editing specificity was improved by an average of 31.8-fold (up to 202 target-to-bystander editing ratio, Fig. 3b). The major editing window of NG-ABE-NW1 spans from protospacer positions 4 to 7, much narrower than the 9-bp editing window of NG-ABE8e (Supplementary Fig. 3a). Moreover, we conjugated TadA-NW1 with a VRQR-Cas9 nickase, and compared the editing efficiency and specificity of VRQR-ABE8e and VRQR-ABE-NW at a genomic site in the primary human lung epithelial cells. VRQR-ABE-NW yielded 94.4 ± 2.33% peak editing rate at A4, comparable to the 95.1 ± 4.03% editing efficiency by VRQR-ABE8e (Fig. 3c). In the VRQR-ABE-NW-treated group, the target-to-bystander editing ratios were improved by 2.55-fold, 8.64-fold, and 13.6-fold at A8, A9, and A12, respectively (Fig. 3d). These results established a set of highly efficient and accurate ABE-NW variants with expanded targeting capabilities, suitable for editing a broad spectrum of genomic sites in diverse cell types.
a A-to-G conversion efficiencies of NG-ABE8e and NG-ABE-NW at four genomic sites in HEK293T cells. b Target-to-bystander editing ratios at the four genomic loci in (a). The numbers above the bars represent the average fold changes in the target-to-bystander editing ratio mediated by NG-ABE-NW compared to NG-ABE8e. c, A-to-G conversion efficiencies of VRQR-ABE8e and VRQR-ABE-NW at an endogenous locus in the primary human lung epithelial cells. d Target-to-bystander editing ratio at the site shown in (c). The numbers on top of the bar indicate the mean fold changes in the target-to-bystander editing ratio mediated by VRQR-ABE-NW relative to VRQR-ABE8e. e C-to-T conversion efficiencies of Td-CBEmax and Td-CBE-NW at four endogenous sites in HEK293T cells. f Target-to-bystander editing ratios at the sites in e. The numbers above the bars show the average fold changes in the target-to-bystander editing ratio mediated by Td-CBE-NW relative to Td-CBEmax. g A-to-C conversion efficiencies of ACBE and ACBE-NW at an endogenous locus in HEK293T cells. h Target-to-bystander editing ratio at the site shown in (g). The number above the bars represents the average fold change in the target-to-bystander editing ratio by ACBE-NW compared to ACBE. The corresponding protospacer sequences are listed with editable nucleotides highlighted in red. All single-nucleotide conversion efficiencies were measured by targeted-amplicon HTS. Data represent mean ± SEM (n = 3 biologically independent experiments). ns = no statistical difference, *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001 (two-tailed Student’s t test). Source data are provided as a Source Data file where exact p values are provided.
TadA-8e can be evolved to program cytidine deamination2,3,9,25 and adenine transversion26. For example, Td-CBEmax, constructed by introducing E27R and N46L mutations into the TadA-8e domain of ABE8e, mediates comparable C·G-to-T·A conversion rates with BE4max3,4, while ACBE, developed by conjugating an engineered mouse alkyladenine/3-methyladenine DNA glycosylase to the C-terminus of ABE8e, enables target A·T-to-C·G transversions26. Next, we assessed whether TadA-NW1 can be reprogrammed to perform versatile base conversions within narrowed editing windows. To this end, we first incorporated TadA-NW1 mutations (N108Q, M151C, and Q154Y) into the cytidine deaminase domain of Td-CBEmax, resulting in the Td-CBE-NW variant, and compared the editing efficiency and specificity of Td-CBEmax and Td-CBE-NW across four genomic sites in HEK293T cells. Although Td-CBE-NW showed decreased peak editing levels at certain loci (Fig. 3e), its specificity was substantially enhanced, with the target-to-bystander editing ratios ranging from 3.16 to 23, compared to the average ratio of 1.33 in the Td-CBEmax-treated groups (Fig. 3f), demonstrating a narrower editing window (Supplementary Fig. 3b). Next, we introduced TadA-NW1 mutations into ACBE, and compared the efficiency and specificity of ACBE and ACBE-NW in editing an endogenous genomic site. ACBE-NW achieved a 7.73 ± 1.17% A-to-C conversion rate at A4, comparable to the 6.99 ± 0.589% editing efficiency of ACBE (Fig. 3g), whereas the editing specificity was improved by 12.2-fold (Fig. 3h), demonstrating the superior A-to-C editing accuracy of ACBE-NW.
Taken together, these data suggest that the TadA-NW1 variant can be re-engineered into versatile base editors with broader targeting scopes and/or altered enzyme selectivity for diverse applications.
Comparison of the application scopes of ABE-NW1 and ABE9
To further benchmark the editing scope and specificity of ABE-NW1, we compared the efficiencies and specificities of ABE-NW1 and ABE9—an ABE variant harboring a highly stringent editing window10—in editing three endogenous loci in HEK293T cells. ABE9 induced significantly lower bystander editing than ABE-NW1 while maintaining comparable or slightly reduced editing efficiency at A5 (Supplementary Fig. 4a), demonstrating a narrower editing window than ABE-NW1, yet comparable to that of ABE-NW1-NL (Fig. 2d). Interestingly, we observed that ABE9 exhibited minimal editing activity (1.56 ± 0.458%) at the locus where no adenine is present at protospacer position 5 (Supplementary Fig. 4b). Moreover, we examined the compatibility of ABE9 with the NG-Cas9 nickase. The expression levels of NG-ABE9 and NG-ABE-NW were similar in HEK293T cells (Supplementary Fig. 4c). However, the editing activity of ABE9 was compromised at an endogenous locus lacking an adenine at protospacer position 5, while NG-ABE-NW mediated robust target editing at A4-A7 (Supplementary Fig. 4d), confirming that the application of ABE9 is restricted to editing the adenine located at protospacer position 5. These results collectively demonstrate the broad application scopes of TadA-NW1-derived base editors.
DNA off-target editing analysis of ABE-NW1
Apart from bystander editing within the protospacer, base editors may induce unintended genome-wide mutations—i.e., DNA off-target effects27,28; the non-specific edits may elicit unpredictable cellular outcomes, raising concerns for the clinical applications of base editors. To assess the off-target effects of Tad-NW-derived base editors, we first compared Cas9-dependent off-target base editing by ABE8e and ABE-NW1 at previously reported off-target sites for HBG1, EMX1, and VEGFA3 loci5; the sequences of these off-target sites share high similarity with the corresponding target loci. Despite the comparable on-target editing efficiencies of ABE8e and ABE-NW1, ABE-NW1 induced significantly lower average A-to-G conversion rates at most off-target sites (Fig. 4a), suggesting that ABE-NW1 has reduced Cas9-dependent off-target editing activity. Given that the average editing rate at the VEGFA3 off-target site 1 was similar between the ABE8e- and ABE-NW1-treated groups, we analyzed the A-to-G conversion rates at individual adenines within the off-target loci. Compared to ABE8e, ABE-NW1 induced decreased A-to-G conversions across most adenines within the off-target loci (Fig. 4b). Nevertheless, at the EMX1 and VEGFA3 off-target sites, ABE-NW1 showed similar or slightly decreased editing levels at positions 5-7 (counting the 5’ end of the off-target site as position 1) compared to ABE8e (Fig. 4b). This result suggests that Cas9-dependent off-target base editing also occurs within a window spanning several nucleotides at the off-target site, and ABE-NW1 exhibits a narrower off-target editing window than ABE8e, consistent with its on-target editing profile.
a A-to-G editing frequencies by ABE8e and ABE-NW1 at three endogenous genomic loci and their corresponding off-target sites in HEK293T cells. The A-to-G conversion rates were determined by averaging the A-to-G editing frequencies across all adenines within the indicated sites. b ABE8e- or ABE-NW1-mediated A-to-G conversion frequencies at individual adenines within the off-target sites shown in (a). The heat map represents average editing rates from three independent experiments. c The overview of the orthogonal R-loop assay. dSaCas9: catalytically-inactive saCas9. d ABE-8e or ABE-NW1-mediated A-to-G conversion rates at each R-loop site. R-loop 1 and 2 were generated by dSaCas9 and a SaCas9 sgRNA targeting R-loop 1 or 2. All A-to-G conversion frequencies were measured by targeted-amplicon HTS. Data represent mean ± SEM (n = 3 biologically independent experiments). Statistical significance was calculated using two-tailed Student’s t test comparing ABE-NW1 to ABE8e; ns = no statistical difference, *P < 0.05, **P < 0.01. Source data are provided as a Source Data file where exact p values are provided.
In addition to Cas9-dependent off-target editing, Cas9-independent off-target DNA editing arises from the intrinsic DNA affinity of the deaminase domain, independent of the guide RNA-directed DNA binding of Cas929,30. To detect the propensity of ABE-NW1 to edit single-stranded DNA regions unrelated to the target loci, we employed an orthogonal R-loop assay31 (Fig. 4c). Specifically, HEK293T cells were transfected with plasmids encoding ABE-NW1 or ABE8e, an on-target sgRNA, a catalytically-inactive Staphylococcus aureus Cas9 (dSaCas9), and a SaCas9 sgRNA targeting a genomic locus unrelated to the target site. Three days after transfection, we assessed the levels of adenine deamination in the R-loop generated by dSaCas9 and the saCas9 sgRNA (Fig. 4c). In the ABE-NW1-treated groups, an average of 0.0561% A-to-G conversion rate was observed at two orthogonal R-loops, substantially lower than the 0.855% average editing rate by ABE8e (Fig. 4d). In all, these results support that ABE-NW1 is a precise base editor with attenuated DNA off-target activity.
ABE-NW1 may acquire altered editing characteristics due to the installed binding module, potentially leading to novel off-target mutations. To unbiasedly investigate the off-target effects of ABE-NW1, we performed whole-genome sequencing (WGS) at a 30x average read depth. Specifically, HEK293T cells were transfected with GFP plasmids, plasmids expressing ABE8e and VEGFA3-targeting sgRNA, and plasmids expressing ABE-NW1 and VEGFA3-targeting sgRNA, respectively. To reduce background genetic variations, we selected edited single cells after transfection, expanded the single-cell clones, and performed WGS analysis to detect ABE-induced single-nucleotide variants (SNVs, see “Methods”, Supplementary Fig. 5a). The WGS pipeline revealed comparable SNV numbers between ABE8e and ABE-NW1 groups (Supplementary Fig. 5b). To nominate high-confidence off-target mutations, we focused on the SNVs consistently detected across all the replicates within each group (Supplementary Fig. 5c), and subsequently categorized them as shared or unique between ABE8e and ABE-NW1 groups (Supplementary Fig. 5d). Targeted editing at the VEGFA3 locus was detected across all the ABE-treated groups (Supplementary Data 1), whereas 13 genomic sites with A•T>G•C mutations were specifically identified in the ABE-NW1 group (Supplementary Fig. 5d), located within intergenic (11) and intronic (2) regions (Supplementary Data 2). Given that the WGS detection pipeline may overestimate off-target editing due to sequencing error and allelic variation in HEK293T cells32,33, we further empirically validated ABE-NW1-induced A•T>G•C at two nominated sites by Sanger sequencing. At both sites, ABE-NW1 did not result in substantial increases in the A•T>G•C mutation rates compared to the GFP-treated control and ABE8e groups (Supplementary Fig. 5e). Collectively, these results demonstrate that ABE-NW1 did not exhibit a significantly altered off-target editing profile.
Therapeutic application of ABE-NW1 in correcting a cystic fibrosis cell model
Next, we assessed the therapeutic application of ABE-NW1 in correcting pathogenic mutations in the disease-relevant cells. Cystic Fibrosis (CF) is a monogenic disease caused by mutations in the Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) gene. As of 25 September 2024, 1,085 CFTR variants have been classified as CF-causing34, 16.4% of which can be corrected by ABE (Supplementary Fig. 6a). With 80.5% of ABE-amenable CFTR variants remaining unresponsive to FDA-approved CFTR modulators34 (Supplementary Fig. 6a), developing adenine base editing strategies to accurately correct CFTR mutations can provide targeted therapies for a broad spectrum of CF patients without therapeutic options. Given that lung disease is the major cause of morbidity and mortality in patients with CF, we focused on investigating the application of ABE in lung airway epithelial cells in this study.
CFTR W1282X is one of the most common untreatable CFTR variants. It is caused by a G > A mutation in exon 23 of the CFTR gene, creating a premature stop codon that impairs CFTR protein production and abolishes Cl- transport activity35. The target CFTR locus contains multiple adenines, where A1 and A3 represent the primary bystander adenines and A2 is the targeted adenine (Fig. 5a). A-to-G conversions at A1 and A3 would result in Q1281R and R1283G amino acid substitutions, respectively, which may impair CFTR expression and function. Initially, we employed two highly efficient base editors, ABE8e and VRQR-ABE8e5, to correct CFTR W1282X in lung airway epithelial cells. ABE8e or VRQR-ABE8e mRNA and the corresponding sgRNA were electroporated into a human bronchial epithelial cell line homozygous for the CFTR W1282X mutation (CFF−16HBEge CFTR W1282X cells)36. ABE8e induced similar editing levels at target and bystander adenines (Supplementary Fig. 6b), while VRQR-ABE8e achieved comparable target editing with ABE8e at A2 but led to nearly 100% A-to-G conversion at A1 (Supplementary Fig. 6c). To assess functional rescue, we measured CFTR activity in the treated bronchial epithelial cells. Although both ABEs partially restored CFTR-mediated Cl- transport activity (Supplementary Fig. 6d), CFTR activity in the ABE8e-treated cells was two-fold lower than that in the VRQR-ABE8e-treated group (Supplementary Fig. 6e), suggesting that bystander editing at A3 adversely affected CFTR function. Indeed, the CFTR R1283G variant is classified as likely pathogenic in the SickKids CFTR database. These findings underscore the need to optimize the base editing strategy to minimize bystander editing for CFTR W1282X correction.
a The CFTR W1282X genomic sequence. The premature stop codon is highlighted in yellow. Protospacer sequences targeted by different CFTR-sgRNAs are listed, with the corresponding PAM sequences in bold. Target adenine (A2) is red, and the primary bystander adenines (A1 and A3) are blue. b Comparison of editing efficiencies at target (A2) and bystander (A1 and A3) adenines by different ABE variants and the corresponding sgRNAs in CFF−16HBEge CFTR W1282X cells. The A-to-G editing rates were quantified using EditR analysis60 of the Sanger sequencing traces, and the heat map represents the average editing frequencies from two (ABE8e+sgNGG, ABE-NW1+sgNGG, NG-ABE8e+sgNG, and NG-ABE-NW1+sgNG) or three (ABE9+sgNGG, VRQR-ABE8e+sgNGA, ABE8e+sgNAG, ABE-NW1+sgNAG, and ABE9+sgNAG) biologically independent samples. c ABE8e and ABE-NW1-mediated editing at the CFTR locus in CFF−16HBEge CFTR W1282X cells measured by targeted-amplicon HTS. The protospacer sequence is shown. The target adenine is red, the bystander adenines are blue, and PAM is underlined. Bars represent mean ± SEM (n = 3 biologically independent experiments). d Frequencies of CFTR alleles with perfect correction, both correction and bystander mutations, or bystander edits only, based on the HTS data in (c). e CFTR protein expression in ABE8e- or ABE-NW1-treated CFF−16HBEge CFTR W1282X cells by western blot. GAPDH serves as an internal control. Experiments were repeated three times, and one is shown. f Quantification of CFTR expression levels detected by western blot assays. CFTR band intensities were normalized to the corresponding GAPDH signals, and calculated relative to CFTR levels in the respective wild-type groups. Data are shown as mean ± SEM (n = 3 biologically independent experiments). ns = no statistical difference, *P < 0.05, **P < 0.01 (two-tailed Student’s t test). Source data are provided as a Source Data file where exact p values are provided.
To this end, we designed four CFTR-targeting sgRNAs placing the target adenine (A2) at different positions within the protospacer (Fig. 5a), delivered them together with the corresponding ABE variants into CFF-16HBEge CFTR W1282X cells (Fig. 5b), measured A-to-G conversion rates across the entire protospacer by Sanger sequencing (Supplementary Fig. 7a-b), and compared editing efficiencies at A1-A3, respectively (Fig. 5b). Guided by a sgRNA targeting the sequence upstream of a 5’-TGG-3’ PAM (sg-NGG), both ABE8e and ABE-NW1 showed minimal target editing at A2. Directed by an sgRNA targeting the sequence upstream of a 5’-AGT-3’ PAM (sg-NG), both NG-ABE variants (NG-ABE8e and NG-ABE-NW) introduced high levels of unfavorable bystander editing at A3. Moreover, VRQR-ABE8e preferentially edited A1 by recognizing the 5’-GGA-3’ PAM. Given that Cas9 can recognize a non-canonical 5’-NAG-3’ PAM sequence37, we identified a 5’-GAG-3’ PAM at the CFTR locus and synthesized the corresponding sgRNA (sg-NAG, Fig. 5a). Encouragingly, ABE-NW1, together with sg-NAG, selectively edited A2 over A1 and A3 (Fig. 5b). Finally, we evaluated the editing efficiency and specificity of ABE9, an ABE variant featuring a single-nucleotide window, at the CFTR locus. ABE9 preferentially edited the bystander adenine (A1) in the presence of sg-NGG, while minimal editing was observed with sg-NAG, demonstrating that ABE-NW1 outperforms ABE9 in correcting the CFTR W1282X mutation. Thus, ABE-NW1 and sg-NAG were selected for further investigations.
To accurately quantify base editing efficiency, we used targeted-amplicon HTS to assess ABE-NW1 and sg-NAG-mediated A-to-G conversion rates at all editable adenines within the target CFTR locus, with the ABE8e and sg-NAG-treated group serving as a comparison. In the CFF-16HBEge CFTR W1282X cells, ABE-NW1 achieved 54.2 ± 2.35% target editing at A7, counting the PAM as positions 21−23, comparable to the 62.7 ± 3.17% target editing rate by ABE8e; in contrast, the bystander editing levels at A3, A8, and A11 were significantly lower in the ABE-NW1-treated groups (A3: 20.4 ± 6.87% (by ABE-NW1) vs 61.9 ± 5.77% (by ABE8e), A8: 14.2 ± 3.29% (by ABE-NW1) vs 41.7 ± 1.92% (by ABE8e), and A11: 0.343 ± 0.289% (by ABE-NW1) vs 9.42 ± 1.70% (by ABE8e); Fig. 5c). Consistently, ABE-NW1 achieved similar editing efficiency and specificity in the CF patient-derived primary bronchial epithelial cells homozygous for CFTR W1282X variant (Supplementary Fig. 7c). We further analyzed the HTS data to characterize the edited CFTR alleles in the CFF-16HBEge CFTR W1282X cells. ABE-NW1 generated 36.6 ± 0.551% perfectly corrected alleles (Fig. 5d)—an average 6.21-fold improvement over ABE8e—constituting the most frequent editing products in the ABE-NW1 group (Supplementary Fig. 7d), whereas most corrected CFTR alleles in the ABE8e group harbor one or more bystander mutations (Supplementary Fig. 7e). These bystander mutations may lead to unfavorable amino acid changes, such as R1283G, which can impair protein processing38 and ultimately lead to rapid protein degradation39. We measured the CFTR protein expression in the lung epithelial cells treated with ABE8e and ABE-NW1. ABE-NW1 rescued full-length CFTR protein expression to an average of 46.1% of the level in wild-type cells, which was significantly higher than that detected in the ABE8e-treated cells (Fig. 5e, f). Taken together, our findings demonstrate that ABE-NW1 is the optimal base editor for correcting the untreatable CFTR W1282X mutation, paving the way for the development of therapeutic base editing to treat CF.
Discussion
Base editing technology presents unprecedented opportunities for permanently correcting a wide range of pathogenic point mutations. However, the wide activity windows of current base editors represent a major hurdle to their clinical translation. A generalizable base editor engineering strategy to improve its editing specificity can accelerate the development of therapeutic base editing. Previous studies to improve base editing efficiency or specificity focused on using large-scale saturation mutagenesis of the key residues involved in substrate interactions within individual base editors, which can be labor-intensive and time-consuming2,3,12,40,41,42,43. Here, we embedded an RNA-binding module14,15,16,19 into the deaminase active center of TadA-8e to support additional interactions between the deaminase and nontarget DNA strand. All the resulting base editor variants exhibited decreased bystander editing activities while maintaining the peak editing efficiencies. Thus, this protein engineering framework can streamline the evolution of variable base editors for higher specificity. Moreover, our study suggests the potential utility of the structural features of naturally occurring oligonucleotide-binding domains in the design of novel genome editors. Indeed, a recent study reported improved prime editing efficiency by fusing an RNA recognition motif into the primer editor protein44.
This study established a set of base editors that can program efficient A-to-G (by ABE-NW), C-to-T (by Td-CBE-NW), or A-to-C (by ACBE-NW) conversions within narrowed editing windows across a broad spectrum of endogenous genomic sites, providing a versatile toolkit for correcting or modeling pathogenic genetic mutations. Although ABEs-NW and Td-CBE-NW exhibited compromised peak editing efficiencies at certain genomic sites, they substantially reduced bystander editing (Figs. 2a and 3e). Further, the editing efficiency and specificity of ABE-NW1 were benchmarked against ABE9, an adenine base editor with a single-nucleotide editing window. The restricted editing activity of ABE9 at position 5 within the protospacer limits its applicability for correcting a broad range of pathogenic mutations, such as the CFTR W1282X (Fig. 5b and Supplementary Fig. 4b and d). In contrast, ABE-NW1 shows a broader targeting scope, serving as a more generalizable base editor applicable to a wide array of genomic sites. Importantly, ABE-NW1 can be further engineered to achieve a single-nucleotide activity window—comparable to that of ABE9—by removing the linker between the deaminase and Cas9 nickase, demonstrating the high flexibility of ABE-NW1. Subsequent characterization of ABE-NW1 showed substantially reduced DNA off-target effects, suggesting the improved binding affinity and/or specificity between ABE and ssDNA substrate; however, the detailed mechanism requires further investigation by resolving the structure of ABE-NW1 bound to DNA. Further, WGS did not detect a substantial number of novel SNVs induced by ABE-NW1, suggesting that the incorporated binding module did not significantly alter off-target editing outcomes. However, WGS of edited single-cell clones may lack the sensitivity to capture low-frequency off-target mutations present in the bulk population. Thus, future validation using alternative methods, such as EndoV-seq28 and CHANGE-seq-BE45, is required to rigorously assess genome-wide off-target editing profiles of BEs-NW in therapeutic contexts.
Our protein engineering strategy is based on the resolved structures of base editors in a substrate-bound state, which may not be accessible for all base editors. For example, since TadA-NW variants were designed based on the structure of TadA-8e conjugated to a SpCas9 nickase, they may not be compatible with compact Cas proteins46,47, such as NmeCas92. Thus, further studies are needed to re-engineer base editors of different sizes. Although the availability of the resolved structures may limit the application of our structure-guided engineering strategy, the advancement of artificial intelligence-based protein structure prediction and modeling tools48,49, e.g., AlphaFold350, will lead to high-quality structural predictions of variable base editors, thereby expanding the application scope of this established base editor design framework.
Finally, we demonstrate the therapeutic application of ABE-NW1 in correcting one of the most common CFTR mutations—CFTR W1282X. Patients carrying this mutation do not have any therapeutic options. Correction of 6–10% cells is sufficient to restore CF chloride ion transport levels to normal in vivo51. Thus, genome editing techniques hold great promise to treat this mutation. However, due to the high AT content at the CFTR locus, some genome editors, such as prime editing, cannot efficiently correct this mutation52. In contrast, our previous work demonstrates the feasibility of using ABE to correct this mutation53. Existing base editors, such as ABE8e, tend to induce a high level of bystander editing when correcting the CFTR W1282X, resulting in Q1281R and/or R1283G amino acid substitutions. Notably, R1283G is a documented mutation that adversely affects CFTR maturation and ion channel activity38. Given that bystander and on-target edits often occur on the same DNA strand54, the bystander editing may undermine the overall correction efficacy. Indeed, in the ABE8e-edited cells, over 50% of the corrected allele contains bystander mutations (Fig. 5d). In contrast, utilizing ABE-NW1 and a guide RNA targeting the sequence upstream of a GAG PAM, we achieved a 36.6 ± 0.551% perfect correction without any bystander mutations in human airway epithelial cells, which, to our knowledge, represents the highest correction rate at this site38,55. This finding lays the foundation for applying the base editing technique to correct CF mutations in vivo.
In summary, we established a genome editor design paradigm for engineering base editors tailored for therapeutic applications.
Methods
Molecular cloning
ABE8e was a gift from David Liu (Addgene plasmid #138489; http://n2t.net/addgene:138489; RRID: Addgene_138489); NG-ABE8e was a gift from David Liu (Addgene plasmid #138491; http://n2t.net/addgene:138491; RRID: Addgene_138491); VRAR-ABEmax was a gift from David Liu (Addgene plasmid #119811; http://n2t.net/addgene:119811; RRID: Addgene_119811); Td-CBEmax was a gift from Dali Li (Addgene plasmid #196600; http://n2t.net/addgene:196600; RRID: Addgene_196600); ACBE was a gift from Dali Li (Addgene plasmid #204606; http://n2t.net/addgene:204606; RRID: Addgene_204606); ABE9 was a gift from Dali Li (Addgene plasmid #194208; http://n2t.net/addgene:194208; RRID: Addgene_194208). Other ABE, CBE, ACBE, and ABE9 plasmids used for mammalian cell transfection were generated using Gibson Assembly (New England Biolabs) based on ABE8e, NG-ABE8e, VRQR-ABEmax, Td-CBEmax, ACBE, or ABE9. All SpCas9 sgRNA plasmids were constructed by ligating annealed oligos into an Esp3I-digested expression vector (a gift from Keith Joung (Addgene plasmid #43860; http://n2t.net/addgene:43860; RRID: Addgene_43860)) using T4 DNA ligase (Thermo Fisher). SaCas9 sgRNA plasmids were constructed using Gibson Assembly based on a BsaI-digested expression vector (a gift from David Liu (Addgene plasmid #132777; http://n2t.net/addgene:132777; RRID: Addgene_132777)). The oligonucleotides used to generate the sgRNA plasmids are listed in Supplementary Data 3. Protospacer and PAM sequences for all the sgRNAs and their corresponding genomic loci (hg38) are described in Supplementary Data 4. All the plasmids used for mammalian cell culture were purified from HB101 Escherichia coli (Zymo) using Qiaprep Spin Miniprep kits (Qiagen). The base editor plasmids generated in this study and their associated sequence maps are available through Addgene (https://www.addgene.org/browse/article/28259363/).
Cell culture and transfection
Human embryonic kidney (HEK293T) cells (ATCC, CRL-3216) were maintained in Dulbecco’s Modified Eagle’s Medium (Corning) supplemented with 10% fetal bovine serum (Gibco) and 1% Penicillin/ Streptomycin (Gibco). CFF-16HBEge CFTR W1282X and primary human bronchial epithelial cells were obtained from the Cystic Fibrosis Foundation’s Therapeutic Lab (Lexington, MA). Parental 16HBE14o- cells expressing wild-type CFTR were from Sigma (SCC150). Cells were cultured in Minimum Essential Medium (Gibco) supplemented with 10% fetal bovine serum (Gibco) and 1% Penicillin/ Streptomycin (Gibco). Flasks were pre-coated by incubating with a thin layer of coating solution (LHC-8 basal medium (Thermo Fisher), 1.34 μl/ml Bovine serum albumin 7.5% (Thermo Fisher), 10 ul/ml Bovine collagen solution (Thermo Fisher), Type 1, 10 μl/ml Fibronectin from human plasma (Advanced Biomatrix)) at 37 °C/ 5% CO2 for 3 h. All the cells were incubated at 37 °Cin a humidified 5% CO2 atmosphere. Cells were seeded at 70% confluence in 12-well cell culture plate one day before transfection. 1.5 μg base editor, and 0.5 μg sgRNA were transfected with Lipofectamine 3000 reagent (Invitrogen). For the orthogonal R-loop assay, 1 µg ABE8e or ABE-NW1, 1 µg catalytically-inactive saCas9, 500 ng sgABE, and 500 ng saCas9 sgRNA were transfected with Lipofectamine 3000 reagents (Invitrogen).
Selection of edited single-cell clones
GFP-expressing plasmids, plasmids encoding ABE8e and sgVEGFA3, and plasmids encoding ABE-NW1 and sgVEGFA3 were transfected into HEK293T cells, respectively. Three days later, the post-transfected cells were serially diluted into 10 cells/ml in culture medium and seeded into 96-well plates (100 µl medium/well). The plates were incubated at 37 °C in a humidified 5% CO2 atmosphere. The medium was changed every three days. After 20 days, the cell colonies were dissociated with 50 µl Trypsin (Gibco) and re-seeded into matching wells of two 24-well pre-coated plates (Plate 1 and 2). After cell confluence reached ~50%, cells in plate 1 were used for extracting genomic DNA with 50 µl Quick extraction buffer (Epicenter) and incubated in thermocycler (65 °C 15 min and 98 °C 5 min). One microliter of extracted genomic DNA was used to amplify the targeted VEGFA3 amplicon by Phusion Flash PCR Master Mix (Thermo Fisher); the purified PCR products were identified for the targeted A-to-G conversion by Sanger sequencing. Single-cell clones harboring biallelic targeted A-to-G conversions were expanded from the corresponding wells in plate 2, and the genomic DNA was extracted using the PureLink Genomic DNA Mini Kit (Thermo Fisher).
In vitro RNA synthesis
To construct in vitro transcription plasmids, the coding sequences for ABE8e, ABE-NW, and ABE9 were inserted between a T7 promoter and a 110-nt poly(A) tract in a cloning vector as described before56. Plasmids were completely linearized using BsmBI (New England Biolabs). In vitro transcription was performed at 37 °C using a HiScrib T7 High Yield RNA Synthesis kit (New England Biolabs) with the addition of CleanCap Reagent AG (Trilink Biotechnologies) for Cap1 structure and with a 100% replacement of UTP by N1-Methylpseudo-UTP (Trilink Biotechnologies). The reaction was terminated after 4 h by a 15-min incubation with DNase I (New England Biolabs). The RNA was then purified using a Monarch RNA Cleanup kit (New England Biolabs). Chemically-modified single guide RNAs were synthesized by Synthego.
Electroporation of lung epithelial cells
The Neon Transfection System (Invitrogen) was used for electroporation. 2 µg ABE mRNA and 0.2 pmol of guide RNA were electroporated into 2 × 105 CFF-16HBEge CFTR W1282X cells or primary human bronchial epithelial cells using 1300 V, 10 ms and three pulses, and then all seeded in a 12-well plate. Sixteen hours post electroporation, the cells were replaced with the fresh culture medium.
Western blot
Post-transfected cells were lysed with RIPA buffer (Boston bioproducts) supplemented with protease inhibitor (Roche) and phosphatase inhibitor (Thermo Fisher). Protein concentration was measured by BCA assay kit (Thermo Fisher). Equal amounts of proteins were loaded onto NuPAGE™ 4–12% Bis-Tris Protein Gels (Invitrogen) and run at 125 V for 90 min. After being transferred to PVDF membrane, the blots were incubated with indicated antibodies, anti-GAPDH (Sigma, MAB347) or anti-CFTR (UNC-596), followed by incubation with HRP-conjugated secondary antibodies (Invitrogen 31430). The images were captured using ChemiDoc imaging system (BioRad). To detect the expression of base editors, the gels were transferred to nitrocellulose membrane, and the blots were incubated with indicated antibodies, anti-Cas9 (Epigentek, A-9000-050) or anti-GAPDH (Sigma, MAB347), followed by incubation with IRDye 680RD-conjugated secondary antibodies (LICRObio 926-68070). The images were captured using the Odyssey M system (Li-Cor Biosciences).
High-throughput DNA sequencing of genomic DNA samples
Targeted-amplicon HTS sequencing: Genomic sites of interest were amplified from genomic DNA using specific primers containing Illumina forward and reverse adaptors (listed in Supplementary Data 5). 20 μL PCR1 reactions were performed with 0.5 μM of each forward and reverse primer, 1 μL of genomic DNA extract, and 10 μL of Phusion Flash PCR Master Mix (Thermo Fisher). PCR reactions were carried out as follows: 98 °C for 10 s, then 20 cycles of [98 °C for 1 s, 55 °C for 5 s, and 72 °C for 10 s], followed by a final 72 °C extension for 3 min. After first round of PCR, unique Illumina barcoding reverse primer were added to each sample in a secondary PCR reaction (PCR 2). Specifically, 20 μL of a PCR reaction contained 0.5 μM of unique reverse Illumina barcoding primer pair and 0.5 μM common forward Illumina barcoding primer, 1 μL of unpurified PCR 1 reaction mixture, and 10 μL of Phusion Flash PCR Master Mix. The barcoding PCR2 reactions were carried out as follows: 98 °C for 10 s, then 20 cycles of [98 °C for 1 s, 60 °C for 5 s, and 72 °C for 10 s], followed by a final 72 °C extension for 3 min. PCR 2 products were purified by 1% agarose gel using a QIAquick Gel Extraction Kit (Qiagen), eluting with 15 μL of Elution Buffer. DNA concentration was measured by Bioanalyzer and sequenced on an Illumina MiSeq instrument 150 bp, single-end) according to the manufacturer’s protocols.
Whole-genome sequencing: WGS was performed on the genomic DNA extracted from two control single-cell clones, three ABE8e:sgVEGFA3-treated clones, and five ABENW:sgVEGFA3-treated clones. Genomic DNA was quantified using the Qubit dsDNA BR Assay kit (Thermo Fisher Scientific Q32850), and DNA integrity was assessed using Agilent Genomic DNA Screentape Analysis (Agilent 5067-5365). DNA inputs for all samples were normalized to 200 ng total DNA input in 30 μl of total volume. Normalized DNA was used as input into the Illumina DNA Prep and Tagmentation kit (Illumina 20060059) for library preparation. All samples were prepared manually to the manufacturer’s specifications. Final libraries were assessed for concentration and sizing by Qubit dsDNA BR Assay Kit (Thermo Fisher Scientific Q32850) and Agilent D1000 Screentape Assay Kit (Agilent 5067-5582), respectively. Final libraries were diluted to 2 nM and pooled for sequencing. Sequencing was performed on the Illumina NovaSeq 6000 System (Illumina 20012850), using NovaSeq 6000 S2 Reagent Kit v1.5 (300 cycles) (Illumina 20028314) configured for 150 bp Paired End sequencing.
High-throughput sequencing data analysis
Quantification of base editing efficiency: Alignment of amplicon sequences to a reference sequence and calculation of single-nucleotide conversion rate were performed using a Matlab script6 (Matlab R2022b). To analyze the frequency of intended and bystander edits at one CFTR allele, CRISPResso257 was used to quantify the proportions of unedited alleles, alleles with perfect correction, and those containing other edits.
Identification of de novo off-target edits: Quality control metrics were aggregated using MultiQC. Single-nucleotide variant calling from whole-genome sequencing data was conducted using the mpileup module from the nf-core/sarek pipeline (v3.5.1)58,59. To further identify novel single-nucleotide variants (SNVs) induced by ABE-NW1 with high confidence, we (1) selected variants with mapping quality > 30, sequencing depth > 10, and altered allele frequency > 10%, (2) filtered out the variants overlapping with those detected in GFP- and ABE8e-treated HEK293T cell clones, (3) excluded the sites present in the NCBI dbSNP (v.153; http://www.ncbi.nlm.nih.gov/SNP/) database, and (4) focused the analysis on canonical (chromosomes 1–22, X, and Y) chromosomes. Shared SNVs detected across all five ABE-NW1-treated single-cell clones were selected for further validation.
Sanger sequencing and analysis using EditR
The target genomic loci were amplified using Phusion Flash PCR Master Mix (Thermo Fisher) and sequenced by GENEWIZ. The Sanger sequencing traces were quantified using EditR60.
Electrophysiology assays
The 16HBE14ge CFTR W1282X cells were nucleofected with the Lonza 4D-Nucleofector system using the Lonza SG cell line 4D X Kit S and program CM-137. The edited and parental 16HBE14o- cells were seeded at a density of 4.5 × 105 cells cm2 onto HTS Transwell 24-well filter inserts (Corning, 3378) pre-coated with human collagen type IV (Sigma-Aldrich, C5533). Cells were grown as submerged cultures in MEM (Gibco, 11095) containing 10% FBS (Hyclone, SH30071.03) and 1% Pen/Strep, and incubated at 37 °C and 5% CO2. After a total of 7 days, 16HBE cells typically formed electrically tight epithelia with a transepithelial resistance (Rt) around 1500 Ω·cm2. CFTR-mediated Cl− equivalent current (Ieq) was determined as described below.
Prior to functional (Ieq) studies, MEM was replaced with fresh HEPES-buffered (pH 7.4) solutions (assay buffer). A driving force for chloride ions was established through application of a basolateral to apical chloride ion gradient (see buffer composition below). Cell plates were mounted onto an automated robotic assay platform and equilibrated at ~36 °C for 90 min. After equilibration, transepithelial voltage (Vt) and resistance (Rt) were monitored at ~5 min intervals using a 24-channel transepithelial current clamp amplifier (TECC-24, EP Design, Bertem, Belgium). Electrode potential differences for each pair of Ag/AgCl voltage electrodes were also monitored at 5 min intervals by taking voltage measurements from a control plate with matching buffer solutions and 16HBE cells that were left untreated. Ieq was calculated from values of Vt and Rt using Ohm’s law after correcting for series resistance and (electrode) voltage offsets unrelated to Vt. Ieq traces are plotted as mean ± SD (n = 3). The first 4 data points reflect baseline Ieq currents prior to sequential stimulation of CFTR with forskolin (10 μM) followed by VX-770/ivacaftor (1 μM). The last six data points were recorded in the presence of CFTR inhibitor CFTRinh-172 (20 μM). CFTR-mediated changes in Ieq, (the area under the curve (AUC) between forskolin and CFTRinh-172 addition) are used as a measure of functional CFTR surface expression or treatment-related functional rescue of mutant CFTR. Assay buffer: CFTR-mediated transepithelial currents were recorded using a Cl− concentration gradient. The basolateral solution contained (mM): 137 NaCl, 4 KCl, 1.8 CaCl2, 1 MgCl2, 10 HEPES and D-Glucose, adjusted to pH 7.4 with NaOH/HCl ([Cl−]total: 146.6 mM). The apical solution was matched to the basolateral except for (mM): 137 Na-gluconate replaced 137 NaCl ([Cl−]total: 9.6 mM).
ClinVar data analysis
Disease-associated variants that can be corrected by base editing were obtained from a ClinVar library described previously61. First, the list was filtered to exclude any variants classified as benign or of uncertain significance. Next, the variants caused by G>A or C>T single nucleotide conversions were selected. These variants were then categorized based on the number of adenines within a 10-bp base editing window (positions 3-12). The presence of multiple adenines within the editing window serves as an indicator of potential ABE-mediated bystander editing at the corresponding site. Finally, the fractions of the variants harboring single A (precise editing) or multiple As (bystander editing) within the editing window were calculated using GraphPad Prism 9.
Statistical and reproducibility
All bar plots and figures except for the Venn diagram were generated using GraphPad Prism 9. P values were calculated using Prism 9 by performing two-tailed Student’s t test (paired two-sample for means), with a statistical significance level represented on each figure as ns (not significant), *P < 0.05, **P < 0.01, ***P < 0.001, and ****P < 0.0001. Venn diagrams were generated using R (v4.3.2). All experiments were performed with at least three independent biological replicates, unless otherwise noted. The bar and dot plots were presented as the mean ± standard error of the mean (SEM), unless otherwise noted. No statistical method was used to predetermine sample size. No data were excluded from the analysis.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The NCBI ClinVar database is accessible at https://www.ncbi.nlm.nih.gov/clinvar/. All the raw DNA sequencing data are available at the NCBI Sequence Read Archive database under accession numbers PRJNA1220701. The plasmids encoding the base editor variants generated in this study are available through Addgene [https://www.addgene.org/browse/article/28259363/]. All other data supporting this study are included within the paper, the Supplementary Information file, the Supplementary Data files, and the Source Data file. All datasets are publicly accessible and will be permanently available, with no restrictions on data availability. Source data are provided with this paper.
References
Rees, H. A. & Liu, D. R. Base editing: precision chemistry on the genome and transcriptome of living cells. Nat. Rev. Genet. 19, 770–788 (2018).
Neugebauer, M. E. et al. Evolution of an adenine base editor into a small, efficient cytosine base editor with low off-target activity. Nat. Biotechnol. 41, 673–685 (2023).
Chen, L. et al. Re-engineering the adenine deaminase TadA-8e for efficient and specific CRISPR-based cytosine base editing. Nat. Biotechnol. 41, 663–672 (2023).
Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 (2016).
Richter, M. F. et al. Phage-assisted evolution of an adenine base editor with improved Cas domain compatibility and activity. Nat. Biotechnol. 38, 883–891 (2020).
Gaudelli, N. M. et al. Programmable base editing of A*T to G*C in genomic DNA without DNA cleavage. Nature 551, 464–471 (2017).
Kurt, I. C. et al. CRISPR C-to-G base editors for inducing targeted DNA transversions in human cells. Nat. Biotechnol. 39, 41–46 (2021).
Chen, L. et al. Adenine transversion editors enable precise, efficient A*T-to-C*G base editing in mammalian cells and embryos. Nat Biotechnol 42, 638–650 (2023).
Zhang, E., Neugebauer, M. E., Krasnow, N. A. & Liu, D. R. Phage-assisted evolution of highly active cytosine base editors with enhanced selectivity and minimal sequence context preference. Nat. Commun. 15, 1697 (2024).
Chen, L. et al. Engineering a precise adenine base editor with minimal bystander editing. Nat. Chem. Biol. 19, 101–110 (2023).
Kim, Y. B. et al. Increasing the genome-targeting scope and precision of base editing with engineered Cas9-cytidine deaminase fusions. Nat. Biotechnol. 35, 371–376 (2017).
Gehrke, J. M. et al. An APOBEC3A-Cas9 base editor with minimized bystander and off-target activities. Nat. Biotechnol. 36, 977–982 (2018).
Popa, S. C., Inamoto, I., Thuronyi, B. W. & Shin, J. A. Phage-assisted continuous evolution (PACE): a guide focused on evolving protein-DNA interactions. ACS Omega 5, 26957–26966 (2020).
Lin, M., Malik, F. K. & Guo, J. T. A comparative study of protein-ssDNA interactions. NAR Genom. Bioinform. 3, lqab006 (2021).
Tompkins, K. J. et al. Molecular underpinnings of ssDNA specificity by Rep HUH-endonucleases and implications for HUH-tag multiplexing and engineering. Nucleic Acids Res. 49, 1046–1064 (2021).
Wang, X., McLachlan, J., Zamore, P. D. & Hall, T. M. Modular recognition of RNA by a human pumilio-homology domain. Cell 110, 501–512 (2002).
Lapinaite, A. et al. DNA capture by a CRISPR-Cas9-guided adenine base editor. Science 369, 566–571 (2020).
Shi, K. et al. Structural basis for targeted DNA cytosine deamination and mutagenesis by APOBEC3A and APOBEC3B. Nat. Struct. Mol. Biol. 24, 131–139 (2017).
Koh, Y. Y. et al. Stacking interactions in PUF-RNA complexes. RNA 17, 718–727 (2011).
Pallaseni, A. et al. Predicting base editing outcomes using position-specific sequence determinants. Nucleic Acids Res. 50, 3551–3564 (2022).
Tan, J., Zhang, F., Karcher, D. & Bock, R. Engineering of high-precision base editors for site-specific single nucleotide replacement. Nat. Commun. 10, 439 (2019).
Hu, J. H. et al. Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature 556, 57–63 (2018).
Nishimasu, H. et al. Engineered CRISPR-Cas9 nuclease with expanded targeting space. Science 361, 1259–1262 (2018).
Kleinstiver, B. P. et al. High-fidelity CRISPR-Cas9 nucleases with no detectable genome-wide off-target effects. Nature 529, 490–495 (2016).
Lam, D. K. et al. Improved cytosine base editors generated from TadA variants. Nat. Biotechnol. 41, 686–697 (2023).
Chen, L. et al. Adenine transversion editors enable precise, efficient A*T-to-C*G base editing in mammalian cells and embryos. Nat. Biotechnol. 42, 638–650 (2024).
Kim, D. et al. Genome-wide target specificities of CRISPR RNA-guided programmable deaminases. Nat. Biotechnol. 35, 475–480 (2017).
Liang, P. et al. Genome-wide profiling of adenine base editor specificity by EndoV-seq. Nat. Commun. 10, 67 (2019).
Zuo, E. et al. Cytosine base editor generates substantial off-target single-nucleotide variants in mouse embryos. Science 364, 289–292 (2019).
Jin, S. et al. Cytosine, but not adenine, base editors induce genome-wide off-target mutations in rice. Science 364, 292–295 (2019).
Doman, J. L., Raguram, A., Newby, G. A. & Liu, D. R. Evaluation and minimization of Cas9-independent off-target DNA editing by cytosine base editors. Nat. Biotechnol. 38, 620–628 (2020).
Matuszek, Z. et al. Base editing of trinucleotide repeats that cause Huntington’s disease and Friedreich’s ataxia reduces somatic repeat expansions in patient cells and in mice. Nat. Genet. 57, 1437–1451 (2025).
Arbab, M. et al. Determinants of base editing outcomes from target library analysis and machine learning. Cell 182, 463–480 e430 (2020).
Castellani, C. & team, C. CFTR2: How will it help care?. Paediatr. Respir. Rev. 14, 2–5 (2013).
Haggie, P. M. et al. Correctors and potentiators rescue function of the truncated W1282X-cystic fibrosis transmembrane regulator (CFTR) translation product. J. Biol. Chem. 292, 771–785 (2017).
Valley, H. C. et al. Isogenic cell models of cystic fibrosis-causing variants in natively expressing pulmonary epithelial cells. J. Cyst. Fibros. 18, 476–483 (2019).
Jiang, W., Bikard, D., Cox, D., Zhang, F. & Marraffini, L. A. RNA-guided editing of bacterial genomes using CRISPR-Cas systems. Nat. Biotechnol. 31, 233–239 (2013).
Mention, K. et al. Use of adenine base editing and homology-independent targeted integration strategies to correct the cystic fibrosis causing variant, W1282X. Hum. Mol. Genet. 32, 3237–3248 (2023).
Skach, W. R. Defects in processing and trafficking of the cystic fibrosis transmembrane conductance regulator. Kidney Int. 57, 825–831 (2000).
Doman, J. L. et al. Phage-assisted evolution and protein engineering yield compact, efficient prime editors. Cell 186, 3983–4002 e3926 (2023).
Li, J. et al. Structure-guided engineering of adenine base editor with minimized RNA off-targeting activity. Nat. Commun. 12, 2287 (2021).
Rees, H. A., Wilson, C., Doman, J. L. & Liu, D. R. Analysis and minimization of cellular RNA editing by DNA adenine base editors. Sci. Adv. 5, eaax5717 (2019).
Huang, T. P. et al. Circularly permuted and PAM-modified Cas9 variants broaden the targeting scope of base editors. Nat. Biotechnol. 37, 626–631 (2019).
Yan, J. et al. Improving prime editing with an endogenous small RNA-binding protein. Nature 628, 639–647 (2024).
Lazzarotto, C. R. et al. CHANGE-seq-BE enables simultaneously sensitive and unbiased in vitro profiling of base editor genome-wide activity. bioRxiv (2024).
Chen, J. S. et al. CRISPR-Cas12a target binding unleashes indiscriminate single-stranded DNase activity. Science 360, 436–439 (2018).
Hou, Z. et al. Efficient genome engineering in human pluripotent stem cells using Cas9 from Neisseria meningitidis. Proc. Natl. Acad. Sci. USA 110, 15644–15649 (2013).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Suzuki, S. et al. Highly efficient gene editing of cystic fibrosis patient-derived airway basal cells results in functional CFTR correction. Mol. Ther. 28, 1684–1695 (2020).
Li, C. et al. Prime editing-mediated correction of the CFTR W1282X mutation in iPSCs and derived airway epithelial cells. PloS ONE 18, e0295009 (2023).
Jiang, T. et al. Chemical modifications of adenine base editor mRNA and guide RNA expand its application scope. Nat. Commun. 11, 1979 (2020).
Komor, A. C. et al. Improved base excision repair inhibition and bacteriophage Mu Gam protein yields C:G-to-T:A base editors with higher efficiency and product purity. Sci. Adv. 3, eaao4774 (2017).
Krishnamurthy, S. et al. Functional correction of CFTR mutations in human airway epithelial cells using adenine base editors. Nucleic Acids Res. 49, 10558–10572 (2021).
Liu, B. et al. A split prime editor with untethered reverse transcriptase and circular RNA template. Nat. Biotechnol. 40, 1388–1393 (2022).
Clement, K. et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat. Biotechnol. 37, 224–226 (2019).
Hanssen, F. et al. Scalable and efficient DNA sequencing analysis on different compute infrastructures aiding variant discovery. NAR Genom. Bioinform. 6, lqae031 (2024).
Garcia, M. et al. Sarek: a portable workflow for whole-genome sequencing analysis of germline and somatic variants. F1000Res 9, 63 (2020).
Kluesner, M. G. et al. EditR: a method to quantify base editing from Sanger sequencing. CRISPR J. 1, 239–250 (2018).
Hanna, R. E. et al. Massively parallel assessment of human variants with base editor screens. Cell 184, 1064–1080.e1020 (2021).
Acknowledgements
We thank the Cystic Fibrosis Foundation Laboratory for providing the CFF-16HBEge CFTR W1282X cells and primary bronchial epithelial cells homozygous for W1282X. We thank the Genomics Core Facility at the Icahn School of Medicine at Mount Sinai for performing the WGS experiment. T.J. was supported by grants from the National Institutes of Health (R00HL153940) and the Cystic Fibrosis Foundation (005363I223).
Author information
Authors and Affiliations
Contributions
T.J. designed the study. T.J., I.V., I.O., D.P., K.G., J.H., and K.C. performed experiments and analyzed the data. E.E. and S.A.C. performed whole-genome sequencing. S.G.M. analyzed the WGS data. R.P.S., H.C.V., K.C., and M.M. provided suggestions for this project. T.J. wrote the manuscript with comments from all authors.
Corresponding author
Ethics declarations
Competing interests
T.J. has submitted a patent application to the Mount Sinai patent office pertaining to the results reported in this work. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Valdez, I., O’Connor, I., Patel, D. et al. A streamlined base editor engineering strategy to reduce bystander editing. Nat Commun 16, 8115 (2025). https://doi.org/10.1038/s41467-025-63609-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-025-63609-6