AI-guided redesign of laboratory-evolved reverse transcriptases enhances prime editing

Tao, Y. Allen; Sakai, Holt A.; Jiang, Allen Y.; Krasnow, Nicholas A.; Vaganov, Vasilii S.; Shim, Brian; Barsdale, Zachary; Pandey, Smriti; Ahmed, Nouraiz; Na, Man; Liao, Ting-Wei; Oye, Keyede; Cristian, Ana; Zhang, Emily; Xu, Joy A.; Bulcaen, Mattijs; Liu, David R.

doi:10.1038/s41587-026-03149-6

Download PDF

Article
Open access
Published: 21 May 2026

AI-guided redesign of laboratory-evolved reverse transcriptases enhances prime editing

Nature Biotechnology (2026) Cite this article

Subjects

Abstract

Although protein engineering and laboratory evolution have been used to optimize prime editors, we show that previous changes that improve prime editor efficiency also compromise protein stability and expression level, limiting performance. To address these limitations, we apply structure-informed artificial intelligence-guided methods such as the inverse-folding network ProteinMPNN to redesign the reverse transcriptase (RT) domains of engineered and evolved prime editors while preserving regions essential for catalysis. Redesigned RTs are extensively mutated, with 30–163 amino acid substitutions, and exhibit enhanced folding stability and soluble expression and up to twofold higher intracellular prime editor protein levels following mRNA delivery. Redesigned PE8 prime editors demonstrate enhanced editing efficiencies across multiple ex vivo contexts, including in several human primary cell types and via several delivery modalities. In mice, editing efficiency is up to 2.9-fold higher than that of state-of-the-art PE6, PE7 and PEmax prime editors. These findings demonstrate a generalizable approach for augmenting laboratory evolution to improve genome editing agents.

Main

Prime editors are versatile and precise molecular machines that perform search-and-replace gene editing by combining a programmable nickase and reverse transcriptase (RT), guided by a prime editing guide RNA (pegRNA), to replace a targeted DNA sequence with a user-defined alternative¹. In contrast to nuclease-based approaches, prime editors can install substitutions, deletions and small insertions through a unique mechanism that avoids double-stranded breaks, reducing the frequency of uncontrolled mixtures of insertions and deletions (indels) and chromosomal rearrangements¹. The precision and versatility of prime editing enables applications in functional genomics^2,3 and therapeutic genome editing in cells, animals and humans^4,5,6. For example, prime editing can rescue a variety of genetic diseases in animal models^{7,8,9,10,11,12,13} and has been used ex vivo to treat chronic granulomatous disease in humans¹⁴.

We and others have improved the properties of prime editors to increase their efficiency, product purity and versatility^{15,16,17,18,19,20,21}. Although previous prime editing protein engineering and laboratory evolution efforts have primarily focused on optimizing the catalytic properties of RTs^21,22,23,24, complementary approaches to improve expression levels or biophysical properties of prime editors are largely unexplored. For large protein cargoes such as prime editors, insufficient soluble expression and thermal instability can lead to misfolding or aggregation processes that reduce the concentration of active protein complex and impede research or therapeutic applications^25,26. Expression and stability limitations can constrain gene editing outcomes across many different delivery modalities, particularly in clinical applications in which dose and exposure may be intentionally minimized²⁷.

Phage‑assisted continuous evolution (PACE) is a powerful laboratory evolution system that has been applied to diverse genome engineering systems, including base editors, prime editors, recombinases and CRISPR-associated transposases^{21,28,29,30,31,32,33}. Laboratory-evolved proteins, however, often accumulate mutations that increase catalytic function at the expense of soluble expression levels and thermodynamic stability of the folded protein state^34,35,36,37. Indeed, proteins that acquire new or improved activities through laboratory evolution can require substantial re-engineering to recover losses in thermostability before they can be effectively deployed in their intended applications^38,39. For prime editors, PACE yielded RT variants with enhanced processivity and editing kinetics, resulting in PE6 prime editors that support improved editing efficiencies²¹. These variants have not yet been studied for potential bottlenecks that may limit activity due to intrinsic biophysical properties of evolved and engineered RTs. Suboptimal stability and folding properties in mammalian cells may limit prime editor performance in therapeutic genome editing applications, particularly in transient expression contexts in vivo⁴⁰.

Recent developments in computational protein design enable capabilities that in principle can address potential deficiencies in proteins emerging from directed evolution^41,42,43. Protein deep learning models have already shown promise in improving the catalytic properties of enzymes used in base editing and prime editing, such as TadA-8e deaminase and Moloney murine leukemia virus (M-MLV) RT^44,45. However, much like directed evolution, these approaches often prioritize mutations that maximize enzymatic activity at the expense of optimizing global biophysical properties. By contrast, we (N.A.K., manuscript in revision) and others have demonstrated that ProteinMPNN can enable structure‑informed sequence redesign that preserves protein catalytic function while improving stability and folding thermodynamics^46,47,48.

We hypothesized that application of ProteinMPNN to the RT domain of prime editors, starting from variants we previously optimized by laboratory evolution or engineering, may yield prime editors with editing efficiencies superior to that of state-of-the-art systems. Because ProteinMPNN is trained on experimentally determined structures that are enriched for reliably folded and expressible proteins, its designs can offer enhanced folding stability and solubility. Therefore, the global refinement of sequence features important for folding and solubility, when carefully restricted to residues that are not directly involved in catalysis^47,49, could enhance thermal stability and expression in mammalian cells without compromising enzymatic function, allowing us to combine each of the key strengths of artificial intelligence (AI)-guided protein design and directed evolution. Increasing the intracellular concentration of active prime editor complex could maximize editing efficiency per dosed molecule, especially in challenging transient in vivo settings. Due to the modularity of prime editor components, we anticipated that computationally redesigned RTs should maintain compatibility with other orthogonal advances that improve editing outcomes such as Cas domain variants, engineered pegRNAs (epegRNA) and fusion constructs with RNA-binding domains such as La^{17,18,19,50,51}. Application of computational protein redesign to RTs (highly dynamic proteins that require precise geometric coordination between the conformationally labile fingers domain and catalytic palm domain, as well as template, primer and dNTP substrates⁵²) would also demonstrate the applicability of this approach to enzyme classes with complex, demanding multistep and multisubstrate mechanisms.

Here, we apply AI-guided sequence redesign to generate thermodynamically stabilized prime editor RT domains from starting points previously optimized for catalytic properties through engineering and evolution. Redesigned RTs based on those from Ec48, Tf1 and M-MLV retain robust reverse transcription activity while exhibiting substantially improved thermal stability and expression levels in mammalian cells. Incorporation of these stabilized RTs into prime editors and delivery of the corresponding mRNA using lipid nanoparticle (LNP)-based formulations⁵³ enhanced average prime editing correction efficiencies of 700 pathogenic variants in the ClinVar database^54,55. These redesigned prime editors also demonstrated improved performance in targeted applications of therapeutic relevance, including mRNA electroporation for ex vivo cell therapy in primary T cells and hematopoietic stem and progenitor cells (HSPCs), engineered virus-like particle (eVLP)-mediated ribonucleoprotein (RNP) delivery and in vivo LNP delivery of reagents programmed to orthogonalize interleukin-2 receptor-β (IL-2Rβ) and to correct pathogenic alleles that cause Bloom syndrome, Crigler–Najjar syndrome, Tay-Sachs disease, sickle cell disease and familial hypercholesterolemia. Based on their general superior performance over PE6 and PEmax–La (PE7) variants, we designate these new prime editor variants PE8max, PE8c and PE8d. Together, these results establish that computationally stabilized RTs expand the performance and applicability of prime editing and suggest that protein deep learning model-guided redesign may offer a general approach to enhancing complex, multisubstrate laboratory-evolved proteins.

Results

Laboratory-evolved RTs have reduced stability

We previously applied PACE to evolve RTs with enhanced activity for prime editing, yielding variants PE6a (evolved from the Ec48 retron RT), PE6b (evolved from the Tf1 retrotransposon RT), PE6c (engineered from PE6b) and PE6d (evolved from the engineered M-MLV retrovirus RT used in the original PE3 prime editor)²¹. These evolved and engineered RTs yielded up to 22-fold higher prime editing efficiencies than their wild-type RT starting points and have been successfully applied to therapeutic prime editing in cultured cells and in vivo, in which their compact size or ability to support high-efficiency prime editing have demonstrated broad utility²¹. However, the mutations introduced during directed evolution and protein engineering campaigns that improve activity often impair folding kinetics or thermodynamics, resulting in proteins that may suffer from decreased stability and impaired soluble expression^34,35,37,38. These issues are amplified in large enzymes such as multidomain prime editors that can approach 2,000 amino acids. Such proteins can rely heavily on co-translational folding and chaperone processes, rendering these proteins susceptible to misfolding, ribosome stalling and aggregation^56,57. We hypothesized that expression level and stability may be bottlenecks for current-generation prime editors, particularly those that have undergone many successive rounds of laboratory evolution in PACE.

To determine if soluble expression level may be a potential bottleneck for prime editing, we quantified the amount of Streptococcus pyogenes Cas9 prime editor protein produced in Hepa1-6 cells from LNP-delivered mRNA constructs encoding prime editors. These prime editors contained either wild-type RTs (WT-Ec48 or WT-Tf1) or laboratory-evolved or engineered RT variants (PEmax, PE6a, PE6c or PE6d). We also included an mRNA encoding Cas9 nuclease for comparison. We performed an anti-Cas9 enzyme-linked immunosorbent assay (ELISA) to quantify protein levels in cell lysates from treated Hepa1-6 cells (a mouse hepatocyte line) at 2, 8, 24 and 48 h after transfection of LNP-mRNA.

Compared to prime editors containing wild-type RTs or Cas9 nuclease, prime editors containing evolved or engineered RT variants expressed at substantially lower levels. The peak protein levels (8 h after transfection) for prime editors with evolved RTs were 1.5- to 2.0-fold lower than their counterparts containing wild-type RTs (Fig. 1a). Together, these observations establish that prime editors containing evolved or engineered RTs are expressed at lower levels than prime editors that use wild-type RTs. Because the prime editors tested in Fig. 1a all use the PEmax-based SpCas9 domain with only 3 amino acid substitutions (R221K, N394K and H840A), these data suggest that targeted redesign of the highly engineered and evolved RT domains of prime editors, which contain up to 17 amino acid substitutions, offers the greatest opportunity to improve their expression levels.

**Fig. 1: Sequence design of previous engineered and evolved RTs.**

Sequence redesign of RTs and in silico structural prediction

We hypothesized that mutations from laboratory evolution or engineering destabilize the folded state of RTs, resulting in faster degradation in cells and the lower levels of prime editor protein observed in Fig. 1a. To address these issues, we applied deep learning-based protein sequence redesign tools to the set of evolved and engineered RNA-dependent DNA polymerases in PEmax, PE6a, PE6c and PE6d. Building on previous strategies^47,48 for redesigning proteins to enhance stability while preserving enzymatic function, we developed a computational pipeline for streamlined design and in silico screening of RT candidates (Fig. 1b). First, for each RT starting point, we selected a set of amino acid residues to exclude from redesign on the basis of (1) proximity to the conserved catalytic core of the RT palm subdomain and (2) evolutionary conservation of residues that are likely essential for enzyme function. We anticipated that constraining these residues would increase the frequency of functional variants among redesigned RTs by avoiding alteration of amino acids critical for substrate binding or catalytic turnover. The remaining residues were allowed to vary during redesign with ProteinMPNN, using the AlphaFold3-predicted structures as protein backbone inputs.

We applied this computational workflow to prime editors, selecting the RTs from PEmax, PE6a, PE6c and PE6d as starting points for sequence redesign due to their diverse phylogenetic origins, prime editing performance characteristics and sizes (Fig. 1b). We began by testing distance cutoffs of 15, 18 and 20 Å, such that only amino acids farther from the DNA primer–RNA template complex than these distance thresholds were permitted to vary during redesign. These distance thresholds safely encompass the catalytic aspartate triad and the two Mg²⁺ cations within the active site of the palm subdomain in the pre- and postcatalytic conformation. We also implemented two sequence conservation thresholds of 25% and 50%, such that residues with conservation frequencies above these cutoffs in the multiple sequence alignment were also excluded from redesign (Fig. 1c). Combining the three distance thresholds with the two conservation cutoffs yielded six constraint configurations. For each configuration, we generated 16 ProteinMPNN-designed sequences, resulting in a total of 96 candidate RTs redesigned from each of the four RT starting points.

The resulting sequences were subjected to in silico validation using AlphaFold2 to assess structural fidelity. We used two tests to filter low-confidence designs: the predicted local distance difference test (pLDDT), which reflects AlphaFold’s confidence in the predicted structure, and root mean squared deviation (r.m.s.d.) from the starting RT’s predicted structure. The 384 initial redesigned RTs retained high structural similarity to the starting RTs, while exploring novel sequence space. Redesigned RTs showed high predicted structural similarity to the starting RTs (median r.m.s.d. of 1.48, 0.94, 1.94 and 1.71 Å for PE6a, PE6c, PE6d and PEmax, respectively) and high confidence in predicted structures (median pLDDT of 93.9, 85.4, 88.7 and 88.1 for PE6a, PE6c, PE6d and PEmax, respectively; Fig. 1d). More stringent constraints generally led to higher similarity but lower confidence outcomes, with the most highly constrained group (20 Å and 25% conservation) generally showing lower r.m.s.d. values but lower pLDDT scores than the least constrained group (15 Å and 50% conservation; Extended Data Fig. 1). For example, redesigned PE6c RT variants showed improved r.m.s.d. values but decreased pLDDT scores under the most stringent constraints (20 Å) compared to the least stringent (15 Å; Fig. 1e). We reason that allowing broader sequence redesign under less stringent constraints increased order at otherwise flexible termini or disordered regions, elevating average pLDDT scores.

We selected the top 48 redesigns per starting point for experimental validation based on AlphaFold2-derived metric r.m.s.d. and pLDDT. This workflow rapidly generated divergent RT designs with up to 40% redesigned residues (up to 202 amino acid substitutions) compared to engineered and evolved starting points, demonstrating that redesign with ProteinMPNN can access novel regions of sequence space while preserving overall predicted structures of prime editor RTs.

Redesigned RTs enhance prime editing efficiency in cultured cells

We assessed the performance of sequence redesigned RTs in prime editing experiments in cultured mammalian cells. Rather than using plasmid-based transfection experiments, which are convenient but lack therapeutic relevance and can achieve high prime editing efficiencies that are no longer primarily limited by prime editor properties, we sought to isolate the potential effects of translational enhancements (thermal stability, soluble expression and folding dynamics) by using LNP-mediated RNA delivery that is already widely used in preclinical and clinical nuclease, base editing and prime editing programs^{14,58,59,60,61,62}. We cloned gene fragments for each redesigned RT into an in vitro transcription (IVT) cassette, generated and purified prime editor mRNA and formulated LNPs using a previously optimized OF-02 ionizable lipid-based formulation^53,63. Of the 192 total selected designs, 174 sequences were amenable to arrayed gene synthesis (Fig. 2a).

**Fig. 2: Redesigned RTs improve prime editing in cultured cells by mRNA-LNP.**

We delivered an admixed formulation of prime editor mRNA with synthetic epegRNA and nick guide RNA (ngRNA) components designed to install a Pcsk9 +1 TTAC insertion that is protective against hypercholesterolemia into mouse Hepa1-6 cells and measured editing 3 days after transfection by high-throughput sequencing (HTS; Fig. 2b). We quantified prime editor performance improvement by normalizing raw percent editing values to that of the predesign starting point for each variant, calculating the fold improvement for each candidate.

Among 174 redesigned RTs, virtually all designs (165/174, 95%) supported prime editing efficiencies above 5%, whereas a substantial proportion (52/174, 30%) resulted in editing efficiencies above those of the engineered or evolved starting points (Fig. 2c). Top-performing RT variants supported efficient prime editing at Pcsk9 with improvements of 2.3-fold for PEmax, 1.7-fold for PE6a, 1.3-fold for PE6c and 1.4-fold for PE6d. These improvements are remarkable for an edit with an already robust baseline LNP-mediated prime editing efficiency (19–56%; Fig. 2d and Extended Data Fig. 2). Nevertheless, we recognize the possibility that relying on a single-target screen for initial triage may have filtered out variants with broader or more context-dependent advantages. Together, these data demonstrate that AI-guided sequence redesign of RTs can produce functional variants that support improved prime editing outcomes compared to current state-of-the-art prime editor RTs.

To gain a better understanding of the parameters that resulted in improved prime editing, we examined the relationship between editing performance and the constraints for distance and conservation that were applied during redesign (Extended Data Fig. 3a). To aggregate data from different RTs, we normalized raw editing efficiencies for each variant to the activity of starting points from each lineage. Across all designs, higher stringency conservation thresholds were a strong determinant of prime editing efficiencies. Stringent conservation constraints (≥25% cutoff) resulted in better performance than more permissive designs (≥50% cutoff; Extended Data Fig. 3a). However, no substantial differences were observed based on distance constraints, suggesting that even the minimal distance threshold of 15 Å already retained sufficient structural residues (Extended Data Fig. 3a). Although optimal distance and conservation constraints likely differ among other classes of protein functions and folds, these observations underscore an important balance between leveraging evolutionary conservation and preserving enzymatic activity.

Although not an explicit redesign constraint, we also considered whether redesigned RTs preserved or mutated PACE-evolved or rationally engineered substitutions. Because activity-based directed evolution campaigns often result in mutations near the catalytic site, we anticipated that targeted redesign of substrate-distal residues would largely mutate orthogonal sets of amino acid residues. Indeed, in PEmax and PE6d, which are derived from M-MLV RT, all substitutions from earlier rational design or laboratory evolution were excluded from redesign based on distance constraints. Similarly, the redesign pipelines for PE6a and PE6c constrained most of the mutations identified during PACE: 7 of 8 for PE6a (Ec48 retron derived) and 15 of 17 for PE6c (Tf1 RT derived; Extended Data Fig. 3b). These observations show that our pipeline for structure-informed redesign produces sequence variants with mutations at sites that largely complement those that arise from laboratory evolution and engineering.

Redesigned RTs improve correction of pathogenic mutations

To broadly validate the performance of redesigned RTs in prime editing, we further selected the top four variants from each redesigned class identified in the Pcsk9 +1 TTAC primary screen and evaluated them at three additional loci representing transition, transversion and deletion edits: RUNX1 +5 G to T, HEK3 +1 T to A and DNMT1 1–15 deletions (Extended Data Fig. 3c). We observed general concordance in rank order of the top-performing variants across targets and decided to test the ability of the top two variants from each RT starting point for the ability to correct 700 pathogenic variants from the ClinVar database^54,55. This set represents a wide range of prime edits, including all possible types of single-nucleotide substitutions, insertions and negative controls. We designed a lentiviral cassette pairing each unique epegRNA with an adjacent synthetic target site such that prime editing outcomes could be directly linked to the epegRNA in a manner compatible with pooled evaluation. We transduced Hepa1-6 cells with the epegRNA library at low multiplicity of infection (≤0.3), enriched for transduced cells under puromycin selection for 7 days and transfected cellular pools with LNPs to deliver prime editor mRNAs of the top two redesigned variants for each prime editor RT, in addition to the starting point prime editors (Fig. 3a). We collected genomic DNA 72 h after transfection, performed HTS of the integrated target cassette and quantified editing outcomes for aligned reads (see Methods). We observed a strong correlation between biological replicates for all prime editors (Pearson R = 0.97; Fig. 3c). Editing outcomes revealed a broad distribution of PE activities dependent on the edit, as expected. The resulting dataset comprised 700 unique prime edits that use eight redesigned RTs and four starting RTs, performed in duplicate, in total, representing the outcomes of 16,800 prime editing experiments (Fig. 3b).

**Fig. 3: Self-targeting PE pooled screen for validation of redesigned RT variants.**

For each unique prime edit, we normalized raw editing efficiencies by quantifying the fold improvement of the sequence redesigned RT variants versus the corresponding engineered or evolved starting point. Compared to each of the starting prime editors, average editing efficiency improved for at least one designed RT variant (Fig. 3d and Extended Data Fig. 4). Redesigned RTs substantially improved the compact Ec48-derived prime editor PE6a (D11 and D37 variants, 58 and 30 mutations from the PE6a RT starting point, 1.6- and 2.1-fold higher average editing efficiency, respectively) and the Tf1-based PE6c (D1 and D8, 133 and 117 mutations from the PE6c RT starting point, 1.3- and 1.4-fold higher average editing efficiency, respectively). PE6a and PE6c were heavily mutated in their original PACE campaigns, accumulating 8 and 17 amino acid substitutions, respectively, compared to their wild-type precursors^1,21. Given that most natural proteins have folding energies of only 5–10 kcal mol⁻¹, and mutations accumulated during laboratory evolution often degrade stability⁶⁴, we speculate that these heavily mutated RTs, including the Schizosaccharomyces pombe Tf1 retrotransposon-derived RT that naturally evolved at 32 °C (ref. ⁶⁵), may be particularly amenable to our computational redesign workflow. By contrast, more modest improvements were observed for M-MLV RT-based prime editors PEmax (D1 and D27, 109 and 163 mutations from the PEmax RT starting point, 1.1- and 1.2-fold average improvement) and PE6d (D16 and D34, 70 and 81 mutations from the PE6d RT starting point, 0.9- and 1.4-fold average editing efficiency change). The RTs in PEmax and PE6d contain only five and seven amino acid substitutions, respectively, from natural M-MLV RT, and both notably already contain thermostability-enhancing mutations added to the original PE2 prime editor^1,66. We speculate that, as a result, these M-MLV-derived RTs may offer fewer opportunities to enhance stability than the RTs in PE6a or PE6c.

To validate the results of the pooled experiments, we separately cloned and tested the top-performing RT variants with eight library elements from the pooled ClinVar variant screen via arrayed lentiviral transduction with self-targeting epegRNA cassettes, and we observed good correlation between arrayed and pooled experiments (Pearson R = 0.95; Extended Data Fig. 5a). We also observed substantial improvement in average editing efficiencies for redesigned PE6a-D37 (average 2.5-fold improvement), PE6c-D8 (average 2.3-fold improvement), PE6d-D34 (average 1.4-fold improvement) and PEmax-D27 (average 1.3-fold improvement) compared to their starting points for correction of pathogenic variants in MSH6, KIF7, LDLR, CYP4F22, KCNJ2, MOCS1, TRIOBP and MYH14 (Fig. 3e) associated with Lynch syndrome, acrocallosal syndrome, hypercholesterolemia, lamellar ichthyosis, Anderson–Tawil syndrome, molybdenum cofactor deficiency, autosomal recessive nonsyndromic deafness and autosomal dominant nonsyndromic deafness, respectively. These data demonstrate that computationally redesigned RTs can substantially enhance prime editor performance across hundreds of clinically relevant mutations. Based on their consistent improvement in prime editing performance over the corresponding PE6 or PEmax variants, we designated PE6a-D37 as PE8a, PE6c-D8 as PE8c, PE6d-D34 as PE8d and PEmax-D27 as PE8max.

Next, we compared the performance of the PE8 variants to the recently reported PEmax–La fusion (PE7)¹⁸. We cloned and tested these PE variants as LNP-delivered mRNAs across seven synthetic target sites containing self-targeting pegRNA/epegRNA cassettes genomically integrated from lentiviral transduction in two formats. First, as the original PE7 study¹⁸ reported greater editing efficiency improvement when La fusion was paired with pegRNAs than with epegRNAs, we compared PE8 variants using epegRNAs to PE7 using standard pegRNAs. The PE8 variants in this experiment resulted in an average of 3.5-fold higher editing efficiency for PE8c, 3.0-fold higher editing efficiency for PE8d and 1.3-fold higher editing efficiency for PE8max than for PE7 (Fig. 3f). We observed a consistent trend in two additional settings: plasmid DNA transfection across four endogenous sites in HEK293T cells, with 1.7-fold higher average editing efficiency for PE8c, 1.4-fold higher average editing efficiency for PE8d and 1.4-fold higher average editing efficiency for PE8max than for PE7 (Extended Data Fig. 5b), and LNP-mRNA transfection across three endogenous sites in Huh-7 cells, with 4.2-fold higher average editing efficiency for PE8c and 1.5-fold higher average editing efficiency for PE8max than for PE7 (Extended Data Fig. 5c). The second format in which we compared PE7 and PE8 constructs used epegRNAs for both editors. We observed similar trends, with 2.8-, 2.5- and 1.9-fold higher average editing efficiencies for PE8c, PE8d and PE8max using epegRNAs, respectively, than for PE7 using epegRNAs (Extended Data Fig. 5d). Although PE8 variants tested at hundreds of sites, on average, improved prime editing over PE6 variants and PE7, certain sites showed no improvement from PE8 (Figs. 3d,f and Extended Data Fig. 5c). Such site-dependent effects may reflect target-specific determinants, including specific pegRNA–RT interactions²¹.

We then assessed whether the redesigned PE8 variants could synergize with previous PE7 engineering efforts by fusing PE8 variants to the RNA-binding, N-terminal domain of La. As before, we evaluated PE8 and PE8–La constructs under two comparison formats. First, we compared PE8 variants using state-of-the-art epegRNA tevo2.0 motifs¹⁷ to PE–La constructs using standard pegRNAs (Extended Data Fig. 5e). We also directly compared PE8 and PE8–La variants using epegRNA motifs for both (Extended Data Fig. 5f). These experiments showed that PE8–La, whether using pegRNAs or epegRNAs, did not further enhance editing efficiencies beyond those of PE8 alone using epegRNAs. We reason that the latest epegRNA designs likely enhance intended editing efficiencies through mechanisms similar to those provided by protein fusions such as La. Together, these results suggest that PE8 variants offer higher average editing efficiencies than La-fused PEmax (PE7).

Redesign improves expression level and thermostability of RTs

The redesign process extensively changed RT sequences among PE8 variants compared to their starting points. PE8a, PE8c, PE8d and PE8max showed 7.4, 23, 16 and 24.0% sequence divergence from their starting points, corresponding to 30, 117, 81 and 163 amino acid changes (Fig. 4a). By contrast, our previous directed evolution and engineering campaigns resulted in a maximum of 17 amino acid substitutions in any RT domain²¹.

**Fig. 4: Characterization of stabilized RT variants.**

To assess whether these PE8 variants displayed improved folding and stability properties, remedying deficiencies reflected in impaired soluble expression levels of PE6 and PEmax starting points (Fig. 1a), we analyzed soluble protein production levels in both mammalian and bacterial expression systems. In mammalian cells, anti-Cas9 ELISA performed on lysates from Hepa1-6 cells transfected with LNP-delivered mRNA revealed that peak protein levels of redesigned PE8a, PE8c, PE8d and PE8max variants at 8 h after transfection increased by 1.2‑, 2.3‑, 2.0‑ and 2.1‑fold, respectively, compared to their parental counterparts (Fig. 4b). The editing kinetic profiles following LNP-delivered mRNA were consistent with the ELISA measurements showing near-complete loss of editor protein by 48 h (Extended Data Fig. 6b).

Purification of the standalone RT domains from bacterial lysates revealed a decrease in soluble protein yield for evolved and engineered RTs compared to their parental wild-type enzymes (Fig. 4c and Extended Data Fig. 6a). Notably, PEmax ΔRNase H RT exhibited higher expression in bacterial cells than the evolved version PE6d RT; however, RNase H truncation has been reported to reduce enzyme processivity, whereas the evolved PE6d RT restored processivity, highlighting the tradeoff between enzyme activity and solubility⁶⁷. By contrast, purification of the redesigned RTs restored soluble protein yield by 2.3-fold and 1.1-fold from PE6c and PE6d, respectively, suggesting that the improved expression levels of the full-length prime editors are attributable to enhanced soluble expression of the redesigned RTs (Fig. 4c and Extended Data Fig. 6a). Although standalone PE8d RT showed a modest increase in soluble yield during bacterial purification, to evaluate the full-length proteins, we performed an orthogonal cell-free expression assay. In this system, full-length PE8d exhibited a consistent 1.3-fold increase in yield compared to PE6d at 2-, 4- and 5-h time points under 37 °C incubation (Extended Data Fig. 6c). Collectively, these observations show that sequence redesigned RTs possess improved soluble protein production properties, independent of codon usage contexts, across mammalian and bacterial systems.

Next, we measured the thermal stability of PE8 variants by differential scanning fluorimetry (DSF) and compared to engineered and evolved PEmax and PE6 starting points. Notably, the redesigned PE8c and PE8max variants showed melting temperature (T_m) increases of 8 °C and 2 °C, respectively, compared to their PE6c and PEmax starting points (Fig. 4d). Although the PE8d RT did not show improved thermodynamic T_m values compared to the PE6d RT, we sought to determine if its functional stability was enhanced (Fig. 4d). To this end, we performed a temperature-dependent reverse transcription activity assay using a pool of RNA templates of varying lengths and quantified the resulting cDNA products (Extended Data Fig. 6d). Although PE8d RT and PE6d RT showed comparable reverse transcription activity at 37 °C, PE8d RT retained higher activity at 57 °C at both 10- and 60-min time points (Extended Data Fig. 6d). Furthermore, when compared to PEmax ΔRNase H RT, computationally redesigned PE8d RT successfully restored high-temperature functional robustness that was lost during the evolution of PE6d RT (Extended Data Fig. 6d). These results confirm that computational sequence redesign can improve prime editor RT stability and expression levels compared to the previously reported state-of-the-art variants. Further mechanistic studies will be required to deconvolute the relative contribution of these properties toward enhancing prime editing outcomes.

Prime editors with redesigned RTs show improved efficacy in therapeutic applications

Many therapeutic applications of prime editors in animal models or humans occur under transient, limited-dose conditions in which prime editor performance is constrained by delivery efficiency and expression bottlenecks. We hypothesized that enhancing the foldability and thermal stability of prime editors would improve editing outcomes under these conditions relevant for clinical applications. To test the performance of PE8 prime editor variants under these conditions, we evaluated prime editing efficacy with transient mRNA and RNP delivery modalities, including electroporation, eVLPs and LNPs.

Across multiple primary cell types and genomic targets, ProteinMPNN-redesigned PE8 variants consistently outperformed their parental counterparts in a diverse array of primary human cell types. In ex vivo mRNA electroporation, PE8c and PE8max improved editing of the RECQL3 locus in human-derived fibroblasts, correcting the 6-base-pair deletion at nucleotide 2281 that causes Bloom syndrome by 1.3-fold and 1.1-fold compared to PE6c and PEmax, respectively (Fig. 5a). In mRNA-electroporated human CD34⁺ HSPCs, PE8c targeting HBB achieved a 1.6-fold improvement in editing efficiency over PE6c to install a PAM-disrupting +5 G-to-A silent edit associated with sickle cell disease, surpassing the efficiency of extensively optimized previous conditions for installing this corrective edit with PEmax (Fig. 5b and Extended Data Fig. 7c). Importantly, we previously observed 40% editing efficiency at this site with 2 µg of PEmax mRNA⁵, whereas here we achieved 50% editing efficiency with a lower dose (1 µg) of PE8c mRNA, without an increase in indels, suggesting higher on-target activity of PE8c. In activated primary human T cells, PE8d yielded enhanced editing efficiencies at the IL2RB locus relative to PE6d to install an orthogonalizing edit for improving the safety of adoptive T cell therapy, showing a 1.2-fold increase at the high dose (1 µg mRNA) and a 1.6-fold increase at the low dose (0.5 µg of mRNA; Fig. 5c and Extended Data Fig. 7d). We also observed comparable or enhanced editing efficiencies for PE8 variants in human-derived fibroblasts to correct mutations for Crigler–Najjar syndrome and Tay-Sachs disease (Extended Data Fig. 7a,b). Collectively, these data demonstrate that PE8 variants mediate enhanced therapeutically relevant editing outcomes, even for prime edits extensively optimized in previous studies.

**Fig. 5: PE8 variants improve ex vivo and in vivo prime editing across multiple delivery methods.**

eVLPs offer a transient, modular modality for the delivery of genome editing agents as RNP complexes based on a viral scaffold^13,68,69. In eVLP-mediated prime editor delivery, redesigned PE8 variants yielded substantial gains in prime editing efficiency compared to the corresponding PE6 or PEmax starting points. At the highest tested dose and across all sites that we evaluated, PE8c PE-eVLPs yielded higher editing efficiencies than PE6c PE-eVLPs, specifically at Dnmt1 (78% with PE8c and 50% with PE6c, 1.6-fold improvement) and Col12a1 (18% with PE8c and 6.2% with PE6c, 2.9-fold improvement) loci in Neuro-2A cells and at FANCF (21% with PE8c and 11% with PE6c, 1.8-fold improvement) and HEK293T site 3 (HEK3; 28% with PE8c and 18% with PE6c, 1.6-fold improvement) in HEK293T cells (Fig. 5d). We attribute the superior performance of PE8c compared to PE6c, in part, to a redesigned Tf1 RT that can maintain proper folding and efficient activity at 37 °C during the 48-h particle production process, whereas wild‑type Tf1 RT from S. pombe exhibits an activity optimum at 32 °C (ref. ⁶⁵). PE8c also outperformed the prior highest-efficiency PE-eVLP editing, which used PEmax, for three of four edits that we tested (78% versus 58% at the Dnmt1 locus, 1.4-fold improvement; 18% versus 8% at the Col12a1 locus, 2.3-fold improvement; 21% versus 15% at the FANCF locus, 1.4-fold improvement; Extended Data Fig. 8). These outcomes demonstrate that PE8 variants can also improve prime editing efficiencies under transient RNP-based delivery conditions such as eVLPs.

We further evaluated the performance of PE8 prime editors in vivo in mice. We selected Pcsk9 as a clinically relevant target for familial hypercholesterolemia⁷⁰. We separately formulated prime editor mRNA and accompanying epegRNA and ngRNA into LNPs using an in-house formulation based on the ionizable OF-02 lipid^53,63. The synthetic epegRNA was designed to install the +1 TTAC insertion in Pcsk9 that we previously used to screen designs in vitro. We admixed LNPs immediately before injection and delivered the admixture to adult C57BL/6 mice via retro-orbital injection at a modest total RNA dose of 1 mg per kg (body weight) to maximize differences among prime editors by avoiding high levels of bulk liver editing in which the editing agents are no longer efficiency limiting. We collected liver tissues 1 week after injection for genomic DNA extraction and bulk HTS (Fig. 5e). Redesigned PE8c (14% average bulk liver editing efficiency) and PE8d (10% editing) improved average in vivo editing efficiency by 1.4-fold and 2.9-fold, respectively, compared to the original PE6c (10% editing) and PE6d (3.5% editing) variants (Fig. 5f). Similarly, PE8max achieved a mean in vivo editing efficiency of 4.8%, a 2.5-fold improvement over PEmax (1.9%; Fig. 5f).

In the same liver tissue samples, we also characterized the precision of the PE8 prime editors. To assess potential changes in product purity at the Pcsk9 locus independent of editor activity, we quantified the edit:indel ratio and observed that PE8 editors did not produce any substantial changes to this product distribution compared to their PE6 or PEmax predecessors (Extended Data Fig. 9a,b). We also evaluated off-target editing at the top 14 CIRCLE-seq-nominated sites⁶⁹ and found that no detectable indels or substitutions were present at these candidate off-target loci at levels above background following editor treatment (Extended Data Fig. 9c,d), suggesting that PE8 editors can enhance editing efficiency without sacrificing product purity or target DNA specificity compared to PE.

To determine the relationship between in vivo activity and expression, we formulated prime editor mRNAs encoding the parental and redesigned PE8 variants into LNPs, which we administered to adult C57BL/6 mice via retro-orbital injection (0.33 mg per kg (body weight)). This dose is equivalent to that of the isolated prime editor mRNA component used for our in vivo editing studies, which also include epegRNA and ngRNA LNPs (1.0 mg per kg (body weight) total RNA). Bulk liver samples were collected 4 h after injection, corresponding to the peak protein expression time point for this lipid formulation⁶³. We found that redesigned variants PE8c, PE8d and PE8max exhibited 1.7-, 2.4- and 1.8-fold higher peak protein levels in vivo than their respective parental counterparts, respectively (Fig. 5g). These data demonstrate that the improved in vivo activity of the redesigned PE8 variants is accompanied by increased protein expression following mRNA-LNP delivery.

Finally, we compared the in vivo performance of a PE8 editor to the PE7 editor (PEmax–La) in mice. We selected PE8c for this comparison because it supported the highest editing efficiency among the PE8 variants for installing the +1 TTAC insertion at the Pcsk9 site (Fig. 5h). We dosed adult C57BL/6 mice with 2 mg per kg (body weight) total RNA-LNP and collected bulk liver 1 week after injection for HTS analysis. PE8c achieved 44% average bulk liver editing (a 1.8‑fold improvement over PE7; 25% average bulk liver editing; Fig. 5h), consistent with our previous observation of higher editing efficiency of PE8c than PE7 in cultured cells (Extended Data Fig. 5).

Together, these results demonstrate that computational redesign of previously evolved and engineered RT domains in prime editors enhances their editing efficiency across a broad range of therapeutic contexts (including ex vivo mRNA electroporation, eVLP-based RNP delivery and in vivo LNP delivery) without apparent erosion of DNA specificity (via off-target editing) or product purity (via indel formation).

Discussion

Protein engineering and laboratory evolution have yielded prime editors with enhanced activity at the expense of reduced stability and expression, limiting their therapeutic potential. Here, we used ProteinMPNN to redesign the RT domains of these engineered and evolved prime editors to improve stability and intracellular expression properties, yielding variants with up to 163 amino acid changes from input sequences. Redesigned prime editors based on Ec48, Tf1 and M-MLV RTs resulted in improved efficiencies for the correction of over 700 therapeutically relevant genetic mutations. We also demonstrated enhanced performance across several therapeutically relevant delivery systems, including mRNA electroporation for ex vivo cell therapy in primary T cells and HSPCs and eVLP-mediated RNP delivery for correction of pathogenic alleles. These improvements in cultured cells were further magnified in vivo, where delivery of PE-LNPs in adult mice resulted in 1.4- to 2.9-fold improvements in prime editing efficiency in bulk liver when using the redesigned PE8 variants compared to their respective starting points. These improvements expand the capabilities of prime editing, especially for in vivo or therapeutic applications.

Our recommendations for selecting PE8 variant depend on editor size requirements and delivery context. We provide an operational decision schematic in Fig. 5i. If payload size is restricted, PE8a should be considered due to its shorter coding sequence (5.6 kilobases). Although PE8c and PE8d have larger RT domains, their improved potency and expression in vivo suggest that they should be prioritized to maximize activity per dosed molecule. This recommendation is consistent with our findings that the redesign process retained enhanced catalytic properties of parental PE6c and PE6d variants²¹ while increasing expression level in vivo (Fig. 5g). Notably, PE8c should be considered for eVLP-based delivery given its consistent superior performance (Fig. 5d). To further improve editing efficiencies, PE8c, PE8d and PE8max should all be evaluated, as editing outcomes may depend on context-specific interactions between the pegRNA and RT domains²¹. Further screening of the full panel of ProteinMPNN designs (Supplementary Table 1) may potentially enhance context-dependent and target-specific editing outcomes. If only a single editor can be tested due to time or resource constraints, we recommend using PE8c as the starting point for all standard prime editing workflows in both research and therapeutic applications and testing PE8d, PE8a or PE8max when specific editing or size characteristics are preferred (Fig. 5i). Finally, because PE8 variants were primarily developed using transient mRNA-LNP delivery, we anticipate that their benefits will be most robust in transient contexts; performance in alternative modalities, such as plasmid delivery, should be evaluated independently.

The continued expansion of therapeutic genome editing in preclinical and clinical studies benefits from the development of improved prime editing systems that maximize editor efficacy while minimizing the lowest effective dose. Because LNPs represent a clinically validated transient delivery modality for therapeutic genome editors, we focused our efforts toward identifying enhanced redesigned RT variants that preserved or augmented prime editing functionality directly in the mRNA-LNP context. We found that the selected variants also provided enhanced outcomes for alternative experimental formats, including plasmid transfection, RNA electroporation and eVLP transduction. Future studies that apply screening approaches based on orthogonal delivery modalities may identify variants capable of addressing additional bottlenecks, such as the packaging or release of prime editors in eVLPs. Additional mechanistic experiments to distinguish the contributions of protein expression level and functional stability may also reveal which ProteinMPNN design variants are optimal in different therapeutic settings.

We focused our redesign efforts on the RT component of the prime editor, which we previously engineered and evolved extensively in our laboratory²¹ and therefore expected to yield the greatest benefit to stabilization upon redesign. The modularity of enhancements to the prime editor, observed in previous studies that have modified both the protein and RNA components of the system, suggests that our redesigned RT scaffolds may support compatibility with recent expansions to the prime editing toolbox, including modifications to the Cas domain (vPE, PE6e-g and PAM-flexible variants)^21,51,71, the pegRNA component (epegRNAs, ngRNAs and packaging aptamers)^17,19,68, alternative synthetic fusions (La, Exo-PE and rPE)^18,44,72 and the RT catalytic core (PEmax**)²⁴. More extensive characterization for each of these combinations may be required, particularly with challenging targets for which improvements to the editor protein alone may be insufficient and optimal performance may depend on the specific pegRNA configuration and complementary toolkit components used. Nevertheless, our work demonstrates that the PE8 suite of prime editors, on average, offer state-of-the-art efficiencies over PE6 and PE7 variants.

The consistent performance improvements across various delivery systems demonstrates that AI-guided redesign can ameliorate previous efficiency constraints of engineered and evolved prime editors. Notably, despite the complexity of the prime editing mechanism, the many precise conformational requirements for successful RT-catalyzed DNA polymerization and the high degree of change in the resulting RTs, redesign of functional prime editor variants was remarkably successful: 95% of initial redesigns supported editing efficiencies above 5%. This study thus provides a proof of concept for a generalizable framework to enhance variants emerging from prime editor-directed evolution. More broadly, this work demonstrates the power of protein deep learning models to traverse expansive regions in the mutational fitness landscape while preserving enzymatic activity outcomes.

Methods

General molecular biology and cloning

For all cloning experiments, gene fragments without adapters were synthesized by Twist Bioscience, and primers were ordered from Integrated DNA Technologies. Plasmid backbones were amplified by PCR using either Phusion U Green Multiplex PCR Master Mix (Thermo Fisher Scientific, F562) or Q5 Hot Start High-Fidelity 2× Master Mix (New England Biolabs, M0494). All plasmids, including IVT prime editor expression plasmids, epegRNA expression plasmids, eVLP plasmids and self-targeting lentiviral elements, were constructed by Gibson assembly with NEBuilder HiFi DNA Assembly Master Mix (New England Biolabs, E2621). For self-targeting lentiviral elements, plasmid pools were transformed into NEB 5-α competent cells (New England Biolabs, C2987), whereas all other plasmids were transformed into One Shot Mach1 cells (Invitrogen, C862003). Plasmids were purified using Qiagen Plasmid Plus 96 Miniprep kits (16181), Qiagen Plasmid Plus Midi kits (12943) or Qiagen Plasmid Plus Maxi kits (12963).

Cell culture

Hepa1-6 (CRL-1830), HEK293T (CRL-3216) and Neuro-2A (CCL-131) cell lines were purchased from ATCC, and HuH-7 cells were purchased from Cytion (300156). Gesicle 293T cells were purchased from Takara (632617). Hepa1-6, HEK293T, Neuro-2A and Gesicle 293T cells were grown in DMEM plus GlutaMAX (Thermo Fisher Scientific) supplemented with 10% (vol/vol) fetal bovine serum (FBS). The following media were used for each human-derived fibroblast line: EMEM (ATCC) supplemented with 10% (vol/vol) FBS for Tay-Sachs disease (GM00221) and EMEM (ATCC) supplemented with 15% (vol/vol) FBS for both Crigler–Najjar syndrome (GM09551) and Bloom syndrome (GM02085). For granulocyte colony-stimulating factor-mobilized human CD34⁺ HSPCs from unidentified healthy adult donors (Fred Hutchinson Cancer Center), cells were cultured in X-VIVO 10 medium (Lonza, 04-380Q) supplemented with 100 ng ml⁻¹ human stem cell factor (Peprotech, 300-07), 100 ng ml⁻¹ human thrombopoietin (Peprotech, 300-18) and 100 ng ml⁻¹ human FLT3 ligand (Peprotech, 300-19) at a density of 1 × 10⁶–2 × 10⁶ cells per ml. For peripheral blood mononuclear cells, cells from two independent anonymous healthy human donors were purchased from StemCell Technologies. Primary human lymphocytes were maintained in RPMI-1640 supplemented with GlutaMAX (Thermo Fisher Scientific, 61870036), 1% penicillin and streptomycin (Thermo Fisher Scientific, 15140122), 1× nonessential amino acids (Thermo Fisher Scientific, 11140050) and 100 IU recombinant human IL-2 (Peprotech, 200-02). Cell lines were maintained at 37 °C with 5% CO₂ and authenticated by their suppliers and were verified to be mycoplasma negative during the study.

ProteinMPNN sequence design

Procedures for sequence design with all associated code are available on GitHub (https://github.com/Allentaoyz/Redesigned_prime_editor_RTs). Briefly, AlphaFold3-predicted structures of each redesigned RT were used as input for ProteinMPNN. Residues within defined distances from the substrate and residues with high evolutionary conservation were constrained from redesign. Sequences were generated in groups based on the applied constraint cutoffs. Substrate distance thresholds of 15, 18 and 20 Å were measured in PyMOL. Residue conservation was calculated as amino acid frequency in a multiple sequence alignment of homologs identified by searching UniRef50 with the parental RT sequence as the query. Conservation cutoffs of 25% and 50% were applied for all RTs; residues meeting or exceeding the cutoff and representing the plurality amino acid at that position were constrained from redesign. During sequence generation, sampling temperatures of 0.1 and 0.3 were tested. Cysteine residues were excluded from redesign to prevent oxidation. Predicted structures of the designed sequences were generated using AlphaFold2 and evaluated by calculating pLDDT scores and r.m.s.d. values relative to the input structures in PyMOL.

Self-targeting lentiviral pool screen

For library design, we selected n = 700 edits from library ClinVar⁵⁵ with previously validated baseline prime editing activity. Negative-control edits (n = 50) were included using the Staphylococcus aureus scaffold sequence or the scramble spacer sequence. For each selected edit, epegRNAs were generated and appended with a variable-length GC-balanced linker and a reverse-complemented target site (49 nucleotides) flanked by two halves of a unique 16-nucleotide barcode. The addition of 23-nucleotide homology arms on each side resulted in a uniform library length of 300 nucleotides. Final designs are listed in Supplementary Table 2.

All high-throughput screening libraries were synthesized as single-stranded oligonucleotide pools by Twist Bioscience and constructed as previously described¹⁷. For lentivirus production for self-targeting constructs, HEK293T cells were seeded in six-well plates at a density of 1 × 10⁶ cells per well. Sixteen hours after seeding, cells were transfected with Lipofectamine 2000 (12 µl; Invitrogen, 11668019), according to the manufacturer’s instructions, using a mixture of transfer plasmid (1,333 ng), pCMV-dR8.2 dvpr (1,000 ng; Addgene, 8455) and pMD2.G (667 ng; Addgene, 12259) in 250 µl of Opti-MEM I Reduced Serum medium (Gibco, 31985070). Viral supernatant was collected 48 h after transfection, centrifuged at 500g for 5 min to remove debris, filtered through a 0.45-µm polyvinylidene difluoride filter (MilliporeSigma, SLHVM33RS) and directly used without further concentration.

Cells were transduced at a target multiplicity of infection of 0.3 with ≥1,000× coverage of transduced cells. For Hepa1-6 cell transduction, spinfection was performed in six-well plates with 1 × 10⁶ cells in medium supplemented with lentiviral supernatant and 8 µg ml⁻¹ polybrene (MilliporeSigma, TR-1003-G) and centrifugation at 900g for 90 min at 37 °C. After spinfection, medium was replaced with fresh culture medium, and cells were passaged into T75 flasks 24 h later. Puromycin selection (1 µg ml⁻¹; InvivoGen, ant-pr-1) was initiated 48 h after transduction, and cells were cultured in selective medium for 7 days until >95% of viable cells were BFP⁺, as measured by a CytoFLEX S Flow Cytometer (Beckman Coulter).

IVT of PE mRNA

IVT of PE mRNA was performed as previously described⁷³. Editors were cloned into pT7 expression constructs (Addgene, 178113). To generate linear DNA templates for IVT, the pT7 editor plasmids were amplified by PCR using Q5 Hot Start High-Fidelity 2× Master Mix (New England Biolabs, M0494) with primers IVT-fwd and IVT-rev (Supplementary Table 3) and the following cycling conditions: 98 °C for 2 min; 35 cycles of 98 °C for 15 s, 68.8 °C for 30 s and 72 °C for 3 min and 30 s and a final extension at 72 °C for 5 min. PCR products were purified using a QIAquick PCR purification kit (Qiagen, 28104). IVT reactions were performed with a HiScribe T7 High Yield RNA synthesis kit (New England Biolabs, E2040) according to the manufacturer’s optional protocol, substituting UTP entirely with N¹-methylpseudouridine-5′-triphosphate (TriLink, N-1081) and incorporating co-transcriptional capping using CleanCap Reagent AG 3′ OMe (TriLink, N-7413). Reactions were incubated at 37 °C for 2 h, followed by DNase I (RNase free) treatment and RNA purification using a Monarch Spin RNA Cleanup kit (New England Biolabs, T2050).

Chemically synthesized guide RNA

Chemically synthesized pegRNAs used for in vitro experiments were ordered from Integrated DNA Technologies (desalted), and those used for in vivo experiments were ordered from GenScript (high-performance liquid chromatography purified). Both contained 2′-O-methyl modifications and 3′-phosphorothioate linkages at the first and last three nucleotides, unless stated otherwise. Chemically synthesized ngRNAs were ordered from Synthego and contained 2′-O-methyl modifications at the first and last three nucleotides and 3′-phosphorothioate linkages between the first three and last two nucleotides.

LNP production

LNPs were formulated by mixing an aqueous phase containing the RNA with an ethanol phase containing the lipids. The ethanol phase was prepared by solubilizing a mixture of OF-02 ionizable lipid (WuXi AppTec), 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE; Avanti Research, 850725), cholesterol (Sigma-Aldrich, C8667) and 1,2-dimyristoyl-sn-glycero-3-phosphoethanolamine-N-[methoxy(polyethylene glycol)-2000] (ammonium salt; C14-PEG2000; Avanti Research, 880150) at a molar ratio of OF-02:DOPE:cholesterol:C14-PEG200 = 35:16:46.5:2.5 and ionizable lipid:RNA weight ratio of 10:1. The aqueous phase was prepared in a 10 mM citrate buffer containing the corresponding mRNA, pegRNA or ngRNA. To produce mRNA LNPs for arrayed transfection in vitro, the aqueous and ethanol phases were mixed at a 3:1 volume ratio by pipetting, diluted fivefold with 1× PBS and used for transfections. To produce all pegRNA and ngRNA LNPs as well as mRNA LNPs for lentivirus screening and in vivo experiments, the aqueous and ethanol phases were mixed in a microfluidic device at a 3:1 ratio by syringe pumps to a final RNA concentration of 0.15 mg ml⁻¹. The resultant formulation was dialyzed against 1× PBS overnight in a 20-kDa molecular weight cutoff dialysis cassette (Thermo Fisher Scientific, 66003) at 4 °C. For in vivo experiments, LNPs were concentrated following dialysis at 4 °C with a 100-kDa molecular weight cutoff Amicon Ultra Centrifugal Filter (Millipore, UFC210024).

LNP characterization

Total RNA concentrations and particle size of LNP formulations were measured by UV-Vis spectrophotometry and differential light scattering, respectively, on a Stunner (Unchained Labs) using the ‘RNA-LNP Screen’ application with default settings in Stunner Client software (version 9.1.0.143). Samples were measured in duplicate with water blanking. Data analysis was performed with Stunner Analysis software (version 9.1.0.157).

In vitro transfection of LNPs

For array screens, Hepa1-6 cells were seeded in 96-well plates (10,000 cells per well) in 100 µl of medium 24 h before transfection. Unless otherwise specified, PE-LNPs containing 25 ng of total RNA were incubated with recombinant apolipoprotein E3 (ApoE3; 6 µg ml⁻¹; R&D Systems, 4144-AE) for 10 min at 37 °C, and 20 µl of the PE-LNP/ApoE3 mixture was added to each well to achieve a final ApoE3 concentration of 1 µg ml⁻¹. Genomic DNA was collected 72 h after transfection. Unless otherwise specified, cells were lysed in 50 µl of lysis buffer (10 mM Tris-HCl (pH 8.0), 0.05% SDS and 25 µg ml⁻¹ proteinase K) by incubation at 37 °C for 1 h, followed by 55 °C for 30 min. Crude lysates were used directly as input for HTS library preparation.

For pooled screens, Hepa1-6 cells were seeded in six-well plates (1 × 10⁶ cells per well) in 1 ml of medium 24 h before transfection. PE-LNPs containing 180 ng of total RNA were incubated with recombinant ApoE3 (4.33 µg ml⁻¹) for 10 min at 37 °C, and 300 µl of the PE-LNP/ApoE3 mixture was added to each well to achieve a final ApoE3 concentration of 1 µg ml⁻¹. Genomic DNA was extracted 72 h after transfection using a QIAamp DNA Mini kit (Qiagen, 51304).

For arrayed validation, Hepa1-6 or Huh-7 cells were seeded in 48-well plates (35,000 cells per well) in 300 µl of medium 24 h before transfection. PE-LNPs containing 15 ng (Hepa1-6 transfections) or 100 ng (Huh-7 transfections) of total RNA were preincubated with recombinant ApoE3 (7 µg ml⁻¹) under identical conditions, and 50 µl of the mixture was added per well to yield a final ApoE3 concentration of 1 µg ml⁻¹.

Electroporation of RNA in primary human fibroblasts, human T cells and HSPCs

Electroporation of primary human fibroblasts

Cells (200,000 per reaction) were dissociated with TrypLE Express Enzyme (Gibco, 12605036), washed once with 1 ml of PBS and resuspended in 20 µl of complete SE nucleofector solution (Lonza, V4XC-1032) supplemented with either a dose (1 µg of PEmax mRNA, 90 pmol of epegRNA, 60 pmol of ngRNA) for Bloom syndrome and Crigler–Najjar syndrome fibroblasts or a dose (0.5 µg of PEmax mRNA, 45 pmol of epegRNA, 30 pmol of ngRNA) for Tay-Sachs disease fibroblasts. The cell–RNA mixture was electroporated using program CM-130 with a 4D-Nucleofector device (Lonza, AAF-1003X). Following electroporation, cells were suspended in 80 µl of prewarmed growth medium, transferred into 1 ml of prewarmed culture medium in 24-well plates and incubated for 72 h.

Electroporation of primary human T cells

Peripheral blood mononuclear cells were thawed and T cells were isolated using an EasySep Human T Cell Isolation kit (StemCell Technologies, 17951) according to the manufacturer’s protocols. Isolated T cells were rested overnight at a density of 1 × 10⁶ cells per ml. Twenty to 24 h after thawing, T cells were activated using Dynabeads Human T-Activator CD3/CD28 (Gibco) according to the manufacturer’s protocols. Forty-eight hours after activation, cells were debeaded, washed and resuspended in Opti-MEM. For each replicate and condition, 200,000 T cells were electroporated with either a high dose (1 µg of editor mRNA, 240 pmol of epegRNA and 160 pmol of ngRNA) or a low dose (0.5 µg of editor mRNA, 120 pmol of epegRNA and 80 pmol of ngRNA) using a MaxCyte ExPERT GTx and OC-25×3 Processing Assemblies (MaxCyte). Cells were rested in the processing assemblies at 37 °C for 20 min immediately following electroporation and subsequently cultured for 72 h.

Electroporation of human primary CD34⁺ HSPCs

Cells from healthy donors (200,000 per reaction) were washed twice with 10 ml of PBS and resuspended in 20 µl of complete P3 nucleofector solution (Lonza, V4XP-3032) supplemented with either a high dose (1 µg of editor mRNA, 180 pmol of epegRNA and 120 pmol of ngRNA) or a low dose (0.5 µg of editor mRNA, 180 pmol of epegRNA and 120 pmol of ngRNA). The cell–RNA mixture was transferred to a 16-well strip from a P3 4D-Nucleofector X Kit S (Lonza, V4XP-3032) and electroporated using program DS-130 on a 4D-Nucleofector device (Lonza, AAF-1003X). Following electroporation, cells were suspended in 80 µl of prewarmed growth medium and incubated for 72 h before downstream analysis.

Transfection for plasmid DNA delivery

HEK293T and Neuro-2A cells were seeded in 96-well plates at a density of 1 × 10⁴ cells per well in 100 µl of medium. After 18 h, cells were transfected with 1 µl of Lipofectamine 2000 (Thermo Fisher Scientific, 11668027) according to the manufacturer’s instructions, using 15 ng of PE plasmid and 15 ng of epegRNA plasmid per well. Genomic DNA was collected 72 h after transfection.

PE protein quantification by ELISA

Hepa1-6 cells cultured in 48-well plates were washed with 300 µl of ice-cold PBS and lysed in 100 µl of RIPA buffer supplemented with protease inhibitor cocktail (Roche, 11836170001) and PMSF (Sigma-Aldrich, 93482). Cell lysates were clarified, and 50 µl of each sample was used for ELISA using a FastScan Cas9 ELISA kit (Cell Signaling Technology, 29666) according to the manufacturer’s instructions. Absorbance was measured on a TECAN Spark microplate reader. A standard curve was generated by serial dilution of recombinant Cas9 nuclease protein (New England Biolabs, M0386) to calculate the PE protein concentrations.

HTS and analysis

For arrayed experiments, target amplicons were amplified from isolated genomic DNA⁷³. The first PCR (PCR1) was performed using primers containing Illumina sequencing adapters and targeting the genomic locus of interest (Supplementary Table 3). Each 25-µl reaction contained 1 µl of cell lysate and Phusion U Green Multiplex PCR Master Mix (Thermo Fisher Scientific, F564L). Cycling conditions were as follows: 98 °C for 3 min; 30 cycles of 98 °C for 10 s, 58–68 °C (optimized per target) for 20 s and 72 °C for 30 s; followed by a final extension at 72 °C for 2 min. Unique Illumina sequencing barcodes were incorporated in a second PCR (PCR2) using 1 μl of PCR1 product as template under the following conditions: 98 °C for 3 min; 10 cycles of 98 °C for 10 s, 61 °C for 20 s and 72 °C for 30 s; with a final extension at 72 °C for 2 min. After PCR2, amplicons were pooled by size and gel purified from 1% agarose using a QIAquick Gel Extraction kit (Qiagen, 28704). Pooled library concentrations were quantified using a Qubit dsDNA HS Assay kit (Invitrogen, Q33231) and run using an Illumina MiSeq 300 V2 kit (Illumina, MS-102-2002) or AVITI Sequencing Kit Cloudbreak Freestyle Low Output (Element Biosciences, 860-00011) with 220–300 cycles. Reads were demultiplexed with Bases2Fastq (Element Biosciences) or Generate FASTQ analysis module (Illumina).

For pool experiments, genomic DNA from each replicate was used to perform PCR1 using Q5 Hot Start High-Fidelity 2× Master Mix (New England Biolabs, M0494L) to amplify the integrated lentiviral cassette and append sequencing adapters. Each 50-µl reaction contained up to 0.6 µg of genomic DNA. Primer sequences are provided in Supplementary Table 3. PCR1 products were purified using a QIAquick PCR Purification kit (Qiagen, 28104), and 5 ng of purified DNA was used as input for PCR2, which incorporated unique sample indices and flow cell adapters. Final PCR products were again purified using a QIAquick PCR Purification kit (Qiagen, 28104) and assessed for quality using a TapeStation High-Sensitivity D1000 ScreenTape assay (Agilent). Libraries were then sequenced with a custom Read1 primer (Supplementary Table 3) on an Element Biosciences AVITI platform for 190 cycles for the R1 read. Reads were demultiplexed with Bases2Fastq (Element Biosciences). For further demultiplexing of individual elements, reads were separated into individual fastq files by alignment to a corresponding member of the library using Bowtie2 (ref. ⁷⁴) and trimming to exclude scaffold bases.

Data analysis was conducted using CRISPResso2 in HDR mode⁷⁵, using parameters ‘-q30’ and ‘discard_indel_reads TRUE’ and a quantification window ‘qwc’ spanning at least 10 nucleotides upstream and downstream of the pegRNA and/or ngRNA nick site. Prime editing efficiency was calculated as the percentage of reads containing the desired edit without indels relative to all reference-aligned reads. Indel frequency was calculated as the fraction of discarded reads relative to the total number of reference-aligned reads. For analysis of pooled libraries, in addition to applying minimum quality thresholds, raw data were filtered to exclude library elements that had dropped out (<1,000 reads) and elements that showed undetectable editing (<0.5% editing). Raw data for screens are included in Supplementary Table 2.

Protein expression and purification

Recombinant RTs with a C-terminal noncleavable 6×His tag were expressed in E. coli BL21 Star (DE3) cells (Thermo Fisher Scientific, C601003). Single colonies were inoculated into 5 ml of LB medium and grown overnight at 37 °C with shaking at 220 rpm. The overnight cultures were diluted 1:5 in 500 ml of LB medium and incubated at 37 °C with shaking until the optical density at 600 nm reached 1.0. Cultures were then cold shocked on ice for 1 h and induced with 0.5 mM isopropyl-β-d-thiogalactopyranoside (Gold Biotechnology, I2481C). Cells were collected by centrifugation at 4,000g at 4 °C for 30 min, resuspended in buffer B (20 mM Tris-HCl (pH 8.0), 300 mM NaCl, 0.5% Triton X-100, 10% (vol/vol) glycerol, 20 mM imidazole, 1 mg ml⁻¹ lysozyme (Sigma-Aldrich, 12671-19-1) and protease inhibitor cocktail) and lysed by sonication (total 3 min 20 s; 10 s on/20 s off, amplitude 10%) using a Qsonica CL-334 tip with Fisher Scientific FB705 power module. Lysates were clarified by centrifugation and incubated with Ni-NTA affinity resin (Thermo Fisher Scientific) at 4 °C for 1 h before being loaded onto a gravity column (G-Biosciences, 82021-346).

The resin was washed sequentially with 10 ml each of buffer B and buffer C (20 mM Tris-HCl (pH 8.0), 300 mM NaCl, 0.5% Triton X-100, 10% (vol/vol) glycerol and 80 mM imidazole), and bound proteins were eluted with buffer D (20 mM Tris-HCl (pH 8.0), 300 mM NaCl, 0.5% Triton X-100, 10% (vol/vol) glycerol and 250 mM imidazole). Eluted fractions containing pure protein were concentrated using Amicon Ultra centrifugal filters (30-kDa molecular weight cutoff; Millipore, UFC9030) and buffer exchanged into buffer E (50 mM Tris-HCl (pH 8.0), 100 mM NaCl, 0.1% Triton X-100, 1 mM EDTA and 1 mM DTT). Protein purity was assessed by SDS–PAGE using NuPAGE Bis-Tris Mini Protein Gels (4–12%, 1.0–1.5 mm; Invitrogen, NP0321BOX), and concentrations were determined with a Pierce BCA Protein Assay kit Reducing Agent Compatible (Thermo Fisher Scientific, 23250).

DSF

DSF was performed using SYPRO Orange dye (Thermo Fisher Scientific, S6650) under an iterative heating protocol. Briefly, 20-µl reactions contained 25 µM protein and 5× SYPRO Orange in buffer composed of 100 mM NaCl, 50 mM Tris-HCl (pH 7.5) and 1% DMSO. Dye-only controls were included for background subtraction. Samples were heated from 25 °C to 95 °C in increments of 1 °C with 30-s holds at each step, followed by cooling to 25 °C for 10 s between increments to allow signal equilibration. Fluorescence was recorded after each cooling step using a CFX Opus 96-Real Time PCR System (Bio-Rad). Raw fluorescence traces were background subtracted and normalized, and T_m values were determined from the peak of the first derivative of fluorescence with respect to temperature (dF / dT).

Cell-free protein expression

Prime editors were fused to monomeric enhanced green fluorescent protein (meGFP) and cloned into an expression plasmid. PE–meGFP proteins were expressed using a PURExpress In Vitro Protein Synthesis kit (New England Biolabs, E6800S) according to the manufacturer’s instructions. Reactions were prepared with the recommended component ratios and incubated at 37 °C for the indicated duration to optimize protein yield. Following expression, samples were kept on ice before analysis. Protein concentration was quantified via GFP fluorescence using a standard curve generated from purified GFP. Fluorescence measurements were obtained with a plate reader using excitation and emission wavelengths appropriate for GFP. Sample concentrations were interpolated from the standard curve to estimate the yield of PE–meGFP produced in each reaction. To assess the integrity of the expressed protein, samples were analyzed on TGX stain-free polyacrylamide gels (Bio-Rad) under semidenaturing conditions.

Temperature-dependent reverse transcription activity assay

Template mRNA was annealed to primers by combining 1 µl of Millenium RNA Markers (1 µg µl⁻¹; Thermo Fisher Scientific, AM7150) with 1 µl of 50 µM oligo(dT) and 13 µl of nuclease-free water. Annealing reactions were performed by incubation at 65 °C for 5 min followed by incubation on ice for 1 min. To prepare reverse transcription reactions, 4 µl of 5× SuperScript Buffer (Thermo Fisher Scientific, 18090050B), 1 µl of 100 mM DTT, 1 µl of RNase inhibitor (Thermo Fisher Scientific, EO0381) and 1 µl of purified RT or commercial SuperScript IV (Thermo Fisher Scientific) at 4.3 µM were added to annealing reactions. RT-free controls were included where water was added instead of RT. Reverse transcription reactions were run by incubation at the indicated temperature for the indicated time and inactivated by heating to 80 °C for 10 min. Template RNA was degraded by adding 1 µl of RNase H (Thermo Fisher Scientific) and incubating at 37 °C for 20 min.

To analyze cDNA products, reactions were combined with 6× alkaline gel loading buffer to a final concentration of 1× and resolved on an agarose alkaline gel comprised of 1% (wt/wt) agarose, 50 mM NaOH and 1 mM EDTA. Gels were run at 20 V for 12 h. Gels were neutralized with two washes of neutralization buffer (500 mM Tris-HCl (pH 7.5) and 1 M NaCl) for 20 min each and stained with SYBR Gold (Thermo Fisher Scientific) for 1 h before imaging on a Bio-Rad Chemidoc.

eVLP production and transduction

Gesicle 293T cells were seeded in T75 flasks (Corning, 353136) at a density of 5 × 10⁶ cells per flask. After 18 h, cells were transfected using jetPRIME transfection reagent (Polyplus, 101000001), according to the manufacturer’s instructions, with plasmids expressing VSV-G (400 ng), wild-type M-MLV Gag–Pol (2,813 ng), Gag–COM–Pol (2,000 ng), Gag–Pol (422 ng), P4–PE (422 ng), COM–epegRNA (3,520 ng) and COM–ngRNA (880 ng). At 48 h after transfection, the culture supernatant was collected, centrifuged at 500g for 5 min to remove debris and filtered through a 0.45-µm polyvinylidene difluoride membrane (MilliporeSigma, SE1M003M00).

For cell culture transduction, 5× PEG-it Virus Precipitation Solution (System Biosciences, LV825A-1) was added to precipitate eVLPs at 4 °C for 18 h. eVLPs were pelleted at 1,500g for 30 min at 4 °C and resuspended in Opti-MEM I reduced serum medium (Gibco, 31985070) at a 100× concentration. HEK293T cells were plated at 24,000 cells per well, and Neuro-2A cells were plated at 40,000 cells in 48-well plates. PE-eVLPs were added directly to the culture medium 18 h after plating. Cells were collected 72 h after transduction, and genomic DNA was extracted by crude lysis for HTS.

Off-target amplicon sequencing and analysis

Nominated candidate off-target sites were previously identified using CIRCLE-seq for the mouse Pcsk9 +1 TTAC insertion using a single guide RNA (sgRNA) targeting the same protospacer sequence⁶⁹. Use of an sgRNA surrogate with an identical protospacer for prime editing was previously reported to nominate largely overlapping sets of genomic loci⁵. Targeted amplicon sequencing was performed as described above. Off-target editing events were identified with CRISPResso2 using the parameters ‘-q30’, ‘discard_indel_reads TRUE’ and ‘-w 25’ to capture events within 25 nucleotides of the epegRNA nick position. Reads binned as ‘discarded’ and ‘substitutions’ were assigned as off-target indels and substitutions, respectively, and normalized to all reference-aligned reads.

Animal care

All experiments involving live animals were approved by the Broad Institute Institutional Animal Care and Use Committee (0048-04-15-2). Six- to 8-week-old female C57BL/6J mice were purchased from The Jackson Laboratory (000664). Mouse housing facilities were maintained at 20–22 °C with 30–50% humidity on a 12-h light/12-h dark cycle with ad libitum access to standard rodent diet and water. Animals were randomly assigned to experimental groups.

Retro-orbital injections of LNPs

Before injection, individually formulated LNPs containing mRNA, epegRNA and ngRNA were admixed at a ratio of 1:1.8:0.2 (mRNA:epegRNA:ngRNA by total RNA mass). Anesthesia was first induced with 4% isoflurane. Following induction, the right eye was gently protruded, and the needle of a loaded insulin syringe, bevel facing away from the eye, was inserted into the retrobulbar sinus, and specified doses of LNP by total RNA mass were slowly injected. Following injection, a drop of proparacaine hydrochloride ophthalmic solution (Patterson Veterinary) was applied to the eye as an analgesic.

Mouse tissue collection and processing

Mice were killed by CO₂ asphyxiation and perfused with PBS via the left ventricle. Bulk livers were collected, transferred to 2-ml tubes containing metal beads (Revvity) and mechanically lysed using a TissueLyser II (Qiagen). Genomic DNA was extracted from ground tissues using a DNAdvance kit (Beckman Coulter, A48705) according to the manufacturer’s protocol. The extracted genomic DNA was used as input for downstream HTS sample preparation.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Plasmids for IVT prime editor variants will be available from Addgene. Amino acid sequences of PE8 variants are listed in Supplementary Table 5. The HTS data generated during this study are available under NCBI Sequence Read Archive accession code PRJNA1369105 (ref. ⁷⁶). Raw data are available from the corresponding author on request. Source data are provided with this paper.

Code availability

The code used for sequence redesign is available at https://github.com/Allentaoyz/Redesigned_prime_editor_RTs (ref. ⁷⁷), and the code used for analysis of HTS data is available from GitHub at https://github.com/pinellolab/CRISPResso2 (ref. ⁷⁵).

References

Anzalone, A. V. et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149–157 (2019).
Article CAS PubMed PubMed Central Google Scholar
Choi, J. et al. A time-resolved, multi-symbol molecular recorder via sequential genome editing. Nature 608, 98–107 (2022).
Article CAS PubMed PubMed Central Google Scholar
Gould, S. I. et al. High-throughput evaluation of genetic variants with prime editing sensor libraries. Nat. Biotechnol. 43, 1648–1662 (2025).
Article CAS PubMed Google Scholar
Jang, H. et al. Application of prime editing to the correction of mutations and phenotypes in adult mice with liver and eye diseases. Nat. Biomed. Eng. 6, 181–194 (2021).
Article PubMed Google Scholar
Everette, K. A. et al. Ex vivo prime editing of patient haematopoietic stem cells rescues sickle-cell disease phenotypes after engraftment in mice. Nat. Biomed. Eng. 7, 616–628 (2023).
Article CAS PubMed PubMed Central Google Scholar
Li, C. et al. In vivo HSC prime editing rescues sickle cell disease in a mouse model. Blood 141, 2085–2099 (2023).
CAS PubMed PubMed Central Google Scholar
Böck, D. et al. In vivo prime editing of a metabolic liver disease in mice. Sci. Transl. Med. 14, eabl9238 (2022).
Article PubMed PubMed Central Google Scholar
Sousa, A. A. et al. In vivo prime editing rescues alternating hemiplegia of childhood in mice. Cell 188, 4275–4294 (2025).
Article CAS PubMed PubMed Central Google Scholar
Davis, J. R. et al. Efficient prime editing in mouse brain, liver and heart with dual AAVs. Nat. Biotechnol. 42, 253–264 (2024).
Article CAS PubMed Google Scholar
Rothgangl, T. et al. Treatment of a metabolic liver disease in mice with a transient prime editing approach. Nat. Biomed. Eng. 9, 1705–1718 (2025).
Article CAS PubMed PubMed Central Google Scholar
Qin, H. et al. Vision rescue via unconstrained in vivo prime editing in degenerating neural retinas. J. Exp. Med. 220, e20220776 (2023).
Article CAS PubMed PubMed Central Google Scholar
Fu, Y. et al. In vivo prime editing rescues photoreceptor degeneration in nonsense mutant retinitis pigmentosa. Nat. Commun. 16, 2394 (2025).
Article CAS PubMed PubMed Central Google Scholar
An, M. et al. Engineered virus-like particles for transient delivery of prime editor ribonucleoprotein complexes in vivo. Nat. Biotechnol. 42, 1526–1537 (2024).
Article CAS PubMed Google Scholar
Gori, J. L. et al. Prime editing for p47^phox -deficient chronic granulomatous disease. N. Engl. J. Med. 394, 1195–1203 (2025).
Velimirovic, M. et al. Peptide fusion improves prime editing efficiency. Nat. Commun. 13, 3512 (2022).
Article CAS PubMed PubMed Central Google Scholar
Liu, B. et al. A split prime editor with untethered reverse transcriptase and circular RNA template. Nat. Biotechnol. 40, 1388–1393 (2022).
Article CAS PubMed Google Scholar
Sakai, H. A. et al. Directed evolution of small RNA-stabilizing motifs that improve prime editing efficiency. Nat. Biotechnol. (in the press).
Yan, J. et al. Improving prime editing with an endogenous small RNA-binding protein. Nature 628, 639–647 (2024).
Article CAS PubMed PubMed Central Google Scholar
Nelson, J. W. et al. Engineered pegRNAs improve prime editing efficiency. Nat. Biotechnol. 40, 402–410 (2022).
Article CAS PubMed Google Scholar
Chen, P. J. et al. Enhanced prime editing systems by manipulating cellular determinants of editing outcomes. Cell 184, 5635–5652 (2021).
Article CAS PubMed PubMed Central Google Scholar
Doman, J. L. et al. Phage-assisted evolution and protein engineering yield compact, efficient prime editors. Cell 186, 3983–4002 (2023).
Article CAS PubMed PubMed Central Google Scholar
Grünewald, J. et al. Engineered CRISPR prime editors with compact, untethered reverse transcriptases. Nat. Biotechnol. 41, 337–343 (2023).
Article PubMed Google Scholar
Zong, Y. et al. An engineered prime editor with enhanced editing efficiency in plants. Nat. Biotechnol. 40, 1394–1402 (2022).
Article CAS PubMed Google Scholar
Liu, P. et al. Increasing intracellular dNTP levels improves prime editing efficiency. Nat. Biotechnol. 43, 539–544 (2025).
Mittas, D. M. et al. Dual AAV vectors for efficient delivery of large transgenes. Nat. Protoc. 21, 1466–1522 (2026).
Roberts, C. J. Therapeutic protein aggregation: mechanisms, design, and control. Trends Biotechnol. 32, 372–380 (2014).
Article CAS PubMed PubMed Central Google Scholar
Hou, X., Zaks, T., Langer, R. & Dong, Y. Lipid nanoparticles for mRNA delivery. Nat. Rev. Mater. 6, 1078–1094 (2021).
Article CAS PubMed PubMed Central Google Scholar
Gaudelli, N. M. et al. Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage. Nature 551, 464–471 (2017).
Article CAS PubMed PubMed Central Google Scholar
Neugebauer, M. E. et al. Evolution of an adenine base editor into a small, efficient cytosine base editor with low off-target activity. Nat. Biotechnol. 41, 673–685 (2023).
Article CAS PubMed Google Scholar
Esvelt, K. M., Carlson, J. C. & Liu, D. R. A system for the continuous directed evolution of biomolecules. Nature 472, 499–503 (2011).
Article CAS PubMed PubMed Central Google Scholar
Huang, T. P. et al. High-throughput continuous evolution of compact Cas9 variants targeting single-nucleotide-pyrimidine PAMs. Nat. Biotechnol. 41, 96–107 (2023).
Article CAS PubMed Google Scholar
Pandey, S. et al. Efficient site-specific integration of large genes in mammalian cells via continuously evolved recombinases and prime editing. Nat. Biomed. Eng. 9, 22–39 (2024).
Article PubMed PubMed Central Google Scholar
Witte, I. P. et al. Programmable gene insertion in human cells with a laboratory-evolved CRISPR-associated transposase. Science 388, eadt5199 (2025).
Article CAS PubMed PubMed Central Google Scholar
Tokuriki, N. et al. Diminishing returns and tradeoffs constrain the laboratory optimization of an enzyme. Nat. Commun. 3, 1257 (2012).
Article PubMed Google Scholar
Serrano, L., Bycroft, M. & Fersht, A. R. Aromatic–aromatic interactions and protein stability. J. Mol. Biol. 218, 465–475 (1991).
Article CAS PubMed Google Scholar
Bloom, J. D. et al. Thermodynamic prediction of protein neutrality. Proc. Natl Acad. Sci. USA 102, 606–611 (2005).
Article CAS PubMed PubMed Central Google Scholar
Giver, L., Gershenson, A., Freskgard, P.-O. & Arnold, F. H. Directed evolution of a thermostable esterase. Proc. Natl Acad. Sci. USA 95, 12809–12813 (1998).
Article CAS PubMed PubMed Central Google Scholar
Tokuriki, N., Stricher, F., Serrano, L. & Tawfik, D. S. How protein stability and new functions trade off. PLoS Comput. Biol. 4, e1000002 (2008).
Article PubMed PubMed Central Google Scholar
Teufl, M., Zajc, C. U. & Traxlmayr, M. W. Engineering strategies to overcome the stability–function trade-off in proteins. ACS Synth. Biol. 11, 1030–1039 (2022).
Article CAS PubMed PubMed Central Google Scholar
Wei, R. et al. Improved split prime editors enable efficient in vivo genome editing. Cell Rep. 44, 115144 (2025).
Article CAS PubMed Google Scholar
Huang, P.-S., Boyken, S. E. & Baker, D. The coming of age of de novo protein design. Nature 537, 320–327 (2016).
Article CAS PubMed Google Scholar
Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).
Ruffolo, J. A. et al. Design of highly functional genome editors by modelling CRISPR–Cas sequences. Nature 645, 518–525 (2025).
Article CAS PubMed PubMed Central Google Scholar
Yang, C. et al. Prime editor with rational design and AI-driven optimization for reverse editing window and enhanced fidelity. Nat. Commun. 16, 5144 (2025).
Article CAS PubMed PubMed Central Google Scholar
Fei, H. et al. Advancing protein evolution with inverse folding models integrating structural and evolutionary constraints. Cell 188, 4674–4692 (2025).
Article CAS PubMed Google Scholar
Goverde, C. A. et al. Computational design of soluble and functional membrane protein analogues. Nature 631, 449–458 (2024).
Article CAS PubMed PubMed Central Google Scholar
Sumida, K. H. et al. Improving protein expression, stability, and function with ProteinMPNN. J. Am. Chem. Soc. 146, 2054–2061 (2024).
Article CAS PubMed PubMed Central Google Scholar
Dauparas, J. et al. Robust deep learning-based protein sequence design using ProteinMPNN. Science 378, 49–56 (2022).
Article CAS PubMed PubMed Central Google Scholar
King, B. R., Sumida, K. H., Caruso, J. L., Baker, D. & Zalatan, J. G. Computational stabilization of a non-heme iron enzyme enables efficient evolution of new function. Angew. Chem. Int. Ed. 64, e202414705 (2025).
Article CAS Google Scholar
Kweon, J. et al. Engineered prime editors with PAM flexibility. Mol. Ther. 29, 2001–2007 (2021).
Article CAS PubMed PubMed Central Google Scholar
Chauhan, V. P., Sharp, P. A. & Langer, R. Engineered prime editors with minimal genomic errors. Nature 646, 1254–1260 (2025).
Nowak, E. et al. Structural analysis of monomeric retroviral reverse transcriptase in complex with an RNA/DNA hybrid. Nucleic Acids Res. 41, 3874–3887 (2013).
Article CAS PubMed PubMed Central Google Scholar
Jiang, A. Y. et al. Efficient prime editing in vivo and in vitro using lipid nanoparticles. Nat. Nanotechnol. (in the press).
Landrum, M. J. et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 44, D862–D868 (2016).
Article CAS PubMed Google Scholar
Yu, G. et al. Prediction of efficiencies for diverse prime editing systems in multiple cell types. Cell 186, 2256–2272 (2023).
Article CAS PubMed Google Scholar
Lyu, X., Yang, Q., Zhao, F. & Liu, Y. Codon usage and protein length-dependent feedback from translation elongation regulates translation initiation and elongation speed. Nucleic Acids Res. 49, 9404–9423 (2021).
Article CAS PubMed PubMed Central Google Scholar
Schiffrin, B. & Calabrese, A. N. Chaperones in concert: orchestrating co-translational protein folding in the cell. Mol. Cell 84, 2403–2404 (2024).
Article CAS PubMed PubMed Central Google Scholar
Kim, M. et al. Dual SORT LNPs for multi-organ base editing. Nat. Biotechnol. 44, 578–586 (2026).
Musunuru, K. et al. Patient-specific in vivo gene editing to treat a rare genetic disease. N. Engl. J. Med. 392, 2235–2243 (2025).
Article CAS PubMed PubMed Central Google Scholar
Longhurst, H. J. et al. CRISPR–Cas9 in vivo gene editing of KLKB1 for hereditary angioedema. N. Engl. J. Med. 390, 432–441 (2024).
Article CAS PubMed Google Scholar
Gillmore, J. D. et al. CRISPR–Cas9 in vivo gene editing for transthyretin amyloidosis. N. Engl. J. Med. 385, 493–502 (2021).
Article CAS PubMed Google Scholar
Beam Therapeutics. A phase 1/2 dose-exploration and dose-expansion study to evaluate the safety and efficacy of BEAM-302 in adult patients with α-1 antitrypsin deficiency (AATD)-associated lung disease and/or liver disease. https://investors.primemedicine.com/news-releases/news-release-details/prime-medicine-announces-breakthrough-clinical-data-showing (2025).
Fenton, O. S. et al. Bioinspired alkenyl amino alcohol ionizable lipid materials for highly potent in vivo mRNA delivery. Adv. Mater. 28, 2939–2943 (2016).
Article CAS PubMed PubMed Central Google Scholar
Privalov, P. L. & Khechinashvili, N. N. A thermodynamic approach to the problem of stabilization of globular protein structure: a calorimetric study. J. Mol. Biol. 86, 665–684 (1974).
Article CAS PubMed Google Scholar
Behrens, R. Fission yeast retrotransposon Tf1 integration is targeted to 5′ ends of open reading frames. Nucleic Acids Res. 28, 4709–4716 (2000).
Article CAS PubMed PubMed Central Google Scholar
Arezi, B. & Hogrefe, H. Novel mutations in Moloney murine leukemia virus reverse transcriptase increase thermostability through tighter binding to template-primer. Nucleic Acids Res. 37, 473–481 (2009).
Article CAS PubMed Google Scholar
Telesnitsky, A. & Goff, S. P. RNase H domain mutations affect the interaction between Moloney murine leukemia virus reverse transcriptase and its primer-template. Proc. Natl Acad. Sci. USA 90, 1276–1280 (1993).
Article CAS PubMed PubMed Central Google Scholar
Geilenkeuser, J. et al. Engineered nucleocytosolic vehicles for loading of programmable editors. Cell 188, 2637–2655 (2025).
Article CAS PubMed Google Scholar
Banskota, S. et al. Engineered virus-like particles for efficient in vivo delivery of therapeutic proteins. Cell 185, 250–265 (2022).
Article CAS PubMed PubMed Central Google Scholar
Abifadel, M. et al. Mutations in PCSK9 cause autosomal dominant hypercholesterolemia. Nat. Genet. 34, 154–156 (2003).
Article CAS PubMed Google Scholar
Silverstein, R. A. et al. Custom CRISPR–Cas9 PAM variants via scalable engineering and machine learning. Nature 643, 539–550 (2025).
Article CAS PubMed PubMed Central Google Scholar
Truong, D.-J. J. et al. Exonuclease-enhanced prime editors. Nat. Methods 21, 455–464 (2024).
Article CAS PubMed PubMed Central Google Scholar
Doman, J. L., Sousa, A. A., Randolph, P. B., Chen, P. J. & Liu, D. R. Designing and executing prime editing experiments in mammalian cells. Nat. Protoc. 17, 2431–2468 (2022).
Article CAS PubMed PubMed Central Google Scholar
Langdon, W. B. Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks. BioData Min. 8, 1 (2015).
Article CAS PubMed PubMed Central Google Scholar
Clement, K. et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat. Biotechnol. 37, 215–216 (2019).
Article Google Scholar
Tao, Y. A. et al. AI-guided redesign of laboratory-evolved reverse transcriptases enhances prime editing. NCBI Sequence Read Archive https://www.ncbi.nlm.nih.gov/sra/PRJNA1369105 (2026).
Tao, Y. A. et al. AI-guided redesign of laboratory-evolved reverse transcriptases enhances prime editing. GitHub https://github.com/Allentaoyz/Redesigned_prime_editor_RTs (2026).

Download references

Acknowledgements

This work was supported by the US National Institutes of Health grants R01EB031172, R35GM118062 and RM1HG009490, the Bill and Melinda Gates Award INV-056974 and the Howard Hughes Medical Institute. A.C. and N.A.K. were supported by the National Science Foundation graduate research fellowship.

Author information

These authors contributed equally: Y. Allen Tao, Holt A. Sakai, Allen Y. Jiang, Nicholas A. Krasnow.

Authors and Affiliations

Merkin Institute of Transformative Technologies in Healthcare, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
Y. Allen Tao, Holt A. Sakai, Allen Y. Jiang, Nicholas A. Krasnow, Vasilii S. Vaganov, Brian Shim, Zachary Barsdale, Smriti Pandey, Nouraiz Ahmed, Man Na, Ting-Wei Liao, Keyede Oye, Ana Cristian, Emily Zhang, Joy A. Xu, Mattijs Bulcaen & David R. Liu
Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
Y. Allen Tao, Holt A. Sakai, Allen Y. Jiang, Nicholas A. Krasnow, Vasilii S. Vaganov, Brian Shim, Zachary Barsdale, Smriti Pandey, Nouraiz Ahmed, Ting-Wei Liao, Keyede Oye, Ana Cristian, Emily Zhang, Joy A. Xu, Mattijs Bulcaen & David R. Liu
Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA
Y. Allen Tao, Holt A. Sakai, Allen Y. Jiang, Nicholas A. Krasnow, Vasilii S. Vaganov, Brian Shim, Zachary Barsdale, Smriti Pandey, Nouraiz Ahmed, Ting-Wei Liao, Keyede Oye, Ana Cristian, Emily Zhang, Joy A. Xu, Mattijs Bulcaen & David R. Liu
Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
Ana Cristian

Authors

Y. Allen Tao
View author publications
Search author on:PubMed Google Scholar
Holt A. Sakai
View author publications
Search author on:PubMed Google Scholar
Allen Y. Jiang
View author publications
Search author on:PubMed Google Scholar
Nicholas A. Krasnow
View author publications
Search author on:PubMed Google Scholar
Vasilii S. Vaganov
View author publications
Search author on:PubMed Google Scholar
Brian Shim
View author publications
Search author on:PubMed Google Scholar
Zachary Barsdale
View author publications
Search author on:PubMed Google Scholar
Smriti Pandey
View author publications
Search author on:PubMed Google Scholar
Nouraiz Ahmed
View author publications
Search author on:PubMed Google Scholar
Man Na
View author publications
Search author on:PubMed Google Scholar
Ting-Wei Liao
View author publications
Search author on:PubMed Google Scholar
Keyede Oye
View author publications
Search author on:PubMed Google Scholar
Ana Cristian
View author publications
Search author on:PubMed Google Scholar
Emily Zhang
View author publications
Search author on:PubMed Google Scholar
Joy A. Xu
View author publications
Search author on:PubMed Google Scholar
Mattijs Bulcaen
View author publications
Search author on:PubMed Google Scholar
David R. Liu
View author publications
Search author on:PubMed Google Scholar

Contributions

Y.A.T., H.A.S. and A.Y.J. performed experiments and analyzed data. Y.A.T., N.A.K. and S.P. conceptualized the design. N.A.K. and S.P. helped with computational sequence redesign. V.S.V. and M.B. performed eVLP experiments. Z.B. helped with arrayed validation experiments. B.S. conducted experiments in primary T cells and helped with the major histocompatibility complex binding prediction, and N.A. conducted experiments in HSPCs. N.A.K., M.N., T.-W.L., E.Z. and J.A.X. assisted with protein biochemistry. A.C. helped with off-target analysis. K.O. helped with in vivo experiment analysis. D.R.L. supervised all research. Y.A.T., H.A.S., A.Y.J. and D.R.L. wrote the paper with input from all authors.

Corresponding author

Correspondence to David R. Liu.

Ethics declarations

Competing interests

Y.A.T., H.A.S., A.Y.J., N.A.K. and D.R.L. have filed patent applications related to this work through the Broad Institute. S.P. is an employee of CRISPR Therapeutics. D.R.L. is a cofounder of Beam Therapeutics, Prime Medicine, Editas Medicine, Pairwise Plants and nChroma Bio, companies that use or deliver genome editing agents. All other authors have no competing interests.

Peer review

Peer review information

Nature Biotechnology thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 In silico analysis of redesigned RT variants.

Relationship and distribution of pLDDT and RMSD values for redesigned RT candidates for (a) PE6a, (b) PE6d, and (c) PEmax across all 96 designs per starting point. See Fig. 1e for data for PE6c. Each dot in the central plot represents one redesigned variant. Histograms show the frequency distributions of RMSD (vertical bars) and pLDDT (horizontal bars) values. Color shadings for bars and dots represent designs produced using the indicated thresholds for exclusion from redesign on the basis of substrate proximity (20, 18, and 15 Å).

Extended Data Fig. 2 Experimental validation of redesigned RT variants in PE-LNP transfection in cell culture.

Prime editing efficiency for installation of a + 1 TTAC edit at the Pcsk9 locus in Hepa1-6 cells treated with OF-02 LNPs (25 ng total RNA dose per well), assessed at 72 h post-transfection. Variants are ordered in descending rank by editing efficiency. Vertical bars show average editing (n = 3 independent biological replicates) and dots show individual values. Horizontal dashed lines show the editing with the starting point variant, which is also highlighted as a light vertical bar. The same data are replotted in Fig. 2d to focus on the top seven hits.

Source data

Extended Data Fig. 3 Effects of redesign constraints on prime editing efficiency.

a, Fold-change in prime editing efficiency for different conservation constraints (left) and distance constraints (right). Variants were more extensively redesigned with higher conservation thresholds (50%) and lower distance thresholds (15 Å). Each dot represents the average of n = 3 independent biological replicates. Thick vertical bars show the grand mean. b, Fraction of residues from laboratory evolution or engineering that were unchanged during the redesign process for each RT starting point. c, Top four redesigned prime editor variants identified from the Pcsk9 + 1 TTAC insertion primary screen were evaluated across three additional endogenous genomic targets in Huh-7 cells by LNP transfection. Data are shown as the mean of n = 3 independent biological replicates.

Source data

Extended Data Fig. 4 ROC comparison of top variants.

a-d, Receiver Operating Characteristic (ROC) curves show relative performance of redesigned variants from pooled lentiviral screen (Fig. 3) compared to the input RT variant. True positive and false positive rates were calculated from spacer-matched differences in prime editing efficiency across all targets with >0.5% editing. The area under the curve (AUC) values are listed in parentheses and represent the probability that the variant outperforms the starting point for a randomly chosen target edit.

Extended Data Fig. 5 PE8 variants outperform PE6 or PEmax starting points following plasmid DNA or mRNA transfection.

a, Pearson correlation of percent editing at selected synthetic sites generated by pool or arrayed transduction (two-sided P < 10⁻¹⁵). Each dot represents the average editing outcome for a unique combination of prime editor and programmed edit. b, Arrayed plasmid DNA transfection of HEK293T cells with PE8 variants paired with epegRNA (tevo2.0) and PE7 paired with pegRNA. Horizontal bars show average editing (n = 3 independent biological replicates) and dots show individual values. c, Arrayed LNP mRNA transfection of Huh-7 cells at endogenous targets with PE8 variants paired with synthetic epegRNA (eSBRNV1) and PE7 paired with synthetic pegRNA with La-accessible end. Vertical bars show average editing (n = 3 independent biological replicates) and dots show individual values. Editing data were collected in the same experimental batch as Extended Data Fig. 3c. d, Arrayed assessment of prime editing at self-targeting synthetic lentiviral target sites in Huh-7 cells using PE8 variants paired with epegRNA (tevo2.0) and PE8-La paired with pegRNA, delivered by LNP-mRNA transfection. Vertical bars show average editing (n = 3 independent biological replicates) and dots show individual values. Editing data were collected in the same experimental batch as Extended Data Fig. 3f. e, f, Arrayed assessment of prime editing at self-targeting synthetic lentiviral target sites in Huh-7 cells using PE8 variants paired with epegRNA (tevo2.0) and PE8-La paired with pegRNA with La-accessible end or PE8 variants and PE8-La both paired with epegRNA (tevo2.0), delivered by LNP-mRNA transfection. Horizontal bars show average editing (n = 3 independent biological replicates) and dots show individual values.

Source data

Extended Data Fig. 6 Biochemical and functional characterization of RT variants.

a, Steps of RT expression and purification from E. coli, showing RT from PE8c as an example on the far left. Elution fractions (E1–E3, which represent all elution fractions) from pre-evolution or pre-engineering, evolved, and redesigned RTs. Each elution fraction lane (E1-E3) shows 0.3% (15 µL) of the total eluted protein (5 mL) from each fraction analyzed by gel electrophoresis on a 4–12% NuPAGE Bis-Tris Mini Protein Gel. b, Editing kinetics of PE8 variants following LNP delivery of prime editor mRNA. Prime editor mRNA encoding PE8 variants, synthetic epegRNA and ngRNA targeting a Pcsk9 + 1 TTAC insertion were formulated in LNPs and transfected into Hepa1-6 cells at a total RNA dose of 20 ng per well. Editing outcomes were quantified 48 and 72 h post-transfection. Dots show mean±s.d of n = 3 biological replicates. c, Comparison of protein yields for full-length PE6d and PE8d using an in vitro cell-free translation system (NEB PURExpress). Both variants were expressed as meGFP-fusion proteins at 37 °C and yields were quantified via GFP fluorescence. Concentration data are represented as mean ± s.d. from n = 3 independent replicates. d, Temperature-dependent reverse transcription activity assay for PEmax ΔRNaseH RT, PE6d, and PE8d RT domains. RT activity was evaluated using a pool of RNA templates of varying lengths at 37 °C and 57 °C. Millenium RNA markers (1 µg) were annealed to an oligo(dT) primer and reverse transcribed using either purified reverse transcriptase or SuperScript IV at 4.3 uM. Reactions lacking reverse transcriptase (RT-free) served as negative controls. Following heat inactivation and template RNA degradation via RNaseH, the resulting cDNA products were resolved on a 1% alkaline agarose gel, neutralized, and visualized with SYBR Gold staining. Top and bottom panels represent two independent biological replicates.

Source data

Extended Data Fig. 7 PE8 variants offer improved editing efficiency in primary human cells.

a,b, Prime editing correction of mutations that cause Crigler-Najjar syndrome (a) and Tay-Sachs disease (b) in primary patient-derived fibroblasts. c, Prime editing efficiency at the HBB locus in CD34+ HSPCs from healthy donors following electroporation of low-dose RNA reagents programmed to install a PAM-disrupting +5 G-to-A silent edit in HBB (n = 3). d, Prime editing efficiency at the IL2RB locus in primary human T cells following electroporation of low-dose RNA reagents to install a + 1 T > A, +5 G > C orthogonalizing edit for adoptive T-cell therapy (n = 3). Bars represent the mean of n = 3 independent biological replicates from a single fibroblast cell line, and dots represent individual replicates.

Source data

Extended Data Fig. 8 PE8c can outperform the prior best-performing PE-eVLPs based on PEmax.

Prime editing with v3b PE-eVLPs in HEK293T (HEK3, FANCF) and N2a (Dnmt1, Col12a1) at the indicated dose. Data show mean±s.d. for n = 3 independent biological replicates. Best fit lines show nonlinear regression to three-parameter logistic curves.

Source data

Extended Data Fig. 9 Analysis of in vivo prime editing product purity and off-target editing.

PE-LNPs targeting Pcsk9 + 1TTAC insertion using admix formulations of prime editor mRNA, epegRNA, and ngRNA were delivered as a single administration at 1 mg/kg into adult C57BL/6 J mice by retro-orbital injection. a, b, Bulk liver tissues were analyzed at the on-target site for indels (a) and product purity (edit-to-indel ratio, b). Bars represent the mean of n = 3 independent biological replicates. c, d, Off-target loci previously nominated by CIRCLE-seq were assessed for indels (c) and substitutions (d) compared to the reference genome (GRCm39). Data show mean±s.d. for n = 3 independent biological replicates.

Source data

Supplementary information

Supplementary Information (download PDF )

Supplementary Discussion, Supplementary Figs. 1 and 2 and Supplementary Tables 1–5.

Reporting Summary (download PDF )

Supplementary Table 1 (download XLSX )

Design and data for computationally designed prime editors.

Supplementary Table 2 (download XLSX )

Design and data for self-targeting prime editing screen.

Supplementary Table 3 (download XLSX )

Sequences of primers for amplification and HTS.

Supplementary Table 4 (download XLSX )

Sequences of pegRNAs and sgRNAs.

Supplementary Table 5 (download XLSX )

PE8 prime editor sequences.

Source data

Source Data Figs. 2, 3 and 5 and Extended Data Figs. 2, 3 and 5–9 (download XLSX )

File names for fastq raw sequencing reads uploaded to NCBI Sequence Read Archive for all experiments.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Tao, Y.A., Sakai, H.A., Jiang, A.Y. et al. AI-guided redesign of laboratory-evolved reverse transcriptases enhances prime editing. Nat Biotechnol (2026). https://doi.org/10.1038/s41587-026-03149-6

Download citation

Received: 25 November 2025
Accepted: 27 April 2026
Published: 21 May 2026
Version of record: 21 May 2026
DOI: https://doi.org/10.1038/s41587-026-03149-6

Subjects

Abstract

Main

Results

Laboratory-evolved RTs have reduced stability

Sequence redesign of RTs and in silico structural prediction

Redesigned RTs enhance prime editing efficiency in cultured cells

Redesigned RTs improve correction of pathogenic mutations

Redesign improves expression level and thermostability of RTs

Prime editors with redesigned RTs show improved efficacy in therapeutic applications

Discussion

Methods

General molecular biology and cloning

Cell culture

ProteinMPNN sequence design

Self-targeting lentiviral pool screen

IVT of PE mRNA

Chemically synthesized guide RNA

LNP production

LNP characterization

In vitro transfection of LNPs

Electroporation of RNA in primary human fibroblasts, human T cells and HSPCs

Electroporation of primary human fibroblasts

Electroporation of primary human T cells

Electroporation of human primary CD34+ HSPCs

Transfection for plasmid DNA delivery

PE protein quantification by ELISA

HTS and analysis

Protein expression and purification

DSF

Cell-free protein expression

Temperature-dependent reverse transcription activity assay

eVLP production and transduction

Off-target amplicon sequencing and analysis

Animal care

Retro-orbital injections of LNPs

Mouse tissue collection and processing

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links

Electroporation of human primary CD34⁺ HSPCs