Main

Co-translational incorporation of ncAAs via genetic code expansion (GCE) enables precise reprogramming of the proteome’s chemical diversity. By leveraging orthogonal aminoacyl-tRNA synthetases (aaRSs), a wide range of functionalities, including post-translational modifications (PTMs), bioorthogonal handles, crosslinking moieties, spectroscopic probes and photocaged amino acids, have been site-specifically introduced into proteins of interest (POIs) across all domains of life, typically via amber suppression1,2,3,4,5,6,7,8,9. These strategies offer powerful tools for studying and manipulating protein functions and for generating proteins with therapeutic and biotechnological importance.

Despite substantial progress, broad implementation of GCE remains limited by low protein production yields. Inefficiencies arise from insufficient substrate activation by orthogonal aaRSs, as well as unfavourable competition of aminoacylated tRNAs with release factors at introduced nonsense codons10. In addition, many ncAAs require advanced expertise in chemical synthesis and are used at high concentrations in typical experiments, making them prohibitively expensive, a factor that is exacerbated by the low incorporation yields.

Multiple efforts have addressed these limitations through optimized aaRS/tRNA expression systems combined with novel selection and evolution strategies that enhance suppression efficiencies11,12,13. Additional advances include orthogonal ribosomes14, release-factor knockouts15,16 and recoded genomes that permit sense codon reassignment17,18,19,20.

A less explored, but critical factor lies in the intracellular bioavailability of ncAAs. In most GCE applications, ncAAs are added exogenously to cells and taken up by passive diffusion or via native amino acid transporters. This often results in low intracellular concentrations, which is especially detrimental for aaRS/ncAA pairs with low catalytic efficiency, for which aminoacylation operates below optimal conditions10,21. Furthermore, reliance on passive diffusion and endogenous importers restricts the available design space for novel building blocks. For a small number of ncAAs, efforts in engineering biosynthetic pathways to produce them directly within cells have overcome some of these limitations, but such approaches require substantial strain development and are currently applicable to only a narrow range of functionalities22,23,24,25,26.

Engineering membrane transport systems presents a promising, yet underexplored, strategy to enhance ncAA uptake and has potential to be widely applicable. Prior work has investigated the substrate scope of a periplasmic leucine-binding protein towards known ncAAs27 and ‘Trojan horse’ strategies, in which ncAAs are conjugated to carrier groups that facilitate recognition and uptake by transporters28,29,30,31,32.

Here we leverage a modular propeptide-based strategy coupled with engineering of a bacterial ABC transporter for programmable import of ncAAs, enabling their efficient encoding in E. coli. We demonstrate that isopeptide-linked tripeptides (Z-XisoK, where Z and X are natural or non-canonical residues) are actively imported into E. coli via the oligopeptide permease (Opp) and processed intracellularly, resulting in high accumulation of Z and XisoK. Using G-XisoK scaffolds, we efficiently incorporate 11 previously inaccessible XisoK ncAAs bearing functionalities such as bioorthogonal handles, crosslinkers and PTMs. We further devise a directed evolution platform to reprogramme the periplasmic binding protein of the transporter (OppA) for preferential uptake of G-XisoK tripeptides over competing linear peptides that are present in commonly used growth media. Expanding this approach, we adapt our platform for importing Z-XisoK tripeptides with diverse Z groups, including bulky or negatively charged ncAAs that are cell-impermeable on their own. Genomic integration of evolved OppA variants creates E. coli strains that are tailored for efficient single and multi-site ncAA incorporation. Finally, we adapt our scaffolds for the incorporation of two distinct ncAAs, mediated by their concomitant transport via a single tripeptide. Together, our results establish transporter engineering as a powerful strategy to unlock and customize ncAA import for the efficient production of proteins with an expanded alphabet.

G-AisoK is transported into E. coli

Previous work in our group combined transpeptidases with GCE to generate defined protein–protein conjugates. By site-specifically encoding an azide-caged diglycine acceptor motif (AzGGisoK) (Supplementary Fig. 1a) followed by on-protein Staudinger reduction, GGisoK-bearing proteins can undergo transpeptidation with donor proteins bearing a C-terminal recognition sequence. We applied this strategy to generate ubiquitin (Ub)- and Ub-like modifier (Ubl)–POI conjugates using sortase or an asparaginyl endopeptidase as transpeptidases33,34,35.

To diversify the linker sequence in the generated protein conjugates, we explored site-specific incorporation of ncAAs resembling a general G-XisoK scaffold (Fig. 1a). Supplementing E. coli K12 with the alanine-bearing G-XisoK tripeptide (G-AisoK; Fig. 1a) enabled efficient amber suppression of superfolder GFP (sfGFP-N150TAG) using the wild-type Methanosarcina barkeri pyrrolysine-tRNA synthetase/tRNA pair (wt-MbPylRS/PylT), with yields comparable to that of wild-type sfGFP production and similar to the gold-standard ncAA BocK (Fig. 1a,b). Mass spectrometric analysis revealed site-specific incorporation of AisoK (Fig. 1c), suggesting intracellular cleavage of the N-terminal glycine, either on the free ncAA, co-translationally or post-translationally. By contrast, direct supplementation of K12 with AisoK resulted in negligible sfGFP production (Fig. 1b). This was corroborated by live-cell sfGFP fluorescence measurements, which showed minimal signal with AisoK, whereas G-AisoK induced earlier and stronger fluorescence than BocK (Fig. 1d).

Fig. 1: Isopeptide-linked tripeptides are privileged scaffolds for efficient E. coli uptake.
figure 1

a, Chemical structures of G-XisoK, XisoK and BocK. X is alanine in G-AisoK and AisoK. b, SDS–PAGE analysis of wild-type sfGFP (wt-sfGFP) and sfGFP-N150TAG expression in K12 bearing wt-MbPylRS/PylT in the presence of 2 mM BocK, AisoK or G-AisoK. Asterisk indicates truncated protein. Consistent results were obtained over three independent replicate experiments. c, LC–MS analysis of sfGFP-N150AisoK. Calc., calculated molecular mass; obs., observed molecular mass. d, Time-course measurements of sfGFP fluorescence from K12 cultures expressing sfGFP-N150TAG and wt-MbPylRS/PylT in the presence of G-AisoK, AisoK or BocK, or grown in the absence of ncAAs. Consistent results were obtained over three independent replicate experiments. e, Extracted ion chromatograms for determining intracellular concentrations of G-AisoK and AisoK by an LC–MS assay, performed on K12 cell extracts. Intracellular G-AisoK concentrations in K12 grown with 2 mM G-AisoK are negligible. Intracellular AisoK concentrations in K12 grown with 2 mM G-AisoK are 5- to 10-fold higher than when grown with 2 mM AisoK. Consistent results were obtained over three independent replicate experiments. f, Proposed model for increased AisoK incorporation. The tripeptide G-AisoK is actively taken up via an E. coli transporter. Within the cytosol, G-AisoK is processed to AisoK, which is a substrate for wt-MbPylRS/PylT and is incorporated site-specifically into a POI.

Source Data

Similarly, G-AisoK-mediated AisoK incorporation was observed for other amber-containing target proteins (Supplementary Fig. 1b). To investigate the underlying mechanism, we performed liquid chromatography–mass spectrometry (LC–MS)-based uptake assays21,36. We did not detect any intracellular G-AisoK after G-AisoK supplementation, but AisoK accumulated at fivefold to tenfold higher concentrations compared with supplementing K12 with AisoK directly (Fig. 1e and Supplementary Fig. 1c).

These findings led us to hypothesize that a specific transport mechanism actively imports G-AisoK into cells. Within the cytosol, G-AisoK is enzymatically processed to AisoK, which accumulates in high concentrations and serves as a substrate for MbPylRS, leading to efficient AisoK encoding (Fig. 1f).

An ABC transporter enables G-XisoK uptake

In Gram-negative bacteria such as E. coli, small peptides enter the periplasm by diffusion through outer membrane porins37. Within the inner membrane, two major peptide-transporter classes facilitate peptide uptake into the cytosol: proton-dependent oligopeptide transporters (POTs) and ABC transporters5 (Supplementary Fig. 2a). To identify a potential uptake system for G-AisoK, we screened E. coli single-gene knockouts38 with deletions of individual transporters or transporter domains for amber suppression of sfGFP-N150TAG in the presence of the wt-MbPylRS/PylT pair and G-AisoK. We hypothesized that loss of a required transporter would reduce or abolish sfGFP expression. Whereas deletion of POT family members and dipeptide-specific ABC transporters had no effect, individual knockouts of genes constituting the opp operon completely abolished sfGFP expression with G-AisoK (Fig. 2a, Extended Data Fig. 1a and Supplementary Fig. 2b,c).

Fig. 2: The Opp transporter is responsible for efficient G-AisoK uptake.
figure 2

a, SDS–PAGE analysis (top) and time-course fluorescence measurements (bottom) of sfGFP-N150TAG expression in the presence of BocK, AisoK or G-AisoK in wild-type K12 and in ΔoppA, ΔoppB or ΔoppD knockouts, indicating that the Opp transporter is responsible for G-AisoK uptake. Results for other knockouts can be found in Extended Data Fig. 1. Consistent results were obtained over three independent replicate experiments. Arrow indicates full-length sfGFP, asterisk indicates truncated sfGFP. b, AlphaFold2 predicted structure of the Opp transporter, consisting of the periplasmic binding protein OppA, two TMDs (OppB and OppC) and two NBDs (OppD and OppF). c, Extracted ion chromatograms of E. coli K12 lysates for determination of intracellular AisoK concentrations in wild-type K12 versus ΔoppA. Genomic deletion of oppA results in undetectable AisoK concentrations when growing cells with 2 mM G-AisoK. Consistent results were obtained over three independent replicate experiments. d, SDS–PAGE analysis of sfGFP-N150TAG expression with BocK or G-AisoK in single peptidase knockouts ΔpepN and ΔpepA and the double knockout ΔpepN/pepA. G-AisoK-dependent full-length sfGFP expression is significantly reduced in ΔpepN/pepA, indicating that pepA and pepN are the main peptidases responsible for cleavage of the N-terminal glycine. Results for other knockouts are presented in Supplementary Fig. 3. Consistent results were obtained over three independent replicate experiments. Arrow indicates full-length sfGFP, asterisk indicates truncated sfGFP. e, Proposed mechanism of Opp-mediated uptake. G-AisoK binds to OppA in the periplasm and is shuttled to membrane-bound OppB and OppC. The tripeptide is actively transported into the cytosol in an ATP-dependent manner, where it is cleaved by pepN and pepA to AisoK. OppA, in its apo-form, is released from the TMDs to allow binding of new G-AisoK.

Source Data

The Opp ABC transporter comprises the periplasmic binding protein (OppA), two transmembrane domains (TMDs) that span the inner membrane (OppB and OppC) and two cytosolic nucleotide-binding domains (NBDs) that drive ATP hydrolysis (OppD and OppF) (Fig. 2b). Peptide-bound OppA docks to the TMDs, triggering ATP-binding and substrate translocation into the cytosol5. Individual deletions of OppA, or any of the two TMDs or NBDs led to complete loss of amber suppression and sfGFP fluorescence with G-AisoK, but not with BocK, indicating Opp-dependent G-AisoK uptake (Fig. 2a and Extended Data Fig. 1a). Uptake assays confirmed that intracellular BocK levels were unchanged in ΔoppA-K12 compared with wild-type K12, whereas AisoK, which accumulated in millimolar concentrations in K12 treated with G-AisoK, was undetectable when oppA was deleted (Fig. 2c and Extended Data Fig. 1b).

To identify the enzyme responsible for processing of G-AisoK to AisoK, we performed amber suppression experiments with G-AisoK using single-gene knockouts that lack specific aminopeptidases38. However, none of the ten tested knockouts exhibited an effect on the amber suppression yield (Fig. 2d and Supplementary Fig. 3a). We therefore generated multi-peptidase knockouts using a CRISPR–Cas12a-based genome editing platform for E. coli39. Notably, only cells with both pepN and pepA deleted (ΔpepN/pepA), showed a marked reduction in sfGFP expression with G-AisoK, whereas amber suppression yields with BocK remained unaffected (Fig. 2d and Supplementary Fig. 3b). Complementation with either pepA or pepN restored sfGFP expression with G-AisoK, indicating that either peptidase is sufficient for G-AisoK processing (Supplementary Fig. 3c).

Together, these findings support a model in which G-AisoK is actively imported via the Opp transporter into the cytosol of E. coli, where it is processed by endogenous peptidases, releasing AisoK for efficient amber suppression (Fig. 2e).

A versatile G-XisoK toolbox

Next, we tested whether amino acids in a general G-XisoK (Fig. 3a) scaffold behaved similarly. Indeed, SisoK, bearing serine instead of alanine, was similarly incorporated in a tripeptide (G-SisoK)-dependent manner (Supplementary Fig. 4a,b). OppA is known to promiscuously bind 2- to 5-amino-acid-long peptides, favouring positively charged side chains. To explore how OppA distinguishes G-XisoK from XisoK, we solved the crystal structure of OppA bound to G-SisoK (Protein Data Bank (PDB) ID: 9RD1; Supplementary Table 5). The structure shows a good overlap with previous ligand-bound OppA conformations40 and adopts the closed state, with G-SisoK enclosed in the binding pocket (Fig. 3b). G-SisoK engages in extensive interactions with OppA through its backbone and termini. The N-terminal glycine forms key hydrogen bonds and electrostatic contacts: its protonated α-amine interacts with D445, whereas the C-terminal carboxylate is stabilized by hydrogen bonds involving the side chains of R439, H397 and N392 (Fig. 3b and Supplementary Fig. 4c). To validate these interactions, we expressed OppA variants in ΔoppA. Expression of wild-type OppA fully restored sfGFP expression with G-SisoK, whereas the D445A variant, which disrupts the interaction with the N-terminal α-amine of G-SisoK, did not rescue expression. Mutations targeting the hydrogen-bonding network at the C terminus of G-SisoK (for example, R439A), had less pronounced effects, suggesting that the OppA binding site possesses some structural flexibility (Supplementary Fig. 4d). These results highlight the essential role of the interaction between the α-amine of glycine and D445 for effective OppA binding and transport.

Fig. 3: A versatile G-XisoK toolbox.
figure 3

a, Structure of a generalized G-XisoK tripeptide. b, X-ray structure of OppA bound to G-SisoK (PDB ID: 9RD1). G-SisoK forms extensive interactions with OppA residues via its N and C termini and its backbone amide groups. For a detailed description of the interactions, see Supplementary Fig. 4c. c, The OppA–G-SisoK complex around the serine side chain reveals a large cavity that is capable of accommodating bulky side chains. d, All functional groups incorporated via the G-XisoK scaffold. e, SDS–PAGE analysis of sfGFP-N150TAG expression in the presence of either 2 mM XisoK or G-XisoK. All G-XisoK derivatives show higher levels of full-length sfGFP expression using the corresponding PylRS/PylT pairs compared with cells grown with the corresponding XisoK. Arrow indicates full-length sfGFP; asterisk indicates truncated sfGFP. LC–MS analyses of purified sfGFP and Ub variants confirming the incorporation of XisoK derivatives are shown in Supplementary Figs. 5, 6 and 8. MaPylRS, Methanomethylophilus alvus PylRS. f, LC–MS analysis of tyrosinase-mediated labelling of 3C-Ub bearing PisoK at K63 with p-cresol demonstrates quantitative conversion. For details and full data see Extended Data Fig. 2g, SDS–PAGE analysis of CuAAC labelling of purified eGFPNb-R75PrgisoK with an Atto647-Azide fluorophore. No labelling is observed for eGFPNb-R75BocK. For details and full data see Extended Data Fig. 3a. h, Western blot analysis of GST-dimer crosslinking for GST-E51pLisoK after 365 nm UV illumination. No crosslink is observed for wild-type GST. For details and full data see Extended Data Fig. 3b. i, Western blot analysis of proximity-induced chemical crosslinking between Rab1b-R79ClAisoK and its interactor DrrA-D512C339–522. Cells expressing both binding partners in the presence of G-ClAisoK display a higher molecular weight band corresponding to the crosslinked complex in both anti-H6 and anti-streptavidin (Strep) blots. Full data and further experiments can be found in Extended Data Fig. 4. ei, Consistent results were obtained over three independent replicate experiments.

The OppA–G-SisoK crystal structure revealed no specific interactions with the serine side chain, which is accommodated in a spacious pocket (Fig. 3c). This suggests that OppA binding and uptake rely primarily on recognition of the tripeptide backbone and termini rather than side-chain identity. Accordingly, this mechanism may represent a more general concept that is applicable to a variety of ncAAs presented within a G-XisoK scaffold. We thus expanded our propeptide strategy to efficiently incorporate XisoK derivatives bearing functionalities commonly used in GCE, including moieties for site-specific protein conjugation and crosslinking (Fig. 3d). Supplementing E. coli K12 with G-XisoK derivatives—where X represents various side chains—enabled efficient suppression of sfGFP-N150TAG and Ub-K63TAG using either wt-MbPylRS/PylT or suitable synthetase variants identified from an in cellulo screen (Fig. 3e). Mass spectrometry analysis confirmed site-specific incorporation of the respective XisoK dipeptides (Supplementary Figs. 5 and 6), and supplementation with free XisoK derivatives led to minimal protein expression (Fig. 3e).

Efficient genetic encoding of XisoK derivatives is notable, as lysine aminoacylation (at the ε-amino group) with any of the 20 canonical amino acids is a recently identified reversible PTM41,42. Previous attempts at directly encoding SisoK, TisoK, PisoK or CisoK via GCE have proved highly inefficient23,41,43,44 (Fig. 3e). Our strategy offers a high-efficiency alternative, providing a foundation for functional studies on these PTMs. CisoK-modified proteins are also ideally suited for native chemical ligation approaches45. Comparing obtained CisoK-incorporation efficiencies with previous yields using specifically evolved PylRS variants23 highlights the benefit of actively importing G-CisoK, (Supplementary Fig. 7a), indicating that intracellular ncAA concentration may be more crucial for efficient ncAA incorporation than extensive PylRS engineering.

Site-specific incorporation of XisoK derivatives enables installation of an amino acid with an α-amine moiety, effectively creating a second, artificial N terminus for internal labelling46. For example, G-PisoK uptake allows installation of an internal proline bearing a free α-amine and its labelling with phenol derivatives using a chemoenzymatic approach47. Tyrosinase oxidizes p-cresol to the corresponding o-quinone, which oxidatively couples to the α-amine of proline in PisoK, enabling specific and quantitative labelling of PisoK-modified POIs (Fig. 3f and Extended Data Fig. 2).

When we screened for PylRS variants for bulky or aromatic X side chains, such as HisoK, no hits emerged using G-HisoK. To probe whether this was due to poor Opp transport or lack of appropriate HisoK-specific PylRS variants, we performed directed evolution using a custom-designed MbPylRS library. A novel MbPylRS variant supported HisoK incorporation with G-HisoK, but not with HisoK (Fig. 3e), indicating that OppA also delivers G-XisoK derivatives with bulky or aromatic X side chains. Efficient encoding of synthetically easily accessible histidine-containing ncAAs may expand the range of metal coordination sites in artificial metalloenzymes48.

We further broadened the G-XisoK toolbox with non-canonical X side chains for bioorthogonal labelling6 and crosslinking7,49. As a considerable advantage of our propetide strategy, G-XisoK derivatives can be easily synthesized at large scales via solid-phase peptide synthesis from commercially available building blocks, overcoming synthetic limitations of previous methods. A propargyl-containing derivative (G-PrgisoK) enabled efficient installation of PrgisoK (Fig. 3e and Supplementary Fig. 8) and subsequent fluorophore labelling via Cu(i)-catalysed azide alkyne cycloaddition (CuAAC) on an eGFP-specific nanobody (eGFPNb) (Fig. 3g and Extended Data Fig. 3a). PrgisoK incorporation using G-PrgisoK compares favourably with recently reported efficiencies using a dedicated PrgisoK-PylRS variant41 (Supplementary Fig. 7b).

To map and trap protein–protein interactions (PPIs), we used our propetide strategy to efficiently incorporate crosslinkers. Incorporating photoleucine (pL), a commercially available ncAA, as X in the G-XisoK scaffold enabled diazirine encoding into POIs using a Methanomethylophilus alvus PylRS variant (Fig. 3e and Supplementary Fig. 8). UV-induced crosslinking confirmed functionality by capturing PPIs, exemplified by successful crosslinking of glutathione-S-transferase (GST) and sfGFP dimers (Fig. 3h and Extended Data Fig. 3b).

For proximity-based crosslinking, we designed G-ClAisoK to endow POIs with chloroalanine (ClA). Supplementation of K12 with G-ClAisoK led to efficient ClAisoK incorporation (Fig. 3e and Supplementary Fig. 8), enabling SN2-mediated crosslinking with nearby nucleophiles (such as cysteines) in interacting proteins. By pairwise incorporation of ClAisoK and cysteine residues at protein–protein interfaces, we covalently stabilized various low-affinity PPIs (dissociation constant (Kd) in the micromolar to low millimolar range), including sfGFP homodimers, affibody–protein Z50, and Rab1b–DrrA51 complexes (Fig. 3i and Extended Data Fig. 4). Distances of 8–12 Å between the corresponding Cα atoms could be efficiently crosslinked.

Scalable XisoK encoding via OppA evolution

All tested G-XisoK tripeptides enabled efficient protein production in chemically defined autoinduction (AI) media52, but not in nutrient-rich conditions, such as 2-YT medium (Fig. 4a and Supplementary Fig. 9a,b). We hypothesized that short peptides in tryptone and peptone-rich medium may compete with G-XisoK tripeptides for OppA binding. Supporting this, intracellular SisoK levels were sixfold lower in 2-YT medium compared with AI medium after G-SisoK supplementation, indicating impaired OppA-mediated uptake under nutrient-rich conditions (Supplementary Fig. 10). This poses a challenge for scalable, cost-effective use of the G-XisoK toolbox, as AI media are expensive, cumbersome to prepare and lead to lower biomass, diminishing expression yields of ncAA-modified proteins.

Fig. 4: Scalable XisoK incorporation through OppA evolution.
figure 4

a, SDS–PAGE analysis of sfGFP-N150TAG expression in the presence of SisoK or G-SisoK in AI or 2-YT medium. In wild-type K12, full-length sfGFP expression yields are significantly reduced when using tryptone-containing 2-YT medium owing to competition between tryptic peptides and G-SisoK for OppA binding. sfGFP expression in 2-YT medium is recovered when using the engineered IsoK12 strain. Consistent results were obtained over three independent replicate experiments. Arrow indicates sfGFP, asterisk indicates truncated sfGFP. b, Scheme for OppA evolution to improve G-SisoK uptake in tryptone-containing medium. OppA libraries were screened for increased G-SisoK uptake under increasing tryptone concentrations by monitoring amber suppression of sfGFP-N150TAG and sorting fluorescent cells. Initial screening of an error-prone library yielded four variants, which were used as basis for creating a cassette mutagenesis library. Screening of this library identified OppA-iso. c, Extracted ion chromatograms to determine intracellular SisoK concentrations. IsoK12 cells grown in 2-YT medium show 7- to 10-fold higher intracellular SisoK concentrations compared with K12 cells when adding 2 mM G-SisoK to 2-YT medium. Consistent results were obtained over three independent replicate experiments. d, Affinity measurements of G-SisoK and a linear GSK peptide (GSK(lin)) towards wild-type OppA (wt-OppA) and OppA-iso using microscale thermophoresis. Data are mean Kd values ± s.e.m. calculated from three biologically independent experiments (n  =  3). All data processing was performed using GraphPad Prism 10 (GraphPad software) and MO.affinity Analysis (v.3.0.5, NanoTemper Technologies). e, SDS–PAGE analysis of sfGFP-N150TAG expression in wild-type K12 versus IsoK12 grown in 2-YT medium in the presence of G-XisoK derivatives with X = P, Prg, ClA, C or pL. Full-length sfGFP expression is greatly increased in IsoK12 for all G-XisoK derivatives. Consistent results were obtained over three independent experiments. Asterisk indicates truncated sfGFP. Full gels and gels for other G-XisoK derivatives are presented in Supplementary Fig. 9.

Source Data

To overcome this, we engineered an OppA variant with increased selectivity for G-SisoK over linear tripeptides. We developed a fluorescence-activated cell sorting (FACS)-based platform to screen an error-prone OppA library in ΔoppA cells containing the wt-MbPylRS/PylT pair and sfGFP-N150TAG through successive enrichment in increasing tryptone concentrations. This system couples uptake of the G-SisoK tripeptide to sfGFP fluorescence (Fig. 4b). Four converging OppA variants with four or five mutations distributed all over the OppA-fold were identified. Mutational hotspots were targeted for saturation mutagenesis and the obtained library was subjected to multiple FACS-based enrichment steps in peptide-rich medium, yielding the final OppA variant (OppA-iso). OppA-iso contains seven mutations, with only R439Q occurring near the binding site (Extended Data Fig. 5a). Genomic integration of OppA-iso into the K12 genome via lambda red-mediated homologous recombination created the IsoK12 strain. IsoK12 and parental K12 showed similar doubling times in AI and 2-YT media (Supplementary Fig. 11a). In 2-YT medium, IsoK12 exhibited sevenfold to tenfold higher intracellular SisoK concentrations compared with K12 when supplemented with G-SisoK, whereas BocK levels remained unchanged, confirming enhanced G-SisoK transport with OppA-iso (Fig. 4c and Supplementary Fig. 10). Notably, IsoK12 restored amber suppression efficiency in 2-YT medium to levels observed for K12 in AI medium (Fig. 4a and Supplementary Fig. 9).

Analysis using microscale thermophoresis showed that wild-type OppA and OppA-iso exhibited similarly low binding affinities towards SisoK (around 300 µM), whereas OppA-iso exhibited a slightly improved affinity (37 µM) towards G-SisoK, compared with wild-type OppA (50 µM). Notably, binding of a linear GSK tripeptide (mimicking linear tryptone peptides) was fourfold lower for OppA-iso (275 µM) compared with wild-type OppA (71 µM), validating its altered selectivity (Fig. 4d and Supplementary Fig. 11b).

Structural analysis showed that most mutated residues in OppA-iso do not directly contact G-SisoK. One notable exception is R439, which lies within hydrogen-bonding distance of the C terminus of the ligand and is replaced by glutamine in OppA-iso. Since the R439A mutant still supports efficient G-SisoK uptake (Supplementary Fig. 4d), binding is likely to be maintained through compensatory interactions with H397 and N392, which are well positioned to interact with the lysine carboxylate of G-SisoK. By contrast, linear tripeptides such as GSK are dependent on R439 for binding, so its mutation reduces affinity, consistent with structural and binding data (Extended Data Fig. 5b).

Notably, the uptake benefit of OppA-iso extended to other G-XisoK derivatives. IsoK12 grown in 2-YT medium achieved efficient amber suppression for all tested X residues (A, S, T, C, V, L, P, H, Prg, pL and ClA) with minimal truncation and yields matching those obtained in AI medium (Fig. 4e and Supplementary Fig. 9). In fact, tripeptide uptake was so efficient in IsoK12 that G-SisoK concentrations as low as 50–100 µM matched incorporation levels seen in K12 with 1 mM G-SisoK, reducing required ncAA concentrations by a factor of around 10 (Extended Data Fig. 6a).

The G-XisoK/IsoK12 system enabled high-yield XisoK incorporation at various positions in a wide range of target proteins (PCNA, β-lactamase, SUMO2, calmodulin, eGFPNb, interleukin-2, human growth hormone, RanGAP and Hsp82) ranging in size from 7 to 85 kDa, including therapeutically relevant examples. Suppression efficiencies surpassed those with BocK and matched wild-type levels (Extended Data Fig. 6b and Supplementary Figs. 1214). Preparative large-scale production of eGFPNb bearing PrgisoK resulted in similar purified protein yields (44 mg l−1) as obtained for wild-type expression (41 mg l−1), exceeding yields from a previously optimized alkyne-ncAA/PylRS combination (Extended Data Fig. 6c).

Increasing intracellular ncAA concentrations via efficient tripeptide uptake also facilitated multi-site amber suppression. We introduced up to three TAG codons into histone H3 (K27, K79 and K122) and expressed the corresponding variants with G-SisoK. IsoK12 outperformed K12 in single, double and triple suppression, achieving wild-type-like expression in the first two cases and significant H3 yields even for triple suppression. By contrast, BocK produced only trace amounts of doubly and triply suppressed variants (Extended Data Fig. 6d and Supplementary Fig. 15).

Generalized ncAA uptake using Z-AisoKs

Efficient uptake and processing of G-XisoK results in high intracellular concentrations of both the XisoK dipeptide and cleaved N-terminal glycine. Crystallographic analysis of the OppA–G-SisoK complex revealed a spacious cavity extending from the serine side chain to the N-terminal glycine (Extended Data Fig. 7a). Given the promiscuity of OppA, we hypothesized that other side chains, including those of ncAAs, could also be accommodated at this position, enabling efficient transport of diverse non-canonical tripeptides into the E. coli cytosol. If different N-terminal residues (Z) in a Z-AisoK scaffold (Fig. 5a) are also efficiently cleaved, the strategy could broadly enable intracellular delivery of various amino acids.

Fig. 5: Generalized ncAA uptake using Z-AisoK tripeptides.
figure 5

a, Structure of tested Z residues within the Z-AisoK scaffold. b, SDS–PAGE analysis of sfGFP-N150TAG expression in the presence of 2 mM of BocK or Z-AisoK tripeptides with wt-MbPylRS/PylT. Expression levels of full-length sfGFP indicate successful transport of the Z-AisoK tripeptides, subsequent cleavage to Z and AisoK and AisoK incorporation. Top, expression in wild-type K12. Bottom, expression in the evolved strain K12-Z2. Consistent results were obtained over three independent replicate experiments. Arrow indicates full-length sfGFP, asterisk indicates truncated sfGFP. c, Scheme for OppA evolution to accommodate novel Z-AisoK substrates. Successful transport by OppA library variants was evaluated by incorporation of AisoK into sfGFP-N150TAG. Variants with high sfGFP fluorescence were enriched via three rounds of FACS. d, X-ray structure of the OppA–G-SisoK complex, highlighting four residues surrounding the N-terminal glycine of G-SisoK that were targeted for site-saturation mutagenesis to enable recognition of novel Z-AisoK substrates. e, Left, SDS–PAGE analysis of sfGFP-N150TAG expression with 0.25 mM LipK or the corresponding Z-AisoK tripeptide 13 with a LipK-specific Methanosarcina mazei PylRS/PylT (MmPylRS/PylT) variant (Y306A/Y384F). Expression levels are highest with peptide 13 in the K12-Z1 strain, which expresses OppA-Z1, evolved specifically for the transport of 13. Right, LC–MS analysis of sfGFP purified from K12-Z1 cultures grown with 13. Observed mass confirms incorporation of LipK. Consistent results were obtained over three independent replicate experiments. f, Dual stop codon suppression using a single isopeptide-linked tripeptide. Left, chemical structure of the tripeptide AcK-pLisoK (16). Middle, SDS–PAGE analysis of sfGFP-N40TAA-N150TAG expression in the presence of either AcK and G-pLisoK (or pLisoK) separately added to medium or in the presence of tripeptide 16 in IsoK12. Right, LC–MS analysis of purified sfGFP confirms dual ncAA incorporation (AcK and pLisoK) after the addition of tripeptide 16. Consistent results were obtained over three independent replicate experiments.

We synthesized a panel of 14 Z-AisoK tripeptides bearing either natural amino acids or ncAAs with diverse side chains, including bulky or negatively charged groups with poor or negligible cell permeability as N-terminal Z residues (Fig. 5a). To monitor uptake and cleavage of Z, we assessed AisoK incorporation into sfGFP-N150TAG using the wt-MbPylRS/tRNA pair. For 8 out of 14 Z residues, supplementation of K12 with corresponding Z-AisoK tripeptides, yielded amber suppression efficiencies resembling those obtained with G-AisoK (Fig. 5b and Extended Data Fig. 7b), with LC–MS confirming AisoK incorporation in all cases (Supplementary Fig. 16a). No sfGFP expression was observed in ΔoppA cells, confirming dependence on OppA-mediated uptake (Extended Data Fig. 7b).

By contrast, tripeptides bearing bulkier or negatively charged Z side chains (compounds 5, 6 and 1215; Fig. 5a), resulted in reduced (5 and 6) or completely abolished (1215) amber suppression (Fig. 5b and Extended Data Fig. 7b), suggesting inefficient uptake and/or cleavage. To address this, we leveraged our OppA engineering platform to evolve new OppA variants capable of accommodating tripeptides with larger or negatively charged Z residues (Fig. 5c). Guided by our OppA–G-SisoK structure, we selected four residues around glycine of G-SisoK for site-saturation mutagenesis (Fig. 5d and Extended Data Fig. 7a) and subjected a corresponding library to multiple FACS-based enrichments in the presence of tripeptides 13 and 15. From these screens, we isolated two OppA variants with enhanced uptake: OppA-Z1 (evolved with 13) and OppA-Z2 (evolved with 15). Both variants feature small side chains at the targeted positions, which are likely to increase the size of the binding pocket to accommodate larger substrates, and in the case of OppA-Z2, a non-programmed R439H mutation. These findings provide further evidence that the R439 variant can enhance binding of isopeptide-linked scaffolds.

We genomically introduced these OppA variants, generating K12-Z1 and K12-Z2 strains, respectively. Supplementing K12-Z1 with bulky Z tripeptides (5, 6, 12 and 13) enabled efficient AisoK incorporation, but uptake of tripeptides with negatively charged Z residues (14 and 15) was not supported by this engineered strain (Extended Data Fig. 7c). By contrast, K12-Z2 enabled efficient uptake of all 14 Z-AisoK tripeptides, including those bearing negatively charged ncAAs such as SucK and GluK as Z residues (Fig. 5b, Extended Data Fig. 7c and Supplementary Fig. 16b).

To demonstrate that tripeptide-based uptake is superior to direct ncAA supplementation, we focused on selected Z residues that are known to have low suppression efficiencies, for which limited uptake was suspected to be the main bottleneck. We compared amber suppression yields after supplementing cells either with Z directly or with the Z-AisoK tripeptide. Compound 3 (with acetyl-lysine (AcK) as the Z residue) yielded higher AcK incorporation than AcK alone using the MbPylRS variant AcKRS353 (Extended Data Fig. 8a), highlighting the benefit of active delivery. Notably, compound 13 (bearing LipK as Z residue) or LipK alone led to negligible expression of full-length sfGFP in the presence of the LipKRS/tRNA pair54. By contrast, K12-Z1 supplemented with 13 enabled efficient sfGFP production, with LipK incorporation confirmed by LC–MS (Fig. 5e). Similar improvements were observed for tripeptides 5 and 6 in K12-Z2, confirming that evolved OppA variants enable effective delivery of tripeptides that are impermeable in wild-type strains (Extended Data Fig. 8b).

Efficient encoding of two distinct ncAAs

Given the adaptability of OppA in substrate recognition, we envisioned a broadly applicable strategy to co-deliver two ncAAs using a single, easily synthesized Z-XisoK tripeptide. To leverage such a mechanism for site-specific dual ncAA incorporation into a single protein, we designed tripeptide 16, bearing AcK (a PTM) as Z and pLisoK (a photocrosslinker) as XisoK (Fig. 5f). Such a setup would represent an ideal tool to investigate PTM-specific protein interactors or to chemically stabilize transient POI–reader complexes55.

Notably, the PylRS/PylT pairs for AcK and pLisoK are mutually orthogonal56 (Extended Data Fig. 8c). After OppA-mediated uptake and cleavage of 16, both AcK and pLisoK accumulate intracellularly, allowing dual suppression of TAA and TAG codons within the same target protein (sfGFP-N40AcK-N150pLisoK; Fig. 5f), using the respective PylRS/PylT pairs. Notably, protein yields were significantly higher when cells were supplemented with tripeptide 16 compared with addition of free AcK and G-pLisoK, underscoring the enhanced efficiency of transporter-mediated delivery. LC–MS of full-length sfGFP confirmed dual incorporation of AcK and pLisoK (Fig. 5f). Mutual orthogonality and accurate decoding of both ncAAs was furthermore shown using a Ub–SUMO2 fusion construct containing a Tobacco Etch Virus (TEV) protease site (Ub-K48TAG-TEV-SUMO-K11TAA). LC–MS analysis after TEV cleavage revealed specific pLisoK incorporation at TAG48 of Ub and AcK encoding in response to TAA11 of SUMO2 (Extended Data Fig. 8d).

Discussion

Low protein yields and the limited chemical accessibility of many ncAAs remain major obstacles to the routine application of GCE for the generation of proteins with therapeutic or biotechnological potential. Here we overcome these limitations by combining a modular propeptide strategy with the programmed ‘hijacking’ of the Opp transporter, enabling active and tailored import of a broad range of ncAAs, including those that have historically been refractory to efficient uptake and encoding. Isopeptide-linked Z-XisoK tripeptides act as Trojan horses that can be readily synthesized from commercially available building blocks via solid-phase peptide synthesis, and function as privileged ligands for the periplasmic binding protein OppA, enabling their ATP-driven transport into the cytosol. Once inside the cell, Z-XisoK tripeptides are enzymatically processed, leading to intracellular accumulation of isopeptide-linked XisoK ncAAs and Z residues. Using a FACS-based directed evolution approach, we reprogrammed OppA to selectively discriminate against linear peptides that are present in complex media. The resulting engineered E. coli strain, IsoK12, enables cost-effective, high-yield production of modified proteins in nutrient-rich conditions using minimal tripeptide concentrations. This platform allows robust incorporation of 11 previously inaccessible XisoK ncAAs, expanding the chemical space available through GCE. Among these functionalities are ncAAs bearing bioorthogonal handles, novel PTMs, chemical and photocrosslinkers, and functional groups for chemoenzymatic ligations, all of which we demonstrated in proof-of-principle applications. The positively charged side chains of XisoK ncAAs make them ideal moieties for applications such as protein labelling, for which site-directed incorporation at surface-exposed positions is essential. This strategy avoids aggregation and misfolding that are often associated with bulky, hydrophobic ncAAs commonly used for labelling purposes6. With these advantages, we anticipate that the XisoK toolbox can be further expanded through aaRS engineering to include functionalities for inverse-electron-demand Diels–Alder cycloadditions or spectroscopic probes.

Notably, uptake and processing of Z-XisoK tripeptides can also be leveraged for customized delivery and intracellular accumulation of challenging Z residues that typically lack cell permeability. By synthesizing a panel of 14 different Z-AisoK tripeptides, we demonstrate that AisoK incorporation into sfGFP serves as a straightforward readout for efficient tripeptide uptake and processing, thereby eliminating the need for intensive mass spectrometry-based uptake assays21. Our platform provides a foundation for developing OppA variants with specific binding sites for tripeptides carrying otherwise cell-impermeable Z residues, such as bulky and negatively charged groups. Thus, our innovation enables the intracellular delivery of these challenging ncAAs. Site-specific incorporation of Z residues from Z-XisoK significantly outperforms direct Z supplementation. As OppA evolution for a specific Z residue is decoupled from availability of a Z-specific orthogonal aaRS, our system presents a practical and modular strategy for dissecting ncAA uptake from its co-translational incorporation, complementing recent efforts in decoupling and optimizing individual GCE steps towards synthesis of non-canonical biopolymers in E. coli57,58. The current system relies on orthogonality of Z incorporation to AisoK-selective PylRS variants, but future efforts will focus on relaxing this constraint by diversifying the XisoK scaffold to further expand the versatility and applicability of the system. Notably, the ability to co-import two distinct ncAAs via a single Z-XisoK scaffold enables efficient dual ncAA incorporation. We envision extending our concept towards multi-ncAA encoding in synergy with advances in non-canonical polymer biosynthesis57,58 and E. coli strains with compressed genomes18,19,20.

Methods

Expression of ncAA bearing proteins

Chemically competent E. coli K12 cells were co-transformed with pBAD_POI (encoding the POI with a C-terminal H6 tag) and pEVOL_PylRS (encoding two copies of the PylRS variant and tRNACUA). After recovery in 1 ml SOC medium for 1 h at 37 °C, cells were cultured in 5 ml of 2-YT medium supplemented with ampicillin (100 µg ml−1) and chloramphenicol (50 µg ml−1) and grown overnight at 37 °C. The overnight culture was then diluted to an OD600 of 0.05 in either 2-YT or AI media52 supplemented with ampicillin (100 µg ml−1), chloramphenicol (50 µg ml−1) and ncAA. Cells grown in 2-YT medium were grown to an OD600 of 0.6 and induced with 0.05% l-arabinose and grown overnight at 37 °C. Cells grown in AI media52 were directly grown overnight at 37 °C. Cells of the overnight culture were collected by centrifugation at 4,000g for 10 min at 4 °C and the pellets were analysed by SDS–PAGE or stored at −20 °C for further use.

For the incorporation of XisoK ncAAs into proteins for purification and downstream characterization, the corresponding G-XisoK peptides were used during expression. The dipeptide XisoK was only used for comparing expression levels in whole lysate. For the incorporation of Z ncAAs, both the free amino acid Z and the corresponding Z-AisoK tripeptide were used. Proteins expressed in the presence of Z-AisoK tripeptides were purified and analysed via LC–MS to confirm Z or AisoK incorporation. For dual incorporation, peptide 16 was used for purification experiments.

Purification of His6-tagged proteins

Cell pellets were resuspended in lysis buffer (20 mM Tris-HCl pH 8.0, 300 mM NaCl, 30 mM imidazole, 1 mM PMSF) and the cells were lysed via sonication in an ice water bath. Lysed cells were centrifuged at 14,000g for 20 min at 4 °C and cleared lysate was incubated with Ni Sepharose fast flow beads (1 ml slurry per 1 l culture, Cytiva) pre-equilibrated with Ni-NTA wash buffer (20 mM Tris-HCl pH 8.0, 300 mM NaCl, 30 mM imidazole) and incubated on a roller for 1 h at 4 °C. Beads were then washed with 10 column volumes of wash buffer 3 times and protein was eluted with elution buffer (20 mM Tris-HCl pH 8.0, 300 mM NaCl, 300 mM imidazole). Resulting protein was either directly used for mass determination or further purification was performed via gel filtration using a Superdex 75 increase 10/300 GL column (Cytiva) equilibrated with SE buffer (2× PBS pH 7.0 for eGFPNb or 20 mM potassium phosphate buffer pH 6.5, 100 mM NaCl for PisoK-bearing proteins or 1× PBS pH 7.4 for Affibody and ProteinZ). Fractions containing protein were pooled and concentrated using Amicon centrifugal filter units with the appropriate molecular weight cut-off (MWCO). Protein was then flash frozen in liquid nitrogen and stored at −80 °C till further use. Proteins bearing CisoK were treated with methoxyamine (10 µM protein, 100 mM methoxyamine at pH 4.0 ammonium acetate) and buffer exchanged in 1× PBS before storage.

Generation of an OppA error-prone library

An error-prone library of oppA was generated using an error-prone PCR kit (Jena Bioscience PP-102). Primers OppA_EP_fwd and OppA_EP_rev (Supplementary Table 2) were used to amplify oppA from pEVOL_MbPylRS_oppA according to the manufacturer’s instructions, running for 25 cycles and adding 2.5 µl error-prone solution. Resultant amplicon was purified on a 1% agarose gel and digested with NdeI and PstI-HF. Digested insert was then ligated with T4 ligase into a backbone generated by digesting pEVOL_MbPylRS_oppA with NdeI and PstI-HF, followed by dephosphorylation with Antarctic Phosphatase. Electrocompetent DH10β cells were then transformed with the purified ligation. After recovery in SOC medium for 1 h at 37 °C, cells were added to 50 ml of 2-YT medium supplemented with chloramphenicol (50 µg ml−1) and grown overnight at 37 °C. Library plasmid from the overnight culture was purified and used for subsequent steps. The size of the library was determined by plating the freshly transformed cells in a dilution series on LB agar plates supplemented with chloramphenicol (50 µg ml−1). The size was calculated to be 4 × 106. Multiple single clones were sequenced to determine the average error rate and was optimized to be ~0.8% or an average of 5 amino acid mutations per gene.

Generation of OppA site-saturation libraries

For the tryptone resistance site-saturation library, four hotspot positions in OppA were selected for site-saturation, on the basis of sequenced variants from selection of the error-prone library. Positions were chosen if mutations occurred more than once (R439 and S460) or if mutations occurred close to each other in different variants (D221 and W222).

For the site-saturation library targeting the pocket of the Z position, four positions were selected on the basis of the X-ray crystal structure of wt-OppA–G-SisoK around the N-terminal glycine of G-SisoK (V60, S63, L530 and N532) to accommodate new N-terminal amino acids.

Sites were randomized using degenerate trimer primers (Ella Biotech) or NNK primers. For the tryptone resistance library, a mix of templates based on pEVOL_MbPylRS_oppA containing wild-type OppA and four variants from the error-prone library were amplified with library primers (Supplementary Table 2). For the Z position library, pEVOL_MbPylRS_oppA_wt was used as a template. Both tryptone resistance template mix and Z library wild-type OppA template were amplified with their respective library primers (Supplementary Table 2). The linear amplicon was then purified on a 1% agarose gel and digested with BbsI-HF and DpnI. Digested fragments were circularized via ligation with T4 ligase and electrocompetent DH10β cells were transformed with the circularized library plasmid. After recovery in SOC medium for 1 h at 37 °C, cells were added to 50 ml of 2-YT medium supplemented with chloramphenicol (50 µg ml−1) and grown overnight at 37 °C. Library plasmid was purified from the overnight culture and used as the template for the next round of amplification to randomize the next position(s).

For the tryptone resistance library, this process was repeated 3 times, each time randomizing a different position, to give a library with positions D221, W222, R439 and S460 mutated to all 20 amino acids on wild-type OppA and 4 error-prone variants. The library size was 8 × 105. For the Z library the process was repeated 2 times to cover all 4 positions (V60, S63, L530 and N532). The library size was 1 × 106.

FACS-based screening protocol

Electrocompetent E. coli K12 ΔoppA cells containing sfGFP reporter plasmid pBAD_sfGFP_N150TAG_H6 were transformed with the OppA library (pEVOL_MbPylRS_oppA_EP_lib, pEVOL_MbPylRS_oppA_trimer_lib, pEVOL_MbPylRS_oppA_Z_lib or pEVOL_MbPylRS_oppA_wt). After recovery in 1 ml SOC for 1 h at 37 °C, transformed cells were diluted in 50 ml of non-inducing medium (AI medium19 without arabinose) supplemented with ampicillin (100 µg ml−1) and chloramphenicol (50 µg ml−1) and grown overnight. The overnight culture was then diluted in non-inducing medium and grown to an OD600 of 0.6.

At this point, 0.05% arabinose was added to induce sfGFP expression and the culture was split into smaller cultures. For the tryptone resistance screening campaign, cultures were supplemented with or without 0.5 mM G-SisoK and varying amounts of tryptone to apply a selection pressure towards OppA variants which preferably bound to G-SisoK.

For the screening of the Z library, cultures were supplemented with 2 mM of each Z-AisoK tripeptide of interest (13 or 15). Cultures were then grown for 4 h at 37 °C to allow for sfGFP expression. To halt growth, the cultures were cooled on ice for 10 min, centrifuged (4,000g, 5 min, 4 °C) and resuspended in ice cold PBS pH 7.0. The PBS cell suspension was sorted on a Sony cell sorter (SH800) using a 70-µm chip sorting for cells with highest sfGFP fluorescence intensity. Gating was decided on the basis of the positive control (tryptone resistance screening: wt-oppA, 0.5 mM G-SisoK, 0 g l−1 tryptone, Z library screening: wt-oppA, 2 mM G-AisoK, 0 g l−1 tryptone) and the negative controls (wt-oppA, 0.5 mM G-SisoK, X g l−1 tryptone, with X being the tryptone concentration used in that round of enrichment, and wt-oppA, 0 mM G-SisoK, 0 g l−1 tryptone). For each round, cells with the top 0.5–2% sfGFP fluorescence were sorted. Sorted cells were recovered in SOC medium supplemented with ampicillin (50 µg ml−1) and chloramphenicol (25 µg ml−1) overnight at 37 °C and the process was repeated for further enrichment. After multiple rounds of enrichment, cells were sorted into a 96-well plate containing SOC supplemented with ampicillin (50 µg ml−1) and chloramphenicol (25 µg ml−1) and grown overnight at 37 °C. Cultures grown from single cells were further evaluated via the fluorescence plate reader assay and variants which showed the desired phenotype were sent for Sanger sequencing. The error-prone library was subjected to 5 rounds of enrichment, each time doubling the tryptone concentration from 1 g l−1 (round 1) to 16 g l−1 (round 5). The site-saturation library was directly grown in LB medium (10 g l−1 tryptone) for 2 rounds of enrichment and 2-YT medium (16 g l−1) for 3 rounds of enrichment. The Z library was subjected to 3 rounds of enrichment at 2 mM of Z-AisoK tripeptide 13 or 15.

Preparation of E. coli lysates for LC–MS based uptake assays

The uptake assay protocol was adapted from previously published protocols21,36. Relevant E. coli strains were transformed with a pBAD plasmid to prevent contamination of cultures. After recovery in 1 ml SOC for 1 h at 37 °C, cells were cultured in 5 ml of 2-YT or AI medium supplemented with ampicillin (100 µg ml−1) and grown overnight at 37 °C. Overnight cultures were diluted to an OD600 of 0.05 in 5 ml of 2-YT or AI medium supplemented with ampicillin (100 µg ml−1) and ncAA or peptide and grown overnight at 37 °C. The OD600 of the overnight cultures was determined and 12 OD ml were collected by centrifugation at 4,000g for 10 min. Cell pellets were then washed 3 times with 1 ml of cold medium and resuspended in 400 µl of a methanol:water solution (60:40). Cells were lysed via 5 freeze thaw cycles in liquid nitrogen and a 42 °C water bath. Lysate was cleared by centrifugation at 17,900g for 20 min. Five-hundred microlitres of cleared lysate was passed through Amicon centrifugal filter units (Millipore, 3 kDa MWCO) and the flow through was injected onto the LC–MS for analysis. For cultures incubated with BocK, samples were injected onto a Zorbax SB-C18 (Agilent, 4.6 × 150 mm) column and a gradient of 5–95% was used. For cultures incubated with XisoK and G-XisoK peptides, samples were injected on a Poroshell 120 HILIC-Z (Agilent, 2.1 × 100 mm) column using a gradient of 95–10%. The mass spectrometer was set to single ion mode to detect the relevant m/z for each ncAA or peptide.

To determine intracellular concentrations, calibration points of lysate spiked with known ncAA or peptide concentrations were measured. Ion peaks were integrated, and integral values plotted against concentration to determine a linear calibration line. Integral values of unknown samples were interpolated on calibration line to determine lysate concentrations. Intracellular concentrations were estimated assuming 1 OD600 = 8 × 108 cells per ml and the volume of an E. coli cell (0.6 fl).

Determination of K d using microscale thermophoresis

Microscale thermophoresis was performed on the NanoTemper Monolith NT.115 (NanoTemper Technologies). Wild-type OppA and evolved variants were fluorescently labelled using the Monolith Protein Labeling Kit RED-NHS 2nd Generation (NanoTemper Technologies) and diluted to 100 nM in assay buffer (2× PBS pH 7.0, 0.02% Tween-20). Peptides were diluted to double the highest measured concentration in assay buffer and diluted twofold in a dilution series to give 16 peptide concentrations. Equal volumes of protein and peptide solutions were mixed (final protein concentration of 50 nM) and incubated at room temperature for 30 min. Samples were loaded into capillaries (Monolith NT.115 Capillaries, NanoTemper Technologies) and measured according to manufacturer’s instructions. Three independent replicates were measured for each peptide–protein combination and data analysis was performed with MO.affinity Analysis (v.3.0.5, NanoTemper Technologies).

Generating isoK12 strain via homologous recombination

Primers with 50 bp overhangs homologous to regions upstream and downstream of the OppA locus in E. coli genome were used to amplify oppA-iso from pEVOL_MbPylRS_oppA-iso. (Supplementary Table 2).

A single clone of E. coli K12 ΔoppA cells transformed with a pSIJ8 plasmid (Supplementary Table 2) was cultured in 2-YT medium supplemented with ampicillin (50 µg ml−1) and grown at 30 °C, 200 rpm until an OD600 of 0.3 followed by induction with 15 mM arabinose for the expression of lambda red recombineering genes. After incubation for 45 min at 37 °C the culture was cooled on ice to halt growth and made electrocompetent. The resulting electrocompetent E. coli K12 ΔoppA cells with expressed recombineering genes were then transformed with the linear DNA fragment encoding oppA-iso. After recovery in 1 ml SOC for 2 h at 37 °C, cells were diluted in 2-YT medium and grown overnight at 37 °C for the curing of thermosensitive plasmid pSIJ8. The overnight culture (containing a mix of ΔoppA and knock-in cells) was diluted and grown to an OD600 of 0.6 and made electrocompetent. These cells were then co-transformed with aaRS plasmid pEVOL_MbPylRS and sfGFP reporter pBAD_sfGFP_N150TAG_H6. Transformed cells were recovered in 1 ml SOC for 1 h at 37 °C and diluted in 2-YT medium supplemented with ampicillin (100 µg ml−1) and chloramphenicol (50 µg ml−1) and grown to an OD600 of 0.6 and sfGFP expression was induced with 0.05% arabinose and 0.5 mM G-SisoK was added. The culture was grown for 4 h, cooled on ice, centrifuged (4,000g, 5 min, 4 °C), and resuspended in ice cold PBS pH 7.0. Cells with the highest sfGFP fluorescence were sorted as single cells into a 96-well plate containing 2-YT medium and grown overnight. Clones with the correct genomic insert were confirmed with sequencing of the genomic locus and whole-genome sequencing. Once confirmed, plasmids were cured from the strain via electroporation.

Generating K12-Z1 and K12-Z2 strains via CRISPR-mediated genome editing

Knock-in generation was adapted from a previously published protocol39. In brief, a single clone of E. coli K12 ΔoppA cells transformed with pSIMcpf1 was cultured in 2-YT medium supplemented with hygromycin (150 µg ml−1) and grown at 30 °C and 200 rpm until an OD600 of 0.2 was reached. At this point, expression of lambda red recombineering genes was induced by incubation at 42 °C for 15 min. Cultures were then cooled on ice for 20 min to halt the growth and cells were made electrocompetent. The resulting electrocompetent cells were transformed with pTF_oppA-Z1 or pTF_oppA-Z2, which carried a donor DNA with genes for the respective OppA variant along with 50 bp upstream and downstream homologous regions of the oppA locus, as well as a CRISPR array encoding a guide RNA (gRNA) targeting the FRT site present in the knocked out oppA locus (Supplementary Table 7). After electroporation, cells were rescued in SOC medium for 1 h, plated on LB agar with hygromycin (150 µg ml−1) and spectinomycin (120 µg ml−1) and incubated overnight at 30 °C. Successfully knocked-in clones were confirmed via colony PCR and grown in 2-YT medium with 0.05% arabinose for 5 h at 30 °C and then grown overnight at 37 °C to cure the cells of plasmids. To further confirm successful integration, cells were sent for whole-genome sequencing.

Generating peptidase double knockouts via CRISPR-mediated genome editing

Peptidase knockouts were generated analogously to the K12-Z1 and K12-Z2 strains. E. coli K12 ΔpepN cells from the Keio collection38 were transformed with pSIMcpf1 (Supplementary Table 2) and were prepared as previously described. pTF plasmids (Supplementary Table 1), which carry a donor DNA of 50 bp upstream and downstream of the peptidase genomic locus, as well as a CRISPR array encoding two gRNAs that target the corresponding peptidase, were used (Supplementary Table 7). Colonies were confirmed via colony PCR and whole-genome sequencing. Plasmids were cured as previously described.

Dual stop codon suppression for incorporation of AcK and pLisoK into proteins

Chemically competent E. coli K12 cells were co-transformed with pBAD_POI (either pBAD_sfGFP_N40TAA_N150TAG_H6 or pBAD_Ub_K48TAA_TEV_SUMO2_K11TAG_H6) with a C-terminal His6 tag) and pEVOL_AcKRS3(TAA)_RBS_MaPylRS_IP(TAG) (encoding AcKRS3 and MaPylRS_IP polycistronically and their respective tRNAs). After recovery in 1 ml SOC medium for 1 h at 37 °C, cells were cultured in 5 ml of 2-YT medium supplemented with ampicillin (100 µg ml−1) and chloramphenicol (50 µg ml−1) and incubated overnight at 37 °C, 200 rpm. The overnight culture was then diluted to an OD600 of 0.05 in AI medium19 supplemented with ampicillin (100 µg ml−1), chloramphenicol (50 µg ml−1) and respective ncAA and/or peptide. Cells were grown overnight at 37 °C and the overnight culture was collected by centrifugation at 4,000g for 10 min at 4 °C and the pellets were stored at −20 °C till further use. Ub-K48pLisoK-TEV-SUMO2-K11AcK-H6 was purified as described in ‘Purification of His6-tagged proteins’. For cleavage, Ub-K48pLisoK-TEV-SUMO2-K11AcK-H was diluted into buffer (PBS pH 7.0 with 3 mM DTT) and incubated with TEV protease (0.1 mg ml−1) for 30 min at room temperature.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.