Abstract
Selective functionalization is central to the structural and functional diversification of pharmacologically important alkaloids, yet the complexity of these scaffolds often hinders chemical modifications. Enzymes such as cytochrome P450 monooxygenases can overcome these challenges by catalyzing stereo- and regioselective transformations at typically inert C–H sites. Here, we present a targeted workflow for mining such enzymes by integrating orthologue-inference bioinformatics to leverage the increasingly available biological data. Comparative analysis of 15 alkaloid-producing plant species generated a focused library of 15 P450s, five of which exhibited activities toward alkaloid scaffolds. Four enzymes from Camptotheca acuminata and Tabernaemontana elegans selectively oxygenated two positions of the anticancer scaffold evodiamine, a compound not reported in either lineage. Structural modelling and mutagenesis revealed that hydrophobic bulk in the active site governs evodiamine positioning and catalytic selectivity. Overall, we demonstrated that combining biocatalysts and substrates from distantly related plants with orthology-guided discovery enables selective functionalization of pharmacologically active alkaloids.
Similar content being viewed by others
Introduction
Bioactive chemicals from plants and derivatives thereof account for a substantial portion of all prescription drugs worldwide1. Among plant-derived natural products, alkaloids exhibit remarkable diversity in structures and functions2. For example, the quinazoline alkaloid scaffold demonstrates a variety of pharmacological activities, including anticancer (e.g., evodiamine), antimicrobial (e.g., vasicine), and antihypertensive effects (e.g., arborine)3,4,5,6. Similarly, aspidosperma (e.g., tabersonine) and iboga (e.g., ibogamine) alkaloid scaffolds are associated with anticancer (e.g., vinblastine) and psychoactive properties (e.g., ibogaine)7,8. However, further functionalization is often needed to optimize these natural products for therapeutic use. Due to their inherent structural complexity, modification of such molecules with regio- and stereoselectivity, particularly at unactivated C–H bonds, remains a significant synthetic challenge. In contrast, enzyme-based transformations of complex substrates offer high efficiency and selectivity under mild, physiological conditions. Furthermore, enzymatic C–H functionalization, often via oxidation, enables further bond formations and connections at otherwise inert positions9. These capabilities position enzymes as powerful tools for chemoenzymatic synthesis, enabling access to chemical and bioactive space with new and/or enhanced drug-like properties of natural products.
A common biocatalytic transformation is hydroxylation, which improves the hydrophilicity of natural products and facilitates further chemical derivatization of typically unreactive C–H bonds (Fig. 1). Such hydroxylations are frequently catalyzed by cytochrome P450 monooxygenases (P450s), an enzyme family known for their remarkable stereo-, regio-, and chemoselective transformation of unactivated C–H bonds10,11. This catalytic versatility allows chemoenzymatic approaches to overcome challenging or impractical conventional synthetic methods12, such as the enzymatic hydroxylation of the unactivated quinoline ring in camptothecin for the semi-synthesis of the anticancer drugs topotecan and irinotecan12,13 (Fig. 1). While several strategies exist to mine and engineer P450s from humans, bacteria, and fungi14,15,16, plant P450s represent a potentially underexplored source of biocatalysts.
Representative hydroxylation reactions catalyzed by cytochrome P450 enzymes are highlighted in yellow. The P450 enzyme model illustrates tabersonine 16-hydroxylase, predicted by AlphaFold2. Downstream structural modifications to the alkaloid scaffolds following hydroxylation are highlighted in blue. Figure was created using icons from Biorender.com.
Recent advances in sequencing technology and the increasing accessibility of omics data have fueled the elucidation of biosynthetic pathways for several important plant natural products, including the convulsant alkaloid strychnine from Strychnos nux-vomica17, the anticancer agent vinblastine from Catharanthus roseus18,19, and the anti-gout compound colchicine from Gloriosa superba20,21. Moreover, these newly available sequence datasets open avenues to uncover previously uncharacterized biocatalysts with specialized functions. Here, we employed OrthoFinder22, a tool that infers orthology relationships across multiple transcriptomes, to identify enzymes capable of functionalizing alkaloid scaffolds. Using publicly available and in-house transcriptomes from 15 alkaloid-producing plant species, we identified three orthogroups comprising 547 P450 genes. Of these, 39 were predicted to be involved in monoterpenoid indole alkaloid (MIA) oxidation. Fifteen of these candidates, selected from diverse clades based on their phylogenetic relationships, were screened against a panel of 30 multi-ring natural products, including alkaloids and flavonoids (Table 1). This screen revealed three P450s with distinct regioselective oxidation activity toward evodiamine, an alkaloid with anticancer and anti-inflammatory properties. Further structural analysis uncovered features associated with their catalytic selectivity, providing potential targets for future enzyme engineering. This proof-of-concept study highlights the utility of orthology-based enzyme discovery pipelines and systematic screening for uncovering biocatalysts with specialized functions.
Results
Selection of a focused library of P450s using OrthoFinder
To identify P450s capable of functionalizing monoterpenoid indole alkaloids (MIAs), a group of natural products with remarkable structural and pharmacological diversity, we focused on known alkaloid-modifying P450s. These include enzymes that hydroxylate the anticancer tabersonine scaffold (e.g., tabersonine 16- and 19-hydroxylases23,24, tabersonine 3-oxygenase25, and tabersonine 6,7-epoxidases26), the pre-psychoactive ibogamine scaffold (e.g., ibogamine 10-hydroxylase7), and the anticancer scaffold camptothecin (e.g., camptothecin 10- and 11-hydroxylases13). To generate a focused library of candidate alkaloid oxygenases for biochemical screening, we used OrthoFinder22 to identify orthologs of these enzymes. OrthoFinder assigned 547 sequences to orthogroups containing known alkaloid-modifying P450s through an all-versus-all BLAST search and refined these using gene tree inference algorithms, yielding 39 putative orthologs (Fig. 2a and Supplementary Data 1). From this set, we selected 15 candidate P450s from six plant species, prioritizing those with available plant material for cloning and ensuring representation across diverse phylogenetic clades (Fig. 2b, c).
a OrthoFinder analysis identified orthologs of previously characterized alkaloid oxygenases. Bar charts show, for each species, the number of genes assigned to orthogroups (green) and the number of genes identified as orthologs (purple). Plant species from which the final 15 P450 candidates were selected for substrate screening are marked with red dots. Species highlighted in orange encode P450s with confirmed activity against alkaloid scaffolds, while those in blue provided the “bait” sequences. b Fifteen candidate P450s (red) were selected from the ortholog pool and arranged in a neighbor-joining tree with other P450s, including previously characterized alkaloid oxygenases (green) and other functionally characterized P450s (black). c The final set of 15 candidate P450s was manually curated from a pool of 547 orthogroup members.
Discovery of alkaloid oxygenases
The candidate P450s were cloned into the dual-expression vector pESC-leu2d (45) carrying a cytochrome P450 reductase (CPR) gene. Yeasts bearing these constructs with confirmed protein expression (Supplementary Fig. 1) were initially screened for oxidative activities against two substrates: aspidosperma-type alkaloid tabersonine and the iboga-type alkaloid catharanthine (Table 1). In preliminary assays, a Vinca minor P450, Vmi1872, converted tabersonine into a product with an m/z and retention time matching 16-hydroxytabersonine (3), an enzymatic product of a previously reported C. roseus tabersonine 16-hydroxylase (CroT16H2; Supplementary Fig. 2 and Table 2). This result suggests that Vmi1872, which shares 72% amino acid identity with CroT16H2 (Supplementary Fig. 3), catalyzes the formation of 16-hydroxytabersonine (Supplementary Fig. 2, Table 2, and Supplementary Table 1). Intriguingly, two P450s from C. acuminata (Cac4918 and Cac4924) and one from T. elegans (Tel3451), sharing 60–77% sequence identity (Supplementary Fig. 3), also accepted tabersonine and converted it into two additional products, compounds 5 and 6, with m/z 369, consistent with dihydroxylated tabersonine derivatives bearing two additional hydroxy groups (+32 amu) relative to tabersonine (m/z 337) (Supplementary Figs. 4–6 and Table 2). The discovery of tabersonine-oxidizing activity in these species is interesting, particularly since these species are not known to natively produce tabersonine or its derivatives.
Intrigued by the activities of Cac4918, Cac4924, and Tel3451 against tabersonine, we speculated that other P450s in our library might similarly exhibit unexpected substrate selectivity. To test this hypothesis, we screened all 15 candidate P450s against a panel of 30 plant-derived multi-ring natural products, mostly MIAs (Table 1). This screening revealed that Cac4916, Cac4918, and Cac4924 from C. acuminata, along with Tel3451 from T. elegans, were also active against evodiamine (m/z 304), an anticancer and anti-inflammatory alkaloid produced in the order Sapindales, rather than Cornales and Gentianales, the lineages from which these P450s were derived (Fig. 3a, b and Table 2). Cac4916, Cac4918, and Cac4924 share approximately 77% amino acid identity with each other and around 60% identity with Tel3451 (Supplementary Figs. 3, 7). Among these enzymes, Tel3451 showed the highest activity toward evodiamine, followed by Cac4918, while Cac4916 and Cac4924 exhibited substantially lower conversion levels (Fig. 3b and Supplementary Table 1). Tel3451 and Cac4918 each produced compounds 7 and 8, both with m/z 320, consistent with mono-oxidized evodiamine (+16 amu), with 7 as the major product (Fig. 3b). In addition, several minor products with the same m/z were also detected, potentially representing other oxidized derivatives of evodiamine (Supplementary Fig. 8). Cac4916 generated only compound 7, whereas Cac4924 yielded compounds 7, 8, and an additional product, compound 9 (Fig. 3b and Table 2). Based on their m/z values, compounds 7–9 were assigned as hydroxylated derivatives of evodiamine. In addition, Tel3451 and Cac4916 accepted rutaecarpine, a structurally related alkaloid that co-occurs with evodiamine in Evodia rutaecarpa, and produced compounds 10–13, which were predicted from their m/z values to be monooxygenated derivatives of rutaecarpine (Supplementary Fig. 9, Table 2, and Supplementary Table 1).
a Enzymatic reaction scheme shows the conversion of evodiamine into two main hydroxylated products. Structures of compounds 7 and 8 were elucidated by NMR spectroscopy. Bold bonds indicate COSY correlations, and single-headed arrows denote HMBCs. b Extracted ion chromatograms from in vivo feeding assays show substrate (m/z 304) and hydroxylated products (m/z 320). Molecular docking of evodiamine into AlphaFold2-predicted structures of Tel3451 using AutoDock Vina revealed binding conformations consistent with the formation of c compound 7 and d compound 8. The protein surface is shown within 5 Å of the docked substrate and clipped to enhance visualization. Evodiamine, the heme cofactor, phenylalanine residues involved in substrate binding, and the conserved cysteine liganded to the heme iron are shown as ball-and-stick models. Carbon atoms of evodiamine positioned near the heme iron are labeled.
Structure elucidation of evodiamine oxygenase reaction products
To enable structural characterization by NMR spectroscopy, we scaled up yeast cultures expressing Tel3451, which produced compounds 7 and 8 at substantially higher yields than Cac4916, Cac4918, and Cac4924 (Fig. 3b). As Cac4924 yielded products at very low abundance, we could not obtain sufficient materials for NMR-based structural analysis of compound 9. Extraction from a 1.6-L culture expressing Tel3451 with evodiamine as substrate, followed by preparative HPLC, yielded 1.8 mg (7.4% yield) of compound 7 and 0.6 mg (2.5% yield) of compound 8, with a total yield of 9.9%. Both compounds were obtained as brown, translucent crystals.
NMR analysis of compound 7 (Supplementary Table 2, Fig. 3a, and Supplementary Data 2) revealed a key heteronuclear multiple bond correlation (HMBC) between the indolic proton H13 and C12 of the evodiamine scaffold. Proton H12 displayed strong coupling with H11 (J = 8.7 Hz), whereas no strong coupling was observed for H9, consistent with hydroxylation at C10. The spectral data aligned with previously reported 1H NMR assignments for 10-hydroxyevodiamine27, identifying compound 7 as 10-hydroxyevodiamine.
Compound 8 showed a similar HMBC from its indolic proton to C12 (Supplementary Table 3, Fig. 3a, and Supplementary Data 2), and COSY correlations among H9–H12 confirmed that the A-ring was unmodified. The presence of only seven aromatic proton signals suggested that hydroxylation occurred on the aromatic E-ring. An HMBC from the N-methyl protons to a quaternary carbon (C1a), together with three HMBCs from aromatic protons to C1a, supported hydroxylation at C3. Alternative hydroxylation sites on the E-ring would yield only two HMBCs to C1a. These data, consistent with previously reported 1H NMR spectra for 3-hydroxyevodiamine28, identified compound 8 as 3-hydroxyevodiamine. Similar to compound 9 produced by Cac4924 from evodiamine, the low yields of rutaecarpine conversion products precluded structural elucidation by NMR. However, based on the characterized products from evodiamine, we predicted that Tel3451 and Cac4916 hydroxylate rutaecarpine at analogous positions, resulting in products 10–13 (Supplementary Fig. 9 and Table 2).
Site-directed mutagenesis to probe substrate-positioning phenylalanine residues
Each of the four evodiamine oxygenases discovered here exhibits distinct catalytic activities and regioselectivities. These differences reflect variations in substrate affinity and specificity across the enzymes. Tel3451, Cac4916, and Cac4918 predominantly produced compound 7, whereas Cac4924 generated comparable amounts of three distinct evodiamine derivatives, suggesting divergent substrate-binding orientations and regioselective preferences. The structures of hydroxyevodiamines 7 and 8 revealed that the evodiamine oxygenase Tel3451 catalyzes hydroxylation at both the A- and E-rings of evodiamine, with a preference for A-ring modification (Fig. 3a, b). To investigate the structural basis for this unusual product promiscuity, we performed site-directed mutagenesis on Tel3451, the enzyme exhibiting the highest product yield (Fig. 3b). To guide mutagenesis, the structure of Tel3451 was predicted using AlphaFold229, and docking of heme and evodiamine into the model was performed with AutoDock Vina30 (Fig. 3c, d). Residues within 5 Å of the docked substrate were identified and compared across a multiple sequence alignment of Tel3451 and its close homologs in C. acuminata with varying hydroxylation activity (Cac4916, Cac4918, and Cac4924; Supplementary Fig. 7). The modeled active site of Tel3451 contained three phenylalanine residues (F102, F113, and F201) in proximity to the aromatic rings of the evodiamine (Fig. 3c, d). These residues were predicted to contribute to substrate binding and orientation, and their low conservation among the orthologs implies structurally distinct modes of evodiamine engagement (Supplementary Fig. 7). Although these enzymes share the conserved overall fold typical of plant P450s, we hypothesized that differences in the shape and amino acid composition of their substrate-binding pockets (Supplementary Figs. 7, 10) likely account for their distinct substrate preferences and catalytic outcomes.
To test this hypothesis, we mutated each individual phenylalanine to leucine, a bulky and hydrophobic, but non-aromatic residue. To further assess the importance of hydrophobic bulk at these positions, we generated additional variants in which the phenylalanines were replaced with alanine, a smaller non-aromatic residue. In addition, due to the presence of conserved serine at the F102-equivalent position in Cac4916 and Cac4918 (Supplementary Fig. 7), Tel3451 F102 was also mutated to serine. The phenylalanine-to-leucine mutants did not exhibit appreciable changes in product profile compared to Tel3451 wild type (WT) (Fig. 4a and Supplementary Data 1). In contrast, Tel3451 F102A and F102S mutants produced noticeably lower levels of both hydroxyevodiamine products 7 and 8 (Fig. 4a and Supplementary Data 1), indicating that the hydrophobic bulk at position 102 is important for catalytic activity. While the F201A mutation did not substantially alter the product profile, the F113A variant disfavored the formation of 10-hydroxyevodiamine (compound 7) (Fig. 4a and Supplementary Data 1), suggesting a role for F113 in modulating the binding orientation of evodiamine within the active site. Western blot analysis confirmed that the observed differences in enzymatic activity were due to the mutations themselves rather than differences in protein expression (Supplementary Fig. 11). To investigate the structural basis for activity changes in Tel3451 variants, we generated AlphaFold2 models of these mutants (Fig. 4b–d and Supplementary Fig. 12). Unlike other variants, F102A and F102S showed docked evodiamine in a binding mode displaced from the catalytic heme, resulting in a greater distance between the substrate and the iron center (Fig. 4c, d). To evaluate the effect of this displacement, we calculated the distance between the iron center and evodiamine in each model and plotted it against the combined yields of compounds 7 and 8 (Supplementary Fig. 13). A distance-dependent decrease in catalytic activity was observed when the iron-evodiamine distance exceeded 6 Å. However, no clear trend was observed in the 3–5 Å range. Kinetic analysis further supported altered substrate binding across all mutants as Tel3451 WT exhibited the lowest KM (12.3 µM), while all variants showed elevated KM values (32–187 µM) (Supplementary Fig. 14). Kinetic parameters for F102A and F102S could not be determined due to insufficient activity. These findings suggest that phenylalanine substitutions variably affect evodiamine binding, with the F102A and F102S mutations most severely attenuating Tel3451 activity by destabilizing substrate positioning near the catalytic heme.
a In vivo assays showing conversion of evodiamine to 10-hydroxyevodiamine and 3-hydroxyevodiamine by Tel3451 WT and mutant variants. Data are presented as mean ± standard deviation from three independent biological replicates. b–d Structural overlays of evodiamine docked into AlphaFold-predicted structures of Tel3451 WT and mutants: b F102L; c F102A; d F102S. Docking was performed using AutoDock Vina. Evodiamine, heme, phenylalanine residues involved in substrate binding, position 102 substitutions, and the cysteine liganded to the heme iron are shown as ball-and-stick models. Carbon-10 of evodiamine docked into Tel3451 WT, positioned near the heme iron, is labeled. Substrate-binding pockets are shown as clipped surfaces to enhance visualization.
Tel3451 also accepted rutaecarpine, a structural analogue of evodiamine, as a substrate. AlphaFold2 modeling combined with Autodock Vina docking revealed a steric clash between F201 and the planar scaffold of rutaecarpine (Supplementary Fig. 15). Substitution of F201 with smaller residues (alanine or leucine) enhanced the production of hydroxylated rutaecarpine derivatives (compounds 10–13), supporting the role of F201 in modulating substrate accommodation (Supplementary Fig. 14).
Results above revealed that F102 likely stabilizes evodiamine within the substrate-binding pocket, while F113 appears to modulate its binding orientation in the active site. To probe this further, and in an attempt to enhance the comparatively lower evodiamine oxygenase activity of Cac4916, Cac4918, and Cac4924, we introduced phenylalanine substitutions at positions equivalent to F102 and F113 of Tel3451 in these C. acuminata P450 orthologs (Supplementary Fig. 16). While not all mutations led to increased activity relative to the corresponding WT enzymes, we observed an intriguing result from substitution at the F102-equivalent position in Cac4916 (S111F), which completely shifted the product profile from 10-hydroxyevodiamine (7) to 3-hydroxyevodiamine (8) (Supplementary Fig. 16). A similar mutation in Cac4918 (S111F) increased the yield of compound 8, whereas substitutions in Cac4924 (F112S) led to a reduction in its formation (Supplementary Fig. 16). In contrast, substitutions at the F113-equivalent position disrupted activity in Cac4916 (A122F) and Cac4918 (G122F) and, in Cac4924 (V123F), reduced the yield of compound 8 while increasing production of compound 9 (Supplementary Fig. 16). These findings strongly support the role of F102 and F113 in Tel3451, and their equivalent residues in related orthologs in governing substrate binding and orientation. This substrate-binding and orienting role is further supported by the fact that substitution of F113 in Tel3451 with glycine (as in Cac4918) reduced its hydroxylation activity on evodiamine to a level comparable to that of Cac4916 WT, while replacement with valine (as in Cac4924) nearly abolished activity (Supplementary Fig. 17). These effects suggest that residue F113 may contribute to shaping the substrate-binding pocket or directly influence substrate positioning.
Discussion
Quinazoline is a common scaffold in clinically important alkaloids, including the antihypertensive drug prazosin31, the diuretic metolazone32, the anticancer agent erlotinib33, and the anti-pneumonia drug trimetrexate34. Evodiamine, a quinazoline indole alkaloid, occurs naturally as the primary bioactive constituent in the fruit of Tetradium ruticarpum, a species widely used in traditional Chinese medicine. Evodiamine exhibits diverse pharmacological properties, including analgesic, anti-inflammatory, anticancer, and antimicrobial activities35. Derivatives such as 10-hydroxyevodiamine (7) and evodiamine 10-phosphate have also been identified as promising multi-target anticancer leads36. Hydroxylation of evodiamine introduces a reactive handle that enables further functionalization, including phosphate conjugation toward promising pro-drugs (Fig. 1). However, conventional chemical hydroxylation methods rely on hazardous reagents such as phosphorus oxychloride and toluene and require elevated temperatures37. Alternative synthetic routes often depend on complex total syntheses starting from pre-hydroxylated intermediates5. Finding evodiamine hydroxylases would offer a sustainable and selective biocatalytic alternative and highlight the increasing potential of plant P450s as selective biocatalysts for scaffold diversification in synthetic biology.
Enzyme discovery efforts typically focus on biological sources known to produce the substrate of interest, frequently overlooking vast phylogenetic diversity that may encode catalytically relevant enzymes. By leveraging OrthoFinder to guide candidate selection across 15 medicinal plants and identify P450 orthologs, we identified four evodiamine hydroxylases from T. elegans (order Gentianales) and C. acuminata (order Cornales), species not known to produce evodiamine. This expands the scope of enzyme discovery beyond species or family-specific searches and provides proof-of-concept that phylogenomic mining can yield functionally relevant enzymes for synthetic biology. The four evodiamine oxygenases exhibit distinct regioselectivities and activities, with Tel3451 being the most active and selective. Different product profiles and substrate-binding preferences among the orthologs correlate with variations in active-site residues, particularly F102 and F113 in Tel3451. Mutational analysis confirmed their role in tuning both activity and product distribution, as exemplified by the S111F substitution in Cac4916, which completely shifted regioselectivity. Future structural studies will be crucial to elucidate the mechanistic basis of substrate recognition and regioselectivity.
In summary, this proof-of-concept study demonstrates how orthologue-inference bioinformatics can be integrated with large-scale omics datasets to enable effective enzyme discovery. The simultaneous identification of P450 variants that catalyze closely related yet distinct oxidative reactions facilitates comparative structure-function analysis, resulting in mechanistic insights as reported here. By expanding the search space beyond plant lineages known to produce the target metabolite, we broadened the scope of alkaloid metabolism and established a foundation for the chemoenzymatic production of quinazoline-based pharmaceuticals.
Materials and methods
Plant materials
Camptotheca acuminata Decne. seeds were a generous gift from Dr. Dean DellaPenna (Michigan State University, USA). Seeds were germinated and cultivated in a growth chamber under controlled conditions of 28 °C day/22 °C night temperature with a 16-h light/8-h dark photoperiod. Upon reaching approximately 15 cm in height, seedlings were acclimated in a greenhouse environment. Catharanthus roseus and Vinca minor plants were obtained from The Greenery (Kelowna, BC, Canada) and maintained under greenhouse conditions.
Chemicals
Tabersonine, catharanthine, voacangarine, rutaecarpine, mitragynine, rauwolscine, luotonin A, ellipticine, and isoliquiritigenin were purchased from Cayman Chemical Company (Ann Arbor, MI, USA). Evodiamine was obtained from Thermo Scientific (Waltham, MA, USA). Camptothecin, dehydroevodiamine, rhynchophylline, yohimbine, ajmalicine, vincamine, and quinazoline were purchased from Millipore Sigma (Oakville, ON, Canada). Strictosamide was acquired from ChemScene (Monmouth Junction, NJ, USA), while hirsuteine, quinine, and vindoline were obtained from Aobious (Gloucester, MA, USA). Pumiloside was purchased from BioCrick (Chengdu, Sichuan, China), and tetrahydroalstonine from AvaChem (San Antonio, TX, USA). Naringenin and quercetin were acquired from TCI Chemicals (Portland, OR, USA) and Acros Organics (Geel, Antwerp, Belgium), respectively. Eburnamonine, polyneuridine aldehyde, and ajmaline were generous gifts from Dr. Sarah O’Connor (Max Planck Institute for Chemical Ecology, Jena, Germany).
OrthoFinder analysis
Publicly available transcriptomes were obtained for various alkaloid-producing plants7,17,38,39,40,41,42,43,44,45,46,47,48,49,50 and the outgroup Arabidopsis thaliana51 (Supplementary Table 4). Transcriptomes lacking coding sequences were processed using TransDecoder v5.5.0 (github.com/TransDecoder/TransDecoder) to predict coding regions, incorporating homology-based open reading frame (ORF) identification via HMMER (hmmer.org) and BLAST+ v2.13.052 searches against the Pfam-A53 and Swiss-Prot54 databases (accessed September 2022). To obtain representative transcripts, coding sequences were clustered with CD-HIT-EST v4.8.155 using local sequence identity, a 98% identity threshold to accommodate next-generation sequencing errors, a minimum coverage ratio of 100% for the shorter sequence, and 0.5% for the longer sequence to allow clustering of transcript variants56. These representative transcripts were subsequently analyzed with OrthoFinder v2.5.522.
Phylogenetic analysis
A neighbor-joining phylogenetic tree of the selected candidates and a representative set of plant P450s was generated using the Geneious Tree Builder tool in Geneious v2023.2.1 (Biomatters, Newark, NJ, USA), employing the Jukes-Cantor genetic distance model. The names, abbreviations, and accession numbers of the sequences used to generate the tree were summarized in Supplementary Table 5. Candidate gene sequences were retrieved from publicly available transcriptomes7,17,38,39,40,41,42,43,44,45,46,47,48,49,50.
Construction of plasmids expressing P450 candidates
Total RNA extracts were obtained from the leaves and stems of C. acuminata as described previously57. Full-length coding regions of Cac4918, Cac4919, Cac4924, Cac4925, Cac6007, Vmi1872, Vmi25813, Rse3867, Rse8182, CroT16H2, CroT19H, CroT3O, and CroTEX were amplified from cDNA synthesized by LunaScript reverse transcriptase (New England Biolabs, Whitby, ON, Canada) with total leaf or stem RNA from C. acuminata, V. minor, R. serpentina, or C. roseus. PrimeSTAR Max DNA polymerase (Takara Bio, San Jose, CA, USA) was used for PCR amplification. Synthetic coding sequences of Cac4916, Cac4921, Cac24044, Rst01014932, Tel3451, and Tib11390 were designed based on native sequences from publicly available transcriptomes7,39,47,50,58 and synthesized by Twist Bioscience (San Francisco, CA, USA; Supplementary Table 6). Primers and synthetic constructs incorporated the 5’-CAC TAA AGG GCG GCC AAC AAA ATG-3’ sequence at the 5’ end and 5’-ATC CAT CGA TAC TAG-3’ at the 3’ end to facilitate homology-based cloning (Supplementary Tables 6, 7). Coding sequences were cloned into the NotI and SpeI sites of the dual plasmid pESC-leu2d::CPR59 using InFusion cloning (Takara Bio), generating pESC-leu2d::CPR/P450 constructs for heterologous expression of FLAG-tagged P450 enzymes. Two versions of the dual-expression plasmid pESC-leu2d::CPR were used: one containing C. acuminata CPR1 for co-expression with C. acuminata candidates, and the other containing A. annua CPR for use with all other candidates (Supplementary Table 8).
Expression of P450s, microsome preparation, and immunoblot analysis
The protease-deficient Saccharomyces cerevisiae YPL154C:PEP4 KO strain60 was transformed with pESC-leu2d::CPR/P450 via heat shock. Transformed yeast cells were initially cultured in 5 mL of synthetic complete medium lacking leucine (SC-Leu; BioShop, Burlington, ON, Canada) supplemented with 2% (w/v) glucose at 30 °C overnight. Cultures were then diluted 1:50 into fresh SC-Leu containing 2% (w/v) glucose and grown overnight under the same conditions. Cells were harvested by centrifugation, resuspended in YPA medium (1% w/v yeast extract, 2% w/v peptone, and 0.04 g L–1 adenine sulfate) supplemented with 2% (w/v) galactose, and incubated overnight at 30 °C to induce heterologous protein expression. Microsomes were isolated, and immunoblotting was conducted as described previously13,61. Blots were imaged using a charge-coupled device camera system (Li-Cor, Lincoln, NE, USA).
Enzymatic assays
Transformed yeast bearing pESC-leu2d::CPR/P450 plasmids were cultured overnight at 30 °C in SC-Leu medium supplemented with 2% (w/v) glucose. Cultures were then diluted 1:10 into 100 µL of fresh SC-Leu medium containing 2% (w/v) galactose and 50 µM substrate (Table 1), dissolved in methanol or DMSO, in 96-well plates. The pH was maintained between 6.0 and 7.0 by buffering the medium with 50 mM HEPES-KOH pH 7.5. Plates were incubated for 48 h at 30 °C with shaking at 750 rpm on an Incu-Mixer MP Heated Microplate Vortexer (Benchmark Scientific Inc., Sayreville, NJ, USA) to induce heterologous protein expression and enable catalytic conversion of the substrate. Reactions were quenched by adding 0.5 volumes of methanol. Yeast bearing pESC-leu2d::CPR served as a negative control.
In vitro biochemical characterization of evodiamine oxygenases
Steady-state kinetic assays were conducted at 30 °C in 100 µL reactions containing 100 mM HEPES buffer (pH 7.0), 0.5 mg of total microsomal protein, 500 µM NADPH, and varying concentrations of evodiamine (0–500 µM). Reactions were initiated by substrate addition and quenched by the addition of an equal volume of methanol. Product concentrations were quantified by LC-MS using a standard curve. Initial velocities were plotted as a function of substrate concentration and fitted to the Michaelis–Menten equation using the nls function in R v4.4.162 to calculate kinetic parameters.
LC-MS analysis
For standard ultra-performance liquid chromatography-mass spectrometry (UPLC-MS) analysis, assay extracts were analyzed using a Waters Acquity UPLC I-Class Plus system coupled to a Xevo TQ-S Cronos triple quadrupole mass spectrometer (Waters, Mississauga, ON, Canada). Chromatographic separation was performed on a Waters Acquity UPLC BEH C18 column (50 mm × 2.1 mm, 1.7 µm particle size) at 30 °C with a flow rate of 0.6 mL min–1. The column was equilibrated with 90% solvent A (water + 0.1% formic acid) and 10% solvent B (acetonitrile + 0.01% formic acid). The elution gradient was as follows: 0–8 min, 10–50% B; 8–8.5 min, 50–100% B; 8.5–9.5 min, 100% B; 9.5–11 min, 100–10% B.
For each substrate, 10 µM authentic standards were analyzed to determine retention times and mass-to-charge ratios (m/z). Targeted analysis was performed in electrospray ionization positive (ESI+) mode using selected ion recording (SIR). The [M + H]+ adduct of each substrate was monitored along with the corresponding monooxygenated (+16 Da), dioxygenated (+32 Da), and dehydrogenated (–2 Da) species. The mass spectrometer was operated with a capillary voltage of 1.50 kV and a cone voltage of 30 V. Data were processed using MassLynx v4.2 (Waters, Mississauga, ON, Canada), and chromatograms were visualized using a modified R script62 developed by Chenxin Li63.
NMR analysis
To obtain sufficient product for nuclear magnetic resonance (NMR) spectroscopy, sixteen 10 mL starter cultures of S. cerevisiae YPL154C:PEP4 KO strain expressing the gene of interest were grown overnight at 30 °C in SC-Leu medium containing 2% (w/v) glucose. A total of 1.6 L of fresh SC-Leu medium was distributed into sixteen 2-L flasks, inoculated with starter cultures, and incubated overnight at 30 °C. Cells were harvested by centrifugation at 4000 × g for 5 min and resuspended in SC-Leu medium supplemented with 1.8% (w/v) galactose, 0.2% (w/v) glucose, and 50 µM substrate. Cultures were incubated for 48 h at 30 °C with shaking at 220 rpm and clarified by centrifugation at 4000 × g for 5 min.
The supernatant was extracted six times with chloroform, and the pooled organic phases were washed twice with water, dried over brine, and concentrated in vacuo. The crude product was dissolved in 2 mL of a 1:1 methanol:DMSO mixture and purified by preparative HPLC (Agilent 1260 Infinity II) using a 250 mm × 10.0 mm Kinetex 5 µm EVO C18 100 Å LC column (Phenomenex, Torrance, CA, USA) at a flow rate of 1.5 mL min–1. The column was pre-equilibrated with 90% water and 10% acetonitrile, followed by the following gradient elution: 0–5 min, 10–20% acetonitrile; 5–25 min, 20–70%; 25–27 min, 70–90%; 27–30 min, 90%; 30–31 min, 90–10%; 31–35 min, 10%. Fractions were analyzed by LC-MS, and those containing oxidized evodiamine products were pooled and dried using a Genevac EZ-2 evaporator (Genevac, NY, USA).
Dried samples were dissolved in 400 µL of DMSO-d₆ (Millipore Sigma, Oakville, ON, Canada) and analyzed on a Bruker Avance 600 MHz NMR spectrometer (Burlington, ON, Canada) equipped with a z-gradient TCI cryoprobe. Data were processed using MestReNova v14.2.0 (Mestrelab Research, Escondido, CA, USA). Chemical shifts were referenced to residual DMSO-d₆ signals (δH = 2.50ppm; δC = 39.52ppm).
Protein structure modeling and substrate docking
Tel3451, an evodiamine hydroxylase with high activity identified in this study, was selected for structural modeling. The nucleotide sequence of Tel3451 was retrieved from a publicly available T. elegans transcriptome50 and modeled using AlphaFold229. Heme and substrate structures were obtained from the Human Metabolome Database64 and sequentially docked into the predicted protein structure using AutoDock Vina30. Mutant variants of Tel3451 were modeled and docked following the same procedure.
Site-directed mutagenesis
Point mutations were introduced into Tel3451 by overlap extension PCR using PrimeSTAR Max DNA polymerase (Takara Bio, San Jose, CA, USA) and mutagenesis primers listed in Supplementary Table 9. Overlapping primer pairs were designed to introduce specific mutations and enable homology-based assembly of full-length Tel3451 fragments. Primers targeting the 5′ and 3′ ends included the sequences 5′-CAC TAA AGG GCG GCC AAC AAA ATG-3′ and 5′-ATC CAT CGA TAC TAG-3′, respectively, to facilitate fragment assembly and insertion into the pESC-leu2d::CPR plasmid via InFusion cloning (Takara Bio). Mutant variants of Cac4916, Cac4918, and Cac4924 were generated using the same approach.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.
References
Veeresham, C. Natural products derived from plants as a source of drugs. J. Adv. Pharm. Technol. Res. 3, 200–201 (2012).
Dewick, P. M. Chapter 6 - Alkaloids. In Medicinal Natural Products, 3rd ed. 311–420 (Wiley, 2009).
Aniszewski, T. Chapter 1 - Definition, Typology and Occurrence of Alkaloids in Alkaloids - Secrets of Life: Aklaloid Chemistry, Biological Significance, Applications and Ecological Role 1-60 (Elsevier, 2007).
Aniszewski, T. Chapter 2 - Alkaloid Chemistry. In Alkaloids, 2nd ed., 99–193 (Elsevier, 2015).
Dong, G. et al. New tricks for an old natural product: discovery of highly potent evodiamine derivatives as novel antitumor agents by systemic structure–activity relationship analysis and biological evaluations. J. Med. Chem. 55, 7593–7613 (2012).
Khaliq, T. et al. Peganine hydrochloride dihydrate an orally active antileishmanial agent. Bioorg. Med. Chem. Lett. 19, 2585–2586 (2009).
Farrow, S. C. et al. Cytochrome P450 and O-methyltransferase catalyze the final steps in the biosynthesis of the anti-addictive alkaloid ibogaine from Tabernanthe iboga. J. Biol. Chem. 293, 13821–13833 (2018).
Svoboda, G. H., Barnes, J. A. J. & Armstrong, R. J. Leurosidine and leurocristine and their production. Patent US3205220A (1961).
Qin, Y., Zhu, L. & Luo, S. Organocatalysis in inert C-H bond functionalization. Chem. Rev. 117, 9433–9520 (2017).
Nguyen, T. D. & Dang, T. T. Cytochrome P450 enzymes as key drivers of alkaloid chemical diversification in plants. Front. Plant Sci. 12, 682181 (2021).
Rasool, S. & Mohamed, R. Plant cytochrome P450s: nomenclature and involvement in natural product biosynthesis. Protoplasma 253, 1197–1209 (2016).
Chubatsu Nunes, H. H., Nguyen, T.-D. & Dang, T.-T. T. Chemoenzymatic synthesis of natural products using plant biocatalysts. Curr. Opin. Green. Sustain. Chem. 35, 100627 (2022).
Nguyen, T. M. et al. Discovering and harnessing oxidative enzymes for chemoenzymatic synthesis and diversification of anticancer camptothecin analogues. Commun. Chem. 4, 177 (2021).
Charlton, S. N. & Hayes, M. A. Oxygenating biocatalysts for hydroxyl functionalisation in drug discovery and development. ChemMedChem 17, e202200115 (2022).
Di Nardo, G. & Gilardi, G. Natural compounds as pharmaceuticals: the key role of cytochromes p450 reactivity. Trends Biochem. Sci. 45, 511–525 (2020).
Fessner, N. D. P450 monooxygenases enable rapid late-stage diversification of natural products via C-H bond activation. ChemCatChem 11, 2226–2242 (2019).
Hong, B. et al. Biosynthesis of strychnine. Nature 607, 617–622 (2022).
Caputi, L. et al. Missing enzymes in the biosynthesis of the anticancer drug vinblastine in Madagascar periwinkle. Science 360, 1235–1239 (2018).
Zhang, J. et al. A microbial supply chain for production of the anti-cancer drug vinblastine. Nature 609, 341–347 (2022).
Nett, R. S., Lau, W. & Sattely, E. S. Discovery and engineering of colchicine alkaloid biosynthesis. Nature 584, 148–153 (2020).
Nett, R. S. & Sattely, E. S. Total biosynthesis of the tubulin-binding alkaloid colchicine. J. Am. Chem. Soc. 143, 19454–19465 (2021).
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).
Besseau, S. et al. A pair of tabersonine 16-hydroxylases initiates the synthesis of vindoline in an organ-dependent manner in Catharanthus roseus. Plant Physiol. 163, 1792–1803 (2013).
Giddings, L. A. et al. A stereoselective hydroxylation step of alkaloid biosynthesis by a unique cytochrome P450 in Catharanthus roseus. J. Biol. Chem. 286, 16751–16757 (2011).
Qu, Y. et al. Completion of the seven-step pathway from tabersonine to the anticancer drug precursor vindoline and its assembly in yeast. Proc. Natl. Acad. Sci. USA 112, 6224–6229 (2015).
Carqueijeiro, I. et al. Two tabersonine 6,7-epoxidases initiate lochnericine-derived alkaloid biosynthesis in Catharanthus roseus. Plant Physiol. 177, 1473–1486 (2018).
Li, L. et al. Microbial metabolism of evodiamine by Penicillium janthinellum and its application for metabolite identification in rat urine. Enzym. Microb. Technol. 39, 561–567 (2006).
Huang, G., Kling, B., Darras, F. H., Heilmann, J. & Decker, M. Identification of a neuroprotective and selective butyrylcholinesterase inhibitor derived from the natural alkaloid evodiamine. Eur. J. Med. Chem. 81, 15–21 (2014).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Trott, O. & Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31, 455–461 (2010).
Basquez, R. & Pippin, M. M. Prazosin https://www.ncbi.nlm.nih.gov/books/NBK555959/ (StatPearls Publishing, 2025).
Bond, G., Adnan, G., Dua, A., Singh, K. & Crew, C. M. Metolazone https://www.ncbi.nlm.nih.gov/books/NBK534203/ (StatPearls Publishing, 2025).
Carter, J. & Tadi, P. Erlotinib https://www.ncbi.nlm.nih.gov/books/NBK554484/ (StatPearls Publishing, 2025).
Allegra, C. J. et al. Trimetrexate for the treatment of Pneumocystis carinii pneumonia in patients with the acquired immunodeficiency syndrome. N. Engl. J. Med. 317, 978–985 (1987).
Sun, Q., Xie, L., Song, J. & Li, X. Evodiamine: a review of its pharmacology, toxicity, pharmacokinetics and preparation researches. J. Ethnopharmacol. 262, 113164 (2020).
Chen, S. et al. Water-soluble derivatives of evodiamine: discovery of evodiamine-10-phosphate as an orally active antitumor lead compound. Eur. J. Med. Chem. 220, 113544 (2021).
Pachter, I. & Suld, G. Notes- The structure and synthesis of rhetsinine (hydroxyevodiamine). J. Org. Chem. 25, 1680–1682 (1960).
Berardini, T. Z. et al. The Arabidopsis information resource: making and mining the “gold standard” annotated reference plant genome. Genesis 53, 474–485 (2015).
Kang, M. et al. A chromosome-level Camptotheca acuminata genome assembly provides insights into the evolutionary origin of camptothecin biosynthesis. Nat. Commun. 12, 3531 (2021).
Kellner, F. et al. Genome-guided investigation of plant natural product biosynthesis. Plant J. 82, 680–692 (2015).
Franke, J. et al. Gene discovery in gelsemium highlights conserved gene clusters in monoterpene indole alkaloid biosynthesis. Chembiochem 20, 83–87 (2019).
Brose, J. et al. The Mitragyna speciosa (Kratom) Genome: a resource for data-mining potent pharmaceuticals that impact human health. G3 Bethesda 11, jkab058 (2021).
Rather, G. A. et al. De novo transcriptome analyses reveals putative pathway genes involved in biosynthesis and regulation of camptothecin in Nothapodytes nimmoniana (Graham) Mabb. Plant Mol. Biol. 96, 197–215 (2018).
Yang, X. et al. A chromosome-level genome assembly of the Chinese tupelo Nyssa sinensis. Sci. Data 6, 282 (2019).
Rai, A. et al. Chromosome-level genome assembly of Ophiorrhiza pumila reveals the evolution of camptothecin biosynthesis. Nat. Commun. 12, 405 (2021).
Gongora-Castillo, E. et al. Development of transcriptomic resources for interrogating the biosynthesis of monoterpene indole alkaloids in medicinal plant species. PLoS One 7, e52506 (2012).
Yates, S. A. et al. The temporal foliar transcriptome of the perennial C3 desert plant Rhazya stricta in its natural environment. BMC Plant Biol. 14, 2 (2014).
Stander, E. A. et al. The Vinca minor genome highlights conserved evolutionary traits in monoterpene indole alkaloid synthesis. G3 Bethesda 12, jkac268 (2022).
Cuello, C. et al. Genome assembly of the medicinal plant Voacanga thouarsii. Genome Biol. Evol. 14, https://doi.org/10.1093/gbe/evac158 (2022).
Xiao, M. et al. Transcriptome analysis based on next-generation sequencing of non-model plants producing specialized metabolites of biotechnological interest. J. Biotechnol. 166, 122–134 (2013).
Swarbreck, D. et al. The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 36, D1009–D1014 (2008).
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinforma. 10, 421 (2009).
Sonnhammer, E. L., Eddy, S. R. & Durbin, R. Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins 28, 405–420 (1997).
Bairoch, A. & Boeckmann, B. The SWISS-PROT protein sequence data bank, recent developments. Nucleic Acids Res. 21, 3093–3096 (1993).
Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).
Cerveau, N. & Jackson, D. J. Combining independent de novo assemblies optimizes the coding transcriptome for nonconventional model eukaryotic organisms. BMC Bioinforma. 17, 525 (2016).
Wan, C. Y. & Wilkins, T. A. A modified hot borate method significantly enhances the yield of high-quality RNA from cotton (Gossypium hirsutum L.). Anal. Biochem. 223, 7–12 (1994).
Zhao, D. et al. De novo genome assembly of Camptotheca acuminata, a natural source of the anti-cancer compound camptothecin. Gigascience 6, 1–7 (2017).
Ro, D. K. et al. Induction of multiple pleiotropic drug resistance genes in yeast engineered to produce an increased level of anti-malarial drug precursor, artemisinic acid. BMC Biotechnol. 8, 83 (2008).
Nguyen, D. T. et al. Biochemical conservation and evolution of germacrene A oxidase in asteraceae. J. Biol. Chem. 285, 16588–16598 (2010).
Nguyen, T. M. et al. Discovery of a cytochrome P450 enzyme catalyzing the formation of spirooxindole alkaloid scaffold. Front. Plant Sci. 14, 1125158 (2023).
R. Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria https://www.R-project.org/ (2021).
Li, C. et al. Single-cell multi-omics in the medicinal plant Catharanthus roseus. Nat. Chem. Biol. 19, 1031–1041 (2023).
Wishart, D. S. et al. HMDB 5.0: the human metabolome database for 2022. Nucleic Acids Res. 50, D622–D631 (2022).
Acknowledgements
The authors would like to thank Dr. Don Nguyen (IKB Faculty of Science, UBC) and Anh Nguyen (Department of Chemistry, IKB Faculty of Science, UBC) for their helpful discussion. B.D.K. received funding from NSERC CGS-M. T.-T.T.D. received funding from the Michael Smith Foundation for Health Research Scholar (SCH-2020-0401) and NSERC-DFG (ALLRP 580347 – 22). J.F. received funding from DFG (FR 3720/8-1). J.W. receives funding from the China Scholarship Council.
Author information
Authors and Affiliations
Contributions
T.-T.T.D. conceived the research idea. B.D.K., T.K., and T.-T.T.D. designed the methodology. B.D.K., T.K., H.N., and J.W. performed all experimental and computational work. ZX assisted with NMR analysis. B.D.K., T.K., J.F., and T.-T.T.D. prepared the manuscript together. All authors contributed to the editing and review of the manuscript.
Corresponding author
Ethics declarations
Competing interests
T.-T.T.D. is an Editorial Board Member for Communications Chemistry and a Guest Editor for Communications Chemistry’s Biosynthesis Collection, but was not involved in the editorial review of, or the decision to publish this article. All other authors declare no competing interests.
Peer review
Peer review information
Communications Chemistry thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Kwan, B.D., Kim, T., Nguyen, H.H. et al. Orthologue inference-based enzyme mining for diversification of the anti-cancer evodiamine scaffold. Commun Chem 9, 73 (2026). https://doi.org/10.1038/s42004-025-01876-6
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s42004-025-01876-6






