Introduction

Isoprenoids are a class of plant derived natural products which are used in the pharmaceutical and food industries1. Limonene is a monoterpene used as a flavour or fragrance based on its citrus aroma2,3 while its hydrogenated form is used as a biofuel4. Limonene exists in nature as two enantiomers, (d)-limonene and (l)-limonene, due to a chiral centre in its structure. Limonene is also a vital precursor to several compounds such as menthol, carveol, carvone, and perillyl alcohol, which are used in food and beverage, pharmaceuticals, and cosmetics5. Currently, limonene is obtained through the extraction of plant biomass, which is rather inefficient due to variations in seasons and conditions of agricultural land6. With the recent progress in metabolic engineering and synthetic biology, microbes, such as Escherichia coli, have emerged as bio-factories providing alternative routes to produce target terpenoids such as limonene3,7,8,9,10.

Escherichia coli have three glycolytic pathways, namely the Embden–Meyerhoff–Parnass (EMP), Entner–Doudoroff (ED), and pentose phosphate (PP) pathways11 (Fig. 1). Glucose metabolism depends on the EMP and PP pathways while the ED pathway tends to be inactive except during growth with gluconate12,13. For the production of isoprenoids in E. coli, there are two main pathways: the native deoxyxylulose 5-phosphate (DXP) and the heterologous mevalonate (MVA) pathways (Fig. 1). The MVA and DXP pathways both produce the isoprene building blocks isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP), which are isomers used to biosynthesize isoprenoids14. To attain high isoprenoid yields in E. coli, the availability of co-factors, energy demands, and carbon flux need to be balanced in the DXP and MVA pathways15,16,17.

Fig. 1: Final metabolic network topology utilised in model development of wild-type E. coli engineered with the mevalonate (MVA) pathway to overproduce limonene.
figure 1

There are several different metabolic pathways involved in limonene biosynthesis, including the glycolytic pathways, tricarboxylic acid cycle (TCA), the native deoxyxylulose 5-phosphate (DXP), and MVA pathways. Co-factor (in orange) consumption is represented by curved arrows. Intermediates: Glcex extracellular glucose, Glc glucose, G6P glucose-6-phosphate, 6PG 6-phosphogluconate, KDPG 2-keto-deoxy-6-phosphogluconate, X5P xylulose-5-phosphate, Ru5P ribulose-5-phosphate, R5P ribose-5-phosphate, F6P fructose-6-phosphate, F16BP fructose-1,6-biphosphate, GAP glyceraldehyde-3-phosphate, DHAP dihydroxyacetone phosphate, BPG 1,3-bisphosphoglycerate, 3PG 3-phosphoglycerate, PEP phosphoenolpyruvate, PYR pyruvate, DXP 1-deoxy-d-xylulose-5-phosphate, B flux to vitamin B6 pathway, MEP 2-C-methylerythritol-4-phosphate, CDPME 4-diphosphocytidyl-2-C-methylerythritol, CDPMEP 4-diphosphocytidyl-2-C-methylerythritol-2-phosphate, MEcPP 2-C-Methylerythritol-2,4-cyclodiphosphate, HMBPP hydroxymethylbutenyl 4-diphosphate, IPP isopentenyl diphosphate, DMAPP dimethylallyl diphosphate, GPP geranyl diphosphate, FPP farnesyl diphosphate, LIM limonene, LIMex extracellular limonene, AcCoA acetyl coenzyme A, AtAcCoA acetoacetyl-CoA, HMGCoA hydroxymethylglutaryl-CoA, MVA mevalonate, MVAP 5-phosphomevalonate, MVAPP 5-diphosphomevalonate, ACE acetic acid, ACEex extracellular acetic acid, ACTLD acetaldehyde, ETH ethanol, ETHex extracellular ethanol, LAC lactic acid, LACex extracellular lactic acid, AKG α-ketoglutarate, SucCoA, succinyl-CoA, SUC succinate, SUCex extracellular succinate, FUM fumarate, OAA oxaloacetate. Enzymes (in green): PTS phosphotransferase system, HK hexokinase, G6PDH lumped reactions of glucose-6-phosphate dehydrogenase and 6-phosphogluconolactonase, PGDH 6-phosphogluconate dehydrogenase, KDPGA 2-keto-deoxy-6-phosphogluconate aldolase, Tkb transketolase, PGI phosphoglucose isomerase, PFK phosphofructokinase, FBA fructose-1,6-biphosphate aldolase, GDH glutamate dehydrogenase, PGK phosphoglycerate kinase, ENO enolase, PYK pyruvate kinase, DXS DXP synthase, DXR DXP reductase, ISPD CDPME synthase, ISPE CDPME kinase, ISPF MEcPP synthase, ISPG HMBPP synthase, ISPH HMBPP reductase, IDI isopentenyl diphosphate isomerase, ISPA farnesyl diphosphate synthase, LS limonene synthase, PDH pyruvate dehydrogenase, AtoB acetyl-CoA acetyltransferase, HMGS HMGCoA synthase, HMGR HMGCoA reductase, MK mevalonate kinase, PMK phosphomevalonate kinase, PMD diphosphate mevalonate decarboxylase, LDH lactate dehydrogenase, PoxB pyruvate oxidase, PCK phosphoenolpyruvate carboxykinase, PPC phosphoenolpyruvate carboxylase, ACS acetyl-CoA synthetase, PTACK lumped reactions of phosphate acetyltransferase and acetate kinase, ALDHB aldehyde dehydrogenase B, ALDH aldehyde dehydrogenase, ADH alcohol dehydrogenase, CSICD lumped enzymatic reactions of citrate synthase, aconitate hydratase A, aconitate hydratase B and isocitrate dehydrogenase, AKGDH α-ketoglutarate dehydrogenase, SCS succinyl-CoA synthetase, FRD fumarate reductase, MDH malate dehydrogenase. The illustration was created using Biorender.com.

Limonene biosynthesis has been generally low for viable bioprocesses18, and this can be attributed to several issues, such as nominal metabolic fluxes flowing towards limonene production, enzyme inefficiency in metabolic biosynthesis pathways, and limonene cytotoxicity distressing the microbial chassis19. Although the native DXP pathway results in an enhanced theoretical yield, metabolic engineering of the DXP pathway in E. coli for improved terpenes yields has been challenging. This is probably due to pathway regulations and minimal concentrations of isoprenoid precursors20, especially geranyl pyrophosphate (GPP)7. Although gene modulation of the DXP pathway has improved isoprenoid production21, engineering a heterologous MVA pathway for expression alongside the native DXP pathway in E. coli has improved yields of target compounds through increased precursor supply3,9,20,22. In attempts to improve limonene yield in bacterial bio-factories, transcriptional tuning23, promoter optimisation24, and incorporating alternative enzymes3 have been explored. However, such strategies to improve limonene yield do not assess the involvement of competing pathways25 that affect the carbon flux entering the DXP and MVA pathways, thereby affecting limonene yield. Understanding the presence of metabolic bottlenecks and the loss of carbon flux26 are also vital in enhancing limonene yields.

Metabolic networks are dynamically regulated and complex with highly interwoven networks. Due to this intricacy, metabolic engineering strategies are usually non-intuitive, where metabolic models such as constraint-based models and dynamic models can aid in addressing certain engineering questions27. Constraint-based models, which rely on steady-state assumption, are founded on stoichiometry, and do not require mechanistic information28. Thus, unlike dynamic or kinetic models, constraint-based models do not provide time-dependent information on cellular metabolism29,30. Dynamic models can reveal the relationship between metabolic fluxes, enzyme expression, metabolite concentrations, and regulation via mechanistic associations, thereby granting dynamic models of metabolism vital for comprehending, predicting, and optimising metabolic behaviours31,32. Dynamic models are created using linear and non-linear differential equations which are based on enzyme dynamics for each biochemical reaction within the metabolic network. These dynamic models have been utilised to understand aspects of systems biology such as metabolism and metabolic engineering33,34, protein signalling35,36,37,38, and gene regulatory networks39,40,41. Dynamic models created using time-series experimental data can, thus, provide better mechanistic understanding of biological networks.

In a recent study, modelling of enzyme dynamics in photosynthetic cyanobacteria with the DXP pathway had revealed limonene synthase, which catalyses the last reaction in limonene biosynthesis, as a key limiting step in limonene bioproduction42. The authors achieved higher limonene production in cyanobacteria by re-engineering the limonene synthase, compared to a previous study where multiple genes were engineered laboriously based on an experimental approach alone43. From this, we see that computational modelling together with synthetic biology-based engineering can be a good strategy in identifying important flux-controlling points in metabolic pathways. A systems biology approach which involves an integration of experimental research and computational modelling can also be used to understand intricate biological systems, such as engineered bacterial bio-factories. Systems biology needs to be predictive, where hypotheses are refuted or confirmed through targeted system perturbation44.

The use of computational models, which is one of the defining features of system biology, is vital in describing the inherent intricacy of biological systems45. Constraint-based models, such as genome-scale models, can generate pathway-level predictions which allow the identification and modification of pathway fluxes to improve the production of target products from engineered microbes. For instance, strain designs for lactic acid over-production have been predicted using a genome-scale model of E. coli46. Furthermore, algorithms such as OptKnock47, OptStrain48, OptORF49, OptFlux50, OptForce51, k-OptForce52, and DySScO53 have been developed to assist the process of microbial strain design and expand the predictive means of constraint-based models. Although these approaches predict the impact of genetic modifications on the microbial production of target products, the translation of these computationally predicted improved microbial strain designs and optimal flux states to the in vivo system has been rather challenging54. To enhance the predictive capability and accuracy of constraint-based models, cellular details such as signalling pathways55, thermodynamics56, and transcriptional regulation57 have been recently explored.

Dynamic or kinetic models, on the other hand, utilise more mechanistic details in a time-series manner which could potentially prove to be more useful in designing microbial strains with improved titres. Furthermore, dynamic models based on the model organism E. coli can be easily created as many of the metabolic reactions involving substrates, products, enzymes, and co-factors are well-known and characterised. Nevertheless, the challenge for dynamic models is in the determination of appropriate rate laws and the corresponding parameter values reliably, as many of the metabolites measured are either derived from steady-state conditions or produced extracellularly58,59.

In this paper, a wild-type E. coli strain was first engineered to produce l-limonene through the expression of the MVA pathway and a limonene synthase (EcoCTs03)3. In parallel, a comprehensive dynamic model, based on enzyme dynamics and differential equations, was developed for the limonene biosynthesis network; by considering the numerous metabolic pathways such as the native glycolytic, TCA, mixed fermentation, and DXP pathways as well as the engineered MVA pathway. Notably, to overcome some of the challenges of dynamic models mentioned earlier, here both intracellular and extracellular time-series quantitative metabolomics data from cell cultures of EcoCTs03 were used to calibrate the dynamic model. [1-13C]-glucose tracer was also used to determine the extent of the ED pathway contributing to the upstream glycolytic pathway60. The metabolic network topology (Fig. 1) and dynamic model were constructed using Complex Pathway Simulator (COPASI)61, where the collected time-series data was fed into the model for parameter estimation and simulations. The calibrated model was next used to identify intracellular targets for optimising the production of limonene for the strain. Finally, the model predictions were compared with experimental yields, by generating new strains suggested by the model.

Results

Generation of time-series metabolomics data

The engineered wild-type E. coli, EcoCTs03, was cultured in shake flasks and sacrificed in duplicates over various time points post IPTG induction. Intracellular metabolites of the cell cultures were analysed on the LC-TOF after extraction following fast filtration and quenching using liquid nitrogen. Extracellular metabolites were analysed on the HPLC after centrifugation of the cell cultures while the dodecane layer of each cell culture was collected and subjected to GC-MS analysis for limonene quantitation. Figure 2 reflects the time-series metabolomics data collected from cultures of EcoCTs03.

Fig. 2: Extracellular and intracellular time-series metabolomics data obtained from cell cultures of wild-type E. coli engineered to overproduce limonene (EcoCTs03) used for model development.
figure 2

A Reflects the concentrations of extracellular metabolites glucose (Glcex), lactic acid (LACex), and acetic acid (ACEex). B Displays the time-series concentrations of secreted limonene (LIMex). C Shows the concentrations of intracellular metabolites glucose-6-phosphate (G6P), fructose-6-phosphate (F6P), fructose-1,6-biphosphate (F16BP), DHAP + GAP (pool of dihydroxyacetone phosphate and glyceraldehyde-3-phosphate), R5P + Ru5P + X5P (pool of ribose-5-phosphate, ribulose-5-phosphate, and xylulose-5-phosphate), and IPP + DMAPP (pool of isopentenyl pyrophosphate and dimethylallyl pyrophosphate) which are of higher concentrations. D Illustrates the intracellular time-series concentrations of pyruvate (PYR), deoxyxylulose 5-phosphate (DXP), geranyl diphosphate (GPP), and farnesyl diphosphate (FPP), which are of a lower concentration range. Average values are shown for biological replicates ( ± 1 SD).

The extracellular metabolite glucose (Glcex) in the culture media decreased with time while the extracellular acetic acid (ACEex) concentration remained relatively constant (Fig. 2A). Extracellular lactic acid (LACex) was not observed at the initial time points of 0–3 h post IPTG induction, however it was produced and detected from 6 to 8 h post IPTG induction (Fig. 2A). Limonene was secreted extracellularly (LIMex) and trapped in the dodecane layer, where its concentration was found to generally increase with time (Fig. 2B). The changes in and around the 6th hour of IPTG induction could also be linked to the E. coli doubling time, as shown in Supplementary Fig. 1.

Figure 2C mainly shows the concentrations of the intracellular metabolites through the glycolytic pathways until pyruvate at the various time points. Intracellular metabolites G6P (glucose-6-phosphate) and F6P (fructose-6-phosphate) concentrations were relatively constant from 0 to 3 h post IPTG induction and dropped at 6 h post IPTG induction and were relatively constant until 8 h post IPTG induction (Fig. 2C). Intracellular R5P + Ru5P + X5P (pool of ribose-5-phosphate, ribulose-5-phosphate, and xylulose-5-phosphate) concentrations were found to be higher at the later time points of 3 h to 8 h compared to 0 h to 2 h post IPTG induction (Fig. 2C). For the intracellular pool of dihydroxyacetone phosphate and glyceraldehyde-3-phosphate (DHAP + GAP), the concentration increased from 0 to 2 h post IPTG induction and then dropped continuously until 8 h post IPTG induction while intracellular F16BP (fructose-1,6-biphosphate) remained relatively constant through the time points (Fig. 2C). Intracellular PYR (pyruvate) remained relatively constant at 0 h and 2 h and decreased from 3 h to 8 h post IPTG induction (Fig. 2D).

Intracellular metabolites MVA (mevalonate) and MVAP (5-phosphomevalonate) were not observed in the bacterial cells during the harvesting time points. Intracellular MVAPP (5-pyrophosphomevalonate), on the other hand, was only detected at one time point of 8 h post IPTG induction at a low concentration of 0.05 ± 0.02 µM/g DCW. Figure 2D illustrates the time-series concentrations of the downstream intracellular metabolites through the MVA pathway and the intracellular metabolite DXP (1-deoxy-d-xylulose-5-phosphate). Intracellular FPP (farnesyl diphosphate) and DXP remained relatively constant through the time points (Fig. 2D). Intracellular GPP (geranyl diphosphate) was not detected at earlier time points of 0 h to 3 h, while it was detected and quantified from 6 h to 8 h post IPTG induction (Fig. 2D). Intracellular IPP + DMAPP (pool of isopentenyl pyrophosphate and dimethylallyl pyrophosphate) was found at high concentrations at later time points of 6 h to 8 h post IPTG induction and was not observed at the earlier time points (Fig. 2C).

The extracellular metabolite Glcex decreased with time whilst ACEex and LACex were detected and quantified from the cell culture media. LIMex was secreted in the dodecane layer and generally increased with time. The intracellular metabolites involved in the upper glycolytic pathways (G6P, F6P, F16BP, DHAP + GAP, and R5P + Ru5P + X5P) were found to be in higher concentrations compared to intracellular PYR since it marks the end-point of glycolysis. Furthermore, the intracellular metabolites involved in the upper glycolytic pathways (G6P, F6P, F16BP, DHAP + GAP and R5P + Ru5P + X5P) also had higher concentrations compared to the intracellular metabolites found downstream of the metabolic network in the DXP and MVA pathways (DXP, GPP, FPP, IPP + DMAPP). The generated time-series extracellular and intracellular metabolomics data from the engineered wild-type EcoCTs03 strain after IPTG induction was used to develop the dynamic model.

Dynamic model development with time-series metabolomics data and 13C-tracer studies

The developed dynamic model utilised the time-series metabolomics data from the engineered wild-type EcoCTs03 strain after IPTG induction. To simplify the dynamic model, the rate law for most of the biochemical reactions in the metabolic network were designated as Michaelis–Menten reactions with metabolite inhibition accounted for according to literature62,63. The metabolic pathway involving the PP pathway was also simplified by lumping the reactions producing isomers Ru5P, R5P, and X5P as the time-series data of these isomers were quantified as a pool on the LC-TOF-MS (liquid chromatography-time of flight-mass spectrometer)64.

During the initial development of the limonene metabolic model (called Model A), the network topology was created (Supplementary Fig. 2) and the parameters of all dynamic equations used in Model A were fitted with the time-series measured metabolite concentrations using the genetic algorithm in COPASI65 (Supplementary Fig. 3). Figure 3A shows that there was poor fitting of extracellular limonene (LIMex) data with the simulated LIMex concentrations with time. In our previous work on modelling of the TLR (Toll-like receptor) 4, TNF (tumour necrosis factor), and TRAIL (TNF-related apoptosis-inducing ligand) signalling, we had a similar situation where the initial model created failed to fit experimental profiles35,36,66,67. After carefully studying the local pathways where the model failed and using response rules, we were able to perform model iterations until satisfactory simulations were made. Here, we adopted a similar approach of performing model iterations for simulation improvements.

Fig. 3: Simulations from Models A, B, and C during model iterations showing improvements in the fitting of the simulations of extracellular limonene (LIMex) and the pooled metabolites DHAP + GAP (pool of dihydroxyacetone phosphate and glyceraldehyde-3-phosphate) with experimental results.
figure 3

A Illustrates the poor fitting of LIMex simulation from Model A with the experimental time-series data. B Displays the metabolic network topology where acetaldehyde (ACTLD) and ethanol (ETH) reside, while C shows the simulated ACTLD and ETH concentrations with time from Model A. D Shows an improved LIMex simulation from Model B while E reflects the poor fitting of DHAP + GAP simulation from Model B. F Displays the improved simulation of DHAP + GAP from Model C with the Entner–Doudoroff (ED) pathway included in the metabolic network. Average values are shown for biological replicates (±1 SD) in (A, DF).

In Model A, notably, acetaldehyde (ACTLD) and ethanol (ETH) were found to accumulate (Fig. 3B, C). To reduce the accumulation of ACTLD and ETH, biochemical reactions downstream of these metabolites were further studied, where the reverse reaction of enzyme AtoB (acetyl-CoA acetyltransferase) was found to be high. The reverse reaction of AtoB refers to the formation of AcCoA (acetyl coenzyme A) from AtAcCoA (acetoacetyl-CoA) and CoA (coenzyme A). Working from this reaction to the downstream reactions ensured that there was sufficient flux in the MVA pathway flowing towards LIMex. During model development, the approach of pulling the accumulated flux away from ACTLD and ETH towards the MVA pathway helped to improve the model further, resulting in a better refined Model B with improved fitting for LIMex (Fig. 3D). (The simulations for all measured metabolites for Model B can be found in Supplementary Fig. 4). Nevertheless, Model B had poor fitting for the simulations of the pooled metabolites GAP + DHAP (Fig. 3E). For both Models A and B, there was no initial inclusion of the ED pathway to the network topology (Supplementary Fig. 2) as this pathway seems to use gluconate instead of glucose12,13. However, some studies have shown that E. coli can involve the EMP and ED pathways to metabolise glucose68 while the isoprenoid synthesis pathways DXP and MVA69 can result in the biosynthesis of limonene.

We next checked this, through the parallel use of two different glucose tracers, [1-13C]-glucose and [4-13C]-glucose, supplied in the cell culture media, where 13C atoms can be incorporated into the metabolite IPP (or isomer DMAPP) when glycolysis occurs via the EMP pathway using the [1-13C]-glucose, and for the [4-13C]-glucose tracer its incorporation into the isoprenoids (IPP and DMAPP) occurs via the DXP pathway60. Since two IPP or DMAPP units are required for the biosynthesis of limonene, it is possible to determine molecular isotopologues having 13C atoms incorporated through the analysis of the mass spectra of limonene obtained from the GC-MS. Each m/z in the mass distribution vector (MDV) relates to a unique number of 13C-labelled IPPs integrated into limonene. By utilising [1-13C]-glucose as a positional label, there can be zero, one, two, or three 13C atoms incorporated into IPP and DMAPP60. A map describing the 13C incorporation into IPP/DMAPP by parallel [1-13C]-glucose and [4-13C]-glucose labelling cultivation has been previously published60. Supplementary Fig. 5 displays the MDV of limonene after correction for naturally occurring isotopes (MDV*). The interpretation of the MDV* of limonene needs the consideration of different labelling scenarios as predicted (Supplementary Fig. 6). As a result, conversion fraction ½ was included as a multiplier for the M + 1 fraction while ¾ was included as multiplier for the M + 2 and M + 3 fractions (Supplementary Table 1). Using these conversion factors and the MDV* values, the calculation for the flux ratio for the EMP pathway was then determined, as previously described60. The EMP pathway had a flux ratio of 0.945 ± 0.002, while the ED pathway which is complementary to the contribution of the EMP pathway was found to have a flux ratio of 0.055 ± 0.002. The results from the [1-13C]-glucose tracer studies showed that the EMP pathway was the main glycolysis pathway (94.5% contribution) while the ED pathway had a small contribution to glycolysis. Since the ED pathway was involved with a 5.5% contribution, it was included in the metabolic network of the next refined dynamic model C (Fig. 1). That is, the ED pathway, which results in the production of PYR and GAP + DHAP, was added to Model C. Notably, this helped in further improving the simulations of the pooled metabolites GAP + DHAP (Fig. 3F). Model C (Supplementary Fig. 7) also showed that the DXP pathway did contribute to limonene production, which was corroborated with experimental findings (Supplementary Fig. 8). The rate laws, rate law equations, and the parameter values of the final fitted model (Model C) can be found in Supplementary Table 2.

For cell culture cultivation with [4-13C]-glucose, at most only one 13C-atom may be incorporated per IPP/ DMAPP molecule60. The DXP flux ratio was obtained through the MDV* value for this positional label and the conversion fraction ½ which was included as a multiplier for the M + 1 fraction (Supplementary Table 1). The DXP flux ratio was determined as 0.094 ± 0.001, while the MVA flux ratio which also contributed to the isoprenoid flux was 0.906 ± 0.001. This showed that the MVA pathway contributed significantly to isoprenoid biosynthesis and therefore limonene biosynthesis. There was also 9.4% contribution of the flux from the DXP pathway, indicating that this pathway is still involved in limonene biosynthesis. This contribution to limonene biosynthesis from the DXP pathway was also reflected in the final developed dynamic model (Model C), through the in silico knockout (KO) of the enzyme ISPH (hydroxymethylbutenyl 4-diphosphate reductase), where the simulation showed a slight reduction in limonene production as reflected in Supplementary Fig. 8 (the in silico KOs were performed by setting all the reaction parameters upstream of the model species, such as metabolites in this context, to zero35,70).

The [1-13C]-glucose and [4-13C]-glucose tracer studies showed that the main route of carbon flux for limonene biosynthesis was through the EMP glycolytic pathway and MVA pathway. Based on the [1-13C]-glucose tracer studies, the EMP pathway was preferred over the ED pathway for glycolysis. This preference is likely due to the high expression of EMP pathway enzymes observed in E. coli68. Furthermore, there was a possible synergy between the heterologous MVA and native DXP pathways, based on the results of the [4-13C]-glucose tracer studies, as both pathways contributed to limonene biosynthesis.

Utilising experimentally validated limonene model to identify key bottlenecks to enhance limonene yield

With model C satisfactorily recapitulating experimental dynamics for all measured metabolites concentrations and fluxes, next, we investigated, using this model, novel in silico targets to enhance limonene concentrations compared to the engineered wild-type EcoCTs03 strain. We performed in silico KOs for all enzymes and found ALDH KO, ADH KO, and LDH KO as the top targets which significantly increased the bioproduction of limonene. Since the enzymes ALDH and ADH are encoded by the same gene, adhE71,72, the knockout of these enzymes was represented as the ALDH-ADH KO. Figure 4A shows simulations of extracellular limonene obtained from the in silico ALDH-ADH KO, where increased limonene yield was predicted. This was further qualitatively validated experimentally, where there was about eightfold more limonene produced 7 h post IPTG induction for the ALDH-ADH KO strain compared to the engineered wild-type EcoCTs03 (Fig. 4B). For the LDH KO, the simulations showed an overall increase in limonene production (Fig. 4A) which corroborated with experimental results where there was significant (11-fold) increase in limonene produced 7 h post IPTG induction for the LDH knockout strain compared to the EcoCTs03 strain (Fig. 4B). The model also indicated that there was flux loss to the mixed fermentation pathway involving lactic acid and ethanol formation, as the LDH KO and ALDH-ADH KO simulations resulted in enhanced limonene yield.

Fig. 4: Extracellular limonene (LIMex) simulations of the in silico knockout (KO) of enzymes ALDH-ADH (aldehyde dehydrogenase-alcohol dehydrogenase) and LDH (lactate dehydrogenase), and overpressed HK (hexokinase) from the developed dynamic model with the corresponding experimental LIMex concentrations.
figure 4

A Compares the simulations of LIMex from the developed dynamic model (Model C) for ALDH-ADH KO, LDH KO, HK overexpressed, and engineered wild-type EcoCTs03, while B shows the LIMex concentrations produced experimentally from the bacterial strains over time. Average values are shown for biological replicates ( ± 1 SD) in (B).

We also performed a study on in silico overexpression by raising the relevant parameters (e.g. Vmax). Notably, when overexpressing the enzyme HK (glk gene) in silico, the simulation of the extracellular concentration of limonene was found to increase (Fig. 4A) albeit less than that observed from the in silico ALDH-ADH KO and LDH KO. Experimentally, the HK overexpressed strain produced eightfold more limonene 7 h post IPTG induction compared to the engineered wild-type EcoCTs03 strain (Fig. 4B).

Discussion

Traditionally, increasing the yields of targets in microbial bio-factories have been carried out through various laborious experimental approaches such as manipulating genes from different pathways19,43. However, such approaches may not be efficient in identifying mutant strains with the highest titres. Incorporating systems biology work, computational modelling approaches together with synthetic biology-based engineering, can help to steer metabolic engineering strategies to improve product yield through the identification of carbon flux-controlling fixtures in the highly integrated metabolic network. In this work, we generated intracellular and extracellular time-series metabolomics data from an engineered wild-type limonene-producing strain (EcoCTs03). The intracellular metabolites were quantified after fast filtration and rapid quenching in liquid nitrogen. The fast filtration and liquid nitrogen quenching methodology was preferred over cold methanol quenching as it minimises cell leakage from microbes and metabolite degradation whilst enabling the removal of contamination from extracellular sources73,74,75,76. Quenching or the complete stopping of biochemical reactions involved in cell metabolism is vital for accurate quantitation of intracellular metabolites77. Furthermore, since limonene is secreted extracellularly and not accumulated intracellularly due to its toxicity to the bacterial cells, a dodecane overlay in the cell culture was used to trap secreted limonene3 to ensure reliable quantitation due to limonene’s volatility. LIMex was quantified from the dodecane layer and its concentration was found to generally increase through the sampling time points (Fig. 2B). LACex was only observed at the later time points from 6 to 8 h post IPTG induction (Fig. 2A) probably due to rapid increase in cell density resulting in some depletion in oxygen supply in the shake flasks. The fermentative conditions probably resulted in increased production of LACex from PYR78,79. This also correlated with the drop in intracellular PYR observed over the same time points (Fig. 2D), where there was a diversion of carbon flux from PYR to LACex. The measured intracellular metabolites upstream of PYR, such as G6P, F6P, and DHAP + GAP, also had reduced concentrations during similar time points which correlated with more carbon flux used for limonene production and thereby increasing its yield.

The intracellular metabolites MVA and MVAP were not detected over the time points while MVAPP was only detected at a low concentration (0.05 ± 0.02 µM/g DCW) at 8 h post IPTG induction. This probably shows that the carbon flux directed through the MVA pathway from MVA to MVAPP was efficiently converted during the sampling time points due to the lack of metabolite accumulation. The presence of intracellular FPP (Fig. 2D) showed that there was a loss of flux from GPP to LIMex production. Previous work has shown that expression of GPP synthase instead of the native IspA produced only GPP instead of both GPP and FPP, thereby redirecting the carbon flux towards GPP and preventing loss to FPP increased limonene titre3. The last two intracellular metabolites before conversion to LIMex, IPP + DMAPP and GPP were only detected at the later time points of 6–8 h post IPTG induction (Fig. 2C, D) which was correlated with the higher detection of LIMex (Fig. 2B). The increased supply of these precursor metabolites (IPP + DMAPP and GPP) have also increased yield of target compounds in previous work7,80.

Here we developed a dynamic model based on sound biochemical reaction equations and fitted with our in-house experimentally generated time-series metabolomics data. The dynamic model underwent few iterations (Models A and B) before the final developed model (Model C). In the initial Model A with the metabolic network topology of Supplementary Fig. 2, there was poor fitting of LIMex (Fig. 3A) due to the carbon flux accumulation found in the time course simulations of ACTLD and ETH (Fig. 3B, C). For the next model iteration, carbon flux was moved from ACTLD and ETH downstream towards LIMex, where Model B resulted in the improved fitting of LIMex (Fig. 3D). However, there was poor fitting of DHAP + GAP in Model B (Fig. 3E). For the final iteration, Model C incorporated the results from the [1-13C]-glucose tracer studies where the ED pathway was added to the metabolic network topology (Fig. 1) which further improved the GAP + DHAP simulation (Fig. 3F). Model C also corroborated with [4-13C]-glucose tracer studies, where most of the carbon flux went through the MVA pathway with minimal carbon flux entering the DXP pathway (Supplementary Fig. 8). The dynamic model was validated after showing in silico knockdowns of LDH and ALDH-ADH yielded enhanced limonene concentrations which was corroborated qualitatively with experimental results. Furthermore, in silico overexpression of HK from the dynamic model also improved limonene yield which was experimentally proven as well. The developed dynamic model was efficient in identifying enzyme knockouts and overexpressed enzyme that increased limonene yield.

From this systems biology work, most of the carbon flux flowed through the MVA pathway towards limonene biosynthesis, while a small amount of carbon flux went through the native DXP pathway. The small amount of carbon flux that went through the DXP pathway may probably be important for vitamin B6 synthesis, which is required for cellular processes and the biosynthesis and catabolism of amino acids81,82 as well as the production of the metabolite FPP, which is required for peptidoglycans formation for cell wall biosynthesis during early development83 and in creating respiratory chain components84,85.

The contributions of both the DXP and MVA pathways to limonene biosynthesis in the EcoCTs03 strain suggests possible synergistic interactions between the two isoprenoid routes. This hypothesised synergy between the DXP and MVA pathways has been observed in previous work in the bacterium Rhodobacter sphaeroides60. The synergy of both pathways has also led to improved lycopene productivity in E. coli86. In assessing the DXP and MVA pathways for IPP/DMAPP synthesis from glucose, the MVA pathway has greater energy efficiency due to the net increase in NAD(P)H reducing equivalents16 while the DXP pathway has greater carbon efficiency due to the requirement of only two GAP molecules for IPP/DMAPP synthesis compared to three GAP molecules by the MVA pathway16. The possible synergy between the DXP and MVA pathways could be due to the compatible nature of both pathways through co-factors, as the synthesis of one IPP/DMAPP from glucose through the MVA pathway results in a net of four NAD(P)H reducing equivalents, which could in turn generate ATP via respiration86. Even though the synthesis of the MVA molecule through glycolysis, pyruvate dehydrogenase, and the upper MVA pathway results in NAD(P)H and ATP surplus, the lower MVA pathway results in no NAD(P)H gained and instead utilises ATP for the conversion of the MVA molecule to IPP/DMAPP. Therefore, the lower MVA pathway cannot provide the reducing equivalent demand and ATP to the DXP pathway, which probably results in a diminished flux going through the DXP pathway86. The pairing of the complementary reducing power and ATP requirement therefore probably portrays a crucial part in the synergy of the dual pathway in limonene biosynthesis.

The presence of the ED pathway in EcoCTs03 was rather interesting as this pathway avoids the thermodynamic bottlenecks of fructose-1,6-bisphosphate aldolase and triose-phosphate isomerase faced by the EMP pathway11, and requires substantially less enzymatic proteins than the EMP pathway68. However, the ED pathway, theoretically, results in one ATP produced compared to two ATP molecules produced by the EMP pathway (per molecule of glucose). In EcoCTs03, ATP was required for various reactions in the MVA pathway, for instance, in the biochemical reaction from metabolite MVA (mevalonate) to MVAP (5-phosphomevalonate), MVAP to MVAPP (5-diphosphomevalonate) and MVAPP to IPP/ DMAPP (Fig. 1). In previous works, ED upregulation were found to enhance isoprenoid yields by increasing precursor metabolites87,88,89 and alleviating oxidative stress90,91. However, less ATP produced through this pathway would make it difficult to improve limonene yield in this work. The ATP concentrations at various time points were not determined due to their concentrations being low, below the detection limit of the LC-TOF. The EMP pathway was much more dominant in EcoCTs03 probably due to the evolutionarily tuned EMP metabolic pathway found in E. coli, whereby EMP pathway enzymes have been observed to be highly expressed in the proteome68.

The simulation of DXP pathway inactivation from the developed dynamic model for the EcoCTs03 strain, portrayed a slight reduction in limonene production, showing that the co-expression of both the DXP and MVA pathways helped to improve limonene production. This observation is further supported by experimental studies in R. spaeroides, where the co-expression of the DXP and MVA pathways improved amorphadiene titres by 1.2-fold, whilst the inactivation of the DXP pathway caused a threefold reduction in amorphadiene titres92. Moreover, in another experimental study in R. spaeroides, strains co-expressing the DXP and MVA pathways were found to produce the highest amorphadiene/biomass ratios compared to single-pathway strains cultivated in various conditions93.

From the final refined model, in silico knockouts of LDH and ALDH-ADH showed favourably enhanced limonene yields. This was verified experimentally as eliminating the carbon flux losses by creating the ALDH-ADH and LDH KO strains helped to improve limonene yield between 8- and 11-folds. This was probably due to the availability of more carbon flux for limonene production with the elimination of competing pathways. With the LDH KO, there was less competition for pyruvate as a precursor and NADH as a co-factor for lactate production, ensuring more pyruvate availability for the MVA pathway. In a previous study, LDH and pyruvate formate lyase knockdowns were performed for Enterobacter aerogenes which improved NADH co-factor supply and 2,3-butanediol yield94. For the ALDH-ADH KO, with the elimination of ethanol production, there was probably more metabolite acetyl coenzyme A (AcCoA) available for the MVA pathway and co-factor NADH. The final model also showed an improvement in limonene yield through in silico overexpression of HK. Limonene yield was also found to increase by eightfold through experimentally produced strains with HK overexpression, corroborating with the developed dynamic model. In another study, overexpressing the HK enzyme in engineered yeast also led to an increased production of the target compound, β-carotene95. Although HK overexpression allowed improvement in limonene yield, there are further bottlenecks down the metabolic pathway, such as the loss of flux to mixed acid fermentation (lactic acid and ethanol formation) as shown through the improved limonene yield for the KO strains in Fig. 4A, B, and the possible reduction in ATP content due to its consumption in the increased phosphorylation of glucose by HK. Furthermore, the activity of HK could also be inhibited by the metabolite G6P (glucose-6-phosphate) produced in the HK enzymatic reaction which causes feedback inhibition96,97,98. These factors could have prevented further improvements in limonene yield experimentally.

The identified knockout strains show that metabolic fluxes can be easily redirected through the deletion of competing pathways. Likewise, previous studies have shown that target compound yields can be improved through the elimination of by-products99,100. Besides the deletion of competing pathways, product yield can also be enhanced through overexpression of target enzymes as shown in this study. The use of our model, therefore, removed the need for a trial-and-error-based experimental approach of creating a library of knockout and overexpressed strains to identify the best limonene-producing strains. The model also identified three different strain manipulations which can enhance limonene production unlike previous work in cyanobacteria producing limonene which identified only one strain modification42. Further work on using these three identified strain manipulations as a combination in a single stable strain could aid in improving limonene yield even more. The supply of co-factors such as ATP or NADPH may also become limiting for the various biochemical reactions in the metabolic network. Fine-tuning the supply of such co-factors is another strategy that can aid in improving product yield42,94.

Here we present a systems biology approach for limonene biosynthesis of engineered wild-type strain EcoCTs03 using time-series intracellular and extracellular metabolomics data, as well as incorporating 13C-tracer studies to develop a refined dynamic model. The model enabled the identification of key strain modifications in silico (ALDH-ADH KO, LDH KO, and HK overexpression) that would enhance the overall limonene yield. Based on the in silico results, both upstream and downstream bottlenecks in the metabolic network were identified, which in turn guided the construction of the identified modifications in the bacterial strains leading to significant limonene bioproduction. This strategy of incorporating a dynamic model into the metabolic engineering workflow to determine strain modifications resulting in improved target yields is more efficient than experimentally creating a large library of overexpressed or knockout strains to identify such high titre-producing modifications. Moreover, the possibility of incorporating the identified modifications into a single stable strain would aid in further improving target yield. After identifying key strain modifications with a developed dynamic model, further engineering strategies such as protein engineering and process optimisation may be utilised to improve product titres. Protein engineering could involve using more active homologous enzymes from alternative organisms which can improve product yield101,102,103,104. Alternatively, targeted mutagenesis of key enzymes could also improve activity towards substrates105.

Methods

All chemicals, standards, solvents, and media components were purchased from Sigma-Aldrich (St. Louis, MO), Fisher Scientific (Fair Lawn, NJ, USA), or VWR (Radnor, PA, USA). Deionised water was filtered by Sartorius Arium Pro VF Type 1 water system (18.2 MΩ, 0.2 μm).

Strain construction

The engineered wild-type strain producing limonene, EcoCTs03, was produced by transforming pJBEI-6409 plasmid, which was a gift from Taek Soon Lee (Addgene plasmid # 47048; http://n2t.net/addgene:47048; RRID:Addgene_47048)3 through heat shock in E. coli K-12 MG1655. pJBEI-6409 is a p15A plasmid expressing limonene from acetyl-CoA via geranyl pyrophosphate (GPP) and was used as-is in this study. CRISPR Cas9-assisted recombineering method as described previously106 was used to delete ldhA (UniProt P52643) and adhE (UniProt P0A9Q7) genes from the E. coli MG1655 strain, resulting in LDH and ALDH-ADH knockouts, respectively. The ldhA and adhE regions of three successfully knockout strains were PCR amplified and thereafter sequenced to confirm gene deletion (Supplementary Fig. 9). One colony each was selected and stored at −80 °C in 40% glycerol prior to further use. To produce limonene, the pJBEI-6409 plasmid was subjected to heat shock in ΔldhA and ΔadhE strains. The Mix & Go! E. coli Transformation Kit (Zymo Chem) was utilised following the manufacturer’s instructions to prepare competent cells using the ldhA and adhE deletion strains. The pJBEI-6409 transformed strains (EcoCTs03, LDH KO, and ALDH-ADH KO) were placed onto individual chloramphenicol selective Luria-Bertani agar plates and left overnight at 37 °C.

In this study, the plasmids used for the expression of the glk enzyme were derived from the previously published plasmids pJBEI-6409 and pTrc-trGPPS(CO)-LS (pJEBI-3101), which were obtained from Addgene3. While pJEBI-3101 is a ColE1 plasmid expressing limonene from isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP) via GPP, it was modified to express the selected enzyme of interest. A list of strains and primers can be found in Tables 1 and 2, respectively. The gene of interest was glk (Uniprot P0A6V8).

Table 1 Bacterial strains used in this study
Table 2 Plasmids used in this study

EcoCTs-CPMS5 was created by restriction-free cloning107,108. Briefly, the plasmid backbone was amplified by PCR using iProof™ High-Fidelity DNA Polymerase (Bio-Rad), in which overlapping overhangs were created by primer design. The resulting DNA was treated with DpnI (New England Labs) overnight at 37 °C. From the DpnI-treated PCR product, 1 μL was transformed by heat shock in E. coli DH5α, plated on an ampicillin selective Luria-Bertani agar plate and incubated overnight at 37 °C. Positive colonies were identified by colony PCR using PCRBIO Taq mix red (PCR Biosystems), and were subsequently inoculated overnight in 2 mL Luria-Bertani supplemented with appropriate antibiotic at 37 °C, 300 rpm. Cultures were spun down, where the plasmid was extracted using E.Z.N.A.® Plasmid Mini Kit I (Omega Bio-tek) and sequenced using Sanger sequencing (Bio Basic) to confirm the presence of the deletion. Finally, plasmids were confirmed by performing restriction enzyme digestion (RE) and compared with in silico agarose gel simulation on SnapGene 7.0.2.

The gene of interest was cloned in EcoCTs-CPMS5 by PCR-amplifying the backbone of the plasmid without limonene synthase (LS), as well as the genes of interest from E. coli K-12 MG1655 genomic DNA. Primers contained overhangs such that the backbone and inserts shared a 15 bp overlap region at the site of integration. Expected band sizes were confirmed on an agarose gel and subsequently treated with DpnI overnight at 37 °C. DpnI-treated PCR products were purified using the E.Z.N.A.® gel extraction kit (Omega Bio-tek). Each of purified DNA backbone (50–100 ng) and inserts were added to a PCR tube with In-Fusion Snap Assembly Master Mix (Takara Bio). The mixture was incubated at 50 °C for 15 min, and 1 μL was transformed by heat shock in E. coli DH5α, plated on an ampicillin selective Luria-Bertani agar plate and incubated overnight at 37 °C. Colonies were analysed by colony PCR, amplified, extracted, sequenced, and RE digested as described above.

For expression, plasmids were transformed by heat shock in E. coli K-12 MG1655. Competent cells were prepared using the Mix & Go! E. coli Transformation Kit (Zymo Chem) following the manufacturer’s specifications. For the expression experiment, pJBEI-6409 was co-transformed with in EcoCTs-CPMS5-GOI (gene of interest, glk), plated on ampicillin and chloramphenicol selective LB agar plates and incubated overnight at 37 °C.

Growth conditions

From the prepared E. coli strains which were placed onto plates, one colony from each plate was selected after overnight incubation at 37 oC. Each colony was inoculated into 5 mL Luria-Bertani medium (5 g/L yeast extract, 10 g/L tryptone, 10 g/L NaCl) with 30 µg/mL chloramphenicol and left overnight in a shaking incubator at 220 rpm and 37 °C. After which, cell pellets were washed and resuspended in 50 mL M9 medium (12.7 g/L Na2HPO4.7H2O, 3.1 g/L KH2PO4, 1 g/L NH4Cl, 0.5 g/L NaCl, 0.25 g/L MgSO4.7H2O, 15 mg/L CaCl2.2H2O, 8.1 mg/L FeCl3, 0.89 mg/L MnCl2.4H2O, 1.7 mg/L ZnCl2, 0.34 mg/L CuCl2, 0.6 mg/L CoCl2.6H2O, 0.51 mg/L Na2MoO4) adapted from a previous study109 with 10 g/L glucose and left overnight in a shaking incubator at 30 °C and 220 rpm. Each strain was kept at −80 °C in 40% glycerol. For each glycerol-stored strain, 100 µL was added to 50 mL M9 medium in 250-mL flasks forming pre-cultures and left overnight at 220 rpm and 30 °C. For the collection of metabolomics data at various time points, numerous cell culture flasks were prepared for duplicate biological samples for each time-series collection. Through the addition of 100 µL pre-cultures of each strain to 50 mL M9 medium in 250-mL flasks, cell cultures of each strain were prepared and kept overnight in a shaking incubator at 220 rpm and 30 °C. Isopropyl β-d-1-thiogalactopyranoside (IPTG) was added resulting in a final concentration of 25 μM when cell cultures reached an optical density of 1 at 600 nm. To enable trapping of secreted limonene, dodecane overlays of 5 mL were added to cell cultures3 and kept at 30 °C and 220 rpm in a shaking incubator. To obtain the concentrations of the intracellular and extracellular metabolites, and secreted limonene for dynamic modelling of the EcoCTs03 strain, prepared cell culture flasks were sacrificed in duplicates at time points 2 h, 3 h, 6 h, 7 h and 8 h post IPTG induction.

Metabolomics extraction and analysis

For the analysis of secreted limonene and extracellular metabolites, cell cultures of 50 mL EcoCTs03 with dodecane overlay were sacrificed at time points 0 h, 2 h, 3 h, 6 h, 7 h, and 8 h post IPTG induction in duplicates and centrifuged at 3000 rpm for 10 min. For the mutant strains (HK overexpressed, ALDH-ADH knockout, and LDH knockout), only secreted limonene concentrations were analysed by sacrificing shake flasks in duplicates at time points 2 h, 3 h, 6 h, and 7 h post IPTG induction. In addition, flasks with EcoCTs03 and HK overexpressed were analysed for protein expression. Briefly, the optical density at the final time point was measured and a normalised amount of cells was extracted (V = 250/OD μL). It was then spun down for 2 min at room temperature at 14,000 rpm. The media was discarded, where the pellets were resuspended in 50 μL B-PER, and vortexed for 10 min at room temperature. The resulting lysate was loaded on SDS-PAGE (sodium dodecyl sulfate-polyacrylamide gel electrophoresis), where the overexpression of HK was confirmed (Supplementary Fig. 10). The dodecane overlay which trapped the secreted limonene was removed from each sample and stored at −80 °C prior to analysis. Diluted limonene extracts in ethyl acetate were run on an Agilent 7890B gas chromatography-mass spectrometry (GC-MS) system equipped with a DB-5ms column. A 10:1 split ratio and 10 mL/min split flow was utilised for each run with 1 µL of sample injected. The GC oven was held at 40 °C for 3 min, with a temperature gradient of 10 °C/min to 100 °C followed by a 60 °C/min ramp to 220 °C, which was held for 2 min. The temperatures of the injector and MS transfer line were set at 250 °C and 280 °C, respectively. Selected ion-monitoring (SIM) mode was utilised for the MS with ions of m/z 136, 68 and 93 monitored, which represented the molecular ion and top two abundant fragmental ions of limonene. Calibration standards were prepared for limonene in ethyl acetate with concentrations ranging from 0.05 µg/mL to 10 µg/mL. Limonene concentrations from cell cultures were determined using Agilent Quantitative software.

After centrifugation, supernatants of the EcoCTs03 cell cultures were kept at −80 °C before subjecting the extracellular metabolites to quantitative analysis. Aliquots of 1 mL from the thawed supernatant samples were filtered using polyamide filters and subjected to an Agilent 1200 high-performance liquid chromatography (HPLC) system equipped with a Bio-rad Aminex HPX-87H column (300 × 7.8 mm). The HPLC system was run together with a 1260 Infinity II Refractive Index Detector (RID). An isocratic gradient was run for 28 min with 0.01 N sulphuric acid, 0.6 mL/min flow rate, and 5 µL of sample injected. The HPLC column was at 35 °C while the RID had positive polarity at 30 °C. Concentrations of extracellular metabolites were determined from calibrations mixtures of the following range: glucose—0.5 to 80 g/L; acetic acid— 0.125–80 g/L; lactic acid—0.125–8 g/L; ethanol—0.5–80 g/L. Cell pellets left after centrifugation were oven-dried before weighting.

For the analysis of intracellular metabolites, cell cultures of EcoCTs03 were sacrificed in duplicates at time points of 2 h, 3 h, 6 h, 7 h, and 8 h after IPTG induction, where 10 mL aliquots of cell cultures were subjected to fast filtration followed by rapid quenching in liquid nitrogen as described previously with modifications73,110. Briefly, for fast filtration, 10 mL aliquots of each cell culture were filtered using 0.2 μm polyamide membrane filters (Sartorius, Goettingen, Germany) followed by washing with 5 mL of wash solution (1 g/L NH4Cl, 0.5 g/L NaCl, 12.7 g/L Na2HPO4.7H2O, 3.1 g/L KH2PO4,). The filter membrane containing the cells was placed onto aluminum foil, folded, and quickly plunged into liquid nitrogen. The membrane filters encased in aluminum foil were stored at −80 °C until the extraction of the intracellular metabolites. The intracellular metabolites were extracted by removing the aluminum foil and placing the membranes in 5 mL of a 4:4:2 mixture of methanol, acetonitrile, and water, and left to stand in an ice bath for 10 min. After which, membranes were vortexed for 1 min and subjected to sonication for 3 min thrice, where samples were placed in an ice bath for 1 min between each round of sonication. An internal standard mixture consisting of 10 µg/mL thymolphthalein monophosphate (TMP) and 50 µg/mL mevalonic acid-d3 (MVA-d3) in a solvent mixture of methanol: 10 mM ammonium hydroxide (7:3) was prepared. Extracts were spiked with 20 µL of the internal standard mixture after decanting into glass tubes. This was followed by subjection to a vacuum concentrator and reconstitution with 200 µL methanol: 10 mM ammonium hydroxide mixture (7:3) prior to filtration into glass vials for quantitative liquid chromatography-mass spectrometry analysis.

An Agilent 6230 time of flight-mass spectrometer (TOF-MS) coupled with a Dual Agilent Jet Stream (AJS) ion source was used together with an Agilent ultra-performance liquid chromatography (UPLC) 1290 system. A VanGuard pre-column (2.1 × 5 mm) was utilised with a Waters Acquity UPLC BEH C18 column (2.1 × 150 mm, 1.7 µm). The chromatographic method utilised was modified from previous work73,110. Samples of 2 µL were injected into the system with mobile phase A consisting of 5 mM ammonium formate in water (pH 9.5) and mobile phase B consisting of 5 mM ammonium formate (pH 9.5) in acetonitrile: water (9:1). The solvent gradient utilised for each run started with a flow rate of 0.1 mL/min and 100% mobile phase A held from 0 to 3.5 min followed by 100% mobile phase B at 12 min which was then held for 8 min and flow rate increased to 0.5 mL/min. The mobile phase was changed to 100% mobile phase A at 20 min for 5 min and kept for another 5 min. The column temperature was held at 35 °C throughout the analysis. For the TOF, negative electrospray ionisation was utilised with the following conditions: Gas flow, 11 L/min; Gas temperature, 325 °C; Nebuliser pressure, 35 psi; Sheath gas flow, 11 L/min; Sheath gas temperature, 375 °C; Vcap voltage, 3500 V; Nozzle voltage, 500 V; Skimmer, 65; OctopoleRFPeak, 750; Scan rate, 2 spectra/s. The fragmentor voltage was altered during each 35 min sample run: 2–7.5 min, 140 V; 7.5–15 min, 100 V, 140 V and 150 V. The following flow diversions for the UPLC was used: 0–2 min to waste, 2– 15 min to TOF-MS, and 15–35 min to waste. Methanol: 10 mM ammonium hydroxide (7:3) mixture was used for preparing standard mixtures used for intracellular metabolite quantitation. The intracellular metabolites determined were R5P + Ru5P + X5P (pool of ribose-5-phosphate, ribulose-5-phosphate, and xylulose-5-phosphate), F16BP (fructose-1,6-biphosphate), DHAP + GAP (pool of dihydroxyacetone phosphate and glyceraldehyde-3-phosphate), PYR (pyruvate), DXP (1-deoxy-d-xylulose-5-phosphate), MVA (mevalonate), MVAP (5-phosphomevalonate), MVAPP (5-pyrophosphomevalonate), IPP + DMAPP (pool of isopentenyl pyrophosphate and dimethylallyl pyrophosphate), FPP (farnesyl diphosphate), and GPP (geranyl diphosphate). Calibration mixtures were made up to 100 µL with 10 µL of internal standard mixture consisting of 50 µg/mL MVA-d3 and 10 µg/mL TMP. The following concentration range was used for the calibration mixtures: R5P (R5P + Ru5P + X5P pool)—0.04–10 µg/mL; F1,6BP—0.04–6 µg/mL; DHAP (DHAP + GAP pool)—0.04–5 µg/mL; PYR—0.04–1.5 µg/mL; DXP—0.04 to 10 µg/mL; MVA—0.04–10 µg/mL; MVAP—0.01–0.3 µg/mL; MVAPP—0.01–0.3 µg/mL; IPP (IPP + DMAPP pool)—0.05–1.5 µg/mL; FPP—0.05–1.5 µg/mL; GPP—0.05–1.5 µg/mL. Agilent Masshunter Workstation Quantitative Analysis for TOF was utilised for metabolite quantitation.

G6P (glucose-6-phosphate) and F6P (fructose-6-phosphate) were quantified by injecting 2 µL of samples and running the samples through the LC-TOF-MS system with a ZIC-HILIC column (2.1 × 100 mm, 3.5 µm). The solvent gradient started with a 10% mobile phase A with a flow rate of 0.5 mL/min at 0 min, 25% mobile phase A at 1.5 min, 35% mobile phase A at 1.8 min and held until 6 min, 10% mobile phase A at 6.5 min until 9.5 with flow rate of 0.6 mL/min, and 10% mobile phase with flow rate of 0.5 mL/min held until 10 min. The column temperature was constant at 35 °C. For the TOF, the same conditions were utilised as described for the other intracellular metabolites with exception to the fragmentor voltage which was kept at 140 V from 0.1–6 min while the flow diversion for UPLC was 0–0.1 min to waste, 0.1–6 min to TOF-MS, and 6–10 min to waste. The following concentration range was used for the calibration mixtures: G6P—0.04–10 µg/mL and F6P—0.04–6 µg/mL, with TMP as the internal standard. Metabolite quantitation was executed using the Agilent Masshunter Workstation Quantitative Analysis for TOF.

For experiments involving 13C tracers, [1-13C]glucose (99 atom% 13C) and [4-13C]glucose (99 atom% 13C) tracers were used in parallel in independent 50 mL cell cultures of the engineered wild-type strain (EcoCTs03) instead of glucose as described in the previous section. The cell cultures were prepared in duplicates and sacrificed 24 h after IPTG induction for cell cultures containing either [1-13C]glucose or [4-13C]glucose. The dodecane layers were removed after centrifugation and subjected to GC-MS analysis in scan mode for limonene, where the metabolic flux ratios for the EMP and ED pathway based on the [1-13 C]glucose and the metabolic flux ratios for DXP and MVA pathway based on the [4-13 C]glucose were determined based on previous work60. The correction of limonene mass distribution vectors (MDVs) for natural 13C and 2H isotopes incorporation was executed as previously described111, where the resulting corrected MDV (MDV*) was utilised for the calculation of the metabolic flux ratios.

Model construction

The dynamic model of limonene production by the E. coli strain EcoCTs03 was developed with the open-source and stand-alone program COPASI (build 26061) and it was modified from our recently published model65. The dynamic model describes the carbon and energy metabolism of EcoCTs03 expressing the MVA pathway from the exponential growth phase after IPTG induction under aerobic conditions. The model contains 55 species and 56 reactions that consist of the pathways found in Fig. 1, such as EMP pathway, ED pathway, tricarboxylic acid cycle (TCA cycle), pentose phosphate pathway, acetate metabolism, MVA pathway and DXP pathway62,112,113. For model simplicity, the model only comprises one compartment. Extracellular metabolites were denoted with an “-ex” suffix, for example “Glcex” or “ACEex” which denotes extracellular glucose and acetate respectively. Concentration units of metabolites in the model were in μmol/L/g dry cell weight (g DCW; reflected as μmol/l in the COPASI file), which was the same units utilised in the time-series metabolomics data. Normalising metabolite concentrations to g DCW allowed for comparison across the different time points after IPTG induction. All reactions were described using Michaelis–Menten rate law, except for enzymes PDH, LDH, PoxB and LAC transport reactions which were described by mass action instead. Supplementary Table 1 details the rate laws, rate law equations and the parameter values of the final fitted model.