Introduction

Diols are widely used for fuel additives1, cosmetic components2,3, and the synthesis of polymers4 and pharmaceuticals5,6. For instance, 1,3-propanediol (1,3-PDO) is implemented in the textile and polymer industries as a monomer and as an additive for cosmetics. The position of hydroxyl groups is one key factor contributing to the structural diversity, resulting in different physical and chemical properties7,8,9. Accordingly, diols can be classified as α,β-diol, α,γ-diol, α,ω-diol, α,n-diol, and β,γ-diol, depending on their hydroxyl positions7,8,10,11. As an example in the class of β,γ-diols, 2,3-butanediol (2,3-BDO) is a promising chemical that is used in the production of softeners, plasticizers, polyesters, and cosmetics because of its attractive properties12,13,14,15. The carbon number of the alkyl chain and the modification of the side chain also give changed physical properties with regard to melting point, boiling point, viscosity, etc16,17. Because different structures of the alkyl chains and the isomerization of hydroxyl groups10,16,18 give distinct properties of diols (Supplementary Table 1), the applications of diols are rapidly expanding, and some diols are even proposed to be used as an ideal Martian rocket propellant fuel12,13,14,15.

For a sustainable future, there is an arising interest to synthesize chemicals from renewable materials and agricultural waste, as an alternative to petroleum-based method19,20,21. Many biosynthetic pathways have been explored for the synthesis of many C3 and C4 diols, including 2,3-BDO, 1,3-PDO, 1,2-propanediol, and 1,4-butanediol (1,4-BDO)6,17,22,23. Among them, 1,3-PDO has been effectively produced by biological approach to an industrial level by Dupont. Yim et al. developed a biosynthetic route for the production of 1,4-BDO from succinate, and the engineered Escherichia coli was capable of producing 18 g/L 1,4-BDO7. To expand the structural diversity of diols, more artificial biosynthetic routes have been recently established. Carbon number extension and side-chain modification of diols often involves the development of non-natural pathways that are compatible with biological systems. Wang et al. utilized the amino acid catabolism in combination with carboxylic acid reductase and endogenous reductase system to achieve the synthesis of C3-C5 straight-chain diols9. Liu et al. constructed a platform to synthesize C3-C5 branched-chain diols via hydroxylated amino acids, namely, isopentyldiol, 2-methyl-1,3-butanediol, 2-methyl-1,4-butanediol, 2-methyl-1,3-propanediol, 2-ethyl-1,3-propanediol, 1,4-pentanediol24. This route comprises four reactions catalyzed by amino acid hydroxylase, L-amino acid deaminase, α-keto acid decarboxylase, and aldehyde reductase24. In the class of β,γ-diols, there are no reports on the biological synthesis of higher β,γ-diols (>C4), even though 2,3-BDO can be synthesized from the natural acetoin pathway in bacteria25. Considering the unique chemical and physical features of β,γ-diols, such as the chiral centers and their versatility in synthesizing complex molecules, it will be desirable to expand this category of β,γ-diols for potential applications in the development of novel pharmaceuticals, agrochemicals, or materials with distinct properties. Additionally, expanding the structural diversity of β,γ-diols could lead to breakthroughs in catalysis-related field26.

In this study, focusing on the development of β,γ-diols with diverse alkyl chains, we report a recursive carboligation cycle mediated by acetohydroxyacid synthase (AHAS) to achieve de novo production of β,γ-diols from branched-chain amino acids (BCAA) metabolism (Fig. 1). This biosynthetic route is composed of AHAS-mediated synthesis of 2-ketoacids for producing aldehyde intermediates, followed by carboligation of aldehydes with pyruvate to form α-hydroxyketones, which are further reduced by aldo-keto reductases (AKRs) to yield branched-chain β,γ-diols. Through the mining of promiscuous AHAS for carboligation of isobutyraldehyde and pyruvate, we find that the catalytic domain of AHAS from Saccharomyces cerevisiae (Ilv2c) could effectively condense branched-chain aldehydes with pyruvate. Subsequently, we develop a de novo platform for the synthesis of branched-chain β,γ-diols from the BCAA metabolism in E. coli by innovating the recursive carboligation cycle mediated by Ilv2c. After systematic optimization of the BCAA pathway, we achieved the titer of 4-methylpentane-2,3-diol (4-M-PDO) at 74.1 mM, 5-methylhexane-2,3-diol (5-M-HDO) at 18.1 mM, and 4-methylhexane-2,3-diol (4-M-HDO) at 0.5 mM. Under the fed-batch condition, the engineered E. coli harboring a single plasmid expressing the 4-M-PDO biosynthetic pathway could produce 129.8 mM (15.3 g/L) 4-M-PDO at 144 h with a consumption of 267 mM glucose, reaching ~72% of the theoretical maximum. In summary, we report that diverse branched-chain β,γ-diols can be produced from BCAA pathway, and the AHAS-mediated carbon elongation system may be generalizable for the future production of other aldehyde-derived β,γ-diols.

Fig. 1: The design principle of recursive carboligation cycle for β,γ-diol synthesis.
figure 1

a Proposed recursive carboligation mediated by bifunctional acetohydroxyacid synthase (AHAS) for the synthesis of β,γ-diols. Acetohydroxyacid synthase, AHAS. b The L-valine pathway as an example to demonstrate the recursive carboligation mediated by AHAS for 4-M-PDO synthesis. AHAS catalyzes pyruvate to bind with thiamine diphosphate (ThDP), to form lactyl-ThDP (process 1). AHAS then catalyzes the decarboxylation of pyruvate in lactyl-ThDP to form hydroxyethyl-ThDP anion/enamine (HEThDP) (process 2). Subsequently, the HEThDP intermediate binds to aldehydes derived from the decarboxylation of 2-ketoacids, to form α-hydroxyketone-ThDP (process 6). α-Hydroxyketones are released, and AHAS, ThDP returns to the cycle (process 7). Diols such as 4-methylpentane-2,3-diol (4-M-PDO) are produced by further reduction of α-hydroxyketones by AKR and sADH (process 8). Processes (1 ~ 5) represent the steps for 2-ketoisovalerate synthesis and its decarboxylation to isobutyraldehyde. c Potential diols from the branched-chain amino acid (BCAA) metabolism. Abbreviations: AHAS acetohydroxyacid synthase, KDC 2-ketoacid decarboxylase, AKR aldehyde-keto reductase, sADH secondary alcohol dehydrogenase.

Results

Design of recursive carboligation cycle inspired by natural existing reactions

As inspired by that the natural 2,3-BDO was derived from acetoin via the reduction of secondary alcohol dehydrogenase (sADH)27, we expect that more structurally diverse diols could be generated if carboligation of aldehydes with pyruvate can give α-hydroxyketones, which can be further reduced to diols in a similar way to 2,3-BDO synthesis (Fig. 1a). Since there is a clearly elucidated metabolic pathway from pyruvate to branched-chain aldehyde28,29,30,31, the main bottleneck for construction of artificial diol biosynthetic pathway is the carboligation of aldehyde and pyruvate. The family of thiamine diphosphate (ThDP)-dependent enzymes catalyze extensive C-C bond ligation. For instance, AHAS not only catalyzes the carboligation of two molecules of pyruvate for α-acetolactate synthesis32,33, but also effectively condenses benzaldehyde with pyruvate to form L-phenylacetylcarbinol (L-PAC)34,35,36 (Supplementary Fig. 1). In theory, it is catalytically feasible to create the synthetic route adopting the mode of “aldehyde-hydroxyketone-diol”37 (Fig. 1a). This route brings two benefits in diol synthesis: the product α-hydroxyketones gain the first hydroxyl group at the γ position; an additional incorporation of pyruvate gives an extended carbon chain with "+2". Taking the L-valine biosynthetic pathway as an example, it will be possible to synthesize 4-M-PDO as an estimated β,γ-diol via the recursive carboligation mediated by AHAS (Fig. 1b). As a great variety of branched-chain aldehydes can be generated through the BCAA metabolism such as L-leucine and L-isoleucine, it is possible to further expand the product diversity of diols besides 4-M-PDO (Fig. 1c). Therefore, the recursive carboligation mediated by AHAS may create a distinct class of non-natural α-hydroxyketone scaffolds, which can be further reduced by AKR or sADH to form diols.

Identification of AHAS for condensing branched-chain aldehydes

To find a suitable AHAS for carboligation of branched-chain aldehydes with pyruvate, several reported AHASs with broad substrate spectra, including IlvBN38 from E. coli, AlsS39 from Bacillus subtilis, and Ilv236 from S. cerevisiae, were selected for evaluating the activity towards the model substrate of isobutyraldehyde (1a). As the full-length Ilv2 with the mitochondrion-localization signal peptide formed inclusion bodies when heterologously expressed in E. coli40, we therefore truncated Ilv2 with removal of the mitochondrion-localization signal peptide, and the resulting construct was renamed as Ilv2c. As shown in Supplementary Fig. 2, we confirmed all AHASs were successfully expressed as soluble fractions in E. coli. We found that the whole-cell biocatalysts of both IlvBN and AlsS showed good activities with a near >99% conversion of benzaldehyde to L-PAC as previously reported38,39 (Supplementary Fig. 3). Interestingly, the heterologously expressed Ilv2c in E. coli also had a comparable carboligation activity for L-PAC formation, which was previously observed in yeast36.

Although all three AHASs showed comparable activity towards benzaldehyde, the carboligation activities of different AHASs for condensing isobutyraldehyde and pyruvate were very distinct (Fig. 2a). Only Ilv2c had an impressive activity to convert substrate 1a into products 1b (9.44 min) and 1c (9.56 min, the dominant product). In comparison, the IlvBN system converted substrate 1a into the byproduct as the form of isobutanol (6.74 min, determined by the retention time of authentic standard). AlsS produced a relatively small amount of 1c, with majority of 1a being converted to the byproduct. As AlsS from B. subtilis was previously explored for isobutyraldehyde overproduction in Synechococcus elongatus PCC794241, it is self-evident that AlsS could not have good carboligation activities towards isobutyraldehyde, as otherwise, isobutyraldehyde overproduction in S. elongatus would not be feasible. We initially expected the main product as the form of 3-hydroxy-4-methylpentan-2-one (3-H-4-MP-2-one) when only AHAS was overexpressed in E. coli. Surprisingly, we found that the E. coli endogenous metabolism could directly reduce α-hydroxyketone intermediate (3-H-4-MP-2-one) into 4-M-PDO as determined by GC-MS analysis (Fig. 2b). The compound 1c showed characteristic fragmentation pattern in line with 4-M-PDO, whereas the compound 1b likely corresponded to 3-H-4-MP-2-one (Supplementary Fig. 4).

Fig. 2: Identification of AHAS for condensing branched-chain aldehydes with pyruvate.
figure 2

a Representative GC-FID chromatogram results of the carboligation of isobutyraldehyde (IBAL) and pyruvate catalyzed by MR-IlvBN, MR-AlsS, and MR-Ilv2c. Compound 1a represents substrate IBAL, 1b represents product 3-H-4-MP-2-one, 1c represents 4-M-PDO. b Representative GC-MS spectrum to validate the chemical structure of compound 1c. Red spectrum represents 1c produced by the biocatalytic system; blue spectrum represents 4-M-PDO from NIST library. c Representative GC-FID chromatogram results of the carboligation of isovaleraldehyde and pyruvate catalyzed by MR-Ilv2c. d Representative GC-FID chromatogram results of the carboligation of 2-methylbutyraldehyde and pyruvate catalyzed by MR-Ilv2c. Experiments were carried out in three biological repeats, and only representative results are shown.

Next, we further evaluated the catalytic promiscuity of Ilv2c towards other aldehydes from BCAA pathway, namely, isovaleraldehyde (2a) from L-leucine pathway, and 2-methylbutyraldehyde (3a) from L-isoleucine pathway. As shown in Fig. 2c, substrate 2a was transformed to products 2b (10.53 min) and 2c (10.60 min). Substrate 3a gave products 3b (10.70 min) and 3c (10.77 min) as depicted in Fig. 2d. GC-MS analysis suggested that 2b was 3-hydroxy-5-methylhexan-2-one (3-H-5-MH-2-one) formed from isovaleraldehyde with pyruvate, and 3b was compound 3-hydroxy-4-methylhexan-2-one (3-H-4-MH-2-one) formed from the condensation of 2-methylbutyraldehyde with pyruvate. Products 2c (5-methylhexane-2,3-diol, 5-M-HDO) and 3c (4-methylhexane-2,3-diol, 4-M-HDO) were the corresponding diol products, respectively, generated from 2b and 3b (Supplementary Fig. 5 and 6). We also found that compound 2c was a dominant product in the biocatalytic system, whereas compound 3b was not fully converted to 3c, indicating that the endogenous reductases in E. coli might have different activity towards various alkyl groups of α-hydroxyketones. Taken together, we report Ilv2c for condensing a variety of branched-chain aldehydes with pyruvate, which paves the way for the subsequent development of recursive carboligation cycle for diol synthesis from BCAA pathway.

Elucidation of the potential AKRs involved in the diol synthesis

As diols were the dominant product from the biocatalytic system, we suspected that the endogenous AKRs in E. coli42,43 might catalyze the reduction of α-hydroxyketones to diols. Although E. coli RARE used in this study already had a number of deletions of AKR-related genes (DkgB, YeaE, DkgA) and alcohol dehydrogenase (ADH) genes (YqhD, YahK, YjgB), it is still possible that residual reductase activities from other AKR enzymes remain. To verify this hypothesis, we suppressed the expression of several known endogenous AKR genes (YdjG, YdbC, YgdS, YghZ, and YdhF)44 by the gene knock-down strategy using CRISPR inference (CRISPRi) mediated by deactivated Cas12a (dCas12a)45,46 (Fig. 3a), to observe their effects on the product distribution profile between α-hydroxyketones and diols. As shown in Fig. 3b, strain Ilv2c-dCpf1-gYdjG, Ilv2c-dCpf1-gYghZ, and Ilv2c-dCpf1-gYdhF increased the accumulation of 3-H-4-MP-2-one and reduced the content of 4-M-PDO when compared to that of the control Ilv2c-dCpf1. These results indicated that YdjG, YghZ, and YdhF might play some roles in the reduction of α-hydroxyketone like 3-H-4-MP-2-one. Next, to further validate the activity of AKRs on α-hydroxyketones, we used purified YdjG, YghZ, and YdhF (Supplementary Fig. 7) to individually catalyze the substrates obtained from biocatalysis, which contain a mixture of 3-H-4-MP-2-one and 4-M-PDO (Supplementary Data 1). As can be seen from Fig. 3c–e, we found that the percentage of diol product in the YdjG, YdhF, and YghZ systems was substantially increased than that of the control, whereas the amounts of α-hydroxyketone were in an opposite trend. Therefore, we confirmed that YdjG, YdhF, and YghZ might play certain roles for 4-M-PDO synthesis. As a comparison, we further evaluated the catalytic activities of purified AKRs towards substrate 3b (3-H-4-MH-2-one), but no inverted peak ratios between substrate and product were observed (Supplementary Fig. 8). Therefore, we concluded that there might be other reductases synergistically acting with YdjG, YghZ, and YdhF to convert α-hydroxyketones into their corresponding diols.

Fig. 3: Elucidation of the potential AKRs involved in the diol synthesis.
figure 3

a dCas12-mediated CRISPRi strategy to repress the expression of AKR genes (YdjG, YdbC, YgdS, YghZ, and YdhF). b The distribution profile of the product 4-M-PDO (1c) and the intermediate 3-H-4-MP-2-one (1b). Reactions were carried out by strains carrying pRSF-Ilv2c and different CRISPRi inhibitory plasmids. The star sign indicates obvious observation of product distribution changes. Data represent the mean value ± SD from three biological replicates. The product distribution profile obtained by GC-FID analysis shows the enzyme activity of YdjG (c), YghZ (d) and YdhF (e). Experiments were conducted using the substrates of 3-H-4-MP-2-one and 4-M-PDO obtained from biocatalysis. For each reaction, 5 mg/mL of purified proteins was individually added to the biocatalytic system. Experiments were carried out in three biological repeats, and representative results are shown. Source data are provided as a Source Data file.

De novo synthesis of branched-chain diols via recursive carboligation mediated by AHAS

Next, we aimed to assemble an upstream pathway for de novo synthesis of branched-chain aldehydes to achieve Ilv2c-mediated recursive carboligation for diol production. As shown in Fig. 4a, we introduced a dual-plasmid system into E. coli so that the 2-ketoacids from the natural network of BCAA pathway could be decarboxylated to give the desired aldehyde substrates. It is well known that the native metabolism in E. coli produces substantial amounts of 2-ketoacids during the synthesis of BCAA28. In order to convert 2-ketoacids into their corresponding aldehydes, we introduced a substrate-promiscuous decarboxylase Aro10 from S. cerevisiae47,48,49 into E. coli. As depicted in Supplementary Fig. 9, E. coli RARE strain harboring pET-Aro10 and pRSF-Ilv2c (MR-01) could give the expected diols after 108 h fermentation, with 4-M-PDO as the dominant product. To minimize the Aro10-mediated formation of byproduct such as phenylacetaldehyde47, we also used an alternative decarboxylase MdlC from Pseudomonas putida with a substrate preference towards smaller substrates50. However, the product profile characterized by GC-FID revealed that the activity of MdlC was not comparable with that of Aro10 for diol production (Supplementary Data 2).

Fig. 4: De novo synthesis of diols via the recursive carboligation cycle from BCAA metabolism.
figure 4

a The dual-plasmid system was transformed into E. coli RARE to evaluate the recursive carboligation cycle for synthesizing diols from glucose. The biosynthetic route for diol synthesis is shown in the dotted box. b The distribution of L-valine pathway derivatives of 4-M-PDO and its precursors. c The distribution of L-leucine pathway derivatives of 5-M-HDO and its precursors. d The distribution of L-isoleucine pathway derivatives of 4-M-HDO and its precursors. e The percentage profile of diols produced by engineered strain MR-01. The average and standard deviation are obtained from three biological replicates. Source data are provided as a Source Data file.

Next, a time course study was carried out to evaluate the accumulation of pathway intermediates such as aldehyde and α-hydroxyketone, and the end product of diol (Supplementary Data 2). We found that only minimal isobutyraldehyde and 3-H-4-MP-2-one were observed (Fig. 4b), indicating the successful operation of diol biosynthetic pathway for 4-M-PDO (C6) production. Similarly, 5-M-HDO (C7) was produced at the dominant form, with only a marginal accumulation of 3-H-5-MH-2-one (Fig. 4c). We found that the majority of BCAA flux was diverted to synthesis of 4-M-PDO and 5-M-HDO, whereas 4-M-HDO was not detectable. Instead, the engineered E. coli only accumulated 3-H-4-MH-2-one (Fig. 4d), indicating that the AKR activity was truly limiting for 4-M-HDO production, which is consistent with observations from biocatalytic analysis (Fig. 2d). Overall, 4-M-PDO represented the dominant product, which is consistent with previous observations that isobutanol always remains as the dominant product over other branched-chain higher alcohols. As L-valine, L-leucine, and L-isoleucine synthesis share the same enzymes, such as ketol-acid reductoisomerase IlvC and dihydroxy-acid dehydratase IlvD, engineering the substrate preference of these shared enzymes is required to redistribute the metabolic flux in BCAA metabolism. As shown in Fig. 4e, 4-M-PDO derived from the L-valine pathway accounted for 80% total amount of diols, and the remaining 20% was 5-M-HDO derived from the L-leucine pathway.

Systematic metabolic engineering of the BCAA pathway for improved diol synthesis

Since pyruvate is the key substrate involved in the diol biosynthetic pathway, we therefore attempted to further engineer the E. coli RARE to enhance the pyruvate supply. Genes Pta, PflB, LdhA, and AdhE that contribute to the formation of fermentation byproducts such as lactate and ethanol were deleted and verified by diagnostic PCR (Supplementary Fig. 10), and the resulting strain was designated as MR4Δ. Next, we strengthened the individual L-valine, L-leucine, and L-isoleucine pathways to enhance the supply of specific 2-ketoacids for increasing 4-M-PDO, 5-M-HDO, and 4-M-HDO productions (Fig. 5a). To increase the synthesis of 4-M-PDO (C6), we constructed a recombinant plasmid for overexpressing IlvCD, and the resulting strain MR-Ilv2c-Aro10-IlvCD (MR-03) and MR4Δ-Ilv2c-Aro10-IlvCD (MR4Δ-01) were obtained. As shown in Fig. 5b, MR-03 produced 14.0 mM 4-M-PDO, which was 2.5 times higher than MR-01 without IlvCD overexpression. Deletion of the byproduct pathway-related genes Pta, PflB, LdhA, and AdhE further increased the production of 4-M-PDO to 27.8 mM (Supplementary Data 3). Moreover, we observed an increasing glucose consumption and a better accumulation of biomass in MR4Δ-01 (Fig. 5c). Consistent with previous reports28, we found that overexpression of the L-valine pathway and deletion of byproduct pathway could be similarly implemented to achieve higher amount of 4-M-PDO. As shown in Fig. 5d, the proportion of 4-M-PDO was also increased from 92% to 96% upon deletion of Pta, PflB, LdhA, and AdhE.

Fig. 5: Metabolic engineering strategies to improve the precursor supply for enhanced diol production.
figure 5

a Metabolic routes for the BCAA pathway towards the synthesis of 4-M-PDO, 5-M-HDO, and 4-M-HDO. b Comparison of 3-H-4MP-2-one and 4-M-PDO production between strains MR-03 and MR4Δ-01. c The time course of glucose consumption and the optical density at 600 nm (OD600) of strain MR-03 and MR4Δ-01. d The proportion of different diols in the engineered strains of MR-03 and MR4Δ-01. e The diol production and distribution of strain MR4Δ-02 that strengthens the L-leucine and L-isoleucine pathways by overexpressing IlvA and LeuABCD. f The diol production of strain MR4Δ-03 with LeuA*BCD overexpression and MR4Δ-04 with IlvCD and LeuA*BCD overexpression. LeuA* encodes the feedback-resistant LeuAG462D. Statistical analysis was performed using a two-sided unpaired t-test, and no adjustment was made for multiple comparisons. ** indicates p-value = 0.0053, *** indicates p-value = 0.0005. Source data are provided as a Source Data file.

To enhance the production of 5-M-HDO and 4-M-HDO (C7), we overexpressed IlvA together with LeuABCD. As shown in Fig. 5e, recombinant strain MR4Δ-Ilv2c-Aro10-IlvA-LeuABCD (MR4Δ-02) produced 5.4 mM 5-M-HDO (35% portion) and 0.5 mM 4-M-HDO (4% portion), with the majority of diols remained as the form of 4-M-PDO (9.4 mM, 61% portion). Notably, the L-isoleucine-derived product accumulated as the form of 3-H-4-MH-2-one because the insufficient activity of AKRs (Supplementary Data 4). To further increase the 5-M-HDO production from the L-leucine branch, we further implemented the feedback-resistant LeuAG462D (designated as LeuA* hereafter) to increase the flux towards L-leucine pathway. As a result, we constructed two additional recombinant strains MR4Δ-Ilv2c-Aro10-LeuA*BCD (MR4Δ-03) and MR4Δ-Ilv2c-Aro10-IlvCD-LeuA*BCD (MR4Δ-04). As shown in Fig. 5f, both recombinant strains of MR4Δ-03 and MR4Δ-04 preferably produced 5-M-HDO, whereas 4-M-PDO was nearly below the detection limit in strain MR4Δ-03. In particular, strain MR4Δ-04 with overexpression of IlvCD and LeuA*BCD could reach 18.1 mM at 108 h, accounting 96% of total diols (Supplementary Data 5). These findings suggested that the feedback-resistant LeuA* could more efficiently condense 2-ketoisovalerate (2-KIV) than that of the wild type LeuA as described in Fig. 5e, thereby preventing the metabolic flux from 2-KIV prematurely entering the 4-M-PDO synthesis.

High-level production of 4-M-PDO through further metabolic engineering and process optimization

As the IlvD from Synechocystis sp. PCC 6803 (IlvD6803) has the dual functionality as keto-3-deoxygluconate-6-phosphate (KDPG) aldolase51, we reasoned that the utilization of IlvCD from Synechocystis sp. PCC 6803 might favor the Entner-Doudoroff (ED) pathway with NADPH regeneration to improve diol synthesis. As shown in Supplementary Fig. 11, IlvCD6803 overexpression in recombinant strain MR4Δ-05 increased the titer of 4-M-PDO to 34.4 mM, which is 1.23 times higher than that of E. coli IlvCD overexpression (Supplementary Data 6). We reasoned that the increased titer of 4-M-PDO might be potentially caused by the improved NADPH supply by the ED pathway, with a combined effect influenced by alternative IlvCD6803 with different enzyme levels and distinct kinetics.

To further reduce the metabolic burden of the engineered strain, we reassembled the Ilv2c-Aro10-IlvCD6803 pathway into the low-copy plasmid pACYCDuet-1 to obtain pAC-Ilv2c-Aro10-IlvCD6803 (Fig. 6a). The resulting strain MR4Δ-06 substantially increased the titer of 4-M-PDO to 74.1 mM at 108 h, and the proportion of 4-M-PDO reached 91% of total diols (Fig. 6b, c). Finally, we tried to scale up the process using fed-batch fermentation. As shown in Fig. 6d, strain MR4Δ-06 produced 129.8 mM 4-M-PDO at 144 h cultivation with the pulse feeding of glucose. The proportion of 4-M-PDO increased during the progressing of fermentation process, eventually reaching 95% of total diols (Fig. 6d). The reduced cell density compared to our previous small volume system (10 mL culture medium in 50 mL flask) indicated that high-level of 4-M-PDO might cause inhibitory effect to the cell growth of E. coli (Fig. 6e). Taken together, we have systematically optimized the 4-M-PDO pathway in E. coli, and the titer of 4-M-PDO reached ~72% of the theoretical maximum (Supplementary Data 7).

Fig. 6: High-level production of 4-M-PDO via further metabolic engineering and process optimization.
figure 6

a The composition of plasmid pAC-Ilv2c-Aro10-IlvCD6803. b The distribution profile of diols in engineered strain MR4Δ-06. c The titer of diols and their precursors in strain MR4Δ-06. d The titer of isobutyraldehyde, 3-H-4-MP-2-one, 4-M-PDO and the proportion of diols. e OD600 and the concentration of glucose of MR4Δ−06 in the fed-batch fermentation. The average and standard deviation are obtained from three biological replicates. Source data are provided as a Source Data file.

Discussion

With the continuous effort for a sustainable future, more microbial cell factories have been proposed for diol production from renewable feedstocks. The boundaries of cellular metabolism have been expanded by exploring artificial synthetic pathways. Besides the natural 2,3-BDO synthesis based on the acetoin pathway, the development of synthetic strategies for other β,γ-diols is limited by the hydroxylation at β,γ position and side chain modification. In this study, we report a recursive carboligation cycle to produce structurally diverse β,γ-diols by harnessing E. coli BCAA catabolism. The starting unit, pyruvate, undergoes BCAA metabolism and decarboxylation to form branched-chain aldehydes. The aldehyde substrate further condenses with another molecule of pyruvate to form α-hydroxyketones with "+2" carbon elongation, which is further reduced to diols. These biomanufactured diols have two aspects of features in terms of β,γ dihydroxy groups and carbon number exceeding 6. Due to their special structure and physicochemical properties, these branched-chain diols may become good candidates for cosmetics, polymer materials, biofuels, and chemical reagents52,53.

In this study, we identified Ilv2c from S. cerevisiae, which has good activity toward the carboligation of branched-chain aldehydes with pyruvate. Efficient de novo synthesis of 4-M-PDO (C6) and 5-M-HDO (C7) was realized by the recursive carboligation mediated by AHAS and systematic metabolic engineering of E. coli metabolism. Taking C6 diol as an example, we have achieved high-yield, high-specificity production of 129.8 mM 4-M-PDO from glucose through systematic metabolic engineering and process optimization. At this moment, the titer of 4-M-HDO from the L-isoleucine branch is relatively low, which might be debottlenecked by the identification of effective AKRs to completely reduce 3-H-4-MH-2-one. Moreover, since a feedback-resistant IlvA was reported for L-threonine and L-isoleucine overproduction54,55, it may require such feedback-resistant enzyme to increase the metabolic flux towards 4-M-HDO. In addition, as 2-ketobutyrate precursor is generated by the dehydration of L-threonine catalyzed by IlvA, it is also necessary to increase the supply of L-threonine for efficient 4-M-HDO overproduction. Collectively, increasing the supply of L-threonine precursor, introducing the feedback-resistant IlvA, and the identification of effective AKRs are three key bottlenecks to fully realize the potential of 4-M-HDO synthesis.

As a great variety of medium-chain aldehydes can be generated through the engineered BCAA metabolism29, these non-natural aldehydes can also serve as the substrates to further expand the product diversity of β,γ-diols. Besides BCAA-derived branched-chain aldehydes, more aldehyde-related chemicals (Supplementary Table 2) produced by microbes56 could be incorporated into the AHAS-mediated carboligation cycle. Aliphatic aldehydes such as short-chain (Cn ≤ 5), medium-chain (C6-C12), and long-chain (Cn > 12) fatty aldehydes have been extensively studied by microbial fermentation. Chaves et al.57 overexpressed coenzyme A-acylating aldehyde dehydrogenase (Aldh) from Clostridium beijerinckii in E. coli to synthesize 0.63 g/L butyraldehyde from glucose. Moreover, C6-C12 fatty aldehydes can be easily produced from fatty acid degradation pathway and reverse β-oxidation pathway58. These aliphatic aldehydes might also be explored to form other β,γ-diols via AHAS-mediated carboligation with pyruvate, which might provide distinct chemical properties.

To achieve more economical, practical applications, more engineering strategies are needed to increase the robustness of E. coli for industrial applications. For instance, phage-resistant E. coli host might be constructed by CRISPR system59, or DNA phosphorothioation-based Ssp defense module60 to provide host protection from phage infection without compromising the fermentation activity of the strain. Besides the E. coli RARE platform, more other industrial hosts with the ability to accumulate the aldehyde intermediate might be constructed to favor the diol synthesis. For instance, the recently developed MARE yeast already showed promising results for aldehyde accumulation43, which might be explored as a chassis yeast for future diol production30,61.

In summary, the recursive carboligation cycle provides a generic platform for the synthesis of β,γ-diols from BCAA metabolism. With the discovery of more aldehyde biosynthetic pathways and mining alternative carboligases, we envision that more structurally diverse diols can be generated via the similar approach. Considering diols are mainly used as important industrial solvents, polymer building blocks, and cosmetics, the strategy of synthesizing higher alcohols from alternative cheap resources such as lignocellulose, methanol, and waste proteins may provide a more economic route than that from glucose. With continuous efforts, we believe it will be possible to adopt cheap feedstocks such as lignocellulose for sustainable production of diols in the future.

Methods

General reagents

E. coli Top10 was used for general cloning. E. coli RARE (∆dkgB, ∆yeaE, ∆dkgA, ∆yqhC, ∆yqhD, ∆yahK, ∆yjgB in E. coli MG1655-T7 strain) was a kind gift from Kristala L. J. Prather, Massachusetts Institute of Technology. Antibiotics were purchased from Sangon (Shanghai, China). Restriction enzymes and DNA polymerase were purchased from New England Biolabs (Ipswich, MA, USA). PCR purification kit, gel extraction kit, and plasmid DNA extraction kit were all purchased from BioFlux (Shanghai, China). All chemicals were purchased from Macklin and Aladdin (Shanghai, China) unless otherwise stated.

Luria-Bertani (LB) medium (10 g/L tryptone, 5 g/L yeast extract, and 10 g/L NaCl) was used for cultivating the E. coli cells with corresponding antibiotics. The antibiotic concentrations were ampicillin 100 μg/mL, kanamycin 50 μg/mL, and chloramphenicol 34 μg/mL. The modified M9 medium was used for bioproduction, which is composed of 17.08 g/L Na2HPO4·12H2O, 3 g/L KH2PO4, 1 g/L NH4Cl, 0.5 g/L NaCl, 11.1 µg/L CaCl2, 240.7 mg/L MgSO4, 50 mg/L EDTA, 8.3 mg/L FeCl3·6H2O, 0.84 mg/L ZnCl2, 0.13 mg/L CuCl2·2H2O, 0.1 mg/L CoCl2·6H2O, 0.1 mg/L H3BO3, 16 mg/L MnCl2·6H2O, 0.3 mg/L Na2MoO4·4H2O, 5 mg/L thiamin hydrochloride, 10 mg/L nicotinic acid, 0.1 mg/L biotin, 7 g/L yeast extract, 4% (w/v) glucose.

Plasmids construction

Genes encoding IlvBN operon, IlvCD operon, LeuABCD operon, threonine deaminase (IlvA, Gene ID: 948287), NADH-specific methylglyoxal reductase (YdjG, Gene ID: 946283), oxidoreductase (YdhF, Gene ID: 946960), pyridoxine 4-dehydrogenase (YdbC, Gene ID: 945980), NADP(H)-dependent aldo-keto reductase (YgdS, Gene ID: 947306), L-glyceraldehyde 3-phosphate reductase (YghZ, Gene ID: 947480) were cloned from the genomic DNA of E. coli MG1655. Gene encoding acetolactate synthase from B. subtilis (AlsS, Gene ID: 936852) was cloned from the genomic DNA of B. subtilis 168. Gene encoding acetolactate synthase catalytic subunit (Ilv2c, Gene ID: 855135), phenylpyruvate decarboxylase (Aro10, Gene ID: 851987) were cloned from the genomic DNA of S. cerevisiae BY4741. Gene encoding benzoylformate decarboxylase (MdlC, Gene ID: 45523605) was cloned from the genomic DNA of P. putida KT2440. Cyanobacterial IlvCD was obtained from the genomic DNA of Synechocystis sp. PCC 6803.

The primers used in this study can be found in Supplementary Data 8. Plasmids were routinely cloned by standard restriction enzyme digestion and ligation approaches. Ilv2c, YdjG, YdbC, YghZ, YdhF, YgdS were inserted into the pRSFDuet-1 vector between BamHI/XhoI sites. IlvCD operon was inserted to pACYCDuet-1 vector between BamHI/XhoI sites, to yield pACYC-IlvCD. For constructing pACYC-IlvCD-LeuA*BCD, we first inserted LeuA*BCD between BamHI/SalI sites, then IlvCD was inserted between the BglII-XhoI sites. pRSF-Ilv2c-Aro10, pACYC-Ilv2c-Aro10-IlvCD6803 were constructed by BsaI-mediated two-fragment ligation method. pCDF*-dCpf1-gYdjG, pCDF*-dCpf1-YgdS, pCDF*-dCpf1-gYghZ, pCDF*-dCpf1-gYdhF, pCDF*-dCpf1-gYdbC were obtained by inserting the guide RNA into pCDF*-dCpf1 plasmid via BsaI-mediated golden-gate assembly method45. All plasmids used in this study are provided in Supplementary Data 9. The deletion of Pta, PflB, LdhA, and AdhE genes in E. coli RARE was performed using CRISPR-Cas9-mediated gene knockout method62,63. Briefly, the receipt E. coli cells with pKD46-Cas9 plasmid were induced with L-arabinose for Cas9 expression, and the prepared competent cells were transformed with a gRNA plasmid targeting at the desired gene locus. Subsequent gene-specific diagnostic primers were used to confirm the successful gene knockout events. All strains used in this study are listed in Supplementary Data 10.

Biocatalysis procedures

The colonies were inoculated into liquid LB supplemented with the corresponding antibiotics and cultured at 37 °C, 250 rpm for 12–16 h as the seed culture. 1% fresh overnight culture was inoculated in Terrific Broth for preparing the whole-cell catalysts. The whole-cell biotransformation and purified enzyme biocatalysis were carried out according to the previous studies39,64. Briefly, the whole-cell biotransformation was typically performed in 1 mL volume, and the purified enzyme system was performed in 0.5 mL volume. Whole-cell biocatalysts MR-IlvBN/MR-AlsS/MR-Ilv2c were used for AHAS-mediated carboligation of branched-chain aldehydes and pyruvate. A 1 mL system was composed of 10 mM branched-chain aldehydes, 2 equivalents pyruvate, 10 g cell dry weight per liter (cdw/L) whole-cell biocatalysts, in pH8.0 phosphate buffer (200 mM). Reactions were incubated at 30 °C with a shaking speed of 250 rpm for 12 h. Whole-cell biocatalyst MR-Ilv2c-dCpf1/MR-Ilv2c-dCpf1-gYdjG/MR-Ilv2c-dCpf1-gYgdS/MR-Ilv2c-dCpf1-gYdbC/MR-Ilv2c-dCpf1-gYghZ/MR-Ilv2c-dCpf1-gYdhF were used for CRISPRi-mediated AKR screening. A 1 mL reaction system was comprised of 10 mM branched-chain aldehydes, 2 equivalents pyruvate, 10 g cdw/L whole-cell biocatalysts in pH8.0 phosphate buffer (200 mM). Reactions were incubated at 20 °C with a shaking speed of 250 rpm for 24 h. Purified YdjG, YghZ, and YdhF were used for AKR-mediated reaction. A 0.5 mL reaction mixture contained ~5 mg/mL purified enzyme, 4 mM NADPH, and 200 μL of a mixture of α-hydroxyketone and diol (obtained by biocatalysis system). Reactions were incubated at 25 °C with a shaking speed of 250 rpm for 24 h. The control system did not add the purified enzyme, and the other components and concentrations were consistent with the purified enzyme system.

Fermentation production of diols

The colonies were inoculated into liquid LB supplemented with the corresponding antibiotics and cultured at 37 °C, 250 rpm for 12–16 h as seed culture. 1% (volume/volume, vol/vol) seed culture was inoculated into 10 mL modified M9 medium for the bioproduction of diols, whereas 10% (vol/vol) seed culture was inoculated to 500 mL fed-batch system. The cell culture was first cultivated in a rotary shaking incubator at 37 °C and 250 rpm for 4 h, and IPTG was added to a final concentration of 100 µM when OD600 reached 0.6–1.0. The cells were then cultivated at 30 °C for the biosynthesis of diols.

For the quantitation of the residual glucose amounts, 300 μL of fermentation broth was mixed with 300 μL of distilled and deionized water (ddH2O), centrifuged at 18407 × g for 10 min, and the supernatant was filtered through a 0.22 μm filter to remove the residual cell pellets. The samples were next analyzed by high-performance liquid chromatography (HPLC). Shimadzu LC-20A system equipped with a refractive index detector (RID) and an Aminex HPX-87H organic acid analysis column (300 mm × 7.8 mm, 9 µm) was used. The column temperature was maintained at 50 °C, ultrapure water containing 5 mM sulfuric acid was used as the mobile phase, and the flow rate was maintained at 0.8 mL/min. The retention time of glucose is 6.60 min. The glucose levels were quantitated using an external standard curve. The standard curve for glucose is provided in Supplementary Data 11.

Quantitation of metabolites such as diols, aldehydes, and α-hydroxyketones was performed using gas chromatography-flame ionization detector (GC-FID). The analysis was performed using a Nexis GC-2030 instrument equipped with FID. The SH-Rtx-5 capillary column (25 m × 0.32 mm × 0.5 μm) was used. The samples were extracted by ethyl acetate. During the analysis, 1 μL of organic sample was injected using a splitless injector with the inject port set at 250 °C. Nitrogen was used as the carrier gas with a column flow rate of 1.0 mL/min. The oven temperature program was set as following: initial temperature 40 °C (hold for 2 min), followed by a temperature increase of 15 °C/min to 250 °C (hold for 2 min). The FID detector was set at 350 °C. Authentic standards were used to plot the standard curve. Since standards were not commercially available for C6 and C7 diols, their isomers of 1,6-hexanediol and 1,7-heptanediol were used for plotting the standard curve. The levels of α-hydroxyketones were calculated based on their corresponding diols. The standard curves of intermediates and end products are provided in Supplementary Data 11. The quantitation of metabolites was calculated via the external calibration curve.

Gas chromatography-mass spectrometry (GC-MS) was performed on an Agilent 7977B MSD system to confirm the identity of the metabolites, such as diols and α-hydroxyketones. The GC-MS system was equipped with an Agilent HP-5ms Ultra Inert GC column (30 m × 250 μm × 0.25 μm). 1 μL of each sample was injected by the autosampler with a split ratio of 1:1. Helium was used as the carrier gas with a column flow rate of 1.0 mL/min. The column oven was maintained at an initial temperature of 40 °C (hold for 2 min) and then increased to 250 °C at a rate of 15 °C/min (hold for 2 min). The MS was operated in scan mode with a solvent delay of 3.5 min. The temperatures of the MS ion source and MS quadrupole were set to 230 °C and 150 °C, respectively. The mass fragmentation pattern was compared with mass spectra with those chemical standards in the NIST 14 Mass Spec Library.

Statistical analysis

Statistical analyses were performed with GraphPad Prism 9 software. The measured data were analyzed using student t test with a level of α = 0.05.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.