Abstract
Collagen glucosyltransferases catalyze collagen glucosylation critical for biology and diseases, yet their structural regulation remains unclear. Here, we report crystal structures of a mimiviral collagen glucosyltransferase in its apo form and in complexes with uridine diphosphate (UDP) and the disaccharide product. We reveal that the enzyme forms a homodimer, stabilized by a loop from one subunit locking into a cleft on the other, enabling UDP-glucose binding cooperativity and enzymatic activity, a property conserved in the human homolog. The structures support an induced fit model for UDP interaction. The dimerization also forms an extended cleft flanked by two active sites, likely facilitating collagen recognition. Unexpectedly, the mimiviral enzyme also synthesizes a prebiotic disaccharide kojibiose. An elongated pocket near the active site allows the enzyme to use UDP-glucose and glucose for kojibiose production. We confirm the enzyme’s kojibiose synthesis activity in vitro and in vivo. These insights inform glucosyltransferase function and open new avenues for inhibitor development and kojibiose biosynthesis.
Similar content being viewed by others
Introduction
Collagens are the most abundant protein family in vertebrates by mass, performing a wide range of structural and biological functions1. During their biosynthesis, collagens acquire a series of lysine (Lys) post-translational modifications that are crucial for their functions2,3. Specific collagen lysine residues can be hydroxylated by lysyl hydroxylases, which are encoded by the procollagen-lysine 2-oxoglutarate 5-dioxygenase genes (PLODs), to form 5-hydroxylysine (Hyl). Certain Hyl residues in the collagen helical domain are further galactosylated by glycosyltransferase 25 domain-containing proteins 1 and 2 (GLT25D1 and GLT25D2)4. These galactosylated Hyl residues can be further glucosylated to form α-(1,2)-glucosyl-β-(1,O)-galactosyl-5-hydroxylysine, with 4′-epi-kojibiose linked to 5-hydroxylysine5. Collagen glucosylation is catalyzed by galactosylhydroxylysyl glucosyltransferases (GGTs), which are encoded by the same polypeptides as lysyl hydroxylases6,7,8,9. This glucosylation process plays key roles in regulating fibrillogenesis, matrix mineralization, axon guidance, angiogenesis, platelet aggregation, and metastasis6,9,10,11,12,13. Dysregulation of Lys modifications is associated with various diseases. For example, mutations in the PLOD2 gene cause Bruck syndrome II, a rare form of osteogenesis imperfecta characterized by joint contractures14. Conversely, increased PLOD2 expression has been linked to fibrosis and cancer progression2,15.
The levels of collagen lysyl post-translational modifications are tightly regulated and play critical roles in both normal biology and disease. Studies in human patients with connective tissue disorders and heterozygous PLOD3 knock-out mice have shown that even a moderate reduction in PLOD3 levels leads to connective tissue abnormalities accompanied by developmental defects16,17,18,19. Our recent work has uncovered collagen glucosyltransferase (GGT) activity in proteins encoded by PLOD1 and an isoform of PLOD2 known as PLOD2b, which arises from alternative splicing of PLOD2 pre-mRNA9. The GGT activity of PLOD1 protein has been linked to vascular development20,21. In contrast, PLOD2b protein, characterized by the inclusion of exon 13a, exhibits cooperative binding of its co-substrate, uridine diphosphate glucose (UDP-glucose), with a higher affinity and demonstrates higher collagen glucosyltransferase activity compared to PLOD2a protein, the isoform lacking exon 13a9. PLOD2b is specifically upregulated in fibroblasts—the primary collagen-producing cells—during mesenchymal differentiation. This isoform is associated with epithelial-to-mesenchymal transition (EMT) and plays a critical role in promoting cancer progression of multiple types9,22,23,24,25,26. Since UDP-glucose levels are often depleted during mesenchymal differentiation, the cooperative binding of UDP-glucose by PLOD2b protein may enable efficient glucosylation of fibrillar collagen in fibroblasts under these conditions. However, the structural basis of collagen glucosyltransferase’s substrate binding and cooperativity remains largely unknown.
Due to their critical roles in tissue homeostasis, collagen and collagen lysyl modifying enzymes are highly conserved in the animal kingdom, from humans to sponges5,27. Remarkably, collagen-like proteins and collagen-modifying enzymes have also been identified in certain fungi, bacteria, and viruses, including Acanthamoeba polyphaga mimivirus5,28. Structural and functional studies of mimiviral collagen lysyl hydroxylase have provided valuable insights into the functions of human collagen-modifying enzymes28,29. Our previous work demonstrated that the active domain of mimiviral lysyl hydroxylase forms a dimer assembly similar to that of human PLOD family members29. More recently, genomic and biochemical analyses have identified a mimiviral collagen glucosyltransferase, R699, capable of modifying galactosyl Hyl residues in collagen30. These findings suggest that collagens and their lysyl post-translational modifications are conserved beyond the animal kingdom. Interestingly, during the life cycle of amoeba, the natural host of mimivirus, the levels of UDP-glucose fluctuate significantly31. However, it remains unclear whether mimivirus possesses mechanisms to sense and respond to these changes in UDP-glucose levels.
In this study, we present a series of high-resolution crystal structures of mimiviral collagen glucosyltransferase, complemented by protein biochemical studies. These structures, with and without UDP and the disaccharide product, provide key snapshots of critical events in the enzyme’s function. Our structural analyses revealed that mimiviral collagen glucosyltransferase forms a dimer, which binds UDP-glucose cooperatively. Comparisons between structures with and without UDP highlight changes in active site electrostatics, supporting an induced-fit model for UDP-glucose binding. Additionally, the dimer appears to form a continuous collagen-binding site, potentially enabling the enzyme to recognize and process long collagen peptides. These structural features are conserved in collagen glucosyltransferases derived from animals, underscoring their functional significance. Unexpectedly, we also discovered a novel kojibiose synthase activity in the mimiviral collagen glucosyltransferase, demonstrated both in vitro and in vivo. Structural analysis revealed that the disaccharide product, kojibiose, is engaged by an elongated sugar-binding pocket adjacent to the UDP-binding site. Together, these findings provide valuable structural insights into the dimer assembly and catalysis mechanism of collagen glucosyltransferases. These insights may enable the pharmacological exploitation of mechanistic differences among enzyme isoforms for the development of isoform-specific inhibitors. Finally, the newly identified kojibiose synthase activity warrants further investigation to establish a scalable process for its mass production.
Results
R699 is a dimer that binds UDP-glucose cooperatively
We have recently identified a novel collagen galactosylhydroxylysyl glucosyltransferase or GGT activity in mimiviral R699 protein using enzymatic activity assays and mass spectrometric analyses30. R699 contains two tandem Rossmann fold domains sharing moderate sequence similarity with human enzymes (Fig. 1a). Using microscale thermophoresis (MST), we determined that R699 binds UDP-glucose cooperatively (Fig. 1b, Kd = 3.9 ± 0.1 µM, Hill coefficient h = 2.0 ± 0.1). Competitive binding assays showed that unlabeled UDP-glucose competed with Glucose-UDP-(polyethylene glycol)6-Fluorescein Conjugate for binding to R699, with an IC50 of 114 µM (Supplementary Fig. 1). These findings suggest that the interaction is primarily mediated by UDP-glucose, with a potential contribution from the polyethylene glycol linker. The observed cooperativity and affinity of UDP-glucose binding recapitulate the findings on a recently identified collagen glucosyltransferase that is encoded by a collagen PLOD2 pre-mRNA alternatively spliced isoform b (PLOD2b)9. Since a previous study indicated PLODs bind UDP-glucose at a 1:1 ratio, suggesting each collagen GGT has one UDP-glucose binding site32, we thus hypothesized that collagen GGTs’ UDP-glucose binding cooperativity is due to enzyme oligomerization. We tested this hypothesis by analyzing the oligomeric states of R699 and PLOD2b using size exclusion chromatography and dynamic light scattering (SEC-DLS). Our results showed that both mimiviral R699 and human PLOD2b, but not PLOD2a, form dimers (Fig. 1c and d), exhibiting similar monomer–dimer equilibrium behavior. These findings suggest that dimerization of collagen glucosyltransferases may underlie their cooperative binding of UDP-glucose.
a Schematic diagrams of mimiviral and human glucosyltransferase domain architecture (top) and glucosylation reaction (bottom). The glycosyltransferase (GLT in cyan), accessory (AC in pink), and lysyl hydroxylase (LH in green) domains are highlighted. Both R699 and animal PLOD (procollagen-lysine 2-oxoglutarate 5-dioxygenase) catalyze the conversion of peptidyl galactosylhydroxylysine (Gal-Hyl) to glucosylgalactosylhydroxylysine (Glc-Gal-Hyl). b R699’s UDP-glucose binding affinity and cooperativity were analyzed using microscale thermophoresis. The data represent the mean values obtained from triplicate measurements (n = 3). c, d Domain architecture and size exclusion chromatography with dynamic light scattering (SEC-DLS) of R699 (in c) and the human PLOD2 GLT-AC truncation (PLOD2-GA) of human PLOD2 isoforms (in d). On the basis of elution time (X-axis) and molar mass (Y-axis at the right), R699 (blue in c) and human PLOD2b truncation (PLOD2b-GA, light brown in d) form a dimer, whereas human PLOD2a truncation (PLOD2a-GA, blue in d) is monomeric.
R699’s antiparallel dimer is required for UDP-glucose binding cooperativity
To elucidate the structural basis of collagen GGT dimerization, we solved the crystal structure of R699 in the presence of Mn²⁺. We chose Mn²⁺ over Mg²⁺ because R699 prefers Mn²⁺ in the enzymatic activity assay (Supplementary Fig. 2). The structure was refined to a resolution of 1.80 Å, with diffraction data statistics and refinement details summarized in Supplementary Table 1. As anticipated, R699 consists of two tandem Rossmann fold domains arranged similarly to human GGT (Fig. 2a)32. The overall structure, comprising the catalytic and adjacent accessory domains, shares moderate homology with human PLOD3 protein (PDB ID 6FXR), with a root-mean-square deviation (RMSD) of 3.1 Å. The catalytic domain of R699 shows much higher structural similarity to the human PLOD3 GGT catalytic domain (PDB ID 6FXR), with an RMSD of 1.4 Å (Supplementary Fig. 3)32, despite only 34% amino acid sequence identity. For comparison, a previously determined structure of the human PLOD3 GGT catalytic domain (PDB ID 6WFV) under different crystallization conditions aligns with the full-length PLOD3 protein structure (PDB ID 6FXR), with an RMSD of 0.9 Å. These findings establish R699 as a close structural homolog of human GGT.
a Ribbon diagram of the R699 bound to Mn2+ (royal purple). D83 and D86 coordinate the Mn2+ within the active site. b Ribbon diagram of the R699 homodimer. The Mn²⁺ binding site is indicated, and the inset highlights key residues at the dimerization interface. c SEC-DLS analysis of R699 wild type (R699WT in blue) and F204R/T243R mutant (orange). On the basis of elution time (X-axis) and molar mass (Y-axis at the right), R699WT forms a dimer, whereas the F204R/T243R mutant is monomeric. d UDP-glucose binding affinity and cooperativity were measured using microscale thermophoresis for the samples described in c. The F204R/T243R mutation disrupts cooperative binding observed in the wild type. Data represent mean values from triplicate samples (n = 3). e Enzymatic activity assay of R699 using galactosylhydroxylysine as a substrate. Enzymatic activity was measured by detecting UDP production with a luciferase assay. The F204R/T243R mutant shows a complete loss of activity. Data represent mean values (± S.D.) from triplicate biological samples (n = 3). Statistical significance was determined using a two-tailed Student’s t-test.
We observed that two R699 molecules form an anti-parallel dimer in the asymmetric unit, occupying a surface area of ~1425 Å2 (Fig. 2b). This structural arrangement supports our hypothesis that UDP-glucose binding cooperativity arises from coordinated interactions between the two subunits within the dimer assembly. As a comparison, we generated AlphaFold models. Among the five models produced, three displayed an anti-parallel dimer configuration. Although the monomer subunit in our structure closely aligns with the AlphaFold prediction (RMSD = 1.4 Å), the overall dimeric assembly observed in our crystal structure is distinct from those predicted by AlphaFold (Supplementary Fig. 4a and 4b). To investigate how the R699 dimer regulates UDP-glucose binding cooperativity, we generated a monomeric R699 mutant based on our structural insights and compared its UDP-glucose binding properties to the wild-type enzyme. Examination of the R699 dimer interface revealed key hydrophobic interactions stabilizing the dimer. These include contacts between F102 in one subunit and I244 and T243 in the opposite subunit (Fig. 2b inset). Additional hydrophobic interactions involve F204 in one subunit and V395 and R396 in the opposite subunit. Notably, these hydrophobic residues are not strictly conserved in PLOD2b (Supplementary Fig. 5), suggesting that the dimerization of PLOD2b is regulated differently. Consistent with the structural findings, SEC-DLS analyses showed that F204R/T243R double mutations caused R699 dimer loss (Fig. 2c). MST binding analyses and enzymatic activity assays further revealed that loss of dimerization resulted in a significant decrease in UDP-glucose binding affinity, abolished UDP-glucose binding cooperativity, and eliminated GGT enzymatic activity (Fig. 2d and e). Enzymatic activity was measured by detecting UDP production with a luciferase assay. To rule out the possibility that the F204R/T243R mutations caused protein misfolding, we performed circular dichroism spectroscopy, which showed no evidence of structural misfolding (Supplementary Fig. 6). These findings demonstrate that R699 dimerization is essential for UDP-glucose binding cooperativity and enzymatic activity.
The structural basis of UDP-glucose binding cooperativity
To investigate the basis of R699 UDP-glucose binding cooperativity, we co-crystallized R699 with UDP-glucose and Mn²⁺. The highest-resolution crystal diffracted to 1.75 Å, and the diffraction data were used to generate an electron density map via molecular replacement (Supplementary Table 2). The map allowed us to model UDP and Mn²⁺ into the structure, but the glucose moiety of UDP-glucose was not resolved (Fig. 3a and Supplementary Fig. 7a). Comparative structural analyses of the UDP-bound and unbound forms revealed significant conformational changes in K222 upon UDP binding (Fig. 3b and Supplementary Fig. 7b and c). In the unbound form, K222 points toward E54, positioning it within range for a potential salt bridge with E54. However, in this state, the electron density for E54 was absent, preventing precise modeling. Based on these findings, we speculate that, in the absence of UDP-glucose, E54 dynamically swings between K222 and K438, forming transient electrostatic interactions with these residues. In the UDP-bound form, K222 reorients to directly interact with the phosphate group of UDP-glucose in the R699 active site. Simultaneously, E54 forms a stable salt bridge with K438 in the accessory (AC) domain (Fig. 3b). These observations support an induced-fit model in which the “2K-1E triad” (K222, K438, and E54) undergoes conformational rearrangements upon UDP-glucose binding, optimizing the active site for catalysis. To probe the functional roles of this triad, we generated a targeted mutation E54A to eliminate both salt bridge options as well as K222E or K438E to disrupt individual salt bridges. Enzymatic activity assays revealed that all mutations severely impaired R699’s activity (Fig. 3c). Interestingly, the E54K/K438E double mutant, designed to restore a salt bridge between residues 54 and 438, showed slightly higher enzymatic activity than the single E54A or K438E mutants. These findings suggest that the E54-K438 salt bridge facilitates UDP binding, consistent with our structural data. To exclude the possibility that these mutations caused protein misfolding, we analyzed the mutants using circular dichroism spectrometry. The results showed no significant differences between the mutants and the wild-type protein (Supplementary Fig. 8), confirming that the loss of activity was not due to misfolding. These findings provide mechanistic insights into the role of the 2K-1E triad in R699’s UDP-glucose binding and enzymatic function.
a Ribbon diagram of the R699 active site bound to Mn2+ (royal purple) and UDP (yellow). The uridine ring of UDP is sandwiched between W48 and Y85. Additional key active site residues are labeled. b Structural comparison of R699 with and without UDP bound, shown as cartoon diagrams. The UDP-bound structure is in purple, and the apo form is in cyan. Key active site residues undergoing conformational changes are highlighted, with UDP shown in yellow. c Enzymatic activity assay of R699 using galactosylhydroxylysine as the substrate. Enzymatic activity was measured by detecting UDP production with a luciferase assay. Results are shown as mean values (± S.D.) from triplicate biological samples (n = 3). p values were determined using two-tailed Student’s t-tests. Although p values between wild-type (WT) and mutant proteins are not displayed, they are <0.001. d Sequence alignment of R699 with human PLOD2, highlighting K438 in R699 and its corresponding lysine residue in PLOD2 in bold. e Ribbon diagram of the K222-containing helix and the dimer-stabilizing loop (highlighted within a gray oval) in R699. Structures with UDP bound (purple) and without UDP (cyan) are overlaid. Residues undergoing conformational changes are shown as sticks and labeled. f Schematic model illustrating the proposed mechanism of positive cooperative interactions between the two active sites within the R699 dimer during UDP-glucose binding. GLT and AC domains are in purple and orange, respectively.
Interestingly, sequence analysis showed that K438 is present exclusively in PLOD2b but not in PLOD2a protein (Fig. 3d). K222 is strictly conserved in PLOD2b protein, while E54 may be substituted by a similar acidic residue, aspartate (Supplementary Fig. 5). These observations suggest that the 2K-1E triad is conserved in PLOD2b protein and may contribute to its unique UDP-glucose binding cooperativity. Structural analysis revealed that conformational changes in the active site upon UDP-glucose binding are accompanied by conformational changes in a loop connecting the GGT and AC domains. The residue in the loop that undergoes a major conformational change is I244 (Fig. 3e and Supplementary Fig. 7d and 7e), which plays a key role in stabilizing the R699 dimer (Fig. 2b inset). Since the loop resides between the two opposite active domains, the loop may regulate these two active domains’ UDP glucose binding affinity via allosteric interactions and transmit UDP-glucose binding signals between the two domains, thereby promoting positive cooperativity (Fig. 3f). These findings support a two-state induced-fit model for UDP-glucose binding cooperativity. However, we cannot rule out the possibility that the observed conformational differences are influenced by crystallization conditions. Therefore, these findings should be interpreted with caution.
The mechanism of collagen binding
Our structural data indicate that the R699 dimer assembly creates a continuous U-shaped surface cleft flanked by two active sites (Fig. 4a), suggesting that this cleft may function as a collagen-binding site. To investigate this possibility, we chose residue N193 located at the center of the cleft (Fig. 4a) and introduced an N193R mutation. The N193R mutant and wild-type proteins were expressed, purified, and subjected to enzymatic activity assays. The N193R mutation had minimal impact on R699’s activity toward galactosylhydroxylysine (Fig. 4b) but significantly reduced its activity toward collagen (Fig. 4c). Further analyses using circular dichroism suggested that N193R was not deleterious to protein folding (Supplementary Fig. 9). These results support a role of the surface cleft in binding large peptidyl substrates.
a Surface representation of the R699 homodimer. The dimerization interface creates a continuous cleft flanked by the active sites. One of 2 Mn²⁺ atoms (slate) is shown within an active domain (cyan), and both N193 (white) residues are highlighted within the cleft. b, c Enzymatic activity assay of R699 using galactosylhydroxylysine (Gal-Hyl in b) or denatured bovine type I collagen (PureCol in c) as a substrate. Enzymatic activity was measured by detecting UDP production with a luciferase assay. Results are expressed as mean values (± S.D.) from triplicate biological samples (n = 3). For both panels, p values were determined using two-tailed Student’s t-tests.
The enzymatic mechanism
To elucidate the structural basis of collagen glucosyltransferase substrate binding, we co-crystallized R699 with UDP-glucose and galactosyl Hyl. Despite screening numerous crystals, we were unable to obtain diffraction data with sufficient galactosyl Hyl electron density. Consequently, we explored R699 co-crystallization with monosaccharides like glucose and galactose as sugar acceptor mimics. Surprisingly, although glucose and galactose were added in equal amounts for crystallization, electron density at the active site revealed bound UDP and kojibiose (α-d-glucopyranosyl-(1 → 2)-α-d-glucose) at 1.50 Å (Fig. 5a and b, Supplementary Table 3). These findings indicate that R699 transfers the glucose moiety from UDP-glucose to a glucose acceptor, generating kojibiose-a product distinct from the 4’-epi-kojibiose found on animal collagens.
a Ribbon diagram of the R699 active site highlighting key components, including Mn²⁺ (royal purple), UDP (yellow), and kojibiose (green). Critical residues involved in substrate interaction are labeled. b Kojibiose (green/red) was modeled on the basis of the electron density map (gray mesh, contour level σ = 1), demonstrating its fit. c LIGPLOT+ diagram illustrating the interaction network of kojibiose within the R699 active site. Key hydrogen bonds and hydrophobic interactions are highlighted, demonstrating the structural basis for substrate binding. d Enzymatic activity assay of R699 using galactosylhydroxylysine as the substrate. Enzymatic activity was measured by detecting UDP production with a luciferase assay. Data represent mean values (± S.D.) from triplicate biological samples. e Measurement of UDP-glucose hydrolysis by R699 using a luciferase-based assay. In this case, the leaky activity of R699 was measured in the absence of a sugar acceptor substrate, representing a deviation from the conditions used for the rest of the data in this study. A reaction lacking the enzyme was used as the control for background subtraction. Data are presented as mean values (± S.D.) from triplicate biological samples (n = 3). For panels d and e, p values were determined using two-tailed Student’s t-tests.
Our structural analysis showed that the first glucose of kojibiose occupies a pocket between Mn²⁺ and D163/Q164 (Fig. 5a). O2 of this glucose interacts with the ND2 of N138 and NE2 of H216 (Fig. 5c), while O3 and O6 form hydrogen bonds with K62 and W118/D162, respectively (Fig. 5c). Sequence alignment revealed that these residues are highly conserved, underscoring their importance in substrate recognition (Supplementary Fig. 5). Mutagenesis studies further confirmed their roles, as substitutions of these residues resulted in significant catalytic activity loss (Fig. 5d). The catalytic roles of W118 and the polyacidic residues (D162 and D163) have been investigated in previous studies30,32,33, providing additional support for their essential contributions to enzymatic function. These findings provide novel insights into R699’s substrate binding and catalytic activity.
The second glucose is positioned between W118 and a loop, forming extensive interactions with UDP and E114 of R699 (Fig. 5a). The O3’ hydroxyl group of the second glucose interacts with the ND2 atom of N218 (Fig. 5c). Both E114 and N218 are critical for R699 catalysis, as demonstrated by site-directed mutagenesis and enzymatic activity assays (Fig. 5d). While both residues are highly conserved across species (Supplementary Fig. 5), E114 is uniquely essential for collagen glucosylation by LH3, whereas N218 is not strictly required. Interestingly, N218 is the closest residue to the O1 hydroxyl group of the first glucose in kojibiose, located approximately 5 Å away, and 6.5 Å from the diphosphate moiety of UDP (Supplementary Fig. 10). These findings suggest that no general base is positioned close enough to directly participate in catalysis, consistent with previous reports that collagen GGTs are retaining-type glycosyltransferases21. Additionally, R699 exhibits moderate uncoupling of UDP-glucose hydrolysis in the absence of an acceptor substrate, a property similar to human LH3. Our results suggest that E114 plays a specific role in acceptor glucosylation rather than uncoupling UDP-glucose hydrolysis, consistent with its proposed function in binding the acceptor substrate (Fig. 5e). In contrast, residues essential for interactions with the first sugar moiety, such as K62 and H216, are critical for both coupling and uncoupling UDP-glucose hydrolysis (Fig. 5e). Circular dichroism spectrometry confirmed that none of the mutants exhibited signs of misfolding, ensuring that the observed effects are not attributable to structural instability (Supplementary Fig. 11).
R699 has kojibiose synthase activity
To confirm R699’s substrate specificity, we analyzed its activity using UDP-glucose as the sugar donor and either mono- or di- saccharides as the sugar acceptor, with galactosyl Hyl serving as a positive control. The results showed that R699 exhibited robust activity toward glucose but no detectable activity with galactose or maltose (Fig. 6a and b, and Supplementary Fig. 12), supporting R699’s kojibiose synthase activity. We also assayed R699 using UDP-glucose and UDP-galactose as sugar donors and glucose as a sugar acceptor. We found that R699 prefers UDP-glucose as a sugar donor (Supplementary Fig. 13). To evaluate R699’s potential kojibiose synthase activity in vivo, we transformed and overexpressed R699 in the E. coli BL21 strain and analyzed the products using gas chromatography-mass spectrometry (GC-MS). However, kojibiose was undetectable. We reasoned that the absence of kojibiose production might be due to low intracellular glucose levels, as glucose is rapidly phosphorylated by hexokinases upon entering E. coli cells. Based on these findings, we hypothesized that R699 is unable to modify phosphorylated glucose substrates, such as glucose-6-phosphate or glucose-1-phosphate. To test this hypothesis, we performed enzymatic activity assays using phosphorylated glucose as potential sugar acceptors. The results confirmed that R699 could not glycosylate glucose-6-phosphate or glucose-1-phosphate (Fig. 6b), supporting the hypothesis that R699 requires unphosphorylated glucose as its substrate. To address the issue of low intracellular glucose levels, we expressed R699 in the E. coli MEC143 strain (Supplementary Fig. 14), which is deficient in hexokinase activity and thus accumulates higher levels of unphosphorylated glucose. GC-MS analysis identified peaks unique to kojibiose exclusively in MEC143 cells expressing R699 (Fig. 6c–e, and Supplementary Fig. 15). The identity of the major kojibiose peak was further confirmed by GC-MS analysis of chemical standards (Supplementary Fig. 16). Quantitative analysis indicated that kojibiose yield was approximately 15% relative to maltose (Fig. 6e and Supplementary Fig. 16 and 17), the most abundant disaccharide in E. coli. These findings demonstrate that kojibiose production occurred exclusively in MEC143 cells expressing R699. These findings also demonstrate that R699 functions as a kojibiose synthase, specifically utilizing unphosphorylated glucose as its substrate.
a Schematic representation of the proposed kojibiose synthesis reaction catalyzed by R699. b Enzymatic activity assay of R699 using various substrates, including galactosylhydroxylysine (Gal-Hyl), glucose, galactose, glucose-6-phosphate, and glucose-1-phosphate. Enzymatic activity was measured by detecting UDP production with a luciferase assay. Data are presented as mean values (± S.D.) from triplicate biological samples (n = 3). p values were calculated using two-tailed Student’s t-tests. c–e Gas chromatograms of standards (in c) and cell lysates from untransformed E. Coli strain MEC143 (in d) versus MEC143 transformed with R699 (in e). Chromatograms were generated by extracting m/z 361, a characteristic ion of methyloximated, 8-trimethylsilyl derivatized disaccharides. The positions of maltose and kojibiose major peaks are marked in e. Quantitative analysis using the major peaks indicates that kojibiose yield is approximately 15% of maltose, the most abundant disaccharide in E. coli. Note there are two isomers of derivatized disaccharides.
Discussion
We found that both mimiviral and human collagen glucosyltransferases form dimers to bind UDP-glucose cooperatively9. Since the LH domains of PLODs also form dimers independently of the glucosyltransferase (GT + AC) domains29,32, PLOD2b may assemble into a fiber-like polymer. Notably, a recent study reported that GLT25D1 forms a homodimer that bridges the GT + AC domains of PLOD3, revealing that PLOD3 and GLT25D1 can form a distinct fiber-like polymer34. Together, these findings suggest that fiber-like polymer formation may be a shared structural feature among PLOD isoforms. The cooperative binding of UDP-glucose suggests that these enzymes can sharply adjust their collagen glucosylation activity in response to changes in UDP-glucose levels. Such regulation may be critical during mimivirus-host interactions, where fluctuations in UDP-glucose levels occur, and during fibroblast differentiation22, a process requiring precise collagen modifications. These findings highlight a potential conserved mechanism by which collagen glucosyltransferases respond dynamically to variations in their substrate availability.
The primary hosts of mimiviruses are amoebae and other protists35. The amoebae life cycle consists of two main stages: an active trophozoite stage and a dormant cyst stage. The latter enables amoebae to survive for days to weeks in external environments36. During the transition from trophozoite to cyst, UDP-glucose levels in amoebae increase fivefold31, potentially influencing the activity of UDP-glucose-dependent enzymes. Our findings demonstrated that mimiviral collagen glucosyltransferase binds UDP-glucose cooperatively, suggesting that mimiviral glucosyltransferase activity and collagen glucosylation are highly sensitive to changes in UDP-glucose levels. This implies that amoebae encystment may sharply upregulate mimiviral collagen glucosylation by elevating UDP-glucose concentrations. However, previous studies indicate that mimivirus infects amoebae during the trophozoite stage, not during the cyst stage37,38. Furthermore, mimiviral infection has been shown to prevent amoebae encystment38, leaving the virus within trophozoites where intracellular UDP-glucose levels are relatively low. These observations may explain why determining mimiviral collagen glycosylation has been challenging. These findings also raise intriguing questions about the specific roles of mimiviral collagen glycosylation in amoebae encystment and mimivirus-host interactions, warranting further investigation into this complex relationship.
In animal cells, intracellular UDP-glucose levels are tightly regulated. It has been reported that downregulation of UDP-glucose promotes mesenchymal differentiation, as UDP-glucose facilitates SNAIL1 mRNA stabilization by binding to Hu antigen R39. As a result, UDP-glucose levels are low in mesenchymal cells, such as fibroblasts. Paradoxically, fibroblasts are the major collagen producers and require UDP-glucose for collagen glucosylation. These findings raise the question of how collagen glucosylation occurs in the presence of low UDP-glucose levels.
Recent research, including our own, has shown that mesenchymal differentiation upregulates PLOD2b22, which binds UDP-glucose tightly and cooperatively9. These findings suggest that the cooperative binding of UDP-glucose enables the tight regulation of collagen glucosyltransferase activity and collagen glucosylation during mesenchymal differentiation. Alternatively, high PLOD2b levels could deplete UDP-glucose during collagen production in fibroblasts, as collagen biosynthesis-being the most abundant protein in mammals-requires significant cellular resources. If this hypothesis is correct, it implies that collagen production may reduce UDP-glucose levels to stabilize SNAIL1 mRNA and reinforce the mesenchymal cell fate.
We present a comprehensive set of collagen glucosyltransferase crystal structures: (1) with Mn2+, (2) with Mn2+ and UDP, and (3) with Mn2+, UDP, and disaccharide product. These structures reveal key features and conformational changes involved in substrate binding and cooperativity. Our study provides critical insights into the structure-function of collagen glucosyltransferases, which could facilitate the development of antagonists for collagen glucosylation-associated diseases, such as cancer and fibrosis. Additionally, these insights may aid in engineering collagen glucosyltransferases to produce recombinant collagen and collagen peptides for biomedical research and industrial applications.
The structures we presented here identify a continuous long cleft with two flanking active sites, suggesting this cleft serves as a collagen-binding site. Based on the size of the cleft, which measures approximately 7–11 Å in diameter (Supplementary Fig. 18), we reason that the cleft is too small to accommodate the collagen triple helix, which has a diameter of 15 Å. We speculate that R699 may recognize a single collagen alpha chain rather than the triple helix, consistent with its enzymatic activity toward denatured collagen. Interestingly, collagen lysyl hydroxylases form a similar collagen-binding cleft at the dimer interface29, indicating that collagen-modifying enzymes recognize their substrate in a similar manner. The two active sites are located in an anti-parallel orientation, which could allow them to bind independently to two unzipped collagen chains or a single collagen chain via looping during collagen biosynthesis.
We crystallized R699 with an equal molar ratio of glucose and galactose as sugar acceptor. The structures presented here suggests that R699 prefers to glycosylate glucose over galactose, forming kojibiose. Kojibiose was initially isolated from Koji extract and was also found in low levels in honey40,41. It has been shown that kojibiose is resistant to be fermented in the mouth or absorbed in small intestine but supports the growth of healthy gut microbiome, thus, it is beneficial to manage dental caries and blood sugar surges as well as promote gut health42,43. However, industrial production of kojibiose remains challenging due to the lack of low-cost and efficient production methods44. Our serendipitous results suggest that R699 is a potential kojibiose synthase, capable of producing kojibiose in a manner similar to sucrose production in plants. These findings open the possibility of engineering R699 and integrating this simple kojibiose synthesis step into sugar-producing microorganisms, such as the baker’s yeast, for industrial production of kojibiose.
Additionally, our findings suggest that R699 may prefer glucosylhydroxylysine over galactosylhydroxylysine as a sugar acceptor. Consistent with these results, a previous report indicated that mimiviral L230 glycosylates hydroxylysine to glucosylhydroxylysine or glucosyl Hyl28. Together, these findings propose a testable model in which certain Lys residues in mimiviral collagen are hydroxylated and glucosylated one or two times to form glucosyl Hyl and glucosylglucosyl Hyl. This is distinct from human collagen, which contains galactosyl Hyl and glucosylgalactosyl Hyl. Furthermore, since we also found that R699 can modify galactosyl Hyl-containing collagen peptides30, our results suggest that R699 is a promiscuous enzyme.
Methods
Cloning, protein expression and purification
The R699 gene was synthesized by Genscript and cloned into a modified version of the pET28 vector using BamH1 and EcoR1 sites for enzymatic activity assays. This modified vector replaced the thrombin recognition site with PreScission and BamH1 recognition sites, with the endogenous BamH1 site being eliminated. Mutant constructs were generated using Agilent’s QuikChange Lightning Site-Directed Mutagenesis Kit. Primers were included in the Supplementary Table 4. R699 was cloned into a version of the pET28-mCherry vector for crystallization purposes, employing BamH1 and EcoR1 sites. This vector contains the mCherry gene sequence and a PreScission recognition site inserted between the Nhe1 and BamH1 sites. To express R699 in an E. coli strain that lacks T7 RNA polymerase, His6-R699 fusion was synthesized by Gene Universal and cloned into pMAL-C4X vector using NdeI and SalI sites. PLOD2-GA truncations were optimized with the PROSS server to improve their expression in E. coli45. The sequences of PLOD2-GA truncations are shown in Supplementary Fig. 19.
All plasmids were verified through Sanger sequencing and then transformed into E. coli strain BL21 DE3 (NEB) for protein expression. A small-scale BL21 overnight culture with 50 mg per liter of kanamycin (GoldBio) was prepared, and 10 ml of this culture was used to inoculate an 800 ml large-scale culture using Terrific Broth Medium (Alpha Biosciences), with the same amount of kanamycin. The culture was grown at 37 °C until reaching an OD600 of 1.5, then induced with 1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG, GoldBio), and finally grown at 16 °C for 18 h. After growth, cells were collected, pelleted, and resuspended in binding buffer (20 mM Tris, pH 8.0, 200 mM NaCl, and 15 mM imidazole). Following cell lysis by sonication, the lysate was centrifuged at 23,000 g for 15 min. The recombinant R699 proteins (wild type or mutants) were purified using immobilized nickel affinity chromatography and eluted with elution buffer (200 mM NaCl and 300 mM imidazole, pH 8.0). R699 protein was dialyzed at 16 °C for 18 h in 20 mM HEPES, pH 7.4, and 150 mM NaCl for enzymatic activity assays.
To test kojibiose synthesis in vivo, pMAL-C4X-His6-R699 was expressed in hexokinase-deficient E. coli strain MEC143, a generous gift from Dr. Mark A. Eiteman from the University of Georgia46. Ten ml of R699 transformed MEC143 overnight culture was used to inoculate an 800 ml large-scale culture of Miller LB Broth (Midland Scientific) supplemented with 100 mM xylose (MilliporeSigma) and 90 μM MnCl2 (MilliporeSigma). The culture was grown at 37 °C until reaching an OD600 of 1.5, then induced with 1 mM IPTG at 37 °C for 18 h. R699 was purified using immobilized metal affinity chromatography, eluted with elution buffer, concentrated to 0.5 mg ml−1, separated and visualized using SDS-PAGE and Bio-Safe™ Coomassie Stain (Bio-Rad). MEC cell lysates were also analyzed using GC-MS to detect Kojibiose.
GC-MS analysis
Bacterial pellets were lysed by brief sonication in -20 °C 80% methanol followed by incubation at −20 °C for 1 h to precipitate proteins. Lysates were centrifuged at 17,000 g for 10 min at 4 °C and supernatants were transferred to 1.5 ml polypropylene tubes. For GC-MS derivatization, lysates were dried on a Centrivap (Labconco) followed by the addition of 50 μl 20 mg ml−1 methoxyamine (Sigma) in pyridine (Thermo) and incubated at 30 °C for 90 min. Samples were centrifuged at 17,000 g for 10 min at room temperature and 40 μl was transferred to a glass GCMS vial where 80 μl of N-Methyl-N-(trimethylsilyl)trifluoroacetamide (MSTFA) + 1% Chlorotrimethylsilane (TCMS) (Thermo) was added to each sample followed by incubation at 37 °C for 30 min. For maltose and kojibiose standards, 20 nmol of each standard was dried and derivatized identically to cell extracts.
For GC-MS analysis, 1 μl of each sample or standard was injected into an Agilent 5977b GC-MS containing an Agilent HP-5MS column. The inlet was set at 250 °C with a pressure of 4.9 psi and septum purge flow of 3 ml min−1. Splitless inlet mode was set to 10.5 ml min−1 at 1 min. Initial oven temperature was set to 60 °C for 1 min, followed by a linear ramp to 325 °C at a rate of 10 °C min−1. Full-scan MS detection range was set to 50-550 m/z with a source and quad temperature of 230 °C and 150 °C, respectively. Data files were imported into MassHunter Qualitative Analysis (Agilent) for peak and mass spectrum extraction. MassHunter chromatograms were exported as text files and imported into GraphPad Prism to generate figures.
Crystallization, structure determination, and refinement
mCherry-R699 was first purified using immobilized metal affinity chromatography, as described above. The eluted recombinant protein was cleaved with PreScission protease at 4 °C for 18 h while dialyzing in gel filtration buffer (20 mM Tris, pH 8.0, 200 mM NaCl). After PreScission protease cleavage, R699 was purified again using reverse immobilized metal affinity chromatography to remove mCherry protein and other contaminants that bind to nickel resin. The eluted protein was further purified by gel filtration using a Hiload 16/60 Superdex 200 PG column at a flow rate of 1 ml per minute. Peak fractions were combined and concentrated for crystal trials. High-quality crystals were obtained via hanging drop vapor diffusion using a Mosquito liquid handling robot (TTP Labtech) with a 200-nL drop. For R699 with Mn2+ structure, R699 (30 mg ml−1), supplemented with 2 mM manganese(II) chloride, was mixed with 200 mM ammonium formate, 20% (w v−1) PEG 3350 at a 1:1 ratio and incubated at 18 °C. For R699 with Mn2+ and UDP structure, R699 (12 mg ml−1), supplemented with 10 mM manganese(II) chloride, 10 mM UDP-glucose, and 50% glycerol, was mixed with 180 mM ammonium chloride, 18% (w v−1) PEG 3350, and 4% (v v−1) 1-Propanol at a 1:1 ratio and incubated at 18 °C. For R699 with Mn2+, UDP, and kojibiose structure, R699 (28 mg ml−1), supplemented with 10 mM manganese(II) chloride, 10 mM UDP-glucose, 0.15 g ml−1 glucose, and 0.15 g ml−1 galactose, was mixed with 100 mM HEPES pH 7.0, 10% (w v−1) PEG 6000 at a 1:1 ratio and incubated at 18 °C. Diffraction data were collected on the 22-ID beamline of SERCAT at the Advanced Photon Source, Argonne National Laboratory (Supplementary Tables 1–3) using SERGUI at 110 K with a wavelength of 1.0 Å. Data were processed using XDSgui and DIALS included in CCP447,48,49,50. An initial R699 crystal structure was solved by molecular replacement with Phenix using RoseTTAFold and AlphaFold models as search templates51,52. The refined individual domains were then used as search models for solving the other R699 crystal structures. The structures were then fully built and refined iteratively using Coot and Phenix53,54,55, respectively. Protein structure similarity was compared using the Dali server56,57, while structure interface analysis was performed using the protein interfaces, surfaces, and assemblies’ service PISA at the European Bioinformatics Institute58. The initial cartoons of disaccharides in the R699 active site were constructed using Ligplot+59. Molecular graphics were prepared using Pymol.
AlphaFold prediction model
To generate an AlphaFold prediction model for structural comparison with R699, the R699 protein sequence (UniProt Accession ID: Q5UNV6) was submitted to the AlphaFold Server on May 10, 2025 (https://alphafoldserver.com/). Five models were generated, and the top-ranked model was used for comparison with the experimentally determined crystal structure. The pLDDT scores were color-coded and overlaid on the top AlphaFold model from two different viewing angles, and the predicted aligned error (PAE) plot was also generated and included. These visualizations are provided in Supplementary Fig. 20.
SEC-DLS analysis
The molar mass of R699 and human PLOD2 proteins were analyzed by size exclusion chromatography and dynamic light scattering analyses (SEC-DLS). Superdex 200 (10/300) was equilibrated with SEC-DLS buffer (20 mM Tris, pH 8, 200 mM NaCl) at 4 °C overnight using Akta purifier (GE Healthcare) and connected downstream to dynamic light scattering detector (miniDAWN TREOS) and refractive index (RI) (Optilab T-rEX) instruments (Wyatt Technology). Proteins (3 mg ml−1, 150 μl) were injected into the Superdex 200 column with a flow rate of 0.5 ml min−1 and the DLS data were analyzed by ASTRA 7.1 program. The results were from a single biological sample. Each protein was analyzed once.
GGT enzymatic activity assay
GGT activity was measured using a method previously described9. The assay was conducted in reaction buffer (100 mM HEPES buffer pH 8.0, 150 mM NaCl) at 37 °C for 1 h with 1 μM R699 enzyme, 100 μM MnCl2, 200 μM UDP-glucose (MilliporeSigma), 1 mM dithiothreitol, and 1.75 mM galactosyl hydroxylysine (Gal-Hyl, MedChemExpress) or 2 μM PureCol® (Advanced BioMatrix) or 5 mM glucose or glucose-phophate. PureCol® was denatured at 95 °C for 5 min and chilled on ice immediately before use. For UDP-glucose hydrolysis assay, no sugar acceptor was added. Following the manufacturer’s instructions, GGT activity was measured by detecting UDP production with an ATP–based luciferase assay (UDP-Glo™ Glycosyltransferase Assay, Promega). In most figures, the reported activity represents the signal from the complete reaction mixture minus the signal from a parallel reaction lacking the sugar acceptor. An exception is shown in Fig. 5e, where we specifically measured UDP-glucose hydrolysis (i.e., leaky activity) in the absence of a sugar acceptor. In this case, the reaction without enzyme served as the control for background correction. Experiments were performed in triplicate from distinct samples, and an unpaired Student’s t-test was used to compare the enzymatic activity of different samples.
Microscale thermophoresis
To conduct microscale thermophoresis, glucose-UDP-(PEG)6-fluorescein conjugate (10 μl at 50 nM) was mixed with an equal volume of serially diluted unlabeled R699 protein in 20 mM Tris, pH 8.0, 200 mM NaCl, 5 mM Mn2+, and 0.05% Tween-20. After incubation at 25 °C for 15 min, the samples were loaded into silica capillaries (Nanotemper Technologies). For the competition assay, fixed concentrations of glucose-UDP-(PEG)6-fluorescein conjugate (50 nM) and R699 (20 μM) were titrated with different concentrations of unlabeled UDP-glucose. For Fig. 1b, glucose-UDP-(PEG)6-fluorescein conjugate was purchased from MilliporeSigma (Cat # SMB00284) and data collection was performed at 20 °C using Dianthus NT23.Pico (Nanotemper Technologies) with Nanotemper MO software. For Fig. 2d and Supplementary Fig. 1, glucose-UDP-(PEG)6-fluorescein conjugate was purchased from AAT Bioquest (Cat # 11706) and data collection was performed at 20 °C using Monolith NT.115 (Nanotemper Technologies) with Nanotemper MO software. Curves were analyzed using Prism 10 to fit Kd according to the law of mass action and to determine IC50. The experiment was repeated in triplicates unless stated otherwise, and the results represent the mean values from repeated biological samples.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Crystal structures have been deposited in the Worldwide Protein Data Bank under RCSB accession ID numbers 9DYT, 9DZS, and 9E92. GC-MS data can be found in the source data files. Unless otherwise stated, all data supporting the results of this study can be found in the article, supplementary, and source data files. Source data are provided with this paper.
References
Shoulders, M. D. & Raines, R. T. Collagen structure and stability. Annu Rev. Biochem. 78, 929–958 (2009).
Yamauchi, M., Barker, T. H., Gibbons, D. L. & Kurie, J. M. The fibrotic tumor stroma. J. Clin. Invest 128, 16–25 (2018).
Yamauchi, M. & Sricholpech, M. Lysine post-translational modifications of collagen. Essays Biochem. 52, 113–133 (2012).
Schegg, B., Hulsmeier, A. J., Rutschmann, C., Maag, C. & Hennet, T. Core glycosylation of collagen is initiated by two beta(1-O)galactosyltransferases. Mol. Cell Biol. 29, 943–952 (2009).
Hennet, T. Collagen glycosylation. Curr. Opin. Struct. Biol. 56, 131–138 (2019).
Sricholpech, M. et al. Lysyl hydroxylase 3-mediated glucosylation in type I collagen: molecular loci and biological significance. J. Biol. Chem. 287, 22998–23009 (2012).
Sricholpech, M. et al. Lysyl hydroxylase 3 glucosylates galactosylhydroxylysine residues in type I collagen in osteoblast culture. J. Biol. Chem. 286, 8846–8856 (2011).
Heikkinen, J. et al. Lysyl hydroxylase 3 is a multifunctional protein possessing collagen glucosyltransferase activity. J. Biol. Chem. 275, 36158–36163 (2000).
Guo, H. F. et al. A collagen glucosyltransferase drives lung adenocarcinoma progression in mice. Commun. Biol. 4, 482 (2021).
Isaacman-Beck, J., Schneider, V., Franzini-Armstrong, C. & Granato, M. The lh3 Glycosyltransferase directs target-selective peripheral nerve regeneration. Neuron 88, 691–703 (2015).
Schneider, V. A. & Granato, M. The myotomal diwanka (lh3) glycosyltransferase and type XVIII collagen are critical for motor growth cone migration. Neuron 50, 683–695 (2006).
Goveia, J. et al. An Integrated Gene Expression Landscape Profiling Approach to Identify Lung Tumor Endothelial Cell Heterogeneity and Angiogenic Candidates. Cancer Cell 37, 21–36 e13 (2020).
Katzman, R. L., Kang, A. H. & Beachey, E. H. Collagen-induced platelet aggregation: involement of an active glycopeptide fragment (alpha1-CB5). Science 181, 670–672 (1973).
Ha-Vinh, R. et al. Phenotypic and molecular characterization of Bruck syndrome (osteogenesis imperfecta with contractures of the large joints) caused by a recessive mutation in PLOD2. Am. J. Med Genet A 131, 115–120 (2004).
Piersma, B. & Bank, R. A. Collagen cross-linking mediated by lysyl hydroxylase 2: an enzymatic battlefield to combat fibrosis. Essays Biochem. 63, 377–387 (2019).
Vahidnezhad, H. et al. Mutations in PLOD3, encoding lysyl hydroxylase 3, cause a complex connective tissue disorder including recessive dystrophic epidermolysis bullosa-like blistering phenotype with abnormal anchoring fibrils and type VII collagen deficiency. Matrix Biol. 81, 91–106 (2019).
Salo, A. M. et al. A connective tissue disorder caused by mutations of the lysyl hydroxylase 3 gene. Am. J. Hum. Genet 83, 495–503 (2008).
Rautavuoma, K. et al. Premature aggregation of type IV collagen and early lethality in lysyl hydroxylase 3 null mice. Proc. Natl. Acad. Sci. USA 101, 14120–14125 (2004).
Ruotsalainen, H. et al. Glycosylation catalyzed by lysyl hydroxylase 3 is essential for basement membranes. J. Cell Sci. 119, 625–635 (2006).
Koenig, S. N. et al. New mechanistic insights to PLOD1-mediated human vascular disease. Transl. Res. 239, 1–17 (2022).
Mattoteia D., et al. Identification of Regulatory Molecular “Hot Spots” for LH/PLOD Collagen Glycosyltransferase Activity. Int. J. Mol. Sci. 24, 11213 (2023).
Venables, J. P. et al. MBNL1 and RBFOX2 cooperate to establish a splicing programme involved in pluripotent stem cell differentiation. Nat. Commun. 4, 2480 (2013).
Eisinger-Mathason, T. S. et al. Hypoxia-dependent modification of collagen networks promotes sarcoma metastasis. Cancer Discov. 3, 1190–1205 (2013).
Saito, T. et al. Aberrant Collagen Cross-linking in Human Oral Squamous Cell Carcinoma. J. Dent. Res. 98, 517–525 (2019).
Terajima, M. et al. Collagen molecular phenotypic switch between non-neoplastic and neoplastic canine mammary tissues. Sci. Rep. 11, 8659 (2021).
Chen, Y. et al. Lysyl hydroxylase 2 induces a collagen cross-link switch in tumor stroma. J. Clin. Invest. 125, 1147–1162 (2015).
Myllyharju, J. & Kivirikko, K. I. Collagens, modifying enzymes and their mutations in humans, flies and worms. Trends Genet. 20, 33–43 (2004).
Luther, K. B. et al. Mimivirus collagen is modified by bifunctional lysyl hydroxylase and glycosyltransferase enzyme. J. Biol. Chem. 286, 43701–43709 (2011).
Guo, H. F. et al. Pro-metastatic collagen lysyl hydroxylase dimer assemblies stabilized by Fe(2+)-binding. Nat. Commun. 9, 512 (2018).
Wu, W. et al. Comparative genomic and biochemical analyses identify a collagen galactosylhydroxylysyl glucosyltransferase from Acanthamoeba polyphaga mimivirus. Sci. Rep. 12, 16806 (2022).
Rudick, V. L. & Weisman, R. A. Uridine diphosphate glucose pyrophosphorylase of Acanthamoeba castellanii. Purification, kinetic, and developmental studies. J. Biol. Chem. 249, 7832–7840 (1974).
Scietti, L. et al. Molecular architecture of the multifunctional collagen lysyl hydroxylase and glycosyltransferase LH3. Nat. Commun. 9, 3163 (2018).
Wang, C. et al. Identification of amino acids important for the catalytic activity of the collagen glucosyltransferase associated with the multifunctional lysyl hydroxylase 3 (LH3). J. Biol. Chem. 277, 18568–18573 (2002).
Peng, J. et al. The structural basis for the human procollagen lysine hydroxylation and dual-glycosylation. Nat. Commun. 16, 2436 (2025).
Colson, P., La Scola, B., Levasseur, A., Caetano-Anolles, G. & Raoult, D. Mimivirus: leading the way in the discovery of giant viruses of amoebae. Nat. Rev. Microbiol. 15, 243–254 (2017).
Rodriguez-Zaragoza, S. Ecology of free-living amoebae. Crit. Rev. Microbiol. 20, 225–241 (1994).
Silva, L., Boratto, P. V. M., La Scola, B., Bonjardim, C. A. & Abrahao, J. S. Acanthamoeba and mimivirus interactions: the role of amoebal encystment and the expansion of the ‘Cheshire Cat’ theory. Curr. Opin. Microbiol. 31, 9–15 (2016).
Boratto, P. et al. Acanthamoeba polyphaga mimivirus prevents amoebal encystment-mediating serine proteinase expression and circumvents cell encystment. J. Virol. 89, 2962–2965 (2015).
Wang, X. et al. UDP-glucose accelerates SNAI1 mRNA decay and impairs lung cancer metastasis. Nature 571, 127–131 (2019).
Sato, A. & Aso, K. Kojibiose (2-O-alpha-D-glucopyranosyl-D-glucose): isolation and structure. Nature 180, 984–985 (1957).
Watanabe, T. & Aso, K. Isolation of kojibiose from honey. Nature 183, 1740 (1959).
Onyango S. O. et al. Oral Microbiota Display Profound Differential Metabolic Kinetics and Community Shifts upon Incubation with Sucrose, Trehalose, Kojibiose, and Xylitol. Appl. Environ. Microbiol. 86, 1170 (2020).
Garcia, C. A. & Gardner, J. G. Bacterial alpha-diglucoside metabolism: perspectives and potential for biotechnology and biomedicine. Appl. Microbiol. Biotechnol. 105, 4033–4052 (2021).
Beerens, K. et al. Biocatalytic Synthesis of the Rare Sugar Kojibiose: Process Scale-Up and Application Testing. J. Agric Food Chem. 65, 6030–6041 (2017).
Goldenzweig, A. et al. Automated Structure- and Sequence-Based Design of Proteins for High Bacterial Expression and Stability. Mol. Cell 63, 337–346 (2016).
Xia, T. et al. Accumulation of d-glucose from pentoses by metabolically engineered Escherichia coli. Appl. Environ. Microbiol. 81, 3387–3394 (2015).
Brehm, W., Trivino, J., Krahn, J. M., Uson, I. & Diederichs, K. XDSGUI: a graphical user interface for XDS, SHELX and ARCIMBOLDO. J. Appl. Crystallogr 56, 1585–1594 (2023).
Winter, G. et al. DIALS: implementation and evaluation of a new integration package. Acta Crystallogr. D. Struct. Biol. 74, 85–97 (2018).
Agirre, J. et al. The CCP4 suite: integrative software for macromolecular crystallography. Acta Crystallogr. D. Struct. Biol. 79, 449–461 (2023).
Collaborative Computational Project N The CCP4 suite: programs for protein crystallography. Acta Crystallogr. D. Biol. Crystallogr 50, 760–763 (1994).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).
Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta Crystallogr. D. Biol. Crystallogr. 60, 2126–2132 (2004).
Liebschner, D. et al. Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Crystallogr. D. Struct. Biol. 75, 861–877 (2019).
Adams, P. D. et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D. Biol. Crystallogr. 66, 213–221 (2010).
Holm, L., Laiho, A., Toronen, P. & Salgado, M. DALI shines a light on remote homologs: One hundred discoveries. Protein Sci. 32, e4519 (2023).
Holm, L. Using Dali for Protein Structure Comparison. Methods Mol. Biol. 2112, 29–42 (2020).
Krissinel, E. & Henrick, K. Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 372, 774–797 (2007).
Laskowski, R. A. & Swindells, M. B. LigPlot+: multiple ligand-protein interaction diagrams for drug discovery. J. Chem. Inf. Model 51, 2778–2786 (2011).
Acknowledgements
We thank Dr. Mark A. Eiteman from the University of Georgia for sharing reagents. We thank Drs. Trevor Creamer, Emilia Galperin, Louis B. Hersh, and Martin Chow from the University of Kentucky for sharing equipment and helpful discussions. This work was supported by the National Institutes of Health grants R00CA225633 and R37CA278989 (H.G.) and an American Cancer Society Research Scholar Grant RSG-24-1156098-01-MM (H.G.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author information
Authors and Affiliations
Contributions
J.K. and Z.C.: Methodology, Data collection and curation, draft Writing and Editing. C.B.: Data analyses, curation, and validation. S.J.R., B.Z., T.C., J.W., and M.E.G.: Data collection and analyses. S.E.G.: Methodology, Data collection and analyses. R.C.B.: Data collection and analyses, draft Writing and Editing. M.Y.: Conceptualization, funding acquisition, draft Writing and Editing. B.L.: Conceptualization, supervision, funding acquisition, draft Editing. H.G.: Conceptualization, supervision, funding acquisition, original draft Writing and Editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Luigi Scietti and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Kim, J.S., Chen, Z., Espinosa Garcia, S.A. et al. Structural basis of collagen glucosyltransferase function and its serendipitous role in kojibiose synthesis. Nat Commun 16, 6704 (2025). https://doi.org/10.1038/s41467-025-61973-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-025-61973-x