Abstract
Carotenoids are membrane-bound pigments that are essential for photosynthesizing plants and algae, widely applied in food, feed and cosmetics due to their antioxidant and anti-inflammatory properties. The production of carotenoids, particularly C30 forms, has been documented in some non-photosynthetic prokaryotes. However, their function, distribution and ecology beyond photosynthesizing organisms remains understudied. In this study, we performed an eco-evolutionary analysis of terpenoid biosynthetic gene clusters in the Lactobacillaceae family, screening 4203 dereplicated genomes for terpenoid biosynthesis genes, and detected crtMN genes in 28/361 (7.7%) species across 14/34 (41.2%) genera. These genes encode key enzymes for producing the C30 carotenoid 4,4′-diaponeurosporene. crtMN genes appeared to be convergently gained within Fructilactobacillus and horizontally transferred across species and genera, including Lactiplantibacillus to Levilactobacillus. The phenotype was confirmed in 87% of the predicted crtMN gene carriers (27/31). Nomadic and insect-adapted species, particularly those isolated from vegetable fermentations, e.g., Lactiplantibacillus, and floral habitats, e.g., Fructilactobacillus, contained crtMN genes, while vertebrate-associated species, including vaginal associated species, lacked this trait. This habitat association aligned with the observations that C30 carotenoid-producing strains were more resistant to UV-stress. In summary, C30 carotenoid biosynthesis plays a role in habitat adaptation and is scattered across Lactobacillaceae in line with this habitat adaptation.
Similar content being viewed by others
Introduction
Carotenoids are membrane-bound specialized metabolites found in all photosynthesizing organisms, such as plants, algae and cyanobacteria. They play vital roles in protecting producer cells against excess light energy and scavenging reactive oxygen species1,2,3. However, these molecules also perform functions that are unrelated to photosynthesis, such as being precursors to phytohormones4,5 or attracting pollinators and promoting seed dispersal by accumulating in flowers and fruits6. Moreover, carotenoids are found in non-photosynthetic organisms, for example in the pathogenic bacterium Staphylococcus aureus, where the carotenoid staphyloxanthin increases virulence by protecting the bacteria against reactive oxygen species and inactivating host neutrophils7. In leaf-dwelling non-photosynthetic bacteria, such as Clavibacter, Pseudomonas, and Methylobacteria, carotenoids protect bacterial cells and, in some cases, the plant host from UV radiation8,9. Animals require—but are generally unable to synthesize—carotenoids and consequently supplementation depends on food intake, with the notable exception of aphids, which have acquired the ability to produce carotenoids as fungal carotenoid biosynthetic genes have been taken up into their genome10. In humans, dietary intake of carotenoids is essential for vitamin A biosynthesis11 and is associated with a reduction in inflammatory markers12 and a reduced risk of prostate cancer and breast cancer13, although also opposite associations have been found such as for lung cancer14. Because mostly positive health benefits have been found, carotenoids are increasingly used as functional ingredients in food and feed, where they can also have additional technological benefits, for instance as natural pigments and antioxidants1.
The best-studied carotenoids are those found in photosynthetic organisms. These specialized metabolites are mostly C40-carotenoids, which have 40 carbon atoms within their backbone15. More recently, the full scope of carotenoid diversity and evolution has gained increasing attention15,16. Phylogenetic analyses revealed that carotenoid biosynthesis is an ancient process that evolved prior to photosynthesis, likely under increased UV radiation conditions16,17,18. According to Santana-Molina et al.18, the earliest evolved group of carotenoids might be composed of 30 carbon atoms, or triterpenoids. These specialized compounds are found in plants under various forms, such as triterpenoid saponins19 or oleanolic acids20, and they are often involved in the plant’s defense system. However, C30 carotenoid biosynthesis originated in prokaryotes16,18 and has been detected in the photosynthesizing bacteria Heliobacter21, and several non-photosynthetic bacteria such as Lactiplantibacillus plantarum22, Bacillus subtilis23, Enterococcus faecium (formerly Streptococcus faecium)24, Staphylococcus aureus25, and Methylobacterium rhodinum (formerly Pseudomonas rhodos)26. In addition to these phenotypic observations, Santana and colleagues18 showed that both 4,4′-diapophytoene (synthesis encoded by the crtM gene) as well as squalene (encoded by Sqs or the HpnCDE cluster) can be precursors to C30 carotenoids, and that these pathways are now found scattered across prokaryotes, ranging from Firmicutes to Planctomycetes to Archaea and found to be mobile via horizontal gene transfer.
While carotenoid biosynthesis originated in, and is now scattered across nonphotosynthesizing prokaryotes, the function and ecological role of these specialized metabolites in these organisms have not been systematically studied. Within the Firmicutes, Lactobacillaceae are an interesting family of nonphotosynthesizing bacteria, with great potential for use in food, feed, and pharma27,28. Two phenotypical screenings of Lactobacillaceae, focusing on isolates from fermented foods, identified Lp. plantarum as a C30 carotenoid producer via the 4,4′-diapophytoene pathway22,29, including its antioxidant capacity, which can mitigate oxidative stress in Lp. plantarum30. However, the genomic and ecological diversity of the biosynthetic and phenotypic potential of this family has not yet been investigated.
In this study, an in-depth pangenome and evolutionary analysis of the presence of terpenoid biosynthetic gene clusters was performed on a large dataset of 4203 unique genomes of the Lactobacillaceae family. These genomic analyses were complemented with phenotypic confirmation using a biobank of Lactobacillaceae from various habitats by high-throughput screening based on absorbance maxima and high-performance liquid chromatography (HPLC) and high-resolution quadrupole time-of-flight mass spectrometry (Q-TOF MS). In addition, the role of C30 carotenoids in UV stress resilience was assessed for selected strains. Finally, associations between the presence of carotenoid biosynthesis genes in a species and its lifestyle were established, using assigned lifestyles from literature31,32, and new isolation and metabolic analyses from our in-house library of strains.
Results
C30 carotenoid biosynthesis genes are dispersed across the Lactobacillaceae family and show frequent horizontal transfer
To explore the carotenoid biosynthetic gene cluster potential within the Lactobacillaceae family, the presence of all known enzymes involved in carotenoid biosynthesis was evaluated. For this purpose, 6297 publicly available genomes were screened. Due to the high similarity among genomes in this dataset, genomes were dereplicated based on pairwise ANI < 99.99%, resulting in 4179 unique genomes, from which a pangenome was constructed. Enzymes involved in the 4,4′-diapophytoene pathway for the production of C30 carotenoids were present; 4,4′-diapophytoene synthase (encoded by the crtM gene) and 4,4′-diapophytoene desaturase (crtN) (Fig. 1). The orthogroups to which each gene belonged were the same in the experimentally confirmed carotenoid-producers, Lp. plantarum WCFS122 and Lt. fragifolii AMBP162T (phenotype first observed during the description of the species33 and confirmed in this paper), validating the use of these orthogroups to detect crtMN genes across the Lactobacillaceae pangenome. In addition, analysis with antiSMASH confirmed that no other known clusters for carotenoid biosynthesis, such as the squalene pathway, were present in the Lactobacillaceae. The crtMN C30 carotenoid biosynthesis gene families showed to be scattered across 28 species from 14 genera out of a total of 361 species and 34 genera analyzed within the Lactobacillaceae family (i.e., 7.7% species and 41.2% genus prevalence) (Fig. 2). We selected 37 in-house strains (Supplementary Table S1) for genome sequencing to complement the public dataset based on the likelihood for presence of crtMN genes (29 strains) and underrepresented species (8 strains). After genomic dereplication, 24 remained (Supplementary Table S1) and were included in the subsequent analysis (4203 unique genomes). The detected crtMN genes clustered into two clades (Fig. 3). One clade included Lactiplantibacillus, Fructilactobacillus, Latilactobacillus and Companilactobacillus, while the other exhibited broader diversity, including genera such as Leuconostoc, Oenococcus, Holzapfelia and other taxa within Fructilactobacillus. Notably, the gene trees for crtM and crtN were highly similar, indicating that these genes were typically transferred or inherited together. The crtN tree was preferred for visualization because it was more accurate due to the greater length of the gene (i.e., 499 amino acids). Notably, the Fructilactobacillus genus was present in both clades, indicating two independent acquisition events of the crtMN genes in this genus and showing that this trait had been gained convergently in this genus. The crtMN phylogenetic trees did not align with the species tree based on the core genome (Fig. 2), suggesting that these genes were not only inherited through vertical transmission but also horizontally transferred across species and even genera. Clear examples of recent horizontal gene transfer (HGT) events were observed within clusters of multiple species that contain nearly identical crtMN genes (up to 100% similarity at the protein level), for example amongst Lp. plantarum, Levilactobacillus buchneri, Levilactobacillus brevis, Lactiplantibacillus pentosus, and Pediococcus pentosaceus, as well as amongst several Leuconostoc species. Of note, one cluster present in Apilactobacillus ozensis and Apilactobacillus xinyiensis seemed to contain two genes that were both annotated as crtN.
The tree was inferred with IQ-Tree via the LG + F + G4 method. An orange tip indicates that carotenoid biosynthesis genes were present in at least one genome in a species. The inner circle corresponds to the assigned lifestyle, as described in Zheng and Wittouck et al.33; the second circle corresponds to the percentage of genomes that contain crtMN genes compared to all tested genomes of that species (n is given at the branch tip). The outer circle corresponds to genome size. Species for which the phenotype was experimentally confirmed are indicated in bold. The genera were collapsed when no crtMN genes were detected.
The tree was inferred with IQtree via the LG + F + G4 method. To reduce the number of branches, sequences were first clustered with cd-hit with a 95% similarity threshold. When a cluster contained multiple species, the species were clustered into multi-species horizontal gene transfer (HGT) groups, as shown in a table for clarity. The numbers indicate the number of strains that collapsed for each cluster. For each tip, the biosynthetic cluster is also shown as predicted by Bigscape using one representative.
Identification of C30 carotenoid-producing species in Lactobacillaceae biobank
Having shown the taxonomic spread of C30 carotenoid biosynthesis genes within Lactobacillaceae, we subsequently aimed to phenotypically substantiate the carotenoid biosynthesis capacity in strains containing crtMN genes. First, C30 carotenoid biosynthesis was confirmed in Latilactobacillus fragifolii AMBP162T, originally isolated from the phyllosphere of a strawberry plant33 and compared to that of the known C30 carotenoid producer Lp. plantarum WCFS122. The pellets of both strains appeared yellow, providing the first indication of carotenoid biosynthesis (Fig. 4A). The extract was purified using HPLC coupled inline to a diode-array detector (DAD) and high-resolution quadrupole time-of-flight mass spectrometry (Q-TOF MS). The highest peak in the UV chromatogram of the HPLC eluent (Fig. 4B) had absorption maxima at 467, 438, and 414 nm (Fig. 4C), similar to the peaks prior to purification (465, 435, and 409 nm) and corresponding to the absorption maxima of 4,4′-diaponeurosporene22. The peak also had a mass-to-charge ratio (m/z) of 403.3338 (Fig. 4D), corresponding to the formula C30H42, which is identical to that of 4,4′-diaponeurosporene.
Lt. graminis DSM20719T was used as a nonpigmented control. A Washed cell pellets (left) and unpurified extracts (right) of 2-day cultures. B UV chromatogram of the HPLC eluent of the extracted pigments. C UV‒visible absorption spectrum of the highest peak in the HPLC chromatogram; absorption maxima (nm) are indicated above the peaks. D High-resolution quadrupole time-of-flight mass spectrometry image of the highest peak in the HPLC eluent. The mass-to-charge ratios are indicated above the peaks.
Subsequently, a high-throughput method was developed to phenotypically test C30 carotenoids, using less solvent and excluding unnecessary purification of the extract (see Methods). Starting from 575 in-house isolates and five publicly sourced strains (Supplementary Table S1) based on taxonomy and isolation source, 45 in-house and 5 public Lactobacillaceae were tested phenotypically. This selection included 17 related noncarriers to assess the absence of phenotype. In total, the biosynthesis of 4,4′-diaponeurosporene-like carotenoids was substantiated in 27 strains belonging to the crtMN-harboring species Lt. fragifolii, Lp. plantarum, Leuconostoc citreum, Lc. pseudomesenteroides and Holzapfelia floricola, and absent in all tested noncarrier strains (17) (Fig. 3, Supplementary Table S1). For only six of the isolates tested, the observed phenotype did not match the genotype. Specifically, the two tested strains of the species Lactiplantibacillus mudanjiangensis (AMBF-0197 and AMBF-0209) and one strain of Lc. mesenteroides (LMG 6893) did not appear to express the phenotype under the tested conditions (Fig. 1C), whereas two Lp. plantarum strains (AMBP-0214 and AMBP-0424) appeared to harbor inactive 4,4′-diapophytoene desaturase caused by the deletion of 522 bp in the crtN gene. These latter strains were considered natural nonproducing mutants of carotenoid production in Lp. plantarum, a characteristic employed in subsequent physiological tests. Finally, the Apilactobacillus xinyiensis (AMBP-0461) containing the unusual cluster with the duplicated crtN did not appear to express the phenotype under the tested conditions.
C30 carotenoids are associated with UV stress resistance in Lp. plantarum strains
To assess whether carotenoid biosynthesis in Lactobacillaceae plays a role in resistance to UV stress, we compared seven carotenoid-producing Lp. plantarum strains to the natural nonproducing strains AMBP-0214 and AMBP-0424. A phylogenetic tree of the core genomes of all the strains tested confirmed that all the strains were closely related and that the non-producers were not an outgroup (Supplementary Fig. S1). A general linear model fit revealed that carotenoid-negative strains were significantly more susceptible (higher Δlog10) to UV-induced stress than carotenoid-positive strains for both 30 s (effect size = 0.5749, p = 0.003) and 40 s (effect size 1.1474, p = 0.01) of UV exposure (Fig. 5A). A Linear Mixed-Effects Model (LMER) was used to include strain-specific variation as a random effect. A likelihood-ratio test showed that including strain as a random effect improved the fit significantly for both 30 s (p = 0.0001) and 40 s (p = 0.002) of UV exposure. The interaction term between carotenoid production and experiment did not significantly improve the model and was therefore removed from the final model. Including carotenoid production significantly improved the fit for both timepoints of UV exposure (p = 0.04 for both). Lastly, when examining the effect of the experiment, it was shown that solely for 40 s of UV exposure, including the experiment leads to a significantly better fit (p = 0.002). To summarize, for 30 s of UV exposure, Δlog10 was significantly influenced by both strain and carotenoid production. For 40 s of UV exposure, Δlog10 was significantly influenced by strain, carotenoid production, and experiment.
A Survival rate of Lp. plantarum strains under ultraviolet (UV) stress. All the strains were tested in three separate trials. Seven producers and two natural knockout mutants that do not produce carotenoids were included (Supplementary Table S1). Mean and standard error of the mean are visualized per strain. Significance (p < 0.05) between conditions is shown with a “*”. B Percentage of species within a lifestyle (based on Zheng et al.33) harboring the crtMN gene as a core or accessory trait. The total number of species analyzed is given for each lifestyle. C The genome sizes of species in relation to their lifestyle and the presence of crtMN genes.
C30 carotenoid biosynthesis is associated with nomadic and insect-adapted lifestyles
We subsequently evaluated whether the C30 carotenoid biosynthesis genes were associated with a particular Lactobacillaceae lifestyle and habitat adaptation using the categories defined by Zheng et al., as free-living, vertebrate-adapted, invertebrate-adapted or nomadic32. Such lifestyle assignments were available for 170 out of 361 species in our public genome datasets. The presence of C30 carotenoid biosynthesis genes (crtMN) was most common in nomadic and insect-adapted lactobacilli, where they appeared as core (e.g., in nomadic Lp. plantarum, insect-adapted Fructilactobacillus lindneri and uncategorized Lc. citreum) and accessory traits. In contrast, crtMN genes were completely absent in vertebrate-associated lactobacilli such as Lactobacillus crispatus and Limosilactobacillus reuteri. They were also found in a few genomes representing free-living species such as Lv. buchneri (Fig. 5B). Consequently, a significant association between the proportion of genomes with C30 carotenoid biosynthesis and the lifestyle of a species (p < 0.001) was found when the Kruskal-Wallis test was used. Pairwise comparison with a Dunn test with Bonferroni correction for multiple testing (Supplementary Fig. S2A) revealed the strongest association with nomadic and insect-adapted lifestyles. When considering the phylogenetic background with phylogenetic generalized least squares (PGLS), the same trends were found, but only insect-adapted versus free living was near significant (p = 0.059) (Supplementary Fig. S2B).
In addition, nomadic and insect-associated crtMN-harboring strains differed in genome size, with the nomadic strains having large genomes, similar to other nomadic Lactobacillaceae, and the insect-adapted strains having the smallest genomes within the family. These findings point to positive selection of these genes after horizontal gain in species with nomadic and insect-adapted lifestyles (Fig. 5C). To further investigate the lifestyle association, we also screened the isolation sources of 575 in-house isolates and three publicly sourced isolates. We correlated this with the C30 carotenoid phenotype and genotype. This analysis pointed towards leaves and flowers carrying a high number of carotenoid producers belonging to the species Lt. fragifolii (n = 13), Lc. pseudomesenteroides (n = 1), Lc. citreum (n = 3), and H. floricola (n = 1) (Supplementary Fig. S3). A second major habitat of carotenoid producers showed to be plant-based fermentations, which harbored carotenoid-producing Lc. citreum (n = 2), Lp. plantarum (n = 6), and Lactiplantibacillus paraplantarum (n = 2). In contrast, in our laboratory, carotenoid-producing Lactobacillaceae strains were only sporadically isolated from vertebrate habitats (e.g., the human vagina34 or respiratory tract35) and were all nomadic Lp. plantarum strains.
Discussion
In this study, we investigated the biosynthesis of C30 carotenoids in the Lactobacillaceae family, the largest family of beneficial bacteria known to date, and linked the crtMN-positive genotype to their lifestyles. For this purpose, we used an integrated comparative genomic approach combined with a phenotypic screening of a diverse in-house Lactobacillaceae strain collection and an assessment of the functional role of carotenoids in UV resistance.
Pangenome analysis of Lactobacillaceae revealed that crtMN-mediated carotenoid biosynthesis was a rare and scattered trait within the family: it occurred in 41% of all the genera but only in a few species or strains within each genus. It was found to be a core property in some nomadic species, such as Lp. plantarum; insect-adapted species, such as Fl. lindneri; and accessory in many others. Phylogenetic analyses indicated that the biosynthetic crtMN genes are frequently transferred horizontally across species and genera, for example, from Lp. plantarum to Lv. brevis and within the Leuconostoc genus. This finding is consistent with the high mobility of carotenoid pathways observed at higher taxonomic levels16,17,18. Moreover, the trait appeared to have been gained convergently in Fructilactobacillus, with this genus having acquired crtMN genes from distinct donors during distinct events. Using a high-throughput extraction and analysis method, the crtMN genotype was matched with the synthesis of 4,4′-diaponeurosporene in five species. Functionally, we showed that Lp. plantarum strains that do not produce carotenoids were less resistant to UV stress, in line with the general knowledge on carotenoids16. Previous research has also demonstrated that 4,4′-diaponeurosporene biosynthesis protects against oxidative stress in Lp. plantarum30. Finally, the scattered distribution and mobility of this trait across the family, coupled with its advantages in UV stress, prompted us to systematically investigate the link between carotenoid biosynthesis and the lifestyle and ecology of Lactobacillaceae.
As we have done here for carotenoids, phylogeny needs to be considered when testing for associations between two features associated with a set of species. This was done in our study by applying a phylogenetic generalized least squares (PGLS) approach36. Such an approach effectively takes into account that closely related species will likely share similarity in any two traits (such as crtMN prevalence and lifestyle studied here) because of ”phylogenetic inertia”, not necessarily because the traits are correlated37. Since the lifestyles of Lactobacillaceae species are mostly conserved at genus level, this implies that the independent units of information are in general the genera rather than the species, resulting in a lower effective sample size and limitations in our dataset concerning statistical power. In addition, another limitation of our work is that a lifestyle assessment has not yet be attributed by Duar et al. and Zheng et al. to 191 of the 361 Lactobacillaceae species in our dataset31,32. This is due to lack of sufficient data on the isolation sources, metabolic potential, and related properties for these species31,32. For many of these species, only a single strain has been isolated from a single source. Repetitive isolation of species from the same environment, as well as substantiation with specific metabolic and experimental validation is required to attribute lifestyles, as the environment of isolation does not reflect the environment of adaptation (niche) for various reasons, such as random dispersal events and increasing anthropogenic effects on the biosphere. Such detailed information will have to be collected for these 191 Lactobacillaceae species to also be able to attribute a lifestyle in the future and further substantiate our analyses.
Despite these shortcomings in public data and taking into account the phylogeny, a near significant association was found between carotenoid biosynthesis genes and an insect-adapted lifestyle (p = 0.056). However, excluding phylogenetic relatedness, we found that crtMN genes were mostly absent in free-living species and completely absent in vertebrate-associated species, such as the L. crispatus, which is dominant in the human vagina34 and Lm. reuteri which typically colonizes the vertebrate gut31. The complete absence of carotenoids in well-studied vertebrate-associated Lactobacillaceae suggests that carotenoid production is not selected for in mucosal and low-oxygen habitats, such as the gut and vagina. In contrast, carotenoid biosynthesis genes were strongly associated with nomadic and insect-adapted Lactobacillaceae indicating that oxygen- and UV-rich environments encountered by nomadic and insect-adapted strains can select for this trait. The habitat-adaptation strategy of nomadic carotenoid producers appeared to differ from that of insect-adapted species. Nomadic Lactobacillaceae species typically have large genomes (between 2.4 and 3.6 Mbp), making them metabolically versatile and adaptable to various environments. In other studies, nomadic species have been found to typically occur in low numbers in oligotrophic environmental niches, such as plant surfaces28,38, and in high abundances once carbohydrates become more available, such as in vegetable fermentation products39. Such fermentations are characterized by intense microbial competition and high-salt concentration39, possibly leading to osmotic and oxidative stress. In such fermentations and outdoor oligotrophic environments, we speculate based on the data obtained here that C30 carotenoids could provide a fitness advantage to nomadic lactobacilli by reducing susceptibility to oxidative stress.
Insect-adapted lactobacilli have smaller genomes (1.2–2.2 Mbp) with a concomitant decrease in carbohydrate metabolic capacity31,32. This difference was also observed in our present study among crtMN carriers, as the insect-adapted crtMN carriers had remarkably small genome sizes (between 1.6 and 2.1 Mbp), but they still contained crtMN genes, indicating an evolutionary advantage. Interestingly, the insect-adapted crtMN carriers were all part of one clade within the Fructilactobacillus genus, a genus known to be transferred between pollinators via the environment, with flowers serving as key hubs31,40,41. Our data presented here, together with previous knowledge on C30 carotenoids30, indicate that the biosynthesis of these terpenes, and their associated protection against UV radiation and oxidative stress, could be a significant advantage for these environmentally dispersed, insect-adapted Lactobacillaceae species. Our hypothesis is in line with the adaptation strategy previously described for leaf-dwelling bacteria, such as Clavibacter and Pseudomonas, which produce C40 carotenoids and other pigments to increase their survival in this UV-stressed environment9. Notably, carotenoid biosynthesis was absent in Lactobacillaceae associated with social pollinators, such as Lactobacillus apis and Bombilactobacillus. These bacteria are vertically passed down to offspring within the hive42, where they are protected from UV- and oxidative stress. An environmental survival strategy does not seem required for these bacteria42. An exception to this is the Apilactobacillus genus, which is also known to be dispersed among solitary bees via flower43. This genus was shown in our study here to have an unusual putative terpenoid cluster, characterized by a duplicated crtN and an absent crtM gene. However, it remains to be substantiated whether this duplication is associated with a particular phenotype or pigment.
Our hypothesis that Lactobacillaceae bacteria that are dispersed to plants via insects have a competitive advantage expressing C30 carotenoids was supported by our extensive culture approach and collection. A diverse array of carotenoid-producing Lactobacillaceae were isolated from flowers and leaves, with a relative high prevalence found for Lc. citreum in flowers. This species has—to the best of our knowledge—not yet been assigned to a certain lifestyle but our data presented here add support to an insect or flower-adapted lifestyle. In contrast, Lactobacillaceae isolates from vertebrate habitats studied here (mainly from the human vagina and respiratory tract) showed to be predominantly non-producers. Among the producing strains isolated, the nomadic Lp. plantarum was the most predominant species.
In addition to the ecological role, the presence of crtMN genes in Lactobacillaceae is of interest from an applied perspective, especially considering the beneficial properties and lack of virulence factors in this family of bacteria. For example, incorporating carotenoid-producing bacteria into food fermentations could add additional functional properties to these foods. In fact, 4,4′-diaponeurosporene is already present in many vegetable fermentations, as Lp. plantarum, a core producer, dominates the later stages of most typical vegetable fermentations, and Leuconostoc species generally dominate in the early stages39. Although the added benefits of 4,4′-diaponeurosporene in these fermented food ecosystems have not yet been studied, this metabolite has been connected to health-promoting effects via immune modulation. For instance, the introduction of crtMN genes originating from Staphylococcus aureus into Bacillus subtilis has been shown to reduce colitis in mice44 and increase resistance to Salmonella typhimurium infection45. Furthermore, in piglets, heterologously crtMN expressing-B. subtilis bacteria have been shown to improve the mucosal immune system of the gut46 and the respiratory tract47. These studies were carried out with genetically modified bacteria and are thus unlikely to reach large market applications, especially in Europe. In contrast, our results presented here indicate that natural carotenoid-producing Lactobacillaceae constitute an interesting alternative. Moreover, microbial biosynthesis can offer advantages over traditional production methods, as it can be safer and less reliant on fossil resources than chemical synthesis and less influenced by seasonality or climate than plant-based biosynthesis48 and applied to C30 carotenoids49.
In summary, this study on the ecology and evolution of carotenoid biosynthesis in the Lactobacillaceae family revealed a scattered distribution of crtMN-mediated C30 carotenoid 4,4′-diaponeurosporene biosynthesis across 28 species and 14 genera and highlighted the mobility of this trait. C30 carotenoid biosynthesis appears to have emerged as a core property in several species, notably Lp. plantarum, Lc. citreum, and Fl. lindneri. Furthermore, carotenoid biosynthesis was strongly associated with nomadic and insect-adapted lifestyles, where it offers an advantage via protection from UV stress.
Methods
Pangenome and phylogenetic analysis of the terpenoid biosynthetic gene clusters
All publicly available Lactobacillaceae genomes were downloaded from the Genome Taxonomy Database (GTDB, gtdb.ecogenomic.org, version r214). With checkM, any incomplete (<90%) or contaminated (>5%) genomes were excluded50. This resulted in a dataset of 6259 public genomes. To avoid pseudoreplication, the sample function of SCARAP (github.com/SWittouck/SCARAP) was used with an average nucleotide identity (ANI) cutoff of 99.99%, retaining only genomes with ANI values below this cutoff. Second, the pan genomes of this family were inferred using the SCARAP tool51. As a start, orthogroups of detected genes, i.e., 4,4′-diapophytoene desaturase (encoded by the crtN gene) and of 4,4′-diapophytoene synthase (encoded by the crtM gene) were determined (Table 1)32. For reference, the genes from two experimentally confirmed species, Lp. plantarum WCFS122 and Latilactobacillus fragifolii AMBP162 (experimentally confirmed in this study), were used as queries. Based on the results of this pangenome analysis, 37 in-house isolates were selected for whole genome sequencing, and were included in the subsequent analysis. Finally, all genomes were checked for other terpene biosynthesis genes using Antismash 6.052 focusing on terpenes and terpene pathways.
The core tree of all Lactobacillaceae was constructed following the method described in Eilers et al.53. The core genome was inferred using SCARAP, which consists of 296 core genes. Protein sequences were extracted, aligned with MAFFT and trimmed with trimAL with a gap threshold of 10%. Aligned and trimmed core proteins were subsequently inferred with IQ-TREE using LG + F + G4.
First, the prevalence of crtMN genes within all species of the family Lactobacillaceae was calculated and integrated with lifestyle data from Zheng et al.32, and metadata from the GTDB using tidygenomes (github.com/SWittouck/tidygenomes) based on ggtree packages in R. Genera were collapsed when no species of this genus contained the crtMN genes.
Second, to examine the phylogeny of the crtM and crtN within the Lactobacillaceae family, a gene tree was inferred at the amino acid level using a similar procedure to the species tree. Since the crtM and crtN tree were highly similar, the crtN tree was used as a model due to its larger size and thus resolution. To reduce the number of branches, sequences were first clustered with cd-hit54 with a 95% similarity threshold. In instances where sequences from different species were present in the same CD-hit cluster, they are shown in the adjacent table. The biosynthetic gene clusters obtained from antiSMASH were visualized using Bigscape55 based on dereplicated genomes, and the clusters were mapped onto the crtN tree.
Strains used in this study
Phenotypic characterization started from the crtMN carriers within 575 in-house Lactobacillaceae isolates and three publicly available strains (Supplementary Table S1). These strains were previously isolated from vegetable fermentations54; the human vagina34; the human respiratory tract35, the phyllosphere33; anthosphere56; and liquid compost fermentations57. Forty-eight isolates, taxonomically related to carotenoid producers based on the pangenome analysis, were phenotypically screened. To complement the publicly available data, the genomes of 25 crtMN and 17 non-crtMN containing in-house isolates were sequenced and included in the pangenome analysis (Supplementary Table S1, study number PRJEB57255). UV stress assays were performed using nine Lp. plantarum strains, as indicated in Supplementary Table S1.
C30 carotenoid extraction and identification
Lipophilic compounds were extracted, and their absorption spectra were measured based on the methods of Garrido-Fernández and colleagues22. In brief, cells were harvested from a 50 ml overnight culture in Weissella Medium Broth (WMB) by centrifugation for 15 min at 2000 × g. The cells were subsequently washed with 50 ml of sterile distilled water. Afterward, 10 ml of N,N-dimethylformamide was added to the washed cells, which were incubated for 15 min at 65 °C. The cell debris was separated by centrifugation at 3000 × g for 10 min, after which the supernatant was transferred to a separator funnel. The extraction of the remaining cell debris with N,N-dimethylformamide was repeated four times. All the extracts were pooled and mixed with 100 ml of diethyl ether, and 10% NaCl was added to aid in the separation of the liquid phases. The organic phase was dried with anhydrous Na2SO4, followed by solvent evaporation in a rotary evaporator. The resulting residue was dissolved in 2 ml of methanol/tert-butyl methyl ether (1:1 v:v) containing 1% BHT. The absorption spectrum was measured between 550 nm and 300 nm using a spectrophotometer to determine the characteristic absorption maxima. Carotenoids were further purified and identified using high-performance liquid chromatography (HPLC) coupled inline to a diode array detector (DAD) and high-resolution quadrupole time-of-flight mass spectrometry (Q-TOF MS), a procedure conducted by the RIC group. For high-throughput identification, a similar procedure was used with different extraction solvents and without further purification. The washed cells were extracted with 2 ml of molecular biology grade ethanol, vortexed thoroughly and incubated for 20 min at 65 °C. The cell debris was separated by centrifugation at 8603 × g for 5 min, after which the supernatant was transferred to a 15 ml tube. The ethanol extraction step was repeated once on the remaining cell debris. Both extracts were pooled, and 2 ml of heptane was added and mixed vigorously for 1 min. Next, 2 ml of distilled water was added and mixed. The hydrophilic and lipophilic phases were then separated by centrifuging the tubes for 5 min at 4000 × g. The top layer (lipophilic phase) was cautiously transferred to a clean 2 ml tube and dried with anhydrous Na2SO4. Finally, the absorption peaks between 300 and 550 nm were measured using a spectrophotometer to detect characteristic absorption peaks.
UV stress resistance assays
Cells were harvested by centrifugation (10 min at 1500 × g) from a 10 mL two-day culture, grown in MRS medium at 28 °C with shaking, washed with sterile phosphate-buffered saline (PBS) and resuspended to an optical density (OD) of 0.16 at 600 nm. Each suspension (500 µl) was dispensed together with 8 mL of sterile PBS into small Petri plates and placed on an orbital shaker with a gentle swirling motion inside a laminar air flow cabinet, after which the lids were removed. The swirling Petri plates were exposed to UV treatment, and samples were collected at three timepoints: before UV exposure, after 30 s, and 40 s of UV exposure. The number of colony-forming units (CFUs) was determined for all the samples through serial dilution and plating out in triplicate on MRS agar. The entire assay was repeated three times.
The rate of decrease in CFU counts, indicating susceptibility to UV stress, was determined by transforming the CFU counts to a logarithmic scale and calculating the log10 reduction (Δlog10) at 30 and 40 s of UV irradiation compared to the initial CFU counts. A general linear model was used to evaluate the effect of carotenoid production and experimental variation on the Δlog10. To account for strain-specific variation, a Linear Mixed-Effects Model (LMER) was used, where strain was included as a random intercept. The significance of carotenoid production, experiments, and strain effects was assessed using log-likelihood-ratio tests.
The log-likelihood test was used to compare the full model with simpler models, where individual terms were removed. If a more complex model does not fit significantly better, a term is removed. If it is significantly better, then the term has a significant effect on the model and is kept.
Association of C30 carotenoid biosynthesis capacity and habitat
To further associate the carotenoids and habitats, multiple statistical analyses were performed on the dereplicated dataset in R with the packages dunn.test, caper, nlme, and geiger to assess the correlation between crtMN prevalence and lifestyle and genome size, taking phylogenetic effects into account. A gene was considered a core trait of a species when more than 95% of the genomes within a species had this trait and an accessory trait when the prevalence was between 5% and 95%.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The publicly available Lactobacillaceae genomes were downloaded from the Genome Taxonomy Database (GTDB, version r214). All newly sequenced genomes from in-house isolates are available in the ENA database in project PRJEB75646 using the accession numbers given in Supplementary Table S1.
References
Sun, T., Rao, S., Zhou, X. & Li, L. Plant carotenoids: recent advances and future perspectives. Mol. Horticulture. https://doi.org/10.1186/s43897-022-00023-2 (2022).
Havaux, M. & Niyogi, K. K. The violaxanthin cycle protects plants from photooxidative damage by more than one mechanism. Plant Biol. 96, 8762–8767 (1999).
Fraser, N. J., Hashimoto, H. & Cogdell, R. J. Carotenoids and bacterial photosynthesis: the story so far. Photosynth. Res. 70, 249–256 (2001).
Schwartz, S. H., Qin, X. & Loewen, M. C. The biochemical characterization of two carotenoid cleavage enzymes from Arabidopsis indicates that a carotenoid-derived compound inhibits lateral branching. J. Biol. Chem. 279, 46940–46945 (2004).
Booker, J. et al. MAX3/CCD7 is a carotenoid cleavage dioxygenase required for the synthesis of a novel plant signaling molecule. Curr. Biol. 14, 1232–1238 (2004).
Schemske, D. W. & Bradshaw, H. D. Pollinator preference and the evolution of floral traits in monkeyflowers (Mimulus). Proc. Natl Acad. Sci. USA 96, 11910–11915 (1999).
Liu, G. Y. et al. Staphylococcus aureus golden pigment impairs neutrophil killing and promotes virulence through its antioxidant activity. J. Exp. Med. 202, 209–215 (2005).
Mohanty, S. R. et al. Methylotroph bacteria and cellular metabolite carotenoid alleviate ultraviolet radiation-driven abiotic stress in plants. Front. Microbiol. 13, 1–18 (2023).
Jacobs, J. L., Carroll, T. L. & Sundin, G. W. The role of pigmentation, ultraviolet radiation tolerance, and leaf colonization strategies in the epiphytic survival of phyllosphere bacteria. Micro. Ecol. 49, 104–113 (2005).
Moran, N. A. & Jarvik, T. Lateral transfer of genes from fungi underlies carotenoid production in aphids. Science 328, 624–627 (2010).
Karrer, P. Carotenoids, Flavins and Vitamin A and B2. https://www.nobelprize.org/prizes/chemistry/1937/karrer/lecture/ (1937).
Hajizadeh-Sharafabad, F., Zahabi, E. S., Malekahmadi, M., Zarrin, R. & Alizadeh, M. Carotenoids supplementation and inflammation: a systematic review and meta-analysis of randomized clinical trials. Crit. Rev. Food Sci. Nutr. 62, 8161–8177 (2022).
Dehnavi, M. K., Ebrahimpour-Koujan, S., Lotfi, K. & Azadbakht, L. The association between circulating carotenoids and risk of breast cancer: a systematic review and dose–response meta-analysis of prospective studies. Adv. Nutr. https://doi.org/10.1016/j.advnut.2023.10.007 (2024).
Zhang, Y., Yang, J., Na, X. & Zhao, A. Association between β-carotene supplementation and risk of cancer: a meta-analysis of randomized controlled trials. Nutr. Rev. 81, 1118–1130 (2023).
Yabuzaki, J. Carotenoids Database: structures, chemical fingerprints and distribution among organisms. Database 2017, 1–11 (2017).
Klassen, J. L. Phylogenetic and evolutionary patterns in microbial carotenoid biosynthesis are revealed by comparative genomics. PLoS ONE 5, e11257 (2010).
Armstrong, G. A. & Hearstt, J. E. Genetics and molecular biology of carotenoid pigment biosynthesis. FASEB J. 10, 228–237 (1996).
Santana-Molina, C., Henriques, V., Hornero-Méndez, D., Devos, D. P. & Rivas-Marin, E. The squalene route to C30 carotenoid biosynthesis and the origins of carotenoid biosynthetic pathways. Proc. Natl Acad. Sci. USA 119, e2210081119 (2022).
Augustin, J. M., Kuzina, V., Andersen, S. B. & Bak, S. Molecular activities, biosynthesis and evolution of triterpenoid saponins. Phytochemistry 72, 435–457 (2011).
Pollier, J. & Goossens, A. Oleanolic acid. Phytochemistry 77, 10–15 (2012).
Takaichi, S. et al. The major carotenoid in all known species of heliobacteria is the C30 carotenoid 4,4’-diaponeurosporene, not neurosporene. Arch. Microbiol. 168, 277–281 (1997).
Garrido-Fernández, J., Maldonado-Barragán, A., Caballero-Guerrero, B., Hornero-Méndez, D. & Ruiz-Barba, J. L. Carotenoid production in Lactobacillus plantarum. Int. J. Food Microbiol. 140, 34–39 (2010).
Pramastya, H., Song, Y., Elfahmi, E. Y., Sukrasno, S. & Quax, W. J. Positioning Bacillus subtilis as terpenoid cell factory. J. Appl. Microbiol. 130, 1839–1856 (2021).
Taylor, R. F. & Davies, B. H. Triterpenoid carotenoids and related lipids. Triterpenoid carotenoid aldehydes from Streptococcus faecium UNH 564P. Biochem. J. 153, 233–239 (1976).
Pelz, A. et al. Structure and biosynthesis of staphyloxanthin from Staphylococcus aureus. J. Biol. Chem. 280, 32493–32498 (2005).
Kleinig, H., Schmitt, R., Meister, W., Englert, G. & Thommen, H. New C30-carotenoic acid glucosyl esters from Pseudomonas rhodos. Z. Naturforsch. - Sect. C. J. Biosci. 34, 181–185 (1979).
Lebeer, S., Vanderleyden, J. & De Keersmaecker, S. C. J. Genes and molecules of lactobacilli supporting probiotic action. Microbiol. Mol. Biol. Rev. 72, 728–764 (2008).
Yu, A. O., Leveau, J. H. J. & Marco, M. L. Abundance, diversity and plant-specific adaptations of plant-associated lactic acid bacteria. Environ. Microbiol. Rep. 12, 16–29 (2020).
Turpin, W. et al. PCR of crtNM combined with analytical biochemistry: An efficient way to identify carotenoid producing lactic acid bacteria. Syst. Appl. Microbiol. 39, 115–121 (2016).
Kim, M., Jung, D., Seo, D., Park, Y. & Seo, M. 4,4′-Diaponeurosporene from Lactobacillus plantarum subsp. plantarum KCCP11226: low temperature stress-induced production enhancement and in vitro antioxidant activity. J. Microbiol. Biotechnol. 31, 63–69 (2021).
Duar, R. M. et al. Lifestyles in transition: evolution and natural history of the genus Lactobacillus. FEMS Microbiol. Rev. 41, S27–S48 (2017).
Zheng, J. et al. A taxonomic note on the genus Lactobacillus: description of 23 novel genera, emended description of the genus Lactobacillus Beijerinck 1901, and union of Lactobacillaceae and Leuconostocaceae. Int. J. Syst. Evol. Microbiol. 70, 2782–2858 (2020).
Legein, M., Wittouck, S. & Lebeer, S. Latilactobacillus fragifolii sp. nov., isolated from leaves of a strawberry plant (Fragaria x ananassa). Int. J. Syst. Evol. Microbiol. 72, 005193 (2022).
Lebeer, S. et al. A citizen-science-enabled catalogue of the vaginal microbiome and associated factors. Nat. Microbiol. 8, 2183–2195 (2023).
De Boeck, I. et al. Lactobacilli have a niche in the human nose. Cell Rep. 31, 107674 (2020).
Grafen, A. The phylogenetic regression. Philos. Trans. R. Soc. Lond. B, Biol. Sci. 326, 119–157 (1989).
Felsenstein, J. Phylogenies and the comparative method. Am. Nat. 125, 1–15 (1985).
Miller, E. R. et al. Establishment limitation constrains the abundance of lactic acid bacteria in the Napa cabbage phyllosphere. Appl. Environ. Microbiol. 85, e00269-19 (2019).
Wuyts, S. et al. Carrot juice fermentations as man-made microbial ecosystems dominated by lactic acid bacteria. Appl. Environ. Microbiol. 84, 2021 (2018).
McFrederick, Q. S. et al. Flowers and wild megachilid bees share microbes. Micro. Ecol. 73, 188–200 (2017).
McFrederick, Q. S. et al. Environment or kin: whence do bees obtain acidophilic bacteria? Mol. Ecol. 21, 1754–1768 (2012).
Kwong, W. K. & Moran, N. A. Gut microbial communities of social bees. Nat. Rev. Microbiol. 14, 374–384 (2016).
Vuong, H. Q., Mcfrederick, Q. S. & Angert, E. Comparative genomics of wild bee and flower isolated lactobacillus reveals potential adaptation to the bee host. Genome Biol. Evol. 11, 2151–2161 (2019).
Jing, Y., Liu, H., Xu, W. & Yang, Q. Amelioration of the DSS-induced colitis in mice by pretreatment with 4,4’-diaponeurosporene-producing Bacillus subtilis. Exp. Ther. Med. 14, 6069–6073 (2017).
Liu, H., Xu, W., Yu, Q. & Yang, Q. 4,4’-diaponeurosporene-producing Bacillus subtilis increased mouse resistance against Salmonella typhimurium infection in a CD36-dependent manner. Front. Immunol. 8, 483 (2017).
Jing, Y., Liu, H., Xu, W. & Yang, Q. 4,4′-Diaponeurosporene-producing Bacillus subtilis promotes the development of the mucosal immune system of the piglet gut. Anat. Rec. 302, 1800–1807 (2019).
Zhang, P., Huang, L., Zhang, E., Yuan, C. & Yang, Q. Oral administration of Bacillus subtilis promotes homing of CD3+ T cells and IgA-secreting cells to the respiratory tract in piglets. Res. Vet. Sci. 136, 310–317 (2021).
Cravens, A., Payne, J. & Smolke, C. D. Synthetic biology strategies for microbial biosynthesis of plant natural products. Nat. Commun. https://doi.org/10.1038/s41467-019-09848-w (2019).
Siziya, I. N., Hwang, C. Y. & Seo, M. J. Antioxidant potential and capacity of microorganism-sourced C30 carotenoids—a review. Antioxidants 11, 1963 (2022).
Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).
Wittouck, S., Wuyts, S., Meehan, C. J., van Noort, V. & Lebeer, S. A genome-based species taxonomy of the Lactobacillus genus complex. mSystems 4, e00264-19 (2019).
Blin, K. et al. AntiSMASH 6.0: improving cluster detection and comparison capabilities. Nucleic Acids Res. 49, W29–W35 (2021).
Eilers, T., Dillen, J., Van de Vliet, N., Wittouck, S. & Lebeer, S. Lactiplantibacillus carotarum AMBF275T sp. nov. isolated from carrot juice fermentation. Int. J. Syst. Evol. Microbiol. 73, https://doi.org/10.1099/ijsem.0.005976 (2023).
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
Navarro-Muñoz, J. C. et al. A computational framework to explore large-scale biosynthetic diversity. Nat. Chem. Biol. 16, 60–68 (2019).
Temmermans, J. et al. The biocontrol agent Lactiplantibacillus plantarum AMBP214 is dispersible to plants via bumblebees. Appl. Environ. Microbiol. https://doi.org/10.1128/aem.00950-23 (2023).
Legein, M. et al. Compost Teas: market survey and microbial analysis 1. Preprint at Biorxiv https://doi.org/10.1101/2022.08.05.503013 (2022).
Acknowledgements
The authors would like to thank Tim Van Rillaer for his help with the bioinformatic pipelines, Kato Michiels for her statistical insights, and Ines Tuyaerts, Nele Van de Vliet and Sam Bakelants for their help with the UV stress tests. This research has received funding from the following funding bodies: the European Research Council (ERC; starting grant Lacto-Be 852600 of S.L.), the Belgian Science Policy Office (BELSPO; BRAIN-be 2.0; B@SEBALL (B2/191/P3/B@SEBALL)), the Scientific Research Foundation – Flanders (FWO; doctoral grant of J.T. 1SC3623N and postdoctoral grant of S.W. 12AZ624N), the Industrial Research Fund of the University of Antwerp (IOF-SEP 48530), VLAIO (HBC.2022.1000), and from the revenue of our microbiome platform services.
Author information
Authors and Affiliations
Contributions
T.E., M.L., J.T., and S.L. conceptualized the project; S.L., S.W., and P.A.B. supervised the work; T.E. and M.L. drafted the manuscript, with contributions from S.L., P.A.B., S.W., J.T., and J.D. through writing and review; M.L. and J.T. conducted carotenoid extraction and stress resistance experiments; T.E. led pangenome and statistical analyses, with support from M.L., S.W., and S.L.; J.D. was responsible for whole genome sequencing of novel strains; I.V. and K.S. conducted MS analysis; T.E. and M.L. created figures and graphs; all authors reviewed and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
S.L. received funding from different probiotic companies that were not involved in this research. T.E. is partially funded through an industrial research VLAIO grant not related to this work. M.L. is employed part-time by Biobest Group NV, this company was not involved in this research. I.V. and K.S. are employed by the RIC and their work was compensated via a service contract. P.A.B. is an independent consultant for several companies in the food and pharmaceutical industry bound by confidentiality agreements.
Peer review
Peer review information
Communications Biology thanks Chad Leidy and the other, anonymous, reviewer for their contribution to the peer review of this work. Primary Handling Editor: Tobias Goris. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Eilers, T., Legein, M., Temmermans, J. et al. Distribution of C30 carotenoid biosynthesis genes suggests habitat adaptation function in insect-adapted and nomadic Lactobacillaceae. Commun Biol 7, 1610 (2024). https://doi.org/10.1038/s42003-024-07291-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42003-024-07291-2
This article is cited by
-
Lactobacilli biology, applications and host interactions
Nature Reviews Microbiology (2025)
-
Pseudomonas algeriensis sp. nov.: A Promising Phosphate-Solubilizing Endophytic Bacterium Isolated from Legume Root Nodules in the Coastal Dunes of Northwest Algeria
Current Microbiology (2025)