Introduction

The gut microbiota (GM) consists of trillions of microorganisms that form a highly dynamic ecosystem, contributing to essential physiological processes such as digestion, immune modulation, and neuroendocrine regulation1,2. GM of healthy persons is very different from those of patients and is closely associated with the incidence and progression of various diseases1,3, including obesity4, atherosclerosis5, rheumatoid arthritis6, and Parkinson’s disease7. Cultivation-independent metagenomic studies have revealed numerous correlations between specific gut microbes and diseases. For instance, Akkermansia muciniphila is negatively associated with obesity8, Parabacteroides distasonis with rheumatoid arthritis9, and Faecalibacterium prausnitzii with Crohn’s disease10. However, while metagenomics reveals correlations, it often falls short in establishing causal relationships between a specific gut microbe and a defined disease. Recent advancements in culturomics have bridged this gap by enabling the cultivation and preservation of an increasing number of species associated with human health11,12,13, facilitating cause-and-effect studies to explore the mechanisms underlying these associations. For example, culture-dependent studies have revealed that A. muciniphila promotes intestinal homeostasis by producing propionate through the fermentation of fucose14. Similarly, P. distasonis was shown to inhibit non-alcoholic steatohepatitis via the production of metabolite pentadecanoic acid15 and alleviate obesity and metabolic dysfunction by releasing succinate and secondary bile acids16. These findings suggest that gut microbes have the potential to be developed as live biotherapeutic products (LBPs) for the treatment of various diseases. LBPs, also known as next-generation probiotics, are defined by the United States Food and Drug Administration (FDA) as products that contain live organisms, are intended for the prevention, treatment, or cure of a disease or condition of human beings, and are not vaccines17,18,19,20. Despite considerable progress in developing LBPs, their clinical translation faces challenges such as strain-specific adaptability, scalability, and safety concerns. For instance, many LBPs under development are derived from a limited pool of culturable species, often lacking robust metabolic diversity or host-specific adaptability21. To harness the full potential of the LBPs, it is essential to expand the repertoire of cultured gut microbial strains while thoroughly characterizing their metabolic functions and adaptability to the host. This approach will be pivotal for identifying and developing promising LBPs for targeted disease prevention and treatment.

The human gut commensal Christensenellaceae members are promising candidates for LBPs. Their abundance significantly decreased across multiple disease cohorts, such as obesity22,23, inflammatory bowel diseases23,24, cardiovascular disease 25, and non-alcoholic fatty liver disease26. Notably, Christensenella minuta27, the first described species in the family Christensenellaceae, has been shown to alleviate host metabolic disorders through the production of a novel acylated secondary bile acid28. In our previous research, a strain resource of the gut Christensenellaceae termed ChrisGMB (https://hgmb.nmdc.cn/subject/christensenellaceae), was obtained12. ChrisGMB (short for Christensenellaceae Gut Microbial Biobank), a collection of Christensenellaceae strains and genomes, comprised of 87 strains representing 14 species of Christensenellaceae. In ChrisGMB, L. tenuis showed strain diversity and exhibited significantly reduced abundance in cohorts with metabolic diseases, such as obesity and type 2 diabetes12, and was shown to reduce the weight gain of high-fat diet-induced obese mice29, highlighting its potential as an LBP for obesity treatment.

In this study, we collected a total of 27 strains of L. tenuis from ChrisGMB12 and sequenced the complete genomes. By combining genomic analysis with experimental validation, we identified obvious diversity in genomic, metabolic, and physiological-biochemical traits across the strains, laying a foundation for further research and potential therapeutic applications of L. tenuis.

Results

Genomic characterization reveals substantial intraspecies diversity among Luoshenia tenuis strains

In our previous study12, we found that treating fecal samples with 25% alcohol effectively facilitated the isolation of strains of L. tenuis, and we isolated 27 strains of L. tenuis from the feces of 23 humans, 3 mice, and 1 monkey. These strains show high 16S rRNA identities (ranging from 99.7 to 100%) to the type strain NSJ-44T13. We sequenced the genomes of 27 L. tenuis strains, and the basic information and general genome features were collected and are described in Supplementary Tables S1 and S2. All genomes were assembled into circular forms, with completeness ranging from 92.53 to 99.19% and contamination from 0 to 0.81%. The genome sizes range from 2.58 Mb (SW57) to 2.77 Mb (SW56), with GC contents ranging from 55.87 mol % (SW56) to 57.79 mol % (SW43, SW50, Fig. 1a). Moreover, as depicted in the phylogenomic trees based on whole-genome sequences (Fig. 1a), L. tenuis strains cluster into three distinct clades, which are independent of the host origins. Further co-linearity analysis (Supplementary Fig. S1) reveals that the genomes within three clusters generally retain a considerable number of conserved regions, except for SW56, which exhibits a higher number of unique sequence elements at local collinear blocks (LCBs) compared to the other genomes. The Average Nucleotide Identity (ANI) values for 27 genomes range from 91.27% to 99.99% (Fig. 1b). Notably, strain SW56 exhibits distinct differences from the other 26 strains, as the pairwise ANI values range from 91.27% to 93.00%. Although this divergence falls below conventional species thresholds of 95%30, we maintained SW56’s classification as L. tenuis based on its 99.85% 16S rRNA identity with the type strain and absence of distinct phenotypic differences. Similarly, SW44 (ANI > 95% with 25 strains but 93.5% with SW56) was also assigned to L. tenuis, demonstrating the complex genomic diversity within this species. We then determined the Percentage of Conserved Proteins (POCP) across all genomes, revealing a broad range from 79.34% to 99.44% (Fig. 1c). Notably, genomes of strains SW52, SW56, and SW62 display the least conservation relative to the other strains. Interestingly, the strains with low POCP values do not completely overlap with those exhibiting low ANI conservation. This discrepancy arises because ANI measures nucleotide-level similarity across homologous genomic regions, thus primarily capturing vertical inheritance31, whereas POCP reflects the proportion of shared protein-coding genes32, which can be substantially affected by horizontal gene transfer (HGT) and gene content variation. Therefore, the non-overlapping patterns between ANI and POCP suggest that some L. tenuis strains have undergone differential genetic flux, such as HGT-driven acquisition or loss of accessory genes. These findings highlight the underlying genetic plasticity and evolutionary diversification of L. tenuis at the strain level.

Fig. 1: General genome features and Pangenomes analysis of L. tenuis.
figure 1

a Phylogenomic tree constructed from whole-genome sequences and genomic overview of the L. tenuis strains. The first ring represents distinct clusters. The second to fourth rings represent accessory genes, unique genes, and exclusively absent genes, respectively. The fifth and sixth rings are heatmaps showing the distribution of genome sizes and GC contents. The seventh ring represents the origins of the strains. b Average Nucleotide Identity (ANI) values among L. tenuis strains. The heatmap color gradient from yellow to green represents the increasing % ANI values. c Percentage of Conserved Proteins (POCP) values among L. tenuis strains. d Curves for pan-genome (blue) and core-genome (red) sets of L. tenuis strains.

Subsequently, we conducted a pan-core genome analysis. The gene accumulation curve as depicted in Fig. 1d indicates that the pan-genome of L. tenuis is in an open state, suggesting that the addition of new strains continues to contribute to the discovery of novel genes. In contrast, the number of core genes, which are essential and conserved across all strains, remains stable, indicating a consistent genetic backbone within the species. Specifically, among the 27 L. tenuis genomes, there are in total of 6659 orthologous genes, of which 1546 (23.22%) core genes are conserved across all strains. Of the remaining genes, 2456 (36.88%) are accessory genes presented in multiple genomes, while 2657 (39.90%) are unique to a single genome. The genetic diversity among L. tenuis strains is highlighted by the obvious variation in the number of unique and exclusively absent genes (Fig. 1a). The count of unique genes among the strains spans from 0 in strain SW66 to a high of 677 in strain SW56. Of the 27 genomes, 16 have exclusively-absent genes varying from 1 (for strains SW45, SW55, et al.) to 174 (for strain SW56). Notably, we found that strains SW56, SW52, and SW62, which previously exhibited the top 3 lowest POCP values, also possess the highest counts of unique and exclusively absent genes. This observation further suggests that these strains might experience more gene transfer and loss events throughout their evolutionary history, potentially setting them apart from the others in terms of functions and host-adaptation roles.

Luoshenia tenuis demonstrates extensive horizontal gene transfers and plasticity

Horizontal gene transfer (HGT) enables bacteria to acquire genes beneficial for their survival from the surrounding microenvironment33 and thereby leads to increased genetic diversity. Our analysis reveals that HGT events within L. tenuis genomes vary across strains, ranging from 105 to 153 events, constituting 3.76% to 5.55% of their genomes (Fig. 2a). Notably, strain SW56, which exhibits the highest genetic variation, also harbors the most HGT events (153), underscoring HGT as a critical driver of genetic variation in this strain. Functional annotation using the Clusters of Orthologous Groups (COG) database (Fig. 2b) indicates that HGT genes are enriched in pathways related to energy production and conversion (C), cell wall/membrane/envelope biogenesis (M), and other essential functions. To investigate the origins of horizontally transferred genes (HGT) in L. tenuis, we performed homology-based taxonomic annotation of all predicted HGT events (Supplementary Table S3). While most donor species could only be resolved at higher taxonomic levels (class Clostridia or order Eubacteriales), several HGT genes—particularly in strain SW56—were confidently assigned to human gut-associated Clostridia species, including Faecousia intestinalis, Evtepia gabavorous, and Maliibacterium massiliense. These results suggest that L. tenuis strains acquire foreign genes primarily from gut-resident Clostridia, consistent with their shared niche and phylogenetic proximity34,35.

Fig. 2: Horizontal gene transfer analysis of the L. tenuis genomes.
figure 2

a Distribution of HGTs and mobile genetic elements (MGEs). b Distribution of COG functional categories among potential horizontal gene transfer (HGT) families. c Comparative genomic analysis of 27 L. tenuis genomes, with the reference genome of strain NSJ-44T.

Since HGT is typically mediated by mobile genetic elements (MGE)28,29, which facilitate site-specific recombination at tRNA-coding genes36,37, we characterized the tRNA loci and MGEs on L. tenuis genomes. The numbers of tRNA loci vary across strains, with the majority of strains containing 49 loci, and strain SW56 uniquely possessing 51 loci (Supplementary Table S2). Additionally, the presence of MGEs also varies obviously among genomes (Fig. 2a): Insertion sequences (ISs) range from 0 to 24 (Supplementary Table S4), genomic islands (GIs) range from 5 to 13 (Supplementary Table S5), and 1–2 prophages are found in strains SW43, SW47, SW48, SW52, SW62, SW65, SW66, and SW67 (Supplementary Table S6). Genomic islands constitute approximately 10% of the genome (compared to <0.02‰ for ISs and <3% for prophages), and appear to be the primary basis for genetic evolution. These GIs harbor genes related to genetic material regulation and carbohydrate metabolism, including α-amylase (27 strains), ribose/galactose Isomerase (14 strains), L-fucose isomerase (10 strains), β-galactosidase (8 strains), and cellulase (8 strains), as well as antibiotic resistance genes. Comparative genomic analysis reveals that most of the highly variable regions in the L. tenuis genomes are located within the GIs (Fig. 2c). This result reinforces the conclusion that GIs drive genetic evolution in L. tenuis.

In addition to MGEs, L. tenuis also evolves multiple antiviral defense mechanisms to combat viral threats. We identified 10 distinct types of defense systems (Supplementary Fig. S2). These antiviral systems exhibit heterogeneous distributions, with the CRISPR-Cas system being the most prevalent, followed by the DRT and RM systems. This diverse repertoire of antiviral systems highlights the adaptability and resilience of L. tenuis in dynamic hosts.

In-silico profiling of metabolic characteristics and nutritional adaptations of Luoshenia tenuis

To profile the functional characteristics of L. tenuis, we analyzed the pan-genomes of 27 strains utilizing the COG database38. Our analysis reveals that metabolic functions form the core of the genome, with 42.21% of core genes associated with metabolism, particularly carbohydrate transport and metabolism (G), amino acid transport and metabolism (E), and energy production and conversion (C) (Fig. 3a, b). Strains SW52 and SW62 exhibit enrichment of metabolic genes in Class (G), (E), and Coenzyme transport and metabolism (H) (Supplementary Fig. S3). The strain SW56, despite its high level of HGT and genetic variation, shows fewer (G)-related genes. Interestingly, 36.75% of accessory genes and 41.63% of unique genes are linked to information storage and processing, highlighting the role of genetic variation in these functions.

Fig. 3: In-silico prediction and experimental validation of carbon source utilization of L. tenuis.
figure 3

a COG functional categories distribution in the pangenome of L. tenuis. b Comparison of COG functional categories in the pan-genome of L. tenuis. c CAZymes analysis in L. tenuis. The colour change from blue to red and the size of the circle correspond to the number of CAZymes, with red and larger circles indicating a greater number. d Carbon source utilization profile of 27 L. tenuis strains (Biolog), with blue indicating utilization and white indicating non-utilization. Data are presented from single independent experiments.

Further metabolic characterization reveals that L. tenuis has a specialized ability to degrade plant polysaccharides. CAZyme analysis identifies 64 CAZyme families, with Glycoside Hydrolases (GHs) making up 39.88% of CAZyme-encoding genes (Fig. 3c). GHs involved in the degradation of plant-derived polysaccharides, such as cellulose (GH1, GH3, GH5, GH31), hemicellulose (GH43, GH51, GH78), and starch (GH13, GH127), are abundant, whereas GHs targeting animal polysaccharides (GH18, GH23, GH25) are limited. This suggests that L. tenuis is highly adapted to vegetative diets, a key role played by the GM. Among shared GHs, the GH109 with α-N-acetylgalactosaminidase activity, is the most prevalent (17-19 copies), indicating a potential link to neurological, liver, and developmental health in hosts39. Additionally, we performed microbial traits prediction for 27 L. tenuis genomes (Supplementary Fig. S4). Over 26 strains possess the genetic basis for utilizing glucose, lactose, raffinose, mannose, and xylose, but cannot utilize D-sorbitol, myo-inositol, salicin, or cellobiose. In terms of the ability to utilize carbon sources such as rhamnose, glycerol, trehalose, and D-mannitol, different strains exhibit genetic variations. This reflects the evolutionary adaptation strategies of L. tenuis to different ecological niches.

Amino acid metabolism analysis reveals that L. tenuis primarily biosynthesizes essential amino acids, including arginine, lysine, cysteine, histidine, and branched-chain amino acids, while some strains (e.g., SW48, SW51, SW60, and SW64-SW67) also possess an ornithine synthesis pathway (Supplementary Table S7). In addition, all 27 strains are annotated with gene clusters for the production of betalactone and ranthipeptide (Supplementary Table S8). Betalactone has attracted extensive attention due to its antimicrobial, anticancer, and anti-obesity properties40. Notably, the betalactone natural product derivative tetrahydrolipstatin (Xenical, Alli) is approved by the FDA as a weight loss drug41. This further highlights the potential differential effects of distinct strains on the host.

To validate genomic predictions on carbon utilization, BIOLOG AN MicroPlate was used (Fig. 3d). The results show that more than 25 strains can utilize glucose, lactose, mannose, and melibiose, which is consistent with the genome annotation results (Supplementary Table S9). However, we also found some discrepancies between phenotypic data and genomic predictions: although in vitro experiments detected metabolic activity of strains towards m-inositol, maltose, salicin, and D-sorbitol, the corresponding traits are not identified in genome annotations. These differences may suggest that in L. tenuis, some enzymes with obvious sequence divergence but conserved functionality might mediate the metabolism of these carbon sources, or that non-canonical metabolic pathways exist. These findings clarify the nutrient metabolism traits of L. tenuis strains, which is crucial for GM to adapt and interact with the hosts42,43.

Luoshenia tenuis exhibits robust acidity tolerance but limited bile salts and antibiotic resistance

The high acidity in the host stomach and the high concentration of bile salts in the proximal intestine are crucial factors that affect the colonization and functionality of LBPs administrated orally44. We systematically evaluated the tolerance of L. tenuis to high acidity and bile salt. Genomic analysis reveals that the core genome of L. tenuis contains many genes related to bacterial acid tolerance response (ATR), such as GrpE(number of gene copies = 5), DnaK(6), DnaJ(5), UvrA(3), Ssb(3), hsp90(3), GroES(3) (Supplementary Table S10)45,46. After treating strains under different pH conditions for 4 hours (Fig. 4a), most strains (18 strains) maintained viability at a low pH of 2.0, which approximates the highly acidic conditions of the human stomach. Additionally, we observe that most of the strains have percentage survival below 10% at pH levels between 2 and 3.5, indicating a general sensitivity to acidic conditions within this range. Notably, strains SW62 and SW67 demonstrate superior acid tolerance, showing substantially higher survival rates of 52.86% and 71.32%, respectively, at pH 3.5. In contrast to their widespread acidity resistance, the bile salt tolerance of L. tenuis strains is generally poor. In mmGAM medium containing 0.3% (m/v) ox bile powder, only four strains (SW43, SW54, SW57, and SW66) are able to survive (Fig. 4b). Notably, these same four strains also exhibit robust acid resistance, suggesting they may possess unique stress response mechanisms. However, there is a contradiction between the genomic annotation results and our experimental phenotypes: Triatar annotations indicate that all strains have bile tolerance (Supplementary Fig. S4). Although all strains could perform bile acid hydrolysis reactions, and bile salt hydrolase (BSH) is generally considered to be associated with bacterial bile acid tolerance47,48, it is clear that the presence of BSH does not fully equate to actual tolerance phenotypes. This finding emphasizes the importance of combining experimental validation with genomic analysis, particularly when assessing bacterial environmental adaptability. Additionally, genes related to spore formation are conserved across strains (Supplementary Table S10 and Supplementary Fig. S4). Using Browne’s established method49, we calculated spore formation signature scores based on 66 conserved sporulation marker genes, with all strains scoring between 0.55 and 0.58, strongly supporting their spore-forming capacity (Supplementary Table S11). Further spore staining results of seven representative strains from three different evolutionary clusters (Supplementary Fig. S5) confirm the above analysis and indicate that sporulation plays an important role in the survival and persistence of L. tenuis.

Fig. 4: Genome annotation and test of tolerances to high acidity and bile salts and antibiotic resistance of L. tenuis.
figure 4

a The percentage survival of L. tenuis strains after a 4-h treatment at different pH (2.0, 2.5, 3.0, 3.5). Values of the heatmap represent mean percentages relative to the pH 7.0 control (set as 100% survival). Values on the heatmap indicate standard deviations from three replicates. b Bile salts tolerance of L. tenuis strains. The time required to reach OD600 = 0.3 was determined (maximum 120 h). Strains cultured in bile-free broth served as the control. strains not displaying orange bars (indicating no detectable growth in bile salt-supplemented medium) were considered bile salt-sensitive. Values on the panel indicate standard deviations from three replicates. c Antibiotic resistance profiles of the 27 L. tenuis strains (disk diffusion, n = 3). Numbers in parentheses indicate resistant strains for each antibiotic. All strains remained susceptible to the other 8 tested antibiotics. Detailed antibiotic susceptibility data are provided in Supplementary Fig. S6.

To assess the risk of antibiotic resistance dissemination, we analyzed the genomes of 27 L. tenuis strains and performed in vitro antibiotic tests using the disk diffusion method (see Supplementary Table S9 for detailed data). The antibiotic testing results show that all strains exhibit varying degrees of sensitivity to the following antibiotics: penicillin, vancomycin, ampicillin, amoxicillin, ciprofloxacin, rifampin, and chloramphenicol, but are resistant to gentamicin, streptomycin, polymyxin B, and bacitracin (Fig. 4c and Supplementary Fig. S6). Notably, due to the limitations of the CARD database in annotating resistance genes in non-pathogenic bacteria, we simultaneously employed the KEGG database for annotation47. Genomic analysis reveals that all 27 strains carry resistance genes for gentamicin, streptomycin, polymyxin B, and bacitracin, but no resistance genes are detected for vancomycin, ciprofloxacin, or rifampin. These results are highly consistent with the in vitro susceptibility tests (Supplementary Table S9). In vitro susceptibility tests reveal that all strains are sensitive to penicillin, vancomycin, carbenicillin, ampicillin, amoxicillin, cefoperazone, ciprofloxacin, rifampin, and chloramphenicol, but resistant to gentamicin, streptomycin, polymyxin B, and bacitracin (Fig. 4c). Further analysis reveals that although β-lactams resistance genes are annotated in the genomes, all tested β-lactams antibiotics (penicillin, carbenicillin, ampicillin, amoxicillin) effectively inhibited L. tenuis growth in vitro, except for cefixime. Notably, for macrolides (azithromycin, clarithromycin, erythromycin), the resistance phenotype and genotype show a specific correlation: except for strains SW59, SW61, and SW64, the remaining strains exhibit co-occurring resistance phenotypes in vitro, but only five strains have corresponding resistance genes detected in their genomes. Additionally, while no clindamycin resistance genes are annotated genomically, strains SW47, SW53, SW57, and SW66 display in vitro resistance. Tetracycline resistance shows a similar discrepancy: 16 strains carry tetracycline resistance genes, but only three demonstrate actual resistance. These results highlight the importance of integrating genomic analysis with experimental validation to fully understand L. tenuis resistance mechanisms: genomic predictions can reveal potential resistance risks, while phenotypic testing reflects actual resistance profiles. Their combination provides more precise guidance for clinical treatment and resistance control. To assess the potential transmission risk of these antibiotic resistance genes (AR), we systematically analyzed the distribution of resistance genes within MGEs. The results show that resistance genes for aminoglycosides, β-lactams, and chloramphenicol are all detected in genomic islands and phage regions of L. tenuis (Supplementary Tables S5 and S6), suggesting these resistance genes may have been acquired through HGT.

Beneficial metabolite production and bile acid biotransformation with Luoshenia tenuis

Finally, we profiled the production of volatile metabolites and the conversion of bile acids with 27 L. tenuis strains. Detection was performed using solid-phase microextraction-gas chromatography-mass spectrometry (SPME-GCMS) and gas chromatography-mass spectrometry (GCMS). A total of 94 characteristic volatile metabolites were identified (Fig. 5a and Supplementary Table S12). The numbers of metabolites produced vary from 24 (SW66) to 54 (SW53, SW58) (Fig. 5b). These metabolites were classified into 9 chemical groups, with esters (20) and alcohols (13) being the most abundant classes (Fig. 5c). Notably, all strains consistently produce key metabolites, including L-lactic acid, 3-phenyl-1-propanol, and trans-farnesol, suggesting their importance in the core metabolic activities of L. tenuis. The production of L-lactic acid is corroborated by our identification of the L-lactate dehydrogenase gene (EC:1.1.1.27) in the genome annotations. This metabolite may be mechanistically linked to weight control by reducing voluntary food intake in mice, as shown in our previous research29. Furthermore, short-chain fatty acids (SCFAs) production was not detected in any of the strains.

Fig. 5: Metabolic profiling of L. tenuis strains.
figure 5

a Volatile metabolite heatmap. Color gradient indicates peak area intensities from GC-MS analysis: Dark green: High abundance; White: Not detected. Compounds are grouped by chemical class. Categories containing fewer than three compounds are classified as “Others.” Numbers in parentheses indicate the quantity of compounds within each category. b The total number of metabolites of different L. tenuis strains. c The number of different types of metabolites from different L. tenuis strains. d Quantitative detection of bile acid biotransformation. The substrate used in this study was cholic acid (CA), chenodeoxycholic acid (CDCA), and taurocholic acid (TCA).

Gut microbe-mediated biotransformation of bile acids plays a crucial role in the host’s digestion, metabolism, and immune regulation50,51. Through liquid chromatography-mass spectrometry (LC-MS) analysis, we identified that L. tenuis could catalyze the dehydroxylation, dehydrogenation, 24’-amidation, and hydrolysis of bile acids (Fig. 5d and Supplementary Table S13). These activities differ remarkably from members of the Christensenella genus12. Interestingly, despite L. tenuis strains exhibiting hydrolysis activity, no bile salt hydrolase (BSH) was identified in their genomes through either annotation or BLAST searches, suggesting the presence of less conserved BSH homologs responsible for this function. Moreover, the absence of 3-O-acylation modification of bile acids in L. tenuis further suggests that L. tenuis and C. minuta employ different mechanisms in regulating host metabolism, highlighting the need for further investigation.

Discussion

In our previous work, we collected 27 L. tenuis strains using an improved sample pretreatment and bacterial cultivation method12. In this study, by integrating phenotypic assays, genomic analysis, and metabolic profiling, we characterized the strain-level diversity, ecological adaptations, and potential functional attributes of L. tenuis as a promising next-generation probiotic. Our prior in vivo research demonstrated its ability to alleviate obesity-related metabolic disturbances in mice29. Analyses of human cohorts revealed that while the overall prevalence of L. tenuis in healthy and diseased populations remained relatively stable, its relative abundance was significantly reduced in individuals with atherosclerotic cardiovascular disease, obesity, non-alcoholic fatty liver disease, and type 2 diabetes12. Those findings underscored the ecological resilience of L. tenuis and suggested that shifts in abundance or functional capacity- rather than mere presence- may underlined its association with metabolic health. The genomic variations and metabolic diversity at strain level, as revealed in this study, complicated the associations of L. tenuis with host heath and diseases, and need further investigation.

Although the 16S rRNA gene sequences are highly conserved, whole-genome phylogenomic analysis reveals substantial genomic divergence, forming three distinct evolutionary clusters. Specifically, strain SW56 form a separate phylogenetic branch with an ANI value below the 95% species threshold30, resembling intraspecies differentiation seen in Streptococcus species52. However, these genomic differences are not reflected in basic phenotypic traits such as carbon source utilization or stress tolerance suggesting conserved core metabolic functions across strains and arguing against assigning SW56 to a separate taxonomic unit. Moreover, we acknowledge that deeper phenotypic assays are needed to explore if subtle functional differences exist among the phylogenetic clusters identified in Fig. 1a.

We observed frequent HGT events in L. tenuis, accounting for 3.76–5.55% of the pangenome. This proportion is slightly higher than that observed in some Gram-negative bacteria, such as Bacteroides cellulosilyticus, which accounts for 3% of the pangenome53, but lower than that in known HGT-active strains like Akkermansia muciniphila, whose genome contains 6.63–10.3% HGT54. Notably, the genes acquired through HGT are predominantly enriched in categories involved in genetic material regulation, energy utilization, and defense mechanisms. This finding is consistent with studies showing that HGT is not only associated with bacterial pathogenicity but also plays a central role in the evolution of new functions within microbial communities33. The significant variation in HGT proportions among different microbial groups may reflect the diversity of their ecological strategies. For example, the archaeon Methanosarcina acetivorans shows an exceptionally high HGT proportion of 25.2%, which may be related to its requirements for complex ecological interactions55. Moreover, although L. tenuis strains were isolated from different hosts, including humans, mice, and monkeys, phenotypic analysis revealed no host-specific adaptations. Comparative genomics shows that human-derived strains are enriched in genes encoding solute-binding proteins (SBPs), potentially reflecting enhanced substrate selection linked to the diverse human diet56. SBPs may also aid in identifying prebiotics that promote L. tenuis colonization. However, the small number of non-human primate isolates (n = 4) limits the generalizability of these findings.

Notably, our comparative analysis between in silico genomic predictions and in vitro phenotypic assays across L. tenuis strains shows both expected concordance and informative discrepancies. Genomic annotations and spore signature scores consistently predict sporulation capability49, which is validated by microscopy-based spore staining in representative strains spanning all phylogenetic clusters. Similarly, BIOLOG assays confirm the utilization of major carbohydrates such as glucose, lactose, and mannose, aligning with genome-encoded repertoires. However, certain carbon sources (e.g., salicin, m-inositol, sorbitol) show positive metabolic activity in vitro despite lacking annotated degradation pathways, suggesting the involvement of divergent or non-canonical enzymes, a phenomenon that is widely observed in nonmodel prokaryote57. In the case of bile tolerance, although all strains exhibit bile salt hydrolysis activity and are predicted to tolerate bile by Traitar, only four strains demonstrate actual growth in bile-supplemented media, highlighting a disconnect between genomic potential and phenotypic expression that may reflect strain-specific regulation58,59. Furthermore, antibiotic resistance profiling reveals partial correspondence: while resistance to aminoglycosides is supported by both genomic islands and phenotypic assays, tetracycline and macrolide resistance show only limited genotype–phenotype agreement. These findings underscore the importance of complementing in silico predictions with empirical validation to fully resolve functional capacities in candidate probiotic strains11,60.

Metabolomic analysis indicates that L. tenuis produces various metabolites, including L-lactic acid and trans-farnesol. Lactic acid was recognized as an important regulator of the immune system, capable of inhibiting pro-inflammatory responses in macrophages61,62, reducing gut permeability63, and promoting hippocampal neurogenesis and cognitive function in adults64. Importantly, lactic acid-derived Lac-Phe could act as a signaling molecule to suppress feeding and obesity65. Our previous research demonstrated that L. tenuis improves host metabolism by suppressing appetite in model mice29. We hypothesized that L. tenuis promoted metabolic improvement in the host through the production of various products, including lactic acid, by metabolizing complex carbohydrates. Additionally, L. tenuis could modify bile acids through dehydroxylation, dehydrogenation, amidation, and hydrolysis. Microbiome-targeted bile acid therapies have proven successful, attracting researchers to further investigate bile acid treatment66. Various bile acids interacted as signaling molecules with farnesoid X receptor67 and vitamin D receptor68, thereby regulating host metabolism. Notably, we also found that some strains possess the key enzyme PTPS for synthesizing cofactor Tetrahydrobiopterin (BH4) (Fig. 2c). However, our metabolomics analysis failed to detect BH4 production, likely due to the detection bias of the GC-MS-based method used in this study. According to previous studies, BH4 plays a crucial role in the nervous69, immune70, cardiovascular71, and metabolic diseases72, and has occasionally been reported to be produced by gut microbes such as Bifidobacterium longum and Bifidobacterium adolescentis73. Although the clinical application of BH4 was limited by its susceptibility to oxidation and instability74, BH4 produced endogenously by GM may overcome these challenges, potentially preserving its biological activity more effectively in situ. Basically, our study provides insights into the application of L. tenuis in modulating host metabolism and immunity, highlighting its potential value in maintaining GM health.

While this work provides essential strain resources, genomic insights, and metabolic traits, supporting the further development of L. tenuis as a potential LBP, several limitations remain. This study is largely based on in silico and in vitro analyses. Many of the functional predictions based on genome annotation, such as biosynthetic capacities and metabolic traits, have not yet been systematically validated through experiments, and their actual physiological relevance and impact on host health remain to be fully elucidated. In vivo studies are necessary to verify the therapeutic potential of L. tenuis under physiological conditions.

Methods

Taxonomic assignment of isolates

All 27 isolates were taxonomically assigned to Luoshenia tenuis based on three criteria: (i) 16S rRNA gene similarity >98.65% to the type strain NSJ-44T; (ii) ANI values ≥ 95% among most strains, with minor exceptions considered acceptable given documented cases (e.g., Streptococcus mitis/oralis) where ANI < 95% does not correspond to species divergence; and (iii) phenotypic consistency across key traits including carbon source utilization, growth kinetics, and antibiotic susceptibility. Only when all three lines of evidence support divergence would a new species designation be considered.

Test of the biological characteristics of Luoshenia tenuis strains

The pH tolerance, bile acid tolerance, and antibiotic resistance of the strains were tested based on previous studies12. All experiments were performed with three biological replicates. Briefly: 1) Fresh bacterial cell pellets were resuspended in PBS at different pH values (2.0/2.5/3.0/3.5/7.0) and incubated for 4 h at 37 °C, then plated on agar plates (pH = 7.0), and the number of colonies was counted. The average number of bacterial colonies (three replicates) growing on agar plates at pH = 7 served as a control. Percentage survival = (number of bacterial colonies under different pH conditions [mean of triplicates]/number of bacterial colonies at pH = 7 [mean of triplicates]) × 100%. 2) Bacterial suspensions were inoculated at 3% (v/v) inoculation volume into the liquid medium with 0.3% commercial product Ox bile powder (Shyuanye, CHN) and cultured anaerobically at 37 °C. Strains inoculated in bile-free broth served as the control. The OD600 was measured until it reached 0.3 (maximum cultivation time: 120 h). 3) Bacterial suspensions were plated on agar plates, with antibiotic discs containing quantified antimicrobial agents placed on the surface, allowing the agents to diffuse through the agar. After incubation at 37 °C for 48 h, the inhibition zone diameters were measured. The antibiotic resistance experiments were performed using the disk diffusion method75 following the Clinical and Laboratory Standards Institute (CLSI) M100 guidelines, with the presence/absence of inhibition zones as the resistance criterion. The 20 antibiotics used were (Bioroyee, CHN; μg per disc unless otherwise stated): Ampicillin (30), Penicillin (10 IU), Cefoperazone (30), Cefixime (5), Amoxicillin (10), Carbenicillin (100), Vancomycin (30), Bacitracin (0.04 IU), Polymyxin B (300 IU), Gentamicin (10), Kanamycin (30), Streptomycin (10), Tetracycline (30), Chloramphenicol (30), Clindamycin (2), Erythromycin (15), Clarithromycin (15), Azithromycin (15), Ciprofloxacin (5), and Rifampin (5). Unless otherwise noted, all strains used in vitro experiments were cultured in mmGAM medium12 at 37°C under anaerobic conditions (85% N₂, 10% H₂, 5% CO₂), with the medium pH = 7.0. The pH of the medium was adjusted using 1 M HCl.

Strain spores were identified using a spore staining kit (Hopebio, CHN) following the manufacturer’s instructions. After staining, the spore staining results were observed and recorded using an optical microscope (Eclipse Ci-l plus) equipped with a 100× oil immersion objective (Nikon, JPN). Carbon source utilization was evaluated using BIOLOG AN MicroPlate (BIOLOG, USA) according to the manufacturer’s protocol. Optical density measurements were taken at both 590 nm and 750 nm wavelengths. Background correction was performed by subtracting the A1 control well values. The final metabolic activity was calculated as: OD590–OD750. A carbon source was considered utilized when the corrected OD value reached ≥0.05. One strain was tested per plate without replicates.

Genome sequencing and analysis

Following the manufacturer’s instructions, the extraction of bacterial genomic DNA was done using the DNeasy Blood & Tissue Kit (Qiagen, GER). DNA concentration was measured with a Qubit™ 4 Fluorometer (Thermo Fisher Scientific, USA). Complete genome sequencing and assembly were performed by Magigene (Guangdong Magigene Biotechnology Co., Ltd., CHN) using both Illumina NovaSeq 6000 and Nanopore MinION sequencing platforms. The sequencing depth was 1G for all strains. The phylogenomic tree was constructed from whole-genome sequences using the CVTree method76 and visualized with the iTOL online tool77. Genome completeness and contamination were assessed using CheckM78. ANI values were calculated using FastANI v1.3331. POCP values were calculated using computePOCP+AAI_v2.1_win (https://github.com/2015qyliang/POCP). Genomic sequence alignment and synteny analysis were performed using Mauve software (version 20150226)79. Pan-genomic analysis was conducted using Roary80 and BPGA-1.381, and a 90% sequence identity cutoff was used for gene clustering. Functional annotation of the coding genes was carried out using the eggNOG 5.0 database82, CAZy databases83, COG databases38, CARD databases84, and METABOLIC 4.0 software85. Phenotype assignments to bacterial genomes of 27 strains were achieved through the Traitar 3.0.186, accounting for 67 diverse microbial traits. Bacterial Secondary Metabolite Biosynthetic Gene Clusters (BGCs) were annotated using antiSMASH 8.087. Comparative genomic analysis was performed using BRIG-0.95-dist software88. The spore signature score was calculated according to the method described by Browne et al.49. Briefly, the authors identified 66 signature sporulation genes through large-scale data analysis and assigned weights to each gene. By performing functional annotation of the genomes, the presence of these 66 genes was examined, and a spore signature score (0–1) was generated based on the detection percentage (weighted score) of these genes. Combined with experimental validation, the authors proposed a threshold of ≥0.5 to classify a strain as spore-forming.

NeoHGT289 (https://github.com/cgneo/neoHGT) was used to identify potential horizontal gene transfer events in L. tenuis. Luoshenia (Rank: genus; Taxon ID: 2944144) and Christensenellaceae (Rank: family; Taxon ID: 990719) were set as self- and close-group references, respectively. tRNAscan-SE (Version 1.3.1)90 was used to predict tRNA genes. Insertion sequences were predicted using the ISfinder91 online website. Phages were predicted using PHAST92. Genome islands were predicted using IslandPath-DIOMB93. The antiviral defense systems were identified by DefenseFinder (v1.3.0)94.

Metabolite determination

The determination of metabolic products was carried out using methods described in previous studies12,95. Briefly, the concentrations of SCFAs (including acetate, propionate, butyrate, valerate, isobutyrate, and isovalerate) were determined using GC-MS. Bacterial cells were grown in mmGAM medium to stationary phase, then 1 mL of bacterial suspension was extracted with 1 mL ethyl acetate and vortexed to extract fermentative products. After centrifugation, the supernatant was analyzed for short-chain fatty acids using GCMS (GCMS-QP2010PLUS, JPN). A sterile liquid medium was used as a blank control. Additionally, other volatile metabolic products were quantified using GC-MS after SPME (Q Exactive GC Orbitrap GC-MS/MS, Thermo Fisher, USA) under identical culture conditions95, with comparisons based solely on peak area. Sterile, noinoculated mmGAM medium was used as control. For consistency, all results were normalized to the maximum OD600 of bacterial growth, meaning the values were divided by the maximum OD600 of the strain.

For the bile acid transformation experiments, methods from previous research12,28 were utilized with minor modifications. Briefly, strains were cultured in a modified PYG medium (see Supplementary Table S14) for 2 days to reach the stationary phase. Cells were then harvested as pellets and re-suspended in fresh broth supplemented with 1 mM of bile acids (CA, CDCA, and TCA), 1 mM SCFA mixture, and 1 mM amino acids mixture. The suspension was adjusted to an OD600 value of 1.4 for 48 h biotransformation at 37 °C. After centrifugation, the fermentation broth was collected for further detection of secondary bile acid formation using an Agilent Accurate-Mass-Q-TOF LC/MS 6520B system. Data analysis was performed using Agilent MassHunter Qualitative Analysis v6.0 software. LC-MS methodological details are described in our previous studies28. No replicates were performed for the metabolomics analysis.

All the processes of data visualization were conducted using RStudio 2023.06.1 + 524 (https://www.R-project.org/) and GraphPad Prism 9.0.0 (https://www.graphpad.com/), unless otherwise specified.