Background & Summary

The coral reef ecosystem, often called the “tropical rainforest in the ocean”, provides habitat and shelter for approximately 30% of marine organisms1,2. However, nearly 30% of reef-building corals are currently endangered due to climate change and human activities, and the trend of coral bleaching is further aggravated3,4,5. Elevated ocean temperatures increase coral susceptibility to pathogens and alter nutrient cycling dynamics, particularly nitrogen metabolism6,7. Nitrogen is essential for maintaining the symbiotic relationship between corals and their microbial partners8, and disrupting the nitrogen nutrient cycle between symbiont and coral can lead to coral bleaching and disease9,10.

Among various nitrogen sources, urea stands out as a critical nutrient in coral ecosystems11. Unlike nitrate and ammonium, urea uptake by coral symbioses increases under thermal stress, even as the absorption of other nitrogen forms declines12. This shift induced by thermal stress makes urea a pivotal compensatory nitrogen source, yet the microbial taxa responsible for urea mobilization and the specific pathways through which they facilitate this process remain poorly defined. Focusing on urea-utilizing bacteria is thus essential to unraveling coral nitrogen cycling under climate change.

Vibrio species, ubiquitous in coral microbiota, exhibit dual ecological roles: they can act as pathogens causing coral diseases13,14,15,16,17,18, but they can also participate in nutrient provisioning for coral hosts19,20. Critically, Vibrio strains dominate nitrogen-fixing communities in coral symbioses21,22 and are known to participate in regulating nitrogen cycling. However, the mechanisms by which Vibrio utilize urea, including the genetic basis such as urease-encoding genes and their regulatory pathways, and their quantitative contribution to coral urea-derived nitrogen uptake under thermal stress, remain largely uncharacterized.

To address this knowledge gap, this study isolated and characterized 18 urea-utilizing Vibrio strains from typical corals at four locations in Fujian, Guangdong, and Hainan provinces, China. Species identification was performed using the 16S rRNA gene, and a phylogenetic tree was constructed using the Multilocus Sequence Analysis (MLSA) method, analyzing nine housekeeping genes. Bacterial genomes were sequenced using Illumina technology, and Benchmarking Universal Single-Copy Orthologs (BUSCO)23 was used to evaluate gene completeness. These genomic data will provide a molecular blueprint for deciphering Vibrio’s evolutionary trajectories, adaptive mechanisms, and roles in coral nitrogen cycling.

Methods

Sample collection

Coral samples (a-h) were collected from Dongshan Eryu Island (23.7°N, 117.4°E) of Fujian Province, Sanmen Island (22.5°N, 114.2°E), and Egong Bay (22.5°N, 113.9°E) of Shenzhen, and Luhuitou (18.1°N, 109.4°E) of Sanya (Fig. 1A). Live bacterial screening samples were kept in seawater and transported to the laboratory. Each coral was washed three times with autoclaved seawater for live bacterial screening and transferred into a 50 mL centrifuge tube. Next, 30 mL of 2216E liquid medium was added to the tube, and the contents were homogenized using steel beads. A 5 mL aliquot of the tissue homogenate was incubated overnight at 25 °C with shaking at 130 rpm. The overnight culture was diluted 1000-fold using 2216E liquid medium. The diluted liquid was then spread on 2216E solid medium and incubated overnight at 25 °C to obtain single colonies. One hundred single colonies were selected, and each was transferred into a 1.5 mL centrifuge tube containing 1 mL of 2216E liquid medium for overnight cultivation. The bacterial cultures were then spread on a urea agar indicator medium and incubated overnight at 25 °C. Urease-positive bacteria were selected and transferred to a 2216E solid medium for quadrant streaking to obtain single colonies. This process was repeated twice to ensure pure cultures of urea-utilizing bacteria (Fig. 1B). Finally, the strains were preserved in 2216E liquid medium containing 25% glycerol at −80 °C.

Fig. 1
figure 1

Sampling sites (A) and workflow for the isolation and purification of urea-utilizing bacteria (B).

Symbionts determination

Genomic DNA (gDNA) from the 18 bacterial strains was extracted using a commercial bacterial genomic DNA extraction kit (Tiangen, China). PCR amplification of the bacterial 16S rRNA gene was performed using primers 27 F (5′-AGAGTTTGATCMTGGCTCAG-3′) and 1492 R (5′-TACGGYTACCTTGTTACGACTT-3′). The PCR products were sequenced and blasted against the EzBioCloud database using the EzBioCloud platform (https://www.ezbiocloud.net/). The 18 bacteria were identified as belonging to six Vibrio species: Vibrio harveyi (8), Vibrio campbellii (1), Vibrio rotiferianus (4), Vibrio natriegens (1), Vibrio owensii (3), and Vibrio jasicida (1).

Genome sequencing and assembly

The Vibrio gDNA samples were used to construct an Illumina pair-end (PE) library with a 350 bp insert size, following standard protocols provided by Illumina. The PE library was then sequenced on the Illumina NovaSeq™ X Plus platform (Illumina Inc.,USA) in 150 bp PE mode. A total of 16.2 Gbp raw reads were generated from 18 Vibrio spp., featuring high sequencing quality with a Q30 score of ≥85% and over 100× coverage of the whole genome. Raw reads were filtered with Trimmomatic v0.3324 to obtain clean reads before the final assembly. A total of 14.9 Gb of clean reads were obtained and assembled using Spades v3.6.2 software25. Coding gene prediction was performed using Prodigal v2.6.326.

The genomic assembly results are shown in Table 1, with an average genome size of 5.67 Mbp, an average coverage depth of 146.1×, and GC content of 45.3%. Genome completeness was assessed using BUSCO (version 5.8.2) with the bacteria_odb12 lineage, and the average match rate was 98.9% (Table 2).

Table 1 Genomic assembly statistics of 18 Vibrio strains.
Table 2 BUSCO completeness evaluation of 18 Vibrio genomes.

Gene prediction and functional annotation

For functional annotation, the predicted proteins were analyzed with BLAST (e-value: 1e-5) against Gene Ontology (GO; Blast2GO was used for GO annotation), eggNOG, KEGG, Nr, Pfam, Swiss-Prot, and TrEMBL databases. The results of the functional annotation are shown in Table 3. The total gene content of the 18 Vibrio strains averaged 5,087 genes, with V. harveyi-1 possessing the largest repertoire (5,541) and V. natriegens the smallest (4,657). Functional annotation coverage (the ratio of all annotated genes to the total gene number) for all strains exceeded 95%. 17 strains achieved >99% coverage, whereas V. jasicida exhibited the lowest coverage at 95.15%.

Table 3 Functional annotation statistics of 18 Vibrio genomes.

Furthermore, the species annotation of the genomes of the 18 Vibrio strains aligned against the Nr database is shown in Fig. 2. This figure uses different colors to distinguish between species and marks the first two or three species. Notably, the gene annotation results of these 18 Vibrio strains are generally consistent with their 16S rRNA gene alignment results.

Fig. 2
figure 2

Species annotation of the genomes of 18 Vibrio strains. The first two or three species of the annotations against Nr database are marked in each pie.

Data Records

Raw Illumina sequencing data, generated from the Illumina NovaSeq™ X Plus platform (150 bp paired-end reads) for 18 coral-isolated Vibrio strains, have been deposited in the NCBI Sequence Read Archive (SRA) under accession numbers: SRR3172111927, SRR3172111828, SRR3172110929, SRR3172110830, SRR3172110731, SRR3172110632, SRR3172110533, SRR3172110434, SRR3172110335, SRR3172110236, SRR3172111737, SRR3172111638, SRR3172111539, SRR3172111440, SRR3172111341, SRR3172111242, SRR3172111143, and SRR3172111044. The raw reads are provided in FASTQ.GZ format.

The de novo assembly of these clean reads, which was performed using Spades v3.6.2, resulted in assembled genomic data for the same 18 Vibrio strains. These assembled genomes, which were generated via whole genome shotgun (WGS) sequencing and provided in FASTA format, have been submitted to NCBI GenBank under the accession numbers: JBJYIK00000000045, JBJYIL00000000046, JBJYIM00000000047, JBJYJJ00000000048, JBJYJK00000000049, JBJYJL00000000050, JBJYJM00000000051, JBJYJN00000000052, JBJYJO00000000053, JBJYJP00000000054, JBJYJQ00000000055, JBJYJR00000000056, JBJYJS00000000057, JBJYJT00000000058, JBJYJU00000000059, JBJYJV00000000060, JBJYJW00000000061, and JBJYJX00000000062.

All datasets are publicly available through the NCBI website: https://www.ncbi.nlm.nih.gov/. A complete list of the SRA accession numbers (for the raw reads) and GenBank accession numbers (for the assembled genomes) for each of the 18 strains is provided in Table 4.

Table 4 SRA and GenBank accession numbers of 18 Vibrio strains.

Technical Validation

Quality assessment of genome assembly

Raw sequencing data were quality-filtered using Trimmomatic, and the resulting clean reads were assembled into genomes with SPAdes v3.6.2. Illumina sequencing metrics indicated high data quality, with an average Q30 score of ≥85% and a sequencing depth exceeding 100 × —both well above the thresholds for reliable genome assembly. Genome completeness was further validated using BUSCO, which assesses the presence of conserved single-copy orthologs to confirm assembly integrity (Tables 1, 2).

Subsequently, gene prediction was performed on the filtered genomic data using Prodigal v2.6.3, a tool optimized for de novo gene identification in newly sequenced genomes via dynamic programming algorithms. The proportion of predicted gene lengths across the 18 Vibrio strains is detailed in Fig. 3: this proportion is relatively consistent, with genes shorter than 100 bp accounting for less than 5%, reflecting high gene integrity and supporting the reliability of sequencing data.

Fig. 3
figure 3

Proportion of gene length in each Vibrio sp. The gene length is divided into 11 intervals and each interval is distinguished by a different color.

To address our core objective of characterizing genomic features of coral-associated Vibrio, we note that over 95% of the predicted genes were successfully annotated—underscoring the robustness of our gene prediction pipeline. These high-quality datasets, validated by rigorous quality control (Q30 ≥ 85%, BUSCO genome’s completeness ≥ 95%), complete assembly, and comprehensive functional annotation, not only fulfill the goal of providing reliable genomic profiles for these strains but also serve as a valuable resource for future investigations into their evolution, molecular mechanisms, and ecological roles in coral holobionts.

Orthologue and phylogenetic analyses

We further performed orthologue and phylogenetic analyses based on the high-quality genomic data, to explore the evolutionary relationships among these coral-associated Vibrio strains. Phylogenetic relationships among the 18 Vibrio strains were inferred using the Multilocus Sequence Analysis (MLSA) approach, a robust method widely applied in bacterial systematics63,64,65,66. Nine conserved housekeeping genes were selected for this analysis: 16S rRNA, gapA, gyrB, ftsZ, mreB, pyrH, recA, rpoA, and topA. These genes were chosen based on their extensive use in Vibrio phylogenetics, as their conserved nature minimizes phylogenetic artifacts that can distort evolutionary inference67,68. The accession number of each reference gene was provided in Table 5.

Table 5 Reference Vibrio strain information used in Multilocus Sequence Analysis (MLSA).

Gene sequences were aligned individually using MAFFT v7 with the–auto parameter to optimize alignment accuracy69. Aligned sequences were concatenated and curated in BioEdit v7.0.9.0 to remove ambiguously aligned regions and gaps, ensuring high-quality data for tree inference70.

To assess phylogenetic robustness, two complementary methods were employed using a combination of raxmlGUI v2.0.1671 and MEGA v772: Neighbor-Joining (NJ) Analysis and Maximum Likelihood (ML) Analysis.

The NJ tree was inferred using MEGA v7, with genetic distances calculated via the Maximum Composite Likelihood (MCL) method73, which accounts for nucleotide substitution biases across sites. Branch reliability was validated with 1000 bootstrap replicates.

The ML tree was constructed using raxmlGUI v2.0.16, with the GTR + G + I nucleotide substitution model (general time-reversible model incorporating gamma-distributed rate heterogeneity [G] and a proportion of invariant sites [I]74). Branch support was evaluated via 1000 bootstrap replicates.

The NJ and ML trees (Fig. 4A,B) consistently resolved distinct species-specific clades, with Photobacterium phosphoreum strain LMG4233 serving as the outgroup to root the trees.

Fig. 4
figure 4

Phylogenetic trees of 18 Vibrio strains. (A) Neighbor-joining (NJ) tree; (B) Maximum likelihood (ML) tree. The gapA, gyrB, ftsZ, mreB, pyrH, recA, rpoA, topA and 16S rRNA gene sequences from 18 Vibrio strains were concatenated and reconstructed using the Multilocus Sequence Analysis (MLSA). Photobacterium phosphoreum (strain LMG4233) was used as outgroup. Both NJ tree and ML tree were constructed with 1000 bootstrap replications. Bootstrap support values lower 50 were not showed.

In the NJ tree, V. harveyi strains clustered into a monophyletic clade with bootstrap support ranging from 30 to 93, and grouped closely with the type strain V. harveyi LMG4044 (100 bootstrap support). V. campbellii and its type strain LMG11216 formed a clade with weak support (39 bootstrap). V. owensii strains clustered with their type strain LMG25443 with 100 bootstrap support. V. natriegens formed a distinct clade relative to other species, positioned closer to them than V. jasicida, with a relatively low bootstrap support of 48. V. rotiferianus strains grouped with type strain LMG21460 (78–100 bootstrap), and V. jasicida clustered with its type strain LMG25398 (100 bootstrap support).

The ML tree showed topological patterns consistent with the NJ tree: V. harveyi strains formed a cohesive clade; V. campbellii and V. owensii each clustered with their respective type strains; and V. rotiferianus and V. jasicida exhibited strong monophyly (with bootstrap support up to 100 for key nodes). Similarly, V. natriegens formed a distinct clade relative to other species in the ML tree.

In conclusion, the congruent topologies of the ML and NJ trees confirmed the robustness of phylogenetic relationships among these coral-associated Vibrio strains. The well-resolved species-specific clades, supported by high bootstrap values, provide a solid framework for subsequent investigations into their evolutionary trajectories, functional specializations, and ecological niches within coral ecosystems. Notably, this phylogenetic resolution—underpinned by high-quality genomic data—reinforces the reliability of our taxonomic and evolutionary inferences for these coral-isolated strains.

Table 6 Database, software, and parameters used in this study.