Abstract
A chromosome-level genome assembly of Bohadschia ocellata, a member of the Holothuriidae family, was constructed through the integration of MGI DNBSEQ-T7 short-read sequencing, PacBio HiFi long-read sequencing, and Hi-C genomic scaffolding technology. After optimization to eliminate redundant sequences, the genome assembly was precisely anchored to 23 chromosomes, resulting in a total size of 909.18 Mb. The N50 of its contig and scaffold sequences were 12.00 Mb and 38.97 Mb, respectively, confirming that the assembly was highly continuous. According to Merqury and BUSCO evaluations, the genome assembly reached a QV of 64.44 and completeness of 94.40%. From this assembly, 31,277 protein-coding genes were identified, which were 98.10% complete based on BUSCO assessment of the predicted proteome. Functional annotations were obtained from at least one database for more than 99% of these genes. This high-quality B. ocellata genome assembly from the current study could offer valuable information for further genetic and evolutionary studies of this sea cucumber species.
Similar content being viewed by others
Data availability
All sequencing and assembly data generated in this study have been deposited in public repositories. Raw sequencing data including BGI short-reads, PacBio HiFi long-reads, Hi-C reads and RNA-seq data are available at NCBI Sequence Read Archive database under the number SRP58693854 (https://identifiers.org/ncbi/insdc.sra:SRP586938). The whole genome shotgun project has been deposited in the DDBJ/ENA/GenBank under the accession number JBOCEH00000000055 (https://identifiers.org/ncbi/insdc:JBOCEH000000000). Additionally, the genome assembly, gene annotation and functional annotation are available in the Figshare repository65 (https://doi.org/10.6084/m9.figshare.29124434.v1) or Baiduyun66 (https://pan.baidu.com/s/10DBoB_GQQhThYBiloGInsA?pwd=wmhs).
Code availability
All commands and workflows used for data processing were executed in accordance with the respective software manuals and protocols, with the relevant settings and parameters detailed below:
SOAPnuke (v2.1.4): Employed to filter out low-quality reads from MGI raw sequencing data using the software’s default configurations.
SMRT Link (v13.1): Employed to process and filter PacBio raw sequencing data using default configurations.
Jellyfish (v2.3.0): Utilized to count 21-mers for estimating genome size and heterozygosity.
GenomeScope (v2.0.0): Utilized to process the K-mer frequency histogram for estimating genome size, heterozygosity, and repeat content using default configurations.
Hifiasm (v0.19.8-r603): Utilized to assemble the PacBio HiFi data after reads comparison and self-correct using built-in configurations.
Bwa (v0.7.17-r1188): Utilized to map the MGI short read data onto the draft assembly using built-in configurations.
Pilon (v1.23): Utilized to correct residual errors with Bwa alignment result using built-in configurations.
Purge_dups (v1.2.5): Utilized to reduce redundant haplotigs and determine heterozygosity for the draft genome under a configuration of -j 80 -s 80.
ALLHiC (v1.1): Utilized to assign and orient scaffolds using Hi-C reads into chromosome-level assemblies.
Merqury (v1.3): Utilized to assess k-mer coverage and QV value for the qualification of the assembled genome using best-fit K-mer = 19.
BUSCO (v5.7.1): Utilized to estimate genomic coverage using the metazoa_odb10 data collection.
Circos (v0.69): Utilized to display chromosomal structure and visualize the distribution of gene regions, repeat sequences, SNP percentage and NGS sequencing depth.
RepeatMasker (v4.09): Utilized to annotate transposable elements using built-in configurations.
EDTA: Utilized to annotate de-novo transposable elements using built-in configurations.
Barrnap (v0.9): Utilized to identify ribosomal RNAs (rRNAs) using built-in configurations.
tRNAscan-SE (v2.0.11): Utilized to search for transfer RNAs (tRNAs) sing built-in configurations.
Infernal (v1.1.4): Utilized to identify microRNAs (miRNAs) and small nuclear RNAs (snRNAs) using built-in configurations.
Braker (v3.0.8): Utilized to integrate gene prediction results with 9 selected proteomes and RNAseq reads from tissues with parameters set to gff3, threads 48, prot_seq = pep.fasta, bam = bams and UTR = on.
HISAT2 (v2.2.1): Utilized to map transcriptomic data for genome annotation using built-in configurations.
StringTie (v2.1.7): Utilized to assemble the transcripts for the prediction of gene structures using built-in configurations.
MAKER3 (v3.01.03): Utilized to combine outputs from various prediction modes into the final gene collection using built-in configurations.
BLAST (v2.11.0 +): Employed for synteny analysis and functional annotation of predicted genes using the BLASTP module with an E-value threshold of 1e–⁵.
References
Mercier, A., Gebruk, A., Kremenetskaia, A. & Hamel, J-F. in The World of Sea Cucumbers (ed. Mercier, A., Hamel, J-F., Suhrbier, A. D. & Pearce, C. M.) Ch. 1, https://doi.org/10.1016/B978-0-323-95377-1.00001-1 (London Academic Press, 2023).
Mercier, A. et al. Revered and Reviled: The Plight of the Vanishing Sea Cucumbers. Annu. Rev. Mar. Sci. 17, 115–142, https://doi.org/10.1146/annurev-marine-032123-025441 (2025).
Miller, A. K. et al. Molecular phylogeny of extant Holothuroidea (Echinodermata). Mol. Phylogenet Evol. 111, 110–131, https://doi.org/10.1016/j.ympev.2017.02.014 (2017).
Pearce, C. M., William Gartrell, J., King, X. K. & Zaklan Duff, S. D. in The World of Sea Cucumbers (ed. Mercier, A., Hamel, J-F., Suhrbier, A. D. & Pearce, C. M.) Ch. 2, https://doi.org/10.1016/B978-0-323-95377-1.00014-X (London Academic Press, 2023).
Purcell, S. W. et al. Commercially important sea cucumbers of the world 2nd edn, https://doi.org/10.4060/cc5230en (FAO, 2023).
Gamboa, R. U., Halun, S. Z. B. & Vularika, A. S. in The World of Sea Cucumbers (ed. Mercier, A., Hamel, J-F., Suhrbier, A. D. & Pearce, C. M.) Ch. 9, https://doi.org/10.1016/B978-0-323-95377-1.00021-7 (London Academic Press, 2023).
Conand, C. Tropical sea cucumber fisheries: Changes during the last decade. Mar. Pollut. Bull. 133, 590–594, https://doi.org/10.1016/j.marpolbul.2018.05.014 (2018).
Slater, M. in The World of Sea Cucumbers (ed. Mercier, A., Hamel, J-F., Suhrbier, A. D. & Pearce, C. M.) Ch. 41, https://doi.org/10.1016/B978-0-323-95377-1.00022-9 (London Academic Press, 2023).
Wolfe, K. in The World of Sea Cucumbers (ed. Mercier, A., Hamel, J-F., Suhrbier, A. D. & Pearce, C. M.) Ch. 28, https://doi.org/10.1016/B978-0-323-95377-1.00028-X (London Academic Press, 2023).
Phelps Bondaroff, T. N. & Morrow, F. in The World of Sea Cucumbers (ed. Mercier, A., Hamel, J-F., Suhrbier, A. D. & Pearce, C. M.) Ch. 13, https://doi.org/10.1016/B978-0-323-95377-1.00009-6 (London Academic Press, 2023).
Hamel, J. F. et al. Global knowledge on the commercial sea cucumber Holothuria scabra. Adv. Mar. Bio. 91, 1–286, https://doi.org/10.1016/bs.amb.2022.04.001 (2022).
Yang, Y. et al. Pipeline for identification of genome-wide microsatellite markers and its application in assessing the genetic diversity and structure of the tropical sea cucumber Holothuria leucospilota. Aquaculture Reports. 37, 102207, https://doi.org/10.1016/j.aqrep.2024.102207 (2024).
Nocillado, J. et al. Spawning induction of the high-value white teatfish sea cucumber, Holothuria fuscogilva, using recombinant relaxin-like gonad stimulating peptide (RGP). Aquaculture. 547, 737422, https://doi.org/10.1016/j.aquaculture.2021.737422 (2022).
Osathanunkul, M. & Suwannapoom, C. Sustainable fisheries management through reliable restocking and stock enhancement evaluation with environmental DNA. Sci. Rep. 13, 11297, https://doi.org/10.1038/s41598-023-38218-2 (2023).
Javanmardi, S., Rezaei Tavabe, K., Moradi, S. & Abed-Elmdoust, A. The effects of dietary levels of the sea cucumber (Bohadschia ocellata Jaeger, 1833) meal on growth performance, blood biochemical parameters, digestive enzymes activity and body composition of Pacific white shrimp (Penaeus vannamei Boone, 1931) juveniles. Iranian Journal of Fisheries Sciences. 19, 2366–2383, https://doi.org/10.22092/ijfs.2020.122330 (2020).
Kim, S. W., Kerr, A. M. & Paulay, G. Colour, confusion, and crossing: resolution of species problems in Bohadschia (Echinodermata: Holothuroidea). Zoological Journal of the Linnean Society. 168, 81–97, https://doi.org/10.1111/zoj.12026 (2013).
Thinh, P. D. et al. Fucosylated Chondroitin Sulfate from Bohadschia ocellata: Structure Analysis and Bioactivities. Processes. 12, 2108, https://doi.org/10.3390/pr12102108 (2024).
Samyn, Y. & Vandenspiegel, D. Sublittoral and bathyal sea cucumbers (Echinodermata: Holothuroidea) from the Northern Mozambique Channel with description of six new species. Zootaxa. 4196, 451–497, https://doi.org/10.11646/zootaxa.4196.4.1 (2016).
Liao, Y. & Clark, A. M. The Echinoderms of Southern China (Science Press, Beijing & New York, 1995).
Amin, A. & Thalib, B. Marine of dentistry: pemanfaatan stichopus hermanii dalam bidang kedokteran gigi (Nas Media Pustaka Press, Indonesia, 2024).
Cheng, H. et al. Taxonomic status and phylogenetic analyses based on complete mitochondrial genome and microscopic ossicles: Redescription of a controversial tropical sea cucumber species (Holothuroidea, Holothuria Linnaeus, 1767). Zoosyst. Evol. 101, 791–804, https://doi.org/10.3897/zse.101.137781 (2025).
Patantis, G., Dewi, A. S., Fawzya, Y. N. & Nursid, M. Identification of Beche-de-mers from Indonesia by molecular approach. Biodiversitas. 20, 537–543, https://doi.org/10.13057/BIODIV/D200233 (2019).
Sun, L., Jiang, C., Su, F., Cui, W. & Yang, H. Chromosome-level genome assembly of the sea cucumber Apostichopus japonicus. Sci. Data. 10, 454, https://doi.org/10.1038/s41597-023-02368-9 (2023).
Chen, T. et al. The Holothuria leucospilota genome elucidates sacrificial organ expulsion and bioadhesive trap enriched with amyloid-patterned proteins. Pnas. 120, e2213512120, https://doi.org/10.1073/pnas.2213512120 (2023).
Zhong, S. et al. Chromosomal-level genome assembly and annotation of the tropical sea cucumber Holothuria scabra. Sci. Data. 11, 474, https://doi.org/10.1038/s41597-024-03340-x (2024).
Chen, T. et al. Chromosome-level genome assembly and annotation of the tropical sea cucumber Stichopus monotuberculatus. Sci. Data. 11, 1245, https://doi.org/10.1038/s41597-024-03985-8 (2024).
Zhang, L. et al. The genome of an apodid holothuroid (Chiridota heheva) provides insights into its adaptation to a deep-sea reducing environment. Commun. Biol. 5, 224, https://doi.org/10.1038/s42003-022-03176-4 (2022).
Ma, B. et al. Analysis of Complete Mitochondrial Genome of Bohadschia argus (Jaeger, 1833) (Aspidochirotida, Holothuriidae). Animals. 12, 1437, https://doi.org/10.3390/ani12111437 (2022).
Chen, Y. et al. SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. Gigascience. 7, 1–6, https://doi.org/10.1093/gigascience/gix120 (2017).
Wenger, A. M. et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37, 1155–1162, https://doi.org/10.1038/s41587-019-0217-9 (2019).
Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 27, 764–770, https://doi.org/10.1093/bioinformatics/btr011 (2011).
Vurture, G. W. et al. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics. 33, 2202–2204, https://doi.org/10.1093/bioinformatics/btx153 (2017).
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods. 18, 170–175, https://doi.org/10.1038/s41592-020-01056-5 (2021).
Guan, D. et al. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics. 36, 2896–2898, https://doi.org/10.1093/bioinformatics/btaa025 (2020).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 25, 1754–1760, https://doi.org/10.1093/bioinformatics/btp324 (2009).
Walker, B. J. et al. Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement. PLoS. ONE. 9, e112963, https://doi.org/10.1371/journal.pone.0112963 (2014).
Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome research. 19, 1639–1645, https://doi.org/10.1101/gr.092759.109 (2009).
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. PNSA. 117, 9451–9457, https://doi.org/10.1073/pnas.1921046117 (2020).
Ou, S. et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 20, 275, https://doi.org/10.1186/s13059-019-1905-y (2019).
Aylward, F. O. Introduction to Prokaryotic gene prediction (CDS and rRNA) V. 2. BMC Bioinformatics. 11, 1, https://doi.org/10.17504/protocols.io.pjrdkm6 (2010).
Lowe, T. M. & Eddy, S. R. tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence. Nucleic Acids Research. 25, 955–964, https://doi.org/10.1093/nar/25.5.955 (1997).
Kalvari, I. et al. Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Research. 46, D335–D342, https://doi.org/10.1093/nar/gkx1038 (2018).
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics. 29, 2933–2935, https://doi.org/10.1093/bioinformatics/btt509 (2013).
Cantarel, B. L. et al. MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes. Genome research. 18, 188–196, https://doi.org/10.1101/gr.6743907 (2008).
Kim, D. et al. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915, https://doi.org/10.1038/s41587-019-0201-4 (2019).
Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295, https://doi.org/10.1038/nbt.3122 (2015).
Brůna, T., Hoff, K. J., Lomsadze, A., Stanke, M. & Borodovsky, M. BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genomics and Bioinformatics. 3, lqaa108, https://doi.org/10.1093/nargab/lqaa108 (2021).
Mitchell, A. L. et al. InterPro in 2019: improving coverage, classification and access to protein sequence annotations. Nucleic Acids Research. 47, D351–D360, https://doi.org/10.1093/nar/gky1100 (2019).
The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Research. 45, D158–D169, https://doi.org/10.1093/nar/gkw1099 (2017).
Boutet, E. et al. UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View. Methods in Molecular Biology. 1374, 23–54, https://doi.org/10.1007/978-1-4939-3167-5_2 (2016).
Kanehisa, M., Sato, Y. & Morishima, K. BlastKOALA and GhostKOALA: KEGG Tools for Functional Characterization of Genome and Metagenome Sequences. Journal of molecular biology. 428, 726–731, https://doi.org/10.1016/j.jmb.2015.11.006 (2016).
Camacho, C. et al. BLAST plus: architecture and applications. BMC Bioinformatics. 10, 421, https://doi.org/10.1186/1471-2105-10-421 (2009).
The Gene Ontology Consortium. The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Research. 47, D330–D338, https://doi.org/10.1093/nar/gky1055 (2019).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP586938 (2025).
Chen, T. Holothuria ocellata isolate DDF-2025, whole genome shotgun sequencing project. GenBank https://identifiers.org/ncbi/insdc:JBOCEH000000000 (2025).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR33662152 (2025).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR33662151 (2025).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR35940911 (2025).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR33662153 (2025).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR33662154 (2025).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR33662155 (2025).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR33662156 (2025).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR33662157 (2025).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR33662158 (2025).
Fan, D. Genome sequence of sea cucumber Bohadschia ocellata (Holothuria ocellata). Figshare https://doi.org/10.6084/m9.figshare.29124434.v1 (2025).
Fan, D. Genome sequence of sea cucumber Bohadschia ocellata (Holothuria ocellata). Baiduyun https://pan.baidu.com/s/10DBoB_GQQhThYBiloGInsA?pwd=wmhs (2025).
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245, https://doi.org/10.1186/s13059-020-02134-9 (2020).
Acknowledgements
This study was graciously supported by grants from the National Natural Science Foundation of China (W2512089 to A.Y., and 42176132 and 32573487 to T.C.), the Research on breeding technology of candidate species for Guangdong modern marine ranching (2024-MRB-00-001 to T.C.), and the Guangdong Province Project (2024A1515011418 to T.C.).
Author information
Authors and Affiliations
Contributions
Chunhua Ren, Chaoqun Hu, Ting Chen, and Aifen Yan planned and conceptualized the research. Qianying Huang, Xuan Wang, Zhou Qin, Hua Ge, Yingxin Lin, Junyan Wang, Yun Yang, Da Huo, Xiaoli Zhang and Xiangxing Zhu acquired and processed the samples. Zhou Qin and Dingding Fan constructed the genome and performed annotations. Qianying Huang, Xuan Wang, Zhou Qin and Dingding Fan analysed gene functions. Qianying Huang, Xuan Wang, Zhou Qin, Dingding Fan and Ting Chen conducted bioinformatic analyses. Zhenyu Xie, Chang Chen, Haipeng Qin, Dongsheng Tang, Chunhua Ren, Chaoqun Hu, Aifen Yan, and Ting Chen offered experimental materials and computational resources. Qianying Huang, Xuan Wang, Dingding Fan, Aifen Yan and Ting Chen composed the manuscript. Ting Chen and Aifen Yan and carried out revisions. All authors have reviewed and consented to the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Huang, Q., Wang, X., Qin, Z. et al. Chromosome-level genome assembly and annotation of the tropical sea cucumber Bohadschia ocellate. Sci Data (2025). https://doi.org/10.1038/s41597-025-06453-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-025-06453-z


