Abstract
Mytella strigata, a bivalve mollusk native to the Atlantic coast of South America, has recently become a globally significant marine invasive species, posing serious threats to native ecosystems and aquaculture operations. Here, we report a haplotype-resolved, chromosome-level genome assembly of M. strigata (2n = 30), generated using high-fidelity (HiFi) long-read sequencing and high-throughput chromosome conformation capture (Hi-C). Two haplotypes were independently assembled: haplotype 1 (Hap1) spans 692.37 Mb with a contig N50 of 6.93 Mb, and haplotype 2 (Hap2) spans 683.91 Mb with a contig N50 of 7.61 Mb. Both assemblies were anchored to 15 chromosomes, achieving anchoring rates of 93.84% (Hap1) and 97.08% (Hap2). Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis revealed high completeness, identifying 92.33% and 93.22% of expected single-copy orthologs in Hap1 and Hap2, respectively. We annotated 27,887 protein-coding genes and conducted analyses of gene functions. This high-quality genomic resource provides a foundation for investigating the genetic mechanisms underlying invasiveness and environmental adaptability in M. strigata.
Data availability
The data presented in this manuscript have not been previously published. The raw PacBio HiFi and Hi-C sequencing reads generated in this study have been deposited in the NCBI SRA under accession number SRP631514. The haplotype-resolved genome assemblies of M. strigata are available in ENA under accession numbers GCA_979236015.1 (Hap2) and GCA_979236015.3 (Hap1).
Code availability
Data analysis was carried out using established pipelines and tools, in accordance with official documentation. Specific software versions and parameters are listed in the Methods section.
References
Wang, Z., Nong, D., Countryman, A. M., Corbett, J. J. & Warziniack, T. Potential impacts of ballast water regulations on international trade, shipping patterns, and the global economy: An integrated transportation and economic modeling assessment. Journal of Environmental Management 275, 110892, https://doi.org/10.1016/j.jenvman.2020.110892 (2020).
Liu, D., Rong, H. & Guedes Soares, C. Shipping route modelling of AIS maritime traffic data at the approach to ports. Ocean Engineering 289, 115868, https://doi.org/10.1016/j.oceaneng.2023.115868 (2023).
Sanpanich, K. & Wells, F. E. Mytella strigata (Hanley, 1843) emerging as an invasive marine threat in Southeast Asia. BioInvasions Records 8, 343–356, https://doi.org/10.3391/bir.2019.8.2.16 (2019).
Lim, J. Y. et al. Mytella strigata (Bivalvia: Mytilidae): an alien mussel recently introduced to Singapore and spreading rapidly. Molluscan Research 38, 170–186, https://doi.org/10.1080/13235818.2018.1423858 (2018).
Ma, P.-Z. et al. First confirmed occurrence of the invasive mussel Mytella strigata (Hanley, 1843) in Guangdong and Hainan, China, and its rapid spread in Indo-West Pacific regions. BioInvasions Record 11, https://doi.org/10.3391/bir.2022.11.4.13 (2022).
Boudreaux, M. L. & Walters, L. J. Mytella charruana (Bivalvia: Mytilidae): a new, invasive bivalve in Mosquito Lagoon, Florida. Nautilus 120, https://stars.library.ucf.edu/scopus2000/8375 (2006).
Jayachandran, P. R. et al. First record of the alien invasive biofouling mussel Mytella strigata (Hanley, 1843)(Mollusca: Mytilidae) from Indian waters. BioInvasions Record 8, https://doi.org/10.3391/bir.2019.8.4.11 (2019).
Rice, M. A., Rawson, P. D., Salinas, A. D. & Rosario, W. R. Identification and Salinity Tolerance of the Western Hemisphere Mussel Mytella charruana (D’Orbigny, 1842) in the Philippines. shre 35, 865–873, https://doi.org/10.2983/035.035.0415 (2016).
Vallejo, B. Jr et al. First record of the Charru mussel Mytella charruana d’Orbignyi, 1846 (Bivalvia: Mytilidae) from Manila Bay, Luzon, Philippines. BioInvasions Record 6, https://doi.org/10.3391/bir.2017.6.1.08 (2017).
Huang, Y.-C. et al. First record of the invasive biofouling mussel Mytella strigata (Hanley, 1843)(Bivalvia: Mytilidae) from clam ponds in Taiwan. BioInvasions Record 10, https://doi.org/10.3391/bir.2021.10.2.08 (2021).
Joyce, P., Lee, S. & Falkenberg, L. First record of the alien invasive mussel Mytella strigata (Hanley, 1843) in Hong Kong. BioInvasions Records 12, 385–391, https://doi.org/10.3391/bir.2023.12.2.03 (2023).
Zheng, Z. et al. The first high-quality chromosome-level genome of the Sipuncula Sipunculus nudus using HiFi and Hi-C data. Sci Data 10, 317, https://doi.org/10.1038/s41597-023-02235-7 (2023).
Marcais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27(6), 764–770, https://doi.org/10.1093/bioinformatics/btr011 (2011).
Vurture, G. W. et al. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics 33, 2202–2204, https://doi.org/10.1093/bioinformatics/btx153 (2017).
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175, https://doi.org/10.1038/s41592-020-01056-5 (2021).
Guan, D. et al. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics 36(9), 2896–2898, https://academic.oup.com/bioinformatics/article/36/9/2896/5714742 (2020).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100, https://doi.org/10.1093/bioinformatics/bty191 (2018).
Durand, N. C. et al. Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst. 3, 95–98 (2016).
Durand, N. C. et al. Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. Cell Syst. 3(1), 99–101, https://www.sciencedirect.com/science/article/pii/S240547121500054X (2016).
Krzywinski, M. et al. Circos: An information aesthetic for comparative genomics. Genome Res. 19(9), 1639–1645, https://pubmed.ncbi.nlm.nih.gov/19541911/ (2009).
Benson, G. Tandem Repeats Finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580, https://academic.oup.com/nar/article/27/2/573/1061099 (1999).
Tempel, S. Using and understanding RepeatMasker. Methods Mol. Biol. 859, 29–51, https://link.springer.com/protocol/10.1007/978-1-61779-603-6_2 (2012).
Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics 21(Suppl_1), i351–i358, https://pubmed.ncbi.nlm.nih.gov/15961478/ (2005).
Xu, Z. & Wang, H. LTR_Finder: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268, https://academic.oup.com/nar/article/35/suppl_2/W265/2920813?login=false (2007).
MolluscDB. Genomic data for multiple Mytilidae and related molluscan species (Bathymodiolus platifrons, Mytilisepta virgata, Mytilus chilensis, Mytilus coruscus, Mytilus galloprovincialis, etc.). MolluscDB (Qingdao National Laboratory for Marine Science and Technology). Available at: http://mgbase.qnlm.ac (accessed 2025-11-05).
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439, https://pubmed.ncbi.nlm.nih.gov/16845043/ (2006).
Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12, 491 https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-12-491 (2011).
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60, https://www.nature.com/articles/nmeth.3176 (2015).
Zdobnov, E. M. & Apweiler, R. InterProScan–an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17, 847–848, https://pubmed.ncbi.nlm.nih.gov/11590104/ (2001).
Johnson, L. S., Eddy, S. R. & Portugaly, E. Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinformatics 11, 1–8, https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-11-431 (2010).
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212, https://academic.oup.com/bioinformatics/article/31/19/3210/211866 (2015).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP631514 (2025).
European Nucleotide Archive https://identifiers.org/insdc.gca:GCA_979236015.1 (2026).
European Nucleotide Archive https://identifiers.org/insdc.gca:GCA_979236015.3 (2026).
Li, R. et al. The whole-genome sequencing and hybrid assembly of Mytilus coruscus. Frontiers in Genetics 11, 440, https://doi.org/10.3389/fgene.2020.00440 (2020).
Acknowledgements
This work was supported by the Research on industrial innovation technology for Guangdong modern marine ranching (Grant no. 2024-MRI-001-03), Shellfish & Algae Industry Innovation Team of Guangdong Modern Agricultural Technology System (Grant no. 2024CXTD23), Guangdong Basic and Applied Basic Research Foundation (Grant no. 2024A1515011617, 2023A1515030048), and Guangdong Ocean University scientific research project funding (Grant no. 060302022305).
Author information
Authors and Affiliations
Contributions
Zheng Z., Wang Q.H. and Deng Y.W. designed the study; Zhang J.W. and Li S.Y. performed genome sequencing, data processing, and genome analysis; Wang Y.W. and Zhong S.J. performed the assembly quality validation and improved gene annotation; Liao Y.S. and Yang C.Y. collected and prepared the samples; Zhang J.W. and Li S.Y. wrote the paper. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhang, J., Li, S., Wang, Y. et al. A haplotype-resolved genome of Mytella strigata, a globally invasive marine bivalve. Sci Data (2026). https://doi.org/10.1038/s41597-026-07174-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-026-07174-7