Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Data
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific data
  3. data descriptors
  4. article
A haplotype-resolved genome of Mytella strigata, a globally invasive marine bivalve
Download PDF
Download PDF
  • Data Descriptor
  • Open access
  • Published: 06 April 2026

A haplotype-resolved genome of Mytella strigata, a globally invasive marine bivalve

  • Jiawei Zhang1,2,3 na1,
  • Siyao Li1 na1,
  • Yiwei Wang1,
  • Shijie Zhong1,
  • Chuangye Yang1,4,5,6,
  • Yongshan Liao1,4,5,6,
  • Deng Yuewen1,4,5,6,
  • Qingheng Wang  ORCID: orcid.org/0000-0001-9148-66131,4,5,6 &
  • …
  • Zhe Zheng1,4,5,6 

Scientific Data , Article number:  (2026) Cite this article

  • 196 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Genome
  • Phylogenetics

Abstract

Mytella strigata, a bivalve mollusk native to the Atlantic coast of South America, has recently become a globally significant marine invasive species, posing serious threats to native ecosystems and aquaculture operations. Here, we report a haplotype-resolved, chromosome-level genome assembly of M. strigata (2n = 30), generated using high-fidelity (HiFi) long-read sequencing and high-throughput chromosome conformation capture (Hi-C). Two haplotypes were independently assembled: haplotype 1 (Hap1) spans 692.37 Mb with a contig N50 of 6.93 Mb, and haplotype 2 (Hap2) spans 683.91 Mb with a contig N50 of 7.61 Mb. Both assemblies were anchored to 15 chromosomes, achieving anchoring rates of 93.84% (Hap1) and 97.08% (Hap2). Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis revealed high completeness, identifying 92.33% and 93.22% of expected single-copy orthologs in Hap1 and Hap2, respectively. We annotated 27,887 protein-coding genes and conducted analyses of gene functions. This high-quality genomic resource provides a foundation for investigating the genetic mechanisms underlying invasiveness and environmental adaptability in M. strigata.

Data availability

The data presented in this manuscript have not been previously published. The raw PacBio HiFi and Hi-C sequencing reads generated in this study have been deposited in the NCBI SRA under accession number SRP631514. The haplotype-resolved genome assemblies of M. strigata are available in ENA under accession numbers GCA_979236015.1 (Hap2) and GCA_979236015.3 (Hap1).

Code availability

Data analysis was carried out using established pipelines and tools, in accordance with official documentation. Specific software versions and parameters are listed in the Methods section.

References

  1. Wang, Z., Nong, D., Countryman, A. M., Corbett, J. J. & Warziniack, T. Potential impacts of ballast water regulations on international trade, shipping patterns, and the global economy: An integrated transportation and economic modeling assessment. Journal of Environmental Management 275, 110892, https://doi.org/10.1016/j.jenvman.2020.110892 (2020).

    Google Scholar 

  2. Liu, D., Rong, H. & Guedes Soares, C. Shipping route modelling of AIS maritime traffic data at the approach to ports. Ocean Engineering 289, 115868, https://doi.org/10.1016/j.oceaneng.2023.115868 (2023).

    Google Scholar 

  3. Sanpanich, K. & Wells, F. E. Mytella strigata (Hanley, 1843) emerging as an invasive marine threat in Southeast Asia. BioInvasions Records 8, 343–356, https://doi.org/10.3391/bir.2019.8.2.16 (2019).

    Google Scholar 

  4. Lim, J. Y. et al. Mytella strigata (Bivalvia: Mytilidae): an alien mussel recently introduced to Singapore and spreading rapidly. Molluscan Research 38, 170–186, https://doi.org/10.1080/13235818.2018.1423858 (2018).

    Google Scholar 

  5. Ma, P.-Z. et al. First confirmed occurrence of the invasive mussel Mytella strigata (Hanley, 1843) in Guangdong and Hainan, China, and its rapid spread in Indo-West Pacific regions. BioInvasions Record 11, https://doi.org/10.3391/bir.2022.11.4.13 (2022).

  6. Boudreaux, M. L. & Walters, L. J. Mytella charruana (Bivalvia: Mytilidae): a new, invasive bivalve in Mosquito Lagoon, Florida. Nautilus 120, https://stars.library.ucf.edu/scopus2000/8375 (2006).

  7. Jayachandran, P. R. et al. First record of the alien invasive biofouling mussel Mytella strigata (Hanley, 1843)(Mollusca: Mytilidae) from Indian waters. BioInvasions Record 8, https://doi.org/10.3391/bir.2019.8.4.11 (2019).

  8. Rice, M. A., Rawson, P. D., Salinas, A. D. & Rosario, W. R. Identification and Salinity Tolerance of the Western Hemisphere Mussel Mytella charruana (D’Orbigny, 1842) in the Philippines. shre 35, 865–873, https://doi.org/10.2983/035.035.0415 (2016).

    Google Scholar 

  9. Vallejo, B. Jr et al. First record of the Charru mussel Mytella charruana d’Orbignyi, 1846 (Bivalvia: Mytilidae) from Manila Bay, Luzon, Philippines. BioInvasions Record 6, https://doi.org/10.3391/bir.2017.6.1.08 (2017).

  10. Huang, Y.-C. et al. First record of the invasive biofouling mussel Mytella strigata (Hanley, 1843)(Bivalvia: Mytilidae) from clam ponds in Taiwan. BioInvasions Record 10, https://doi.org/10.3391/bir.2021.10.2.08 (2021).

  11. Joyce, P., Lee, S. & Falkenberg, L. First record of the alien invasive mussel Mytella strigata (Hanley, 1843) in Hong Kong. BioInvasions Records 12, 385–391, https://doi.org/10.3391/bir.2023.12.2.03 (2023).

    Google Scholar 

  12. Zheng, Z. et al. The first high-quality chromosome-level genome of the Sipuncula Sipunculus nudus using HiFi and Hi-C data. Sci Data 10, 317, https://doi.org/10.1038/s41597-023-02235-7 (2023).

    Google Scholar 

  13. Marcais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27(6), 764–770, https://doi.org/10.1093/bioinformatics/btr011 (2011).

    Google Scholar 

  14. Vurture, G. W. et al. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics 33, 2202–2204, https://doi.org/10.1093/bioinformatics/btx153 (2017).

    Google Scholar 

  15. Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175, https://doi.org/10.1038/s41592-020-01056-5 (2021).

    Google Scholar 

  16. Guan, D. et al. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics 36(9), 2896–2898, https://academic.oup.com/bioinformatics/article/36/9/2896/5714742 (2020).

    Google Scholar 

  17. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100, https://doi.org/10.1093/bioinformatics/bty191 (2018).

    Google Scholar 

  18. Durand, N. C. et al. Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst. 3, 95–98 (2016).

    Google Scholar 

  19. Durand, N. C. et al. Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. Cell Syst. 3(1), 99–101, https://www.sciencedirect.com/science/article/pii/S240547121500054X (2016).

    Google Scholar 

  20. Krzywinski, M. et al. Circos: An information aesthetic for comparative genomics. Genome Res. 19(9), 1639–1645, https://pubmed.ncbi.nlm.nih.gov/19541911/ (2009).

    Google Scholar 

  21. Benson, G. Tandem Repeats Finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580, https://academic.oup.com/nar/article/27/2/573/1061099 (1999).

    Google Scholar 

  22. Tempel, S. Using and understanding RepeatMasker. Methods Mol. Biol. 859, 29–51, https://link.springer.com/protocol/10.1007/978-1-61779-603-6_2 (2012).

    Google Scholar 

  23. Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics 21(Suppl_1), i351–i358, https://pubmed.ncbi.nlm.nih.gov/15961478/ (2005).

    Google Scholar 

  24. Xu, Z. & Wang, H. LTR_Finder: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268, https://academic.oup.com/nar/article/35/suppl_2/W265/2920813?login=false (2007).

    Google Scholar 

  25. MolluscDB. Genomic data for multiple Mytilidae and related molluscan species (Bathymodiolus platifrons, Mytilisepta virgata, Mytilus chilensis, Mytilus coruscus, Mytilus galloprovincialis, etc.). MolluscDB (Qingdao National Laboratory for Marine Science and Technology). Available at: http://mgbase.qnlm.ac (accessed 2025-11-05).

  26. Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439, https://pubmed.ncbi.nlm.nih.gov/16845043/ (2006).

    Google Scholar 

  27. Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12, 491 https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-12-491 (2011).

    Google Scholar 

  28. Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60, https://www.nature.com/articles/nmeth.3176 (2015).

    Google Scholar 

  29. Zdobnov, E. M. & Apweiler, R. InterProScan–an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17, 847–848, https://pubmed.ncbi.nlm.nih.gov/11590104/ (2001).

    Google Scholar 

  30. Johnson, L. S., Eddy, S. R. & Portugaly, E. Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinformatics 11, 1–8, https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-11-431 (2010).

    Google Scholar 

  31. Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212, https://academic.oup.com/bioinformatics/article/31/19/3210/211866 (2015).

    Google Scholar 

  32. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP631514 (2025).

  33. European Nucleotide Archive https://identifiers.org/insdc.gca:GCA_979236015.1 (2026).

  34. European Nucleotide Archive https://identifiers.org/insdc.gca:GCA_979236015.3 (2026).

  35. Li, R. et al. The whole-genome sequencing and hybrid assembly of Mytilus coruscus. Frontiers in Genetics 11, 440, https://doi.org/10.3389/fgene.2020.00440 (2020).

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Research on industrial innovation technology for Guangdong modern marine ranching (Grant no. 2024-MRI-001-03), Shellfish & Algae Industry Innovation Team of Guangdong Modern Agricultural Technology System (Grant no. 2024CXTD23), Guangdong Basic and Applied Basic Research Foundation (Grant no. 2024A1515011617, 2023A1515030048), and Guangdong Ocean University scientific research project funding (Grant no. 060302022305).

Author information

Author notes
  1. These authors contributed equally: Jiawei Zhang, Siyao Li.

Authors and Affiliations

  1. Fisheries College, Guangdong Ocean University, Zhanjiang, 524088, China

    Jiawei Zhang, Siyao Li, Yiwei Wang, Shijie Zhong, Chuangye Yang, Yongshan Liao, Deng Yuewen, Qingheng Wang & Zhe Zheng

  2. Guangdong Provincial Key Laboratory of Aquatic Animal Disease Control and Healthy culture, Zhanjiang, 524088, China

    Jiawei Zhang

  3. Key Laboratory of Marine Ecology and Aquaculture Environment of Zhanjiang, Zhanjiang, 524088, China

    Jiawei Zhang

  4. Guangdong Science and Innovation Center for Pearl Culture, Zhanjiang, 524088, China

    Chuangye Yang, Yongshan Liao, Deng Yuewen, Qingheng Wang & Zhe Zheng

  5. Pearl Breeding and Processing Engineering Technology Research Centre of Guangdong Province, Zhanjiang, 524088, China

    Chuangye Yang, Yongshan Liao, Deng Yuewen, Qingheng Wang & Zhe Zheng

  6. Pearl Research Institute, Guangdong Ocean University, Zhanjiang, 524088, China

    Chuangye Yang, Yongshan Liao, Deng Yuewen, Qingheng Wang & Zhe Zheng

Authors
  1. Jiawei Zhang
    View author publications

    Search author on:PubMed Google Scholar

  2. Siyao Li
    View author publications

    Search author on:PubMed Google Scholar

  3. Yiwei Wang
    View author publications

    Search author on:PubMed Google Scholar

  4. Shijie Zhong
    View author publications

    Search author on:PubMed Google Scholar

  5. Chuangye Yang
    View author publications

    Search author on:PubMed Google Scholar

  6. Yongshan Liao
    View author publications

    Search author on:PubMed Google Scholar

  7. Deng Yuewen
    View author publications

    Search author on:PubMed Google Scholar

  8. Qingheng Wang
    View author publications

    Search author on:PubMed Google Scholar

  9. Zhe Zheng
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Zheng Z., Wang Q.H. and Deng Y.W. designed the study; Zhang J.W. and Li S.Y. performed genome sequencing, data processing, and genome analysis; Wang Y.W. and Zhong S.J. performed the assembly quality validation and improved gene annotation; Liao Y.S. and Yang C.Y. collected and prepared the samples; Zhang J.W. and Li S.Y. wrote the paper. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Qingheng Wang or Zhe Zheng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Li, S., Wang, Y. et al. A haplotype-resolved genome of Mytella strigata, a globally invasive marine bivalve. Sci Data (2026). https://doi.org/10.1038/s41597-026-07174-7

Download citation

  • Received: 23 June 2025

  • Accepted: 30 March 2026

  • Published: 06 April 2026

  • DOI: https://doi.org/10.1038/s41597-026-07174-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Editors & Editorial Board
  • Journal Metrics
  • Policies
  • Open Access Fees and Funding
  • Calls for Papers
  • Contact

Publish with us

  • Submission Guidelines
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Data (Sci Data)

ISSN 2052-4463 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing