Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Data
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific data
  3. data descriptors
  4. article
Chromosome-Level Genome Assembly of the Japanese Zacco platypus for Comparative Genomics
Download PDF
Download PDF
  • Data Descriptor
  • Open access
  • Published: 23 December 2025

Chromosome-Level Genome Assembly of the Japanese Zacco platypus for Comparative Genomics

  • Jui-Hung Tai  ORCID: orcid.org/0000-0003-4283-31181,2,3 na1,
  • Tsung-Han Yu2,4 na1,
  • Feng-Yu Wang5 na1,
  • Yohey Terai  ORCID: orcid.org/0000-0003-3353-34206,
  • Tabata Ryoichi6,
  • Shih-Pin Huang3,
  • Tzi-Yuan Wang3 &
  • …
  • Hurng-Yi Wang1,2,4,7 

Scientific Data , Article number:  (2025) Cite this article

  • 1358 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Genome
  • Ichthyology

Abstract

Zacco platypus is a freshwater minnow widely distributed across East Asia, noted for its high environmental adaptability, strong reproductive capacity, and ability to hybridize across genera. The type specimen was described from Japan, yet no representative genome from the Japanese lineage has been reported to date. Introduced to Taiwan in the 1980s, Z. platypus rapidly established a stable population. In this study, we assembled a chromosome-level genome from a Taiwanese population and resequenced a native Japanese individual, providing genomic evidence that the Taiwanese lineage originated from Japan. The assembly and gene annotation achieved BUSCO completeness scores of 98.9% and 96.6%, respectively, representing the highest quality reported among published Z. platypus genomes. Furthermore, we identified chromosomal structural variation among populations, and both PCA and genetic distance analyses revealed that the Japanese lineage is distinct from continental populations, indicating the importance of representative genomes across geographic lineages. This high-quality genome provides a valuable resource for future research in comparative genomics, population genetics, hybridization, speciation, and evolutionary biology.

Similar content being viewed by others

A chromosome-level genome assembly of East Asia endemic minnow Zacco platypus

Article Open access 27 March 2024

Chromosome-level genome assembly of the dwarf cattail Typha minima

Article Open access 10 January 2026

A chromosome-level genome assembly of the flat mite Brevipalpus obovatus

Article Open access 12 November 2025

Data availability

The chromosome-level genome assembly is available at GenBank under the accession number JBHGZP00000000036. The genome annotation file has been deposited in the Figshare database37. For Taiwanese Z. platypus, the Nanopore long reads is available under accession SRR3066980338, the Illumina short reads under SRR3066944039, the Hi-C reads under SRR3066990240, and the RNA-seq reads under SRR30669437–SRR3066943941,42,43. The Illumina short reads of Japanese Z. platypus are available under accession SRR3518205844.

Code availability

No specific scripts were developed for this project. All data processing and bioinformatics analyses were conducted using publicly available software, following protocols and manuals provided by each respective tool.

References

  1. Ma, G. C., Watanabe, K., Tsao, H. S. & Yu, H. T. Mitochondrial phylogeny reveals the artificial introduction of the pale chub (Cyprinidae) in Taiwan. Ichthyol Res 53, 323–329, https://doi.org/10.1007/s10228-006-0353-3 (2006).

    Google Scholar 

  2. Fu, S.-J., Cao, Z.-D., Yan, G.-J., Fu, C. & Pang, X. Integrating environmental variation, predation pressure, phenotypic plasticity and locomotor performance. Oecologia 173, 343–354, https://doi.org/10.1007/s00442-013-2626-7 (2013).

    Google Scholar 

  3. Arao, K. & Shimoyama, J. Hybrids between Zacco platypus and Z. temminckii from Aichi prefecture. Japan Sci Rep Toyohashi Mus Nat Hist 16, 53–54 (2006).

    Google Scholar 

  4. Liao, N. L., Huang, S. P. & Wang, T. Y. Interspecific mating behavior between introduced Zacco platypus and native Opsariichthys evolans in Taiwan. Zool Stud 59, e6, https://doi.org/10.6620/zs.2020.59-6 (2020).

    Google Scholar 

  5. Wang, C.-F., Chang, G.-C., Wang, Y.-Q., Chen, Q.-Y. & Lin, G.-Y. Do introduced Opsariichthys and native Opsariichthys interbreed naturally?, (New Taipei Municipal Ming Der High School, New Taipei City, Taiwan, 2011).

  6. Siebold, P. F. V., Haan, W. D., Schlegel, H. & Temminck, C. J. Fauna japonica, sive, Descriptio animalium, quae in itinere per Japoniam, jussu et auspiciis, superiorum, qui summum in India Batava imperium tenent, suscepto, annis 1823-1830. Vol. v.[2] Pisces (Apud Auctorem, 1835).

  7. Nam, S.-E. & Rhee, J.-S. Chromosomal-level genome assembly data from the pale chub, Zacco platypus (Jordan & Evermann, 1902). Data in Brief 55, 110596, https://doi.org/10.1016/j.dib.2024.110596 (2024).

    Google Scholar 

  8. Xu, X. et al. A chromosome-level genome assembly of East Asia endemic minnow Zacco platypus. Scientific Data 11, 317, https://doi.org/10.1038/s41597-024-03163-w (2024).

    Google Scholar 

  9. Perdices, A. & Coelho, M. M. Comparative phylogeography of Zacco platypus and Opsariichthys bidens (Teleostei, Cyprinidae) in China based on cytochrome b sequences. Journal of Zoological Systematics and Evolutionary Research 44, 330–338, https://doi.org/10.1111/j.1439-0469.2006.00368.x (2006).

    Google Scholar 

  10. Wick, R. R., Judd, L. M., Gorrie, C. L. & Holt, K. E. Completing bacterial genome assemblies with multiplex MinION sequencing. Microbial Genomics 3, https://doi.org/10.1099/mgen.0.000132 (2017).

  11. Andrews, S. FastQC: a quality control tool for high throughput sequence data. (2010).

  12. Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nature Biotechnology 37, 540–546, https://doi.org/10.1038/s41587-019-0072-8 (2019).

    Google Scholar 

  13. Zimin, A. V. & Salzberg, S. L. The genome polishing tool POLCA makes fast and accurate corrections in genome assemblies. PLOS Computational Biology 16, e1007981, https://doi.org/10.1371/journal.pcbi.1007981 (2020).

    Google Scholar 

  14. Laetsch, D. & Blaxter, M. BlobTools: Interrogation of genome assemblies [version 1; peer review: 2 approved with reservations]. F1000Research 6, https://doi.org/10.12688/f1000research.12232.1 (2017).

  15. Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Systems 3, 95–98, https://doi.org/10.1016/j.cels.2016.07.002 (2016).

    Google Scholar 

  16. Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95, https://doi.org/10.1126/science.aal3327 (2017).

    Google Scholar 

  17. Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A. & Zdobnov, E. M. BUSCO Update: Novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Molecular Biology and Evolution 38, 4647–4654, https://doi.org/10.1093/molbev/msab199 (2021).

    Google Scholar 

  18. Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proceedings of the National Academy of Sciences 117, 9451–9457, https://doi.org/10.1073/pnas.1921046117 (2020).

    Google Scholar 

  19. Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Research 35, W265–W268, https://doi.org/10.1093/nar/gkm286 (2007).

    Google Scholar 

  20. Ou, S. & Jiang, N. LTR_FINDER_parallel: parallelization of LTR_FINDER enabling rapid identification of long terminal repeat retrotransposons. Mobile DNA 10, 48, https://doi.org/10.1186/s13100-019-0193-0 (2019).

    Google Scholar 

  21. Ou, S. & Jiang, N. LTR_retriever: A highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiology 176, 1410–1422, https://doi.org/10.1104/pp.17.01310 (2018).

    Google Scholar 

  22. Shao, F., Wang, J., Xu, H. & Peng, Z. FishTEDB: a collective database of transposable elements identified in the complete genomes of fish. Database 2018, bax106, https://doi.org/10.1093/database/bax106 (2018).

    Google Scholar 

  23. Hubley, R. et al. The Dfam database of repetitive DNA families. Nucleic Acids Research 44, D81–D89, https://doi.org/10.1093/nar/gkv1272 (2016).

    Google Scholar 

  24. Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA 6, 11, https://doi.org/10.1186/s13100-015-0041-9 (2015).

    Google Scholar 

  25. Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Current Protocols in Bioinformatics 25, 4.10.11–14.10.14, https://doi.org/10.1002/0471250953.bi0410s25 (2009).

    Google Scholar 

  26. Brůna, T., Hoff, K. J., Lomsadze, A., Stanke, M. & Borodovsky, M. BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genomics and Bioinformatics 3, lqaa108, https://doi.org/10.1093/nargab/lqaa108 (2021).

    Google Scholar 

  27. Gabriel, L., Hoff, K. J., Brůna, T., Borodovsky, M. & Stanke, M. TSEBRA: transcript selector for BRAKER. BMC Bioinformatics 22, 566, https://doi.org/10.1186/s12859-021-04482-0 (2021).

    Google Scholar 

  28. Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240, https://doi.org/10.1093/bioinformatics/btu031 (2014).

    Google Scholar 

  29. Ashburner, M. et al. Gene Ontology: tool for the unification of biology. Nature Genetics 25, 25–29, https://doi.org/10.1038/75556 (2000).

    Google Scholar 

  30. Kanehisa, M., Furumichi, M., Sato, Y., Matsuura, Y. & Ishiguro-Watanabe, M. KEGG: biological systems database as a model of the real world. Nucleic Acids Research 53, D672–D677, https://doi.org/10.1093/nar/gkae909 (2025).

    Google Scholar 

  31. Dierckxsens, N., Mardulyn, P. & Smits, G. NOVOPlasty: de novo assembly of organelle genomes from whole genome data. Nucleic Acids Research 45, e18–e18, https://doi.org/10.1093/nar/gkw955 (2017).

    Google Scholar 

  32. Marçais, G. et al. MUMmer4: A fast and versatile genome alignment system. PLOS Computational Biology 14, e1005944, https://doi.org/10.1371/journal.pcbi.1005944 (2018).

    Google Scholar 

  33. Purcell, S. et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics 81, 559–575, https://doi.org/10.1086/519795 (2007).

    Google Scholar 

  34. Tai, J.-H. et al. The VCF file of four Zacco platypus lineages. figshare. https://doi.org/10.6084/m9.figshare.30600653 (2025).

  35. He, W. et al. VCF2PCACluster: a simple, fast and memory-efficient tool for principal component analysis of tens of millions of SNPs. BMC Bioinformatics 25, 173, https://doi.org/10.1186/s12859-024-05770-1 (2024).

    Google Scholar 

  36. Wang, T.-Y. et al. Zacco platypus isolate ZpHy926, whole genome shotgun sequencing project. GenBank https://identifiers.org/ncbi/insdc:JBHGZP000000000 (2025).

  37. Tai, J.-H. et al. Annotation files of Japan lineage Zacco platypus from Taiwan. figshare. https://doi.org/10.6084/m9.figshare.30011686.v1 (2025).

  38. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR30669803 (2025).

  39. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR30669440 (2025).

  40. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR30669902 (2025).

  41. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR30669437 (2025).

  42. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR30669438 (2025).

  43. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR30669439 (2025).

  44. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR35182058 (2025).

Download references

Acknowledgements

This study was supported by grants from the National Science and Technology Council (NSTC), Taiwan (113-2327-B-002-003-, MOST 109-2311-B-002-023-MY3, MOST 105-2311-B-001-064, MOST 106-2311-B-001-022, and MOST 107-2311-B-001-007), and National Taiwan University (113L7223). It was also partially supported by the Taiwan BioGenome Project, funded by Academia Sinica, Taiwan (AS-Grant 23-23). Additional support was provided through the National Key Area International Cooperation Alliance: University Academic Alliance in Taiwan (UAAT) - Kyushu-Okinawa Open University (KOOU) - Medicine and Life Sciences Integrative Program, funded by the Ministry of Education, Taiwan, to promote international collaboration in cutting-edge research.

Author information

Author notes
  1. These authors contributed equally: Jui-Hung Tai, Tsung-Han Yu, Feng-Yu Wang.

Authors and Affiliations

  1. Genome and Systems Biology Degree Program, National Taiwan University and Academia Sinica, Taipei, Taiwan

    Jui-Hung Tai & Hurng-Yi Wang

  2. Graduate Institute of Clinical Medicine, College of Medicine, National Taiwan University, Taipei, Taiwan

    Jui-Hung Tai, Tsung-Han Yu & Hurng-Yi Wang

  3. Biodiversity Research Center, Academia Sinica, Taipei, Taiwan

    Jui-Hung Tai, Shih-Pin Huang & Tzi-Yuan Wang

  4. Institute of Ecology and Evolutionary Biology, National Taiwan University, Taipei, Taiwan

    Tsung-Han Yu & Hurng-Yi Wang

  5. Taiwan Ocean Research Institute, National Institutes of Applied Research, Kaohsiung, Taiwan

    Feng-Yu Wang

  6. Research Center for Integrative Evolutionary Science, The Graduate University for Advanced Studies, SOKENDAI, Hayama, Kanagawa, Japan

    Yohey Terai & Tabata Ryoichi

  7. Department of Entomology, National Taiwan University, Taipei, Taiwan

    Hurng-Yi Wang

Authors
  1. Jui-Hung Tai
    View author publications

    Search author on:PubMed Google Scholar

  2. Tsung-Han Yu
    View author publications

    Search author on:PubMed Google Scholar

  3. Feng-Yu Wang
    View author publications

    Search author on:PubMed Google Scholar

  4. Yohey Terai
    View author publications

    Search author on:PubMed Google Scholar

  5. Tabata Ryoichi
    View author publications

    Search author on:PubMed Google Scholar

  6. Shih-Pin Huang
    View author publications

    Search author on:PubMed Google Scholar

  7. Tzi-Yuan Wang
    View author publications

    Search author on:PubMed Google Scholar

  8. Hurng-Yi Wang
    View author publications

    Search author on:PubMed Google Scholar

Contributions

H.Y.W. and T.Y.W. conceived and designed the study. T.Y.W., S.P.H., F.Y.W., Y.T., R.T., J.H.T. and T.H.Y. collected samples. J.H.T. and T.H.Y. performed the data analysis. T.Y.W. and Y.T. conducted experiments. J.H.T. wrote the manuscript. H.Y.W. and T.Y.W. revised the manuscript.

Corresponding authors

Correspondence to Tzi-Yuan Wang or Hurng-Yi Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tai, JH., Yu, TH., Wang, FY. et al. Chromosome-Level Genome Assembly of the Japanese Zacco platypus for Comparative Genomics. Sci Data (2025). https://doi.org/10.1038/s41597-025-06467-7

Download citation

  • Received: 05 September 2025

  • Accepted: 10 December 2025

  • Published: 23 December 2025

  • DOI: https://doi.org/10.1038/s41597-025-06467-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Editors & Editorial Board
  • Journal Metrics
  • Policies
  • Open Access Fees and Funding
  • Calls for Papers
  • Contact

Publish with us

  • Submission Guidelines
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Data (Sci Data)

ISSN 2052-4463 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing