Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Data
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific data
  3. data descriptors
  4. article
An annotated genome of freshwater amphipod (Gammarus nekkensis)
Download PDF
Download PDF
  • Data Descriptor
  • Open access
  • Published: 02 April 2026

An annotated genome of freshwater amphipod (Gammarus nekkensis)

  • Decai Lu1,2,
  • Hongguang Liu1,
  • Yan Tong1,2,
  • Zeyu Liu1,2,
  • Chao-Dong Zhu1 &
  • …
  • Zhonge Hou3 

Scientific Data , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Abstract

Freshwater Gammarus species represent important keystone organisms in aquatic ecosystems, and are sensitive to environmental changes. Here, we present the first pseudo-chromosome-level genome of Gammarus nekkensis, endemic to north China. We integrated PacBio HiFi long-read sequencing, Illumina short-read sequencing, and Hi-C scaffolding to generate a high-quality, pseudo-chromosome-scale genome assembly. The assembled genome is approximately 6.24 Gb in size, with a scaffold N50 of 233.63 Mb, and 96.76% of the sequences were successfully anchored to 26 pseudo-chromosomes. A total of 39,474 protein-coding genes were predicted, and approximately 70% of these genes obtained functional annotations. Repetitive elements constituted about 63.93% of the genome, with long interspersed nuclear elements (LINE) being the most abundant (23.98%). BUSCO analysis indicated that both the assembly and annotation are highly complete compared with published amphipod genomes. This high-quality genome enables studies of sex determination, adaptive evolution, and genomic diversity in G. nekkensis, with applications for its conservation and breeding.

Data availability

The raw sequencing data supporting this study have been deposited in the European Nucleotide Archive (ENA) with the following accession numbers: PacBio data, ERR16720729, Illumina data, ERR16722743, and Hi-C data, ERR16729500. The assembled genome sequence is available from the ENA under accession number ERZ29160471 (GCA_980912865) and from the China National GeneBank under accession number CNA0509588. The functional annotation information of the gene was uploaded to Figshare database under the following: https://doi.org/10.6084/m9.figshare.31698019.

Code availability

All analyses were conducted using the software and pipelines as specified in the Methods section, without the generation of any custom code.

References

  1. Angela, M. G. et al. Evolutionary responses to warming. Trends in Ecology & Evolution 36, 591–600, https://doi.org/10.1016/j.tree.2021.02.014 (2021).

    Google Scholar 

  2. Liu, H. et al. Marine-montane transitions coupled with gill and genetic convergence in extant crustacean. Science Advances, 9, https://www.science.org/doi/10.1126/sciadv.adg4011 (2023).

  3. Poynton, H. C. et al. The toxicogenome of Hyalella azteca: a model for sediment ecotoxicology and evolutionary toxicology. Environmental science & technology 52, 6009–6022, https://doi.org/10.1021/acs.est.8b00837 (2018).

    Google Scholar 

  4. Harlıoğlu, M. M. & Farhadi, A. Importance of Gammarus in aquaculture. Aquaculture International 26, 1327–1338, https://link.springer.com/article/10.1007/s10499-018-0287-6 (2018).

    Google Scholar 

  5. Averof, M. The crustacean Parhyale. Nature Methods 19, 1015–1016, https://doi.org/10.1038/s41592-022-01596-y (2022).

    Google Scholar 

  6. Liu, H. et al. Osmoregulatory evolution of gills promoted salinity adaptation following the sea-land transition of crustacean. Marine Life Science & Technology 7, 205–217, https://doi.org/10.1007/s42995-025-00298-6 (2025).

    Google Scholar 

  7. Alther, R. et al. Optimizing laboratory cultures of Gammarus fossarum (Crustacea: Amphipoda) as a study organism in environmental sciences and ecotoxicology. Science of the Total Environment 855, 158730, https://doi.org/10.1016/j.scitotenv.2022.158730 (2023).

    Google Scholar 

  8. Huang, M. et al. Diversity of endemic cold-water amphipods threatened by climate warming in northwestern China. Diversity and Distributions 30, e13798, https://doi.org/10.1111/ddi.13798 (2024).

    Google Scholar 

  9. Hou, Z. et al. Past climate cooling promoted global dispersal of amphipods from Tian Shan montane lakes to circumboreal lakes. Global Change Biology 28, 3830–3845, https://doi.org/10.1111/gcb.16160 (2022).

    Google Scholar 

  10. Schatz, M. C. et al. Assembly of large genomes using second-generation sequencing. Genome research 20, 1165–1173, http://www.genome.org/cgi/doi/10.1101/gr.101360.109 (2010).

    Google Scholar 

  11. Tegenfeldt, F. et al. OrthoDB and BUSCO update: annotation of orthologs with wider sampling of genomes. Nucleic Acids Research 53, D516–D522, https://doi.org/10.1093/nar/gkae987 (2025).

    Google Scholar 

  12. Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680, https://doi.org/10.1016/j.cell.2014.11.021 (2014).

    Google Scholar 

  13. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293, https://doi.org/10.1126/science.1181369 (2009).

    Google Scholar 

  14. Ranallo-Benavidez, T. R. et al. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nature Communications 11, 1432, https://doi.org/10.1038/s41467-020-14998-3 (2020).

    Google Scholar 

  15. Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770, https://doi.org/10.1093/bioinformatics/btr011 (2011).

    Google Scholar 

  16. Cheng, H. et al. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nature Methods 18, 170–175, https://doi.org/10.1038/s41592-020-01056-5 (2021).

    Google Scholar 

  17. Guan, D. et al. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics 36, 2896–2898, https://doi.org/10.1093/bioinformatics/btaa025 (2020).

    Google Scholar 

  18. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760, https://doi.org/10.1093/bioinformatics/btp324 (2009).

    Google Scholar 

  19. Jiang, H. et al. NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics 36, 2253–2255, https://doi.org/10.1093/bioinformatics/btz891 (2020).

    Google Scholar 

  20. Zeng, X. et al. Chromosome-level scaffolding of haplotype-resolved assemblies using Hi-C data without reference genomes. Nature Plants 10, 1184–1200, https://doi.org/10.1038/s41477-024-01755-3 (2024).

    Google Scholar 

  21. Zhou, C. et al. YaHS: yet another Hi-C scaffolding tool. Bioinformatics 39, btac808, https://doi.org/10.1093/bioinformatics/btac808 (2023).

    Google Scholar 

  22. Dudchenko, O. et al. The Juicebox Assembly Tools module facilitates de novo assembly of mammalian genomes with chromosome-length scaffolds for under $1000. BioRxiv, 254797, https://www.biorxiv.org/content/10.1101/254797v1 (2018).

  23. Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proceedings of the National Academy of Sciences 117, 9451–9457, https://doi.org/10.1073/pnas.1921046117 (2020).

    Google Scholar 

  24. Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Current protocols in Bioinformatics 5, 4–10, https://doi.org/10.1002/0471250953.bi0410s25 (2009).

    Google Scholar 

  25. Gabriel, L. et al. TSEBRA: transcript selector for BRAKER. BMC Bioinformatics 22, 566, https://doi.org/10.1186/s12859-021-04482-0 (2021).

    Google Scholar 

  26. Jens, K. et al. Using intron position conservation for homology-based gene prediction. Nucleic Acids Research 44, e89, https://doi.org/10.1093/nar/gkw092 (2016).

    Google Scholar 

  27. Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome biology 9, R7, https://doi.org/10.1186/gb-2008-9-1-r7 (2008).

    Google Scholar 

  28. Shadab, A. et al. The UniProt website API: facilitating programmatic access to protein knowledge. Nucleic Acids Research 53, W547–W553, https://doi.org/10.1093/nar/gkaf394 (2025).

    Google Scholar 

  29. Buchfink, B. et al. Fast and sensitive protein alignment using DIAMOND. Nature Methods 12, 59–60, https://doi.org/10.1038/nmeth.3176 (2015).

    Google Scholar 

  30. Buchfink, B. et al. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nature Methods 18, 366–368, https://doi.org/10.1038/s41592-021-01101-x (2021).

    Google Scholar 

  31. Philip, J. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240, https://doi.org/10.1093/bioinformatics/btu031 (2014).

    Google Scholar 

  32. Chen, C. et al. TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data. Molecular Plant 13, 1194–1202, https://doi.org/10.1016/j.molp.2020.06.009 (2020).

    Google Scholar 

  33. ENA Sequence Read Archive https://www.ebi.ac.uk/ena/browser/view/ERR16720729 (2026).

  34. ENA Sequence Read Archive https://www.ebi.ac.uk/ena/browser/view/ERR16722743 (2026).

  35. ENA Sequence Read Archive https://www.ebi.ac.uk/ena/browser/view/ERR16729500 (2026).

  36. NCBI GenBank https://identifiers.org/insdc.gca:GCA_980912865.2.

  37. Lu, D. CNGBdb. https://db.cngb.org/data_resources/project/CNP0008126 (2026).

  38. Lu, D. Functional annotation of gene. figshare. Figure. https://doi.org/10.6084/m9.figshare.31698019.v1 (2026).

    Google Scholar 

  39. Schultz, D. T. et al. Ancient gene linkages support ctenophores as sister to other animals. Nature 618, 110–117, https://doi.org/10.1038/s41586-023-05936-6 (2023).

    Google Scholar 

  40. Zhang, H. et al. CNGB https://db.cngb.org/data_resources/assembly/CNA0142381

  41. Liu, H. et al. Genbank https://identifiers.org/ncbi/insdc.gca:GCA_030386875.1

  42. Kao, D. et al. Genbank https://identifiers.org/ncbi/insdc.gca:GCA_001587735.2

  43. Poynton, H.C. et al. Genbank https://identifiers.org/ncbi/insdc.gca:GCA_000764305.4

  44. Patra, A.K. et al. Genbank https://identifiers.org/ncbi/insdc.gca:GCA_006783055.1

  45. Cormier, A. et al. Genbank https://identifiers.org/ncbi/insdc.gca:GCA_016164225.1

  46. Miron, W. & Pirro, S. Genbank https://identifiers.org/ncbi/insdc.gca:GCA_047292215.1

  47. Edsinger, E., Kieras, M. & Pirro, S. Genbank https://identifiers.org/ncbi/insdc.gca:GCA_037179465.1

  48. Liu, H. et al. Genomics of rafting crustaceans reveals adaptation to climate change in tropical oceans. Nat. Commun. https://doi.org/10.1038/s41467-026-69173-x (2026).

    Google Scholar 

  49. Direct Submission. Genbank https://identifiers.org/ncbi/insdc.gca:GCA_947561585.1

  50. Nunez, J.C.B. et al. Genbank https://identifiers.org/ncbi/insdc.gca:GCA_014899125.1

Download references

Acknowledgements

This study was supported by the National Natural Science Foundation of China (grant number 32470474, 32500387), the International Partnership Program of Chinese Academy of Sciences (grant number 073GJHZ2024043MI), the Institute of Zoology, Chinese Academy of Sciences (2023IOZ0104, 2024IOZ0108); Beijing Natural Science Foundation (grant number 5244045).

Author information

Authors and Affiliations

  1. State Key Laboratory of Animal Biodiversity Conservation and Integrated Pest Management, Institute of Zoology, Chinese Academy of Sciences, 100101, Beijing, China

    Decai Lu, Hongguang Liu, Yan Tong, Zeyu Liu & Chao-Dong Zhu

  2. University of Chinese Academy of Sciences, 100049, Beijing, China

    Decai Lu, Yan Tong & Zeyu Liu

  3. College of Life Sciences, Capital Normal University, 100048, Beijing, China

    Zhonge Hou

Authors
  1. Decai Lu
    View author publications

    Search author on:PubMed Google Scholar

  2. Hongguang Liu
    View author publications

    Search author on:PubMed Google Scholar

  3. Yan Tong
    View author publications

    Search author on:PubMed Google Scholar

  4. Zeyu Liu
    View author publications

    Search author on:PubMed Google Scholar

  5. Chao-Dong Zhu
    View author publications

    Search author on:PubMed Google Scholar

  6. Zhonge Hou
    View author publications

    Search author on:PubMed Google Scholar

Contributions

D.L. and Z.H. conceived the study; D.L., Z.L. and H.L. collected sample; Y.T., H.L. and Z.L. dissected the tissue and sent the separated samples to the sequencing company; The genome assembly, annotation, and analysis are completed by D.L.; C.-D.Z. provided guidance and suggestions for the writing and conceptualization of this article; The final manuscript has been read, edited, and approved by all authors.

Corresponding author

Correspondence to Zhonge Hou.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, D., Liu, H., Tong, Y. et al. An annotated genome of freshwater amphipod (Gammarus nekkensis). Sci Data (2026). https://doi.org/10.1038/s41597-026-07126-1

Download citation

  • Received: 10 October 2025

  • Accepted: 25 March 2026

  • Published: 02 April 2026

  • DOI: https://doi.org/10.1038/s41597-026-07126-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Editors & Editorial Board
  • Journal Metrics
  • Policies
  • Open Access Fees and Funding
  • Calls for Papers
  • Contact

Publish with us

  • Submission Guidelines
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Data (Sci Data)

ISSN 2052-4463 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing