Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Data
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific data
  3. data descriptors
  4. article
Genome assembly and annotation of the naked mole rat Heterocephalus glaber reared in Japan
Download PDF
Download PDF
  • Data Descriptor
  • Open access
  • Published: 18 March 2026

Genome assembly and annotation of the naked mole rat Heterocephalus glaber reared in Japan

  • Kouhei Toga1,2,
  • Kaori Oka3,4,
  • Hiroyuki Tanaka5,
  • Takehiko Itoh  ORCID: orcid.org/0000-0002-6113-557X5,
  • Atsushi Toyoda  ORCID: orcid.org/0000-0002-0728-75486,7,
  • Hidemasa Bono  ORCID: orcid.org/0000-0003-4413-06511,2 &
  • …
  • Kyoko Miura  ORCID: orcid.org/0000-0003-2208-149X3,4 

Scientific Data , Article number:  (2026) Cite this article

  • 1238 Accesses

  • 7 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Comparative genomics
  • Genome

Abstract

The naked mole rat (NMR, Heterocephalus glaber) is a eusocial rodent that is native to northeastern Africa. NMRs exhibit extraordinary traits such as longevity, resistance to age-related decline, and remarkable hypoxia tolerance. Although the reference genome of this species has been determined because of its unique characteristics, the significance or role of intraspecific genomic variations remains unknown. In this study, we used PacBio long-read sequencing to generate a genome assembly of NMR reared in Japan. The assembled genome is 2.56 Gb. Benchmarking Universal Single–Copy Orthologs (BUSCO) revealed high completeness (95.2%). BRAKER3 estimated 26,714 protein-coding genes, and we successfully added functional annotations for 26,232 protein-coding genes using the functional annotation workflow. We identified 417 gene models that were previously undetectable in the reference genome of this species. We also identified structural and amino acid sequence variations between our assembly and the reference genome, suggesting the presence of intraspecific genomic variations. This new genomic resource could help uncover the molecular mechanisms underlying the behavioral and physiological traits of NMR.

Similar content being viewed by others

Epigenetic aging of the demographically non-aging naked mole-rat

Article Open access 17 January 2022

More than one species of the naked mole-rat, a new biomedical model

Article Open access 09 December 2025

DNA methylation clocks tick in naked mole rats but queens age more slowly than nonbreeders

Article Open access 23 December 2021

Data availability

All raw sequencing data generated in this study, including Illumina short reads and PacBio CLR long reads, have been deposited in the DNA Data Bank of Japan (DDBJ) Sequence Read Archive under accession numbers DRR401650–DRR401653 (https://ddbj.nig.ac.jp/search/entry/sra-study/DRP012667)63. The genome assembly has been deposited under accession GCA_053883595.1, and the whole-genome shotgun (WGS) project under BAAHMU010000000–BAAHMU010000382. Functional annotation results produced using Fanflow (TSV format) and gene models predicted by BRAKER3 (GTF format and FASTA format) are available on figshare (https://doi.org/10.6084/m9.figshare.28171166 and https://doi.org/10.6084/m9.figshare.28180430). The Fanflow annotation file includes the following columns: braker_id (BRAKER3 transcript ID), human_pid / human_gene / human_description (human best-hit protein, gene symbol, and description), mouse_pid / mouse_gene / mouse_description (mouse best-hit protein, gene symbol, and description), guinea_pig_pid / guinea_pig_gene / guinea_pig_description (guinea pig best-hit protein, gene symbol, and description), Hgla_female_pid / Hgla_female_gene / description (Naked mole-rat female best-hit protein, gene symbol, and description), Hgla_male_pid / Hgla_male_gene / description (Naked mole-rat male best-hit protein, gene symbol, and description), uniprot_id / uniprot_description (UniProtKB best-hit protein and description), pfam_id / pfam_description (Pfam domain IDs and domain annotations).

Structural variants identified in this study (VCF format) have been deposited in the European Variation Archive (EVA) under project accession numbers ERZ28787178 (https://identifiers.org/ena.embl:ERZ28787178)53, ERZ28787179 (https://identifiers.org/ena.embl:ERZ28787179)54, ERZ28787181 (https://identifiers.org/ena.embl:ERZ28787181)48, and ERZ28787182 (https://identifiers.org/ena.embl:ERZ28787182)50. All datasets are publicly accessible at the repositories listed above.

Code availability

All programs and pipelines were executed following their official manuals or help pages. Version and parameter information used in our analysis are provided in the Methods section. No custom scripts were used.

References

  1. Oka, K., Yamakawa, M., Kawamura, Y., Kutsukake, N. & Miura, K. The Naked Mole-Rat as a Model for Healthy Aging. Annu. Rev. Anim. Biosci. 11, 207–226 (2023).

    Google Scholar 

  2. Buffenstein, R. The Naked Mole-Rat: A New Long-Living Model for Human Aging Research. The Journals of Gerontology Series A: Biological Sciences and Medical Sciences 60, 1369–1377 (2005).

    Google Scholar 

  3. Jarvis, J. U. M. Eusociality in a Mammal: Cooperative Breeding in Naked Mole-Rat Colonies. Science 212, 571–573 (1981).

    Google Scholar 

  4. Park, T. J. et al. Fructose-driven glycolysis supports anoxia resistance in the naked mole-rat. Science 356, 307–311 (2017).

    Google Scholar 

  5. Fang, X. et al. Adaptations to a Subterranean Environment and Longevity Revealed by the Analysis of Mole Rat Genomes. Cell Reports 8, 1354–1364 (2014).

    Google Scholar 

  6. Zhou, X. et al. Beaver and Naked Mole Rat Genomes Reveal Common Paths to Longevity. Cell Reports 32, 107949 (2020).

    Google Scholar 

  7. Oka, K. et al. Resistance to chemical carcinogenesis induction via a dampened inflammatory response in naked mole-rats. Commun Biol 5, 287 (2022).

    Google Scholar 

  8. Tian, X. et al. High-molecular-mass hyaluronan mediates the cancer resistance of the naked mole rat. Nature 499, 346–349 (2013).

    Google Scholar 

  9. Kim, E. B. et al. Genome sequencing reveals insights into physiology and longevity of the naked mole rat. Nature 479, 223–227 (2011).

    Google Scholar 

  10. Sokolowski, D. J. et al. An updated reference genome sequence and annotation reveals gene losses and gains underlying naked mole-rat biology. Preprint at, https://doi.org/10.1101/2024.11.26.625329 (2024).

  11. Lewin, H. A. et al. The Earth BioGenome Project 2020: Starting the clock. Proc. Natl. Acad. Sci. USA. 119, e2115635118 (2022).

    Google Scholar 

  12. Hoffmann, A. A. & Rieseberg, L. H. Revisiting the Impact of Inversions in Evolution: From Population Genetic Markers to Drivers of Adaptive Shifts and Speciation? Annu Rev Ecol Evol Syst 39, 21–42 (2008).

    Google Scholar 

  13. Plessy, C. et al. Extreme genome scrambling in marine planktonic Oikopleura dioica cryptic species. Genome Res., https://doi.org/10.1101/gr.278295.123 (2024).

  14. Dobigny, G., Britton-Davidian, J. & Robinson, T. J. Chromosomal polymorphism in mammals: an evolutionary perspective: Chromosomal polymorphism in mammals. Biol Rev 92, 1–21 (2017).

    Google Scholar 

  15. Faulkes, C. G. et al. Micro‐ and macrogeographical genetic structure of colonies of naked mole‐rats Heterocephalus glaber. Molecular Ecology 6, 615–628 (1997).

    Google Scholar 

  16. Ingram, C. M., Troendle, N. J., Gill, C. A., Braude, S. & Honeycutt, R. L. Challenging the inbreeding hypothesis in a eusocial mammal: population genetics of the naked mole‐rat, H eterocephalus glaber. Molecular Ecology 24, 4848–4865 (2015).

    Google Scholar 

  17. Gabriel, L. et al. BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA. Genome Res. 34, 769–777 (2024).

    Google Scholar 

  18. Bono, H., Sakamoto, T., Kasukawa, T. & Tabunoki, H. Systematic Functional Annotation Workflow for Insects. Insects 13, 586 (2022).

    Google Scholar 

  19. Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat Commun 11, 1432 (2020).

    Google Scholar 

  20. Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k -mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).

    Google Scholar 

  21. Walker, B. J. et al. Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement. PLoS ONE 9, e112963 (2014).

    Google Scholar 

  22. Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol 21, 245 (2020).

    Google Scholar 

  23. European Nucleotide Archive. https://identifiers.org/insdc.gca:GCA_944319715.1 (2022).

  24. European Nucleotide Archive https://identifiers.org/insdc.gca:GCA_000230445.1 (2011).

  25. Manni, M., Berkeley, M. R., Seppey, M. & Zdobnov, E. M. BUSCO: Assessing Genomic Data Quality and Beyond. Current Protocols 1, e323 (2021).

    Google Scholar 

  26. Ou, S. et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol 20, 275 (2019).

    Google Scholar 

  27. Toga, K. Repetitive sequences statistic of NMR, mice and human. figshare https://doi.org/10.6084/m9.figshare.28170533.v1 (2025).

  28. Toga, K. List of public RNA-Seq data of naked mole rat. figshare https://doi.org/10.6084/m9.figshare.28171121.v1 (2025).

  29. Krueger, F. et al. FelixKrueger/TrimGalore: v0.6.10 - add default decompression path. Zenodo https://doi.org/10.5281/ZENODO.5127898 (2023).

  30. Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915 (2019).

    Google Scholar 

  31. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Google Scholar 

  32. Mistry, J. et al. Pfam: The protein families database in 2021. Nucleic Acids Research 49, D412–D419 (2021).

    Google Scholar 

  33. Li, H. Protein-to-genome alignment with miniprot. Bioinformatics 39, btad014 (2023).

    Google Scholar 

  34. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Google Scholar 

  35. Katoh, K., Rozewicki, J. & Yamada, K. D. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Briefings in Bioinformatics 20, 1160–1166 (2019).

    Google Scholar 

  36. Toga, K. Differences of gene model between our assembly and Ensembl assembly. figshare https://doi.org/10.6084/m9.figshare.28171271.v1 (2025).

  37. European Nucleotide Archive. https://identifiers.org/insdc.gca:GCA_944319725.1 (2022).

  38. European Nucleotide Archive. https://identifiers.org/insdc.gca:GCA_000001635.9.

  39. NCBI GenBank. https://identifiers.org/ncbi/insdc.gca:GCA_001624185.1.

  40. NCBI GenBank. https://identifiers.org/ncbi/insdc.gca:GCA_001624215.1.

  41. NCBI GenBank. https://identifiers.org/ncbi/insdc.gca:GCA_001624295.1.

  42. NCBI GenBank. https://identifiers.org/ncbi/insdc.gca:GCA_001632525.1.

  43. NCBI GenBank. https://identifiers.org/ncbi/insdc.gca:GCA_921997135.2.

  44. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    Google Scholar 

  45. Toga, K. Gap regions at chromosome 19. figshare https://doi.org/10.6084/m9.figshare.29670398.v1 (2025).

  46. Toga, K. Gap regions at chromosome 18. figshare https://doi.org/10.6084/m9.figshare.29670218.v1 (2025).

  47. Heller, D. & Vingron, M. SVIM-asm: structural variant detection from haploid and diploid genome assemblies. Bioinformatics 36, 5519–5521 (2021).

    Google Scholar 

  48. European Variation Archive https://identifiers.org/ena.embl:ERZ28787181 (2026).

  49. Hickey, G. et al. Pangenome graph construction from genome alignments with Minigraph-Cactus. Nat Biotechnol 42, 663–673 (2024).

    Google Scholar 

  50. European Variation Archive https://identifiers.org/ena.embl:ERZ28787182 (2026).

  51. DDBJ https://identifiers.org/ncbi/insdc.gca:GCA_053883595.1 (2025).

  52. Smolka, M. et al. Detection of mosaic and population-level structural variants with Sniffles2. Nat Biotechnol 42, 1571–1580 (2024).

    Google Scholar 

  53. European Variation Archive https://identifiers.org/ena.embl:ERZ28787178 (2026).

  54. European Variation Archive https://identifiers.org/ena.embl:ERZ28787179 (2026).

  55. Shumate, A. & Salzberg, S. L. Liftoff: accurate mapping of gene annotations. Bioinformatics 37, 1639–1643 (2021).

    Google Scholar 

  56. Toga, K. Full information on functional annotation added using Fanflow. figshare https://doi.org/10.6084/m9.figshare.28171166.v1 (2025).

  57. Khan, A. & Mathelier, A. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinformatics 18, 287 (2017).

    Google Scholar 

  58. Zhou, Y. et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun 10, 1523 (2019).

    Google Scholar 

  59. Toga, K. Transcripts with intraspecific variations in NMR. figshare https://doi.org/10.6084/m9.figshare.28218143.v1 (2025).

  60. Toga, K. Results of enrichment analysis for 77 transcripts with intraspecific variants in NMR. figshare https://doi.org/10.6084/m9.figshare.28180361.v1 (2025).

  61. Kawamura, Y. et al. Cellular senescence induction leads to progressive cell death via the INK4a‐RB pathway in naked mole‐rats. The EMBO Journal 42, e111133 (2023).

    Google Scholar 

  62. Toga, K. SV detection at Gpr143 and Htr2b gene loci. figshare https://doi.org/10.6084/m9.figshare.29835212.v2 (2025).

  63. DDBJ. https://identifiers.org/ncbi/insdc.sra:DRP012667 (2025).

  64. Toga, K. Gene models generated from BRAKER3. figshare https://doi.org/10.6084/m9.figshare.28180430.v1 (2025).

  65. Ruan, J. & Li, H. Fast and accurate long-read assembly with wtdbg2. Nat Methods 17, 155–158 (2020).

    Google Scholar 

  66. Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37, 540–546 (2019).

    Google Scholar 

  67. Guan, D. et al. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics 36, 2896–2898 (2020).

    Google Scholar 

Download references

Acknowledgements

This work was supported by JSPS KAKENHI Grant Number JP16H06279 (PAGS), supported in part by the JSPS KAKENHI Grants (JP18H02365) to K.M., JSPS KAKENHI Grants (JP24H00542) and JST COI‐NEXT (JPMJPF2010) to K.M. and H.B. and JST FOREST Program (JPMJFR216C) to K.M. We would like to thank all laboratory members at Hiroshima University and Kumamoto University for their valuable comments. Computations were performed on the computers at the Hiroshima University Genome Editing Innovation Center. Computations were partially performed on the NIG supercomputer at ROIS National Institute of Genetics.

Author information

Authors and Affiliations

  1. Laboratory of BioDX, PtBio Co-Creation Research Center, Genome Editing Innovation Center, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima city, Hiroshima, 739-0046, Japan

    Kouhei Toga & Hidemasa Bono

  2. Laboratory of Genome Informatics, Graduate School of Integrated Sciences for Life, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima city, Hiroshima, 739-0046, Japan

    Kouhei Toga & Hidemasa Bono

  3. Department of Aging and Longevity Research, Faculty of Life Sciences, Kumamoto University, 2-2-1 Honjo, Chuo-ku, Kumamoto city, Kumamoto, 860-0811, Japan

    Kaori Oka & Kyoko Miura

  4. Department of Stem Cell Biology and Medicine, Graduate School of Medical Sciences, Kyushu University, Fukuoka, 812-8582, Japan

    Kaori Oka & Kyoko Miura

  5. Department of Life Science and Technology, Institute of Science Tokyo, Tokyo, 152-8550, Japan

    Hiroyuki Tanaka & Takehiko Itoh

  6. Comparative Genomics Laboratory, National Institute of Genetics, Yata 1111, Mishima, Shizuoka, 411-8540, Japan

    Atsushi Toyoda

  7. Advanced Genomics Center, National Institute of Genetics, Yata 1111, Mishima, Shizuoka, 411-8540, Japan

    Atsushi Toyoda

Authors
  1. Kouhei Toga
    View author publications

    Search author on:PubMed Google Scholar

  2. Kaori Oka
    View author publications

    Search author on:PubMed Google Scholar

  3. Hiroyuki Tanaka
    View author publications

    Search author on:PubMed Google Scholar

  4. Takehiko Itoh
    View author publications

    Search author on:PubMed Google Scholar

  5. Atsushi Toyoda
    View author publications

    Search author on:PubMed Google Scholar

  6. Hidemasa Bono
    View author publications

    Search author on:PubMed Google Scholar

  7. Kyoko Miura
    View author publications

    Search author on:PubMed Google Scholar

Contributions

K.M. and H.B. coordinated and designed this study. K.O. conducted the sampling of Heterocephalus glaber and H.T., T.I., and A.T. performed sequencing and de novo assembly. K.T. and H.B. performed bioinformatics analyses and generated figures and tables. K.T. wrote the first draft of the manuscript. All authors revised, edited, and approved the final manuscript.

Corresponding authors

Correspondence to Hidemasa Bono or Kyoko Miura.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Toga, K., Oka, K., Tanaka, H. et al. Genome assembly and annotation of the naked mole rat Heterocephalus glaber reared in Japan. Sci Data (2026). https://doi.org/10.1038/s41597-026-06996-9

Download citation

  • Received: 20 May 2025

  • Accepted: 27 February 2026

  • Published: 18 March 2026

  • DOI: https://doi.org/10.1038/s41597-026-06996-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Editors & Editorial Board
  • Journal Metrics
  • Policies
  • Open Access Fees and Funding
  • Calls for Papers
  • Contact

Publish with us

  • Submission Guidelines
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Data (Sci Data)

ISSN 2052-4463 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing