Abstract
The naked mole rat (NMR, Heterocephalus glaber) is a eusocial rodent that is native to northeastern Africa. NMRs exhibit extraordinary traits such as longevity, resistance to age-related decline, and remarkable hypoxia tolerance. Although the reference genome of this species has been determined because of its unique characteristics, the significance or role of intraspecific genomic variations remains unknown. In this study, we used PacBio long-read sequencing to generate a genome assembly of NMR reared in Japan. The assembled genome is 2.56 Gb. Benchmarking Universal Single–Copy Orthologs (BUSCO) revealed high completeness (95.2%). BRAKER3 estimated 26,714 protein-coding genes, and we successfully added functional annotations for 26,232 protein-coding genes using the functional annotation workflow. We identified 417 gene models that were previously undetectable in the reference genome of this species. We also identified structural and amino acid sequence variations between our assembly and the reference genome, suggesting the presence of intraspecific genomic variations. This new genomic resource could help uncover the molecular mechanisms underlying the behavioral and physiological traits of NMR.
Similar content being viewed by others
Data availability
All raw sequencing data generated in this study, including Illumina short reads and PacBio CLR long reads, have been deposited in the DNA Data Bank of Japan (DDBJ) Sequence Read Archive under accession numbers DRR401650–DRR401653 (https://ddbj.nig.ac.jp/search/entry/sra-study/DRP012667)63. The genome assembly has been deposited under accession GCA_053883595.1, and the whole-genome shotgun (WGS) project under BAAHMU010000000–BAAHMU010000382. Functional annotation results produced using Fanflow (TSV format) and gene models predicted by BRAKER3 (GTF format and FASTA format) are available on figshare (https://doi.org/10.6084/m9.figshare.28171166 and https://doi.org/10.6084/m9.figshare.28180430). The Fanflow annotation file includes the following columns: braker_id (BRAKER3 transcript ID), human_pid / human_gene / human_description (human best-hit protein, gene symbol, and description), mouse_pid / mouse_gene / mouse_description (mouse best-hit protein, gene symbol, and description), guinea_pig_pid / guinea_pig_gene / guinea_pig_description (guinea pig best-hit protein, gene symbol, and description), Hgla_female_pid / Hgla_female_gene / description (Naked mole-rat female best-hit protein, gene symbol, and description), Hgla_male_pid / Hgla_male_gene / description (Naked mole-rat male best-hit protein, gene symbol, and description), uniprot_id / uniprot_description (UniProtKB best-hit protein and description), pfam_id / pfam_description (Pfam domain IDs and domain annotations).
Structural variants identified in this study (VCF format) have been deposited in the European Variation Archive (EVA) under project accession numbers ERZ28787178 (https://identifiers.org/ena.embl:ERZ28787178)53, ERZ28787179 (https://identifiers.org/ena.embl:ERZ28787179)54, ERZ28787181 (https://identifiers.org/ena.embl:ERZ28787181)48, and ERZ28787182 (https://identifiers.org/ena.embl:ERZ28787182)50. All datasets are publicly accessible at the repositories listed above.
Code availability
All programs and pipelines were executed following their official manuals or help pages. Version and parameter information used in our analysis are provided in the Methods section. No custom scripts were used.
References
Oka, K., Yamakawa, M., Kawamura, Y., Kutsukake, N. & Miura, K. The Naked Mole-Rat as a Model for Healthy Aging. Annu. Rev. Anim. Biosci. 11, 207–226 (2023).
Buffenstein, R. The Naked Mole-Rat: A New Long-Living Model for Human Aging Research. The Journals of Gerontology Series A: Biological Sciences and Medical Sciences 60, 1369–1377 (2005).
Jarvis, J. U. M. Eusociality in a Mammal: Cooperative Breeding in Naked Mole-Rat Colonies. Science 212, 571–573 (1981).
Park, T. J. et al. Fructose-driven glycolysis supports anoxia resistance in the naked mole-rat. Science 356, 307–311 (2017).
Fang, X. et al. Adaptations to a Subterranean Environment and Longevity Revealed by the Analysis of Mole Rat Genomes. Cell Reports 8, 1354–1364 (2014).
Zhou, X. et al. Beaver and Naked Mole Rat Genomes Reveal Common Paths to Longevity. Cell Reports 32, 107949 (2020).
Oka, K. et al. Resistance to chemical carcinogenesis induction via a dampened inflammatory response in naked mole-rats. Commun Biol 5, 287 (2022).
Tian, X. et al. High-molecular-mass hyaluronan mediates the cancer resistance of the naked mole rat. Nature 499, 346–349 (2013).
Kim, E. B. et al. Genome sequencing reveals insights into physiology and longevity of the naked mole rat. Nature 479, 223–227 (2011).
Sokolowski, D. J. et al. An updated reference genome sequence and annotation reveals gene losses and gains underlying naked mole-rat biology. Preprint at, https://doi.org/10.1101/2024.11.26.625329 (2024).
Lewin, H. A. et al. The Earth BioGenome Project 2020: Starting the clock. Proc. Natl. Acad. Sci. USA. 119, e2115635118 (2022).
Hoffmann, A. A. & Rieseberg, L. H. Revisiting the Impact of Inversions in Evolution: From Population Genetic Markers to Drivers of Adaptive Shifts and Speciation? Annu Rev Ecol Evol Syst 39, 21–42 (2008).
Plessy, C. et al. Extreme genome scrambling in marine planktonic Oikopleura dioica cryptic species. Genome Res., https://doi.org/10.1101/gr.278295.123 (2024).
Dobigny, G., Britton-Davidian, J. & Robinson, T. J. Chromosomal polymorphism in mammals: an evolutionary perspective: Chromosomal polymorphism in mammals. Biol Rev 92, 1–21 (2017).
Faulkes, C. G. et al. Micro‐ and macrogeographical genetic structure of colonies of naked mole‐rats Heterocephalus glaber. Molecular Ecology 6, 615–628 (1997).
Ingram, C. M., Troendle, N. J., Gill, C. A., Braude, S. & Honeycutt, R. L. Challenging the inbreeding hypothesis in a eusocial mammal: population genetics of the naked mole‐rat, H eterocephalus glaber. Molecular Ecology 24, 4848–4865 (2015).
Gabriel, L. et al. BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA. Genome Res. 34, 769–777 (2024).
Bono, H., Sakamoto, T., Kasukawa, T. & Tabunoki, H. Systematic Functional Annotation Workflow for Insects. Insects 13, 586 (2022).
Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat Commun 11, 1432 (2020).
Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k -mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).
Walker, B. J. et al. Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement. PLoS ONE 9, e112963 (2014).
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol 21, 245 (2020).
European Nucleotide Archive. https://identifiers.org/insdc.gca:GCA_944319715.1 (2022).
European Nucleotide Archive https://identifiers.org/insdc.gca:GCA_000230445.1 (2011).
Manni, M., Berkeley, M. R., Seppey, M. & Zdobnov, E. M. BUSCO: Assessing Genomic Data Quality and Beyond. Current Protocols 1, e323 (2021).
Ou, S. et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol 20, 275 (2019).
Toga, K. Repetitive sequences statistic of NMR, mice and human. figshare https://doi.org/10.6084/m9.figshare.28170533.v1 (2025).
Toga, K. List of public RNA-Seq data of naked mole rat. figshare https://doi.org/10.6084/m9.figshare.28171121.v1 (2025).
Krueger, F. et al. FelixKrueger/TrimGalore: v0.6.10 - add default decompression path. Zenodo https://doi.org/10.5281/ZENODO.5127898 (2023).
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915 (2019).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Mistry, J. et al. Pfam: The protein families database in 2021. Nucleic Acids Research 49, D412–D419 (2021).
Li, H. Protein-to-genome alignment with miniprot. Bioinformatics 39, btad014 (2023).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Katoh, K., Rozewicki, J. & Yamada, K. D. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Briefings in Bioinformatics 20, 1160–1166 (2019).
Toga, K. Differences of gene model between our assembly and Ensembl assembly. figshare https://doi.org/10.6084/m9.figshare.28171271.v1 (2025).
European Nucleotide Archive. https://identifiers.org/insdc.gca:GCA_944319725.1 (2022).
European Nucleotide Archive. https://identifiers.org/insdc.gca:GCA_000001635.9.
NCBI GenBank. https://identifiers.org/ncbi/insdc.gca:GCA_001624185.1.
NCBI GenBank. https://identifiers.org/ncbi/insdc.gca:GCA_001624215.1.
NCBI GenBank. https://identifiers.org/ncbi/insdc.gca:GCA_001624295.1.
NCBI GenBank. https://identifiers.org/ncbi/insdc.gca:GCA_001632525.1.
NCBI GenBank. https://identifiers.org/ncbi/insdc.gca:GCA_921997135.2.
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Toga, K. Gap regions at chromosome 19. figshare https://doi.org/10.6084/m9.figshare.29670398.v1 (2025).
Toga, K. Gap regions at chromosome 18. figshare https://doi.org/10.6084/m9.figshare.29670218.v1 (2025).
Heller, D. & Vingron, M. SVIM-asm: structural variant detection from haploid and diploid genome assemblies. Bioinformatics 36, 5519–5521 (2021).
European Variation Archive https://identifiers.org/ena.embl:ERZ28787181 (2026).
Hickey, G. et al. Pangenome graph construction from genome alignments with Minigraph-Cactus. Nat Biotechnol 42, 663–673 (2024).
European Variation Archive https://identifiers.org/ena.embl:ERZ28787182 (2026).
DDBJ https://identifiers.org/ncbi/insdc.gca:GCA_053883595.1 (2025).
Smolka, M. et al. Detection of mosaic and population-level structural variants with Sniffles2. Nat Biotechnol 42, 1571–1580 (2024).
European Variation Archive https://identifiers.org/ena.embl:ERZ28787178 (2026).
European Variation Archive https://identifiers.org/ena.embl:ERZ28787179 (2026).
Shumate, A. & Salzberg, S. L. Liftoff: accurate mapping of gene annotations. Bioinformatics 37, 1639–1643 (2021).
Toga, K. Full information on functional annotation added using Fanflow. figshare https://doi.org/10.6084/m9.figshare.28171166.v1 (2025).
Khan, A. & Mathelier, A. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinformatics 18, 287 (2017).
Zhou, Y. et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun 10, 1523 (2019).
Toga, K. Transcripts with intraspecific variations in NMR. figshare https://doi.org/10.6084/m9.figshare.28218143.v1 (2025).
Toga, K. Results of enrichment analysis for 77 transcripts with intraspecific variants in NMR. figshare https://doi.org/10.6084/m9.figshare.28180361.v1 (2025).
Kawamura, Y. et al. Cellular senescence induction leads to progressive cell death via the INK4a‐RB pathway in naked mole‐rats. The EMBO Journal 42, e111133 (2023).
Toga, K. SV detection at Gpr143 and Htr2b gene loci. figshare https://doi.org/10.6084/m9.figshare.29835212.v2 (2025).
DDBJ. https://identifiers.org/ncbi/insdc.sra:DRP012667 (2025).
Toga, K. Gene models generated from BRAKER3. figshare https://doi.org/10.6084/m9.figshare.28180430.v1 (2025).
Ruan, J. & Li, H. Fast and accurate long-read assembly with wtdbg2. Nat Methods 17, 155–158 (2020).
Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37, 540–546 (2019).
Guan, D. et al. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics 36, 2896–2898 (2020).
Acknowledgements
This work was supported by JSPS KAKENHI Grant Number JP16H06279 (PAGS), supported in part by the JSPS KAKENHI Grants (JP18H02365) to K.M., JSPS KAKENHI Grants (JP24H00542) and JST COI‐NEXT (JPMJPF2010) to K.M. and H.B. and JST FOREST Program (JPMJFR216C) to K.M. We would like to thank all laboratory members at Hiroshima University and Kumamoto University for their valuable comments. Computations were performed on the computers at the Hiroshima University Genome Editing Innovation Center. Computations were partially performed on the NIG supercomputer at ROIS National Institute of Genetics.
Author information
Authors and Affiliations
Contributions
K.M. and H.B. coordinated and designed this study. K.O. conducted the sampling of Heterocephalus glaber and H.T., T.I., and A.T. performed sequencing and de novo assembly. K.T. and H.B. performed bioinformatics analyses and generated figures and tables. K.T. wrote the first draft of the manuscript. All authors revised, edited, and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Toga, K., Oka, K., Tanaka, H. et al. Genome assembly and annotation of the naked mole rat Heterocephalus glaber reared in Japan. Sci Data (2026). https://doi.org/10.1038/s41597-026-06996-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-026-06996-9


