Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Data
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific data
  3. data descriptors
  4. article
Assembling a chromosome-level genome for the Microtus fortis using PacBio HiFi and Hi-C technologies
Download PDF
Download PDF
  • Data Descriptor
  • Open access
  • Published: 14 February 2026

Assembling a chromosome-level genome for the Microtus fortis using PacBio HiFi and Hi-C technologies

  • Du Zhang  ORCID: orcid.org/0000-0002-3350-91081,2,
  • Qi Hu3,
  • Tianqiong He4,5,
  • Junkang Zhou4,
  • Yixin Wen4,5,
  • Qian Liu4,5,
  • Jing Zhang  ORCID: orcid.org/0009-0002-7758-99604,5,
  • Wenlin Zhi4,5,
  • Lingxuan Ouyang4,5,
  • Suisui Gao4,5,
  • Ruotong Guan4,5 &
  • …
  • Zhijun Zhou4,5 

Scientific Data , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Genome
  • Genomics

Abstract

The reed vole (Microtus fortis) is an important rodent model for studying unique biological traits, such as its natural resistance to Schistosoma japonicum. To facilitate the genetic study of these phenotypes, we have produced the first high-quality, chromosome-level genome assembly for this species. The genome was assembled using PacBio HiFi long-read sequencing and scaffolded to the chromosome level with Hi-C data. The final 2.29 Gb assembly exhibits excellent continuity (contig N50 = 68.89 Mb; scaffold N50 = 91.23 Mb), with 97.7% of the sequence anchored into 26 pseudomolecules, consistent with the species’ karyotype. Genome completeness was estimated at 96.3% via BUSCO analysis (glires_odb10). The annotation includes 23,678 protein-coding genes, with 97.5% assigned a putative function. This publicly available, high-quality genomic resource will be invaluable for future research, providing the necessary foundation to explore the genetic mechanisms behind the unique adaptations of M. fortis, including its innate immunity, digestive physiology, and disease models. The assembly will also serve as a key reference for comparative genomics, enriching our understanding of rodent evolution.

Data availability

The raw sequencing data (including Illumina, PacBio HiFi, Hi-C, and transcriptomic reads) generated in this study have been deposited in the NCBI Sequence Read Archive (SRA) under study accession number SRP58974844. The complete mitochondrial genome is available in GenBank under accession number PX549189.150. The final chromosome-level genome assembly and annotation have been deposited in GenBank under accession number JBQVRV000000000.151 and in the Genome Warehouse (GWH) at the National Genomics Data Center (NGDC) under accession number GWHESEF0000000052. All data are associated with BioProject accession number PRJNA1271721.

Code availability

No custom code was generated for this study. All analyses were performed using publicly available bioinformatics tools as described in the methods section.

References

  1. Wang, S. et al. The feeding preference and bite response between Microtus fortis and Broussonetia papyrifera. Frontiers in Plant Science 15, 1361311 (2024).

    Google Scholar 

  2. Okada, K. & Kageyama, A. Assisted reproductive technologies in Microtus genus. Reproductive Medicine and Biology 18, 121–127 (2019).

    Google Scholar 

  3. Hu, Q. et al. De novo assembly and transcriptome characterization: Novel insights into the mechanisms of primary ovarian cancer in Microtus fortis. Molecular Medicine Reports 25, 64 (2022).

    Google Scholar 

  4. Ueoka, I., Pham, H. T. N., Matsumoto, K. & Yamaguchi, M. Autism spectrum disorder-related syndromes: modeling with Drosophila and rodents. International Journal of Molecular Sciences 20, 4071 (2019).

    Google Scholar 

  5. Zhu, L., Qi, Z., Wen, Y. C., Min, J. Z. & Song, Q. K. The complete mitochondrial genome of Microtus fortis pelliceus (Arvicolinae, Rodentia) from China and its phylogenetic analysis. Mitochondrial DNA Part B 4, 2039–2041 (2019).

    Google Scholar 

  6. He, T. et al. Metabolomic analysis of the intrinsic resistance mechanisms of Microtus fortis against Schistosoma japonicum infection. Scientific Reports 15, 7147 (2025).

    Google Scholar 

  7. Shen, J. et al. Macrophage-mediated trogocytosis contributes to destroying human schistosomes in a non-susceptible rodent host, Microtus fortis. Cell Discovery 9, 101 (2023).

    Google Scholar 

  8. Xiong, D. et al. Transcriptional profiling of Microtus fortis responses to S. japonicum: New sight into Mf‐Hsp90 α resistance mechanism. Parasite Immunology 43, e12842 (2021).

    Google Scholar 

  9. Hu, Y. et al. De novo assembly and transcriptome characterization: novel insights into the natural resistance mechanisms of Microtus fortis against Schistosoma japonicum. BMC Genomics 15, 1–13 (2014).

    Google Scholar 

  10. Li, H. et al. Genome assembly and transcriptome analysis provide insights into the antischistosome mechanism of Microtus fortis. Journal of Genetics and Genomics 47, 743–755 (2020).

    Google Scholar 

  11. Rhie, A. et al. Towards complete and error-free genome assemblies of all vertebrate species. Nature 592, 737–746 (2021).

    Google Scholar 

  12. Wenger, A. M. et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nature Biotechnology 37(10), 1155–1162 (2019).

    Google Scholar 

  13. Burton, J. N. et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nature Biotechnology 31(12), 1119–1125 (2013).

    Google Scholar 

  14. Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

    Google Scholar 

  15. Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).

    Google Scholar 

  16. Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nature Communications 11, 1432 (2020).

    Google Scholar 

  17. Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nature Methods 18, 170–175 (2021).

    Google Scholar 

  18. Guan, D. et al. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics 36, 2896–2898 (2020).

    Google Scholar 

  19. Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Systems 3, 95–98 (2016).

    Google Scholar 

  20. Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).

    Google Scholar 

  21. Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Systems 3, 99–101 (2016).

    Google Scholar 

  22. Pan, Y. Q. et al. Analysis of chromosome number and chromosome bands of Microtus fortis from different regions in China. Chinese Journal of Laboratory Animal Science 12(3), 147–150 (2002).

    Google Scholar 

  23. Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A. & Zdobnov, E. M. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Molecular Biology and Evolution 38, 4647–4654 (2021).

    Google Scholar 

  24. Slater, G. S. C. & Birney, E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6, 1–11 (2005).

    Google Scholar 

  25. Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Research 34, W435–W439 (2006).

    Google Scholar 

  26. Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. Journal of Molecular Biology 268(1), 78–94 (1997).

    Google Scholar 

  27. Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12, 1–14 (2011).

    Google Scholar 

  28. Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biology 14, 1–13 (2013).

    Google Scholar 

  29. Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nature Biotechnology 33, 290–295 (2015).

    Google Scholar 

  30. Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biology 9, 1–22 (2008).

    Google Scholar 

  31. UniProt, C. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Research 47, D506–D515 (2019).

    Google Scholar 

  32. Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240 (2014).

    Google Scholar 

  33. Huerta-Cepas, J. et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper. Molecular Biology and Evolution 34, 2115–2122 (2017).

    Google Scholar 

  34. Bergman, C. M. & Quesneville, H. Discovering and detecting transposable elements in genome sequences. Briefings in Bioinformatics 8, 382–392 (2007).

    Google Scholar 

  35. Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA 6, 1–6 (2015).

    Google Scholar 

  36. Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proceedings of the National Academy of Sciences 117, 9451–9457 (2020).

    Google Scholar 

  37. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research 27, 573–580 (1999).

    Google Scholar 

  38. Lowe, T. M. & Chan, P. P. tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Research 44(W1), W54–W57 (2016).

    Google Scholar 

  39. Griffiths-Jones, S. et al. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Research 33, D121–D124 (2005).

    Google Scholar 

  40. Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).

    Google Scholar 

  41. Meng, G., Li, Y., Yang, C. & Liu, S. MitoZ: a toolkit for animal mitochondrial genome assembly, annotation and visualization. Nucleic acids research 47(11), e63 (2019).

    Google Scholar 

  42. Bernt, M. et al. MITOS: Improved de novo metazoan mitochondrial genome annotation. Molecular Phylogenetics and Evolution, 69(2) (2013).

  43. Stephan, G., Pascal, L. & Ralph, B. OrganellarGenomeDRAW (OGDRAW) version 1.3.1: expanded toolkit for the graphical visualization of organellar genomes. Nucleic Acids Research 47(W1), W59–W64 (2019).

    Google Scholar 

  44. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP589748 (2025).

  45. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra/SRR33821528 (2025).

  46. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra/SRR33821526 (2025).

  47. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra/SRR33821527 (2025).

  48. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra/SRR33821525 (2025).

  49. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra/SRR33821524 (2025).

  50. NCBI GenBank https://identifiers.org/ncbi/insdc:PX549189.1 (2025).

  51. NCBI GenBank https://identifiers.org/ncbi/insdc:JBQVRV000000000.1 (2025).

  52. CSDC Genome Warehouse https://ngdc.cncb.ac.cn/gwh/Assembly/84475/show (2025).

  53. Chen, Y., Zhang, Y., Wang, A. Y., Gao, M. & Chong, Z. Accurate long-read de novo assembly evaluation with Inspector. Genome Biology 22, 1–21 (2021).

    Google Scholar 

  54. Challis, R. et al. BlobToolKit-interactive quality assessment of genome assemblies[J]. G3: Genes, Genomes, Genetics 10(4), 1361–1374 (2020).

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Changsha Major Special Project of Science and Technology (Grant No. kh2301027), the Natural Science Foundation of Hunan Province (Grant Nos. 2024JJ5422 and 2024JJ6494), and the Key Research and Development Program of Hunan Province (Grant No. 2024DK2001). The authors are grateful to Benagen Technology (Wuhan, China) for their valuable advice on bioinformatic analysis. We also sincerely thank the journal editors and the reviewers for their constructive suggestions and insightful comments, which significantly contributed to the improvement of this manuscript.

Author information

Authors and Affiliations

  1. Department of Medical Genetics, The Second Xiangya Hospital of Central South University, Changsha, 410011, China

    Du Zhang

  2. Hunan Province Clinical Medical Research Center for Genetic Birth Defects and Rare Diseases, Department of Medical Genetics, The Second Xiangya Hospital of Central South University, Changsha, 410011, China

    Du Zhang

  3. E-gene Biotechnology Co., Ltd., Shenzhen, 518038, China

    Qi Hu

  4. Department of Laboratory Animal Science, Xiangya School of Medicine College, Central South University, Changsha, 410013, China

    Tianqiong He, Junkang Zhou, Yixin Wen, Qian Liu, Jing Zhang, Wenlin Zhi, Lingxuan Ouyang, Suisui Gao, Ruotong Guan & Zhijun Zhou

  5. Hunan Key Laboratory of Animal Models for Human Diseases, Central South University, Changsha, 410013, China

    Tianqiong He, Yixin Wen, Qian Liu, Jing Zhang, Wenlin Zhi, Lingxuan Ouyang, Suisui Gao, Ruotong Guan & Zhijun Zhou

Authors
  1. Du Zhang
    View author publications

    Search author on:PubMed Google Scholar

  2. Qi Hu
    View author publications

    Search author on:PubMed Google Scholar

  3. Tianqiong He
    View author publications

    Search author on:PubMed Google Scholar

  4. Junkang Zhou
    View author publications

    Search author on:PubMed Google Scholar

  5. Yixin Wen
    View author publications

    Search author on:PubMed Google Scholar

  6. Qian Liu
    View author publications

    Search author on:PubMed Google Scholar

  7. Jing Zhang
    View author publications

    Search author on:PubMed Google Scholar

  8. Wenlin Zhi
    View author publications

    Search author on:PubMed Google Scholar

  9. Lingxuan Ouyang
    View author publications

    Search author on:PubMed Google Scholar

  10. Suisui Gao
    View author publications

    Search author on:PubMed Google Scholar

  11. Ruotong Guan
    View author publications

    Search author on:PubMed Google Scholar

  12. Zhijun Zhou
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Z.J.Z. and Q.H. conceived and designed the study; T.Q.H. and J.K.Z. conducted the collection of the reed vole samples; J.K.Z., Y.X.W., Q.L., J.Z., W.L.Z., L.X.O., S.S.G., R.T.G. contributed to experimental design and data collection. D.Z. analyzed the data and wrote the draft manuscript; D.Z., T.Q.H., Q.H. and Z.J.Z. discussed the results and improved and revised the manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Zhijun Zhou.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemantary Materials

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, D., Hu, Q., He, T. et al. Assembling a chromosome-level genome for the Microtus fortis using PacBio HiFi and Hi-C technologies. Sci Data (2026). https://doi.org/10.1038/s41597-026-06813-3

Download citation

  • Received: 11 June 2025

  • Accepted: 03 February 2026

  • Published: 14 February 2026

  • DOI: https://doi.org/10.1038/s41597-026-06813-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Editors & Editorial Board
  • Journal Metrics
  • Policies
  • Open Access Fees and Funding
  • Calls for Papers
  • Contact

Publish with us

  • Submission Guidelines
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Data (Sci Data)

ISSN 2052-4463 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research