Near telomere to telomere genome assembly of Chinese yellow rabbit (Oryctolagus cuniculus)

Chen, Xuequn; Yu, Chunying; Liu, Deqiang; Zhong, Qiuming; Zhou, Jianhua; Lin, Wei; Yang, Yanyan; Xie, Jindong; Wang, Xunli; Wu, Sheng; Qian, Liping; Lin, Yao; Xie, Xiping; Huang, Zhen

doi:10.1038/s41597-025-06058-6

Download PDF

Data Descriptor
Open access
Published: 13 November 2025

Near telomere to telomere genome assembly of Chinese yellow rabbit (Oryctolagus cuniculus)

Xuequn Chen¹^na1,
Chunying Yu¹^na1,
Deqiang Liu¹,
Qiuming Zhong¹,
Jianhua Zhou¹,
Wei Lin¹,
Yanyan Yang¹,
Jindong Xie¹,
Xunli Wang¹,
Sheng Wu²,
Liping Qian²,
Yao Lin ORCID: orcid.org/0000-0002-0493-0155¹,
Xiping Xie³ &
…
Zhen Huang¹

Scientific Data volume 12, Article number: 1786 (2025) Cite this article

2348 Accesses
2 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Oryctolagus cuniculus (Fujian Yellow Rabbit; FJY), the only indigenous rabbit breed in China, known for its yellow coat, adaptability, and higher meat quality. However, comprehensive genomic datasets on O. cuniculus -FJY are lacking. In this study, we assembled the nearly full telomere-to-telomere (T2T) genome of O. cuniculus -FJY, 68% T2T, using PacBio HiFi, ONT ultra-long, and Hi-C data. The 2.88 Gb assembly was anchored into 23 chromosomes (21 autosomes, X and Y), with 99.80% of the sequence successfully assigned to chromosomal positions. Scaffold N50 reached 148.90 Mb, and glires BUSCO completeness of 98.3%. Repeats account for 47.09% of the genome. We identified 22,674 genes, with 99.74% functionally annotated. Additionally, we annotated 1,341 tRNAs, 922 rRNAs, 635 miRNAs, and 2,421 snRNAs. This genome represents the closest T2T rabbit genome and the first successful assembly of the Y chromosome in O. cuniculus. It not only enhances our understanding of the genetic diversity and evolutionary history of the domestication of rabbits, but also provides crucial data support for future breeding research on O. cuniculus-FJY.

A telomere-to-telomere gapless genome assembly of the Tibetan wild ass (Equus kiang)

Article Open access 06 January 2026

A near-telomere-to-telomere genome assembly of the Chinese soft-shelled turtle (Pelodiscus sinensis)

Article Open access 06 January 2026

Telomere to telomere level genome assembly of the Yarkand hare (Lepus yarkandensis)

Article Open access 12 February 2026

Background & Summary

European rabbits (Oryctolagus cuniculus, 2n = 44), as an important branch of mammals, have unique physiological characteristics, ecological habits, and wide applications in agriculture, medicine, and biological research, making them a hot topic in genetics and evolutionary biology¹. The Fujian Yellow Rabbit (O. cuniculus-FJY), as an endemic population in Fujian, China, exhibits distinct morphological traits, has a unique geographical environment and is relatively isolated. Over a long period of time, it has faced complex but relatively stable ecological conditions and human activities, which may have led to significant genetic drift and local adaptation within the population. Ren et al. compared the genetic diversity and population structure of four Chinese rabbit breeds, revealing that the O. cuniculus-FJY breed exhibits the highest levels of genetic diversity². Meanwhile, The O. cuniculus-FJY distinguishes itself from other rabbit breeds due to its yellow coat, slow growth rate, small body size, high protein, and low-fat muscle content, showcasing unique traits of domestication and adaptive evolution^3,4. Therefore, resolving the genomic architecture of O. cuniculus-FJY could inform conservation strategies and selective breeding programs, ensuring the preservation and sustainable utilization of its valuable genetic resources.

To date, several O. cuniculus genome assemblies have been released in the database, such as NCBI (https://www.ncbi.nlm.nih.gov/)^1,5. The most complete among them is the assembly mOryCun1.1 (GeneBank: GCF_964237555.1), which delivers nine T2T chromosomes and nine chromosomes with a single telomere. However, as this assembly was generated from a female individual, it lacks Y-linked sequences and still contains 195 unplaced scaffolds. This incomplete genome information severely limits our understanding of rabbit genetic diversity and evolutionary history, as well as hindering the elucidation of the genetic mechanisms underlying the unique traits of the O. cuniculus-FJY. Unlike the rabbit breeds with published reference genomes, the O. cuniculus-FJY is a local breed with distinctive characteristics and considerable breeding value. It differs markedly in appearance from commercial breeds such as the New Zealand White Rabbit (O. cuniculus-NZW), exhibiting, for example, a yellow coat color and black eyes. Consequently, its genome is distinct from that of the O. cuniculus-NZW. This, in turn, affects breeding and improvement of production performance. Therefore, generating a high-quality genome for the O. cuniculus-FJY not only helps us understand the molecular basis of its origin and differentiation, but also provides a scientific basis for building regional genetic resources, protecting rare populations, and improving breeding strategies.

In this study, we integrated PacBio HiFi, ONT ultra-long reads, and Hi-C sequencing data to generate a near-T2T assembly of the O. cuniculus genome, which includes the first complete assembly of its Y chromosome. This assembly achieved nearly full T2T genome with only 30 unplaced scaffolds remaining, surpassing all previously published O. cuniculus genomes. This study not only enhances our comprehension of rabbit biology but also serves as a valuable resource for future studies aimed at conservation and genetic improvement, ultimately benefiting both the species and the agricultural practices associated with it.

Methods

Ethics statement

All animal experiments were performed in strict compliance with the guidelines established by the Committee on Animal Research and Ethics of Fujian University of Traditional Chinese Medicine. (approval number: FJTCM IACUC 2024330).

Sample collection and sequencing

A four-month-old male O. cuniculus-FJY was obtained from the National Fujian Yellow Rabbit Conservation Farm (Fuzhou, Fujian, China). Genomic DNA and transcriptome sequencing libraries were prepared from blood samples. Sequencing data were generated using multiple platforms, including PacBio high-fidelity (HiFi) reads, Oxford Nanopore (ONT) ultra-long reads, paired-end reads, and Hi-C reads (Table 1). For PacBio HiFi sequencing, a standard SMRTbell library was constructed using the SMRTbell Express Template Prep Kit 2.0 (Pacific Biosciences, USA) following the manufacturer’s protocol. Sequencing was performed on the PacBio Revio platform in circular consensus sequencing (CCS) mode. Two SMRT cells were sequenced, yielding approximately 11 million high-quality CCS reads (184.14 Gb, ~64 × coverage). The average CCS read length was 16.96 Kb, with an N50 of 16.82 Kb. For ONT sequencing, a PromethION library was prepared following the Oxford Nanopore SQK-LSK109 kit protocol (Oxford Nanopore Technologies). Four flow cells were sequenced, and raw data were processed using the Dorado server (v7.2.13) (https://github.com/nanoporetech/dorado/) with adapter trimming and quality filtering (parameter: ‘–min-qscore 7’). This generated 568,258 high-quality ONT reads (56.90 Gb), with the longest and average read lengths being 813.30 Kb and 100.13 Kb, respectively (Fig. 1). The Hi-C library was constructed from cross-linked genomic DNA and sequenced on the DNBSEQ-T7 platform (MGI) using 2 × 150 bp paired-end reads, producing 166.79 Gb of clean data (~58 × coverage) with fastp (v0.26.0; parameter: ‘fastp -Q -L–adapter_sequence AAGTCGGAGGCCAAGCGGTCTTAGGAAGACAA–adapter_sequence_r2 AAGTCGGATCGTAGCCATGTCGTTCTGTGAGCCAAGGAGTTG–out1 [output read1]–out2 [output read2]–in1 [input read1]–in2 [input read2]’)⁶. Additionally, next-generation DNA sequencing and RNA-seq were performed on the DNBSEQ-T7 platform, yielding 216.05 Gb and 10.99 Gb of clean data, respectively (Table 1).

Table 1 The summary of the sequencing data for O. cuniculus-FJY genome assembly and annotation.

Full size table

Genome assembly and evaluation

Initial chromosome-level genome assembly

Initial assembly of contig sequences was performed with Hifiasm (v0.19.6; parameters: ‘–h1 hic.R1.fq.gz–h2 hic.R2.fq.gz–ul ont.fa.gz HiFi-reads.fa.gz’)⁷, incorporating HiFi reads, ONT reads, and clean Hi-C data. To eliminate contaminated sequences and mitochondrial sequences, the initially assembled genome was aligned against the NT database (https://ftp.ncbi.nlm.nih.gov/blast/db/) using BLASTN (v2.11.0 + ; parameters: ‘-evalue 0.00001, -num_alignments 5, and -max_hsps 1’)⁸. Contigs with more than 50% of their sequences identified as contamination or originating from organelles were completely removed from the assembly. For chromosomal scaffolding, low-quality Hi-C reads were removed using HiCUP (v0.7.2; parameter: ‘–NM 3’)⁹, followed by alignment of the filtered Hi-C data to the contig assembly via BWA (v0.7.12; default parameters)¹⁰. Subsequently, valid Hi-C reads were analyzed using Juicer (v1.6; default parameters)¹¹ and 3D-DNA (parameter: ‘-r 0’)¹² to scaffold contigs into chromosomes. Contig ordering and assembly accuracy were verified and manually corrected in JuiceBox (v1.11.08; default parameters)¹³. Following refinement and genome adjustment, the final assembly was anchored to 23 chromosomes (comprising 21 autosomes, X, and Y) (Table 2).

Table 2 Genome assembly statistics of O. cuniculus-FJY.

Full size table

T2T genome assembly

To achieve a T2T assembly, telomeric regions were extended by aligning ONT ultra-long reads to the scaffolded genome using Minimap2 (v2.24; default parameters)¹⁴, followed by refinement with medaka consensus (v1.7.2; https://github.com/nanoporetech/medaka) and BLASTN (v2.11.0+; default parameters)⁸. Contig gaps were resolved using TGS-GapCloser (v1.2.0; parameters: ‘–min_nread 10’)¹⁵, leveraging ONT ultra-long reads and coverage-based relationships among contigs. The extended and gap-filled assembly was subsequently polished with 216.05 Gb of clean short-read data (~75 × coverage) using Pilon (v1.23; default parameters)¹⁶. The final near T2T assembly spanned 23 chromosomes, totaling 2.88 Gb (30 unmapped contigs remained, with a mounting rate of 99.80%), with a scaffold N50 of 148.90 Mb and a GC content of 44.0%. Notably, this represents the first assembled rabbit Y chromosome, measuring 34.43 Mb in length (Fig. 2a–c, Tables 2, 3).

Table 3 Summary of chromosomal and telomere information for the O. cuniculus-FJY genome.

Full size table

Telomere identification

Telomeric regions were identified by leveraging the conserved sequence signature of telomeres (CCCTAA/TTAGGG). A genome-wide scanning was performed to detect characteristic sequences that contain at least four repeat units and are situated within 50 Kb of either end of each chromosome using the quarTeT (v1.1.36; parameter: ‘-m 1’) software^17,18. Telomeres were annotated across 23 chromosomes, with each chromosome showing detectable telomeric sequences (Fig. 2c, Table 3).

Repeat sequence annotation

The repetitive sequences in the O. cuniculus-FJY genome were annotated using the following three methods: (1) Based on the RepBase database (https://www.girinst.org/repbase/), homology prediction was conducted using RepeatMasker (v4.0.9; default parameters) (http://www.repeatmasker.org) to identify DNA-level repetitive sequence features, resulting in 1.23 Gb of repetitive sequences, accounting for approximately 42.90% of the genome (Fig. 2d); (2) Using the RepeatProteinMask tool integrated in RepeatMasker (v4.0.9; default parameters), homology prediction was performed to identify protein-level repetitive sequence features, yielding 210.71 Mb of repetitive sequences, which represent about 7.32% of the genome; (3) A self-built repetitive sequence feature library was constructed using RepeatModeler (v1.0.11; default parameters)¹⁹ and LTR-FINDER_parallel (v1.0.7; default parameters)²⁰, and de novo prediction was carried out with RepeatMasker, resulting in 924.13 Mb of repetitive sequences, accounting for approximately 32.11% of the genome. Finally, a redundancy-removed statistical calculation was performed and a total of approximately 1.36 Gb of repetitive sequences were obtained, representing 47.09% of the genome. The repetitive sequences were categorized by type, with the highest proportions being short interspersed nuclear elements (SINEs), long interspersed nuclear elements (LINEs), long terminal repeats (LTRs), and DNA transposons (Table 4).

Table 4 Statistical results of annotated repetitive sequences in O. cuniculus-FJY genome.

Full size table

Protein coding gene annotation

Gene structure annotation

Gene structure annotation was performed using a combination of methods, including: (1) De novo prediction: We conducted de novo gene prediction using the Augustus (v3.3.2)²¹ and Genscan²² software with default parameters, resulting in the identification of 31,072 and 52,508 genes, respectively. (2) Homolog prediction: We downloaded homologous protein sequences from related rabbit species, including Ochotona princeps-mOchPri1 (GenBank: GCF_030435755.1), O. princeps-OchPri4 (GenBank: GCF_014633375.1), O. cuniculus-OryCun2 (GenBank: GCF_000003625.3), and O. cuniculus-NZW (GenBank: GCF_009806435). Gene structure annotation was performed based on homologous species proteins using the software miniprot (v0.11; parameters: ‘–gff-only -O 11 -E 1 -F 23 -C 1 -B 5 -G 200000 -j 1’)²³ and Liftoff (v1.6.3; parameters: ‘-a 0.5 -s 0.5’)²⁴. (3) RNA-seq data-based prediction: Using Stringtie (v1.3.5; default parameters)²⁵ and TransDecoder (v5.5.0; default parameters)²⁶, we predicted a total of 11,285 coding genes based on RNA-seq data. (4) Protein library comparison prediction: We obtained 14,108 genes through protein library comparison prediction using the Benchmarking Universal Single-Copy Orthologs (BUSCO; v5.5.0)²⁷ software. Finally, a non-redundant gene set was generated by integrating prior evidence with MAKER (v2.31.10; default parameters)²⁸, followed by refinement using HiFAP (v2.4.1; Wuhan OneMore Tech Co., Ltd., https://www.onemore-tech.com/) to obtain a final high-quality set of 22,674 genes. The average gene length was 42,977 bp, with an average of approximately nine exons per gene (Table 5).

Table 5 Summary of gene structure prediction results in O. cuniculus-FJY genome.

Full size table

Gene function annotation

In this study, gene function annotation was primarily conducted using two major approaches: sequence similarity-based and domain/motif similarity-based methods. (1) Sequence similarity-based annotation: Diamond (v2.0.14; parameters: ‘–evalue 1e-05’)²⁹ was used to align annotated protein sequences against several databases, including NR (version 2023-04-01), SwissProt (version 2023-03-01), TrEMBL (version 2023-03-01), AnimalTFDB (version 4.0), KOG (version 2023-03-01), and KEGG (version 2023-01-01). Additionally, KOBAS (v3.0; parameters: ‘-t blastout:tab -s ko’)³⁰ was used to link annotated KEGG information with KEGG ortholog and pathway information. (2) Domain/Motif similarity-based annotation: i) We employed InterProScan (v5.61-93.0; parameters: ‘–seqtype p–formats TSV–goterms–pathways -dp’)³¹ to compare against a series of sub-databases, obtaining information on conserved sequences, motifs, and domains of proteins. ii) We used HMMER3 (v3.3.1; parameters: ‘hmmsearch -E 1e-05–domE 1e-05’)³² to annotate conserved sequences, including transcription factors and Pfam domains/motifs, based on multiple sequence alignment and hidden Markov models. Out of 22,674 genes, 22,616 genes (99.74%) were successfully annotated with functional information (Table 6, Fig. 2e). For Gene Ontology (GO) and KEGG functional annotation analyses, the GO results showed that the top three pathways in biological process were cellular process, metabolic process, and biological process, while the top three pathways in molecular function were binding, catalytic activity, and molecular transducer activity (Fig. 3a). The KEGG analysis revealed that the most represented pathways in cellular process, metabolism, and organismal systems were cell growth and death, lipid metabolism, and immune system, respectively (Fig. 3b).

Table 6 Summary of gene functional annotation results in O. cuniculus-FJY genome.

Full size table

Non-coding RNA annotation

Non-coding RNA (ncRNA) annotations were performed using different methods based on the characteristics of each RNA type: (1) tRNA: Based on the structural features of tRNA, we used the tRNAscan-SE (v1.3.1)³³ software for annotation, identifying a total of 1,341 tRNAs with an average length of 78 bp. (2) rRNA: Given the high conservation of rRNA sequences, we used rRNA sequences from closely related species as reference sequences. Annotation was performed via BLASTN (v2.11.0 + ; parameters: ‘-evalue 0.01’)⁸, resulting in the identification of 922 rRNAs with an average length of 186 bp. (3) miRNA and snRNA: Using covariance models from the Rfam family, we annotated miRNAs and snRNAs with the INFERNAL software included in Rfam (v14.8)³⁴. This process identified 635 miRNAs and 2,421 snRNAs, with average lengths of 87 bp and 117 bp, respectively (Table 7).

Table 7 Summary of non-coding RNA annotations in O. cuniculus-FJY genome.

Full size table

Whole-genome collinearity analysis

For comparative genomic analysis, two additional O. cuniculus genome assemblies with annotations—mOryCun1.1 (GenBank: GCA_030258805.1) and NZW (GenBank: GCA_030258775.1) were retrieved from NCBI. Collinearity relationships among the three genomes were constructed and visualized using the JCVI software³⁵. The analysis was performed with the following parameters: for ortholog identification, ‘-m jcvi.compara.catalog ortholog -no_strip_names–cscore = .99’; for synteny screening, ‘-m jcvi.compara.synteny screen–minspan = 30’. The results showed that the O. cuniculus-FJY genome was significantly better than the O. cuniculus-NZW genome, and was similar to the O. cuniculus-mOryCun1.1 genome (Fig. 4a).

Gene family clustering and analysis

To ascertain gene families, we conducted comparative analyses of protein sequences from O. cuniculus-FJY against those from six related species: O. cuniculus-mOryCun1.1, O. cuniculus-NZW, Lepus europaeus (GenBank: GCF_033115175.1), L. oiostolus (GenBank: GCA_036325905.1), Ochotona curzoniae (GenBank: GCF_017591425.1), and O. princeps (GenBank: GCF_030435755.1). Prior to analysis, gene sets of all involved species were filtered to ensure quality: for genes with multiple transcripts (resulting from alternative splicing), only the longest coding transcript was retained; additionally, genes encoding proteins shorter than 30 amino acids or containing internal stop codons were excluded. Subsequently, an all-vs-all BLASTp⁸ search (v2.11.0+; parameter: ‘-evalue 1e-5’) was performed to determine similarity relationships among protein sequences across all species. Finally, clustering of the BLASTp results was carried out using OrthoFinder (v2.5.5; parameter: ‘-S blast -I 1.5’)³⁶. Among the 22,674 protein-coding genes in O. cuniculus-FJY, 22,480 (approximately 99.14%) were grouped into 18,478 orthologous clusters (Fig. 4b).

Data Records

The genome assembly of O. cuniculus-FJY has been deposited at NCBI GenBank under accession number GCA_051225735.1³⁷. Raw sequencing data were available in the Sequence Read Archive (SRA) under accession number CRA023120³⁸, and the annotation files have been archived in figshare database³⁹.

Technical Validation

Evaluation of genome assembly quality

The accuracy and completeness of the near-T2T assembly (O. cuniculus-FJY) were rigorously validated through multiple approaches. Hi-C contact maps revealed uniform chromosomal interactions with strong intra-chromosomal signal enrichment, confirming proper contig scaffolding and orientation (Fig. 2b). Furthermore, short reads and long reads were realigned to the assembly using BWA (v0.7.12; default parameters)¹⁰ and Minimap2 (v2.24; default parameters)¹⁴, achieving alignment rates of over 99.95%. Genome-wide integrity was further evaluated with BUSCO (v5.5.0)²⁷ base on glires (odb10) lineage set, recovering 98.3% (13,563/13,800) of conserved eukaryotic orthologs as complete (Table 8).

Table 8 BUSCO assessment results of O. cuniculus-FJY genome and annotation.

Full size table

Validation of genome annotation

The final annotation comprised 22,674 protein-coding genes, of which 99.7% (22,616) were functionally annotated across multiple reference databases. Notably, 84.8% (19,235 genes) received consistent support from all five databases, indicating robust cross-validation (Fig. 2e, Table 6). Protein-level assessment using BUSCO (v5.5.0)²⁷ further confirmed annotation quality, with 97.80% of expected orthologs classified as complete. In addition, we compared the length distributions of genes, coding sequences (CDS), exons, and introns in O. cuniculus-FJY with those in four related genome assemblies: O. princeps-mOchPri1, Ochotona princeps-OchPri4, O. cuniculus-OryCun2, and O. cuniculus-NZW. Comparative analysis revealed that the structural characteristics of genes in O. cuniculus-FJY are highly consistent with the other related species (Fig. 5). Collectively, these results demonstrate high confidence in both structural and functional annotation accuracy.

Phylogenetic analysis

Genes from single-copy gene families were selected, and their individual alignment results were concatenated to construct a super-alignment matrix. Based on this matrix, relevant site data were extracted, including all unprocessed sites, phase 1 sites, and 4D sites. A phylogenetic tree was then constructed using RAxML (v8.2.12; parameters: ‘-f a -N 100 -m GTRGAMMA’)⁴⁰ with 1000 bootstrap replications. The phylogenetic analysis highlights the unique evolutionary position of O. cuniculus-FJY and the distinct features of its genome. These findings, which are closely associated with the its unique biological characteristics, further underscore the importance of decoding its genome (Fig. 4c).

Data availability

The data that support the findings of this study are openly available in figshare at https://doi.org/10.6084/m9.figshare.29413133³⁹.

Code availability

In this study, all software and commands were used in line with the published tools’ guidelines. Any elements not specified in the Methods were run with default settings, and no custom scripts were used. The versions of all software and databases are detailed in the Methods section.

References

Carneiro, M. et al. Rabbit genome analysis reveals a polygenic basis for phenotypic change during domestication. Science 345, 1074–1079, https://doi.org/10.1126/science.1253714 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Ren, A. et al. Genetic diversity and population structure of four Chinese rabbit breeds. PLoS One 14, e0222503, https://doi.org/10.1371/journal.pone.0222503 (2019).
Article CAS PubMed PubMed Central Google Scholar
Zhang, J. et al. Integrated metabolome and transcriptome provide insights into meat quality differences between Fujian yellow and New Zealand white rabbits. Food Bioscience 67, 105707, https://doi.org/10.1016/j.fbio.2024.105707 (2025).
Article CAS Google Scholar
X, X. The characteristics and development strategies of meat rabbit industry in Fujian Province. Chin J Rabbit Farm 9, 33–36, https://doi.org/10.3969/j.issn.1005-6327.2009.09.010 (2009).
Google Scholar
Bai, Y. et al. Improving the genome assembly of rabbits with long-read sequencing. Genomics 113, 3216–3223, https://doi.org/10.1016/j.ygeno.2021.05.031 (2021).
Article CAS PubMed Google Scholar
Chen, S. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. Imeta 2, e107, https://doi.org/10.1002/imt2.107 (2023).
Article CAS PubMed PubMed Central Google Scholar
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods 18, 170–175, https://doi.org/10.1038/s41592-020-01056-5 (2021).
Article CAS PubMed PubMed Central Google Scholar
Mount, D. W. Using the Basic Local Alignment Search Tool (BLAST). CSH Protoc 2007, pdb top17, https://doi.org/10.1101/pdb.top17 (2007).
PubMed Google Scholar
Wingett, S. et al. HiCUP: pipeline for mapping and processing Hi-C data. F1000Res 4, 1310, https://doi.org/10.12688/f1000research.7334.1 (2015).
Article PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760, https://doi.org/10.1093/bioinformatics/btp324 (2009).
Article CAS PubMed PubMed Central Google Scholar
Durand, N. C. et al. Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst 3, 95–98, https://doi.org/10.1016/j.cels.2016.07.002 (2016).
Article CAS PubMed PubMed Central Google Scholar
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95, https://doi.org/10.1126/science.aal3327 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Durand, N. C. et al. Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. Cell Syst 3, 99–101, https://doi.org/10.1016/j.cels.2015.07.012 (2016).
Article CAS PubMed PubMed Central Google Scholar
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100, https://doi.org/10.1093/bioinformatics/bty191 (2018).
Article CAS PubMed PubMed Central Google Scholar
Xu, M. Y. et al. TGS-GapCloser: A fast and accurate gap closer for large genomes with low coverage of error-prone long reads. Gigascience 9, ARTN giaa09410.1093/gigascience/giaa094 (2020).
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9, e112963, https://doi.org/10.1371/journal.pone.0112963 (2014).
Article ADS PubMed PubMed Central Google Scholar
Lin, Y. et al. quarTeT: a telomere-to-telomere toolkit for gap-free genome assembly and centromeric repeat identification. Hortic Res 10, uhad127, https://doi.org/10.1038/s0.1093/hr/uhad127 (2023).
Article PubMed PubMed Central Google Scholar
Jiang, C., Du, Y., Lou, Z., Zhang, Y. & Wang, T. Telomere-to-telomere reference genome of Rhinogobio nasutus, an endangered endemic fish from the Yellow River. Sci Data 12, 462, https://doi.org/10.1038/s41597-025-04793-4 (2025).
Article CAS PubMed PubMed Central Google Scholar
Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics 21(Suppl 1), i351–358, https://doi.org/10.1093/bioinformatics/bti1018 (2005).
Article CAS PubMed Google Scholar
Ou, S. & Jiang, N. LTR_FINDER_parallel: parallelization of LTR_FINDER enabling rapid identification of long terminal repeat retrotransposons. Mob DNA 10, 48, https://doi.org/10.1186/s13100-019-0193-0 (2019).
Article CAS PubMed PubMed Central Google Scholar
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res 34, W435–439, https://doi.org/10.1093/nar/gkl200 (2006).
Article CAS PubMed PubMed Central Google Scholar
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J Mol Biol 268, 78–94, https://doi.org/10.1006/jmbi.1997.0951 (1997).
Article CAS PubMed Google Scholar
Slater, G. S. & Birney, E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6, 31, https://doi.org/10.1186/1471-2105-6-31 (2005).
Article PubMed PubMed Central Google Scholar
Shumate, A. & Salzberg, S. L. Liftoff: accurate mapping of gene annotations. Bioinformatics 37, 1639–1643, https://doi.org/10.1093/bioinformatics/btaa1016 (2021).
Article CAS PubMed PubMed Central Google Scholar
Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33, 290–295, https://doi.org/10.1038/nbt.3122 (2015).
Article CAS PubMed PubMed Central Google Scholar
Haas, B. J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc 8, 1494–1512, https://doi.org/10.1038/nprot.2013.084 (2013).
Article CAS PubMed Google Scholar
Simao, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212, https://doi.org/10.1093/bioinformatics/btv351 (2015).
Article CAS PubMed Google Scholar
Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12, 491, https://doi.org/10.1186/1471-2105-12-491 (2011).
Article PubMed PubMed Central Google Scholar
Buchfink, B., Reuter, K. & Drost, H. G. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nature Methods 18, 366-+, https://doi.org/10.1038/s41592-021-01101-x (2021).
Article CAS PubMed PubMed Central Google Scholar
Bu, D. et al. KOBAS-i: intelligent prioritization and exploratory visualization of biological functions for gene enrichment analysis. Nucleic Acids Res 49, W317–W325, https://doi.org/10.1093/nar/gkab447 (2021).
Article CAS PubMed PubMed Central Google Scholar
Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240, https://doi.org/10.1093/bioinformatics/btu031 (2014).
Article CAS PubMed PubMed Central Google Scholar
Mistry, J., Finn, R. D., Eddy, S. R., Bateman, A. & Punta, M. Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res 41, e121, https://doi.org/10.1093/nar/gkt263 (2013).
Article CAS PubMed PubMed Central Google Scholar
Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25, 955–964, https://doi.org/10.1093/nar/25.5.955 (1997).
Article CAS PubMed PubMed Central Google Scholar
Daub, J., Eberhardt, R. Y., Tate, J. G. & Burge, S. W. Rfam: annotating families of non-coding RNA sequences. Methods Mol Biol 1269, 349–363, https://doi.org/10.1007/978-1-4939-2291-8_22 (2015).
Article CAS PubMed Google Scholar
Tang, H. et al. JCVI: A versatile toolkit for comparative genomics analysis. Imeta 3, e211, https://doi.org/10.1002/imt2.211 (2024).
Article CAS PubMed PubMed Central Google Scholar
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol 20, https://doi.org/10.1186/s13059-019-1832-y (2019).
NCBI GenBank https://identifiers.org/ncbi/insdc.gca:GCA_051225735.1 (2025).
National Genomics Data Center (NGDC) Genome Sequence Archive https://bigd.big.ac.cn/gsa/browse/CRA023120 (2025).
Genome annotation of Chinese yellow rabbit (Oryctolagus cuniculus). Figshare, https://doi.org/10.6084/m9.figshare.29413133 (2025).
Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313, https://doi.org/10.1093/bioinformatics/btu033 (2014).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This study was supported by the Youth Science and Technology Innovation Talent Cultivation Program of FJTCM (no. XQC2023007 and XQC2024003); Rehabilitation technology innovation center by joint collaboration of ministry of education and Fujian province, FJTCM (X2022009-Collaboration); Fujian Province ‘Eaglet Plan’ Young Top Talents Project; the Natural Science Foundation of Fujian Province (2024J01743 and 2024J01121); China Agriculture Research System (CARS-43-5). We thank Wuhan Onemore-tech Co., Ltd. for their assistance with genome sequencing and analysis.

Author information

These authors contributed equally: Xuequn Chen, Chunying Yu.

Authors and Affiliations

The Affiliated People’s Hospital of Fujian University of Traditional Chinese Medicine, College of Integrative Medicine, Fujian-Macao Science and Technology Cooperation Base of Traditional Chinese Medicine-Oriented Chronic Disease Prevention and Treatment, Fujian University of Traditional Chinese Medicine, Fuzhou, Fujian, 350122, China
Xuequn Chen, Chunying Yu, Deqiang Liu, Qiuming Zhong, Jianhua Zhou, Wei Lin, Yanyan Yang, Jindong Xie, Xunli Wang, Yao Lin & Zhen Huang
Fujian Academy of Medical Sciences, Fuzhou, Fujian, 350001, China
Sheng Wu & Liping Qian
Institute of Animal Husbandry and Veterinary Medicine, Fujian Academy of Agricultural Sciences, Fuzhou, Fujian, 350013, China
Xiping Xie

Authors

Xuequn Chen
View author publications
Search author on:PubMed Google Scholar
Chunying Yu
View author publications
Search author on:PubMed Google Scholar
Deqiang Liu
View author publications
Search author on:PubMed Google Scholar
Qiuming Zhong
View author publications
Search author on:PubMed Google Scholar
Jianhua Zhou
View author publications
Search author on:PubMed Google Scholar
Wei Lin
View author publications
Search author on:PubMed Google Scholar
Yanyan Yang
View author publications
Search author on:PubMed Google Scholar
Jindong Xie
View author publications
Search author on:PubMed Google Scholar
Xunli Wang
View author publications
Search author on:PubMed Google Scholar
Sheng Wu
View author publications
Search author on:PubMed Google Scholar
Liping Qian
View author publications
Search author on:PubMed Google Scholar
Yao Lin
View author publications
Search author on:PubMed Google Scholar
Xiping Xie
View author publications
Search author on:PubMed Google Scholar
Zhen Huang
View author publications
Search author on:PubMed Google Scholar

Contributions

Z.H. conceived and designed the project. X.C., C.Y., D.L., Q.Z., J.Z., W.L., Y.Y. and Y.L. contributed to the data analysis and visualization. X.C. and C.Y. wrote the manuscript. X.C. and Z.H. provided edits to the manuscript. J.X., X.W., S.W., L.Q. and X.X. contributed to the collection the blood samples of rabbit.

Corresponding author

Correspondence to Zhen Huang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, X., Yu, C., Liu, D. et al. Near telomere to telomere genome assembly of Chinese yellow rabbit (Oryctolagus cuniculus). Sci Data 12, 1786 (2025). https://doi.org/10.1038/s41597-025-06058-6

Download citation

Received: 07 April 2025
Accepted: 26 September 2025
Published: 13 November 2025
Version of record: 13 November 2025
DOI: https://doi.org/10.1038/s41597-025-06058-6

Subjects

Abstract

Similar content being viewed by others

A telomere-to-telomere gapless genome assembly of the Tibetan wild ass (Equus kiang)

A near-telomere-to-telomere genome assembly of the Chinese soft-shelled turtle (Pelodiscus sinensis)

Telomere to telomere level genome assembly of the Yarkand hare (Lepus yarkandensis)

Background & Summary

Methods

Ethics statement

Sample collection and sequencing

Genome assembly and evaluation

Initial chromosome-level genome assembly

T2T genome assembly

Telomere identification

Repeat sequence annotation

Protein coding gene annotation

Gene structure annotation

Gene function annotation

Non-coding RNA annotation

Whole-genome collinearity analysis

Gene family clustering and analysis

Data Records

Technical Validation

Evaluation of genome assembly quality

Validation of genome annotation

Phylogenetic analysis

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links