The chromosome-level genome assembly of Broad-Leaf Fern (Dipteris shenzhenensis)

Shu, Jiangping; Zhang, Yongxia; Huang, Tengbo; Yan, Yuehong

doi:10.1038/s41597-025-04812-4

Download PDF

Data Descriptor
Open access
Published: 21 March 2025

The chromosome-level genome assembly of Broad-Leaf Fern (Dipteris shenzhenensis)

Jiangping Shu^1,2,3,
Yongxia Zhang¹,
Tengbo Huang ORCID: orcid.org/0000-0002-4762-9653¹ &
…
Yuehong Yan³

Scientific Data volume 12, Article number: 475 (2025) Cite this article

2112 Accesses
4 Citations
Metrics details

Subjects

This article has been updated

Abstract

Dipteris is a relic plant genus and an important indicator of global climate warming and plant geography during the Mesozoic era. However, the lack of genomic resources has hindered the study of paleoclimate, systematic evolution, and medicinal value of this genus. Here, we sequenced and assembled the first chromosome-level genome of Dipteris shenzhenensis. The assembled genome was 1.9 Gb with a contig N50 length of 4.75 Mb, GC content of 42.28% and BUSCO value of 98.3%, and 98.37% of the assembled sequences were anchored onto 33 pseudochromosomes. 71.97% of the genome were predicted to be repetitive sequences, and 45 telomeres were identified, including 15 paired telomeres. A total of 26,471 protein coding genes were predicted, of which 24,485 (92.5%) genes were functionally annotated. The first high-quality genome of Dipteris will provide important genome resources for understanding the systematic evolution, paleoclimate and medicinal value of ferns.

Chromosome-level genome assembly and annotation of Zicaitai (Brassica rapa var. purpuraria)

Article Open access 03 November 2023

Chromosome-level genome assembly of Hippophae gyantsensis

Article Open access 25 January 2024

Chromosome-level genome assembly of the western flower thrips Frankliniella occidentalis

Article Open access 04 June 2024

Background & Summary

Dipteris, commonly known as Broad-Leaf Fern, is an early divergent genus of leptosporangiate ferns, with only eight species in the world^1,2, and limitedly distributed in the Indo-Malay archipelago, including northeastern India, southern China, southern Ryukyu Islands to northeastern Queensland and Fiji Islands³. Contrary to the extant taxa, the fossils of Dipteris are extremely abundant and widely distributed throughout the world, which are important indicators of global climate warming and plant geography during the Mesozoic era^4,5. Furthermore, Dipteris is a key transitional group in the evolution of the key morphological trait “sporangial annulus” from horizontal to vertical⁶, and one of the most controversial evolutionary branches in the fern phylogeny^7,8. Importantly, the rhizomes of Dipteris plants can be used to treat edema, kidney deficiency, low back pain and other diseases⁹, and its plant extracts also have antioxidant and antibacterial activities, effective cholesterol degradation and anti-lipid solubility activities^9,10, and show the potential to treat Alzheimer’s disease¹¹. However, the lack of genomic resources has hindered the study of systematic evolution, paleoclimate, paleogeology, ornamental and medicinal value of this genus.

Dipteris shenzhenensis is a critically endangered plant endemic to China¹² and a peculiar and beautiful plant with leaves split into two fan shapes (Fig. 1A). Its chromosome number is 2n = 2x = 66 according to the chromosome counts database (CCDB, https://ccdb.tau.ac.il)¹³, and the genome size was estimated as 2.14 Gb by flow cytometry (Fig. 1B) and 1.94 Gb by genome survey (Fig. 1C). In this study, we sequenced and assembled its chromosome-level genome based on Illumina short-read sequencing (56× according to genome survey), PacBio single molecule real-time (SMRT) long-read sequencing (35×) and high-through chromosome conformation capture (Hi-C) technologies (134×) (Table 1). The assembled genome was 1.9 Gb with a contig N50 length of 4.75 Mb and GC content of 42.28% (Table 2). In which, 98.37% of the assembled sequences were anchored onto 33 pseudochromosomes (Figs. 1D, 2), and 1.37 Gb (71.97%) of the genome were predicted to be repetitive sequences, including 699.52 Mb (36.82%) of LTR retrotransposons, 424.14 Mb (22.33%) of DNA transposons and so on (Table 3). The LTR insertion mainly occurred about 0.24 million years ago (MYA). 45 telomeres were identified in 33 pseudochromosomes, among them, 15 pseudochromosomes had paired telomeres, 15 pseudochromosomes had only one telomere, and 3 pseudochromosomes failed to identified telomeres (Table 2). A total of 26,471 protein coding genes were predicted with an average CDS length of 1.164 bp, and 24,485 (92.5%) genes could be functionally annotated. In the genome, 11,215 non-coding RNA were identified, including 5,063 miRNAs, 4,700 tRNAs, 580 rRNAs and 872 snRNAs (Table 3). The first high-quality genome of Dipteris will be of great significance for plant evolution, paleoclimate and paleogeology since Mesozoic era, and provide important genome resources for understanding the systematic evolution, ornamental and medicinal value of ferns.

Table 1 The information of Dipteris shenzhenensis genome sequencing.

Full size table

Table 2 The information of genome assembly and estimation of Dipteris shenzhenensis.

Full size table

Table 3 The information of Dipteris shenzhenensis genome annotation.

Full size table

Methods

Plant materials and genome sequencing

Fresh leaves were collected from a mature plant of D. shenzhenensis (Voucher specimen number: YYH24624) at the China National Orchid Conservation Center (CNOCC), Shenzhen, China, and were sent to Novogene Co., Ltd. (Tianjin, China) for genome sequencing. DNA extraction was used a modified cetyltrimethylammonium bromide (CTAB) protocol. Short-read sequencing libraries with an insert size of 350 bp were pooled and sequenced on Illumina Hiseq platforms with PE150 strategy. After quality control and filtering, 108.72 Gb Illumina short reads (56×) and 260.68 Gb Hi-C reads (134×) were generated. PacBio long-read sequencing libraries with fragment sizes of 15–18 kb were sequenced by PacBio Sequel II/IIe platforms with circular consensus sequencing (CCS) mode, and 68.31 Gb HiFi reads were obtained (Table 1).

Genome size estimation

The genome size of D. shenzhenensis was estimated by flow cytometry (BD FACScalibur) and k-mer analysis. For flow cytometry, Solanum lycopersicum L. (1 C = 0.9 Gb) was used as the internal reference, the coefficient of variation (CV%) was controlled within 5%, and Modifit v3.0 was used to calculate the ratio and plotting the histogram. The genome size of 2.14 Gb was estimated by flow cytometry (Fig. 1B, Table 2). After obtaining high quality Illumina Hiseq sequencing data (108.72 Gb), k-mer analysis was conducted with jellyfish v.2.3.0¹⁴, and the 17-mer spectrum was fitted using GenomeScope¹⁵, which indicated a genome size of 1.94 Gb (Fig. 1C).

Genome assembly and annotation

The raw data were broken at the junction and the junction sequences were filtered out to obtain subreads by minimum length = 50. High quality HiFi reads were filtered by ccs software (https://github.com/PacificBiosciences/ccs) with the criteria of min-passes = 3 and min-rq = 0.99. The HiFi reads obtained after quality control were assembled using Hifiasm¹⁶, and the obtained contig genome was combined with the sequenced Hi-C data for chromosome clustering, orientation, and sorting using ALLHiC v0.9.8¹⁷ (parameters: enz = DpnII, CLUSTER = n). The Juicebox software was then used for manual correction based on the chromosome interaction strength to obtain the chromosome-level genome. 98.37% of the assembled genome (1.87 Gb) was mounted on 33 pseudochromosomes. The completeness of genome assembly was evaluated by BUSCO v5.2.2¹⁸ with viridiplantae_odb10 database and OMArk¹⁹ with Viridiplantae.h5 database, and QV scores were calculated by MERQURY v1.3²⁰ for measuring the assembly accuracy.

De novo prediction of tandem repeats in the genome using TRF v4.09.1²¹, Then LTR_FINDER v1.07²², RepeatScout v1.0.5²³, RepeatModeler v2.0.3²⁴ were used to predict the repeat sequence of D. shenzhenensis genome, and the sequences with length less than 100 bp and unknown base (N) content greater than 5% were filtered out, so as to construct the unique repeats database. The UCLUST method in USEARCH v10²⁵ was used to merge the constructed repeat sequence database with the Repbase database²⁶ to obtain a non-redundant repeat sequence database, and RepeatMasker v4.1.2²⁷ was used to predict the repeats in the genome based on homologous sequence alignment.

De novo prediction of gene structure was performed with Augustus v2.5.5²⁸, GlimmerHMM v3.0.4²⁹, SNAP³⁰, Geneid v1.4.4³¹ and GENSCAN³² based on statistical characteristics of genome sequence, such as codon frequency, exon and intron distribution, and so on. BLAST v.2.2.26³³ was used to align D. shenzhenensis with homologous gene dataset constructed with protein-coding sequences of Alsophila spinulosa (Figshare, 19075346.v6), Ceratopteris richardii (Phytozome v13, C.richardii v2.1), Adiantum capillus-veneris (Figshare, 24619215.v1), Salvinia cucullata (FernBase, Salvinia_asm_v1.2), and Arabidopsis thaliana (NCBI, TAIR10.1). Then the protein-coding sequence of D. shenzhenensis genome was predicted by GeneWise v.2.4.1³⁴. In order to further optimize the annotation of genome structure, the transcriptome data of different tissues (root, bulb, and leaf) were compared to the genome sequence using HISAT2 v2.2.1³⁵, so as to identify exon regions and splicing sites. Based on the alignment results, the transcript was assembled using StringTie v1.3.3b³⁶, and gene prediction was performed using PASA v2.5.2³⁷. Finally, EvidenceModeler v1.1.1³⁸ was used to combine the three gene datasets with weights (TRASCRIPT: 50, PROTEIN: 20, ABINITIO PREDICTION: 2) to obtain the final non-redundant gene set. InterProSan v5.54–87.0³⁹ was used to annotate the conserved motifs and domains of the proteins and obtain the GO number of each gene. The gene set was compared with KEGG database (https://www.genome.jp/kegg) to annotate the functional metabolic pathway of each gene. Transcriptome factors were predicted with iTAK v1.7⁴⁰.

The telomeres identification was performed by the module TeloExplorer of quarTeT v1.2.5⁴¹ with the parameter “-c plant”. EDTA v2.2.2⁴² was used to estimate the LTR insertion time with the parameters “--anno 1–u 6.5e-9--force 1--sensitive 1” and the LTR Assembly Index (LAI)⁴³ was calculated by LAI program with the parameters “-genome genome.fa -intact genome.fa.mod.pass.list -all genome.fa.mod.out”.

Data Records

The whole genome sequencing datasets have been deposited in the Genome Sequence Archive⁴⁴ (GSA) in National Genomics Data Center⁴⁵ (NGDC), China National Center for Bioinformation/Beijing Institute of Genomics, Chinese Academy of Sciences. The raw data of Illumina reads, PacBio HiFi reads and Hi-C reads can be located using the GSA numbers of CRA020015⁴⁶, CRA019940⁴⁷, CRA019992⁴⁸, respectively, which corresponds to the BioProject accession number PRJCA031597⁴⁹. The genome assembly has been deposited at DDBJ/ENA/GenBank under the accession JBLQTB000000000⁵⁰. The genome assembly and annotation files have been deposited in Figshare⁵¹.

Technical Validation

The sequencing depth was sufficient with 108.72 Gb Illumina short reads (56×), 260.68 Gb HiC reads (134×) and 68.31 Gb PacBio HiFi reads (35×). The evaluation of genome assembly was conducted by N50 for assessing continuity (contig N50 = 4.75 Mb), the sequences accuracy (QV = 37.986) was measured by MERQURY v1.3²⁰, which was higher than 99.9% (QV = 30), and the Illumina paired-end reads mapping rate for ensuring consistency with the raw data (Mapping rate = 99.1%). The completeness and consistency of genome assembly was estimated by BUSCO¹⁸ with viridiplantae_odb10 database and OMArk¹⁹ with Viridiplantae.h5 database, 98.3% of BUSCOs and 92.58% of Conserved HOGs were present in the D. shenzhenensis genome, and 65.03% of gene families were consistent with the known gene families of Viridiplantae.h5 database, while only 1.7% of BUSCOs were missing and the protein-coding genes of D. shenzhenensis genome were not contaminated. The LAI value was 12.16, and 45 telomeres was identified in 33 pseudochromosomes, including 15 paired telomeres.

Code availability

The study utilized freely available software to the public, and the parameters are explicitly outlined in the Methods section and Supplementary Table 1. The study did not utilize custom scripts or code.

Change history

23 April 2025
In this article the following affiliation was missing for Jiangping Shu, ‘College of Physics and Optoelectronic Engineering, Shenzhen University, Shenzhen, 518060, China’. The original article has been corrected.

References

Hassler, M. World Ferns. Synonymic Checklist and Distribution of Ferns and Lycophytes of the World. www.worldplants.de/ferns/ (1994-2025).
PPG I. A community-derived classification for extant lycophytes and ferns. J. Syst. Evol. 54, 563–603, https://doi.org/10.1111/jse.12229 (2016).
Zhang, X., Kato, M. & Nooteboom, H. Dipteridaceae. in Flora of China vols 2–3 116–117 (Beijing: Science Press; St. Louis: Missouri Botanical Garden Press, 2013).
Choo, T. & Escapa, I. Assessing the evolutionary history of the fern family Dipteridaceae (Gleicheniales) by incorporating both extant and extinct members in a combined phylogenetic study. Am. J. Bot. 105, 1315–1328, https://doi.org/10.1002/ajb2.1121 (2018).
Article CAS PubMed Google Scholar
Zhou, N., Wang, Y., Li, L. & Zhang, X. Diversity variation and tempo-spatial distributions of the Dipteridaceae ferns in the Mesozoic of China. Palaeoworld 25, 263–286, https://doi.org/10.1016/j.palwor.2015.11.008 (2016).
Article Google Scholar
Shen, H. et al. Large-scale phylogenomic analysis resolves a backbone phylogeny in ferns. GigaScience 7, gix116, https://doi.org/10.1093/gigascience/gix116 (2018).
Article CAS PubMed Google Scholar
Nitta, J., Schuettpelz, E., Ramírez-Barahona, S. & Iwasaki, W. An open and continuously updated fern tree of life. Front. Plant Sci. 13, 909768, https://doi.org/10.3389/fpls.2022.909768 (2022).
Article PubMed PubMed Central Google Scholar
Shu, J. et al. Phylogenomic Analysis Reconstructed the Order Matoniales from Paleopolyploidy Veil. Plants 11, 1529, https://doi.org/10.3390/plants11121529 (2022).
Article PubMed PubMed Central Google Scholar
Wang, K. et al. A New ent-Kaurane Diterpenoid from the Aerial of Dipteris chinensis (Dipteridaceae). Acta Botanica Yunnanica 31, 279–283 (2009).
Article Google Scholar
Jarial, R., Singh, L. & Thakur, S. Applications of fern Dipteris conjugata in anti-bacterial and anti-lipolytic purpose. in Proceedings of the National Conference for Postgraduate Research (NCON-PGR 2016). Universiti Malaysia Pahang (UMP), Pekan, Pahang 24–25 (2016).
Chetia, P., Mazumder, M., Mahanta, S., De, B. & Dutta, C. A novel phytochemical from Dipteris wallichii inhibits human β-secretase 1: Implications for the treatment of Alzheimer’s disease. Med. Hypotheses 143, 109839, https://doi.org/10.1016/j.mehy.2020.109839 (2020).
Article CAS PubMed Google Scholar
Wei, Z. et al. Dipteris shenzhenensis, a new endangered species of Dipteridaceae from Shenzhen, southern China. PhytoKeys 186, 111–120, https://doi.org/10.3897/phytokeys.186.73739 (2021).
Article PubMed PubMed Central Google Scholar
Rice, A. et al. The Chromosome Counts Database (CCDB) – a community resource of plant chromosome numbers. New Phytol. 206, 19–26, https://doi.org/10.1111/nph.13191 (2015).
Article PubMed Google Scholar
Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770, https://doi.org/10.1093/bioinformatics/btr011 (2011).
Article CAS PubMed PubMed Central Google Scholar
Ranallo-Benavidez, T., Jaron, K. & Schatz, M. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat. Commun. 11, 1432, https://doi.org/10.1038/s41467-020-14998-3 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Cheng, H., Concepcion, G., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175, https://doi.org/10.1038/s41592-020-01056-5 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zhang, X., Zhang, S., Zhao, Q., Ming, R. & Tang, H. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nat. Plants 5, 833–845, https://doi.org/10.1038/s41477-019-0487-8 (2019).
Article CAS PubMed Google Scholar
Manni, M., Berkeley, M., Seppey, M., Simão, F. & Zdobnov, E. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes. Mol. Biol. Evol. 38, 4647–4654, https://doi.org/10.1093/molbev/msab199 (2021).
Article CAS PubMed PubMed Central Google Scholar
Nevers, Y. et al. Quality assessment of gene repertoire annotations with OMArk. Nat. Biotechnol. 1–10, https://doi.org/10.1038/s41587-024-02147-w (2024).
Rhie, A., Walenz, B., Koren, S. & Phillippy, A. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245, https://doi.org/10.1186/s13059-020-02134-9 (2020).
Article CAS PubMed PubMed Central Google Scholar
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580, https://doi.org/10.1093/nar/27.2.573 (1999).
Article CAS PubMed PubMed Central Google Scholar
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–268, https://doi.org/10.1093/nar/gkm286 (2007).
Article PubMed PubMed Central Google Scholar
Price, A., Jones, N. & Pevzner, P. De novo identification of repeat families in large genomes. Bioinformatics 21, i351–358, https://doi.org/10.1093/bioinformatics/bti1018 (2005).
Article CAS PubMed Google Scholar
Flynn, J. et al. RepeatModeler2 for automated genomic discovery of transposable element families. PNAS 117, 9451–9457, https://doi.org/10.1073/pnas.1921046117 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Edgar, R. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461, https://doi.org/10.1093/bioinformatics/btq461 (2010).
Article CAS PubMed Google Scholar
Bao, W., Kojima, K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA 6, 11, https://doi.org/10.1186/s13100-015-0041-9 (2015).
Article PubMed PubMed Central Google Scholar
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinform. 25, p4.10.1–4.10.14, https://doi.org/10.1002/0471250953.bi0410s25 (2009).
Article Google Scholar
Stanke, M. & Morgenstern, B. AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Res. 33, W465–467, https://doi.org/10.1093/nar/gki458 (2005).
Article CAS PubMed PubMed Central Google Scholar
Majoros, W., Pertea, M. & Salzberg, S. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 2878–2879, https://doi.org/10.1093/bioinformatics/bth315 (2004).
Article CAS PubMed Google Scholar
Korf, I. Gene finding in novel genomes. BMC Bioinform. 5, 59, https://doi.org/10.1186/1471-2105-5-59 (2004).
Article Google Scholar
Blanco, E., Parra, G. & Guigó, R. Using geneid to identify genes. Curr. Protoc. Bioinform. 18, p4.3.1–4.3.28, https://doi.org/10.1002/0471250953.bi0403s18 (2007).
Article Google Scholar
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94, https://doi.org/10.1006/jmbi.1997.0951 (1997).
Article CAS PubMed Google Scholar
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421, https://doi.org/10.1186/1471-2105-10-421 (2009).
Article CAS PubMed PubMed Central Google Scholar
Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988–995, https://doi.org/10.1101/gr.1865504 (2004).
Article CAS PubMed PubMed Central Google Scholar
Kim, D., Paggi, J., Park, C., Bennett, C. & Salzberg, S. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915, https://doi.org/10.1038/s41587-019-0201-4 (2019).
Article CAS PubMed PubMed Central Google Scholar
Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295, https://doi.org/10.1038/nbt.3122 (2015).
Article CAS PubMed PubMed Central Google Scholar
Haas, B. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666, https://doi.org/10.1093/nar/gkg770 (2003).
Article CAS PubMed PubMed Central Google Scholar
Haas, B. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7, https://doi.org/10.1186/gb-2008-9-1-r7 (2008).
Article CAS PubMed PubMed Central Google Scholar
Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240, https://doi.org/10.1093/bioinformatics/btu031 (2014).
Article CAS PubMed PubMed Central Google Scholar
Zheng, Y. et al. iTAK: A Program for Genome-wide Prediction and Classification of Plant Transcription Factors, Transcriptional Regulators, and Protein Kinases. Mol. Plant 9, 1667–1670, https://doi.org/10.1016/j.molp.2016.09.014 (2016).
Article CAS PubMed Google Scholar
Lin, Y. et al. quarTeT: a telomere-to-telomere toolkit for gap-free genome assembly and centromeric repeat identification. Hortic. Res. 10, uhad127, https://doi.org/10.1093/hr/uhad127 (2023).
Article PubMed PubMed Central Google Scholar
Ou, S. et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 20, 275, https://doi.org/10.1186/s13059-019-1905-y (2019).
Article CAS PubMed PubMed Central Google Scholar
Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 46, e126, https://doi.org/10.1093/nar/gky730 (2018).
Article CAS PubMed PubMed Central Google Scholar
Chen, T. et al. The Genome Sequence Archive Family: Toward Explosive Data Growth and Diverse Data Types. Genom. Proteom. Bioinform. 19, 578–583, https://doi.org/10.1016/j.gpb.2021.08.001 (2021).
Article Google Scholar
CNCB-NGDC Members and Partners. Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2022. Nucleic Acids Res. 50, D27–D38, https://doi.org/10.1093/nar/gkab951 (2022).
Article CAS Google Scholar
NGDC Genome Sequence Archive. https://ngdc.cncb.ac.cn/gsa/browse/CRA020015 (2024).
NGDC Genome Sequence Archive. https://ngdc.cncb.ac.cn/gsa/browse/CRA019940 (2024).
NGDC Genome Sequence Archive. https://ngdc.cncb.ac.cn/gsa/browse/CRA019992 (2024).
NGDC BioProject. https://ngdc.cncb.ac.cn/bioproject/browse/PRJCA031597 (2024).
NCBI GenBank. https://identifiers.org/ncbi/insdc/JBLQTB000000000 (2025).
Shu, J. The genome assembly and annotation of Dipteris shenzhenensis. figshare https://doi.org/10.6084/m9.figshare.27419877.v1 (2024).

Download references

Acknowledgements

The authors are very grateful to Minyu Li, Xueying Wei and Juan Li of the Orchid Conservation & Research Center of Shenzhen for the assistance during the sample collection.We thank the Instrumental Analysis Center of Shenzhen University (Lihu Campus) and Central Research Facilities of College of Life Sciences and Oceanography for helping us with the experiments. Y.Y was supported by National Natural Science Foundation of China (32170216) and Shenzhen Fundamental Research Program (JCYJ20220818103212025), J.S. was supported by the Wild Plant Conservation and Management Program of National Forestry and Grassland Administration (2023070302), T.H was supported by Guangdong Basic and Applied Basic Research Foundation (2023A1515010498).

Author information

Authors and Affiliations

Guangdong Provincial Key Laboratory for Plant Epigenetics, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, 518061, China
Jiangping Shu, Yongxia Zhang & Tengbo Huang
College of Physics and Optoelectronic Engineering, Shenzhen University, Shenzhen, 518060, China
Jiangping Shu
Key Laboratory of National Forestry and Grassland Administration for Orchid Conservation and Utilization, the Orchid Conservation & Research Center of Shenzhen, Shenzhen, 518114, China
Jiangping Shu & Yuehong Yan

Authors

Jiangping Shu
View author publications
Search author on:PubMed Google Scholar
Yongxia Zhang
View author publications
Search author on:PubMed Google Scholar
Tengbo Huang
View author publications
Search author on:PubMed Google Scholar
Yuehong Yan
View author publications
Search author on:PubMed Google Scholar

Contributions

J.S., T.H. and Y.Y. developed the idea and designed the experiment; J.S. collected the plant materials; J.S., Y.Z., T.H. and Y.Y. performed the analyses; J.S. interpreted the results and wrote the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Tengbo Huang or Yuehong Yan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Table 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Shu, J., Zhang, Y., Huang, T. et al. The chromosome-level genome assembly of Broad-Leaf Fern (Dipteris shenzhenensis). Sci Data 12, 475 (2025). https://doi.org/10.1038/s41597-025-04812-4

Download citation

Received: 11 November 2024
Accepted: 12 March 2025
Published: 21 March 2025
Version of record: 21 March 2025
DOI: https://doi.org/10.1038/s41597-025-04812-4