Chromosome-level Genome Assembly of the Halophytic Turfgrass Zoysia macrostachya

Park, Hyeonseon; Bae, Eunji; Jung, Jae Gyeong; Choi, Bae Young; Lee, Tae-Ho; Shim, Donghwan

doi:10.1038/s41597-025-05702-5

Download PDF

Data Descriptor
Open access
Published: 31 July 2025

Chromosome-level Genome Assembly of the Halophytic Turfgrass Zoysia macrostachya

Hyeonseon Park¹^na1,
Eunji Bae²^na1,
Jae Gyeong Jung¹,
Bae Young Choi³,
Tae-Ho Lee⁴ &
…
Donghwan Shim ORCID: orcid.org/0000-0002-0223-595X^1,5

Scientific Data volume 12, Article number: 1335 (2025) Cite this article

2152 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Zoysia macrostachya Franch. & Sav. is a halophytic perennial turfgrass in the Poaceae family, commonly found in the coastal regions of Korea, Japan, and East Asia. Z. macrostachya thrives in high-salinity environments, making it an excellent model for studying abiotic stress resilience. In this study, we present a chromosome-level genome assembly of Z. macrostachya, constructed using Oxford Nanopore long reads, Illumina short reads, and Omni-C sequencing data. The assembly spans 329.78 Mb across 20 chromosomes, with a scaffold N50 of 19.24 Mb, and includes complete telomeric sequences at both ends. The assembly showed 97.8% complete BUSCOs, indicating high genome completeness. Repeat element and gene annotation identified 44.03% of the genome as repetitive elements and 33,474 protein-coding genes. The gene annotation showed 97.1% complete BUSCOs and 86.92% functionally characterized genes. Macrosynteny analysis highlighted highly collinear relationships with related species, providing a foundational understanding of the Z. macrostachya genomic structure. This high-quality genome serves as a valuable resource for advancing salinity tolerance research and improving the genetic diversity of Zoysia species.

Chromosome-level genome assembly and annotation of Zicaitai (Brassica rapa var. purpuraria)

Article Open access 03 November 2023

Chromosome-scale genome assembly of Zoysia japonica uncovers cold tolerance candidate genes

Article Open access 03 April 2025

Haplotype-resolved chromosome-level genome assembly of creeping bentgrass, Agrostis stolonifera

Article Open access 16 January 2026

Background & Summary

Zoysia macrostachya Franch. & Sav. is a perennial grass species in the Poaceae family and the Chloridoideae subfamily¹. Zoysia is a genus of warm-season grasses widely cultivated for turf applications, including golf courses, residential lawns, and commercial landscapes^2,3. As of 2015, an estimated 10,375 ha of zoysiagrass were planted in the US, with 78% of it used on golf courses in the transition zone and 17% in the Southeast region⁴. This species is native to the coastal regions of Korea, Japan, and East Asia, where it thrives in saline environments⁵. Like other species in the Zoysia genus, Z. macrostachya exhibits traits of a recretohalophyte, featuring specialized salt glands on its leaves that enable salt excretion under saline conditions^6,7,8. This adaptation, common in halophytes, allows the species to thrive in challenging habitats with high salinity. Morphologically, it is a predominantly littoral species growing up to 21 cm tall⁹. These characteristics make it an excellent model for exploring plant resilience to abiotic stresses.

Salinity stress is a critical challenge in agriculture, as it significantly reduces plant productivity and limits species distribution by inducing osmotic imbalance, ion toxicity, and oxidative stress¹⁰. Studies on the Zoysia genus have revealed diverse mechanisms of salinity tolerance^11,12,13,14. For example, Z. japonica exhibits adaptive traits such as the exclusion and isolation of sodium ions from sensitive tissues and the effective compartmentalization of potassium ions¹⁵. Z. macrostachya exhibits enhanced tolerance, as shown by its lower membrane conductivity and reduced malondialdehyde accumulation under stress¹⁶. Furthermore, this species increases proline and soluble sugar levels while boosting peroxidase (POD) activity, further enhancing its resistance to salinity. The basis of this salinity tolerance may lie in species-specific structural variations and sequence variations, underscoring the importance of genomic studies to further explore these adaptations.

Advances in genomic studies have illuminated the evolutionary and adaptive mechanisms of Zoysia species. Notably, Z. japonica is the only Zoysia species with a chromosome-level reference genome, consisting of a 334 Mb draft assembly and 59,271 predicted protein-coding genes¹⁷. Genomic synteny analyses suggest that the Zoysia genus underwent a species-specific whole-genome duplication (WGD) event approximately 20.8 Mya, which contributed to enhanced salt tolerance through expanded cytochrome P450 and ABA biosynthetic gene families¹⁸. While these findings provide insights into genetic diversity and salinity tolerance, the genome of Z. macrostachya has the potential to offer a deeper understanding of the ecological and physiological adaptations to salinity within the genus.

In this study, we present a chromosome-level genome assembly for Zoysia macrostachya constructed using Oxford Nanopore long-reads, Illumina short-reads, and Omni-C sequencing data. Z. macrostachya has a tetraploid karyotype with 2n = 4x = 40 chromosomes^1,19. The assembly spans 329.78 Mb across 20 haploid chromosomes, with a scaffold N50 of 19.24 Mb and complete telomeric sequences at both ends. Genome validation showed 97.8% complete BUSCOs, highlighting the high quality of genome assembly. Additionally, gene annotation was validated with BUSCO, identifying 4,754 (97.1%) complete BUSCOs. A total of 33,474 protein-coding genes were annotated, with 86.92% functionally characterized using Swiss-Prot and eggNOG databases. Macrosynteny analysis revealed high gene-based synteny with related species, reflecting the evolutionary genome structure dynamics within Chloridoideae species. This high-quality genome establishes a valuable resource for future research on abiotic stress resilience and evolutionary genomics within the Zoysia genus.

Methods

Plant materials and sequencing

Z. macrostachya (ZN3169) was used in this study, collected from Seonyu Island, Gunsan-si, Jeonbuk-do, Republic of Korea (Fig. 1a). Genomic DNA was isolated from leaf samples using the SmartGene Plant DNA Extraction Kit II (SmartGene, Daejeon, Korea), following the manufacturer’s instructions. A 151 bp paired-end library was prepared using the xGen DNA Library Prep Kit for genomic DNA sequencing. Sequencing on the Illumina NovaSeq. 6000 platform generated 20.40 Gb of raw data (Table S1). This data was utilized for genome size estimation and heterozygosity analysis (Fig. 1b). For Oxford Nanopore long-read sequencing, a genomic DNA library was prepared using the ONT Ligation Sequencing Kit (SQK-LSK110) following the manufacturer’s guidelines. Sequencing was conducted on a MinION device with R9.4.1 flow cells, producing 30.48 Gb (92.32×) (Table S1). For Omni-C library sequencing, the Dovetail Omni-C Kit (Dovetail Genomics, USA) was utilized following the provided instructions. Approximately 300 mg of Z. macrostachya leaf was ground into a fine powder under liquid nitrogen. The powdered tissue was crosslinked with 37% formaldehyde and incubated for 10 minutes at room temperature. Chromatin was digested in situ using DNase supplied in the kit, followed by end-repair, ligation, and purification of the resulting DNA fragments. The ligated DNA underwent proximity ligation to form long-range interactions and was subsequently reverse crosslinked to release the DNA. The extracted DNA was then purified and quantified with a Qubit Fluorometer (Thermo Fisher Scientific, USA). The Omni-C libraries were constructed using the Dovetail™ Library Module for Illumina (Dovetail Cat. No. 21004) and indexed with the Dovetail Dual Index Primer Set for Illumina (Dovetail Cat. No. 25005), with amplification performed according to the manufacturer’s guidelines. Library quality and quantity were evaluated using a Qubit 4.0 Fluorometer (Invitrogen Ltd, Paisley, UK) and a 4150 TapeStation (Agilent, Santa Clara, CA, USA), respectively. The library was sequenced on an Illumina NovaSeq 6000 platform (Illumina, San Diego, CA, USA) using paired-end reads of 151 bp, generating 27.43 Gb (83.07×) (Table S1). For gene annotation analysis, total RNA was extracted from shoot and root samples using the SmartGene Plant RNA Extraction Kit (SmartGene, Daejeon, Korea) following the manufacturer’s instructions. To prepare for mRNA sequencing, mRNA was isolated using the Poly(A) RNA Selection Kit (Lexogen, Vienna, Austria). The selected mRNA was subsequently used to construct 151 bp paired-end libraries with the xGen™ RNA Lib Prep Kit (Integrated DNA Technologies, Coralville, LA, USA. Sequencing of the RNA libraries was conducted on the Illumina NovaSeq 6000 platform, producing 8.88 Gb of raw data with 151 bp paired-end reads (Table S1).

Genome assembly, and chromosome-level scaffolding

A total of 67,546,030 Illumina paired-end reads were cleaned using Trimmomatic v0.39²⁰. The parameters included removal of adapter sequences (ILLUMINACLIP:TruSeq. 3-PE.fa:2:30:10:2:keepBothReads), trimming of leading and trailing low-quality bases (Phred score < 10; LEADING:10, TRAILING:10), and removal of reads shorter than 50 bp (MINLEN:50). K-mer frequency analysis was performed using Jellyfish v2.3.0²¹ with 21-mers to facilitate genome size and heterozygosity rate estimation for the Z. macrostachya genome. Subsequently, GenomeScope 2.0²² was employed with the options -k 21 -p 2 to analyze genome size and heterozygosity, including sub-genomes. The k-mer distribution exhibited two peaks, predicting a genome size of 324,246,238 bp and a heterozygosity rate of 2.58% (Fig. 1b).

The genome of Z. macrostachya was de novo assembled using NextDeNovo v2.5.0²³ with Oxford Nanopore long reads exceeding 1 kb in length. Minimap2 with a k-mer size of 19 was employed during consensus building to balance sensitivity and specificity in sequence alignments. Genome polishing was subsequently conducted using NextPolish v1.4.1²⁴, employing DNA-seq data from Illumina NovaSeq, comprising 135,092,060 reads with a total yield of 20.40 Gb, achieving a coverage depth of approximately 61.79X. Polishing was carried out over three iterative rounds, with BWA-MEM utilized for alignment of the short-read data. This initial assembly consisted of 95 contigs with a total length of 340,935,610 Mb and a contig N50 of 7.50 Mb (Table 1). To refine the assembly and remove haplotypic duplications, contig overlaps were resolved using Purge_Dups²⁵, which resulted in a purged assembly of 69 contigs. This refined draft genome had a total length of 330,146,791 Mb and an improved contig N50 of 7.56 Mb

Table 1 Genome assembly statistics of Z. macrostachya.

Full size table

For chromosome-level genome scaffolding, Omni-C reads were aligned to the draft genome using BWA v0.7.17-r1188²⁶ with the -5SP option. Alignments were filtered with MAPQ ≥ 1 (mapping quality ≥ 1) and NM ≤ 3. The filtered Omni-C reads were then processed using the HapHiC v1.0.6²⁷ pipeline with the options–correct_nrounds 2,–max_inflation 5, and a target chromosome count of 20 for scaffolding analysis (Fig. 2a). This step successfully anchored the 69 contigs into 20 scaffolds, dramatically increasing contiguity to a scaffold N50 of 19.24 Mb. The resulting scaffolded assembly initially contained 49 gaps. Subsequently, we performed manual curation in Juicebox_1.11.0821²⁸. For gap closing, we used YAGCloser (https://github.com/merlyescalona/yagcloser) with Oxford Nanopore long reads corrected by Ratatosk v0.9.0²⁹, which successfully filled 23 of these gaps. The final chromosome-level assembly consists of 20 scaffolds with a total length of 329,773,341 bp and a scaffold N50 of 19.24 Mb, containing only 26 remaining gaps that total approximately 2,600 bp in length. The final chromosome-level assembly was validated using the TeloExplorer module of the quarTeT³⁰ toolkit to confirm its structural integrity. Using the plant telomeric repeat sequence (AAACCCT), the analysis successfully identified these canonical repeats at both ends of all 20 chromosomes, with copy numbers ranging from 12 to 650 times (Fig. 2b). The comprehensive detection of these terminal repeat structures provides strong evidence for a high-quality chromosome-level assembly.

Genome annotation

To identify both known and novel repetitive elements in the Z. macrostachya genome, a database was constructed using RepeatModeler v2.0.4 (www.repeatmasker.org/RepeatModeler/) with default settings. Repeat sequences were subsequently predicted and masked utilizing RepeatMasker v4.1.5 (http://www.repeatmasker.org/). A total of 145,185,180 bp of repetitive sequences were identified, accounting for 44.03% of the entire Z. macrostachya genome (Table S2). Long terminal repeat (LTR) elements were the most abundant, with 61,197 elements spanning 69,542,551 bp, representing 21.09% of the genome.

For gene annotation, the genome with softmasked repetitive sequences was utilized. The BRAKER3 pipeline^{31,32,33,34,35} was used to predict protein-coding genes in Z. macrostachya. This approach integrated two types of evidence: short-read RNA-seq data and protein homology information. Short RNA-seq reads were trimmed and filtered using Trimmomatic v0.39²⁰ with parameters ILLUMINACLIP:TruSeq. 3-PE.fa:2:30:10:2:keepBothReads, LEADING:10, TRAILING:10, and MINLEN:50. Subsequently, the reads were further filtered with PRINSEQ-lite v0.20.4³⁶ using the parameters: -min_len 50, -min_qual_mean 15, -derep 14, -trim_qual_left 15, and -trim_qual_right 15. The cleaned reads were then aligned to the genome with HISAT2 v2.2.1³⁷, and the resulting alignment files were provided to the BRAKER3 pipeline^{31,32,33,34,35} along with protein sequences from 27 Liliopsida species obtained from NCBI (Table S3). These hints were applied to train GeneMark-ETP and AUGUSTUS for gene prediction. Statistical analysis of the gene annotation results was performed using the agat_sq_stat_basic.pl script from the AGAT software³⁸. A total of 33,474 protein-coding genes with an average length of 2,544.83 bp were predicted in the Z. macrostachya genome (Table 2). The average exon length was 238.33 bp, while the average intron length was 381.49 bp.

Table 2 Statistics of Z. macrostachya protein-coding gene annotation.

Full size table

Functional annotation of protein-coding genes was conducted using EnTAP v1.1.1³⁹. Protein sequences were compared against the UniProt Swiss-Prot⁴⁰ and eggNOG⁴¹ databases via DIAMOND⁴², with an e-value threshold of 1 \({\text{e}}^{-5}\). Furthermore, KEGG⁴³ terms and GO⁴⁴ terms were assigned to the genes through eggNOG-mapper^45,46. The result was visualized using TBtools⁴⁷. Out of the 33,474 protein-coding genes, 29,095 genes (86.92% of total genes) were annotated based on the Swiss-Prot and EggNOG databases (Fig. 3). Among these, 18,921 genes were annotated using Swiss-Prot, while Gene Ontology (GO) annotations were assigned as follows: 19,582 genes for Biological Process (BP) terms, 20,059 genes for Cellular Component (CC) terms, and 18,740 genes for Molecular Function (MF) terms. Additionally, 13,182 genes were assigned to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways.

Macrosynteny analysis

To compare collinearity among closely related species (Oropetium thomaeum and Zoysia japonica cv. Compadre), the genome data were first downloaded from CoGe (gid: 51527) and NCBI GenBank (GCA_040438285.1), respectively. Gene homology-based macrosynteny was then analyzed using MCScan (JCVI v1.3.9⁴⁸) with the parameters (–cscore 0.99 and–minspan 30) (Fig. 4). The plants used for the analysis included three species belonging to the Chloridoideae subfamily of Poaceae. Among them, O. thomaeum is a diploid species with a chromosome number of n = 10, while Z. japonica, being a member of the same genus as Z. macrostachya, is an allotetraploid species with 20 chromosomes (n = 20). Macrosynteny analysis revealed 18,316 homologous gene pairs between O. thomaeum and Z. japonica, covering 60.40% of the 28,930 genes in O. thomaeum and 37.26% of the 49,074 genes in Z. japonica (Fig. 4a). Similarly, 27,286 gene pairs were identified between Z. japonica and Z. macrostachya, representing 55.59% of the genes in Z. japonica and 81.30% of the 33,474 genes in Z. macrostachya. These findings highlight the synteny conserved among these species. Syntenic depth analysis, performed without the minimum collinearity restriction, indicated a 1:2 gene synteny between O. thomaeum and Z. japonica, consistent with the difference in their chromosome numbers (Fig. 4b,c). Moreover, a 1:1 gene synteny pattern observed between Z. japonica and Z. macrostachya confirms that Z. macrostachya exhibits tetraploid ploidy and contains two distinct subgenome types across its 20 chromosomes.

Data Records

All raw sequencing data have been deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database under the accession numbers SRP555321⁴⁹. The genome assembly information has been deposited in and is publicly available from the GenBank database under the accession number (GCA_049640385.1)⁵⁰. The genome annotation files, including GFF and FASTA formats, are available in figshare⁵¹.

Technical Validation

To validate the genome assembly, we aligned Oxford Nanopore long reads and Illumina short reads using Minimap v2.28-r1209 and Bowtie2 v2.4.4, respectively. The alignment results showed 91.26% and 91.35% mapping rates, indicating a high level of completeness in the assembled genome. Genome coverage and sequence quality were verified by aligning 4,896 Poales orthologs using Benchmarking Universal Single-Copy Orthologs (BUSCO) v5.2.2 in genome mode with the odb10 dataset⁵². Out of the total BUSCOs, 4,792 (97.8%) were identified as complete, with 3,595 (73.4%) classified as single-copy and 1,197 (24.4%) identified as duplicated (Fig. 5a). This demonstrates significantly higher completeness compared to previously reported genomes of Zoysia species¹⁷. To assess genome quality and the impact of heterozygosity, we performed a k-mer-based analysis using Merqury v1.3⁵³. Given the high heterozygosity rate of Z. macrostachya (2.58%), we specifically implemented a haplotype-purging strategy to mitigate potential assembly artifacts and enhance base-level accuracy. We evaluated the assembly quality both before and after this purging step. Quality was assessed with Merqury using a k-mer database (k = 21) generated by meryl v1.3. For context, the initial assembly prior to polishing had a k-mer completeness of 71.81% and a QV of 29.02. After polishing, these metrics increased significantly. The initial polished assembly showed a k-mer completeness of 73.54% and a consensus quality value (QV) of 36.15. After purging, the purged assembly showed a slight decrease in completeness to 72.49% but an increase in QV to 36.31. This trade-off demonstrates that while approximately 1% of k-mer completeness was lost, the process successfully improved the base-level accuracy of the primary contigs. The final scaffolded assembly maintained these improvements, presenting a completeness score of 72.46% and a QV of 36.31. Validation of the gene annotation was performed using BUSCO v5.2.2. Out of the total Poales BUSCOs, 4,754 (97.1%) were identified as complete, with 3,154 (64.4%) classified as single-copy and 1,600 (32.7%) identified as duplicated (Fig. 5b). In the context of the Earth BioGenome Project (EBP) quality standards⁵⁴, these metrics classify our assembly as a high-quality draft genome. It successfully meets the EBP criteria for chromosome-level contiguity, achieving a 100% chromosome assignment rate into 20 final scaffolds (Contig N50 = 7.56 Mb, Scaffold N50 = 19.24 Mb). Furthermore, it exceeds the benchmark for completeness with a genome BUSCO score of 97.8%. While its base accuracy is high at QV 36.31 (>99.97% precision), it is slightly below the EBP’s highest Q40 target. Collectively, these results establish our assembly as a robust and reliable foundation for subsequent genomic studies.

Code availability

All analyses were conducted following the official guidelines of the software described in the Methods section. No customizations were made to the commands or pipelines used.

References

Loch, D. S., Ebina, M., Choi, J. S. & Han, L. Ecological Implications of Zoysia Species, Distribution, and Adaptation for Management and Use of Zoysiagrasses. International Turfgrass Society Research Journal 13, 11–25, https://doi.org/10.2134/itsrj2016.10.0857 (2017).
Article Google Scholar
Choi, J.-S. Distribution, classification, breeding, and current use of zoysiagrass species and cultivars in Korea. Weed & Turfgrass Science 6, 283–291 (2017).
Google Scholar
Patton, A. J., Schwartz, B. M. & Kenworthy, K. E. Zoysiagrass (Zoysia spp.) History, Utilization, and Improvement in the United States: A Review. Crop Science 57, S-37–S-72, https://doi.org/10.2135/cropsci2017.02.0074 (2017).
Article Google Scholar
Gelernter, W. D., Stowell, L. J., Johnson, M. E. & Brown, C. D. Documenting trends in land‐use characteristics and environmental stewardship programs on US golf courses. Crop, Forage & Turfgrass Management 3, 1–12 (2017).
Article Google Scholar
Shi-lun, Y. & Ji-yu, C. Coatal salt marshes and mangrove swamps in China. Chinese Journal of Oceanology and Limnology 13, 318–324, https://doi.org/10.1007/BF02889465 (1995).
Article ADS Google Scholar
Céccoli, G. et al. Salt Glands in the Poaceae Family and Their Relationship to Salinity Tolerance. The Botanical Review 81, 162–178, https://doi.org/10.1007/s12229-015-9153-7 (2015).
Article Google Scholar
Wang, R. et al. Comparative Transcriptome Analysis of Halophyte Zoysia macrostachya in Response to Salinity Stress. Plants 9, 458 (2020).
Article CAS PubMed PubMed Central Google Scholar
Marcum, K. B., Anderson, S. J. & Engelke, M. C. Salt Gland Ion Secretion: A Salinity Tolerance Mechanism among Five Zoysiagrass Species. Crop Science 38, cropsci1998.0011183X003800030031x, https://doi.org/10.2135/cropsci1998.0011183X003800030031x (1998).
Article Google Scholar
Bae, E.-J. et al. Distribution and morphology characteristics of native zoysiagrasses (Zoysia spp.) grown in South Korea. Asian Journal of Turfgrass Science 24, 97–105 (2010).
Google Scholar
Parihar, P., Singh, S., Singh, R., Singh, V. P. & Prasad, S. M. Effect of salinity stress on plants and its tolerance strategies: a review. Environmental Science and Pollution Research 22, 4056–4075, https://doi.org/10.1007/s11356-014-3739-1 (2015).
Article CAS PubMed Google Scholar
Xie, Q. et al. De novo assembly of the Japanese lawngrass (Zoysia japonica Steud.) root transcriptome and identification of candidate unigenes related to early responses under salt stress. Frontiers in Plant Science 6, https://doi.org/10.3389/fpls.2015.00610 (2015).
Yamamoto, A. et al. The relationship between salt gland density and sodium accumulation/secretion in a wide selection from three Zoysia species. Australian Journal of Botany 64, 277–284, https://doi.org/10.1071/BT15261 (2016).
Article CAS Google Scholar
Hooks, T. et al. Salt tolerance of seven genotypes of zoysiagrass (Zoysia spp.). Technology in Horticulture 2, 1–7, https://doi.org/10.48130/TIH-2022-0008 (2022).
Article ADS Google Scholar
Maeda, Y. Effects of calcium application on the salt tolerance and sodium excretion from salt glands in zoysiagrass (Zoysia japonica). Grassland Science 65, 189–196, https://doi.org/10.1111/grs.12234 (2019).
Article CAS Google Scholar
Li, X. et al. Na+ and K+ homeostasis in different organs of contrasting Zoysia japonica accessions under salt stress. Environmental and Experimental Botany 214, 105455, https://doi.org/10.1016/j.envexpbot.2023.105455 (2023).
Article CAS Google Scholar
Xiao-Ying, D., Yan-Xin, D., Jing-Yuan, L. & Ying, Z. Responses of three <em>Zoysia</em> grass species to salt stress. Acta Prataculturae Sinica 24, 109–117, https://doi.org/10.11686/cyxb2015161 (2015).
Article Google Scholar
Tanaka, H. et al. Sequencing and comparative analyses of the genomes of zoysiagrasses. DNA Research 23, 171–180, https://doi.org/10.1093/dnares/dsw006 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wang, W., Shao, A., Xu, X., Fan, S. & Fu, J. Comparative genomics reveals the molecular mechanism of salt adaptation for zoysiagrasses. BMC Plant Biology 22, 355, https://doi.org/10.1186/s12870-022-03752-0 (2022).
Article CAS PubMed PubMed Central Google Scholar
Tateoka, T. Karyotaxonomy in Poaceae III. Further studies of somatic chromosomes. Cytologia 20, 296–306 (1955).
Article Google Scholar
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120, https://doi.org/10.1093/bioinformatics/btu170 (2014).
Article CAS PubMed PubMed Central Google Scholar
Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770, https://doi.org/10.1093/bioinformatics/btr011 (2011).
Article CAS PubMed PubMed Central Google Scholar
Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nature Communications 11, 1432, https://doi.org/10.1038/s41467-020-14998-3 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Hu, J. et al. NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads. Genome Biology 25, 107, https://doi.org/10.1186/s13059-024-03252-4 (2024).
Article PubMed PubMed Central Google Scholar
Hu, J., Fan, J., Sun, Z. & Liu, S. NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics 36, 2253–2255, https://doi.org/10.1093/bioinformatics/btz891 (2019).
Article CAS Google Scholar
Guan, D. et al. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics 36, 2896–2898, https://doi.org/10.1093/bioinformatics/btaa025 (2020).
Article CAS PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760, https://doi.org/10.1093/bioinformatics/btp324 (2009).
Article CAS PubMed PubMed Central Google Scholar
Zeng, X. et al. Chromosome-level scaffolding of haplotype-resolved assemblies using Hi-C data without reference genomes. Nature Plants 10, 1184–1200, https://doi.org/10.1038/s41477-024-01755-3 (2024).
Article CAS PubMed Google Scholar
Durand, N. C. et al. Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. Cell Systems 3, 99–101, https://doi.org/10.1016/j.cels.2015.07.012 (2016).
Article CAS PubMed PubMed Central Google Scholar
Holley, G. et al. Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly. Genome Biology 22, 28, https://doi.org/10.1186/s13059-020-02244-4 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lin, Y. et al. quarTeT: a telomere-to-telomere toolkit for gap-free genome assembly and centromeric repeat identification. Horticulture Research 10, https://doi.org/10.1093/hr/uhad127 (2023).
Quinlan, A. R. BEDTools: The Swiss-Army Tool for Genome Feature Analysis. Current Protocols in Bioinformatics 47, 11.12.11–11.12.34, https://doi.org/10.1002/0471250953.bi1112s47 (2014).
Article Google Scholar
Kovaka, S. et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biology 20, 278, https://doi.org/10.1186/s13059-019-1910-1 (2019).
Article CAS PubMed PubMed Central Google Scholar
Pertea, G. & Pertea, M. GFF Utilities: GffRead and GffCompare [version 2; peer review: 3 approved]. F1000Research 9, https://doi.org/10.12688/f1000research.23297.2 (2020).
Bruna, T., Lomsadze, A. & Borodovsky, M. GeneMark-ETP: Automatic Gene Finding in Eukaryotic Genomes in Consistency with Extrinsic Data. bioRxiv, 2023.2001.2013.524024, https://doi.org/10.1101/2023.01.13.524024 (2024).
Gabriel, L. et al. BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA. Genome Research, 34, 769-777, https://doi.org/10.1101/gr.278090.123 (2024).
Schmieder, R. & Edwards, R. Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863–864, https://doi.org/10.1093/bioinformatics/btr026 (2011).
Article CAS PubMed PubMed Central Google Scholar
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nature Biotechnology 37, 907–915, https://doi.org/10.1038/s41587-019-0201-4 (2019).
Article CAS PubMed PubMed Central Google Scholar
Dainat, J. (2021).
Hart, A. J. et al. EnTAP: Bringing faster and smarter functional annotation to non-model eukaryotic transcriptomes. Molecular Ecology Resources 20, 591–604, https://doi.org/10.1111/1755-0998.13106 (2020).
Article CAS PubMed Google Scholar
Poux, S. et al. On expert curation and scalability: UniProtKB/Swiss-Prot as a case study. Bioinformatics 33, 3454–3460, https://doi.org/10.1093/bioinformatics/btx439 (2017).
Article CAS PubMed PubMed Central Google Scholar
Huerta-Cepas, J. et al. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Research 47, D309–D314, https://doi.org/10.1093/nar/gky1085 (2018).
Article CAS PubMed Central Google Scholar
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nature Methods 12, 59–60, https://doi.org/10.1038/nmeth.3176 (2015).
Article CAS PubMed Google Scholar
Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y. & Morishima, K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Research 45, D353–D361, https://doi.org/10.1093/nar/gkw1092 (2016).
Article CAS PubMed PubMed Central Google Scholar
Consortium, T. G. O. Gene Ontology Consortium: going forward. Nucleic Acids Research 43, D1049–D1056, https://doi.org/10.1093/nar/gku1179 (2014).
Article CAS Google Scholar
Huerta-Cepas, J. et al. Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper. Molecular Biology and Evolution 34, 2115–2122, https://doi.org/10.1093/molbev/msx148 (2017).
Article CAS PubMed PubMed Central Google Scholar
Cantalapiedra, C. P., Hernández-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale. Molecular Biology and Evolution 38, 5825–5829, https://doi.org/10.1093/molbev/msab293 (2021).
Article CAS PubMed PubMed Central Google Scholar
Chen, C. et al. TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data. Molecular Plant 13, 1194–1202, https://doi.org/10.1016/j.molp.2020.06.009 (2020).
Article CAS PubMed Google Scholar
Tang, H. et al. JCVI: A versatile toolkit for comparative genomics analysis. iMeta 3, e211, https://doi.org/10.1002/imt2.211 (2024).
Article CAS PubMed PubMed Central Google Scholar
Park, H. Chromosome-level genome assembly of the halophytic grass Zoysia macrostachya, NCBI Sequence Read Archive http://identifiers.org/insdc.sra:SRP555321 (2025).
Park, H. Chromosome-level genome assembly of the halophytic grass Zoysia macrostachya, GenBank http://identifiers.org/assembly:GCA_049640385.1 (2025).
Park, H. Chromosome-level genome assembly of the halophytic grass Zoysia macrostachya, figshare https://doi.org/10.6084/m9.figshare.28151138.v3 (2025).
Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A. & Zdobnov, E. M. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes. Molecular Biology and Evolution 38, 4647–4654, https://doi.org/10.1093/molbev/msab199 (2021).
Article CAS PubMed PubMed Central Google Scholar
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biology 21, 245, https://doi.org/10.1186/s13059-020-02134-9 (2020).
Article CAS PubMed PubMed Central Google Scholar
Rhie, A. et al. Towards complete and error-free genome assemblies of all vertebrate species. Nature 592, 737-746, https://doi.org/10.1038/s41586-021-03451-0 (2021).

Download references

Acknowledgements

This research was funded by the Forest Biomaterial Research Center, National Institute of Forest Science (Project No. FE0100- 2025-03). And this work was carried out with the support of “Cooperative Research Program for Agriculture Science and Technology Development (Project No. RS-2021-RD009840)” Rural Development Administration, Republic of Korea.

Author information

These authors contributed equally: Hyeonseon Park, Eunji Bae.

Authors and Affiliations

Department of Biological Science, Chungnam National University, Daejeon, 34134, Republic of Korea
Hyeonseon Park, Jae Gyeong Jung & Donghwan Shim
Forest Biomaterials Research Center, National Institute of Forest Science, Jinju, 52817, Republic of Korea
Eunji Bae
School of Liberal Arts and Sciences, Korea National University of Transportation, Chungju, 27469, Republic of Korea
Bae Young Choi
Department of Agricultural Biotechnology Supercomputing Center, National Institute of Agricultural Sciences, Rural Development Administration, Jeonju, 54874, Republic of Korea
Tae-Ho Lee
Center for Genome Engineering, Institute for Basic Science, Daejeon, 34126, Republic of Korea
Donghwan Shim

Authors

Hyeonseon Park
View author publications
Search author on:PubMed Google Scholar
Eunji Bae
View author publications
Search author on:PubMed Google Scholar
Jae Gyeong Jung
View author publications
Search author on:PubMed Google Scholar
Bae Young Choi
View author publications
Search author on:PubMed Google Scholar
Tae-Ho Lee
View author publications
Search author on:PubMed Google Scholar
Donghwan Shim
View author publications
Search author on:PubMed Google Scholar

Contributions

H.P., E.B. and D.S. designed the study and led the research. E.B. and J.G.J. collected and developed plant materials. B.Y.C. performed genome sequencing. H.P. performed genome assembly, genome annotation, and genomics analyses, and drafted the manuscript. D.S., E.B. and T.-H.L. provided the research funding. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Donghwan Shim.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary table 1. raw sequencing data

Supplementary table 2. Zoysia macrostachya Repeat elements

Supplementary table 3. Liliopsida NCBI protein sets

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Park, H., Bae, E., Jung, J.G. et al. Chromosome-level Genome Assembly of the Halophytic Turfgrass Zoysia macrostachya. Sci Data 12, 1335 (2025). https://doi.org/10.1038/s41597-025-05702-5

Download citation

Received: 17 March 2025
Accepted: 25 July 2025
Published: 31 July 2025
Version of record: 31 July 2025
DOI: https://doi.org/10.1038/s41597-025-05702-5