Abstract
Cyperus difformis is a globally problematic weed in rice fields, posing a significant threat to rice yield. While chemical herbicides are commonly used for its control, the species often escapes management due to its rapid evolution of resistance to widely used herbicides. To better understand the mechanisms underlying herbicide resistance, insights into the genetics and genomics of C. difformis are essential. In this study, we present a telomere-to-telomere genome assembly of C. difformis, generated by combining PacBio HiFi, Oxford Nanopore Technologies (ONT), MGI short reads and high-throughput chromatin conformation capture (Hi-C) technologies. The assembled genome spans 226 Mb with a scaffold N50 of 13.08 Mb. Utilizing Hi-C interaction data, 97.24% of the contigs were anchored to 18 chromosomes, with 35 telomeres successfully defined. Further analysis identified 75.73 Mb of repetitive sequences and 21,069 protein-coding genes, of which 91.8% (19,347 genes) were functionally annotated. This high-quality genome provides a valuable resource for studies in population genetics, phylogeny, comparative genomics, adaptive evolution, and functional genomics of C. difformis.
Similar content being viewed by others
Data availability
The raw sequence data are available in the Genome Sequence Archive at https://ngdc.cncb.ac.cn/gsa/browse/CRA031324. The assembly and annotation file of Cyperus difformis are available from Figshare at https://doi.org/10.6084/m9.figshare.29072027. the assembly file is also available in NCBI at https://www.ncbi.nlm.nih.gov/nuccore/JBTANI000000000.
Code availability
All bioinformatic analyses were conducted in accordance with the respective software manuals and standard protocols. Software versions and critical parameters are reported in the Methods section, with unspecified settings retained as defaults. No custom code was developed during this study.
References
Holm, L. G., Plucknett, D. L., Pancho, J. V. & Herberger J. P. The World’s worst weeds: Distribution and biology. Malabar, FL, USA: UH Press 609 (1991).
Rao, A. N., Johnson, D. E., Sivaprasad, B., Ladha, J. K. & Mortimer, A. M. Weed Management in Direct-Seeded Rice. Adv. Agron. 93, 153–255 (2007).
Chauhan, B. S. & Johnson, D. E. Ecological studies on Cyperus difformis, Cyperus iria and Fimbristylis miliacea: three troublesome annual sedge weeds of rice. Ann. Appl. Biol. 155, 103–112 (2009).
Guo, X. et al. Effect of mutations on acetohydroxyacid synthase (AHAS) function in Cyperus difformis L. J. Integr. Agric. 23, 177–186 (2024).
Swain, D. J., Nott, M. J. & Trounce, R. B. Competition between Cyperus difformis and rice: the effect of time of weed removal. Weed Res. 15, 149–152 (1975).
Li, Z. et al. Variation in mutations providing resistance to acetohydroxyacid synthase inhibitors in Cyperus difformis in China. Pestic. Biochem. Phys. 166, 104571 (2020).
Choudhary, V. K. et al. Resistance in smallflower umbrella sedge (Cyperus difformis) to an acetolactate synthase-inhibiting herbicide in rice: First case in India. Weed Technol. 35, 710–717 (2021).
Heap I. The international herbicide-resistant weed database. http://www.weedscience.org (2025).
Ntoanidou, S., Kaloumenos, N., Diamantidis, G., Madesis, P. & Eleftherohorinos, I. Molecular basis of Cyperus difformis cross-resistance to ALS-inhibiting herbicides. Pestic. Biochem. Phys. 127, 38–45 (2016).
Tehranchian, P. et al. ALS-resistant smallflower umbrella sedge (Cyperus difformis) in Arkansas rice: physiological and molecular basis of resistance. Weed Sci. 63, 561–568 (2015).
Huang, M. et al. Comparative analysis of resistance to ALS-inhibiting herbicides in smallflower umbrella sedge (Cyperus difformis) populations from directseeded and puddled-transplanted rice systems. Weed Sci. 70, 174–182 (2022).
Chen, X. et al. Multiple herbicide resistance in a Cyperus difformis population in rice field from China. Pestic. Biochem. Physiol. 195, 105576 (2023).
Ceseski, A. R., Godar, A. S., Ohadi, S. & Al-Khatib, K. Target and nontarget mechanisms of AHAS inhibitor cross-resistance patterns in Cyperus difformis. Pestic. Biochem. Physiol. 193, 105444 (2023).
Porebski, S., Bailey, L. G. & Baum, B. R. Modifcation of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components. Plant Mol. Biol. Rep. 15, 8–15 (1997).
Marcais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).
Vurture, G. W. et al. GenomeScope: fast reference-free genome profling from short reads. Bioinformatics 33, 2202–2204 (2017).
Cheng, H. Y. et al. Haplotype-resolved assembly of diploid genomes without parental data. Nat. Biotechnol. 40, 1332–1335 (2022).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Wingett, S. et al. HiCUP: pipeline for mapping and processing Hi-C data [version 1; peer review: 2 approved, 1 approved with reservations]. F1000Research 4, 1310 (2015).
Burton, J. N. et al. Chromosome-scale scafolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31, 1119–1125 (2013).
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 1–27 (2020).
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. PNAS 117, 9451–9457 (2020).
Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics 21, i351–i358 (2005).
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics 25, 4.10.11–14.10.14 (2009).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Kim, D. et al. TopHat2: Accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013).
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
Haas, B. J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666 (2003).
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
Yu, X. J., Zheng, H. K., Wang, J., Wang, W. & Su, B. Detecting lineage-specific adaptive evolution of brain-expressed genes in human using rhesus macaque as outgroup. Genomics 88, 745–751 (2006).
Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988–995 (2004).
Stanke, M., Diekhans, M., Baertsch, R. & Haussler, D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24, 637–644 (2008).
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).
Delcher, A. L., Bratke, K. A., Powers, E. C. & Salzberg, S. L. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23, 673–679 (2007).
Korf, I. Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).
Parra, G., Blanco, E. & Guigó, R. GeneID in Drosophila. Genome Res. 10, 511–515 (2000).
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).
Buchfnk, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Blum, M. et al. Te InterPro protein families and domains database: 20 years on. Nucleic Acids Res. 49, D344–D354 (2021).
Lagesen, K. et al. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 35, 3100–3108 (2007).
Chan, P., Lin, B., Mak, A. J. & Lowe, T. M. tRNAscan-SE 2.0: improved detection and functional classifcation of transfer RNA genes. Nucleic Acids Res. 49, 9077–9096 (2021).
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).
LI, J. Genome data of Cyperus difformis. GSA. Dataset https://ngdc.cncb.ac.cn/gsa/browse/CRA031324 (2025).
LI, J. & Wang, Y. Genome assembly and annotation file of Cyperus difformis L. FigsharesDataset https://doi.org/10.6084/m9.figshare.29072027 (2025).
LI, J. Cyperus difformis isolate JL-2025, whole genome shotgun sequencing project. NCBI. Dataset. https://www.ncbi.nlm.nih.gov/nuccore/JBTANI000000000 (2025).
Acknowledgements
This study was supported by National Natural Science Foundation of China (32360681 and 32460693), Guangxi Natural Science Foundation (2021GXNSFDA220007 and 2024GXNSFAA010013), Research Funding of Guangxi Academy of Agriculture Sciences (2021YT066, 2022JM43 and 2024ZX17), Guangxi Key Laboratory of Biology for Crop Diseases and Insect Pests (22-035-31-23ST02), Research Funding of Guangxi Vocational University of Agriculture (XKJ2305), and Guangxi Innovation Group of National Modern Agricultural Industry Technology System (nycytxgxcxtd-2024-16-03).
Author information
Authors and Affiliations
Contributions
Y.W. and J.L. designed this research. J.Z., W.Z. and Y.M. analyzed the data. M.Q. and W.L. collected the samples. Y.W. and J.L. drafted and revised the manuscript. All co-authors contributed to and approved this manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Li, J., Zhao, J., Zheng, W. et al. A telomere-to-telomere genome assembly for Cyperus difformis. Sci Data (2026). https://doi.org/10.1038/s41597-026-06582-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-026-06582-z

