Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Super-pangenome analyses across 35 accessions of 23 Avena species highlight their complex evolutionary history and extensive genomic diversity

Abstract

Common oat, belonging to the genus Avena with 30 recognized species, is a nutritionally important cereal crop and high-quality forage worldwide. Here, we construct a genus-level super-pangenome of Avena comprising 35 high-quality genomes from 14 cultivated oat accessions and 21 wild species. The fully resolved phylogenomic analysis unveils the origin and evolutionary scenario of Avena species, and the super-pangenome analysis identifies 26.62% and 59.93% specific genes and haplotypes in wild species. We delineate the landscape of structural variations (SVs) and the transcriptome profile based 1,401 RNA-sequencing (RNA-seq) samples from diverse abiotic stress treatments in oat. We highlight the crucial role of SVs in modulating gene expression and shaping adaptation to diverse stresses. Further combining SV-based genome-wide association studies (GWASs), we characterize 13 candidate genes associated with drought resistance such as AsARF7, validated by transgenic oat lines. Our study provides unprecedented genomic resources to facilitate genomic, evolution and molecular breeding research in oat.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Phylogeny, population structure and diverse phenotypes of cultivated oats and wild species in Avena.
Fig. 2: Inferred phylogeny, origin and evolution of Avena species.
Fig. 3: Super-pangenome of Avena and rich specific genetic resources in wild species.
Fig. 4: Characteristics of SVs among the 17 hexaploid oats.
Fig. 5: The impact of SVs on gene expression alteration.
Fig. 6: Identification and validation of key SVs and genes related to oat drought tolerance based on the SV-based GWAS.

Similar content being viewed by others

Data availability

The raw data (CRA015654 and CRA015559) and genome assemblies (GWHEROY00000000-GWHGDJH00000000.1) in this study have been deposited at the Chinese National Genomics Data Center (https://bigd.big.ac.cn/) under the BioProject accession number PRJCA024628. Raw sequencing data (SRR31832837–SRR31832882) and genome assemblies (JBKZJG000000000–JBKZJZ000000000 and JBKZKA000000000–JBKZKH000000000) have been deposited at the NCBI under the BioProject PRJNA1200805. Transcriptome sequencing data under normal growth (SRR31798633–SRR31798686 and SRR31814753–SRR31815091) have been deposited at the NCBI under the BioProject PRJNA1201252. Transcriptome sequencing data under different stresses (SRR31832975–SRR31833946) have been deposited at the NCBI under the BioProject PRJNA1203203. Whole-genome resequencing data of 1,078 hexaploid oat accessions were obtained from the OatOmics database (http://www.oatomics.com). Source data are provided with this paper.

Code availability

The scripts used in this study are available via GitHub (https://github.com/HongyuZhang-HBU/oat-pangenome) and Zenodo111 (https://doi.org/10.5281/zenodo.15761906).

References

  1. Rasane, P., Jha, A., Sabikhi, L., Kumar, A. & Unnikrishnan, V. S. Nutritional advantages of oats and opportunities for its processing as value added foods — a review. J. Food Sci. Technol. 52, 662–675 (2015).

    CAS  PubMed  Google Scholar 

  2. Shi, J., Tian, Z., Lai, J. & Huang, X. Plant pan-genomics and its applications. Mol. Plant 16, 168–186 (2023).

    CAS  PubMed  Google Scholar 

  3. Kamal, N. et al. The mosaic oat genome gives insights into a uniquely healthy cereal crop. Nature 606, 113–119 (2022).

    CAS  PubMed Central  PubMed  Google Scholar 

  4. Peng, Y. et al. Reference genome assemblies reveal the origin and evolution of allohexaploid oat. Nat. Genet. 54, 1248–1258 (2022).

    CAS  PubMed Central  PubMed  Google Scholar 

  5. Liu, Q., Lin, L., Zhou, X., Peterson, P. M. & Wen, J. Unraveling the evolutionary dynamics of ancient and recent polyploidization events in Avena (Poaceae). Sci. Rep. 7, 41944 (2017).

    CAS  PubMed Central  PubMed  Google Scholar 

  6. Yan, H. et al. High-density marker profiling confirms ancestral genomes of Avena species and identifies D-genome chromosomes of hexaploid oat. Theor. Appl. Genet. 129, 2133–2149 (2016).

    CAS  PubMed Central  PubMed  Google Scholar 

  7. Loskutov, I. G., Gnutikov, A. A., Blinova, E. V. & Rodionov, A. V. The application of Vavilov’s approaches to the phylogeny and evolution of cultivated species of the genus Avena L. Vavilovskii Zhurnal Genet. Selektsii 27, 921–932 (2023).

    CAS  PubMed Central  PubMed  Google Scholar 

  8. Fu, Y. B. Oat evolution revealed in the maternal lineages of 25 Avena species. Sci. Rep. 8, 4252 (2018).

    PubMed Central  PubMed  Google Scholar 

  9. Liu, Q. et al. Comparative chloroplast genome analyses of Avena: insights into evolutionary dynamics and phylogeny. BMC Plant Biol. 20, 406 (2020).

    CAS  PubMed Central  PubMed  Google Scholar 

  10. Morikawa, T. & Nishihara, M. Genomic and polyploid evolution in genus Avena as revealed by RFLPs of repeated DNA sequences. Genes Genet. Syst. 84, 199–208 (2009).

    CAS  PubMed  Google Scholar 

  11. Yan, H. H. et al. Phylogenetic analysis of the genus Avena based on chloroplast intergenic spacer psbA-trnH and single-copy nuclear gene Acc1. Genome 57, 267–277 (2014).

    CAS  PubMed  Google Scholar 

  12. Liu, B. B. et al. Phylogenomic conflict analyses in the apple genus Malus s.l. reveal widespread hybridization and allopolyploidy driving diversification, with insights into the complex biogeographic history in the Northern Hemisphere. J. Integr. Plant Biol. 64, 1020–1043 (2022).

  13. Meleshko, O. et al. Extensive genome-wide phylogenetic discordance is due to incomplete lineage sorting and not ongoing introgression in a rapidly radiated bryophyte genus. Mol. Biol. Evol. 38, 2750–2766 (2021).

    CAS  PubMed Central  PubMed  Google Scholar 

  14. Liu, S. Y. et al. An integrative framework reveals widespread gene flow during the early radiation of oaks and relatives in Quercoideae (Fagaceae). J. Integr. Plant Biol. 67, 1119–1141 (2025).

  15. Li, W. et al. Plant pan-genomics: recent advances, new challenges, and roads ahead. J. Genet. Genomics 49, 833–846 (2022).

    PubMed  Google Scholar 

  16. Tieman, D. et al. A chemical genetic roadmap to improved tomato flavor. Science 355, 391–394 (2017).

    CAS  PubMed  Google Scholar 

  17. Zhuang, Y. et al. Phylogenomics of the genus Glycine sheds light on polyploid evolution and life-strategy transition. Nat. Plants 8, 233–244 (2022).

    CAS  PubMed  Google Scholar 

  18. Yu, H. et al. A route to de novo domestication of wild allotetraploid rice. Cell 184, 1156–1170 (2021).

    CAS  PubMed  Google Scholar 

  19. Li, N. et al. Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species. Nat. Genet. 55, 852–860 (2023).

    CAS  PubMed Central  PubMed  Google Scholar 

  20. Tang, D. et al. Genome evolution and diversity of wild and cultivated potatoes. Nature 606, 535–541 (2022).

    CAS  PubMed Central  PubMed  Google Scholar 

  21. Shi, T. et al. The super-pangenome of Populus unveils genomic facets for its adaptation and diversification in widespread forest trees. Mol. Plant 17, 725–746 (2024).

    CAS  PubMed  Google Scholar 

  22. Fominaya, A., Vega, C. & Ferrer, E. Giemsa C-banded karyotypes of Avena species. Genome 30, 627–632 (1988).

    Google Scholar 

  23. Latta, R. G., Bekele, W. A., Wight, C. P. & Tinker, N. A. Comparative linkage mapping of diploid, tetraploid, and hexaploid Avena species suggests extensive chromosome rearrangement in ancestral diploids. Sci. Rep. 9, 12298 (2019).

    PubMed Central  PubMed  Google Scholar 

  24. Yan, H. et al. Genome size variation in the genus Avena. Genome 59, 209–220 (2016).

    PubMed  Google Scholar 

  25. Xiong, Z., Gaeta, R. T. & Pires, J. C. Homoeologous shuffling and chromosome compensation maintain genome balance in resynthesized allopolyploid Brassica napus. Proc. Natl Acad. Sci. USA 108, 7908–7913 (2011).

    CAS  PubMed Central  PubMed  Google Scholar 

  26. Chester, M. et al. Extensive chromosomal variation in a recently formed natural allopolyploid species, Tragopogon miscellus (Asteraceae). Proc. Natl Acad. Sci. USA 109, 1176–1181 (2012).

    CAS  PubMed Central  PubMed  Google Scholar 

  27. Peng, Y. et al. Phylogenetic relationships in the genus Avena based on the nuclear Pgk1 gene. PLoS ONE 13, e0200047 (2018).

    PubMed Central  PubMed  Google Scholar 

  28. Morikawa, T., Gushiken, Y. & Tsurukawa, N. Chromosomal diversity and morphological dimorphism in Moroccan wild oat, Avena agadiriana. Plant Syst. Evol. 281, 107–113 (2009).

    Google Scholar 

  29. Paczos-Grzęda, E. & Bednarek, P. Comparative analysis of hexaploid Avena species using REMAP and ISSR methods. Turk. J. Bot. 38, 1103–1111 (2014).

  30. Loskutov, I. G. On evolutionary pathways of Avena species. Genet. Resour. Crop Evol. 55, 211–220 (2008).

    Google Scholar 

  31. Badaeva, E. D., Shelukhina, O. Y., Dedkova, O. S., Loskutov, I. G. & Pukhalskyi, V. A. Comparative cytogenetic analysis of hexaploid Avena L. species. Russ. J. Genet. 47, 691–702 (2011).

    CAS  Google Scholar 

  32. Ociepa, T. The oat gene pools—review about the use of wild species in improving cultivated oat/Pule genowe owsa—przeglad informacji na temat wykorzystania dzikich gatunkow w ulepszaniu owsa uprawnego. J. Cent. Eur. Agric. 20, 251–261 (2019).

    Google Scholar 

  33. Ladizinsky, G. & Fainstein, R. Domestication of the protein-rich tetraploid wild oats Avena magna and A. murphyi. Euphytica 26, 221–223 (1977).

    Google Scholar 

  34. Ladizinsky, G. Domestication via hybridization of the wild tetraploid oats Avena magna and A. murphyi. Theor. Appl. Genet. 91, 639–646 (1995).

    CAS  PubMed  Google Scholar 

  35. Okoń, S., Ociepa, T., Nucia, A., Cieplak, M. & Kowalczyk, K. Is every wild species a rich source of disease resistance? Avena fatua L.—potential donor of resistance to powdery mildew. Plants 10, 560 (2021).

    PubMed Central  PubMed  Google Scholar 

  36. Zhai, C. et al. The isolation and characterization of Pik, a rice blast resistance gene which emerged after rice domestication. New Phytol. 189, 321–334 (2011).

    CAS  PubMed  Google Scholar 

  37. Ramírez-González, R. H. et al. The transcriptional landscape of polyploid wheat. Science 361, eaar6089 (2018).

    PubMed  Google Scholar 

  38. Ma, P.-F. et al. Genome assemblies of 11 bamboo species highlight diversification induced by dynamic subgenome dominance. Nat. Genet. 56, 710–720 (2024).

    CAS  PubMed Central  PubMed  Google Scholar 

  39. Qin, P. et al. Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell 184, 3542–3558 (2021).

    CAS  PubMed  Google Scholar 

  40. Yan, H. et al. Pangenomic analysis identifies structural variation associated with heat tolerance in pearl millet. Nat. Genet. 55, 507–518 (2023).

    CAS  PubMed Central  PubMed  Google Scholar 

  41. Uga, Y. et al. Control of root system architecture by DEEPER ROOTING 1 increases rice yield under drought conditions. Nat. Genet. 45, 1097–1102 (2013).

    CAS  PubMed  Google Scholar 

  42. Lo, S. F. et al. Ectopic expression of specific GA2 oxidase mutants promotes yield and stress tolerance in rice. Plant Biotechnol. J. 15, 850–864 (2017).

    CAS  PubMed Central  PubMed  Google Scholar 

  43. Wei, H., Wang, X., He, Y., Xu, H. & Wang, L. Clock component OsPRR73 positively regulates rice salt tolerance by modulating OsHKT2;1-mediated sodium homeostasis. EMBO J. 40, e105086 (2021).

    CAS  PubMed  Google Scholar 

  44. Kim, J.-J. et al. Overexpression of a proton pumping gene OVP1 enhances salt stress tolerance, root growth and biomass yield by regulating ion balance in rice (Oryza sativa L.). Environ. Exp. Bot. 175, 104033 (2020).

    CAS  Google Scholar 

  45. An, X., Zhang, L., Luo, J., Zhao, S. & Jiao, T. Effects of oat hay content in diets on nutrient metabolism and the rumen microflora in sheep. Animals 10, 2341 (2020).

    PubMed Central  PubMed  Google Scholar 

  46. Katsiotis, A., Hagidimitriou, M. & Heslop-Harrison, J. S. The close relationship between the A and B genomes in Avena L. (Poaceae) determined by molecular cytogenetic analysis of total genomic, tandemly and dispersed repetitive DNA sequences. Ann. Bot. 79, 103–109 (1997).

    CAS  Google Scholar 

  47. Zhang, C. et al. High-quality genome of a modern soybean cultivar and resequencing of 547 accessions provide insights into the role of structural variation. Nat. Genet. 56, 2247–2258 (2024).

    CAS  PubMed  Google Scholar 

  48. Li, W. et al. Genomic insights into the divergence between hulled and hulless oats. Cell Rep. 44, 116055 (2025).

    CAS  PubMed  Google Scholar 

  49. He, Q. et al. The near-complete genome assembly of hexaploid wild oat reveals its genome evolution and divergence with cultivated oats. Nat. Plants 10, 2062–2078 (2024).

    CAS  PubMed  Google Scholar 

  50. Li, Y. et al. Subtelomeric assembly of a multi-gene pathway for antimicrobial defense compounds in cereals. Nat. Commun. 12, 2563 (2021).

    PubMed Central  PubMed  Google Scholar 

  51. Maughan, P. J. et al. Genomic insights from the first chromosome-scale assemblies of oat (Avena spp.) diploid species. BMC Biol. 17, 92 (2019).

    PubMed Central  PubMed  Google Scholar 

  52. Li, H. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    CAS  PubMed Central  PubMed  Google Scholar 

  53. Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).

    CAS  PubMed Central  PubMed  Google Scholar 

  54. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    CAS  PubMed Central  PubMed  Google Scholar 

  55. Alexander, D. H., Novembre, J. & Lange, K. L. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

    CAS  PubMed Central  PubMed  Google Scholar 

  56. Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).

    CAS  PubMed Central  PubMed  Google Scholar 

  57. Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).

    CAS  PubMed Central  PubMed  Google Scholar 

  58. Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).

    CAS  PubMed Central  PubMed  Google Scholar 

  59. Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A. & Zdobnov, E. M. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol. 38, 4647–4654 (2021).

    CAS  PubMed Central  PubMed  Google Scholar 

  60. Ou, S. & Jiang, N. LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 176, 1410–1422 (2018).

    CAS  PubMed  Google Scholar 

  61. Robert Hubley, A. S. RepeatModeler Open-1.0 www.repeatmasker.org/RepeatModeler/ (2010).

  62. Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).

    PubMed Central  PubMed  Google Scholar 

  63. Ellinghaus, D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9, 18 (2008).

    PubMed Central  PubMed  Google Scholar 

  64. Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics Chapter 4, 4.10.1–4.10.14 (2009).

  65. Zhang, R.-G. et al. TEsorter: an accurate and fast method to classify LTR-retrotransposons in plant genomes. Hortic. Res. 9, uhac017 (2022).

    PubMed Central  PubMed  Google Scholar 

  66. Haas, B. J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).

    CAS  PubMed  Google Scholar 

  67. Haas, B. J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666 (2003).

    CAS  PubMed Central  PubMed  Google Scholar 

  68. Stanke, M. & Waack, S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19, ii215–ii225 (2003).

    PubMed  Google Scholar 

  69. Gremme, G., Brendel, V., Sparks, M. E. & Kurtz, S. Engineering a software tool for gene structure prediction in higher organisms. Inf. Softw. Technol. 47, 965–978 (2005).

    Google Scholar 

  70. Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).

    CAS  PubMed Central  PubMed  Google Scholar 

  71. Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).

    CAS  PubMed Central  PubMed  Google Scholar 

  72. Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).

    PubMed Central  PubMed  Google Scholar 

  73. Aramaki, T. et al. KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold. Bioinformatics 36, 7 (2020).

    Google Scholar 

  74. Jia, K. H. et al. SubPhaser: a robust allopolyploid subgenome phasing method based on subgenome-specific k-mers. New Phytol. 235, 801–809 (2022).

    CAS  PubMed  Google Scholar 

  75. Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).

    PubMed Central  PubMed  Google Scholar 

  76. Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).

    CAS  PubMed Central  PubMed  Google Scholar 

  77. Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).

    CAS  PubMed Central  PubMed  Google Scholar 

  78. Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).

    CAS  PubMed  Google Scholar 

  79. He, Q. et al. High-quality genome of allotetraploid Avena barbata provides insights into the origin and evolution of B subgenome in Avena. J. Integr. Plant Biol. 67, 1515–1532 (2025).

  80. Han, M. V., Thomas, G. W., Lugo-Martinez, J. & Hahn, M. W. Estimating gene gain and loss rates in the presence of error in genome assembly and annotation using CAFE 3. Mol. Biol. Evol. 30, 1987–1997 (2013).

    CAS  PubMed  Google Scholar 

  81. Sun, P. et al. WGDI: a user-friendly toolkit for evolutionary analyses of whole-genome duplications and ancestral karyotypes. Mol. Plant 15, 1841–1851 (2022).

    CAS  PubMed  Google Scholar 

  82. Zhang, R.-G. et al. Subgenome-aware analyses suggest a reticulate allopolyploidization origin in three Papaver genomes. Nat. Commun. 14, 2204 (2023).

    CAS  PubMed Central  PubMed  Google Scholar 

  83. Doyle, J. & Egan, A. Dating the origins of polyploidy events. New Phytol. 186, 73–85 (2009).

    PubMed  Google Scholar 

  84. Jin, J. J. et al. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol. 21, 241 (2020).

    PubMed Central  PubMed  Google Scholar 

  85. Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).

    PubMed Central  PubMed  Google Scholar 

  86. Danecek, P. & McCarthy, S. BCFtools/csq: haplotype-aware variant consequences. Bioinformatics 33, 2037–2039 (2017).

    CAS  PubMed Central  PubMed  Google Scholar 

  87. Wang, D., Zhang, Y., Zhang, Z., Zhu, J. & Yu, J. KaKs_Calculator 2.0: a toolkit incorporating γ-series methods and sliding window strategies. Genomics Proteomics Bioinformatics 8, 77–80 (2010).

    CAS  PubMed Central  PubMed  Google Scholar 

  88. Zhang, Y. et al. Telomere-to-telomere Citrullus super-pangenome provides direction for watermelon breeding. Nat. Genet. 56, 1750–1761 (2024).

    CAS  PubMed Central  PubMed  Google Scholar 

  89. Huang, Y. et al. Pangenome analysis provides insight into the evolution of the orange subfamily and a key gene for citric acid accumulation in citrus fruits. Nat. Genet. 55, 1964–1975 (2023).

    CAS  PubMed  Google Scholar 

  90. Zhang, Z. et al. ParaAT: a parallel tool for constructing multiple protein-coding DNA alignments. Biochem. Biophys. Res. Commun. 419, 779–781 (2012).

    CAS  PubMed  Google Scholar 

  91. Potter, S. C. et al. HMMER web server: 2018 update. Nucleic Acids Res. 46, W200–W204 (2018).

    CAS  PubMed Central  PubMed  Google Scholar 

  92. Li, P. et al. RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants. BMC Genomics 17, 852 (2016).

    CAS  PubMed Central  PubMed  Google Scholar 

  93. Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).

    PubMed Central  PubMed  Google Scholar 

  94. Marçais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput. Biol. 14, e1005944 (2018).

    PubMed Central  PubMed  Google Scholar 

  95. Goel, M., Sun, H., Jiao, W. B. & Schneeberger, K. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol. 20, 277 (2019).

    PubMed Central  PubMed  Google Scholar 

  96. Li, X. et al. Large-scale gene expression alterations introduced by structural variation drive morphotype diversification in Brassica oleracea. Nat. Genet. 56, 517–529 (2024).

    CAS  PubMed Central  PubMed  Google Scholar 

  97. Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).

    CAS  PubMed Central  PubMed  Google Scholar 

  98. Zhang, J. et al. Vacuum and co-cultivation agroinfiltration of (germinated) seeds results in tobacco rattle virus (TRV) mediated whole-plant virus-induced gene silencing (VIGS) in wheat and maize. Front. Plant Sci. 8, 393 (2017).

    PubMed Central  PubMed  Google Scholar 

  99. Liu, N. et al. Comprehensive co-expression network reveals the fine-tuning of AsHSFA2c in balancing drought tolerance and growth in oat. Commun. Biol. 8, 393 (2025).

    CAS  PubMed Central  PubMed  Google Scholar 

  100. Tian, T. et al. Arabidopsis FAR-RED ELONGATED HYPOCOTYL3 integrates age and light signals to negatively regulate leaf senescence. Plant Cell 32, 1574–1588 (2020).

    CAS  PubMed Central  PubMed  Google Scholar 

  101. Liu, N. et al. A lncRNA fine-tunes salicylic acid biosynthesis to balance plant immunity and growth. Cell Host Microbe 30, 1124–1138 (2022).

    CAS  PubMed  Google Scholar 

  102. Zhao, X. et al. Global identification of Arabidopsis lncRNAs reveals the regulation of MAF4 by a natural antisense RNA. Nat. Commun. 9, 5056 (2018).

    PubMed Central  PubMed  Google Scholar 

  103. Zhou, J. et al. Intronic heterochromatin prevents cryptic transcription initiation in Arabidopsis. Plant J. 101, 1185–1197 (2020).

    CAS  PubMed  Google Scholar 

  104. Li, Q. et al. DEAD-box helicases modulate dicing body formation in Arabidopsis. Sci. Adv. 7, eabc6266 (2021).

    CAS  PubMed Central  PubMed  Google Scholar 

  105. Wang, K. et al. Author Correction: The gene TaWOX5 overcomes genotype dependency in wheat genetic transformation. Nat. Plants 8, 717–720 (2022).

    PubMed  Google Scholar 

  106. Liu, X. et al. Uncovering the transcriptional regulatory network involved in boosting wheat regeneration and transformation. Nat. Plants 9, 908–925 (2023).

    CAS  PubMed  Google Scholar 

  107. Yu, Y. et al. Enhancing wheat regeneration and genetic transformation through overexpression of TaLAX1. Plant Commun. 5, 100738 (2024).

    CAS  PubMed  Google Scholar 

  108. Ishida, Y., Tsunashima, M., Hiei, Y. & Komari, T. Wheat (Triticum aestivum L.) transformation using immature embryos. Methods Mol. Biol. 1223, 189–198 (2015).

    CAS  PubMed  Google Scholar 

  109. Said, M. et al. The Agropyron cristatum karyotype, chromosome structure and cross-genome homoeology as revealed by fluorescence in situ hybridization with tandem repeats and wheat single-gene probes. Theor. Appl. Genet. 131, 2213–2227 (2018).

    CAS  PubMed Central  PubMed  Google Scholar 

  110. Xi, W. et al. New ND-FISH-positive oligo probes for identifying Thinopyrum chromosomes in wheat backgrounds. Int. J. Mol. Sci. 20, 2031 (2019).

    CAS  PubMed Central  PubMed  Google Scholar 

  111. Zhang, H. et al. Super-pangenome analyses across 35 accessions of 23 Avena species highlight their complex evolutionary history and extensive genomic diversity. Zenodo https://doi.org/10.5281/zenodo.15761906 (2025).

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of Hebei Province (grant nos. C2023201074 and C2021201048 to H.D.), the Young Elite Scientists Sponsorship Program by CAST (grant YESS20210080 to H.D.), the National Natural Science Foundation of China (32100500 to H.D.), open funds of the State Key Laboratory of Plant Environmental Resilience (grant no. SKLPERKF2406 to Z.G.), the Interdisciplinary Research Program of Natural Science of Hebei University (grant no. 513201422004 to H.D.) and Post-graduate’s Innovation Fund Project of Hebei University (grant no. HBU2025BS011 to H.Z.). We sincerely thank Y. Fan at Chengdu University, Y. Liu at the Dingxi Academy of Agricultural Sciences and B. Wu at the Institute of Crop Sciences, Chinese Academy of Agricultural Sciences for providing the oat accession resources. We also thank K. Wang at the Institute of Crop Sciences, Chinese Academy of Agricultural Sciences for providing the TaWOX5 plasmid.

Author information

Authors and Affiliations

Authors

Contributions

H.D. conceived and supervised the project; H.D., Q.H. and Z.G. designed the study. Yaru Wang, X.Z., Q.H., Yu Wang and H.Z. sequenced and processed the raw data. Q.H., H.Z., Yaru Wang, M.W. and X.Z. assembled and annotated the genome. H.Z., X.Z., Z.L., Q.Z. and L.X. performed phylogenetic and genome evolution analyses; Yaru Wang, Q.H. and H.Z. conducted the transcriptome analysis; H.Z., Yaru Wang, Q.H., X.Z. and J.L. conducted pangenome analyses. Q.H., Yaru Wang, T.L., W.L. and N.L. performed TE and SV analyses. W.L., J.L. and Q.H. performed the population genetics analysis. N.L., Q.H., Y.Y., T.L., Y.Q. and J.Y. conceived of and designed the experiments. H.D. composed the outline of the paper. Q.H., W.L. and H.Z. wrote the paper. H.D. revised the paper.

Corresponding authors

Correspondence to Zhizhong Gong, Qiang He or Huilong Du.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Mark Chapman and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Geographical distribution and phenotype diversity of cultivated and wild oats accessions.

a, Principal component analysis (PCA) of 1,079 globally collected hexaploid oat accessions. PC1 (32.14%) clearly separates wild oat accessions (G1) and cultivated oat accessions (G2-G6); PC2 (19.45%) clearly separates G5 group of hulless oats from other groups; b, Geographic distribution of the 17 diverse representative accessions among 1,079 hexaploid oat accessions. The world map was generated using the WorldClim (https://worldclim.org/) whose data are freely available for academic use (https://worldclim.org/about.html). The color of points corresponds to Fig. 1a. c-e, The highly diverse agronomic phenotypes of representative wild oat accessions, including plant architecture (c), spike architecture (d) and spikelet (e).

Extended Data Fig. 2 Circos plot of the reference genome C0191 and assembly assessment of genomes in our study.

a, Circos plot showing chromosome-level features of C0191 reference genome. Tracks represent (I) chromosome length, (II) GC content, (III) gene density, (IV)TE density, (V) LTR density, (VI) DNA TE density, (VII) other TE density and (VIII) syntenic blocks. b, BUSCO completeness assessment for genome annotation of 35 oat accessions.

Source data

Extended Data Fig. 3 Phylogenetic relationships and karyotypes evolution of Avena species.

a, Distribution of Ks for orthologue duplicates identified between two (sub)genomes to estimate divergence time. Ad (A. damascene), As (A. brevis), D (A. maroccana-D), B (A. vaviloviana-B). b, SNP-based phylogenetic tree constructed with A. longiglumis as the reference genome based on our resequencing data and published resequencing data. c, A specific FISH probe targeting the As subtypes can label A. hispanica, A. brevis and A. nuda, but not A. hirtula, as observed in cells at the metaphase of mitosis. Scale bars, 5 μm. d, The reconstructed ancestral karyotype of the Avena genus using rice as the outgroup. Gene families with expansion and contraction in each species and node are highlighted in blue and orange, respectively. The red pentagrams represent the position of the inferred ancestor of A, B, C and D lineages. The arrows represented that the chromosome 1 and 4 in C subgenomes of tetraploids exhibited large-scale rearrangements. The red dashed boxes represent a large segment specific to the D subgenome of A. insularis but absent in that of A. maroccana and A. murphyi, then was translocated to the end of chromosome 2D during the process of polyploidization and retained in the hexaploid oats. e, The density of SVs in C1071, Afat, Aste, Aocc with the C0191 genome as reference. The average sequence similarity between subgenomes of C0191 and C1071, Afat, Aste, Aocc are shown in the heatmap. f, Proportions of contrasting gene tree topologies for 100 randomly selected single-copy genes (1,000 times) with regard to three major conflicting relationships. The values in the upper right corner of each tree show the percentage of windows that recovered the topology.

Source data

Extended Data Fig. 4 The divergent genome composition between A. agadiriana and AABB tetraploid species.

a, Unsupervised hierarchical clustering of differential 15-mers validates that A. vaviloviana, A. barbata, A. abyssinica are successfully phased into two subgenomes based on clearly distinct patterns, while the two subgenomes of A. agadiriana showed higher sequence identity, making the subgenomes phasing relatively challenging and easily confused. b, The coalescent-based tree reconstructed from 100 randomly selected single-copy genes (1,000 times). The values in the upper right corner of each tree show the percentage of windows that recovered the topology. As (A. hispanica, A. brevis, A. atlantica, A. nuda). c, Intergenomic synteny among A. agadiriana, A. abyssinica, Avaviloviana and A. barbata. Some large inversions and translocations between A. agadiriana and species with AABB genome constitution were shown in blue. d, The average sequence similarity between the subgenomes of A. agadiriana, and three AABB species, as well as other subgenomes. e, Distribution of synonymous divergence (Ks) for orthologue genes identified between each subgenome of A. agadiriana (Aaga-Ac’ and Aaga-As’) and other subgenomes.

Source data

Extended Data Fig. 5 Refined model for the origins, polyploidizations and evolutionary trajectories of Avena species.

a, Phylogenetic trees of 24 oat species with L. perenne and T. aestivum as an outgroup based on 26 chloroplast (cp) genomes and 7,137 mitochondrial (mt) SNP data sets. b-f, Divergence and polyploidization time estimation based on the 2,456 single-copy orthologous groups using MCMCTree, and distribution of synonymous divergence (Ks) for orthologue duplicates identified between two subgenomes.

Extended Data Fig. 6 Genetic diversity in cultivated and wild oats.

a, Venn diagram showing the numbers of common and unique gene families identified in panA, panC and panD of cultivated oats and panA, panB, panC and panD of wild oats. b, The distribution of CDS length in core, dispensable and private genes in cultivated and wild pangenomes. Cul-core (n = 1,460,017), Cul-dispensable (n = 394,369), Cul-private (n = 8,665), Wild-core (n = 776,159), Wild-dispensable (n = 553,189), Wild-private (n = 35,786). c, Proportion of genes with Pfam domains in core, dispensable, and private genes in cultivated and wild oats. Blue histograms indicate the genes with Pfam domain annotation; gray histograms indicate the gene without Pfam domain annotation. d, Comparison of gene expression levels in core, dispensable, and private genes in cultivated and wild oats. e-h, Comparison of the TE coverage with a window of 5 Mb (e), complete BUSCOs (n = 17) (f), Ka/Ks (n = 10,000) (g), NBS-encoding gene number (n = 17) (h) among the three subgenomes of hexaploid oats. i, The number of specific gene families derived from each wild oats compared to cultivated oats. The pie chart represents the proportion of gene families unique to wild oats that are derived from different number of subgenomes. j, A candidate introgression region from A. maroccana. The sequence similarity and SNP dataset all supported that A. maroccana showed close relationship with cultivated oats in this region.

Source data

Extended Data Fig. 7 Verification and analysis of structural variations.

a, HiFi reads were used to validate the borders of 6 deletions randomly selected in C0080 and C0298. b, Schematic diagrams showed the distribution of inversion between two oat genomes (x-axis: C0191). Red box represented the validated inversion events relative to C0191. c, Illustration of inversion identified between two oat genomes by Hi-C contact map. Chromatin interaction heatmap revealed inversion signals appearing after manual flipping. These maps supported the inversions in chromosome 7D, 3 A, 5D and 3 C. d, Ten insertions were randomly selected for validation by PCR amplification. The detailed information of the validation of 10 SVs were presented in Supplementary Table 17. SV8, SV9 and SV10 came from other gels. Uncropped images are shown in Supplementary Figure 20. Different bands show the presence or absence of the SVs amplified in two oat genomes. The asterisk indicates the target band. The experiments were repeated independently three times, with similar results. e, Length distribution of deletions and insertions among 16 oat accessions. f, The haplotype of chromosome 4 A, 1 C, 2 C and 1D among C0355, C0298 and A. sterilis genomes indicated the similar genotype of C0298 with A. sterilis. g, Density of SV number per 100 bp in gene bodies and 3 kb flanking regions among 16 oat accessions.

Source data

Extended Data Fig. 8 The effects of SVs on the gene expression.

a, Phylogenetic tree of RNA-seq samples. b, Principal-component analysis (PCA) for the RNA-seq samples based on the FPKM of each sample. c, The relative expression percentage of each subgenome based on the 1:1:1 (1:1) gene triads (pairs) in 18 oat accessions. The relative expression of different subgenomes of each accession summed to 100%. Error bar represents the SD of all RNA-seq samples of each accession. d, Comparison of average gene expressions between SV genes and no-SV genes among all RNA-seq samples in 16 oat accessions. e, The number of SV genes with suppression and promotion expression under normal growing and four different stress treatments.

Source data

Extended Data Fig. 9 Gene expression analysis under different stress treatments.

a, Heatmap of the expression levels of the genes under normal and different stress treatments of both leaf and root. b, Functional analysis of differential expression genes from four different stress treatments. c, RT-qPCR detection of A105905C025039.1 expression level in normal and drought conditions was shown in the left. Comparison of the expression level (FPKM) of A105905C025039.1 between the two haplotypes was shown in the right. d-f, RT-qPCR detection of A105902D033372.1, A105904C0105377.1 and A105902A0179076.1 expression level in control and different stress treatments in two tissues were shown in the left (Two-tailed Student’s t test). Error bar represents the SD of three biological replicates. Comparison of the expression level (FPKM) of A105902D033372.1, A105904C0105377.1 and A105902A0179076.1 between the two haplotypes in two tissues under corresponding stress treatment were shown in the right (two-sided Wilcoxon rank-sum test).

Source data

Extended Data Fig. 10 Functional validation of AsARF7.

a, The relationship among drought phenotype, SVs and gene expression. b, The expression heatmap of AsARF7 under ABA, drought and PEG6000 treatment. PEG600_S represents the aerial part after PEG600 treatment. PEG600_R represents the root part after PEG600 treatment. c, The expression of AsARF7 in different tissues including root, stem, leaf, flag-leaf, flower, fruit pod and seed. d, Detection of AsARF7 gene expression level in knockdown mutant strains. e, Water loss rate statistics of AsARF7 knockdown strains with three biological replicates. f, Chlorophyll content detection statistics of AsARF7 knockdown strains under drought treatment. g, Nuclear run-on assay showed that the transcription of AsARF7 was decreased in PEG6000 treatment plants. h, Genetic transformation in A. sativa. The up part: schematic representation of the pUBI::AsARF7-GFP vector with the UBI promoter and NOS terminator; The down part: detection of GFP fluorescence signal in callus induced from mature embryos infected with the pUBI::AsARF7-GFP vector (leaf). Regeneration phenotypes of mature embryos infected with the pUBI::AsARF7-GFP vector. Scale bar, 1 cm (right). Error bar represents the SD of three (c,d,g) of five (f) biological replicates.

Source data

Supplementary information

Source data

Source Data Fig. 1 (download XLSX )

Statistical source data.

Source Data Fig. 2 (download XLSX )

Statistical source data.

Source Data Fig. 3 (download XLSX )

Statistical source data.

Source Data Fig. 4 (download XLSX )

Statistical source data.

Source Data Fig. 5 (download XLSX )

Statistical source data.

Source Data Fig. 6 (download XLSX )

Statistical source data.

Source Data Extended Data Fig. 2 (download XLSX )

Statistical source data.

Source Data Extended Data Fig. 3 (download XLSX )

Statistical source data.

Source Data Extended Data Fig. 4 (download XLSX )

Statistical source data.

Source Data Extended Data Fig. 6 (download XLSX )

Statistical source data.

Source Data Extended Data Fig. 7 (download XLSX )

Statistical source data.

Source Data Extended Data Fig. 7 (download PDF )

Unprocessed gels.

Source Data Extended Data Fig. 8 (download XLSX )

Statistical source data.

Source Data Extended Data Fig. 9 (download XLSX )

Statistical source data.

Source Data Extended Data Fig. 10 (download XLSX )

Statistical source data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, H., Liu, N., Wang, Y. et al. Super-pangenome analyses across 35 accessions of 23 Avena species highlight their complex evolutionary history and extensive genomic diversity. Nat Genet 57, 2276–2288 (2025). https://doi.org/10.1038/s41588-025-02294-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41588-025-02294-z

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research