Abstract
Common oat, belonging to the genus Avena with 30 recognized species, is a nutritionally important cereal crop and high-quality forage worldwide. Here, we construct a genus-level super-pangenome of Avena comprising 35 high-quality genomes from 14 cultivated oat accessions and 21 wild species. The fully resolved phylogenomic analysis unveils the origin and evolutionary scenario of Avena species, and the super-pangenome analysis identifies 26.62% and 59.93% specific genes and haplotypes in wild species. We delineate the landscape of structural variations (SVs) and the transcriptome profile based 1,401 RNA-sequencing (RNA-seq) samples from diverse abiotic stress treatments in oat. We highlight the crucial role of SVs in modulating gene expression and shaping adaptation to diverse stresses. Further combining SV-based genome-wide association studies (GWASs), we characterize 13 candidate genes associated with drought resistance such as AsARF7, validated by transgenic oat lines. Our study provides unprecedented genomic resources to facilitate genomic, evolution and molecular breeding research in oat.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout






Similar content being viewed by others
Data availability
The raw data (CRA015654 and CRA015559) and genome assemblies (GWHEROY00000000-GWHGDJH00000000.1) in this study have been deposited at the Chinese National Genomics Data Center (https://bigd.big.ac.cn/) under the BioProject accession number PRJCA024628. Raw sequencing data (SRR31832837–SRR31832882) and genome assemblies (JBKZJG000000000–JBKZJZ000000000 and JBKZKA000000000–JBKZKH000000000) have been deposited at the NCBI under the BioProject PRJNA1200805. Transcriptome sequencing data under normal growth (SRR31798633–SRR31798686 and SRR31814753–SRR31815091) have been deposited at the NCBI under the BioProject PRJNA1201252. Transcriptome sequencing data under different stresses (SRR31832975–SRR31833946) have been deposited at the NCBI under the BioProject PRJNA1203203. Whole-genome resequencing data of 1,078 hexaploid oat accessions were obtained from the OatOmics database (http://www.oatomics.com). Source data are provided with this paper.
Code availability
The scripts used in this study are available via GitHub (https://github.com/HongyuZhang-HBU/oat-pangenome) and Zenodo111 (https://doi.org/10.5281/zenodo.15761906).
References
Rasane, P., Jha, A., Sabikhi, L., Kumar, A. & Unnikrishnan, V. S. Nutritional advantages of oats and opportunities for its processing as value added foods — a review. J. Food Sci. Technol. 52, 662–675 (2015).
Shi, J., Tian, Z., Lai, J. & Huang, X. Plant pan-genomics and its applications. Mol. Plant 16, 168–186 (2023).
Kamal, N. et al. The mosaic oat genome gives insights into a uniquely healthy cereal crop. Nature 606, 113–119 (2022).
Peng, Y. et al. Reference genome assemblies reveal the origin and evolution of allohexaploid oat. Nat. Genet. 54, 1248–1258 (2022).
Liu, Q., Lin, L., Zhou, X., Peterson, P. M. & Wen, J. Unraveling the evolutionary dynamics of ancient and recent polyploidization events in Avena (Poaceae). Sci. Rep. 7, 41944 (2017).
Yan, H. et al. High-density marker profiling confirms ancestral genomes of Avena species and identifies D-genome chromosomes of hexaploid oat. Theor. Appl. Genet. 129, 2133–2149 (2016).
Loskutov, I. G., Gnutikov, A. A., Blinova, E. V. & Rodionov, A. V. The application of Vavilov’s approaches to the phylogeny and evolution of cultivated species of the genus Avena L. Vavilovskii Zhurnal Genet. Selektsii 27, 921–932 (2023).
Fu, Y. B. Oat evolution revealed in the maternal lineages of 25 Avena species. Sci. Rep. 8, 4252 (2018).
Liu, Q. et al. Comparative chloroplast genome analyses of Avena: insights into evolutionary dynamics and phylogeny. BMC Plant Biol. 20, 406 (2020).
Morikawa, T. & Nishihara, M. Genomic and polyploid evolution in genus Avena as revealed by RFLPs of repeated DNA sequences. Genes Genet. Syst. 84, 199–208 (2009).
Yan, H. H. et al. Phylogenetic analysis of the genus Avena based on chloroplast intergenic spacer psbA-trnH and single-copy nuclear gene Acc1. Genome 57, 267–277 (2014).
Liu, B. B. et al. Phylogenomic conflict analyses in the apple genus Malus s.l. reveal widespread hybridization and allopolyploidy driving diversification, with insights into the complex biogeographic history in the Northern Hemisphere. J. Integr. Plant Biol. 64, 1020–1043 (2022).
Meleshko, O. et al. Extensive genome-wide phylogenetic discordance is due to incomplete lineage sorting and not ongoing introgression in a rapidly radiated bryophyte genus. Mol. Biol. Evol. 38, 2750–2766 (2021).
Liu, S. Y. et al. An integrative framework reveals widespread gene flow during the early radiation of oaks and relatives in Quercoideae (Fagaceae). J. Integr. Plant Biol. 67, 1119–1141 (2025).
Li, W. et al. Plant pan-genomics: recent advances, new challenges, and roads ahead. J. Genet. Genomics 49, 833–846 (2022).
Tieman, D. et al. A chemical genetic roadmap to improved tomato flavor. Science 355, 391–394 (2017).
Zhuang, Y. et al. Phylogenomics of the genus Glycine sheds light on polyploid evolution and life-strategy transition. Nat. Plants 8, 233–244 (2022).
Yu, H. et al. A route to de novo domestication of wild allotetraploid rice. Cell 184, 1156–1170 (2021).
Li, N. et al. Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species. Nat. Genet. 55, 852–860 (2023).
Tang, D. et al. Genome evolution and diversity of wild and cultivated potatoes. Nature 606, 535–541 (2022).
Shi, T. et al. The super-pangenome of Populus unveils genomic facets for its adaptation and diversification in widespread forest trees. Mol. Plant 17, 725–746 (2024).
Fominaya, A., Vega, C. & Ferrer, E. Giemsa C-banded karyotypes of Avena species. Genome 30, 627–632 (1988).
Latta, R. G., Bekele, W. A., Wight, C. P. & Tinker, N. A. Comparative linkage mapping of diploid, tetraploid, and hexaploid Avena species suggests extensive chromosome rearrangement in ancestral diploids. Sci. Rep. 9, 12298 (2019).
Yan, H. et al. Genome size variation in the genus Avena. Genome 59, 209–220 (2016).
Xiong, Z., Gaeta, R. T. & Pires, J. C. Homoeologous shuffling and chromosome compensation maintain genome balance in resynthesized allopolyploid Brassica napus. Proc. Natl Acad. Sci. USA 108, 7908–7913 (2011).
Chester, M. et al. Extensive chromosomal variation in a recently formed natural allopolyploid species, Tragopogon miscellus (Asteraceae). Proc. Natl Acad. Sci. USA 109, 1176–1181 (2012).
Peng, Y. et al. Phylogenetic relationships in the genus Avena based on the nuclear Pgk1 gene. PLoS ONE 13, e0200047 (2018).
Morikawa, T., Gushiken, Y. & Tsurukawa, N. Chromosomal diversity and morphological dimorphism in Moroccan wild oat, Avena agadiriana. Plant Syst. Evol. 281, 107–113 (2009).
Paczos-Grzęda, E. & Bednarek, P. Comparative analysis of hexaploid Avena species using REMAP and ISSR methods. Turk. J. Bot. 38, 1103–1111 (2014).
Loskutov, I. G. On evolutionary pathways of Avena species. Genet. Resour. Crop Evol. 55, 211–220 (2008).
Badaeva, E. D., Shelukhina, O. Y., Dedkova, O. S., Loskutov, I. G. & Pukhalskyi, V. A. Comparative cytogenetic analysis of hexaploid Avena L. species. Russ. J. Genet. 47, 691–702 (2011).
Ociepa, T. The oat gene pools—review about the use of wild species in improving cultivated oat/Pule genowe owsa—przeglad informacji na temat wykorzystania dzikich gatunkow w ulepszaniu owsa uprawnego. J. Cent. Eur. Agric. 20, 251–261 (2019).
Ladizinsky, G. & Fainstein, R. Domestication of the protein-rich tetraploid wild oats Avena magna and A. murphyi. Euphytica 26, 221–223 (1977).
Ladizinsky, G. Domestication via hybridization of the wild tetraploid oats Avena magna and A. murphyi. Theor. Appl. Genet. 91, 639–646 (1995).
Okoń, S., Ociepa, T., Nucia, A., Cieplak, M. & Kowalczyk, K. Is every wild species a rich source of disease resistance? Avena fatua L.—potential donor of resistance to powdery mildew. Plants 10, 560 (2021).
Zhai, C. et al. The isolation and characterization of Pik, a rice blast resistance gene which emerged after rice domestication. New Phytol. 189, 321–334 (2011).
Ramírez-González, R. H. et al. The transcriptional landscape of polyploid wheat. Science 361, eaar6089 (2018).
Ma, P.-F. et al. Genome assemblies of 11 bamboo species highlight diversification induced by dynamic subgenome dominance. Nat. Genet. 56, 710–720 (2024).
Qin, P. et al. Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell 184, 3542–3558 (2021).
Yan, H. et al. Pangenomic analysis identifies structural variation associated with heat tolerance in pearl millet. Nat. Genet. 55, 507–518 (2023).
Uga, Y. et al. Control of root system architecture by DEEPER ROOTING 1 increases rice yield under drought conditions. Nat. Genet. 45, 1097–1102 (2013).
Lo, S. F. et al. Ectopic expression of specific GA2 oxidase mutants promotes yield and stress tolerance in rice. Plant Biotechnol. J. 15, 850–864 (2017).
Wei, H., Wang, X., He, Y., Xu, H. & Wang, L. Clock component OsPRR73 positively regulates rice salt tolerance by modulating OsHKT2;1-mediated sodium homeostasis. EMBO J. 40, e105086 (2021).
Kim, J.-J. et al. Overexpression of a proton pumping gene OVP1 enhances salt stress tolerance, root growth and biomass yield by regulating ion balance in rice (Oryza sativa L.). Environ. Exp. Bot. 175, 104033 (2020).
An, X., Zhang, L., Luo, J., Zhao, S. & Jiao, T. Effects of oat hay content in diets on nutrient metabolism and the rumen microflora in sheep. Animals 10, 2341 (2020).
Katsiotis, A., Hagidimitriou, M. & Heslop-Harrison, J. S. The close relationship between the A and B genomes in Avena L. (Poaceae) determined by molecular cytogenetic analysis of total genomic, tandemly and dispersed repetitive DNA sequences. Ann. Bot. 79, 103–109 (1997).
Zhang, C. et al. High-quality genome of a modern soybean cultivar and resequencing of 547 accessions provide insights into the role of structural variation. Nat. Genet. 56, 2247–2258 (2024).
Li, W. et al. Genomic insights into the divergence between hulled and hulless oats. Cell Rep. 44, 116055 (2025).
He, Q. et al. The near-complete genome assembly of hexaploid wild oat reveals its genome evolution and divergence with cultivated oats. Nat. Plants 10, 2062–2078 (2024).
Li, Y. et al. Subtelomeric assembly of a multi-gene pathway for antimicrobial defense compounds in cereals. Nat. Commun. 12, 2563 (2021).
Maughan, P. J. et al. Genomic insights from the first chromosome-scale assemblies of oat (Avena spp.) diploid species. BMC Biol. 17, 92 (2019).
Li, H. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Alexander, D. H., Novembre, J. & Lange, K. L. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).
Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A. & Zdobnov, E. M. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol. 38, 4647–4654 (2021).
Ou, S. & Jiang, N. LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 176, 1410–1422 (2018).
Robert Hubley, A. S. RepeatModeler Open-1.0 www.repeatmasker.org/RepeatModeler/ (2010).
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).
Ellinghaus, D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9, 18 (2008).
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics Chapter 4, 4.10.1–4.10.14 (2009).
Zhang, R.-G. et al. TEsorter: an accurate and fast method to classify LTR-retrotransposons in plant genomes. Hortic. Res. 9, uhac017 (2022).
Haas, B. J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).
Haas, B. J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666 (2003).
Stanke, M. & Waack, S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19, ii215–ii225 (2003).
Gremme, G., Brendel, V., Sparks, M. E. & Kurtz, S. Engineering a software tool for gene structure prediction in higher organisms. Inf. Softw. Technol. 47, 965–978 (2005).
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).
Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).
Aramaki, T. et al. KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold. Bioinformatics 36, 7 (2020).
Jia, K. H. et al. SubPhaser: a robust allopolyploid subgenome phasing method based on subgenome-specific k-mers. New Phytol. 235, 801–809 (2022).
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).
Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).
Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).
Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
He, Q. et al. High-quality genome of allotetraploid Avena barbata provides insights into the origin and evolution of B subgenome in Avena. J. Integr. Plant Biol. 67, 1515–1532 (2025).
Han, M. V., Thomas, G. W., Lugo-Martinez, J. & Hahn, M. W. Estimating gene gain and loss rates in the presence of error in genome assembly and annotation using CAFE 3. Mol. Biol. Evol. 30, 1987–1997 (2013).
Sun, P. et al. WGDI: a user-friendly toolkit for evolutionary analyses of whole-genome duplications and ancestral karyotypes. Mol. Plant 15, 1841–1851 (2022).
Zhang, R.-G. et al. Subgenome-aware analyses suggest a reticulate allopolyploidization origin in three Papaver genomes. Nat. Commun. 14, 2204 (2023).
Doyle, J. & Egan, A. Dating the origins of polyploidy events. New Phytol. 186, 73–85 (2009).
Jin, J. J. et al. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol. 21, 241 (2020).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).
Danecek, P. & McCarthy, S. BCFtools/csq: haplotype-aware variant consequences. Bioinformatics 33, 2037–2039 (2017).
Wang, D., Zhang, Y., Zhang, Z., Zhu, J. & Yu, J. KaKs_Calculator 2.0: a toolkit incorporating γ-series methods and sliding window strategies. Genomics Proteomics Bioinformatics 8, 77–80 (2010).
Zhang, Y. et al. Telomere-to-telomere Citrullus super-pangenome provides direction for watermelon breeding. Nat. Genet. 56, 1750–1761 (2024).
Huang, Y. et al. Pangenome analysis provides insight into the evolution of the orange subfamily and a key gene for citric acid accumulation in citrus fruits. Nat. Genet. 55, 1964–1975 (2023).
Zhang, Z. et al. ParaAT: a parallel tool for constructing multiple protein-coding DNA alignments. Biochem. Biophys. Res. Commun. 419, 779–781 (2012).
Potter, S. C. et al. HMMER web server: 2018 update. Nucleic Acids Res. 46, W200–W204 (2018).
Li, P. et al. RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants. BMC Genomics 17, 852 (2016).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Marçais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput. Biol. 14, e1005944 (2018).
Goel, M., Sun, H., Jiao, W. B. & Schneeberger, K. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol. 20, 277 (2019).
Li, X. et al. Large-scale gene expression alterations introduced by structural variation drive morphotype diversification in Brassica oleracea. Nat. Genet. 56, 517–529 (2024).
Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).
Zhang, J. et al. Vacuum and co-cultivation agroinfiltration of (germinated) seeds results in tobacco rattle virus (TRV) mediated whole-plant virus-induced gene silencing (VIGS) in wheat and maize. Front. Plant Sci. 8, 393 (2017).
Liu, N. et al. Comprehensive co-expression network reveals the fine-tuning of AsHSFA2c in balancing drought tolerance and growth in oat. Commun. Biol. 8, 393 (2025).
Tian, T. et al. Arabidopsis FAR-RED ELONGATED HYPOCOTYL3 integrates age and light signals to negatively regulate leaf senescence. Plant Cell 32, 1574–1588 (2020).
Liu, N. et al. A lncRNA fine-tunes salicylic acid biosynthesis to balance plant immunity and growth. Cell Host Microbe 30, 1124–1138 (2022).
Zhao, X. et al. Global identification of Arabidopsis lncRNAs reveals the regulation of MAF4 by a natural antisense RNA. Nat. Commun. 9, 5056 (2018).
Zhou, J. et al. Intronic heterochromatin prevents cryptic transcription initiation in Arabidopsis. Plant J. 101, 1185–1197 (2020).
Li, Q. et al. DEAD-box helicases modulate dicing body formation in Arabidopsis. Sci. Adv. 7, eabc6266 (2021).
Wang, K. et al. Author Correction: The gene TaWOX5 overcomes genotype dependency in wheat genetic transformation. Nat. Plants 8, 717–720 (2022).
Liu, X. et al. Uncovering the transcriptional regulatory network involved in boosting wheat regeneration and transformation. Nat. Plants 9, 908–925 (2023).
Yu, Y. et al. Enhancing wheat regeneration and genetic transformation through overexpression of TaLAX1. Plant Commun. 5, 100738 (2024).
Ishida, Y., Tsunashima, M., Hiei, Y. & Komari, T. Wheat (Triticum aestivum L.) transformation using immature embryos. Methods Mol. Biol. 1223, 189–198 (2015).
Said, M. et al. The Agropyron cristatum karyotype, chromosome structure and cross-genome homoeology as revealed by fluorescence in situ hybridization with tandem repeats and wheat single-gene probes. Theor. Appl. Genet. 131, 2213–2227 (2018).
Xi, W. et al. New ND-FISH-positive oligo probes for identifying Thinopyrum chromosomes in wheat backgrounds. Int. J. Mol. Sci. 20, 2031 (2019).
Zhang, H. et al. Super-pangenome analyses across 35 accessions of 23 Avena species highlight their complex evolutionary history and extensive genomic diversity. Zenodo https://doi.org/10.5281/zenodo.15761906 (2025).
Acknowledgements
This work was supported by the Natural Science Foundation of Hebei Province (grant nos. C2023201074 and C2021201048 to H.D.), the Young Elite Scientists Sponsorship Program by CAST (grant YESS20210080 to H.D.), the National Natural Science Foundation of China (32100500 to H.D.), open funds of the State Key Laboratory of Plant Environmental Resilience (grant no. SKLPERKF2406 to Z.G.), the Interdisciplinary Research Program of Natural Science of Hebei University (grant no. 513201422004 to H.D.) and Post-graduate’s Innovation Fund Project of Hebei University (grant no. HBU2025BS011 to H.Z.). We sincerely thank Y. Fan at Chengdu University, Y. Liu at the Dingxi Academy of Agricultural Sciences and B. Wu at the Institute of Crop Sciences, Chinese Academy of Agricultural Sciences for providing the oat accession resources. We also thank K. Wang at the Institute of Crop Sciences, Chinese Academy of Agricultural Sciences for providing the TaWOX5 plasmid.
Author information
Authors and Affiliations
Contributions
H.D. conceived and supervised the project; H.D., Q.H. and Z.G. designed the study. Yaru Wang, X.Z., Q.H., Yu Wang and H.Z. sequenced and processed the raw data. Q.H., H.Z., Yaru Wang, M.W. and X.Z. assembled and annotated the genome. H.Z., X.Z., Z.L., Q.Z. and L.X. performed phylogenetic and genome evolution analyses; Yaru Wang, Q.H. and H.Z. conducted the transcriptome analysis; H.Z., Yaru Wang, Q.H., X.Z. and J.L. conducted pangenome analyses. Q.H., Yaru Wang, T.L., W.L. and N.L. performed TE and SV analyses. W.L., J.L. and Q.H. performed the population genetics analysis. N.L., Q.H., Y.Y., T.L., Y.Q. and J.Y. conceived of and designed the experiments. H.D. composed the outline of the paper. Q.H., W.L. and H.Z. wrote the paper. H.D. revised the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Mark Chapman and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Geographical distribution and phenotype diversity of cultivated and wild oats accessions.
a, Principal component analysis (PCA) of 1,079 globally collected hexaploid oat accessions. PC1 (32.14%) clearly separates wild oat accessions (G1) and cultivated oat accessions (G2-G6); PC2 (19.45%) clearly separates G5 group of hulless oats from other groups; b, Geographic distribution of the 17 diverse representative accessions among 1,079 hexaploid oat accessions. The world map was generated using the WorldClim (https://worldclim.org/) whose data are freely available for academic use (https://worldclim.org/about.html). The color of points corresponds to Fig. 1a. c-e, The highly diverse agronomic phenotypes of representative wild oat accessions, including plant architecture (c), spike architecture (d) and spikelet (e).
Extended Data Fig. 2 Circos plot of the reference genome C0191 and assembly assessment of genomes in our study.
a, Circos plot showing chromosome-level features of C0191 reference genome. Tracks represent (I) chromosome length, (II) GC content, (III) gene density, (IV)TE density, (V) LTR density, (VI) DNA TE density, (VII) other TE density and (VIII) syntenic blocks. b, BUSCO completeness assessment for genome annotation of 35 oat accessions.
Extended Data Fig. 3 Phylogenetic relationships and karyotypes evolution of Avena species.
a, Distribution of Ks for orthologue duplicates identified between two (sub)genomes to estimate divergence time. Ad (A. damascene), As (A. brevis), D (A. maroccana-D), B (A. vaviloviana-B). b, SNP-based phylogenetic tree constructed with A. longiglumis as the reference genome based on our resequencing data and published resequencing data. c, A specific FISH probe targeting the As subtypes can label A. hispanica, A. brevis and A. nuda, but not A. hirtula, as observed in cells at the metaphase of mitosis. Scale bars, 5 μm. d, The reconstructed ancestral karyotype of the Avena genus using rice as the outgroup. Gene families with expansion and contraction in each species and node are highlighted in blue and orange, respectively. The red pentagrams represent the position of the inferred ancestor of A, B, C and D lineages. The arrows represented that the chromosome 1 and 4 in C subgenomes of tetraploids exhibited large-scale rearrangements. The red dashed boxes represent a large segment specific to the D subgenome of A. insularis but absent in that of A. maroccana and A. murphyi, then was translocated to the end of chromosome 2D during the process of polyploidization and retained in the hexaploid oats. e, The density of SVs in C1071, Afat, Aste, Aocc with the C0191 genome as reference. The average sequence similarity between subgenomes of C0191 and C1071, Afat, Aste, Aocc are shown in the heatmap. f, Proportions of contrasting gene tree topologies for 100 randomly selected single-copy genes (1,000 times) with regard to three major conflicting relationships. The values in the upper right corner of each tree show the percentage of windows that recovered the topology.
Extended Data Fig. 4 The divergent genome composition between A. agadiriana and AABB tetraploid species.
a, Unsupervised hierarchical clustering of differential 15-mers validates that A. vaviloviana, A. barbata, A. abyssinica are successfully phased into two subgenomes based on clearly distinct patterns, while the two subgenomes of A. agadiriana showed higher sequence identity, making the subgenomes phasing relatively challenging and easily confused. b, The coalescent-based tree reconstructed from 100 randomly selected single-copy genes (1,000 times). The values in the upper right corner of each tree show the percentage of windows that recovered the topology. As (A. hispanica, A. brevis, A. atlantica, A. nuda). c, Intergenomic synteny among A. agadiriana, A. abyssinica, Avaviloviana and A. barbata. Some large inversions and translocations between A. agadiriana and species with AABB genome constitution were shown in blue. d, The average sequence similarity between the subgenomes of A. agadiriana, and three AABB species, as well as other subgenomes. e, Distribution of synonymous divergence (Ks) for orthologue genes identified between each subgenome of A. agadiriana (Aaga-Ac’ and Aaga-As’) and other subgenomes.
Extended Data Fig. 5 Refined model for the origins, polyploidizations and evolutionary trajectories of Avena species.
a, Phylogenetic trees of 24 oat species with L. perenne and T. aestivum as an outgroup based on 26 chloroplast (cp) genomes and 7,137 mitochondrial (mt) SNP data sets. b-f, Divergence and polyploidization time estimation based on the 2,456 single-copy orthologous groups using MCMCTree, and distribution of synonymous divergence (Ks) for orthologue duplicates identified between two subgenomes.
Extended Data Fig. 6 Genetic diversity in cultivated and wild oats.
a, Venn diagram showing the numbers of common and unique gene families identified in panA, panC and panD of cultivated oats and panA, panB, panC and panD of wild oats. b, The distribution of CDS length in core, dispensable and private genes in cultivated and wild pangenomes. Cul-core (n = 1,460,017), Cul-dispensable (n = 394,369), Cul-private (n = 8,665), Wild-core (n = 776,159), Wild-dispensable (n = 553,189), Wild-private (n = 35,786). c, Proportion of genes with Pfam domains in core, dispensable, and private genes in cultivated and wild oats. Blue histograms indicate the genes with Pfam domain annotation; gray histograms indicate the gene without Pfam domain annotation. d, Comparison of gene expression levels in core, dispensable, and private genes in cultivated and wild oats. e-h, Comparison of the TE coverage with a window of 5 Mb (e), complete BUSCOs (n = 17) (f), Ka/Ks (n = 10,000) (g), NBS-encoding gene number (n = 17) (h) among the three subgenomes of hexaploid oats. i, The number of specific gene families derived from each wild oats compared to cultivated oats. The pie chart represents the proportion of gene families unique to wild oats that are derived from different number of subgenomes. j, A candidate introgression region from A. maroccana. The sequence similarity and SNP dataset all supported that A. maroccana showed close relationship with cultivated oats in this region.
Extended Data Fig. 7 Verification and analysis of structural variations.
a, HiFi reads were used to validate the borders of 6 deletions randomly selected in C0080 and C0298. b, Schematic diagrams showed the distribution of inversion between two oat genomes (x-axis: C0191). Red box represented the validated inversion events relative to C0191. c, Illustration of inversion identified between two oat genomes by Hi-C contact map. Chromatin interaction heatmap revealed inversion signals appearing after manual flipping. These maps supported the inversions in chromosome 7D, 3 A, 5D and 3 C. d, Ten insertions were randomly selected for validation by PCR amplification. The detailed information of the validation of 10 SVs were presented in Supplementary Table 17. SV8, SV9 and SV10 came from other gels. Uncropped images are shown in Supplementary Figure 20. Different bands show the presence or absence of the SVs amplified in two oat genomes. The asterisk indicates the target band. The experiments were repeated independently three times, with similar results. e, Length distribution of deletions and insertions among 16 oat accessions. f, The haplotype of chromosome 4 A, 1 C, 2 C and 1D among C0355, C0298 and A. sterilis genomes indicated the similar genotype of C0298 with A. sterilis. g, Density of SV number per 100 bp in gene bodies and 3 kb flanking regions among 16 oat accessions.
Extended Data Fig. 8 The effects of SVs on the gene expression.
a, Phylogenetic tree of RNA-seq samples. b, Principal-component analysis (PCA) for the RNA-seq samples based on the FPKM of each sample. c, The relative expression percentage of each subgenome based on the 1:1:1 (1:1) gene triads (pairs) in 18 oat accessions. The relative expression of different subgenomes of each accession summed to 100%. Error bar represents the SD of all RNA-seq samples of each accession. d, Comparison of average gene expressions between SV genes and no-SV genes among all RNA-seq samples in 16 oat accessions. e, The number of SV genes with suppression and promotion expression under normal growing and four different stress treatments.
Extended Data Fig. 9 Gene expression analysis under different stress treatments.
a, Heatmap of the expression levels of the genes under normal and different stress treatments of both leaf and root. b, Functional analysis of differential expression genes from four different stress treatments. c, RT-qPCR detection of A105905C025039.1 expression level in normal and drought conditions was shown in the left. Comparison of the expression level (FPKM) of A105905C025039.1 between the two haplotypes was shown in the right. d-f, RT-qPCR detection of A105902D033372.1, A105904C0105377.1 and A105902A0179076.1 expression level in control and different stress treatments in two tissues were shown in the left (Two-tailed Student’s t test). Error bar represents the SD of three biological replicates. Comparison of the expression level (FPKM) of A105902D033372.1, A105904C0105377.1 and A105902A0179076.1 between the two haplotypes in two tissues under corresponding stress treatment were shown in the right (two-sided Wilcoxon rank-sum test).
Extended Data Fig. 10 Functional validation of AsARF7.
a, The relationship among drought phenotype, SVs and gene expression. b, The expression heatmap of AsARF7 under ABA, drought and PEG6000 treatment. PEG600_S represents the aerial part after PEG600 treatment. PEG600_R represents the root part after PEG600 treatment. c, The expression of AsARF7 in different tissues including root, stem, leaf, flag-leaf, flower, fruit pod and seed. d, Detection of AsARF7 gene expression level in knockdown mutant strains. e, Water loss rate statistics of AsARF7 knockdown strains with three biological replicates. f, Chlorophyll content detection statistics of AsARF7 knockdown strains under drought treatment. g, Nuclear run-on assay showed that the transcription of AsARF7 was decreased in PEG6000 treatment plants. h, Genetic transformation in A. sativa. The up part: schematic representation of the pUBI::AsARF7-GFP vector with the UBI promoter and NOS terminator; The down part: detection of GFP fluorescence signal in callus induced from mature embryos infected with the pUBI::AsARF7-GFP vector (leaf). Regeneration phenotypes of mature embryos infected with the pUBI::AsARF7-GFP vector. Scale bar, 1 cm (right). Error bar represents the SD of three (c,d,g) of five (f) biological replicates.
Supplementary information
Supplementary Information (download PDF )
Supplementary Notes 1–9 and Figs. 1–19.
Supplementary Tables 1–26 (download XLSX )
Supplementary Tables 1–26.
Source data
Source Data Fig. 1 (download XLSX )
Statistical source data.
Source Data Fig. 2 (download XLSX )
Statistical source data.
Source Data Fig. 3 (download XLSX )
Statistical source data.
Source Data Fig. 4 (download XLSX )
Statistical source data.
Source Data Fig. 5 (download XLSX )
Statistical source data.
Source Data Fig. 6 (download XLSX )
Statistical source data.
Source Data Extended Data Fig. 2 (download XLSX )
Statistical source data.
Source Data Extended Data Fig. 3 (download XLSX )
Statistical source data.
Source Data Extended Data Fig. 4 (download XLSX )
Statistical source data.
Source Data Extended Data Fig. 6 (download XLSX )
Statistical source data.
Source Data Extended Data Fig. 7 (download XLSX )
Statistical source data.
Source Data Extended Data Fig. 7 (download PDF )
Unprocessed gels.
Source Data Extended Data Fig. 8 (download XLSX )
Statistical source data.
Source Data Extended Data Fig. 9 (download XLSX )
Statistical source data.
Source Data Extended Data Fig. 10 (download XLSX )
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, H., Liu, N., Wang, Y. et al. Super-pangenome analyses across 35 accessions of 23 Avena species highlight their complex evolutionary history and extensive genomic diversity. Nat Genet 57, 2276–2288 (2025). https://doi.org/10.1038/s41588-025-02294-z
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41588-025-02294-z
This article is cited by
-
Digitaria genome analyses indicate introgression may drive local adaptation and herbicide resistance
Nature Communications (2026)
-
Six recommendations when applying genomic tools to invasion biology research
Biological Invasions (2026)
-
QTL mapping of oat crown rust resistance in Australian fields and identification of a seedling resistance locus in oat line GS7
Theoretical and Applied Genetics (2026)


