Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Pan-genomic analysis highlights genes associated with agronomic traits and enhances genomics-assisted breeding in alfalfa

Abstract

Alfalfa (Medicago sativa L.), a globally important forage crop, is valued for its high nutritional quality and nitrogen-fixing capacity. Here, we present a high-quality pan-genome constructed from 24 diverse alfalfa accessions, encompassing a wide range of genetic backgrounds. This comprehensive analysis identified 433,765 structural variations and characterized 54,002 pan-gene families, highlighting the pivotal role of genomic diversity in alfalfa domestication and adaptation. Key structural variations associated with salt tolerance and quality traits were discovered, with functional analysis implicating genes such as MsMAP65 and MsGA3ox1. Notably, overexpression of MsGA3ox1 led to a reduced stem–leaf ratio and enhanced forage quality. The integration of genomic selection and marker-assisted breeding strategies improved genomic estimated breeding values across multiple traits, offering valuable genomic resources for advancing alfalfa breeding. These findings provide insights into the genetic basis of important agronomic traits and establish a solid foundation for future crop improvement.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Layout of the alfalfa graph pan-genome study.
Fig. 2: Distribution and diversity of representative alfalfa accessions.
Fig. 3: Detection of SVs and construction of the pan-genome based on the 24 de novo assembled alfalfa genomes.
Fig. 4: Functional impact of SV in alfalfa leaf morphology under salt stress.
Fig. 5: Functional validation of a key gene identified by pan-genomic analysis of SLR phenotype using SNP-GWAS and SV-GWAS.
Fig. 6: GWAS and genomic prediction accuracies using SV and SNP markers across 54 phenotypic traits.

Similar content being viewed by others

Data availability

The sequencing raw data have been deposited in the NCBI database under accession code BioProject PRJNA1197171. The haploid reference genome is derived from a previously published study5. The assembled data have been deposited in the NCBI database under the BioProject accession code PRJNA1220045. Additionally, the data are available via Zenodo at https://doi.org/10.5281/zenodo.14118213 (ref. 85) and via Figshare at https://figshare.com/articles/dataset/Alfalfa/28426967 (ref. 86). Resequencing data used in this study were obtained from Zhang’s research, and the relevant data have been provided in his published article45. The RNA sequence data from this study have been deposited in the NCBI database under accession code BioProject PRJNA1083622. The phenotypes used in GWAS and GS studies are available via Zenodo at https://doi.org/10.5281/zenodo.14869063 (ref. 87).

Code availability

All codes associated with this project are available via GitHub at https://github.com/hefei0609-afk/Alfalfa and via Zenodo at https://doi.org/10.5281/zenodo.14800545 (ref. 88).

References

  1. Annicchiarico, P., Barrett, B., Brummer, E. C., Julier, B. & Marshall, A. H. Achievements and challenges in improving temperate perennial forage legumes. Crit. Rev. Plant Sci. 34, 327–380 (2015).

    Article  CAS  Google Scholar 

  2. Shen, C. et al. The chromosome-level genome sequence of the autotetraploid alfalfa and resequencing of core germplasms provide genomic resources for alfalfa research. Mol. Plant 13, 1250–1261 (2020).

    Article  CAS  PubMed  Google Scholar 

  3. Li, X. & Brummer, E. C. Applied genetics and genomics in alfalfa breeding. Agronomy 2, 40–61 (2012).

    Article  CAS  Google Scholar 

  4. Chen, H. et al. Allele-aware chromosome-level genome assembly and efficient transgene-free genome editing for the autotetraploid cultivated alfalfa. Nat. Commun. 11, 2494 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Long, R. et al. Genome assembly of alfalfa cultivar Zhongmu-4 and identification of SNPs associated with agronomic traits. Genomics Proteomics Bioinformatics 20, 14–28 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Jayakodi, M., Schreiber, M., Stein, N. & Mascher, M. Building pan-genome infrastructures for crop plants and their use in association genetics. DNA Res. 28, dsaa030 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  7. Pang, A. W. et al. Towards a comprehensive structural variation map of an individual human genome. Genome Biol. 11, R52 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Zhang, Z. et al. Genome-wide mapping of structural variations reveals a copy number variant that determines reproductive morphology in cucumber. Plant Cell 27, 1595–1604 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Zhou, Y. et al. The population genetics of structural variants in grapevine domestication. Nat. Plants 5, 965–979 (2019).

    Article  PubMed  Google Scholar 

  10. Saxena, R. K., Edwards, D. & Varshney, R. K. Structural variations in plant genomes. Brief. Funct. Genomics 13, 296–307 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Gabur, I., Chawla, H. S., Snowdon, R. J. & Parkin, I. A. Connecting genome structural variation with complex traits in crop plants. Theor. Appl. Genet. 132, 733–750 (2019).

    Article  PubMed  Google Scholar 

  12. Chen, S. et al. Gene mining and genomics-assisted breeding empowered by the pangenome of tea plant Camellia sinensis. Nat. Plants 9, 1986–1999 (2023).

    Article  CAS  PubMed  Google Scholar 

  13. Gaut, B. S., Seymour, D. K., Liu, Q. & Zhou, Y. Demography and its effects on genomic variation in crop domestication. Nat. Plants 4, 512–520 (2018).

    Article  PubMed  Google Scholar 

  14. Wellenreuther, M., Mérot, C., Berdan, E. & Bernatchez, L. Going beyond SNPs: the role of structural genomic variants in adaptive evolution and species diversification. Mol. Ecol. 28, 1203–1209 (2019).

    Article  PubMed  Google Scholar 

  15. Huang, K. & Rieseberg, L. H. Frequency, origins, and evolutionary role of chromosomal inversions in plants. Front. Plant Sci. 11, 296 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Kirkpatrick, M. & Barton, N. Chromosome inversions, local adaptation and speciation. Genetics 173, 419–434 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Zhang, X. et al. Haplotype-resolved genome assembly provides insights into evolutionary history of the tea plant Camellia sinensis. Nat. Genet. 53, 1250–1259 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Simão, F. A., Waterhouse, R. M., Panagiotis, I., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).

    Article  PubMed  Google Scholar 

  19. Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 46, e126 (2018).

    PubMed  PubMed Central  Google Scholar 

  20. Alonge, M. et al. Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing. Genome Biol. 23, 258 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Li, A. et al. A chromosome-scale genome assembly of a diploid alfalfa, the progenitor of autotetraploid alfalfa. Hortic. Res. 7, 194 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Finn, R. D. et al. Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 (2014).

    Article  CAS  PubMed  Google Scholar 

  23. Cantalapiedra, C. P., Hernández-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Zhou, S., Chen, Q., Li, X. & Li, Y. MAP65-1 is required for the depolymerization and reorganization of cortical microtubules in the response to salt stress in Arabidopsis. Plant Sci. 264, 112–121 (2017).

    Article  CAS  PubMed  Google Scholar 

  25. Liang, M. et al. Comprehensive analyses of microtubule-associated protein MAP65 family genes in Cucurbitaceae and CsaMAP65s expression profiles in cucumber. J. Appl. Genet. 64, 393–408 (2023).

    Article  CAS  PubMed  Google Scholar 

  26. Dwiningsih, Y. & Al-Kahtani, J. Genome-wide association study of complex traits in maize detects genomic regions and genes for increasing grain yield and grain quality. Adv. Sustain. Sci. Eng. Technol. 4, 0220209 (2022).

    Google Scholar 

  27. Liu, R. et al. GWAS analysis and QTL identification of fiber quality traits and yield components in upland cotton using enriched high-density SNP markers. Front. Plant Sci. 13, 1067 (2018).

    Article  Google Scholar 

  28. Kephart, K. D., Buxton, D. & Hill, R. Jr Digestibility and cell‐wall components of alfalfa following selection for divergent herbage lignin concentration. Crop Sci. 30, 207–212 (1990).

    Article  Google Scholar 

  29. Han, R.-H., Lu, X.-S., Gao, G.-J. & Yang, X.-J. Analysis of the principal components and the subordinate function of alfalfa drought resistance. Acta Agrestia Sin. 14, 142 (2006).

    Google Scholar 

  30. Reinecke, D. M. et al. Gibberellin 3-oxidase gene expression patterns influence gibberellin biosynthesis, growth, and development in pea. Plant Physiol. 163, 929–945 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Wu, H., Bai, B., Lu, X. & Li, H. A gibberellin-deficient maize mutant exhibits altered plant height, stem strength and drought tolerance. Plant Cell Rep. 42, 1687–1699 (2023).

    Article  CAS  PubMed  Google Scholar 

  32. Ameur, A. Goodbye reference, hello genome graphs. Nat. Biotechnol. 37, 866–868 (2019).

    Article  CAS  PubMed  Google Scholar 

  33. Garrison, E. et al. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nat. Biotechnol. 36, 875–879 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Qin, P. et al. Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell 184, 3542–3558.e16 (2021).

    Article  CAS  PubMed  Google Scholar 

  35. Alonge, M. et al. Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell 182, 145–161.e23 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. He, Q. et al. A graph-based genome and pan-genome variation of the model plant Setaria. Nat. Genet. 55, 1232–1242 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Huang, Y. et al. Pangenome analysis provides insight into the evolution of the orange subfamily and a key gene for citric acid accumulation in citrus fruits. Nat. Genet. 55, 1964–1975 (2023).

    Article  CAS  PubMed  Google Scholar 

  38. Liu, Y. et al. Pan-genome of wild and cultivated soybeans. Cell 182, 162–176.e13 (2020).

    Article  CAS  PubMed  Google Scholar 

  39. Hu, J. et al. Potential sites of bioactive gibberellin production during reproductive growth in Arabidopsis. Plant Cell 20, 320–336 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Sun, H. et al. Gibberellins inhibit flavonoid biosynthesis and promote nitrogen metabolism in Medicago truncatula. Int. J. Mol. Sci. 22, 9291 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Dalmadi, Á. et al. Dwarf plants of diploid Medicago sativa carry a mutation in the gibberellin 3-β-hydroxylase gene. Plant Cell Rep. 27, 1271–1279 (2008).

    Article  CAS  PubMed  Google Scholar 

  42. Israelsson, M., Mellerowicz, E., Chono, M., Gullberg, J. & Moritz, T. Cloning and overproduction of gibberellin 3-oxidase in hybrid aspen trees. Effects on gibberellin homeostasis and development. Plant Physiol. 135, 221–230 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Zheng, L. et al. From model to alfalfa: gene editing to obtain semidwarf and prostrate growth habits. Crop J. 10, 932–941 (2022).

    Article  Google Scholar 

  44. He, X. et al. Accuracy of genomic selection for alfalfa biomass yield in two full-sib populations. Front. Plant Sci. 13, 1037272 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Zhang, F. et al. Evolutionary genomics of climatic adaptation and resilience to climate change in alfalfa. Mol. Plant 17, 867–883 (2024).

    Article  CAS  PubMed  Google Scholar 

  46. Li, H. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  48. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Alexander, D. H. & Lange, K. Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinformatics 12, 246 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Zhang, X., Zhang, S., Zhao, Q., Ming, R. & Tang, H. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nat. Plants 5, 833–845 (2019).

    Article  CAS  PubMed  Google Scholar 

  53. Zhang, J. et al. Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L. Nat. Genet. 50, 1565–1573 (2018).

    Article  CAS  PubMed  Google Scholar 

  54. Wu, T. D. & Watanabe, C. K. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21, 1859 (2005).

    Article  CAS  PubMed  Google Scholar 

  55. Tang, H. et al. An improved genome release (Version Mt4.0) for the model legume Medicago truncatula. BMC Genomics 15, 312 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Wang, Y. et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40, e49 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Guan, D. et al. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics 36, 2896–2898 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Zhao, X. & Hao, W. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).

    Article  Google Scholar 

  60. Ellinghaus, D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9, 18 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  61. Ou, S. & Jiang, N. LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 176, 1410–1422 (2018).

    Article  CAS  PubMed  Google Scholar 

  62. Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics 5, 4.10.11–14.10.14 (2004).

    Google Scholar 

  63. Brůna, T., Hoff, K. J., Lomsadze, A., Stanke, M. & Borodovsky, M. BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genom. Bioinform. 3, lqaa108 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  64. Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Ou, S. et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 20, 275 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Su, W., Gu, X. & Peterson, T. TIR-Learner, a new ensemble method for TIR transposable element annotation, provides evidence for abundant new transposable elements in the maize genome. Mol. Plant 12, 447–460 (2019).

    Article  CAS  PubMed  Google Scholar 

  68. Xiong, W. et al. HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes. Proc. Natl Acad. Sci. USA 111, 10263–10268 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl Acad. Sci. USA 117, 9451–9457 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Lavigne, R., Seto, D., Mahadevan, P., Ackermann, H.-W. & Kropinski, A. M. Unifying classical and molecular taxonomic classification: analysis of the Podoviridae using BLASTP-based tools. Res. Microbiol. 159, 406–414 (2008).

    Article  CAS  PubMed  Google Scholar 

  71. Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  72. Wang, D., Zhang, Y., Zhang, Z., Zhu, J. & Yu, J. KaKs_Calculator 2.0: a toolkit incorporating gamma-series methods and sliding window strategies. Genomics Proteomics Bioinformatics 8, 77–80 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Wang, D.-P., Wan, H.-L., Zhang, S. & Yu, J. γ-MYN: a new algorithm for estimating Ka and Ks with consideration of variable substitution rates. Biol. Direct 4, 20 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  74. Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461–468 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Heller, D. & Vingron, M. SVIM: structural variant identification using mapped long reads. Bioinformatics 35, 2907–2915 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Jiang, T. et al. Long-read-based human genomic structural variation detection with cuteSV. Genome Biol. 21, 189 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Marçais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput. Biol. 14, e1005944 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  79. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Zadeh, L. A. Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst. 1, 3–28 (1978).

    Article  Google Scholar 

  81. VanRaden, P. M. Efficient methods to compute genomic predictions. J. Dairy Sci. 91, 4414–4423 (2008).

    Article  CAS  PubMed  Google Scholar 

  82. Cortes, C. & Vapnik, V. Support-vector networks. Machine Leaning 20, 273–297 (1995).

    Article  Google Scholar 

  83. Fu, C., Hernandez, T., Zhou, C. & Wang, Z.-Y. Alfalfa (Medicago sativa L.). Methods Mol. Biol. 1223, 213–221 (2015).

    Article  CAS  PubMed  Google Scholar 

  84. Abràmoff, M. D., Magalhães, P. J. & Ram, S. J. Image processing with ImageJ. Biophotonics Int. 11, 36–42 (2004).

    Google Scholar 

  85. He, F. Pan-genomic analysis highlights genes associated with agronomic traits and enhances genomics-assisted breeding in alfalfa. Zenodo https://doi.org/10.5281/zenodo.14118212 (2024).

  86. He, F. Alfalfa. Figshare https://doi.org/10.6084/m9.figshare.28426967.v1 (2025).

  87. Fei, H. Alfalfa. Zenodo https://doi.org/10.5281/zenodo.14869062 (2025).

  88. Fei, H. Alfalfa pan-genome. Zenodo https://doi.org/10.5281/zenodo.14800544 (2025).

Download references

Acknowledgements

This work was supported by China Agriculture Research System of MOF and MARA (grant no. CARS-34 to Q.Y.), the Biological Breeding-National Science and Technology Major Project (grant no. 2022ZD04011 to R.L.), the Key Projects in Science and Technology of Inner Mongolia (grant no. 2021ZD0031 to R.L.) and Agricultural Science and Technology Innovation Program of CAAS (grant no. ASTIP-IAS14 to Q.Y.).

Author information

Authors and Affiliations

Authors

Contributions

Q.Y., R.L. and X.Z. designed this project and coordinated the research activities. F.Z., J.K., H.L., L.C., Xianyang Li, M.L., X.W., X.J., B.S., M.X. and Y.L. collected and provided plant materials. F.Z., R.L. and X.Z. participated in the genome sequencing and resequencing. S.C., S.Q. and K.C. assembled the genomes. S.C., W.K., Q.Z., K.C. and S.Q. performed the gene annotation. S.C. and F.H. analyzed RNA-seq data. F.H. constructed the sequence and gene-based pan-genome. F.Z., S.C. and F.H. contributed to population GWAS analysis. Y.Z. performed functional verification. X.H., Xiao Li and T.Z. conducted a whole-genome selection analysis. F.H., S.C., X.Z., R.L. and Q.Y. interpreted the data and contributed to the manuscript writing.

Corresponding authors

Correspondence to Xingtan Zhang, Ruicai Long or Qingchuan Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Eric von Wettberg and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Population structure and Fixation Index of the global alfalfa diversity panel.

a. Population structure of the alfalfa panel was inferred by assuming three subpopulations (K). Each color represents a different subpopulation. b. Word cloud of the primary origin countries for alfalfa varieties in Group1, Group2, and Group3. Font size represents the relative proportion of varieties from each country. Group1 is predominantly from the United States, Group2 from China, and Group3 from Turkey, with contributions from other countries as well. c. The PCA scatter plot shows the distribution of PC1 and PC2, with different colors representing different groups (Group1, Group2, Group3). d. Fixation Index (FST) values among Group1, Group2, and Group3 alfalfa accessions.

Extended Data Fig. 2 The genome structure variations (SVs) between species of alfalfa.

a, Chromosome. b–h, means the distribution of repeat density, gene density, SNP/Indel density, deletions, insertion, duplication and inversion.

Extended Data Fig. 3 Genome-wide association study (GWAS) for monosaccharide content, In Vitro True Dry Matter Degradability at 24 h (IVTDMD24), and In Vitro True Dry Matter Degradability at 30 h (IVTDMD30).

a, c, e, present the Manhattan and QQ plots of the GWAS results for monosaccharide, IVTDMD24, and IVTDMD30, respectively, using structural variation (SV) markers. b, d, f, show the Manhattan and QQ plots for the same traits using single nucleotide polymorphism (SNP) markers. The red dashed line indicates the Bonferroni-corrected genome-wide significance threshold (α = 0.05/n, where 'n' is the total number of independent SNPs and effective SVs). g, i, k, depict scatter plots of the peak structural variations in chromosome 1 for the three traits, with the horizontal line marking the Bonferroni-corrected genome-wide significance threshold. h, j, l, display boxplots of the three traits across different accessions, categorized by the alleles they carry. The sample sizes for the REF and ALT groups are 171 and 5, respectively. In boxplots, the 25% and 75% quartiles are shown as lower and upper edges of boxes, respectively, and central lines denote the median. The whiskers extend to 1.5 times the inter-quartile range. P-values were computed from two-tailed Student’ s t-test.

Extended Data Fig. 4 Impact of MsGA3ox1 overexpression on alfalfa morphology traits.

a, Comparison between WT alfalfa plants and overexpression lines (OE3, OE7, and OE12). b-e, Quantitative measurements of MsGA3ox1 expression levels, plant height, SLR, and biomass. f, Photographs of leaves from WT, OE3, OE7, and OE12 lines at the 3rd, 4th, and 5th stem nodes. g-i, Comparative assessments of leaf area, leaf length, and leaf width between WT and MsGA3ox1 overexpression lines as shown in f. j, Comparison of WT and MsGA3ox1 overexpression lines in the number of trifoliolate leaves. The scale bar represents 5 cm. Asterisks denote statistical significance with ‘*’ ‘**’ and ‘***’ indicating P < 0.05, P < 0.01and P < 0.001, respectively. Data are presented as means ± SEM, with three independent experimental replicates for panel b, six independent experimental replicates for panels c, d, e, and j, and nine independent experimental replicates for panels g, h, and i. The control group (WT) is the Zhongmu No.1 variety of Medicago sativa L.

Extended Data Fig. 5 Phenotypic characterization of alfalfa quality traits in MsGA3ox overexpression lines.

The bar graphs depict a comparative analysis of crude protein (CP) (a), acid detergent fiber (ADF) (b), neutral detergent fiber (NDF) (c), lignin content (d), total digestible nutrients (TDN) (e), and net energy for gain (NEg) (f) between WT and overexpressed lines OE1 and OE3. Asterisks denote levels of statistical significance compared to WT (*P < 0.05, **P < 0.01, ***P < 0.001). Data are presented as means ± SEM, with four biological replicates per group. The control group (WT) is the Zhongmu No.1 variety of Medicago sativa L.

Supplementary information

Supplementary Information

Supplementary Figs. 1–7 and Tables 1–10.

Reporting Summary

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, F., Chen, S., Zhang, Y. et al. Pan-genomic analysis highlights genes associated with agronomic traits and enhances genomics-assisted breeding in alfalfa. Nat Genet 57, 1262–1273 (2025). https://doi.org/10.1038/s41588-025-02164-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41588-025-02164-8

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research