Pan-genomic analysis highlights genes associated with agronomic traits and enhances genomics-assisted breeding in alfalfa

He, Fei; Chen, Shuai; Zhang, Yangyang; Chai, Kun; Zhang, Qing; Kong, Weilong; Qu, Shenyang; Chen, Lin; Zhang, Fan; Li, Mingna; Wang, Xue; Lv, Huigang; Zhang, Tiejun; He, Xiaofan; Li, Xiao; Li, Yajing; Li, Xianyang; Jiang, Xueqian; Xu, Ming; Sod, Bilig; Kang, Junmei; Zhang, Xingtan; Long, Ruicai; Yang, Qingchuan

doi:10.1038/s41588-025-02164-8

Article
Published: 23 April 2025

Pan-genomic analysis highlights genes associated with agronomic traits and enhances genomics-assisted breeding in alfalfa

Fei He¹^na1,
Shuai Chen ORCID: orcid.org/0000-0002-6861-2682²^na1,
Yangyang Zhang¹^na1,
Kun Chai²,
Qing Zhang ORCID: orcid.org/0000-0002-1160-2715³,
Weilong Kong²,
Shenyang Qu²,
Lin Chen¹,
Fan Zhang¹,
Mingna Li¹,
Xue Wang¹,
Huigang Lv¹,
Tiejun Zhang⁴,
Xiaofan He⁴,
Xiao Li⁴,
Yajing Li¹,
Xianyang Li¹,
Xueqian Jiang¹,
Ming Xu¹,
Bilig Sod¹,
Junmei Kang¹,
Xingtan Zhang ORCID: orcid.org/0000-0002-5207-0882²,
Ruicai Long ORCID: orcid.org/0000-0001-6920-2979¹ &
…
Qingchuan Yang ORCID: orcid.org/0000-0002-5926-9798¹

Nature Genetics volume 57, pages 1262–1273 (2025)Cite this article

5005 Accesses
6 Citations
3 Altmetric
Metrics details

Subjects

Abstract

Alfalfa (Medicago sativa L.), a globally important forage crop, is valued for its high nutritional quality and nitrogen-fixing capacity. Here, we present a high-quality pan-genome constructed from 24 diverse alfalfa accessions, encompassing a wide range of genetic backgrounds. This comprehensive analysis identified 433,765 structural variations and characterized 54,002 pan-gene families, highlighting the pivotal role of genomic diversity in alfalfa domestication and adaptation. Key structural variations associated with salt tolerance and quality traits were discovered, with functional analysis implicating genes such as MsMAP65 and MsGA3ox1. Notably, overexpression of MsGA3ox1 led to a reduced stem–leaf ratio and enhanced forage quality. The integration of genomic selection and marker-assisted breeding strategies improved genomic estimated breeding values across multiple traits, offering valuable genomic resources for advancing alfalfa breeding. These findings provide insights into the genetic basis of important agronomic traits and establish a solid foundation for future crop improvement.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Layout of the alfalfa graph pan-genome study.**

**Fig. 2: Distribution and diversity of representative alfalfa accessions.**

**Fig. 3: Detection of SVs and construction of the pan-genome based on the 24 de novo assembled alfalfa genomes.**

**Fig. 4: Functional impact of SV in alfalfa leaf morphology under salt stress.**

**Fig. 5: Functional validation of a key gene identified by pan-genomic analysis of SLR phenotype using SNP-GWAS and SV-GWAS.**

**Fig. 6: GWAS and genomic prediction accuracies using SV and SNP markers across 54 phenotypic traits.**

A super pan-genomic landscape of rice

Article Open access 12 July 2022

Promises and challenges of crop translational genomics

Article 23 September 2024

Genetic diversity and local adaption of alfalfa populations (Medicago sativa L.) under long-term grazing

Article Open access 30 January 2023

Data availability

The sequencing raw data have been deposited in the NCBI database under accession code BioProject PRJNA1197171. The haploid reference genome is derived from a previously published study⁵. The assembled data have been deposited in the NCBI database under the BioProject accession code PRJNA1220045. Additionally, the data are available via Zenodo at https://doi.org/10.5281/zenodo.14118213 (ref. ⁸⁵) and via Figshare at https://figshare.com/articles/dataset/Alfalfa/28426967 (ref. ⁸⁶). Resequencing data used in this study were obtained from Zhang’s research, and the relevant data have been provided in his published article⁴⁵. The RNA sequence data from this study have been deposited in the NCBI database under accession code BioProject PRJNA1083622. The phenotypes used in GWAS and GS studies are available via Zenodo at https://doi.org/10.5281/zenodo.14869063 (ref. ⁸⁷).

Code availability

All codes associated with this project are available via GitHub at https://github.com/hefei0609-afk/Alfalfa and via Zenodo at https://doi.org/10.5281/zenodo.14800545 (ref. ⁸⁸).

References

Annicchiarico, P., Barrett, B., Brummer, E. C., Julier, B. & Marshall, A. H. Achievements and challenges in improving temperate perennial forage legumes. Crit. Rev. Plant Sci. 34, 327–380 (2015).
Article CAS Google Scholar
Shen, C. et al. The chromosome-level genome sequence of the autotetraploid alfalfa and resequencing of core germplasms provide genomic resources for alfalfa research. Mol. Plant 13, 1250–1261 (2020).
Article CAS PubMed Google Scholar
Li, X. & Brummer, E. C. Applied genetics and genomics in alfalfa breeding. Agronomy 2, 40–61 (2012).
Article CAS Google Scholar
Chen, H. et al. Allele-aware chromosome-level genome assembly and efficient transgene-free genome editing for the autotetraploid cultivated alfalfa. Nat. Commun. 11, 2494 (2020).
Article CAS PubMed PubMed Central Google Scholar
Long, R. et al. Genome assembly of alfalfa cultivar Zhongmu-4 and identification of SNPs associated with agronomic traits. Genomics Proteomics Bioinformatics 20, 14–28 (2022).
Article CAS PubMed PubMed Central Google Scholar
Jayakodi, M., Schreiber, M., Stein, N. & Mascher, M. Building pan-genome infrastructures for crop plants and their use in association genetics. DNA Res. 28, dsaa030 (2021).
Article PubMed PubMed Central Google Scholar
Pang, A. W. et al. Towards a comprehensive structural variation map of an individual human genome. Genome Biol. 11, R52 (2010).
Article PubMed PubMed Central Google Scholar
Zhang, Z. et al. Genome-wide mapping of structural variations reveals a copy number variant that determines reproductive morphology in cucumber. Plant Cell 27, 1595–1604 (2015).
Article CAS PubMed PubMed Central Google Scholar
Zhou, Y. et al. The population genetics of structural variants in grapevine domestication. Nat. Plants 5, 965–979 (2019).
Article PubMed Google Scholar
Saxena, R. K., Edwards, D. & Varshney, R. K. Structural variations in plant genomes. Brief. Funct. Genomics 13, 296–307 (2014).
Article PubMed PubMed Central Google Scholar
Gabur, I., Chawla, H. S., Snowdon, R. J. & Parkin, I. A. Connecting genome structural variation with complex traits in crop plants. Theor. Appl. Genet. 132, 733–750 (2019).
Article PubMed Google Scholar
Chen, S. et al. Gene mining and genomics-assisted breeding empowered by the pangenome of tea plant Camellia sinensis. Nat. Plants 9, 1986–1999 (2023).
Article CAS PubMed Google Scholar
Gaut, B. S., Seymour, D. K., Liu, Q. & Zhou, Y. Demography and its effects on genomic variation in crop domestication. Nat. Plants 4, 512–520 (2018).
Article PubMed Google Scholar
Wellenreuther, M., Mérot, C., Berdan, E. & Bernatchez, L. Going beyond SNPs: the role of structural genomic variants in adaptive evolution and species diversification. Mol. Ecol. 28, 1203–1209 (2019).
Article PubMed Google Scholar
Huang, K. & Rieseberg, L. H. Frequency, origins, and evolutionary role of chromosomal inversions in plants. Front. Plant Sci. 11, 296 (2020).
Article PubMed PubMed Central Google Scholar
Kirkpatrick, M. & Barton, N. Chromosome inversions, local adaptation and speciation. Genetics 173, 419–434 (2006).
Article CAS PubMed PubMed Central Google Scholar
Zhang, X. et al. Haplotype-resolved genome assembly provides insights into evolutionary history of the tea plant Camellia sinensis. Nat. Genet. 53, 1250–1259 (2021).
Article CAS PubMed PubMed Central Google Scholar
Simão, F. A., Waterhouse, R. M., Panagiotis, I., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Article PubMed Google Scholar
Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 46, e126 (2018).
PubMed PubMed Central Google Scholar
Alonge, M. et al. Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing. Genome Biol. 23, 258 (2022).
Article CAS PubMed PubMed Central Google Scholar
Li, A. et al. A chromosome-scale genome assembly of a diploid alfalfa, the progenitor of autotetraploid alfalfa. Hortic. Res. 7, 194 (2020).
Article CAS PubMed PubMed Central Google Scholar
Finn, R. D. et al. Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 (2014).
Article CAS PubMed Google Scholar
Cantalapiedra, C. P., Hernández-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zhou, S., Chen, Q., Li, X. & Li, Y. MAP65-1 is required for the depolymerization and reorganization of cortical microtubules in the response to salt stress in Arabidopsis. Plant Sci. 264, 112–121 (2017).
Article CAS PubMed Google Scholar
Liang, M. et al. Comprehensive analyses of microtubule-associated protein MAP65 family genes in Cucurbitaceae and CsaMAP65s expression profiles in cucumber. J. Appl. Genet. 64, 393–408 (2023).
Article CAS PubMed Google Scholar
Dwiningsih, Y. & Al-Kahtani, J. Genome-wide association study of complex traits in maize detects genomic regions and genes for increasing grain yield and grain quality. Adv. Sustain. Sci. Eng. Technol. 4, 0220209 (2022).
Google Scholar
Liu, R. et al. GWAS analysis and QTL identification of fiber quality traits and yield components in upland cotton using enriched high-density SNP markers. Front. Plant Sci. 13, 1067 (2018).
Article Google Scholar
Kephart, K. D., Buxton, D. & Hill, R. Jr Digestibility and cell‐wall components of alfalfa following selection for divergent herbage lignin concentration. Crop Sci. 30, 207–212 (1990).
Article Google Scholar
Han, R.-H., Lu, X.-S., Gao, G.-J. & Yang, X.-J. Analysis of the principal components and the subordinate function of alfalfa drought resistance. Acta Agrestia Sin. 14, 142 (2006).
Google Scholar
Reinecke, D. M. et al. Gibberellin 3-oxidase gene expression patterns influence gibberellin biosynthesis, growth, and development in pea. Plant Physiol. 163, 929–945 (2013).
Article CAS PubMed PubMed Central Google Scholar
Wu, H., Bai, B., Lu, X. & Li, H. A gibberellin-deficient maize mutant exhibits altered plant height, stem strength and drought tolerance. Plant Cell Rep. 42, 1687–1699 (2023).
Article CAS PubMed Google Scholar
Ameur, A. Goodbye reference, hello genome graphs. Nat. Biotechnol. 37, 866–868 (2019).
Article CAS PubMed Google Scholar
Garrison, E. et al. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nat. Biotechnol. 36, 875–879 (2018).
Article CAS PubMed PubMed Central Google Scholar
Qin, P. et al. Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell 184, 3542–3558.e16 (2021).
Article CAS PubMed Google Scholar
Alonge, M. et al. Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell 182, 145–161.e23 (2020).
Article CAS PubMed PubMed Central Google Scholar
He, Q. et al. A graph-based genome and pan-genome variation of the model plant Setaria. Nat. Genet. 55, 1232–1242 (2023).
Article CAS PubMed PubMed Central Google Scholar
Huang, Y. et al. Pangenome analysis provides insight into the evolution of the orange subfamily and a key gene for citric acid accumulation in citrus fruits. Nat. Genet. 55, 1964–1975 (2023).
Article CAS PubMed Google Scholar
Liu, Y. et al. Pan-genome of wild and cultivated soybeans. Cell 182, 162–176.e13 (2020).
Article CAS PubMed Google Scholar
Hu, J. et al. Potential sites of bioactive gibberellin production during reproductive growth in Arabidopsis. Plant Cell 20, 320–336 (2008).
Article CAS PubMed PubMed Central Google Scholar
Sun, H. et al. Gibberellins inhibit flavonoid biosynthesis and promote nitrogen metabolism in Medicago truncatula. Int. J. Mol. Sci. 22, 9291 (2021).
Article CAS PubMed PubMed Central Google Scholar
Dalmadi, Á. et al. Dwarf plants of diploid Medicago sativa carry a mutation in the gibberellin 3-β-hydroxylase gene. Plant Cell Rep. 27, 1271–1279 (2008).
Article CAS PubMed Google Scholar
Israelsson, M., Mellerowicz, E., Chono, M., Gullberg, J. & Moritz, T. Cloning and overproduction of gibberellin 3-oxidase in hybrid aspen trees. Effects on gibberellin homeostasis and development. Plant Physiol. 135, 221–230 (2004).
Article CAS PubMed PubMed Central Google Scholar
Zheng, L. et al. From model to alfalfa: gene editing to obtain semidwarf and prostrate growth habits. Crop J. 10, 932–941 (2022).
Article Google Scholar
He, X. et al. Accuracy of genomic selection for alfalfa biomass yield in two full-sib populations. Front. Plant Sci. 13, 1037272 (2022).
Article PubMed PubMed Central Google Scholar
Zhang, F. et al. Evolutionary genomics of climatic adaptation and resilience to climate change in alfalfa. Mol. Plant 17, 867–883 (2024).
Article CAS PubMed Google Scholar
Li, H. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Article PubMed PubMed Central Google Scholar
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Article CAS PubMed PubMed Central Google Scholar
Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).
Article PubMed PubMed Central Google Scholar
Alexander, D. H. & Lange, K. Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinformatics 12, 246 (2011).
Article PubMed PubMed Central Google Scholar
Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).
Article CAS PubMed PubMed Central Google Scholar
Zhang, X., Zhang, S., Zhao, Q., Ming, R. & Tang, H. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nat. Plants 5, 833–845 (2019).
Article CAS PubMed Google Scholar
Zhang, J. et al. Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L. Nat. Genet. 50, 1565–1573 (2018).
Article CAS PubMed Google Scholar
Wu, T. D. & Watanabe, C. K. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21, 1859 (2005).
Article CAS PubMed Google Scholar
Tang, H. et al. An improved genome release (Version Mt4.0) for the model legume Medicago truncatula. BMC Genomics 15, 312 (2014).
Article PubMed PubMed Central Google Scholar
Wang, Y. et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40, e49 (2012).
Article CAS PubMed PubMed Central Google Scholar
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
Article CAS PubMed PubMed Central Google Scholar
Guan, D. et al. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics 36, 2896–2898 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zhao, X. & Hao, W. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).
Article Google Scholar
Ellinghaus, D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9, 18 (2008).
Article PubMed PubMed Central Google Scholar
Ou, S. & Jiang, N. LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 176, 1410–1422 (2018).
Article CAS PubMed Google Scholar
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics 5, 4.10.11–14.10.14 (2004).
Google Scholar
Brůna, T., Hoff, K. J., Lomsadze, A., Stanke, M. & Borodovsky, M. BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genom. Bioinform. 3, lqaa108 (2021).
Article PubMed PubMed Central Google Scholar
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).
Article CAS PubMed PubMed Central Google Scholar
Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240 (2014).
Article CAS PubMed PubMed Central Google Scholar
Ou, S. et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 20, 275 (2019).
Article CAS PubMed PubMed Central Google Scholar
Su, W., Gu, X. & Peterson, T. TIR-Learner, a new ensemble method for TIR transposable element annotation, provides evidence for abundant new transposable elements in the maize genome. Mol. Plant 12, 447–460 (2019).
Article CAS PubMed Google Scholar
Xiong, W. et al. HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes. Proc. Natl Acad. Sci. USA 111, 10263–10268 (2014).
Article CAS PubMed PubMed Central Google Scholar
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl Acad. Sci. USA 117, 9451–9457 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lavigne, R., Seto, D., Mahadevan, P., Ackermann, H.-W. & Kropinski, A. M. Unifying classical and molecular taxonomic classification: analysis of the Podoviridae using BLASTP-based tools. Res. Microbiol. 159, 406–414 (2008).
Article CAS PubMed Google Scholar
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).
Article PubMed PubMed Central Google Scholar
Wang, D., Zhang, Y., Zhang, Z., Zhu, J. & Yu, J. KaKs_Calculator 2.0: a toolkit incorporating gamma-series methods and sliding window strategies. Genomics Proteomics Bioinformatics 8, 77–80 (2010).
Article CAS PubMed PubMed Central Google Scholar
Wang, D.-P., Wan, H.-L., Zhang, S. & Yu, J. γ-MYN: a new algorithm for estimating Ka and Ks with consideration of variable substitution rates. Biol. Direct 4, 20 (2009).
Article PubMed PubMed Central Google Scholar
Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461–468 (2018).
Article CAS PubMed PubMed Central Google Scholar
Heller, D. & Vingron, M. SVIM: structural variant identification using mapped long reads. Bioinformatics 35, 2907–2915 (2019).
Article CAS PubMed PubMed Central Google Scholar
Jiang, T. et al. Long-read-based human genomic structural variation detection with cuteSV. Genome Biol. 21, 189 (2020).
Article CAS PubMed PubMed Central Google Scholar
Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).
Article CAS PubMed PubMed Central Google Scholar
Marçais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput. Biol. 14, e1005944 (2018).
Article PubMed PubMed Central Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS PubMed PubMed Central Google Scholar
Zadeh, L. A. Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst. 1, 3–28 (1978).
Article Google Scholar
VanRaden, P. M. Efficient methods to compute genomic predictions. J. Dairy Sci. 91, 4414–4423 (2008).
Article CAS PubMed Google Scholar
Cortes, C. & Vapnik, V. Support-vector networks. Machine Leaning 20, 273–297 (1995).
Article Google Scholar
Fu, C., Hernandez, T., Zhou, C. & Wang, Z.-Y. Alfalfa (Medicago sativa L.). Methods Mol. Biol. 1223, 213–221 (2015).
Article CAS PubMed Google Scholar
Abràmoff, M. D., Magalhães, P. J. & Ram, S. J. Image processing with ImageJ. Biophotonics Int. 11, 36–42 (2004).
Google Scholar
He, F. Pan-genomic analysis highlights genes associated with agronomic traits and enhances genomics-assisted breeding in alfalfa. Zenodo https://doi.org/10.5281/zenodo.14118212 (2024).
He, F. Alfalfa. Figshare https://doi.org/10.6084/m9.figshare.28426967.v1 (2025).
Fei, H. Alfalfa. Zenodo https://doi.org/10.5281/zenodo.14869062 (2025).
Fei, H. Alfalfa pan-genome. Zenodo https://doi.org/10.5281/zenodo.14800544 (2025).

Download references

Acknowledgements

This work was supported by China Agriculture Research System of MOF and MARA (grant no. CARS-34 to Q.Y.), the Biological Breeding-National Science and Technology Major Project (grant no. 2022ZD04011 to R.L.), the Key Projects in Science and Technology of Inner Mongolia (grant no. 2021ZD0031 to R.L.) and Agricultural Science and Technology Innovation Program of CAAS (grant no. ASTIP-IAS14 to Q.Y.).

Author information

These authors contributed equally: Fei He, Shuai Chen, Yangyang Zhang.

Authors and Affiliations

Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
Fei He, Yangyang Zhang, Lin Chen, Fan Zhang, Mingna Li, Xue Wang, Huigang Lv, Yajing Li, Xianyang Li, Xueqian Jiang, Ming Xu, Bilig Sod, Junmei Kang, Ruicai Long & Qingchuan Yang
National Key Laboratory for Tropical Crop Breeding, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
Shuai Chen, Kun Chai, Weilong Kong, Shenyang Qu & Xingtan Zhang
State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, College of Agriculture, Guangxi Key Laboratory of Sugarcane Biology, Guangxi University, Nanning, China
Qing Zhang
School of Grassland Science, Beijing Forestry University, Beijing, China
Tiejun Zhang, Xiaofan He & Xiao Li

Authors

Fei He
View author publications
Search author on:PubMed Google Scholar
Shuai Chen
View author publications
Search author on:PubMed Google Scholar
Yangyang Zhang
View author publications
Search author on:PubMed Google Scholar
Kun Chai
View author publications
Search author on:PubMed Google Scholar
Qing Zhang
View author publications
Search author on:PubMed Google Scholar
Weilong Kong
View author publications
Search author on:PubMed Google Scholar
Shenyang Qu
View author publications
Search author on:PubMed Google Scholar
Lin Chen
View author publications
Search author on:PubMed Google Scholar
Fan Zhang
View author publications
Search author on:PubMed Google Scholar
Mingna Li
View author publications
Search author on:PubMed Google Scholar
Xue Wang
View author publications
Search author on:PubMed Google Scholar
Huigang Lv
View author publications
Search author on:PubMed Google Scholar
Tiejun Zhang
View author publications
Search author on:PubMed Google Scholar
Xiaofan He
View author publications
Search author on:PubMed Google Scholar
Xiao Li
View author publications
Search author on:PubMed Google Scholar
Yajing Li
View author publications
Search author on:PubMed Google Scholar
Xianyang Li
View author publications
Search author on:PubMed Google Scholar
Xueqian Jiang
View author publications
Search author on:PubMed Google Scholar
Ming Xu
View author publications
Search author on:PubMed Google Scholar
Bilig Sod
View author publications
Search author on:PubMed Google Scholar
Junmei Kang
View author publications
Search author on:PubMed Google Scholar
Xingtan Zhang
View author publications
Search author on:PubMed Google Scholar
Ruicai Long
View author publications
Search author on:PubMed Google Scholar
Qingchuan Yang
View author publications
Search author on:PubMed Google Scholar

Contributions

Q.Y., R.L. and X.Z. designed this project and coordinated the research activities. F.Z., J.K., H.L., L.C., Xianyang Li, M.L., X.W., X.J., B.S., M.X. and Y.L. collected and provided plant materials. F.Z., R.L. and X.Z. participated in the genome sequencing and resequencing. S.C., S.Q. and K.C. assembled the genomes. S.C., W.K., Q.Z., K.C. and S.Q. performed the gene annotation. S.C. and F.H. analyzed RNA-seq data. F.H. constructed the sequence and gene-based pan-genome. F.Z., S.C. and F.H. contributed to population GWAS analysis. Y.Z. performed functional verification. X.H., Xiao Li and T.Z. conducted a whole-genome selection analysis. F.H., S.C., X.Z., R.L. and Q.Y. interpreted the data and contributed to the manuscript writing.

Corresponding authors

Correspondence to Xingtan Zhang, Ruicai Long or Qingchuan Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Eric von Wettberg and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Population structure and Fixation Index of the global alfalfa diversity panel.

a. Population structure of the alfalfa panel was inferred by assuming three subpopulations (K). Each color represents a different subpopulation. b. Word cloud of the primary origin countries for alfalfa varieties in Group1, Group2, and Group3. Font size represents the relative proportion of varieties from each country. Group1 is predominantly from the United States, Group2 from China, and Group3 from Turkey, with contributions from other countries as well. c. The PCA scatter plot shows the distribution of PC1 and PC2, with different colors representing different groups (Group1, Group2, Group3). d. Fixation Index (F_ST) values among Group1, Group2, and Group3 alfalfa accessions.

Extended Data Fig. 2 The genome structure variations (SVs) between species of alfalfa.

a, Chromosome. b–h, means the distribution of repeat density, gene density, SNP/Indel density, deletions, insertion, duplication and inversion.

Extended Data Fig. 3 Genome-wide association study (GWAS) for monosaccharide content, In Vitro True Dry Matter Degradability at 24 h (IVTDMD24), and In Vitro True Dry Matter Degradability at 30 h (IVTDMD30).

a, c, e, present the Manhattan and QQ plots of the GWAS results for monosaccharide, IVTDMD24, and IVTDMD30, respectively, using structural variation (SV) markers. b, d, f, show the Manhattan and QQ plots for the same traits using single nucleotide polymorphism (SNP) markers. The red dashed line indicates the Bonferroni-corrected genome-wide significance threshold (α = 0.05/n, where 'n' is the total number of independent SNPs and effective SVs). g, i, k, depict scatter plots of the peak structural variations in chromosome 1 for the three traits, with the horizontal line marking the Bonferroni-corrected genome-wide significance threshold. h, j, l, display boxplots of the three traits across different accessions, categorized by the alleles they carry. The sample sizes for the REF and ALT groups are 171 and 5, respectively. In boxplots, the 25% and 75% quartiles are shown as lower and upper edges of boxes, respectively, and central lines denote the median. The whiskers extend to 1.5 times the inter-quartile range. P-values were computed from two-tailed Student’ s t-test.

Extended Data Fig. 4 Impact of MsGA3ox1 overexpression on alfalfa morphology traits.

a, Comparison between WT alfalfa plants and overexpression lines (OE3, OE7, and OE12). b-e, Quantitative measurements of MsGA3ox1 expression levels, plant height, SLR, and biomass. f, Photographs of leaves from WT, OE3, OE7, and OE12 lines at the 3rd, 4th, and 5th stem nodes. g-i, Comparative assessments of leaf area, leaf length, and leaf width between WT and MsGA3ox1 overexpression lines as shown in f. j, Comparison of WT and MsGA3ox1 overexpression lines in the number of trifoliolate leaves. The scale bar represents 5 cm. Asterisks denote statistical significance with ‘*’ ‘**’ and ‘***’ indicating P < 0.05, P < 0.01and P < 0.001, respectively. Data are presented as means ± SEM, with three independent experimental replicates for panel b, six independent experimental replicates for panels c, d, e, and j, and nine independent experimental replicates for panels g, h, and i. The control group (WT) is the Zhongmu No.1 variety of Medicago sativa L.

Extended Data Fig. 5 Phenotypic characterization of alfalfa quality traits in MsGA3ox overexpression lines.

The bar graphs depict a comparative analysis of crude protein (CP) (a), acid detergent fiber (ADF) (b), neutral detergent fiber (NDF) (c), lignin content (d), total digestible nutrients (TDN) (e), and net energy for gain (NEg) (f) between WT and overexpressed lines OE1 and OE3. Asterisks denote levels of statistical significance compared to WT (*P < 0.05, **P < 0.01, ***P < 0.001). Data are presented as means ± SEM, with four biological replicates per group. The control group (WT) is the Zhongmu No.1 variety of Medicago sativa L.

Supplementary information

Supplementary Information

Supplementary Figs. 1–7 and Tables 1–10.

Reporting Summary

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

He, F., Chen, S., Zhang, Y. et al. Pan-genomic analysis highlights genes associated with agronomic traits and enhances genomics-assisted breeding in alfalfa. Nat Genet 57, 1262–1273 (2025). https://doi.org/10.1038/s41588-025-02164-8

Download citation

Received: 21 February 2024
Accepted: 13 March 2025
Published: 23 April 2025
Issue date: May 2025
DOI: https://doi.org/10.1038/s41588-025-02164-8

This article is cited by

DU-Net-L: an effective and lightweight segmentation model for alfalfa images that integrates the strengths of DeepLabV3+ and U-Net
- Wei Tian
- Kang Chong
- Jingyu Zhang
aBIOTECH (2025)
Alfalfa pan-genome unveiled—a breakthrough in alfalfa genomics-assisted breeding
- Gai Huang
- Lulu Li
- Hao Lin
Science China Life Sciences (2025)