Evolution of the super-pangenome concept: the birth of a new reference genome for fast-forward breeding

The super-pangenome concept corresponds to a major breakthrough in plant genomics that provides a comprehensive foundation for harnessing genetic diversity and fast-track crop improvement1. In the 1990s, the launch of the “Human Genome Project” opened a transformative age in genomics that guided the development of the earliest reference genomes for Arabidopsis (Arabidopsis thaliana L.) and rice (Oryza sativa L.). During the past three decades, plant genomics has entered into a new era, progressing from chromosome-level to telomere-to-telomere (T2T) achievements and eventually obtaining high-quality genome assemblies of hundreds of plant species2,3,4. These advances have enhanced our understanding of plant diversity and helped genomics-assisted breeding (GAB) applications of future crops5. Nonetheless, single reference genomes fail to capture the full spectrum of rare and structurally unpredictable genomic regions that are necessary for plant genetic diversity1,6. This limitation directly constrains GAB progress, as it complicates the full allelic series and structural variations (SVs) associated with complex agronomic traits and stress adaptations, thus creating bottlenecks for the development of climate-smart, high-yielding future cultivars.

To address this limitation, the pan-genome concept has emerged, covering the complete genomic data of species7. Primarily originating from bacterial exploration, the pan-genome was later subjected to human- and plant-based investigations that provided insights into population diversity by integrating core, dispersal, and private genomic elements8. Core genomes, which are conserved across all individuals of a species, encode vital biological processes. In contrast, dispensable genomes confer adaptability to specific stress environments, and private genomes harbor rare traits with substantial evolutionary and applied implications7,9. Moreover, pan-genomic studies have discovered SVs, including presence/absence variations (PAVs), copy number variations (CNVs), insertions/deletions (InDels), and chromosomal rearrangements, which play critical roles in plant stress adaptation and agronomic trait development. For centuries, cultivated germplasms have undergone diverse adaptations, which guide modern agricultural landscapes3,4,5,7,9.

However, traditional pan-genomes frequently fail to fully capture the breadth of genetic diversity needed for future crop tolerance, and also miss key interspecies variations that drive long-term adaptability. Moving beyond species-level assessment, the super-pangenome expands its scope to incorporate multiple species within a genus by integrating genomic datasets from both cultivated crops and their wild relatives (Fig. 1)1. This extensive dataset offers an exceptional opportunity to explore genus-wide genetic diversity that successfully bridges the gap between cultivated crops and their wild relatives1,6,9,10. Super-pangenomes have discovered novel evolutionary insights that species-specific pan-genomes fail to capture. These extensive genomic datasets increase our understanding of complex traits, including stress tolerance and yield stability, across diverse settings.

Fig. 1: Overview of super-pangenome development and classification of super pan-genes.
figure 1

The germplasm collection from distinct geographic backgrounds includes cultivated species, landraces, and wild relatives, which serve as a foundation for super-pangenome analysis. The genomic diversity discovered across multiple genomes (e.g., genomes 1–10 from different species within a genus) includes diverse structural variations. It is then classified into the super-pangenome concept, which comprises the core, dispersal, and private genomes. Similarly, genes within the super-pangenome are classified into super pan-genes, including genus-conserved, genus-variable, and species-specific genes. This dataset facilitates the examination of genetic diversity and evolutionary relationships at the genus level, which offers insights into crop improvement and breeding programs. Created with BioRender.com.

Super-pangenomes are built by collecting genomic datasets from diverse species within a genus, allowing the discovery of super-pangenes, i.e., genus-wide conserved and variable genes and species-specific genomic regions (Fig. 1). By capturing the large genetic diversity present in wild relatives, super-pangenomes provide beneficial resources for germplasm conservation, evolutionary activities, and the discovery of unique alleles for crop advancement via fast-forward breeding methods1,6,10. However, how do we explain the ideal balance between sequencing complexity and diversity in building a super-pangenome? Integrating multi-species genomic data into a combined foundation portrays computational and analytical challenges that must be solved for effective super-pangenome construction.

The key differences between pan-genomes and super-pangenomes are their sampling scope, dataset alignment, and biological associations. While pan-genomes highlight diversity within a single species7, super-pangenomes include multiple species1 that serve as a meta-collection of pan-genomes. These richer genetic datasets facilitate the discovery of novel haplotypes11,12,13, offer insights into evolutionary activities14,15,16, accelerate gene transfer across species through hybridization17, and advance genome engineering techniques1,6. These datasets hold enormous potential for improving crop tolerance, increasing productivity, and preserving genetic diversity for future generations5,18. However, species-specific genetic bottlenecks may limit applicability, and in this context, super-pangenomes can serve as truly universal reference genomes for breeding. Now the question is: what ethical and biosafety factors must be addressed before crop engineering can extensively adopt super-pangenomes?

This review highlights recent advancements in super-pangenome research across diverse plant species and explores their contributions to crop improvement, particularly in terms of enhancing abiotic/biotic stress tolerance and agronomic traits. Additionally, we discuss several key aspects, including (1) the significance of unique genomic features identified in super-pangenomes, (2) the role of SVs in domestication and breeding, (3) insights from quantitative trait loci (QTLs) and genome-wide association studies (GWASs) in super-pangenomics, (4) how super-pangenomes can guide novel breeding approaches, (5) key challenges in adopting super-pangenomes as new reference genomes, and (6) their key short- and long-term benefits for future breeding. We further hypothesize that super-pangenomes, as a “surprise package” (because they go beyond conventional pan-genomes and discover hidden genetic treasures crucial for crop improvement), will fast-track future crop breeding strategies by offering more dynamic and inclusive reference datasets, fast-tracking the documentation of functionally important alleles, and designing stress-smart, high-yielding future crops to accomplish global food security targets.

Recent progress on super-pangenomes across plant species: now it’s time to act and design future crops

Since 2020, super-pangenomes have been developed for several plant species (Table 1), significantly expanding their pan-genomic sequences and pan-genome sources among diverse germplasms, especially wild relatives. However, these expanded genomic resources are not yet manipulated with ideal efficiency for crop improvement due to persisting bottlenecks that hinder their integration into fast-forward breeding programs. Thus, we anticipate that these new resources can play a key role as “surprise packages” for improving diverse traits, e.g., stress tolerance/disease resistance and agronomic traits, via modern breeding and biotechnological tools (Fig. 2). However, traditional bioinformatics analysis is time-consuming and complex; therefore, we suggest the integration of AI/ML-driven automated super-pangenome analysis, e.g., graph neural networks for SV detection and predictive modeling of trait associations, to rapidly develop improved crop cultivars (Fig. 2)19,20,21.

Fig. 2: A multistep process for super-pangenomics to deliver improved crop cultivars.
figure 2

This figure displays the systematic development of the super-pangenomes, which is based on widely used methods and integrates raw genomic data, biological insights, and advanced breeding applications to increase crop tolerance and productivity. The process includes six key steps: (1) initial data acquisition involves various high-throughput sequencing methods to capture genomic complexity and structural variations (SVs); (2) the information identified in the next step serves as the foundation for understanding genetic diversity and evolutionary adaptations across species; and (3) the extracted genomic features are integrated into variation networks, regulatory elements affecting gene expression, epistatic interactions, trait-associated variant catalogs, functional gene annotations, and panomics to harness genotype-phenotype relationships. This stage enhances the discovery of key functional genes linked to agronomic traits, stress tolerance, and breeding targets. (4) In the next step, the integration of panomics datasets leads to super-pangenomics (a surprise package), which provides new insights into hidden genetic diversity across multiple species. This comprehensive analysis serves as a genetic toolbox for fast-forward breeding strategies: (5) the super-pangenomes facilitates diverse GAB methods that can accelerate crop domestication and improvement; and (6) in the final stage, by leveraging these comprehensive resources, we can achieve the final goal of climate-smart, high-yielding crops with enhanced abiotic stress tolerance, biotic stress resistance, and agronomic trait improvements. Created with BioRender.com.

Table 1 Summary of super-pangenomes across plant species and their role in crop improvement

Contribution to abiotic stress tolerance

Harnessing the genetic basis of abiotic stress tolerance is essential for improving crop tolerance and ensuring future food security22,23. Currently, super-pangenomes have discovered novel stress-responsive genes and SVs that were previously undetectable via traditional genomic approaches. These new elements are greatly increasing our knowledge of stress adaptation and tolerance mechanisms across species. In this context, super-pangenomes have become a powerful “surprise package” for finding novel genetic elements that contribute to stress responses/tolerance and thus provide new insights for fast-tracking stress-smart breeding. For instance, the super-pangenome of maize (termed the pan-Zea genome) discovered a critical SV, i.e., a Harbinger transposon-like insertion (PZ00001aSV02097079INS) upstream of Zm00001d023299, a zinc finger CCCH domain-containing protein involved in drought and UV stress responses. This insertion suppressed gene expression under drought conditions, which improved survival by disrupting an abscisic acid-responsive element and modifying transcription factor binding. Additionally, the authors highlighted novel PME-like genes enriched in teosintes, suggesting their role in stress adaptation. Bayesian fine-mapping emphasized the significance of insertions and gene-PAVs over SNPs/InDels in regulating stress-responsive traits24. In rice, the super-pangenome discovered definitive genetic pathways for submergence adaptation in Asian and African rice via integrated pan-GWAS and expression profiling. In contrast to Asian rice (O. sativa), African rice (Oryza glaberrima and Oryza barthii) lacks Sub1A, the primary regulator of submergence quiescence, but carries SNORKEL1/2 and ACE1, which promote internode elongation. Furthermore, a 54 bp in-frame insertion in the OgDEC1 gene (a negative elongation regulator) resulted in selection-driven functional adjustments in African rice for flooding stress adaptation11.

The super-pangenomes of wild grape (Vitis) species identified a key locus associated with chloride exclusion (crucial for salinity tolerance) through a graph-based pan-GWAS approach. A significant SNP (chr08:13598495) correlated with an Arabidopsis CHX20 homologue highlighted its role in ion transport and salinity adaptation25. In the Populus super-pangenome, species-specific private genes are associated with extreme environmental adaptation. Among the different species, Populus qiongdaoensis, which can survive in tropical climates, possesses genes associated with heat tolerance, whereas P. pseudoglauca, which is native to the Qinghai-Tibet Plateau, harbors genes regulating hypoxia and rapid temperature variations. Population genomic investigations further discovered that climate-driven selection played a noteworthy role in guiding private gene variation, with Pokor23334 strongly associated with precipitation patterns, which demonstrated its role in water availability adaptation26.

Similarly, the Cochlearia super-pangenome detected climate-associated SVs that direct local adaptation across northern and southern populations. Notably, many SVs have shown positive selection, and future climate projections (2061–2080) suggest that SV-based adaptation may intensify, particularly in southern populations, which feature value in terms of plant tolerance under climate change27. These investigations highlight the magic of super-pangenomes in identifying genetic variations governing abiotic stress tolerance/adaptation. By capturing SVs and species-specific adaptations, super-pangenomes provide massive datasets for increasing crop stress tolerance and offer new opportunities for breeding stress-smart future cultivars.

Contribution to biotic stress resistance

Plant super-pangenomes have provided unique insights into the evolutionary activities, SVs, and PAVs of biotic stress resistance genes across diverse species. However, the critical question is: how do SVs in resistance genes guide long-term biotic stress resistance? This remains an open but vital question, as predicting long-term pathogen/disease adaptation will define how super-pangenomes shape future resistance breeding. To answer this question, rice super-pangenome exploration of the NLRome showed extensive nucleotide-binding leucine-rich repeat (NLR) gene diversity, with wild rice retaining more dispensable NLRs than cultivated varieties do. The core NLRs presented increased expression, which emphasizes their functional importance in disease resistance. Furthermore, PAVs can be used to identify species-specific resistance loci, such as Pi36-like domains, in African rice, emphasizing the potential for breeding disease-resistant rice cultivars11. Similarly, a recent Vitis super-pangenome study found 104,046 NLR genes across 72 accessions, which suggests a strong link between TIR-NBARC-LRR domain-containing NLR genes and resistance to downy mildew. Resistant accessions (e.g., Muscadine and North American grapes) harbored more of these genes, which were correlated with known downy mildew resistance loci (Rpv1 and Rpv3). In addition to NLR gene diversity, NLR gene expression patterns and interactions with “pattern recognition receptors” also play key roles in defense activation that suggest novel insights into the breeding of disease-resistant grape cultivars28.

In watermelon, the Citrullus super-pangenome identified key SVs associated with disease resistance, including a major QTL (Qfon1.1) correlated with Fusarium oxysporum resistance29. Notably, wild watermelon species harbor disease resistance-related genes that are largely absent in cultivated C. lanatus, which highlights the importance of wild watermelon reservoirs for improving disease resistance14. In apples, the super-pangenome revealed that cultivated varieties contain significantly more resistance gene analogs (RGAs) than wild species, particularly in the TNL family. A key genomic fragment (MDgdA02G000320) introgressed from Malus baccata was identified, carrying resistance genes associated with downy mildew resistance, highlighting the evolutionary role of gene duplications and hybridization in disease resistance17.

In addition to fruit crops, legume and oilseed species have also benefited from super-pangenome insights. In the chickpea super-pangenome, a genome-wide analysis identified 7,866 candidate resistance (R) genes across the Cicer genus, with significant SVs affecting 67.07% of them. Notably, a 72 bp deletion in the RGA1 gene (crucial for stress responses) was detected in wild accessions but was lost in cultivated chickpea, suggesting its role in domestication-related susceptibility15. Similarly, in the sesame super-pangenome, cultivated varieties harbored fewer resistance genes than did their wild relatives, with significant contractions in the RLK, RLP, and NBS clusters. GWAS identified two SNPs, “SNP1738690 (on ScaChr.16 in S. calycinum) and SNP131559 (on SauChr.14 in S. angustifolium)”, and an SV (SV_297658) correlated with Fusarium resistance, highlighting their role in disease resistance13. An evaluation of the cotton super-pangenome reported that a 444 bp deletion in the GoNe promoter was associated with the absence of foliar nectaries in Gossypium gossypioides (D6) and G. schwendimanii (D11), which shed light on the evolutionary divergence of natural defense mechanisms in cotton16. The magical role of dispensable genes in biotic stress resistance was further characterized in Populus, where private genes enriched in disease resistance functions presented increased methylation and chromatin accessibility variation, suggesting the key role of epigenetic regulation in pathogen defense26.

Additionally, the super-pangenomes of lettuce and kiwifruit demonstrated the ability of CNVs and PAVs to increase plant defense. In lettuce, PAV-GWAS identified key loci for Bremia lactucae (downy mildew) resistance, including the g3691.t1 gene encoding a TIR-NBS-LRR protein that is absent in reference genomes (Lactuca sativa and Lactuca serriola). CNV in the RLL2B gene was also associated with anthocyanin content variation in L. sativa, which demonstrated the multifaceted role of super-pangenomes beyond disease resistance30. Integrated genomic and transcriptomic analysis within the Actinidia super-pangenome identified substantial variation in NBS-LRR gene content (ranging from 115 to 360 genes), with higher counts generally associated with improved resistance to Pseudomonas syringae pv. actinidiae (Psa). Across seven kiwifruit materials, A. × leiocacarpae (a highly resistant genotype) presented up to 360 NBS-LRR genes, including 22 RPM1 copies, with GZ02841 showing Psa-inducible expression. This gene is positioned within a tandem-repeat cluster on Chr08, harboring a unique ~3 kb SV absent in other accessions that possibly drives its increased expression and functional divergence31. In another Actinidia super-pangenome, an assessment of 18,858 RGAs revealed species-specific resistance genes (NBSs, RLKs, RLPs, and TM-CCs), including Ach27g03710DH, an RLK gene exhibiting PAVs within a known P. syringae resistance QTL. A paired NLR gene set (Ach19g11900DH and Ach19g11910DH) was also identified, indicating functionally allied roles in immunity. Furthermore, pan-RGA investigations highlighted that 575 dispensable resistance genes were absent in A. chinensis, which provides valuable genetic resources for disease resistance breeding32. Together, these insights emphasize the functional value of SVs and RGAs in shaping species-specific disease immunity and breeding future disease-resistant kiwifruit cultivars.

Contribution to agronomic trait improvement

With the advent of super-pangenomes, we discovered hidden SVs and PAVs that influence key agronomic traits. Super-pangenomes provide a powerful bridge between domesticated and wild species to improve yield and quality. However, challenges exist in translating these genomic discoveries into practical breeding applications. By capturing novel genetic diversity across species and accessions, super-pangenomic studies have discovered vital genomic elements influencing yield, flowering time, plant architecture, and nutritional quality. For example, in Glycine, the super-pangenome identified several key genes, such as PLANT HOMOLOGOUS TO PARAFIBROMIN (PHP) and DWARF14 (D14), with strong purifying selection, which play crucial roles in flowering regulation and plant architecture, respectively, offering insights into perenniality-annuality transitions and stress adaptation33. Similarly, in rice, the super-pangenome found that SVs influence agronomic traits, including a 520 bp deletion in DNR1 that enhances nitrogen uptake and an insertion near HGW that reduces thousand-grain weight (TGW). A 1.3 kb SV near qTGW1.2a identified the LOC_Os01g57250 gene as a negative regulator of TGW. Sub-population-specific SVs, such as a 1116 bp deletion in DTH8 affecting heading date and a 17.1 kb CNV in GL7 promoting grain length, have breeding potential. Additionally, GWAS on pan-SVs also showed novel yield-associated variants, including a 1.4 kb deletion near OsNPY2, undetectable by SNP-based methods, highlighting the magic of the super-pangenomes for fast-tracking rice breeding11. In the tomato super-pangenome, SVs affecting fruit quality were identified, including a CNV influencing NSGT and a 4151 bp allele upstream of TomLoxC, which improves fruit flavor. Moreover, a 244 bp deletion in the cytochrome-P450 gene “Sgal12g015720” was discovered in cultivated tomatoes, which potentially reduced its function. Overexpression of this gene increased lateral branching and fruit number, thus indicating its potential for yield improvement34. Additionally, Solanum super-pangenome analysis correlated clade-specific genes with cold stress tolerance, domestication-related flowering, and tuberization traits35.

In foxtail millet, the super-pangenome identified SVs and key QTLs associated with domestication and agronomic traits. Notably, a 366 bp deletion near SiGW3 was linked to grain weight, whereas major QTLs for seed non-shattering (qSH5.1, qSH5.2, and qSH9.1) highlighted the role of sh1 in domestication. SV-GWAS outperformed SNP-based GWAS in identifying genetic signals for yield-related traits, including amylose content and heading date. The novel gene “Ghd7” was also identified as crucial for yield and stress adaptation36. In the Citrullus super-pangenomes, CNVs in TST2 and base changes in LCYB were associated with sugar accumulation and flesh colouration, whereas SVs in ClBt explained the loss of bitterness during domestication. Additionally, deletions in ClPHT4;2 and regulatory interactions with ClbZIPs contributed to carotenoid accumulation29. Another Citrullus study reported that domestication sweeps spanning 17.62 Mb included genes that are associated with fruit-quality traits, confirming that ClTST2 tandem duplication is a key factor in sugar content selection14. Similarly, in the Actinidia super-pangenome, a high vitamin C (VC) content-related gene (DTZ79_23g14810) was identified, and its overexpression in transgenic roots significantly increased VC levels, indicating the role of gene expansion in fruit nutritional enhancement37.

SVs identified in the Cicer super-pangenome significantly influence flowering time, a crucial trait for chickpea adaptation. Over 88% of the 263 flowering-related genes presented SVs, with key genes such as AGL9, FLOWERING LOCUS D, and FY affected by translocations and insertions. Variants in JMJ14 (a regulator of FT expression) further influence flowering regulation in different Cicer species15. In lettuce, a recent super-pangenome study discovered that CNV of the flowering repressor gene FLOWERING LOCUS C (FLC) is a major domestication signature. Cultivated leaf-type lettuces harbor 5–8 copies of FLC, compared to only three copies in their wild progenitor (L. serriola). This increase in repressor copy number is strongly associated with delayed flowering, which is a key trait selected to extend the vegetative harvesting period38. In sesame, SVs associated with branching (an essential yield determinant) were identified through GWAS, highlighting the role of SiPT1 in plant architecture. A Copia long terminal repeat-retrotransposon (LTR-RT) insertion in SiPT1 led to a frameshift mutation affecting branching, whereas a deletion in DT1 supported its role in determinate growth13. Similarly, in the Populus super-pangenome, dispensable genes play a role in secondary metabolism and growth regulation. In contrast, core genes are enriched in biomass-related pathways that provide potential targets for Populus breeding26. In the Hevea super-pangenome, the expansion of the REF/SRPP gene family (critical for rubber biosynthesis) was linked to tandem duplications, with high expression of REF1, REF3, REF7, and SRPP1 correlating with latex yield and stress tolerance39.

In cotton, the super-pangenome identified 321 SV hotspot regions influencing fiber traits, with the A2 genome exhibiting fewer SVs in fiber-associated loci, likely contributing to its superior lint yield and fiber quality16. In short, these findings highlight the importance of super-pangenomes in the discovery of genetic variations that contribute to the improvement of key agronomic traits. By integrating SV-based insights into future breeding strategies, future crop improvement programs can harness structural diversity to increase productivity and quality.

Unique features identified in super-pangenomes and their significance

Several super-pangenomes have provided a wealth of unique genetic features, including transposable elements (TEs), cis-regulatory elements (CREs), motifs, conserved noncoding sequences (CNSs), and SVs, which contribute to genome plasticity, stress adaptation, and trait evolution. Notably, TE-driven genome expansion, SV-mediated functional diversity, and the influence of CREs and CNSs on gene regulation have emerged as key themes. This raises critical questions: how do TEs and SVs guide genome evolution in diverse plant species? These variations substantially influence agronomic traits and stress adaptation, while insights from super-pangenomes can be leveraged for precision breeding.

The Glycine super-pangenome highlights the role of TEs (mainly LTR-RTs) in genome differentiation, with TEs comprising up to 49.9% of perennial genomes. Interestingly, allopolyploidization did not trigger massive TE amplification, suggesting that transcriptomic modifications constitute a primary adaptation mechanism33. In the Zea super-pangenome, massive SVs dominate genome diversity, with ~60% of identified SVs associated with TEs, particularly Helitron and TIR elements, which impact functional traits via CREs24. The interplay between TE expansion, DNA methylation, and genome size variation is elegantly demonstrated in the Lactuca super-pangenome38. Wild relative “Lagerstroemia indica” possesses a massive 5.5 Gb genome, driven by recent proliferation of Copia LTR-RTs. This expansion was mechanistically associated with reduced CHH methylation levels, which result from lower expression of the CHROMOMETHYLASE 2 gene. This outcome provides a clear epigenetic mechanism for how TE activity can drive genome size diversity within a genus38. Similarly, the Dendrobium super-pangenome suggests that dispensable and private genes demonstrate relatively high TE coverage, emphasizing the role of TEs in genome variability and gene expression suppression. Notably, TE-driven gene duplications, such as the expansion of FAR1, are linked to adaptive traits, emphasizing how super-pangenomes capture genome plasticity and evolutionary innovation40. Another key question now arises: are SVs passive by-products of genome evolution, or do they actively guide adaptation by modulating regulatory networks?

The Solanum section Petota super-pangenome further emphasizes TE-driven genome evolution, with TE-silencing mechanisms regulating TE abundance across groups35. In the Setaria super-pangenome, TE-derived PAVs contribute to genomic plasticity, suggesting potential targets for improving agronomic traits36. Similarly, the Solanum super-pangenome shows TE content variation (64.3–74.5%) across species, with Gypsy LTR-RTs being particularly abundant in Solanum lycopersicoides, contributing to its large genome (1.2 Gb). A lineage-specific Gypsy LTR-RT burst ~2 million years ago likely influenced its genome post-divergence from potato34.

In addition to TEs, genome structural complexities extend to centromere evolution and hybridization effects. In the Malus super-pangenome, TE-rich centromeric regions (~61.5% of the genome) guide chromosomal organization, which suggests that SVs influence not only genome stability but also inheritance patterns17. Similarly, Populus species exhibit species-specific TE amplifications that contribute to genome diversification, supporting the hypothesis that recent TE bursts facilitate adaptive divergence26. Another novel question emerges: to what extent do species-specific TE landscapes determine ecological niche specialization? Similar patterns are observed in the Citrus super-pangenome, where TE (or DNA satellite)-rich chromosome regions influence genome expansion and species differentiation41, and in the Vitis super-pangenome, where Gypsy LTR-RTs dominate the private genome, which features their role in genetic divergence and trait variability25. Moreover, lineage-specific Gypsy LTR-RT bursts, such as the ~2 million-year-old expansion in wild Solanum species (S. lycopersicoides, S. corneliomulleri, S. peruvianum, and S. chilense) have been shown to contribute to their increased genomic diversity (e.g., 64.3–74.5% TE content) and adaptation to arid niches, thereby indicating the importance of wild tomatoes as valuable genetic resources for breeding future cultivars34. This provides a mechanistic hypothesis for how TE-driven genome expansion guides ecological niche specialization.

Super-pangenomes also provide a window into epigenetic regulation and CREs. In the Cochlearia super-pangenome, TE mobilization triggers SV formation. However, TE silencing patterns remain similar between diploid and tetraploid species despite higher genome-wide methylation in tetraploids, which indicates a conserved mechanism of TE regulation across ploidy levels27. In addition to silencing, TEs themselves can be vectors of transgenerational epigenetic inheritance. A recent breakthrough report demonstrates that hypomethylation at TE loci can be stably inherited and cause heritable changes in the expression of neighboring genes, which effectively create epialleles that are frequent targets of natural selection and contribute to phenotypic variation42. The Hevea super-pangenome further supports this, illuminating strong synteny despite high TE content (81.8% repetitive genome) and emphasizing that genome stability and plasticity can coexist39. In Citrullus, TE enrichment in centromeric regions affects chromosomal evolution, fruit quality, and stress tolerance29. This prompts a key question to explore: how do epigenetic modifications and TE silencing mechanisms vary between domesticated and wild species, and what are their implications for breeding programs?

Finally, comparative analysis highlights the role of CNSs (by acting as CREs) and CREs in stress responses. For example, the Actinidia super-pangenome discovered 227,786 conserved regions with stress-responsive motifs enriched near transcription start sites37. These catalogs of evolutionarily conserved CREs and CNSs provide a high-value target resource for precision gene editing. By using CRISPR-based technologies to engineer these noncoding regions, breeders can potentially control gene expression networks (e.g., enhancing vitamin C biosynthesis in kiwifruit or stress-responsive pathways) without changing the coding sequence itself, which ultimately offers a unique platform for trait improvement37. The Dendrobium super-pangenome identified 4024 core genes harboring species-specific CNSs, which were significantly enriched in fatty acid and carbon metabolism pathways, both of which are necessary for energy production and plant growth40. This raises another critical question: can CNSs serve as universal regulatory elements across species, or are they largely species-specific? Resolving this is key to leveraging super-pangenome data for synthetic biology43,44. Species-specific CNSs could be used to engineer precise, context-dependent gene regulation. On the other hand, universal CNSs might allow for the design of broad-spectrum genetic modules for traits like stress tolerance. Addressing this could transform functional genomics and gene editing approaches in multiple species.

The studies discussed above highlight TEs, SVs, CNSs, and CREs as key elements of genome evolution, trait variability, and species adaptation. Thus, the integration of epigenetics and functional genomics will be crucial for understanding these regulatory mechanisms. Therefore, future research should translate genome plasticity into agronomic benefits and leverage super-pangenomes for targeted crop improvement and fast-forward breeding to ensure sustainable agriculture.

Role of structural variations in crop domestication and future breeding

Diverse SVs have substantially advanced crop domestication and improvement. In contrast to SNPs, these SVs lead to large-scale genomic rearrangements that influence gene function, expression, and genome architecture11,45,46. Genetic diversity lost during domestication can be reintroduced to improve modern crops, which require SV insights to balance productivity, tolerance, and nutritional value in breeding programs. However, this remains a vital challenge, though emerging strategies now use super-pangenomes to reintroduce lost diversity via targeted breeding and gene editing.

In rice, SVs play a major role in domestication by modifying plant architecture and stress tolerance through large genomic rearrangements and PAVs11. The rice super-pangenome discovered that independent but parallel genetic modifications, such as deletions in the RPAD locus affecting plant structure and insertions in SHAT1 controlling seed shattering, guided domesticated Asian and African rice. These findings demonstrate how SVs guide convergent domestication events, providing valuable targets for modern breeding11. Similarly, 4582 domestication-selected PAVs (domPAVs) and 152 improvement-selected PAVs (impPAVs) have been identified in foxtail millet, with more substantial selection pressures observed during domestication. Key genes such as sh1 (seed shattering) and SiGW3 (grain weight) underwent fixed deletions, which explained how SVs contribute to the regulation of agronomic traits and provided insights into genetic strategies for yield enhancement36. The lettuce super-pangenome identified over 500,000 domestication-associated PAVs when comparing cultivated morphotypes to their wild progenitor. In addition to well-characterized FLC CNV, these PAVs affect genes associated with vernalization response, delivering a comprehensive view of the genomic restructuring that supports domestication syndromes38.

In tomato, SVs have led to significant genetic erosion, with a notable 244 bp deletion in the cytochrome-P450 gene affecting shoot architecture. These deletions, which are fixed in cultivated varieties but absent in wild relatives, suggest that SV-driven loss of genetic elements play a major role in defining domestication traits. Reintroducing lost diversity through gene editing or introgression could be a promising roadmap for enhancing agronomic performance34. Similarly, in potato, PAVs are associated with shifts in gene content, leading to the enrichment of 8684 genes related to photosynthesis, the stress response, and metabolic regulation, whereas 4814 genes were lost. This highlights how SV-mediated gene duplications and deletions influence crop evolution and the value of preserving genetic variation for future breeding35.

Watermelon domestication has also been influenced by SVs, particularly in terms of sugar accumulation, pigment composition, and disease resistance. Comparative genomics showed that CNVs of TST2 and SVs in key genes (e.g., ClSWEET3, ClVST1, and ClAGA2) are correlated with increased sugar content, whereas deletions in ClPHT4;2 are associated with carotenoid accumulation. Interestingly, while SVs strongly influence fruit sweetness and color, their impact on traits such as fruit shape and seed size is less consistent, which highlights the complexity of genomic changes in trait determination29. These discoveries raise a critical question: how do different types of SVs interact to influence complex traits, and how can this knowledge be harnessed to improve multiple traits simultaneously?

The above studies taught us how SVs have guided domestication by restructuring crop genomes that impact both adaptive and agronomic traits. While domestication has led to the fixation of beneficial SVs, it has also resulted in the loss of genetic elements that could be crucial for stress tolerance/disease resistance. Hence, we propose that future breeding programs integrate super-pangenome insights to reintroduce beneficial SVs/genes lost during domestication while minimizing trade-offs between yield and stress tolerance. Advances in gene editing and targeted recombination approaches will be key to harness SV diversity, which can guide the design of next-generation, future crops.

Moving beyond genomics to super-pangenomics: insights from QTL and GWAS mapping

While traditional genomic and pan-genomic reports have considerably advanced our understanding of crop domestication and trait variation5,47, integrating super-pangenomics with QTL and GWAS mapping provides a more profound resolution of the SVs, PAVs, and rare alleles determining key traits. This approach enables the identification of hidden genetic variations that conventional SNP-based GWASs often overlook, thereby increasing the accuracy of trait mapping, marker discovery, and genomic selection (GS). For instance, super-pangenomic examination is vital for dissecting complex agronomic traits in rice. By integrating GWAS and eQTL analysis within a pan-genome foundation, Shang et al.11 identified SVs regulating key yield-related traits, such as a 1.3 kb SV in LOC_Os01g57250 accompanying TGW and a 4 kb insertion near LOC_Os06g04820 influencing grain length. Another rice super-pangenome incorporating GWASs and eQTLs uncovered that transcriptional shifts under salinity stress recognized STG5 as a crucial locus maintaining Na+/K+ homeostasis48. Through GWAS and QTL mapping, another study further demonstrated how natural allelic variations in TTL1 regulate thermotolerance and grain size in rice, supporting the role of the super-pangenomes in identifying domestication signatures and novel breeding targets12. In maize, the integration of genome-wide and local association analysis found 21,255 QTLs, with SV-QTLs contributing strongly to trait variation24. Similarly, in Setaria, SV-based GWAS outperformed SNP-based approaches, identifying a 366 bp deletion accompanying TGW and a 196 bp insertion near GBSSI, affecting the amylose content36. These findings jointly stress the necessity of integrating SV-driven approaches to enhance trait dissection and precision breeding.

In addition to staple cereals, super-pangenomics has enabled the discovery of SVs influencing stress tolerance and metabolic pathways in diverse crop species. In Vitis, a pan-GWAS approach identified an SNP on chromosome 8 strongly associated with chloride exclusion, a key factor of salinity stress adaptation, which shows functional parallels with the Arabidopsis AtCHX20 transporter25. A recent graph-based Vitis super-pangenome integrated a reference genome with 132,518 SVs, facilitating high-resolution SV genotyping. SV-eQTL mapping in 113 accessions identified 63 significant SV-expression associations, including an 85 bp insertion associated with altered VvLHT8 expression, regulating downy mildew resistance and stomatal traits via SA-dependent pathways28. In tomato, a graph-based genome for SV-based GWAS analysis observed SVs associated with fruit flavor and metabolic traits, including a 347 bp deletion accompanying the content of geranylacetone, a crucial volatile compound34. Additionally, PAV-GWAS in Lactuca uniquely identified disease resistance loci absent from the reference genome, emphasizing how reference bias in traditional investigations obscures novel SVs, thus highlighting the “magic” of super-pangenomes in revealing them30. The super-pangenome of sesame discovered key SVs regulating plant architecture and oil content, where a GWAS suggested that an LTR-RT insertion in SiPT1 controlled growth patterns and that a mutation in SiNAC1 significantly increased oil yield13. These examples feature how super-pangenomics refines trait mapping, expands the genetic toolkit for crop improvement, and promotes the engineering of stress-smart, high-yielding future cultivars.

Despite progress, super-pangenomics can be further integrated with panomics approaches to predict gene function under stress. See Raza et al.23 for more arguments on how we can integrate panomics with other modern breeding approaches for managing stresses (single or combined) and improving other traits in plants. Can machine learning be leveraged to prioritize functionally relevant SVs in breeding programs? Rare alleles captured through super-pangenomes considerably influence long-term crop adaptation. Answering this question will be crucial in advancing crop genomics beyond SNP-based limitations and cracking hidden SVs for fast-forward breeding strategies. The studies discussed above demonstrate that moving beyond traditional genomics toward super-pangenomics is not only an enhancement but also a requirement to fully harness trait architecture and sustainable agriculture.

Super-pangenomes as catalysts for novel breeding approaches

Super-pangenomes offer a precise foundation for GS and haplotype-assisted breeding (HAB) by integrating SVs, rare alleles, and haplotype diversity across diverse germplasms. These advances improve the identification of functionally important variants, thereby increasing the predictive accuracy for desirable traits. However, the extent to which super-pangenomes outclass traditional GS and HAB models remains ambiguous, as applied constraints may still limit their widespread application. Below, we explore recent findings that address these questions and guide the future of fast-forward breeding.

In rice, haplotype analysis displayed lineage-specific elite genes and SVs with significant agronomic inference. The rice super-pangenome showed that haplotype variation in LOC_Os01g57250 influences TGW, with the gene acting as a negative regulator. Moreover, haplotype differences in spd6 narrowed down candidate genes for grain length, which confirmed LOC_Os06g04820 as the causal gene11. Similarly, haplotype analysis of TTL1 identified two major haplotypes, “hapL (indica) and hapS (japonica)”, which are associated with heat tolerance and grain size variation, with natural promoter variations driving these differences12. These insights highlight how HAB can optimize key traits by leveraging natural variations in regulatory and coding regions.

In addition to those in rice, super-pangenomes have also discovered crucial haplotypes associated with oil content in sesame, yield-related traits in foxtail millet, and fruit quality in tomatoes. In sesame, a SiNAC1 haplotype carrying a C333A mutation significantly increased oil accumulation and demonstrated the potential of haplotype-based selection for improving seed composition13. In foxtail millet, integrating SVs with SNPs into GS models improved genomic-estimated breeding values by up to 50%, highlighting the role of super-pangenomes in filtering trait prediction accuracy36. Moreover, extensive SVs in wild tomatoes present valuable genetic resources for breeding, but large inversions can suppress recombination and create linkage drag. For example, Solanum pennellii, which lacks a 7.1 Mb inversion on chromosome 3 found in other wild species, serves as an ideal donor for backcross breeding to introduce desirable alleles into cultivated varieties34. These findings raise a critical question: how can we overcome recombination barriers imposed by large genomic rearrangements while preserving beneficial alleles for crop improvement?

Super-pangenomes provide an advanced toolkit for fast-tracking breeding through GAB methods, e.g., HAB and GS models. By systematically integrating SVs and refining breeding strategies, we can improve the efficiency of trait selection while minimizing linkage drag. In the future, more efforts should focus on optimizing the use of super-pangenomes data in breeding programs, developing computational tools to increase genomic prediction accuracy, and exploring how haplotype interactions influence complex agronomic traits. Addressing these challenges will be key to explaining the full potential of super-pangenomics for sustainable future agriculture.

Challenges in adopting super-pangenomes as reference genomes in basic omics analysis and beyond

Super-pangenomes deliver massive genomic reference datasets by integrating SVs, haplotypes, and dispensable genes, which collectively offer a dynamic platform to dissect genus-wide diversity. However, their adoption in basic omics analysis and breeding programs poses multifaceted challenges (see Fig. 3 for an overview). If unaddressed, these leaps risk extending single-reference biases rather than harnessing the full “surprise package” potential of super-pangenomes for stress-smart, high-yielding crops. Below, we outline the key challenges and propose evidence-based strategies to overcome them.

Fig. 3: Key challenges in adopting super-pangenomes as new reference genomes in research and breeding.
figure 3

Two-way thick black arrows show the influence or interaction, including (1) arrows from the center to the challenges signify how super-pangenomes address/amplify these challenges and (2) line-with-bars from challenges to the center suggest how these challenges impact/hinder the adoption of super-pangenomes. Circular colored lines highlight the interplay between challenges, as shown by the colored text. Created with BioRender.com.

A foundational challenge is the ideal balance between sequencing complexity and genetic diversity during super-pangenome construction. Over-representing closely related cultivars maximizes haplotype resolution but captures minimal novel diversity. On the contrary, including highly divergent wild species adds impressive SVs and valuable alleles but notably increases computational complexity. Experimental studies in Vitis28 and Cicer15 super-pangenomes propose a tiered strategy, i.e., 20–50 representative genomes (with ~30% wild progenitors) can recover >90% of SV diversity without vast resources, complemented by a larger panel of resequenced accessions to capture population-level diversity7. In addition to ~30–40 representative genomes, incremental variant discovery plateaus while computational burdens, e.g., terabyte-scale graphs and >100x alignment runtimes, increase sharply9,10. This challenge is intensified in polyploid species where homoeologous exchanges blur haplotype boundaries. In species such as wheat49 specialized polyploid-aware or hybrid linear-graph algorithms are required to distangle adaptive variants9, which can ensure scalability across ploidy levels and transform diversity into actionable trait-mining tools.

Although early concerns highlighted potential alignment ambiguities, graph-based references have been demonstrated to extensively mitigate reference bias compared to linear genomes. For instance, tools like GraphAligner enable rapid long-read mapping to pangenome graphs, which reduce error rates by 25–40% in highly divergent regions50. On the other hand, Minigraph-Cactus pipelines help the construction of scalable pangenome graphs from whole-genome alignments51. Contrary to initial limitations, studies exhibit that pangenome graphs reduce mapping bias in short-variant calling by up to ~38% compared to linear genomes. This enables more accurate genotyping of rare SVs across diverse haplotypes without favoring cultivated accessions52,53,54. Moreover, variation graph toolkit enhances variant discovery in complex regions, turning potential biases into prospects for finding hidden trait loci overlooked by SNP-based methods52. To handle the massive computational load, distributed computing frameworks (e.g., Apache Spark on cloud platforms like AWS or Google Cloud) are being adapted for parallel SV calling, which offer significant (e.g., 10–20x) speedups for processing terabyte- to petabyte-scale graphs55, as demonstrated in single-cell transcriptomics extensions to pangenomes56. Nevertheless, alignment accuracy can diminish in hypervariable regions with extreme sequence divergence, implying that hybrid linear-graph approaches may be beneficial for specific taxonomic groups. The development of stronger algorithms and optimized-scalable pipelines for these graph-based genomes is an ongoing critical need.

Furthermore, integrating super-pangenomes into breeding programs for field-ready varieties faces significant feasibility obstacles that must be tackled for equitable adoption, particularly in resource-limited settings. Protracted breeding cycles (5–10 years) for trait introgression and high upfront costs (i.e., super-pangenome construction and large-scale resequencing), and limited accessibility for smallholder farmers, all of these risks extend the genomic divide and remain major barriers for equitable adoption. To address these constraints, we need a multi-pronged strategy: (1) international collaborations (e.g., CGIAR’s orphan crop hubs, https://www.cgiar.org/) and public-sector investment in pre-breeding programs for capacity building; (2) the development of cost-effective, simplified genotyping panels targeting informative, super-pangenome-informed haplotypes; and (3) the adoption of easy open-source computational tools that enable low-compute GS, that can potentially reduce cycle times19,57,58. These strategies are important to ensure the demonstrated benefits of super-pangenomes, such as significant gains in GS accuracy (e.g., 50% GS accuracy gains in foxtail millet via SV integration36), are accessible to a wider range of breeders and institutions, which can ensure scalable progress toward modern crop design.

Standardization and functional annotation also limit adoption. The lack of consistent methodologies for super-pangenome construction, annotation, and analysis hinders reproducibility and comparability across studies. On the other hand, functional annotation gaps, particularly for accessory genes and SVs, restrict biological interpretation19,20,53. Addressing these requires AI-driven functional predictions19,20,53, integration with panomics23,59, and validation through high-throughput phenotyping to link genomic signals with agronomic performance1,19,60.

Finally, the adoption of super-pangenomes raises pressing ethical and biosafety considerations that must be proactively addressed. The power to introgress alleles from wild relatives or engineer crops based on super-pangenome data demands an influential ethical agenda to ensure equitable access to genetic resources and benefits, particularly for farmers and communities in the Global South hotspots where many crop wild relatives originate61. This is crucial to inhibit a widening of the genomic divide, as the current research focus heavily skews towards major crops, overlooking the potential of orphan species. International frameworks, including the Convention on Biological Diversity (CBD, https://www.cbd.int/convention), Nagoya Protocol (https://www.cbd.int/abs/default.shtml), and Cartagena Protocol (https://bch.cbd.int/protocol), provide a starting point for sharing. Still, clearer crop-focused policies are necessary to address data ownership and intellectual property. Biosafety protocols must also evolve to examine the potential ecological impacts of introducing complex SVs from distant relatives, including unintended gene flow, allergenicity, or herbicide resistance spillover, in line with the Cartagena Protocol on Biosafety62,63. Moreover, ethical frameworks must prioritize global genetic diversity conservation to avoid biopiracy and promote fair and equitable benefit-sharing, such as ensuring royalties flow to Global South custodians of wild germplasm as outlined in the Nagoya Protocol (https://www.cbd.int/abs/default.shtml)61. This is crucial for orphan crops, where genomic coverage remains skewed toward major staples; for instance, chickpea super-pangenomes discover 263 genes for flowering in underutilized accessions15, and foxtail millet achieves 50% GS boosts for different agronomic traits36. A dedicated push for super-pangenomes in these species, via CBD-aligned repositories, will conserve critical domestication sweeps and help discover the untapped potential of orphan crops for a more food-secure and biodiverse agricultural future64,65. Establishing transparent governance and stakeholder engagement will be crucial for gaining public trust and ensuring the sustainable and equitable deployment of this technology.

By addressing these challenges through advancements in computational tools, standardized pipelines, feasibility strategies, and interdisciplinary collaborations, including inputs from ethicists and social scientists, super-pangenomes can move forward from “promise to practice” and truly become the new foundation (i.e., reference genomes) for modern omics research and breeding, and for sustainable, food-secure world.

Short- and long-term benefits of super-pangenomes: hope or hype?

Despite the rapid innovations in super-pangenomes construction and interpretation, breeders can expect massive real benefits from these complex genomic datasets, both now and in the coming years. In this context, Table 2 briefly summarizes these benefits from a breeding perspective on the basis of personal understanding and ongoing breeding efforts while evaluating and keeping in mind whether super-pangenomes are a convincing hope or more speculative hype1. In the short term, super-pangenomes already offer various practical gains that breeders can begin leveraging. In the longer term, super-pangenomes can guide how we characterize and exploit genetic diversity for crop improvement programs (Table 2).

Table 2 Short- vs long-term benefits of super-pangenomes for fast-forward breeding

Super-pangenomes are not “silver bullets”; rather, their magical power lies in complementing traditional and modern breeding approaches rather than replacing them. However, their execution remains in its early stages, and the pathway from “promise to practice” involves substantial bottlenecks (see Section “Challenges in adopting super-pangenomes as reference genomes in basic omics analysis and beyond” and Fig. 3 for detailed discussions on key technical and computational challenges). Additionally, regulatory constraints, the cost of execution, and public perception of genome editing tools must also be addressed. Notably, uneven execution (mainly in resource-constrained settings) poses a risk of extending the global innovation divide. Nevertheless, cumulative evidence supports a strong case for positivity. With continued improvements in data integration, user-friendly analytical tools, and functional genomics, the magical potential of super-pangenomes is progressively within reach. Therefore, in the near future, we assume that the “hope” of super-pangenomes will begin to offset the “hype”, which will guide a new era of precision, tolerance, and inclusivity in plant breeding programs.

Concluding remarks and future perspectives

Super-pangenomes have transformed crop genomics by harnessing SVs, novel genes, and rare alleles that were previously neglected by traditional SNP-based methods. In contrast, it also highlights technical, computational, and ethical bottlenecks that must be addressed for their extensive adoption (Fig. 3). In addition to these variations, super-pangenomes have also played a surprising role in identifying unique genetic features, e.g., TEs, CREs, CNSs, and species-specific SVs. These elements not only shape genome plasticity but also contribute to trait evolution, stress adaptation, and epigenetic regulation and transgenerational inheritance, which proposes new targets for precision breeding. Moreover, these variations play important roles in deciphering complex agronomic traits, domestication, and improving GS, HAB, and other GAB methods. The successful application of these methods in diverse crop plants highlights their ability to capture trait-associated variations and fast-track the design of high-yielding, stress-smart, and nutritionally enriched future crops (Fig. 2).

By integrating pan-genomic resources across diverse species, super-pangenomes provide comprehensive genetic datasets that enable in-depth examination of species-specific and dispensable genes. These datasets allow scientists to discover key adaptation genes and develop strategies for introducing or cultivating climate-smart varieties. Furthermore, the variations detected via super-pangenomes can serve as molecular markers in marker-assisted breeding, which enables the efficient transfer of desirable traits from wild relatives to domesticated cultivars.

Despite these breakthroughs, several challenges hinder the extensive adoption of super-pangenomes (Fig. 3), including the following:

  1. 1.

    The massive amount of genomic data demands high-performance computing, scalable algorithms, and cost-effective storage solutions.

  2. 2.

    Developing global policy frameworks for equitable data sharing and benefit-sharing to ensure that the innovations enabled by super-pangenomes are available to all, mainly breeding programs in developing, lower-income countries.

  3. 3.

    Owing to the imperfect annotation of many SVs, determining their functional significance in trait advancement is difficult.

  4. 4.

    The linkage drag caused by SVs complicates breeding efforts and demands innovative strategies to retain beneficial alleles while minimizing undesirable genetic hitchhiking.

  5. 5.

    The limited focus on underutilized and orphan crop species suggests the need to expand super-pangenomic investigations to harness novel genetic elements.

  6. 6.

    Protracted breeding cycles (5–10 years) and high costs for super-pangenome construction are major barriers for equitable adoption in resource-limited settings.

We argue that several fast-forward and forward-looking research directions should be prioritized to fully exploit the “surprising magic” of super-pangenomes:

  1. 1.

    Integrating insights from TE- and SV-based genome evolution into functional genomics and epigenetic studies to discover their precise roles in creating heritable epialleles and their impact on stress adaptation and trait improvement.

  2. 2.

    Integrating graph-based pan-genomes with ML- and AI-driven genome analysis, including deep learning algorithms for imputing or envisaging rare alleles in dispensable genomes, to enhance trait mapping, genomic prediction, and breeding design19,20,21.

  3. 3.

    Panomics tools (genomics, transcriptomics, epigenomics, proteomics, metabolomics, phenomics, and others) can be integrated to gain a meaningful understanding of genome architecture, stress-adaptive mechanisms, and agronomic trait regulation.

  4. 4.

    Harnessing single-cell sequencing and tissue/cell-specific phenotyping to validate super-pangenome predictions and dissect specific gene regulatory networks and cellular responses under single and combined stress conditions.

  5. 5.

    Developing standardized, community-approved pipelines for SV-aware breeding should be developed to guide more precise genomic predictions and targeted crop improvement.

  6. 6.

    Harnessing catalogs of CNSs and CREs as universal regulatory elements for cross-species gene regulation will be crucial to understand their potential. This will enable CRISPR-based fine-tuning of gene expression for trait enhancement and the design of synthetic genetic circuits in synthetic biology applications, which is a way forward from editing coding sequences to mastering the regulatory genome.

  7. 7.

    Building super-pangenomes for underutilized species to increase genetic diversity, conserve domestication sweeps, and contribute to future-proofing of global food security.

To fully harness the power of super-pangenomes, future research must bridge the gap between genomic discoveries and practical breeding applications (i.e., discovery of practical products). By integrating panomics, high-throughput phenotyping, and AI-assisted genome analysis (e.g., for predictive modeling of phenome-genome relationships), we can fast-track precision breeding and guide the design of future crops21,23,60,66. In conclusion, super-pangenomes are not merely a “surprise package” for insightful genomic exploration; they constitute the foundation for next-generation crop breeding, guiding more efficient, precise, and sustainable agricultural innovations.