Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

A tomato telomere-to-telomere super-pangenome empowers stress resilience breeding

Abstract

Tomato (Solanum lycopersicum), one of the world’s most valuable vegetable crops, has suffered from diminished genetic diversity and stress resistance. Wild tomatoes serve as an invaluable genetic reservoir, yet their potential for stress resilience remains largely unexploited in tomato breeding. Here we report a genus-wide super-pangenome across 16 tomato species by integrating 20 telomere-to-telomere genomes and 27 published chromosome-scale genomes. Genus-wide population analysis demonstrates broad genetic diversity with limited gene flows among principal clades. Pan-centromere analysis reveals a diverse landscape and dynamic evolution of the mysterious tomato centromeres involving rapid diversification, satellite emergence and repositioning. A comprehensive catalog of structural variants uncovers extensive rearrangements, especially from wild tomatoes, and discovers key molecular markers associated with salinity resistance. Structural-variant-based genome-wide association studies identified a leucine-rich repeat receptor gene SlGMAK conferring gray mold resistance. Our telomere-to-telomere super-pangenome will accelerate exploiting the untapped potential of wild relatives to improve modern tomatoes for stress resilience.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Population analysis of wild and cultivated tomatoes.
Fig. 2: T2T genome assemblies of wild and cultivated tomatoes.
Fig. 3: Pan-centromere landscape and diversity of tomatoes.
Fig. 4: Super-pangenome analysis of 20 tomato accessions (16 species).
Fig. 5: Catalog of structural variations in tomato genomes reveals functional SVs affecting salt stress tolerance.
Fig. 6: SV-GWAS identified a GMR gene.

Similar content being viewed by others

Data availability

All raw sequencing data and genome assemblies generated in this study have been deposited in the China National Center for Bioinformation (CNCB, https://ngdc.cncb.ac.cn/, BioProject accession PRJCA030093). Whole-genome resequencing, CENH3 ChIP–seq, ATAC-seq and RNA-seq data have also been deposited in the National Center for Biotechnology Information (NCBI, BioProject accession PRJNA1201608). The genome assemblies and annotations are available via Zenodo at https://doi.org/10.5281/zenodo.17878268 (ref. 101). Tomato whole-genome resequencing data used in this study were downloaded from NCBI under BioProject PRJEB5235. Source data are provided with this paper.

Code availability

The scripts used in this study are available via GitHub at https://github.com/ChunmeiShi02/TomatoT2Tsuperpangenome and via Zenodo at https://doi.org/10.5281/zenodo.17935735 (ref. 102).

References

  1. Lin, T. et al. Genomic analyses provide insights into the history of tomato breeding. Nat. Genet. 46, 1220–1226 (2014).

    Article  CAS  PubMed  Google Scholar 

  2. Knapp, S. & Peralta, I. E. in The Tomato Genome. Compendium of Plant Genomes (eds Causse, M. et al.) 7–21 (Springer, 2016).

  3. Ercolano, M. R., Di Matteo, A., Andolfo, G. & Frusciante, L. in The Wild Solanums Genomes (eds Carputo, D. et al.) 35–49 (Springer, 2021).

  4. Martin, G. B. et al. Map-based cloning of a protein kinase gene conferring disease resistance in tomato. Science 262, 1432–1436 (1993).

    Article  CAS  PubMed  Google Scholar 

  5. Yang, H. et al. The Sm gene conferring resistance to gray leaf spot disease encodes an NBS-LRR (nucleotide-binding site-leucine-rich repeat) plant resistance protein in tomato. Theor. Appl. Genet. 135, 1467–1476 (2022).

    Article  CAS  PubMed  Google Scholar 

  6. Powell, A. F. et al. A Solanum lycopersicoides reference genome facilitates insights into tomato specialized metabolism and immunity. Plant J. 110, 1791–1810 (2022).

    Article  CAS  PubMed  Google Scholar 

  7. van Rengs, W. M. J. et al. A chromosome scale tomato genome built from complementary PacBio and Nanopore sequences alone reveals extensive linkage drag during breeding. Plant J. 110, 572–588 (2022).

    Article  PubMed  Google Scholar 

  8. Yu, X. et al. Chromosome-scale genome assemblies of wild tomato relatives Solanum habrochaites and Solanum galapagense reveal structural variants associated with stress tolerance and terpene biosynthesis. Hortic. Res. 9, uhac139 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. The Tomato Genome Consortium. The tomato genome sequence provides insights into fleshy fruit evolution. Nature 485, 635–641 (2012).

    Article  Google Scholar 

  10. Zhou, Y. et al. Graph pangenome captures missing heritability and empowers tomato breeding. Nature 606, 527–534 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Chen, Y., Tian, J., Zhao, Y., Zhang, J. & Liang, C. A telomere-to-telomere reference genome assembly of tomato cultivar ‘Heinz 1706’. Plant Commun. 6, 101618 (2025).

    Article  Google Scholar 

  12. Zhang, Y. et al. Telomere-to-telomere Citrullus super-pangenome provides direction for watermelon breeding. Nat. Genet. 56, 1750–1761 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Shang, L. et al. A super pan-genomic landscape of rice. Cell Res. 32, 878–896 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Jayakodi, M. et al. Structural variation in the pangenome of wild and domesticated barley. Nature 636, 654–662 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Tong, X. et al. High-resolution silkworm pan-genome provides genetic insights into artificial selection and ecological adaptation. Nat. Commun. 13, 5619 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Gao, L. et al. The tomato pan-genome uncovers new genes and a rare allele regulating fruit flavor. Nat. Genet. 51, 1044–1051 (2019).

    Article  CAS  PubMed  Google Scholar 

  17. Alonge, M. et al. Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell 182, 145–161.e23 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Li, N. et al. Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species. Nat. Genet. 55, 852–860 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Aflitos, S. et al. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing. Plant J. 80, 136–148 (2014).

    Article  PubMed  Google Scholar 

  20. Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. J. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Benoit, M. et al. Solanum pan-genetics reveals paralogues as contingencies in crop engineering. Nature 640, 135–145 (2025).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Henikoff, S., Ahmad, K. & Malik, H. S. The centromere paradox: stable inheritance with rapidly evolving DNA. Science 293, 1098–1102 (2001).

    Article  CAS  PubMed  Google Scholar 

  23. Chen, W. et al. Two telomere-to-telomere gapless genomes reveal insights into Capsicum evolution and capsaicinoid biosynthesis. Nat. Commun. 15, 4295 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Chen, W. et al. The complete genome assembly of Nicotiana benthamiana reveals the genetic and epigenetic landscape of centromeres. Nat. Plants 10, 1928–1943 (2024).

    Article  CAS  PubMed  Google Scholar 

  25. Zhang, H. et al. Boom-bust turnovers of megabase-sized centromeric DNA in Solanum species: rapid evolution of DNA sequences associated with centromeres. Plant Cell 26, 1436–1447 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Chang, S. B. et al. FISH mapping and molecular organization of the major repetitive sequences of tomato. Chromosome Res. 16, 919–933 (2008).

    Article  CAS  PubMed  Google Scholar 

  27. Kasinathan, S. & Henikoff, S. Non-B-form DNA is enriched at centromeres. Mol. Biol. Evol. 35, 949–962 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Mandáková, T., Hloušková, P., Koch, M. A. & Lysak, M. A. Genome evolution in Arabideae was marked by frequent centromere repositioning. Plant Cell 32, 650–665 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Liu, Y. et al. Pan-centromere reveals widespread centromere repositioning of soybean genomes. Proc. Natl Acad. Sci. USA 120, e2310177120 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Pedley, K. F. & Martin, G. B. Molecular basis of Pto-mediated resistance to bacterial speck disease in tomato. Annu. Rev. Phytopathol. 41, 215–243 (2003).

    Article  CAS  PubMed  Google Scholar 

  31. Goel, M., Sun, H., Jiao, W. B. & Schneeberger, K. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol. 20, 277 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Hickey, G. et al. Genotyping structural variants in pangenome graphs using the vg toolkit. Genome Biol. 21, 35 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Wang, Z. et al. Loss of salt tolerance during tomato domestication conferred by variation in a Na+/K+ transporter. EMBO J. 39, e103256 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Guo, M. et al. Loss of cold tolerance is conferred by absence of the WRKY34 promoter fragment during tomato evolution. Nat. Commun. 15, 6667 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Mu, Q. et al. Fruit weight is controlled by cell size regulator encoding a novel protein that is expressed in maturing tomato fruits. PLoS Genet. 13, e1006930 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  36. Williamson, B. et al. Botrytis cinerea: the cause of grey mould disease. Mol. Plant Pathol. 8, 561–580 (2007).

    Article  CAS  PubMed  Google Scholar 

  37. Khan, A. W. et al. Super-pangenome by integrating the wild side of a species for accelerated crop improvement. Trends Plant Sci. 25, 148–158 (2020).

    Article  CAS  PubMed  Google Scholar 

  38. Davis, J. et al. Mapping of loci from Solanum lycopersicoides conferring resistance or susceptibility to Botrytis cinerea in tomato. Theor. Appl. Genet. 119, 305–314 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Finkers, R. et al. Three QTLs for Botrytis cinerea resistance in tomato. Theor. Appl. Genet. 114, 585–593 (2007).

    Article  PubMed  Google Scholar 

  40. Finkers, R. et al. Quantitative resistance to Botrytis cinerea from Solanum neorickii. Euphytica 159, 83–92 (2008).

    Article  Google Scholar 

  41. Strickler, S. R. et al. Comparative genomics and phylogenetic discordance of cultivated tomato and close wild relatives. PeerJ 3, e793 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Darwin, S. C., Knapp, S. & Peralta, I. E. J. Taxonomy of tomatoes in the Galápagos Islands: native and introduced species of Solanum section Lycopersicon (Solanaceae). Syst. Biodivers. 1, 29–53 (2003).

    Article  Google Scholar 

  43. Lim, K. B. et al. Characterization of rDNAs and tandem repeats in the heterochromatin of Brassica rapa. Mol. Cells 19, 436–444 (2005).

    Article  CAS  PubMed  Google Scholar 

  44. Hu, J. et al. NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads. Genome Biol. 25, 107 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Alonge, M. et al. Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing. Genome Biol. 23, 258 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Xu, M. et al. TGS-GapCloser: a fast and accurate gap closer for large genomes with low coverage of error-prone long reads. GigaScience 9, giaa094 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Jain, C., Rhie, A., Hansen, N. F., Koren, S. & Phillippy, A. M. Long-read mapping to repetitive reference sequences using Winnowmap2. Nat. Methods 19, 705–710 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Huang, N. & Li, H. Compleasm: a faster and more accurate reimplementation of BUSCO. Bioinformatics 39, btad595 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl Acad. Sci. USA 117, 9451–9457 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Bao, W., Kojima, K. K. & Kohany, O. J. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. DNA 6, 11 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics 25, 4.10.1–4.10.14 (2009).

    Article  Google Scholar 

  59. Cantarel, B. L. et al. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 18, 188–196 (2008).

    Article  CAS  PubMed  Google Scholar 

  60. Slater, G. S. & Birney, E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6, 31 (2005).

    Article  PubMed  PubMed Central  Google Scholar 

  61. Korf, I. Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Hoff, K. J., Lomsadze, A., Borodovsky, M. & Stanke, M. Whole-genome annotation with BRAKER. Methods Mol. Biol. 1962, 65–95 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).

    Article  PubMed  Google Scholar 

  66. Nevers, Y. et al. Quality assessment of gene repertoire annotations with OMArk. Nat. Biotechnol. 43, 124–133 (2025).

    Article  CAS  PubMed  Google Scholar 

  67. Wlodzimierz, P. et al. Cycles of satellite and transposon evolution in Arabidopsis centromeres. Nature 618, 557–565 (2023).

    Article  CAS  PubMed  Google Scholar 

  68. Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  69. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Ramírez, F., Dündar, F., Diehl, S., Grüning, B. A. & Manke, T. DeepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, W187–W191 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  71. Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  72. Ellinghaus, D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9, 18 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  73. Ou, S. & Jiang, N. LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 176, 1410–1422 (2018).

    Article  CAS  PubMed  Google Scholar 

  74. Vollger, M. R., Kerpedjiev, P., Phillippy, A. M. & Eichler, E. E. StainedGlass: interactive visualization of massive tandem repeat structures with identity heatmaps. Bioinformatics 38, 2049–2051 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Tarasov, A., Vilella, A. J., Cuppen, E., Nijman, I. J. & Prins, P. Sambamba: fast processing of NGS alignment formats. Bioinformatics 31, 2032–2034 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Yu, H., Shi, C., He, W., Li, F. & Ouyang, B. PanDepth, an ultrafast and efficient genomic tool for coverage calculation. Brief Bioinform. 25, bbae197 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  77. Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 11.10.1–11.10.33 (2013).

    PubMed  PubMed Central  Google Scholar 

  78. Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 6, 80–92 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

    Article  CAS  PubMed  Google Scholar 

  82. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  86. Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  87. Kozlov, A. M., Darriba, D., Flouri, T., Morel, B. & Stamatakis, A. RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 35, 4453–4455 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).

    Article  CAS  PubMed  Google Scholar 

  89. Särkinen, T., Bohs, L., Olmstead, R. G. & Knapp, S. A phylogenetic framework for evolutionary study of the nightshades (Solanaceae): a dated 1000-tip tree. BMC Evol. Biol. 13, 214 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  90. Chen, H. & Zwaenepoel, A. Inference of ancient polyploidy from genomic data. Methods Mol. Biol. 2545, 3–18 (2023).

    Article  PubMed  Google Scholar 

  91. Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).

    Article  CAS  PubMed  Google Scholar 

  92. Steuernagel, B. et al. The NLR-annotator tool enables annotation of the intracellular immune receptor repertoire. Plant Physiol. 183, 468–482 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Smith, M., Jones, J. T. & Hein, I. Resistify: a novel NLR classifier that reveals Helitron-associated NLR expansion in Solanaceae. Bioinform. Biol. Insights 19, 11779322241308944 (2025).

    Article  PubMed  PubMed Central  Google Scholar 

  94. Ngou, B. P. M., Heal, R., Wyler, M., Schmid, M. W. & Jones, J. D. G. Concerted expansion and contraction of immune receptor gene repertoires in plant genomes. Nat. Plants 8, 1146–1152 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

    Article  CAS  PubMed  Google Scholar 

  97. Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Li, M. X., Yeung, J. M., Cherny, S. S. & Sham, P. C. Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets. Hum. Genet. 131, 747–756 (2012).

    Article  CAS  PubMed  Google Scholar 

  100. Fu, J. et al. Resistance to RNA interference by plant-derived double-stranded RNAs but not plant-derived short interfering RNAs in Helicoverpa armigera. Plant Cell Environ. 45, 1930–1941 (2022).

    Article  CAS  PubMed  Google Scholar 

  101. Shi, C., Chen, S. & Wang, J. Tomato telomere-to-telomere super pangenome empowers stress resilience breeding. Zenodo https://doi.org/10.5281/zenodo.17878268 (2025).

  102. Shi, C., Chen, S. & Wang, J. Codes for a tomato telomere-to-telomere super pangenome. Zenodo. https://doi.org/10.5281/zenodo.17935735 (2025).

Download references

Acknowledgements

We thank S. Huang for constructive suggestions on the paper, and the Tomato Genetics Resource Center at the University of California, Davis, for providing seeds for wild tomato accessions. We thank the Bioinformatics Platform at Peking University Institute of Advanced Agricultural Sciences for providing the high-performance computing resources. This work was supported by the Key R&D Program of Shandong Province (project 2025CXPT174 and 2025CXPT160 to C.Y.), the National Natural Science Foundation of China (32500545 to W.C.), the Shandong Provincial Natural Science Foundation (project SYS202206), the Yuandu Scholars Program, and the Taishan Scholars Program and Natural Science Foundation for Distinguished Young Scholars (ZR2023JQ010 to L.G.) of Shandong Province.

Author information

Authors and Affiliations

Authors

Contributions

C.Y. and L.G. conceived and supervised the project. Q.G. collected material phenotypes. D.M. and X.S. prepared the biological materials for sequencing. C. Shi, S.C., J.W. and W.C. performed bioinformatic analyses and prepared tables and figures. C. Sun conducted gene functional analysis. S.L., H.W., Y.M., X.S., J.Z. and L.D. carried out molecular experiments. C. Shi, L.G., C.Y. and W.C. wrote and revised the paper. L.Z. and S.H. participated in discussion and result interpretation. All authors read and approve the final paper.

Corresponding authors

Correspondence to Li Guo or Changxian Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Michael Bevan and Xingtan Zhang for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 An overview of the tomato graph super-pangenome in this study.

a, The SL6.0 T2T genome enables comprehensive population analysis across 294 wild and cultivated tomato accessions. b, Construction of the super-pangenome and investigation of pan-centromere evolution. c, A graph-based super-pangenome facilitates discovering genetic variations associated with salt tolerance and grey mold resistance. d, Editing and establishing new editing systems in wild species or cultivated varieties for de novo crop domestication.

Extended Data Fig. 2 Genome assembly and annotation of S. lycopersicum T2T gap-free genome.

a, Whole-genome coverage of HiFi reads across the SL6.0 assembly. Regions with read depth below 100 or above 250 were marked by black shades. The GC content was shown as a blue curve. 5S rDNA was marked by blue boxes. Satellites were marked by red boxes. b, Examples of gap-closures on Chr04 and Chr05 for SL6.0 genome assembly. c. Comparison of the annotated proteomes from SL6.0 and SL5.0 genome.

Extended Data Fig. 3 Genome statistics and correlation with transposons in 20 wild and cultivated tomato genomes.

a, Genome size and N50 length statistics for the genomes analyzed in the study. b, Correlation between the size genomes and the length of TE across the 20 newly assembled genomes. c-f, Correlation between genome size and the length of different TE types, including Gypsy (c), DNA (d), LINE (e), and Copia (f) elements. Pearson correlation coeficients, the 95% confdence interval are shown for each comparison.

Extended Data Fig. 4 Evolution of centromeres in wild and cultivated tomatoes.

a-b, Distribution of Gypsy and specific satellites as well as CENH3 enrichment on chromosomes Chr02 (a) and Chr03 (b). Six species from five clades are taken as representatives. The Clade5 includes a closely related wild species and a widely-used domesticated tomato. The centromeres are marked with grey background boxes.

Extended Data Fig. 5 A proposed evolutionary model of satellite-based centromeres in tomatoes.

a, Evolution of specific satellites in wild and cultivated tomatoes. The boxes represent the type of specific 24-bp (Sat24) or 53-bp (Sat53) repeats. The circles showing presence of specific satellite for each chromosome (column) and accession (row). b, Comparison of two 45S rDNA monomers from S. lycopersicum and N. benthamiana illustrating the origin of specific Sat53 and centromere-specific CEN33/43 satellites, respectively. c, A proposed evolutionary pathway for formation of satellite-type centromeres in cultivated tomatoes. The centromere of Solanum ancestor was probably dominated by Gypsy retrotransposons; then Sat24 emerged in ancient wild tomatoes (for example JUG) and acted as centromeres. After that, the Sat24 degenerated while the Sat53 emerged in LYC, probably originated from the subrepeat of 45S rDNA; then Sat53 expanded approaching to original centromeres, which then employed CENH3 protein in wild tomatoes (for example PER). After that, the satellite-type centromeres (most exclusively dominated by Sat53) formed and matured in cultivated tomatoes, although there were only one or two.

Extended Data Fig. 6 Pan-NLRome analysis of 20 wild and cultivated tomatoes.

a, Statistics of pan- and core-NLR size across 20 tomato genomes, classified using OrthoFinder. The x-axis indicates the number of tomato accessions, while the y-axis indicates the number of gene families. The central line denotes the median, and the edges of the box define the first and third quartiles. The whiskers extend to the range of datapoints within 1.5 times the interquartile range (IQR) (the same for all subsequent boxplots) b, Core, softcore, dispensable and private NLR genes from 20 tomato genomes. Core NLR genes (present in 20 accessions), soft core NLR genes (present in 18-19 accessions), Dispensable NLR genes (present in 2-17 accessions) and private NLR genes (present in only one accession). c, Presence and absence information of NLR genes in 20 tomato genomes. d, Pan-NLRome landscape of tomato species from different clades. e, The microsyntenic relationships of one NLR gene clusters on Chr05 in three tomato accessions. SLL A, M82; SLL B, Heinz 1706; SLL C, TS-60; SLL D, AC.

Extended Data Fig. 7 Structural variation landscape and Pan-SV analysis of wild and cultivated tomatoes.

a, SV size distribution. TRA: translocation. INV: inversion. DEL: deletion. INS: insertion. b, Variation of SV count in the pan- and core-SV along as additional tomato genomes are incorporated. c, Compositions of the pan-SV and individual genomes. The histogram shows the number of SVs in the 46 genomes with different frequencies. The pie shows the proportion of the SVs marked by each composition. d, Presence and absence information of pan SV across the 46 genomes. Red pentagrams represent published genomes. e, Number of private SVs in wild and cultivated tomatoes, showing a higher abundance of private SVs in wild tomatoes. Statistical significance was determined using two-tailed Student’s t-test.

Extended Data Fig. 8 Landscape and functional effects of structural variants.

a, Genomic distribution of the SV in relation to the gene. b, SV distribution with respect to transposable elements (TE). c, Frequency distribution of two SVs (upper panel) and three haplotypes (lower panel) on SlHAK gene across 294 wild and cultivated accessions. n indicates the number of individuals per species; d, Na⁺ and K⁺ contents in roots and leaves under salt stress. From left to right panel, Na⁺ and K⁺ contents in roots, Na⁺ and K⁺ contents in leaves for 56 accessions carrying three haplotypes under salt stress. Numbers in brackets indicate the number of accessions per haplotype. Statistical significance was determined using two-tailed Student’s t-test. e, Frequency distribution of SVs on known genes across 294 wild and cultivated accessions. n indicates the number of individuals per species.

Extended Data Fig. 9 Pangenome assisted gene discovery and functional analysis for grey mold resistance in tomatoes.

a, Phenotype of tomato leaves after B. cinerea infection. The scale bar = 1 cm. b, Frequency distribution of lesion area on leaves of 209 accession after B. cinerea infection. c, Comparison of lesion area on leaves of 209 tomato accessions after B. cinerea inoculation. Scattered colors represent tomato species including 16 wild and one cultivated (SLC and SLL) species. n represent the number of accessions each clade. d, PCR validation of a 937-bp deletion in 24 tomato accessions. Numbers 1-24 correspond to accessions in Supplementary Table 21. Three independent experiments were conducted with similar results. e, Functional domains of SlGMAK were determined using Interpro (https://www.ebi.ac.uk/interpro/) and SMART (https://smart.embl.de/) (upper panel). The putative transmembrane (TM) region was predicted with DeeTMHMM (https://dtu.biolib.com/DeepTMHMM/) (lower panel). f, Expression pattern of SlGMAK gene in various tomato tissues using RNA-seq and RT-qPCR. 20 critical developmental stages (roots, stems, leaves, flowers and fruits) of MicroTom accession RNA-seq download from NCBI under the BioProject PRJCA001514 (n = 3). Upper right panel, quantitative of SlGMAK gene expression using RT-qPCR (n = 3). g-h, Transient overexpression of SlGMAK in tomato (g) and N. benthamiana (h) leaves. In g, leaves phenotype (left panel), relative expression levels of SlGMAK (middle panel) (n = 3) and lesion area (right panel) (n = 8) on leaves of tomato after B. cinerea infection. In h, leaves phenotype (left panel), relative expression levels of SlGMAK (middle panel) (n = 3) and lesion area (right panel) (n = 7) on leaves of N. benthamiana after B. cinerea infection. Scale bar = 1 cm. i, VIGS of SlGMAK in tomato plants. Phenotype of leaves (left panel), relative expression levels of SlGMAK (right upper panel) (n = 3) and lesion area (right lower panel) (n = 4) on leaves after B. cinerea infection in empty vector (EV) and VIGS-treated tomato plants. The scale bar = 1 cm. In f, g, h and i, all data are presented as mean ± SD with n for the number of biological repeats. Statistical significance was determined by two-sided Student’s t-tests.

Source data

Supplementary information

Supplementary Information

Supplementary Notes 1–3, methods, Figs. 1–23, references and figures.

Reporting Summary

Peer Review File

Supplementary Tables

Supplementary Tables 1–25.

Source data

Source Data Figs. 5 and 6 and Extended Data Fig. 9

Statistical source data for Figs. 5 and 6 and Extended Data Fig. 9.

Source Data Extended Data Fig. 9

Unprocessed gels for Extended Data Fig. 9d.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shi, C., Chen, S., Wang, J. et al. A tomato telomere-to-telomere super-pangenome empowers stress resilience breeding. Nat Genet (2026). https://doi.org/10.1038/s41588-026-02508-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • DOI: https://doi.org/10.1038/s41588-026-02508-y

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing