Abstract
Limited pangenome and ambiguous genomic architecture constrain comprehensive genetic variation discovery and cotton improvement. Here we assembled a telomere-to-telomere (T2T) genome for elite cultivar NDM13 and near-T2T genomes for 27 additional representatives of Gossypium hirsutum over the recent century, with transcriptomic profiling of 15 distinct tissues from each. We uncovered 51,551 one-to-one conserved orthologs across all genomes and landscapes of telomere, centromere, 45S rDNA, segmental duplication and copy number variant. We revealed hotspots of structural variation (SV) and impacts of SV, segmental duplication and copy number variant on gene expression or content alteration, as well as adversity resistances. We identified thousands of divergent SVs and genes implicated in modern breeding evolution. Combining T2T-reference-based pangenome construction and 761,536 SVs identified across 1,671 worldwide accessions with phenotypic data from 22 environments, we captured a number of hidden SVs that potentially influence critical breeding traits. These will boost genetic study and biotechnological improvement of the crop.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout






Similar content being viewed by others
Data availability
The raw sequencing and transcriptome data for 28 cottons have been deposited in the National Genomics Data Center (NGDC) under the BioProject accession PRJCA023347 and in the NCBI Sequence Read Archive (SRA) under the BioProject accession PRJNA1132390. The genome assemblies of 28 cottons and the CENH3 ChIP–seq data for NDM13 have been deposited in the NGDC under the BioProject accession PRJCA023347. The resequencing data for 1,671 accessions are available in the NCBI SRA under the BioProject accession PRJNA680449 (1,081 cotton accessions) and PRJNA1132397 (590 cotton accessions). Source data are provided with this paper.
Code availability
The script and software used in this study are all publicly available from the internet as described in Methods and Reporting Summary. All custom scripts and codes associated with this project are available via Zenodo at https://doi.org/10.5281/zenodo.18357054 (ref. 121) and GitHub at https://github.com/SLBio/Analysis_pipeleine-NG-A66010.
References
Sven, B. Empire of Cotton: A Global History (Alfred A. Knopf Press, 2014).
Fang, L. et al. Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits. Nat. Genet. 49, 1089–1098 (2017).
The International Wheat Genome Sequencing Consortium et al. Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science 361, eaar7191 (2018).
Walkowiak, S. et al. Multiple wheat genomes reveal global variation in modern breeding. Nature 588, 277–283 (2020).
Goff, S. A. et al. A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296, 92–100 (2002).
Yu, J. et al. A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296, 79–92 (2002).
Schnable, P. S. et al. The B73 maize genome: complexity, diversity, and dynamics. Science 326, 1112–1115 (2009).
Jiao, Y. et al. Improved maize reference genome with single-molecule technologies. Nature 546, 524–527 (2017).
Hufford, M. B. et al. De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes. Science 373, 655–662 (2021).
Schmutz, J. et al. Genome sequence of the palaeopolyploid soybean. Nature 463, 178–183 (2010).
Paterson, A. H. et al. Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres. Nature 492, 423–427 (2012).
Wang, K. et al. The draft genome of a diploid cotton Gossypium raimondii. Nat. Genet. 44, 1098–1103 (2012).
Li, F. et al. Genome sequence of the cultivated cotton Gossypium arboreum. Nat. Genet. 46, 567–572 (2014).
Li, F. et al. Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat. Biotechnol. 33, 524–530 (2015).
Zhang, T. et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat. Biotechnol. 33, 531–537 (2015).
Ma, Z. et al. Resequencing a core collection of upland cotton identifies genomic variation and loci influencing fiber quality and yield. Nat. Genet. 50, 803–813 (2018).
Hu, Y. et al. Gossypium barbadense and Gossypium hirsutum genomes provide insights into the origin and evolution of allotetraploid cotton. Nat. Genet. 51, 739–748 (2019).
Wang, M. et al. Reference genome sequences of two cultivated allotetraploid cottons, Gossypium hirsutum and Gossypium barbadense. Nat. Genet. 51, 224–229 (2019).
Chen, Z. J. et al. Genomic diversifications of five Gossypium allopolyploid species and their impact on cotton improvement. Nat. Genet. 52, 525–533 (2020).
Huang, G. et al. Genome sequence of Gossypium herbaceum and genome updates of Gossypium arboreum and Gossypium hirsutum provide insights into cotton A-genome evolution. Nat. Genet. 52, 516–524 (2020).
Ma, Z. et al. High-quality genome assembly and resequencing of modern cotton cultivars provide resources for crop improvement. Nat. Genet. 53, 1385–1391 (2021).
Wang, M. et al. Genomic innovation and regulatory rewiring during evolution of the cotton genus Gossypium. Nat. Genet. 54, 1959–1971 (2022).
Sreedasyam, A. et al. Genome resources for three modern cotton lines guide future breeding efforts. Nat. Plants 10, 1039–1051 (2024).
Nurk, S. et al. The complete sequence of a human genome. Science 376, 44–53 (2022).
Chen, J. et al. A complete telomere-to-telomere assembly of the maize genome. Nat. Genet. 55, 1221–1231 (2023).
Naish, M. et al. The genetic and epigenetic landscape of the Arabidopsis centromeres. Science 374, eabi7489 (2021).
Shang, L. et al. A complete assembly of the rice Nipponbare reference genome. Mol. Plant 16, 1232–1236 (2023).
Hu, Y. et al. Post-polyploidization centromere evolution in cotton. Nat. Genet. 57, 1021–1030 (2025).
Hu, G. et al. A telomere-to-telomere genome assembly of cotton provides insights into centromere evolution and short-season adaptation. Nat. Genet. 57, 1031–1043 (2025).
Liu, Y. et al. Pan-genome of wild and cultivated soybeans. Cell 182, 162–176 (2020).
Alonge, M. et al. Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell 182, 145–161 (2020).
Jayakodi, M. et al. The barley pan-genome reveals the hidden legacy of mutation breeding. Nature 588, 284–289 (2020).
Qin, P. et al. Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell 184, 3542–3558 (2021).
Shi, J., Tian, Z., Lai, J. & Huang, X. Plant pan-genomics and its applications. Mol. Plant 16, 168–186 (2023).
Tang, D. et al. Genome evolution and diversity of wild and cultivated potatoes. Nature 606, 535–541 (2022).
Zhou, Y. et al. Graph pangenome captures missing heritability and empowers tomato breeding. Nature 606, 527–534 (2022).
Li, N. et al. Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species. Nat. Genet. 55, 852–860 (2023).
He, Q. et al. A graph-based genome and pan-genome variation of the model plant Setaria. Nat. Genet. 55, 1232–1242 (2023).
Yang, Z. et al. Graph pan-genome illuminates evolutionary trajectories and agronomic trait architecture in allotetraploid cotton. Nat. Genet. 58, 218–229 (2026).
Gu, Q. et al. A high-density genetic map and multiple environmental tests reveal novel quantitative trait loci and candidate genes for fibre quality and yield in cotton. Theor. Appl. Genet. 133, 3395–3408 (2020).
Gu, Q. et al. A stable QTL qSalt-A04-1 contributes to salt tolerance in the cotton seed germination stage. Theor. Appl. Genet. 134, 2399–2410 (2021).
Zhang, X. et al. Breeding of high-quality cotton in Hebei province during the past 70 years. China Cotton 47, 1–6 (2020).
Liao, W. W. et al. A draft human pangenome reference. Nature 617, 312–324 (2023).
Yang, Z. et al. Recent progression and future perspectives in cotton genomic breeding. J. Integr. Plant Biol. 65, 548–569 (2023).
Zhang, C. Y. et al. High-quality genome of a modern soybean cultivar and resequencing of 547 accessions provide insights into the role of structural variation. Nat. Genet. 56, 2247–2258 (2024).
Yang, Z. et al. Multi-omics provides new insights into the domestication and improvement of dark jute (Corchorus olitorius). Plant J. 112, 812–829 (2022).
Zhang, Y. et al. The telomere-to-telomere gap-free genome of four rice parents reveals SV and PAV patterns in hybrid rice breeding. Plant Biotechnol. J. 20, 1642–1644 (2022).
Aganezov, S. et al. A complete reference genome improves analysis of human genetic variation. Science 376, eabl3533 (2022).
Vollger, M. R. et al. Segmental duplications and their variation in a complete human genome. Science 376, eabj6965 (2022).
Bretani, G. et al. Segmental duplications are hot spots of copy number variants affecting barley gene content. Plant J. 103, 1073–1088 (2020).
Emanuel, B. S. & Shaikh, T. H. Segmental duplications: an ‘expanding’ role in genomic instability and disease. Nat. Rev. Genet. 2, 791–800 (2001).
Hosmani, P. S. et al. Dirigent domain-containing protein is part of the machinery required for formation of the lignin-based Casparian strip in the root. Proc. Natl Acad. Sci. USA 110, 14498–14503 (2013).
Paniagua, C. et al. Dirigent proteins in plants: modulating cell wall metabolism during abiotic and biotic stress exposure. J. Exp. Bot. 68, 3287–3301 (2017).
Wang, Y. et al. A dirigent family protein confers variation of Casparian strip thickness and salt tolerance in maize. Nat. Commun. 13, 2222 (2022).
Yang, X. et al. A loss-of-function of the dirigent gene TaDIR-B1 improves resistance to Fusarium crown rot in wheat. Plant Biotechnol. J. 19, 866–868 (2021).
Deng, J. et al. Dirigent gene family is involved in the molecular interaction between Panax notoginseng and root rot pathogen Fusarium solani. Ind. Crop. Prod. 178, 114544 (2022).
Lin, J. L. et al. Dirigent gene editing of gossypol enantiomers for toxicity-depleted cotton seeds. Nat. Plants 9, 605–615 (2023).
Li, S. et al. Genome-edited powdery mildew resistance in wheat without growth penalties. Nature 602, 455–460 (2022).
Li, Y. B. et al. The thioredoxin GbNRX1 plays a crucial role in homeostasis of apoplastic reactive oxygen species in response to Verticillium dahliae infection in cotton. Plant Physiol. 170, 2392–2406 (2016).
Chen, J. et al. NLR surveillance of pathogen interference with hormone receptors induces immunity. Nature 613, 145–152 (2023).
Wang, N. et al. An F-box protein attenuates fungal xylanase-triggered immunity by destabilizing LRR-RLP NbEIX2 in a SOBIR1-dependent manner. New Phytol. 236, 2202–2215 (2022).
Bian, Y. et al. Cancer SLC43A2 alters T cell methionine metabolism and histone methylation. Nature 585, 277–282 (2020).
Zhai, K. et al. NLRs guard metabolism to coordinate pattern- and effector-triggered immunity. Nature 601, 245–251 (2022).
Porubsky, D. et al. Recurrent inversion polymorphisms in humans associate with genetic instability and genomic disorders. Cell 185, 1986–2005 (2022).
Garrison, E. et al. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nat. Biotechnol. 36, 875–879 (2018).
Jamshed, M. et al. Identification of stable quantitative trait loci (QTLs) for fiber quality traits across multiple environments in Gossypium hirsutum recombinant inbred line population. BMC Genomics 17, 197 (2016).
Rico, M. & Egelhoff, T. T. Myosin heavy chain kinase B participates in the regulation of myosin assembly into the cytoskeleton. J. Cell Biochem. 88, 521–532 (2003).
Song, X. et al. Genome-wide association analysis reveals loci and candidate genes involved in fiber quality traits under multiple field environments in cotton (Gossypium hirsutum). Front. Plant Sci. 12, 695503 (2021).
Shao, Q. et al. Identifying QTL for fiber quality traits with three upland cotton (Gossypium hirsutum L.) populations. Euphytica 198, 43–58 (2014).
Zhang, Z. et al. Genome-wide quantitative trait loci reveal the genetic basis of cotton fibre quality and yield-related traits in a Gossypium hirsutum recombinant inbred line population. Plant Biotechnol. J. 18, 239–253 (2020).
Ling, J. Karyotype Analysis by Telomere-FISH and Primary Development of High-Resolution Cytological Map in Cotton. PhD thesis, Chinese Academy of Agricultural Sciences (2008).
Dvořáčková, M., Fojtová, M. & Fajkus, J. Chromatin dynamics of plant telomeres and ribosomal genes. Plant J. 83, 18–37 (2015).
Sykorova, E. et al. The absence of Arabidopsis-type telomeres in Cestrum and closely related genera Vestia and Sessea (Solanaceae): first evidence from eudicots. Plant J. 34, 283–291 (2003).
Sykorová, E. et al. Minisatellite telomeres occur in the family Alliaceae but are lost in Allium. Am. J. Bot. 93, 814–823 (2006).
He, S. et al. The genomic basis of geographic differentiation and fiber improvement in cultivated cotton. Nat. Genet. 53, 916–924 (2021).
Yang, Z. et al. Extensive intraspecific gene order and gene structural variations in upland cotton cultivars. Nat. Commun. 10, 2989 (2019).
Harringmeyer, O. & Hoekstra, H. Chromosomal inversion polymorphisms shape the genomic landscape of deer mice. Nat. Ecol. Evol. 6, 1965–1979 (2022).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Chang, X. et al. High-quality Gossypium hirsutum and Gossypium barbadense genome assemblies reveal the landscape and evolution of centromeres. Plant Commun. 5, 100722 (2023).
Marçais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput. Biol. 14, e1005944 (2018).
Mount, D. W. Using the basic local alignment search tool (BLAST). CSH Protoc. 2007, pdb.top17 (2007).
Smit, A., Hubley, R. & Green, P. RepeatMasker Open-4.0. Institute for Systems Biology http://www.repeatmasker.org (2013–2015).
Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. DNA 6, 11 (2015).
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, 265–268 (2007).
Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics 21, i351–i358 (2005).
Edgar, R. C. & Myers, E. W. PILER: identification and classification of genomic repeats. Bioinformatics 21, i152–i158 (2005).
Smit, A. & Hubley, R. RepeatModeler Open-1.0. Institute for Systems Biology http://www.repeatmasker.org (2008–2015).
Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988–995 (2004).
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
Haas, B. J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666 (2003).
Stanke, M. & Waack, S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19, ii215 (2003).
Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 2878–2879 (2004).
Korf, I. Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).
Guigó, R. Assembling genes from predicted exons in linear time with dynamic programming. J. Comput. Biol. 5, 681–702 (1998).
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).
Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013).
Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).
The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 46, 2699 (2018).
Finn, R. D. et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 44, 279–285 (2016).
The Gene Ontology Consortium. Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res. 45, 331–338 (2017).
Kanehisa, M. et al. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 42, 199–205 (2014).
Išerić, H., Alkan, C., Hach, F. & Numanagić, I. Fast characterization of segmental duplication structure in multiple genome assemblies. Algorithms Mol. Biol. 17, 4 (2022).
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).
Chakraborty, M., Emerson, J. J., Macdonald, S. J. & Long, A. D. Structural variants exhibit widespread allelic heterogeneity and shape variation in complex traits. Nat. Commun. 10, 4872 (2019).
Goel, M., Sun, H., Jiao, W. B. & Schneeberger, K. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol. 20, 277 (2019).
Poplin, R. et al. A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 36, 983–987 (2018).
Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).
Putri, G. H., Anders, S., Pyl, P. T., Pimanda, J. E. & Zanini, F. Analysing high-throughput sequencing data in Python with HTSeq 2.0. Bioinformatics 38, 2943–2945 (2022).
Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).
Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Ge, X. et al. Efficient genotype-independent cotton genetic transformation and genome editing. J. Integr. Plant Biol. 65, 907–917 (2023).
Sun, Z. Scripts and code used in ‘A pangenome reference and population studies link structural variants with breeding traits in Gossypium hirsutum’. Zenodo https://doi.org/10.5281/zenodo.18357054 (2026).
Acknowledgements
This work was supported by the National Key Research and Development Program of China (2022YFF1001403) to Y.Z., Z.M. and Xingfen Wang; the Science Research Project of Hebei Education Department (PTZX2026014) to Xingfen Wang, Y.Z., Z.S. and Z.M.; the Natural Science Foundation (C2022204205) to Xingfen Wang, Y.Z. and Z.S.; the Key Research and Development Program (21326314D) to Z.M., Xingfen Wang and Y.Z.; the Top Talent Project (031601801) of Hebei Province to Z.M.; the National Key Project of Bio-breeding of China (2023ZD04039) to Z.M., Xingfen Wang, Y.Z. and Z.S.; the China Agricultural Research System (CARS-15-03) to L.W., Xingfen Wang, Y.Z. and Z.M. and the Project for National Top Talent (0602019) and Shennong Plan of China to Y.Z.
Author information
Authors and Affiliations
Contributions
Y.Z., Z.S., Xingfen Wang and Z.M. performed most of the experiments and analyzed the data. Y.Z., Z.S., Xingfen Wang, L.W., Q.G., H.K., G.Z., B.C., Z.W., J.Z., X.Z. Z.L., J.Y., J.W., G.W., D.Z., Xingyi Wang, C.M., Y.L., Z.Z., W.C., M.J., H.J., J. Li, H.Z., Y.W., M.G., M.X., L.W., Z.L., Y.Y., Y.C. and J. Liu performed field trials, trait determination and sample preparation. S.T., X.L., Y.J., K.Z., Z.S., Y.Z. and Xingfen Wang performed the genome assembly and genomic analyses. Y.Z., Xingfen Wang, Z.S., S.T., X.L. and Z.M. identified genomic variations and constructed tables and figures. Y.Z., Z.S., Q.G. and Xingfen Wang conducted the genetic analyses of breeding traits. Y.Z., Xingfen Wang and Z.M. wrote the paper. Z.M. and Xingfen Wang conceived and supervised the project.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Phylogenetic tree of 1,671 accessions and highly diverse agronomic phenotypes across 28 accessions.
a, Phylogenetic tree using genome-wide SNP data. This tree incorporated G. barbadense cv. Pima90 as the outgroup, with all branch lengths annotated for clarity. Branch lengths are quantified through substitutions per site. (b-e) The diverse agronomic phenotypes among 28 accessions, including seed size and color (b), length of boll handle (c), leaf size and shape (d), boll size and shape (e). f, Fiber length. g, Fiber strength, h, Lint percentage. All scale bars represent 1 cm.
Extended Data Fig. 2 Density of SD blocks identified in the NDM13 and NDM8 genomes.
The blue lines indicate the synteny between NDM13 and NDM8 in each chromosome.
Extended Data Fig. 3 The density of gene models, Copia and Gypsy of the 28 genomes with 1,000 windows.
The vertical dashed lines indicate the 10% windows of the left and right, respectively.
Extended Data Fig. 4 Expression comparison among core, dispensable and private genes based on the averaged FPKM of 15 tissues in each cotton.
In the box plots, the center line denotes the median; box limits are the upper and lower quartiles; whiskers mark the range of the data. Statistical significance was determined using a two-side wilcox test.
Extended Data Fig. 5 An example for SV hotspots located in chromosome Dt01.
a, SV hotspots in 60-61 Mb of chromosome Dt01 from each accession. b, Disease resistance-related genes located in hotspot.
Supplementary information
Supplementary Information (download PDF )
Supplementary Notes 1 and 2, and Figs. 1–15.
Supplementary Tables (download XLSX )
Supplementary Tables 1–52.
Source data
Source Data Fig. 6 (download PDF )
Unprocessed gel.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Y., Sun, Z., Tian, S. et al. A pangenome reference and population studies link structural variants with breeding traits in Gossypium hirsutum. Nat Genet 58, 928–939 (2026). https://doi.org/10.1038/s41588-026-02523-z
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41588-026-02523-z


