Abstract
Upland cotton (Gossypium hirsutum) accounts for more than 90% of the world’s cotton production and, as an allotetraploid, is a model plant for polyploid crop domestication. In the present study, we reported a complete telomere-to-telomere (T2T) genome assembly of Upland cotton accession Texas Marker-1 (T2T-TM-1), which has a total size of 2,299.6 Mb, and annotated 79,642 genes. Based on T2T-TM-1, interspecific centromere divergence was detected between the A- and D-subgenomes and their corresponding diploid progenitors. Centromere-associated repetitive sequences (CRCs) were found to be enriched for Gypsy-like retroelements. Centromere size expansion, repositioning and structure variations occurred post-polyploidization. It is interesting that CRC homologs were transferred from the diploid D-genome progenitor to the D-subgenome, invaded the A-subgenome and then underwent post-tetraploidization proliferation. This suggests an evolutionary advantage for the CRCs of the D-genome progenitor, presents a D-genome-adopted inheritance of centromere repeats after polyploidization and shapes the dynamic centromeric landscape during polyploidization in polyploid species.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Data availability
The T2T-TM-1 assembly is available at the GenBank (accession no. JBJYSO000000000) and NCBI (accession no. PRJNA1197584). The assembly and annotation are also available at http://cotton.zju.edu.cn and the figshare database website (https://figshare.com/s/e7448929553e1073acaa). The genome-sequencing data used for TM-1-T2T assembly, including PacBio HiFi data and ultralong ONT data, were deposited in the NCBI database (accession nos. PRJNA1161022 and PRJNA1196658) and China National Genomics Data (https://ngdc.cncb.ac.cn) (accession no. PRJCA021201). The raw transcriptomics data used for annotation were deposited in the NCBI database (accession no. PRJNA1162026) and China National Genomics Data (accession no. PRJCA021339). The ChIP–seq data used for centromere identification were deposited in the NCBI database (accession no. PRJNA1162014) and China National Genomics Data (accession no. PRJCA021342).
Code availability
The codes used in the present study are available via Zenodo at https://doi.org/10.5281/zenodo.13294045 (ref. 100).
References
Kursel, L. E. & Malik, H. S. Centromeres. Curr. Biol. 26, R487–R490 (2016).
Black, B. E. et al. Structural determinants for generating centromeric chromatin. Nature 430, 578–582 (2004).
Jiang, J., Birchler, J. A., Parrott, W. A. & Dawe, R. K. A molecular view of plant centromeres. Trends Plant Sci. 8, 570–575 (2003).
Comai, L., Maheshwari, S. & Marimuthu, M. P. A. Plant centromeres. Curr. Opin. Plant Biol. 36, 158–167 (2017).
Sullivan, L. L. & Sullivan, B. Genomic and functional variation of human centromeres. Exp. Cell Res. 389, 111896 (2020).
Hartley, G. A., Okhovat, M., O’Neill, R. J. & Carbone, L. Comparative analyses of gibbon centromeres reveal dynamic genus-specific shifts in repeat composition. Mol. Biol. Evol. 38, 3972–3992 (2021).
Capozzi, O. et al. A comprehensive molecular cytogenetic analysis of chromosome rearrangements in gibbons. Genome Res. 22, 2520–2528 (2012).
Bracewell, R., Chatla, K., Nalley, M. J. & Bachtrog, D. Dynamic turnover of centromeres drives karyotype evolution in Drosophila. eLife 8, e49002 (2019).
Perumal, S. et al. A high-contiguity Brassica nigra genome localizes active centromeres and defines the ancestral Brassica genome. Nat. Plant 6, 929–941 (2020).
Kohel, R., Richmond, T. R. & Lewis, C. F. Texas Marker-1. Description of a genetic standard for Gossypium hirsutum L. Crop Sci. 10, 670–671 (1970).
Zhang, T. Z. et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat. Biotechnol. 33, 531–537 (2015).
Li, F. G. et al. Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat. Biotechnol. 33, 524–530 (2015).
Hu, Y. et al. Gossypium barbadense and Gossypium hirsutum genomes provide insights into the origin and evolution of allotetraploid cotton. Nat. Genet. 51, 739–748 (2019).
Huang, G. et al. Genome sequence of Gossypium herbaceum and genome updates of Gossypium arboreum and Gossypium hirsutum provide insights into cotton A-genome evolution. Nat. Genet. 52, 516–524 (2020).
Wang, M. et al. Reference genome sequences of two cultivated allotetraploid cottons, Gossypium hirsutum and Gossypium barbadense. Nat. Genet. 51, 224–229 (2019).
Yang, Z. et al. Extensive intraspecific gene order and gene structural variations in upland cotton cultivars. Nat. Commun. 10, 2989–3001 (2019).
Chen, Z. J. et al. Genomic diversifications of five Gossypium allopolyploid species and their impact on cotton improvement. Nat. Genet. 52, 525–533 (2020).
Simao, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Shay, J. W. & Wright, W. E. Telomeres and telomerase: three decades of progress. Nat. Rev. Genet. 20, 299–309 (2019).
Han, J. et al. Rapid proliferation and nucleolar organizer targeting centromeric retrotransposons in cotton. Plant J. 88, 992–1005 (2016).
Shan, W., Jiang, Y., Han, J. & Wang, K. Comprehensive cytological characterization of the Gossypium hirsutum genome based on the development of a set of chromosome cytological markers. Crop J. 4, 256–265 (2016).
Zhang, Y. et al. Cysteine-rich receptor-like protein kinases: emerging regulators of plant stress responses. Trends Plant Sci. 28, 776–794 (2023).
Fan, M., Wang, M. & Bai, M.-Y. Diverse roles of SERK family genes in plant growth, development and defense response. Sci. China Life Sci. 59, 889–896 (2016).
Wang, R. et al. Genome-wide analysis of strictosidine synthase-like gene family revealed their response to biotic/abiotic stress in poplar. Int. J. Mol. Sci. 24, 10117 (2023).
Ma, Z. et al. Resequencing a core collection of upland cotton identifies genomic variation and loci influencing fiber quality and yield. Nat. Genet. 50, 803–813 (2018).
Shang, L. et al. A super pan-genomic landscape of rice. Cell Res. 32, 878–896 (2022).
Li, B. et al. Wheat centromeric retrotransposons: the new ones take a major role in centromeric structure. Plant J. 73, 952–965 (2013).
Hudakova, S. et al. Sequence organization of barley centromeres. Nucleic Acids Res. 29, 5029–5035 (2001).
Houben, A. et al. CENH3 interacts with the centromeric retrotransposon cereba and GC-rich satellites and locates to centromeric substructures in barley. Chromosoma 116, 275–283 (2007).
Su, H. et al. Centromere satellite repeats have undergone rapid changes in polyploid wheat subgenomes. Plant Cell 31, 2035–2051 (2019).
Naish, M. et al. The genetic and epigenetic landscape of the Arabidopsis centromeres. Science 374, eabi7489 (2021).
Song, J. M. et al. Two gap-free reference genomes and a global view of the centromere architecture in rice. Mol. Plant 14, 1757–1767 (2021).
Chen, J. et al. A complete telomere-to-telomere assembly of the maize genome. Nat. Genet. 55, 1221–1231 (2023).
Gong, Z. et al. Repeatless and repeat-based centromeres in potato: implications for centromere evolution. Plant Cell 24, 3559–3574 (2012).
Macas, J. et al. Next generation sequencing-based analysis of repetitive DNA in the model dioecious [corrected] plant Silene latifolia. PLoS ONE 6, e27335 (2011).
Zhang, W. et al. Identification of centromeric regions on the linkage map of cotton using centromere-related repeats. Genomics 104, 587–593 (2014).
Udall, J. A. et al. De novo genome sequence assemblies of Gossypium raimondii and Gossypium turneri. G3 9, 3079–3085 (2019).
Gao, S. et al. HiCAT: a tool for automatic annotation of centromere structure. Genome Biol. 24, 58 (2023).
Altemose, N. et al. Complete genomic and epigenetic maps of human centromeres. Science 376, eabl4178 (2022).
Alexandrov, I., Kazakov, A., Tumeneva, I., Shepelev, V. & Yurov, Y. Alpha-satellite DNA of primates: old and new families. Chromosoma 110, 253–266 (2001).
Luo, S. et al. The cotton centromere contains a Ty3-gypsy-like LTR retroelement. PLoS ONE 7, e35261 (2012).
Fukagawa, T. & Earnshaw, W. C. The centromere: chromatin foundation for the kinetochore machinery. Dev. Cell 30, 496–508 (2014).
Shang, L. et al. A complete assembly of the rice Nipponbare reference genome. Mol. Plant 16, 1232–1236 (2023).
Deng, Y. et al. A telomere-to-telomere gap-free reference genome of watermelon and its mutation library provide important resources for gene discovery and breeding. Mol. Plant 15, 1268–1284 (2022).
Wang, Y. et al. Telomere-to-telomere and haplotype-resolved genome of the kiwifruit Actinidia eriantha. Mol. Hort. 3, 4 (2023).
Huang, H. et al. Telomere-to-telomere haplotype-resolved reference genome reveals subgenome divergence and disease resistance in triploid Cavendish banana. Hort. Res. 10, uhad153 (2023).
Fu, A. et al. Telomere-to-telomere genome assembly of bitter melon (Momordica charantia L. var. abbreviata Ser.) reveals fruit development, composition and ripening genetic characteristics. Hort. Res. 10, uhac228 (2022).
Zhou, Y. et al. The telomere-to-telomere genome of Fragaria vesca reveals the genomic evolution of Fragaria and the origin of cultivated octoploid strawberry. Hort. Res. 10, uhad027 (2023).
Wang, T. et al. A complete gap-free diploid genome in Saccharum complex and the genomic footprints of evolution in the highly polyploid Saccharum genus. Nat. Plant 9, 554–571 (2023).
Wlodzimierz, P. et al. Cycles of satellite and transposon evolution in Arabidopsis centromeres. Nature 618, 557–565 (2023).
Ahmed, H. et al. Einkorn genomics sheds light on history of the oldest domesticated wheat. Nature 620, 830–838 (2023).
Ma, H. et al. Centromere plasticity with evolutionary conservation and divergence uncovered by wheat 10+ genomes. Mol. Biol. Evol. 40, msad176 (2023).
Zhao, J. et al. Centromere repositioning and shifts in wheat evolution. Plant Commun. 4, 100556 (2023).
Ventura, M., Archidiacono, N. & Rocchi, M. Centromere emergence in evolution. Genome Res. 11, 595–599 (2001).
Ventura, M. et al. Recurrent sites for new centromere seeding. Genome Res. 14, 1696–1703 (2004).
Ventura, M. et al. Evolutionary formation of new centromeres in macaque. Science 316, 243–246 (2007).
Wang, K., Wu, Y., Zhang, W., Dawe, R. K. & Jiang, J. Maize centromeres expand and adopt a uniform size in the genetic background of oat. Genome Res. 24, 107–116 (2014).
Zhao, H. et al. Recurrent establishment of de novo centromeres in the pericentromeric region of maize chromosome 3. Chromosome Res. 25, 299–311 (2017).
Schneider, K. L., Xie, Z., Wolfgruber, T. K. & Presting, G. G. Inbreeding drives maize centromere evolution. Proc. Natl Acad. Sci. USA 113, E987–E996 (2016).
Xue, C. et al. De novo centromere formation in pericentromeric region of rice chromosome 8. Plant J. 111, 859–871 (2022).
Yang, X. et al. Amplification and adaptation of centromeric repeats in polyploid switchgrass species. New Phytol. 218, 1645–1657 (2018).
Paterson, A. H., Brubaker, C. L. & Wendel, J. F. A rapid method for extraction of cotton (Gossypium spp.) genomic DNA suitable for RFLP or PCR analysis. Plant Mol. Biol. Rep. 11, 122–127 (1993).
Cheng, H., Concepcion, G., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Zhang, X., Zhang, S., Zhao, Q., Ming, R. & Tang, H. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nat. Plant 5, 833–845 (2019).
Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259–269 (2015).
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).
Lin, Y. et al. quarTeT: a telomere-to-telomere toolkit for gap-free genome assembly and centromeric repeat identification. Hort. Res. 10, uhad127 (2023).
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).
Guigo, R. Assembling genes from predicted exons in linear time with dynamic programming. J. Comput. Biol. 5, 681–702 (1998).
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).
Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 2878–2879 (2004).
Korf, I. Gene finding in novel genomes. BMC Bioinf. 5, 59–67 (2004).
Birney, E., Clamp, M. & Durbin, R. GeneWise and genomewise. Genome Res. 14, 988–995 (2004).
Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240 (2014).
Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucleic Acids Res. 25, 31–36 (1997).
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964 (1997).
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Goel, M., Sun, H., Jiao, W. B. & Schneeberger, K. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol. 20, 277–289 (2019).
Goel, M. & Schneeberger, K. plotsr: visualizing structural similarities and rearrangements between multiple genomes. Bioinformatics 38, 2922–2926 (2022).
Fan, L. et al. A high-density genetic map of extra-long staple cotton (Gossypium barbadense) constructed using genotyping-by-sequencing based single nucleotide polymorphic markers and identification of fiber traits-related QTL in a recombinant inbred line population. BMC Genom. 19, 489–500 (2018).
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).
Fang, L. et al. Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits. Nat. Genet. 49, 1089–1098 (2017).
Fang, L. et al. Divergent improvement of two cultivated allotetraploid cotton species. Plant Biotechnol. J. 19, 1325–1336 (2021).
Yin, L. et al. rMVP: a memory-efficient, visualization-enhanced, and parallel-accelerated tool for genome-wide association study. Genom. Proteom. Bioinformat. 19, 619–628 (2021).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Wang, M. et al. Comparative genome analyses highlight transposon-mediated genome expansion and the evolutionary architecture of 3D genomic folding in cotton. Mol. Biol. Evol. 38, 3621–3636 (2021).
Stovner, E. B. & Sætrom, P. epic2 efficiently finds diffuse domains in ChIP-seq data. Bioinformatics 35, 4392–4393 (2019).
Ou, S. et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 20, 275 (2019).
Zhang, R.-G. et al. TEsorter: an accurate and fast method to classify LTR-retrotransposons in plant genomes. Hort. Res. 9, uhac017 (2022).
Vollger, M. R., Kerpedjiev, P., Phillippy, A. M. & Eichler, E. E. StainedGlass: interactive visualization of massive tandem repeat structures with identity heatmaps. Bioinformatics 38, 2049–2051 (2022).
Albert, P. S. et al. Whole-chromosome paints in maize reveal rearrangements, nuclear domains, and chromosomal relationships. Proc. Natl Acad. Sci. USA 116, 1679–1685 (2019).
Zhang, X. et al. Characterization of meiotic chromosome behavior in the autopolyploid Saccharum spontaneum reveals preferential chromosome pairing without distinct DNA sequence variation. Crop J. 11, 1550–1558 (2023).
Wang, K., Zhang, W., Jiang, Y. & Zhang, T. Systematic application of DNA fiber-FISH technique in cotton. PLoS ONE 8, e75674 (2013).
Yu, J. et al. CottonGen: the community database for cotton genomics, genetics, and breeding research. Plants 10, 2805–2820 (2021).
Dai, F. et al. COTTONOMICS: a comprehensive cotton multi-omics database. Database 2022, 1–8 (2022).
Hu, Y. Scripts used in ‘Post-polyploidization centromere evolution in cotton’. Zenodo https://doi.org/10.5281/zenodo.13294045 (2024).
Acknowledgements
The present study was financially supported by grants from the National Key R&D Program of China (grant no. 2022YFF1001400 to L.F.), the Fundamental Research Funds for the Central Universities (grant no. 226-2022-00100 to T.Z.), the National Natural Science Foundation of China (grant nos. 32130075 to T.Z. and 32070544 to K.W.), Xinjiang Production and Construction Corps (grant no. 2023AA008 to T.Z.) and postdoctoral innovative talents support program (grant no. 517000-X92308 to S.J.). We thank Y. X. Zhu (Wuhan University), Z. J. Chen (University of Texas at Austin), S. X. Yu (Institute of Cotton Research of the Chinese Academy of Agricultural Sciences), F. G. Li (Institute of Cotton Research of the Chinese Academy of Agricultural Sciences), Z. Y. Ma (Hebei Agricultural University), X. L. Zhang (Huazhong Agricultural University), J. A. Udall (Crop Germplasm Research Unit) and J. F. Wendel (Iowa State University), who kindly released cotton genomes for comparisons in this Article.
Author information
Authors and Affiliations
Contributions
T.Z. conceived the research project and designed the experiments. Y.H., S.Y., L.X., X.G. and L.F. assembled the TM-1-T2T genome; K.W., J.H. and G.Y. conducted molecular cytogenetic and centromeric analysis. J.H., S.J., Z.H., Z.S. X.G. and L.F. analyzed the bioinformatic data. T.Z., Y.H., S.J., J.H. and K.W. participated in writing and revising the paper. All authors discussed the results and commented on the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Marie Mirouze and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information (download PDF )
Supplementary Figs. 1–29.
Supplementary Tables (download XLSX )
Supplementary Tables 1–13.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yan, H., Han, J., Jin, S. et al. Post-polyploidization centromere evolution in cotton. Nat Genet 57, 1021–1030 (2025). https://doi.org/10.1038/s41588-025-02115-3
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41588-025-02115-3
This article is cited by
-
Thirty years of molecular cytogenetics in cotton: from chromosome identification to precision breeding design
Journal of Cotton Research (2026)
-
GhTPS11, underlying an early-maturity QTL cluster on Chr. D08, positively regulates flowering through the age pathway in cotton
Theoretical and Applied Genetics (2026)
-
TRFill: synergistic use of HiFi and Hi-C sequencing enables accurate assembly of tandem repeats for population-level analysis
Genome Biology (2025)
-
Genome analyses and breeding of polyploid crops
Nature Plants (2025)
-
Integrative genomic structural variation analysis unveils genetic architecture underlying important traits in Gossypium barbadense
Theoretical and Applied Genetics (2025)


