Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Genetic diversity and evolution of rice centromeres

Abstract

Understanding the driving force of centromere dynamics is crucial for deciphering the complexity of eukaryotic evolution and speciation. Here we assembled 67 rice genomes from the Oryza AA group and analyzed >800 nearly complete centromeres. Through de novo annotation of centromeric satellite CEN155 sequences and employing a progressive compression strategy, we quantified the local homogenization and multilayer structures of rice satellite arrays. Our results indicate that genetic innovations in rice centromeres primarily arise from structural variations and centrophilic retrotransposon insertions. The single-base substitution rate in rice centromeres appears to be lower relative to that in chromosome arms. Comparisons of CEN155 arrays, retrotransposons and functional centromeres highlight their dynamic but correlated interplay. Contrary to the KARMA model for Arabidopsis centromere evolution, we propose a hypothesis that retrotransposon invasion probably contributes to the decline of progenitor centromeric satellite arrays and promotes centromere repositioning, as evidenced by extended CENH3 chromatin immunoprecipitation sequencing enrichment beyond the native satellite arrays.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Centromere diversity of Oryza AA genomes.
Fig. 2: Centromere haplotype, introgression and splitting.
Fig. 3: Genetic variation in CEN155 satellite sequences and centrophilic TEs in rice genomes.
Fig. 4: Multilayer nested structure of rice CEN155 satellite arrays.
Fig. 5: Sequence divergence and mutation rate in rice centromeres.
Fig. 6: Epigenetic profiling and centromere positioning.

Similar content being viewed by others

Data availability

Raw PacBio HiFi reads for 46 rice accessions, raw ONT sequencing reads for 10 accessions and CENH3 ChIP–seq NGS reads for 10 accessions generated in this study have been deposited in the National Genomics Data Center under BioProject, accession no. PRJCA025388 (https://ngdc.cncb.ac.cn/bioproject/browse/PRJCA025388), with the Genome Sequence Archive nos. CRA016014 (https://ngdc.cncb.ac.cn/gsa/browse/CRA016014), CRA017638 (https://ngdc.cncb.ac.cn/gsa/browse/CRA017638) and CRA017653 (https://ngdc.cncb.ac.cn/gsa/browse/CRA017653), respectively. The newly generated genome assemblies in this study are available in the NCBI (BioProject, accession no. PRJNA1276249) and via Zenodo at https://doi.org/10.5281/zenodo.12770803 (ref. 81). The genome assemblies of NIP, MH63 and ZS97 are available at the RiceSuperPIRdb (http://ricesuperpir.com/) and the Rice Information GateWay (http://rice.hzau.edu.cn/). TE and gene annotation files of 70 rice genomes are available via Zenodo at https://doi.org/10.5281/zenodo.12698984 (ref. 82). Supporting materials for centromere assembly quality control for each sample are available via Zenodo at https://doi.org/10.5281/zenodo.14286880 (ref. 83), including read-mapping coverage plots, NucFreq plots, GCI coverage plots and VerityMap plots. Rice centromere annotation and comparison plots for all accessions and chromosomes are available via Zenodo at https://doi.org/10.5281/zenodo.12702715 (ref. 84), including similarity heatmap plots generated by StainedGlass for each centromere, whole-genome synteny to the NIP reference assembly, centromere synteny against NIP and NJ11 assemblies and centromere composition for all chromosomes. Source data are provided with this paper.

Code availability

The SynPan-CEN code is available via GitHub at https://github.com/Darlene1997/SynPan-CEN (ref. 85) and the scripts for the progressive compression strategy in deciphering the satellite organization and additional in-house codes associated with this study (including assembly, annotation and visualization) are available via GitHub at https://github.com/dongyawu/CenTools (ref. 86). The code and scripts are also available via Zenodo at https://doi.org/10.5281/zenodo.16990314 (ref. 87). The visualization of centromere annotation and synteny tracks was performed using ggplot2 in R (v.4.3.1, https://www.r-project.org/).

References

  1. Barra, V. & Fachinetti, D. The dark side of centromeres: types, causes and consequences of structural abnormalities implicating centromeric DNA. Nat. Commun. 9, 4340 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Naish, M. & Henderson, I. R. The structure, function, and evolution of plant centromeres. Genome Res. 34, 161–178 (2024).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Melters, D. P. et al. Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution. Genome Biol. 14, R10 (2013).

    PubMed  PubMed Central  Google Scholar 

  4. Ahmed, H. I. et al. Einkorn genomics sheds light on history of the oldest domesticated wheat. Nature 620, 830–838 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Naish, M. et al. The genetic and epigenetic landscape of the Arabidopsis centromeres. Science 374, eabi7489 (2021).

  6. Altemose, N. et al. Complete genomic and epigenetic maps of human centromeres. Science 376, eabl4178 (2022).

  7. Huang, Z. et al. Evolutionary analysis of a complete chicken genome. Proc. Natl Acad. Sci. USA 120, e2216641120 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Wang, T. et al. A complete gap-free diploid genome in Saccharum complex and the genomic footprints of evolution in the highly polyploid Saccharum genus. Nat. Plants 9, 554–571 (2023).

    CAS  PubMed  Google Scholar 

  9. Logsdon, G. A. et al. The variation and evolution of complete human centromeres. Nature 629, 136–145 (2024).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Cheng, Z. et al. Functional rice centromeres are marked by a satellite repeat and a centromere-specific retrotransposon. Plant Cell 14, 1691–1704 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Lv, Y. et al. A centromere map based on super pan-genome highlights the structure and function of rice centromeres. J. Integr. Plant Biol. 66, 196–207 (2024).

    CAS  PubMed  Google Scholar 

  12. Malik, H. S. & Henikoff, S. Major evolutionary transitions in centromere complexity. Cell 138, 1067–1082 (2009).

    CAS  PubMed  Google Scholar 

  13. Kursel, L. E. & Malik, H. S. Centromeres. Curr. Biol. 26, R487–R490 (2016).

    CAS  PubMed  Google Scholar 

  14. Gent, J. I., Wang, N. & Dawe, R. K. Stable centromere positioning in diverse sequence contexts of complex and satellite centromeres of maize and wild relatives. Genome Biol. 18, 121 (2017).

    PubMed  PubMed Central  Google Scholar 

  15. Liu, Y. et al. Pan-centromere reveals widespread centromere repositioning of soybean genomes. Proc. Natl Acad. Sci. USA 120, e2310177120 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Zhang, T. et al. The CentO satellite confers translational and rotational phasing on CENH3 nucleosomes in rice centromeres. Proc. Natl Acad. Sci. USA 110, E4875–E4883 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Chen, J. et al. A complete telomere-to-telomere assembly of the maize genome. Nat. Genet. 55, 1221–1231 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Stein, J. C. et al. Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza. Nat. Genet. 50, 1618 (2018).

    CAS  PubMed  Google Scholar 

  19. Song, J. et al. Two gap-free reference genomes and a global view of the centromere architecture in rice. Mol. Plant 14, 1757–1767 (2021).

    CAS  PubMed  Google Scholar 

  20. Shang, L. et al. A complete assembly of the rice Nipponbare reference genome. Mol. Plant 16, 1232–1236 (2023).

    CAS  PubMed  Google Scholar 

  21. Cheng, H. et al. Haplotype-resolved assembly of diploid genomes without parental data. Nat. Biotechnol. 40, 1332–1335 (2022).

    CAS  PubMed  Google Scholar 

  22. Rautiainen, M. et al. Telomere-to-telomere assembly of diploid chromosomes with Verkko. Nat. Biotechnol. 41, 1474–1482 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Nurk, S. The complete sequence of a human genome. Science 376, 44–53 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Mikheenko, A., Bzikadze, A. V., Gurevich, A., Miga, K. H. & Pevzner, P. A. TandemTools: mapping long reads and assessing/improving assembly quality in extra-long tandem repeats. Bioinformatics 36, i75–i83 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Vollger, M. R. et al. Long-read sequence and assembly of segmental duplications. Nat. Methods 16, 88–94 (2019).

    CAS  PubMed  Google Scholar 

  26. Cheng, Z., Buell, C. R., Wing, R. A., Gu, M. & Jiang, J. Toward a cytological characterization of the rice genome. Genome Res. 11, 2133–2141 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Lian, Q. et al. A pan-genome of 69 Arabidopsis thaliana accessions reveals a conserved genome structure throughout the global species range. Nat. Genet. 56, 982–991 (2024).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Wu, D. et al. A syntelog-based pan-genome provides insights into rice domestication and de-domestication. Genome Biol. 24, 179 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Gong, H. & Han, B. Genetic introgression between different groups reveals the differential process of Asian cultivated rice. Sci. Rep. 12, 17662 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Rosandić, M. et al. CENP-B box and pJα sequence distribution in human alpha satellite higher-order repeats (HOR). Chromosome Res. 14, 735–753 (2006).

    PubMed  Google Scholar 

  31. Rice, W. R. A game of thrones at human centromeres I. Multifarious structure necessitates a new molecular/evolutionary model. Preprint at bioRxiv https://doi.org/10.1101/731430 (2020).

  32. Masumoto, H., Masukata, H., Muro, Y., Nozaki, N. & Okazaki, T. A human centromere antigen (CENP-B) interacts with a short specific sequence in alphoid DNA, a human centromeric satellite. J. Cell Biol. 109, 1963–1973 (1989).

    CAS  PubMed  Google Scholar 

  33. Kipling, D., Wilson, H. E., Mitchell, A. R., Taylor, B. A. & Cooke, H. J. Mouse centromere mapping using oligonucleotide probes that detect variants of the minor satellite. Chromosoma 103, 46–55 (1994).

    CAS  PubMed  Google Scholar 

  34. Kugou, K., Hirai, H., Masumoto, H. & Koga, A. Formation of functional CENP-B boxes at diverse locations in repeat units of centromeric DNA in New World monkeys. Sci. Rep. 6, 27833 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Cappelletti, E. et al. The localization of centromere protein A is conserved among tissues. Commun. Biol. 6, 963 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Gaff, C. et al. A novel nuclear protein binds centromeric alpha satellite DNA. Hum. Mol. Genet. 3, 711–716 (1994).

    CAS  PubMed  Google Scholar 

  37. Wlodzimierz, P. et al. Cycles of satellite and transposon evolution in Arabidopsis centromeres. Nature 618, 557–565 (2023).

    CAS  PubMed  Google Scholar 

  38. Kubo, T. & Yoshimura, A. Genetic basis of hybrid breakdown in a Japonica/Indica cross of rice, Oryza sativa L. Theor. Appl. Genet. 105, 906–911 (2002).

    CAS  PubMed  Google Scholar 

  39. Bensasson, D. Evidence for a high mutation rate at rapidly evolving yeast centromeres. BMC Evol. Biol. 11, 211 (2011).

    PubMed  PubMed Central  Google Scholar 

  40. Minton, K. Tandem repeat variation of human centromeres. Nat. Rev. Genet. 25, 455 (2024).

    CAS  PubMed  Google Scholar 

  41. Schneider, K. L., Xie, Z., Wolfgruber, T. K. & Presting, G. G. Inbreeding drives maize centromere evolution. Proc. Natl Acad. Sci. USA 113, E987–E996 (2016).

  42. Irvine, D. V. et al. Chromosome size and origin as determinants of the level of CENP-A incorporation into human centromeres. Chromosome Res. 12, 805–815 (2004).

    CAS  PubMed  Google Scholar 

  43. Plačková, K., Bureš, P. & Zedek, F. Centromere size scales with genome size across eukaryotes. Sci. Rep. 11, 19811 (2021).

    PubMed  PubMed Central  Google Scholar 

  44. Wang, N., Liu, J., Ricci, W. A., Gent, J. I. & Dawe, R. K. Maize centromeric chromatin scales with changes in genome size. Genetics 217, iyab020 (2021).

    PubMed  PubMed Central  Google Scholar 

  45. Bilinski, P. et al. Diversity and evolution of centromere repeats in the maize genome. Chromosoma 124, 57–65 (2015).

    PubMed  Google Scholar 

  46. Rice, W. R. A game of thrones at human centromeres II. A new molecular/evolutionary model. Preprint at bioRxiv https://doi.org/10.1101/731471 (2019).

  47. Talbert, P. & Henikoff, S. Centromeres organize (epi)genome architecture. Cell 185, 3083–3085 (2002).

    Google Scholar 

  48. Wu, Z. et al. De novo genome assembly of Oryza granulata reveals rapid genome expansion and adaptive evolution. Commun. Biol. 1, 84 (2018).

    PubMed  PubMed Central  Google Scholar 

  49. Zhang, Y. et al. The telomere-to-telomere gap-free genome of four rice parents reveals SV and PAV patterns in hybrid rice breeding. Plant Biotechnol. J. 20, 1642–1644 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. Sedeek, K. et al. Multi-omics resources for targeted agronomic improvement of pigmented rice. Nat. Food 4, 366–371 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Shang, L. et al. A super pan-genomic landscape of rice. Cell Res. 32, 878–896 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Nagaki, K., Talbert, P. B. & Zhong, C. X. Chromatin immunoprecipitation reveals that the 180-bp satellite repeat is the key functional DNA element of Arabidopsis thaliana centromeres. Genetics 163, 1221–1225 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. Sim, S. B., Corpuz, R. L., Simmonds, T. J. & Geib, S. M. HiFiAdapterFilt, a memory efficient read processing pipeline, prevents occurrence of adapter sequence in PacBio HiFi reads and their negative impacts on genome assembly. BMC Genom. 23, 157 (2022).

  54. Cheng, H., Asri, M., Lucas, J., Koren, S. & Li, H. Scalable telomere-to-telomere assembly for diploid and polyploid genomes with double graph. Nat. Methods 21, 967–970 (2024).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Alonge, M. et al. Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing. Genome Biol. 23, 258 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. Jain, C., Rhie, A., Hansen, N. F., Koren, S. & Phillippy, A. M. Long-read mapping to repetitive reference sequences using Winnowmap2. Nat. Methods 19, 705–710 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Hu, J. et al. NextPolish2: a repeat-aware polishing tool for genomes assembled using HiFi long reads. Genom. Proteom. Bioinform. 22, qzad009 (2024).

    Google Scholar 

  59. Bzikadze, A. V., Mikheenko, A. & Pevzner, P. A. Fast and accurate mapping of long reads to complete genome assemblies with VerityMap. Genome Res. 32, 2107–2118 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  61. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol. Biol. Evol. 26, 1641–1650 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. Vollger, M. R., Kerpedjiev, P., Phillippy, A. M. & Eichler, E. E. StainedGlass: interactive visualization of massive tandem repeat structures with identity heatmaps. Bioinformatics 38, 2049–2051 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. Lin, J. et al. SVision: a deep learning approach to resolve complex structural variants. Nat. Methods 19, 1230–1233 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  66. Edgar, R. C. Muscle5: high-accuracy alignment ensembles enable unbiased assessments of sequence homology and phylogeny. Nat. Commun. 13, 6968 (2022).

  67. Bailey, T. L., Johnson, J., Grant, C. E. & Noble, W. S. The MEME suite. Nucleic Acids Res. 43, W39–W49 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  68. Haas, B. J., Delcher, A. L., Wortman, J. R. & Salzberg, S. L. DAGchainer: a tool for mining segmental genome duplications and synteny. Bioinformatics 20, 3643–3646 (2004).

    CAS  PubMed  Google Scholar 

  69. Rice, P., Longden, L. & Bleasby, A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000).

  70. Mistry, J. et al. Pfam: the protein families database in 2021. Nucleic Acids Res. 49, D412–D419 (2021).

    CAS  PubMed  Google Scholar 

  71. Katoh, K. & Standley, D. M. MAFFT Multiple Sequence Alignment Software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  72. Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

    PubMed  PubMed Central  Google Scholar 

  73. Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  74. Mayor, C. et al. VISTA: visualizing global DNA sequence alignments of arbitrary length. Bioinformatics 16, 1046–1047 (2000).

    CAS  PubMed  Google Scholar 

  75. Paradis, E. & Schliep, K. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35, 526–528 (2019).

    CAS  PubMed  Google Scholar 

  76. Thompson, J. D., Gibson, T. J. & Higgins, D. G. Multiple sequence alignment using ClustalW and ClustalX. Curr. Protoc. Bioinform. Chapter 2, Unit 2.3 (2002).

  77. Chen, S. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. iMeta 2, e107 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  78. Langmead, B., Wilks, C., Antonescu, V. & Charles, R. Scaling read aligners to hundreds of threads on general-purpose processors. Bioinformatics 35, 421–432 (2019).

    CAS  PubMed  Google Scholar 

  79. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

  80. Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).

    PubMed  PubMed Central  Google Scholar 

  81. Xie, L. The newly-generated genome assemblies. Zenodo https://doi.org/10.5281/zenodo.12770803 (2025).

  82. Xie, L. TE and gene annotation files. Zenodo https://doi.org/10.5281/zenodo.12698984 (2025).

  83. Xie, L. Centromere assembly quality plots. Zenodo https://doi.org/10.5281/zenodo.14286880 (2025).

  84. Xie, L. Rice centromere annotation and comparison plots. Zenodo https://doi.org/10.5281/zenodo.12702715 (2025).

  85. Xie, L. The SynPan-CEN code. GitHub https://github.com/Darlene1997/SynPan-CEN (2025).

  86. Wu, D. The CenTools code. GitHub https://github.com/dongyawu/CenTools (2025).

  87. Xie, L. The SynPan-CEN code. Zenodo https://doi.org/10.5281/zenodo.16990314 (2025).

Download references

Acknowledgements

This work was supported by Biological Breeding-Major Projects (grant no. 2023ZD04076), China National Postdoctoral Program for Innovative Talents (grant no. BX20220269), China Postdoctoral Science Foundation (grant no. 2023M743045), Young Scientists Fund of the National Natural Science Foundation of China (grant no. 32300490), National Key Research and Development Program of China (grant no. 2019YFA0903904) and CIC MIC. We thank G. Zhang (Zhejiang University), Y. Mao (Shanghai Jiao Tong University), B. Wu (Sun Yat-sen University) and K. Wu (Zhejiang University) for constructive suggestions.

Author information

Authors and Affiliations

Authors

Contributions

D.W. conceived and initiated this study. L.S. collected the samples. D.W., Q.C. and M.S. performed the sequencing data quality control, centromere assembly and quality evaluation. L.X., D.W. and Y.S. performed the analysis of satellite sequence identification and clustering and satellite array organization. Y.H. and D.W. performed the annotation and centromeric insertion analysis of TEs. W.H., L.X. and S.B. conducted the CENH3 ChIP experiments. L.X., D.W. and S.Z. processed the ChIP–seq data and analyzed the epigenetic profiling of rice centromeres. L.F. and D.W. supervised all analyses. Q.Q., W.J., C.Y., L.S. and X.Z. provided suggestions on analysis, organization and writing. D.W., L.X. and Y.H. wrote the manuscripts with input from all the coauthors. All authors discussed the results and commented on the manuscript.

Corresponding authors

Correspondence to Longjiang Fan or Dongya Wu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Variation in rice centromere positioning.

a, Chromosome length and the ratio of long arm length to short arm length across each chromosome. b, Relationships between chromosome size and CEN155 satellite array size across different taxonomic groups. Linear regression analyses were performed to evaluate the relationships between variables, with P values based on two-sided t test. Shaded areas represent 95% confidence intervals of the fitted regression lines. c, Two megabase-scale inversions observed in the centromere region of chromosome Chr06.

Source data

Extended Data Fig. 2 StainedGlass sequence identity heat maps of centromeres from chromosomes Chr02, Chr05 and Chr04.

Representative centromere haplotypes for each chromosome are shown.

Extended Data Fig. 3 Centromere divergence and fission on rice chromosome Chr12.

a, Centromere similarity and structural variations compared to NIP on chromosomes Chr12, showing divergent centromere haplotypes (CenHaps) and putative centromere introgression events (for example CW15, MH63). Left, a maximum-likelihood phylogenetic tree across rice accessions on chromosome Chr12. b, StainedGlass sequence similarity heat maps within and between CEN155 arrays on chromosomes Chr12 from SL044, CW09 and NJ11, and their synteny, indicating a fission from SL044-type centromere to CW09 (X3) and NJ11 (X4) type. TEs (blue) and gene-like elements (red) are shown. Their commonly shared retrotransposon RETROSAT-2C around the junction is highlighted. c, Comparison of phylogenetic trees built using upstream 500-Kbp and downstream 500-Kbp SNPs flanking the Chr12 CEN155 array. Taxonomic information is represented by colored circles, with the position of SL044 highlighted by a dashed line. d, Schematic representation of centromeric structural alterations, including introgression, duplication and fission or splitting.

Extended Data Fig. 4 Divergence sites between CEN155 superfamilies.

The CENP-B box-like and pJα-like motif regions are shown.

Extended Data Fig. 5 Inference of multimers and muHRs in rice centromeres.

The de Bruijn graphs are constructed based on the dimer-compressed satellite string of each centromere.

Extended Data Fig. 6 Structural variations in centromere regions.

a, Schematic diagram of structural variations in satellite arrays (SaSVs). A query satellite array is aligned against a reference array using satellites and TEs as markers. Based on syntenic pairing of CEN155 satellites and TEs, SVs with more than 50 copies of CEN155 satellites are defined as large expansions (LEs) or contractions (LCs). Regions with continuously poor synteny are referred to as divergent blocks, compared to the reference array. b, SaSV number and involved CEN155 satellite size (upper), and SaSV size distribution (bottom) in GJ and XI satellite arrays compared to their corresponding reference assemblies NIP and NJ11, respectively. c. SaSVs in individual genomes associated with the phylogenetic kinship.

Extended Data Fig. 7 CENH3 ChIP-seq enrichment and element annotation across Chr05 centromeres.

Top, CENH3 ChIP-seq enrichment (log2(ChIP/input), two replicates). Beneath, CEN155 superfamily, TE and sati annotation along each centromere.

Extended Data Fig. 8 A retrotransposon-induced centromere evolution model, summarized from the rice centromere analysis.

This model highlights the multi-layer structures of rice satellite arrays by local homogenization and emphasizes the triggering role of LTR invasion in initiating satellite array degeneration and centromere repositioning determined by CENH3 occupancy.

Supplementary information

Source data

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xie, L., Huang, Y., Huang, W. et al. Genetic diversity and evolution of rice centromeres. Nat Genet 57, 2808–2818 (2025). https://doi.org/10.1038/s41588-025-02365-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41588-025-02365-1

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research