Extended Data Fig. 5: Epigenetic regulation of repeat-based holocentromeres and fine-scale correlation of CO positions with Tyba repeats.

(a) Chromosome distribution of the CO rate coupled with different (epi)genetic features. Top: recombination landscape (black line) created with COs detected in all single-pollen nuclei (n = 1,641), coupled with Omni-C chromosome conformation capture contacts. Synteny analysis and detected the structural variants between the two haplotypes. For the y-axes, all features were scaled [0,1], with 1 indicating a maximum of 2.34 for recombination frequency (cM/Mb), 5 for Tyba density, 6 for CENH3 density, 7205 for SNP density, 88 for gene density, and 227 for TE density. GC [33.3, 46.6], H3K4me3 [–1.494, 0.231], H3K9me2 [–1.20, 1.84], and H3K27me3 [–0.671, 0.491] are scaled to [0,1] by their minima and maxima. mCG, mCHG and CHH are original values (0 to 100%). All features were smoothed by 1Mbp sliding window and 250kbp step size. (b) Size (left) and spacing (right) length distribution of CENH3 domains and Tyba arrays. CENH3 domain median size is 19,156 bp and the mean size is 20,697 bp. The median of Tyba array size is 17,424 bp and the mean is 18,220 bp. CENH3 domain median spacing is 378,467 bp and the mean is 401,763 bp. The median of Tyba array spacing is 354,850 bp and the mean is 374,310 bp. (c) Number of Tyba arrays (left) and CENH3 domains (right) for each chromosome annotated in the reference haplotype genome. (d) Enrichment of CENH3, H3K4me3, H3K9me2, and DNA methylation in the CpG, CHG, and CHH contexts from the start and end of different types of sequences: CENH3 domains, Tyba repeats, genes, LTRs, and TEs. ChIP-seq signals are shown as log2 (normalised RPKM ChIP/input). Grey boxes highlight the modification enrichment over the body of each sequence type. (e) CO frequency within Tyba arrays. (f) Random distribution of the relative distance of CO positions to the end of the left and to the start of the right Tyba array. The median of CO resolution is 334 bp and the mean is about 2 kb. The solid blue line was predicted by Local Polynomial Regression Fitting (loess function from R) using data from 63 F1 recombinant offspring and a total of 378 COs; whereas the dashed blue band presents the range of one standard error above and below the fitted line. Green-filled triangles schematically represent Tyba repeat arrays.