Fig. 2: Mutational processes of small mutations at TOP2B binding sites in cancer genomes.

a Analysis overview. Mutation frequency in aligned binding sites was compared to flanking sequences using RM234, which quantifies mutational enrichments in genomic elements using trinucleotide context and megabase-scale mutation burden. b Mutational enrichments in TOP2B binding sites in primary and metastatic cancers (RM2, FDR < 0.05). Enrichment scores represent directional significance such that binding sites with enriched mutations are on the right and depleted sites on the left (OR, odds ratio). c Grassy hills plots show local mutation burden in pooled binding sites (colors; 600 bps) and flanking control sequences (grey; ±600 bps). FDR-adjusted two-tailed P-values (q-values) from RM2 are shown. d Mutations in TOP2B binding sites associate with conserved TOP2B binding in mice (left) and in vitro double-strand break (DSB) activity (right). e Mutational signatures of single base substitutions (SBS) in TOP2B binding sites. Indels were also included as a separate signature. In panels d and e, TOP2B-CTCF-RAD21 and TOP2B-RAD21 sites were compared to controls (CTCF-RAD21 and RAD21-only sites, respectively) using two-tailed hypergeometric tests (FDR < 0.05). Enriched signatures are displayed on the right and depleted signatures on the left. f Mutational processes of small mutations in TOP2B binding sites grouped by transcription or chromatin loops. Four bins of sites were analyzed (none, low, middle, high). Heatmap shows site types and cancer types having at least one bin with significant mutational enrichments (FDR < 0.05 from RM2). Positive associations are shown above (i.e., higher mutagenesis associated with more transcription or chromatin interactions) and negative associations are shown below. Color strips indicate site types, cancer types, and activity. g, h Examples of mutational processes at TOP2B-RAD21 binding sites. Grassy hills plots show mutation frequencies in sites binned by transcription or chromatin interactions. TOP2B-RAD21 sites (dark red) were compared to flanking sequences (grey) using RM2. A control analysis using RAD21-only sites is also shown (light blue). FDR-adjusted two-tailed P-values (q-values) from RM2 are shown. Loess smoothing in panels (c, g, h) is shown with 95% standard error bands. Source data are provided as Source Data files.