Extended Data Fig. 4: Enrichment of CDMs compared to other AT-rich sequences and in nucleosome-free regions. | Nature Medicine

Extended Data Fig. 4: Enrichment of CDMs compared to other AT-rich sequences and in nucleosome-free regions.

From: Colibactin DNA-damage signature indicates mutational impact in colorectal cancer

Extended Data Fig. 4

a, The enrichment of hexanucleotide sequences in close proximity to DSB positions (±7 nt) upon different treatments. Bars represent the enrichment of DSB-associated hexanucleotide motifs (mean log2 ratios of DSB at each motif comparing two conditions) in pks+ E. coli-infected cells vs. non-infected (NT) cells, n=4 independent experiments. Error bars denote the 95% confidence interval (CI) around the mean log2 ratios. b, The enrichment of pentanucleotide sequences in close proximity to DSB positions (±7 nt) upon different treatments. Bars represent the enrichment of DSB-associated pentanucleotide motifs (mean log2 ratios of DSB at each motif comparing two conditions) in pks+ E. coli-infected vs. pks- E. coli-infected cells (left panel) or in pks- E. coli-infected vs. non-infected (NT) cells (right panel), n=4 independent experiments. Error bars denote the 95% CI around the mean log2 ratios. c, Scaled mean Log2 enrichment values (scaled to 1 for maximum enrichment per replicate) of all nonamers in the context of DSBs detected by sBLISS, n=4 independent experiments. The x-axis represents the number of A/T in a nonamer. Red box plots represent AAWWTT motifs in the nonamer sequence, while turquoise box plots represent all other motifs. Each data point corresponds to one nonamer. d, The distributions of hexanucleotide patterns containing only A/T around centers of nucleosome dyads in the human genome as determined by 80. The values are smoothed proportions of hexanucleotides within ±100 bp next to the nucleosome dyad centers. Dashed vertical lines represent ± 73 bp that indicate the size of the nucleosome-covered DNA. Grey curves represent mean ± 2*SD of all A/T hexanucleotides, except for the AAWWTT motif, AAAAAA, ATATAT, TATATA and their reverse complements, n=33 hexanucleotide motifs. Full curves represent the AAWWTT motifs. Dashed curved represent the outlier motifs AAAAAA, ATATAT, TATATA, showing clearly distinct profiles from all other hexanucleotide profiles.

Back to article page