Extended Data Figure 6: Translocation hotspots are enriched for xTSS-RNA expression.
From: Noncoding RNA transcription targets AID to divergently transcribed loci in B cells

a, Strand-specific RNA-seq mapped reads at AID target genes Myc, Cd83, Pim1, Pax5 and Cd79b. Green and red peaks indicate sense and antisense reads, respectively. Red bars represent RefSeq annotation of gene exons. Asterisks indicate the location of TSSs. Arrows indicate the orientation of coding strand transcript. Data were compiled from two biological replicates. b, Boxplot analysis of the level of expression of xTSS-RNAs at various genes reported to undergo recurrent AID-dependent translocations at DNA double-strand breaks generated within the Igh (left panel) or Myc (right panel) loci. Boxplots represent median values compiled from two biological replicates. Whiskers represent 99% of data values. **P < 0.01 (Wilcoxon rank-sum test). c, The list of 40 genes that show an overlap of translocation hotspots and xTSS-RNA expression (from Fig. 3c) was evaluated directly for xTSS-RNA levels (left panel) and mRNA levels (right panel). Statistical analysis was as described in b. **P < 0.01; NS, not significant (Wilcoxon rank-sum test). d, xTSS-RNA expression levels in Exosc3-deficient B cells at non-recurrent and recurrent AID-dependent translocation sites in the B-cell genome. Data were compiled from two biological replicates. **P < 0.01 (Wilcoxon rank-sum test). e, Statistical analysis of the probability of identification of 40 random xTSS-RNA-expressing genes solely based on expression level. Ten-thousand control group genes were randomly selected that were expressed at similar levels as translocation hotspots genes. Specifically, to generate one random control group, we exhausted all translocation hotspots to find genes with similar expression levels (difference of RPKM < 0.5), and randomly picked up one for each hotspot. Ten-thousand gene lists were obtained that contain 88 genes and share the same expression profile with the translocation hotspots list. We then simulate the distribution of genes containing xTSS-RNA by overlapping the random control groups and actual xTSS-RNA gene list. The binomial fitting (red curve) shows that the number of overlapping genes of real translocation hotspots is significantly higher than random controls. **P < 0.01 (binomial distribution).