Fig. 2

Transcriptome-wide analysis of 3′-UTR regions identifies thousands of stable cleaved tails. a We used three experimental methods to measure separated body and tail RNA fragments at a transcriptome-wide scale; poly(A) selected RNA from U2OS cells was enriched for 5′-capped bodies using TEX treatment (TEX) or anti-Cap immunoprecipitation (CAP IP); uncapped tails were enriched by streptavidin bead pulldown of in vitro biotinylated-7-methylguanylate capped RNA (3′-PD). b RNA-seq read coverage data (y-axis) across the putative cleavage point of SLC38A2 and LBH (arrow) show reduced “tail” read coverage in TEX- and CAP IP-treated cells compared to enriched “tails” in 3′-PD, and equal coverage in untreated RNA (control). Black horizontal lines mark average coverage across exons. c–e Computational analysis using a Hidden Markov Model (HMM) was applied to identify the most probable cleavage point for each transcript, resulting in 12,578 statistically significant transcripts (Kolmogorov–Smirnov p < 0.01) for TEX (overall of 6068 cleaved genes, FDR < 0.01); 11,108 transcripts for CAP IP (5222 genes); and 14,589 transcripts for 3′-PD (6501 genes). Shown are density plots comparing the relative read coverage (normalized over coding region, genome wide) before and after treatment, for bodies (x-axis) and tails (y-axis) for TEX (c), CAP IP (d), and 3′-PD (e). Red dots correspond to transcripts predicted to be cleaved, and blue dots mark non-cleaved transcripts. Histograms above and to the right of each plot show the marginal distribution of body or tail treated to untreated RNA-seq ratio for each population. f Venn diagram of genes with statistically significant differences in body vs. tail read coverage (FDR < 0.01) for TEX, CAP IP, and 3′-PD. g Meta-gene analysis of average read coverage (paired-end RNA-seq; red line) showing a dip at the putative cleavage point (predicted in TEX treatment data). Blue line shows meta-gene plot of exon annotations, suggesting that the observed dip is not due to exon–intron junctions