Fig. 1: Mapping and characterization of mRNA 3′ end CS in 206 primary cell types.
From: Quantifying 3′UTR length from scRNA-seq data reveals changes independent of gene expression

a Read distribution at mRNA 3′ end CS from 10x Genomics compared with MWS data. Shown is the terminal exon of the mouse Vamp2 gene. nts, nucleotides. b Schematic of CS annotation from MWS data and generation of a truncated UTRome for downstream gene and 3′UTR isoform quantification. c Motif distribution surrounding human CS (position 0). PAS (AWTAAA) in [−50,0], CFI binding site (TGTA) in [−100,0], CFII binding site (TKTKTK) in [0,50] for the indicated annotation categories. d Calculated APARENT2 cleavage probabilities for all human CS in a 30-nt window stratified by annotation category as shown in (b). Box shows interquartile range (IQR) with median and whiskers 1.5*IQR. e Mean PhastCons score of 30 genomes in a 100-nt window centered on human CS, but excluding coding sequences. Box shows IQR with median and whiskers 1.5*IQR. f GENCODE transcript annotations depicting the last exons of the human NUDT21, PCF11, and ROCK1 genes. Shown are chromosome coordinates (hg38), CS from existing PAS databases and our MWS CS annotation, together with APARENT2 cleavage probability scores and CS usage scores. Major CS are highlighted by the gray boxes. g CS usage score distribution for human CS that were identified by MWS. h Numbers of major MWS CS per gene compared to CS counts from two other databases. Shown are CS in all human protein coding genes.