Extended Data Fig. 1: nuORFdb characteristics. | Nature Biotechnology

Extended Data Fig. 1: nuORFdb characteristics.

From: Unannotated proteins expand the MHC-I-restricted immunopeptidome in cancer

Extended Data Fig. 1

a. Hierarchical ORF prediction. Tree showing individual samples (leaves), combinations of samples (clades) and entire datasets of all reads (root) representing the nodes used to make ORF predictions (arrowheads). #: samples used in nuORFdb construction, but later discovered to be of poor quality and not used in any subsequent analyses; CHX: samples pre-treated with cycloheximide; Harr: samples pretreated with harringtonine, IFNy: samples pre-treated with interferon gamma. b. NuORFdb size relative to the annotated proteome, RNA-seq- and transcriptome-based databases. Number of ORFs (y axis) across four databases (x axis). c-d. Ribo-seq reveals mRNA reading frames. c. RNA-seq (blue) and Ribo-seq (green) reads aligned to the transcript of the MLEC gene. RNA-seq reads align to the entire length of the transcript, while Ribo-seq reads align exclusively to the translated portions. Ribo-seq supports translation of a 5’ uORF (red box, top). Histogram of +15nt-shifted 5’ ends of Ribo-seq reads supporting translation of the MLEC 5’ uORF (colorful) with corresponding full-length aligned reads below. 5‘ ends of full-length reads are outlined in colors matching their +15nt-shifted positions in the histogram (bottom). d. Histogram of 5’ ends of Ribo-seq reads supporting translation of annotated protein-coding ORFs at every third nucleotide (x axis) around the start codon (left) and the stop codon (right). The –12 position of the first peak indicates the placement of the ribosome at the start codon (position 0), which is computationally adjusted to +3 by adding +15nt to each 5’ end read location, as shown in (c).

Back to article page