Fig. 3: Ribosome footprint profiling uncovers thousands of short open reading frames in the Chinese hamster genome. | Nature Communications

Fig. 3: Ribosome footprint profiling uncovers thousands of short open reading frames in the Chinese hamster genome.

From: Detection of host cell microprotein impurities in antibody drug products

Fig. 3: Ribosome footprint profiling uncovers thousands of short open reading frames in the Chinese hamster genome.

Examples are shown of (a) an uORF found in a Ddit3 transcript, (b) an ouORF in a Rab31 transcript, and (c) a sORF found in the transcript of a long non-coding RNA gene. a, b and c show the full CHX coverage [coloured grey] and P-site offset CHX coverage [coloured by reading frame relative to the annotated TIS]. Many previously uncharacterized ORFs identified in this study were (d) sORFs predicted to produce proteins < 100 aa. We focused on short open reading frames found in the 5’ leader of protein-coding transcripts (i.e., upstream ORFs and start overlapping uORFs) as well as ORFs found in non-coding RNAs where > 90% of all identified ORFs in these classes were sORFs. Comparison of the (e) amino acid frequencies of uORFs (both uORFs and ouORFs) and ncRNA sORFs to annotated proteins, as well as the expected amino acid frequency for the Chinese hamster genome, revealed differences in usage of amino acids, including arginine and glycine when compared to conventional protein-coding ORFs ( ≥ 100 aa). Supplementary Fig. 7 shows the frequency of all amino acids. The sORF populations were also found to have (f) a reduced codon adaption index (CAI) compared to previously annotated canonical proteins. A two-sided Kolmogorov–Smirnov test was used to assess the CAI difference between ORF types; a p-value < 0.01 was considered significant. The (d) and (f) boxplot center lines show the median length, and the whiskers extend to 1.5× the interquartile range. Source data are provided as a Source Data file.

Back to article page