Fig. 6: nORFs dysregulated in cancer.

a Analysis of Xena’s TCGA-TARGET-GTEx dataset to study the expression of the 14 probable ‘cancer markers’ which are expressed differentially in 19 cancers. These 14 markers are non-protein-coding transcripts that translate low-noise nORFs in 11 cell lines as observed from analyzing the ribo-ORF datasets from RPFdb (black—transcripts that are not expressed in both the tumor and matched normal samples; red—transcripts that are not expressed only in tumor samples; green—transcripts that are expressed only in tumor samples; light blue—no differential expression of transcript between the tumor and normal samples; dark blue—differential expression of transcript between the tumor and normal samples) (a transcript is defined to be expressed if it has non-zero expression in at least 25% of the samples). b Predicted structure of mPLsORF0000447155, which is a peptide translated from ENST00000427352.1, using EV-Fold, of human ortholog displayed with pymol. Red regions on the structure indicate amino acids which are affected by COSMIC mutations. Supplementary Table 6 shows the mutations associated with this sORF. Below are the structures of the highest scoring ligands of compound 8462, compound 1491, and compound 1355 (right), and that of the complex it makes with the sORF (left) predicted, respectively, from the libraries: Immune-oncology, Targeted Oncology, and Signal pathway inhibitor.