Fig. 4: Expression levels and ORF changes in unannotated transcripts.

a Box plots comparing the expression levels between annotated and unannotated transcripts in protein coding genes, lncRNAs, and pseudogenes. Each box represents the IQR and median of TPM value of each transcript, whiskers indicate 1.5 times the IQR. P, Wilcoxon’s rank-sum test. n = 443,200 annotated and n = 1,373,392 protein coding transcripts, n = 41,168 annotated and n = 39,984 unannotated lncRNA transcripts, n = 5472 annotated and n = 15,840 unannotated pseudogene transcripts, respectively. The individual expression values (data points) could be found in the Supplementary Data 8. b The length distributions of ORF from annotated and unannotated transcripts. c The percentage of transcripts with different ORF changes. d Bar plots showing the number of different ORF changes in each ASE type.