Fig. 1: Identification of novel, non-annotated, transcripts and proteins. | Nature Communications

Fig. 1: Identification of novel, non-annotated, transcripts and proteins.

From: Uncovering de novo gene birth in yeast using deep transcriptomics

Fig. 1

a Experimental overview of our study. We grew 11 species of yeast in two conditions (rich media and oxidative stress), then performed RNA-Seq on all 22 samples. We also performed Ribo-Seq for S. cerevisiae. b Transcriptome assembly. We generated a combined transcriptome assembly combining annotated genes together with the subset de novo assembled transcripts not present in the annotations. Subsequently, we quantified the expression of all transcripts in the two conditions. c Transcriptomes per species. We obtained hundreds of novel, non-annotated, transcripts, for each species. d Prediction of novel translated ORFs. Using the presence of translation signatures in the ribosome profiling data we predicted novel translated ORFs in S. cerevisiae. We found 236 non-annotated transcripts likely to encode novel, not yet characterized, proteins. e Size of novel and annotated proteins. Novel proteins identified by ribosome profiling were significantly smaller than annotated proteins (two-sided Wilcoxon test, p-value < 2.2e-16). Computation of protein length was based on the longest coding sequence per transcript; values in black are the medians. f Number of ORFs per transcript. A sizable proportion of the novel transcripts were predicted to encode for more than one ORF. Source data are provided as a Source Data file.

Back to article page