Fig. 3: Identification of de novo transcripts using different approaches.
From: Uncovering de novo gene birth in yeast using deep transcriptomics

a Diagram representing different strategies depending on the use of annotations or transcriptomics data. The use of transcriptomics data to obtain de novo assembled, non-annotated, transcripts, increases the scope of the comparisons of expressed sequences across species. b Number of de novo transcripts identified with each of the approaches. The same computational pipeline was applied in all cases, the only differences was whether we used transcriptomics (de novo assembled transcripts) and annotations in all species (our approach), transcriptomics only in the reference species, or annotations only. Many more de novo transcripts were retrieved in the other approaches. c Comparison of annotated S. cerevisiae-specific de novo proteins that are common between different previously published studies and this study. We used overlap in the genomic coordinates to categorize two transcripts or ORFs as common between two studies. We see moderate overlap in the pairwise comparisons, no two methods have produced very similar results. Note that when several lists existed for the same study we took the least stringent one. Source data are provided as a Source Data file.