Supplementary Figure 2: Features of the Euplotes transcriptome. | Nature Structural & Molecular Biology

Supplementary Figure 2: Features of the Euplotes transcriptome.

From: Position-dependent termination and widespread obligatory frameshifting in Euplotes translation

Supplementary Figure 2

(a) Euplotes Sec and Cys tRNAs that decode TGA codons. Cys tRNA with the GCA anticodon and mitochondrial Trp tRNA with TCA anticodon are shown for comparison. In total we identified 183 tRNA genes in E. crassus and 337 genes in E. focardii based on their genomes analysis. (b) Frequency of introns of different lengths. The X axis indicates the length of introns in nucleotides, and the Y axis shows how many times they are found in the transcriptomes (log scale). Short introns (~25 nucleotides) is a characteristic feature of Euplotes transcriptomes. (c) Frequency of chromosomes with different numbers of RNA molecules transcribed from them. The X axis shows a number of transcripts per chromosome, and the Y axis how many such chromosomes are found in the genome. (d) E. crassus splice sites. Nucleotide conservation around exon-intron junction and intron-exon junctions. E. crassus. Transcriptomes were assembled de novo using Trinity (Haas, B. J. et al., Nature Protoc, 8, 1494-1512, 2013); no genomic template was used for the assembly of the transcriptome to ensure independence of the analysis. The assembly procedure produced 33,701 unique transcripts with an average length of 573 nucleotides in E. crassus. We obtained the E. focardii RNA-seq reads from (Keeling, P. J. et al., PLoS Biol, 12, e1001889, 2014).; this assembly produced 28,869 unique transcripts with an average length of 667 nucleotides. To identify introns we carried out pairwise alignments between the genome and the transcriptome for each species using FASTA (Pearson, W. Curr Protoc Bioinf, Chapter 3, Unit3 9, 2004) In total, we identified 21,798 introns in E. crassus and 18,747 in E. focardii. The most frequent intron length was 25 nucleotides in both E. crassus and E. focardii with 2,895 and 2,631 occurrences, respectively. Using 10,000 intron sequences from E. crassus, we characterized sequence features of the exon-intron donor and intron-exon acceptor sites. We further aligned 32,350 E. crassus transcripts or their fragments (96%) to 18,032 genomic contigs, and similarly aligned 21,233 E. focardii transcripts (74%) to 16,950 genomic contigs. The majority of chromosomes had a single transcript aligning to them, 10,495 in E. crassus and 14,082 in E. focardii. Some chromosomes contained two or more predicted transcripts, which could be, at least in part, due to insufficient sequence coverage. Low coverage can result in missassembly of a single transcript as two or more, when reads matching internal positions are missing.

Back to article page