Supplementary Fig. 1: Evaluation of the sequencing mappers and RNA bisulfite treatment conditions.
From: Genome-wide identification of mRNA 5-methylcytosine in mammals

(a) Comparison of different sequencing mappers. The mapping efficiency and accuracy of different mappers evaluated with 100 bp (left) or 50bp (right) simulated paired-end reads. Simulated reads were generated by R package polyester. Six mapping strategies were tested: Bowtie2 (2.2.9), HISAT2 (2.10.0), HISAT2 plus Bowtie2, meRanGh (HISAT2-2.0.4) and meRanGs (STAR-2.5.2b) from meRanTK-1.20, and BS-RNA (2.10.0). The mapping rates were shown on the y axis. Pseudogene means that the simulated reads generated from pseudogenes were mapped to their parent genes. Despite these reads were not mapped to their original genomic coordinates, they may not be considered as incorrect mapping. (b) Comparison of the coverage of each ERCC mix between medium-stringency condition and high-stringency condition. The coverages were normalized by all detected ERCC reads of each library (Methods). The concentrations of the ERCC mixes reported in the manufacturer’s protocol were shown on the x axis. Each concentration contains several ERCC transcripts. (c) The distribution of the coverage across the in vitro transcribed transcript. The coverage is normalized by the total base count of the in vitro transcript. Dashed lines indicate the positions of the 5 Cs. C380 was omitted because of the significant decrease of sequenced depth, which indicates the failure of synthesis of the full-length transcript.