Fig. 3: RNA sequences and modification states are accurately annotated at single-molecule resolution in the DeepRM dataset.
From: Comprehensive discovery of m6A sites in the human transcriptome at single-molecule resolution

a The reconstruction algorithm of local sequence context blocks (LCBs). Sequenced reads deviate from the design due to sequencing errors (top). To locate the LCBs (green) accurately, each read is aligned to the design using fixed-sequence spacers (blue; middle). LCBs are then located between the aligned spacers (bottom). b The precision-recall (PR) measurement of the LCB reconstruction alignment using an in vitro-transcribed dataset (see “Methods”). The PR curves of A and m6A LCB reconstruction are shown (right). The graph-based reconstruction algorithm (blue) was compared against simple algorithms using the distance from the first spacer from the 3′ end (orange), and using the distance from the 5′ end of the poly(A) tail (green). Dashed lines indicate extrapolated values between the maximum achievable recall and 1.0. The maximum achievable precision is indicated by red dots, and its corresponding recall is depicted by gray lines. c The accuracy of the reconstructed LCB sequences. The substitution (first from the top), insertion (second), and deletion (third) rates are calculated between the reconstructed LCBs and DNA template sequences. 1.0 minus the sum of the substitution, insertion, and deletion frequencies is shown (bottom). d The length of reads obtained from A (gray) and m6A (red) sequencing runs. Boxes, center lines, and whiskers represent the 25th-75th percentiles, the median, and ± 1.5 × interquartile range, respectively (n = 1000). e The correlation of electric currents for center A between replicates. Signal intensities at central A of an 11-mer motif at the center of LCB are compared between the two replicates. Each point represents an 11-mer motif with depth ≥50 in both replicates. Pearson’s R2 and Spearman’s ρ2 values are shown in each plot. The color scale indicates Gaussian kernel-estimated density. n = 618,976 (A rep. 1 vs. A rep. 2), 138,480 (A rep. 1 vs. m6A rep. 1), 505,623 (A rep. 1 vs. m6A rep. 2), 144,246 (A rep. 2 vs. m6A rep. 1), 661,463 (A rep. 2 vs. m6A rep. 2), and 144,081 (m6A rep. 1 vs. m6A rep. 2).