Extended Data Fig. 2: Peak Calling & Central Dogma Integration.
From: The landscape of N6-methyladenosine in localized primary prostate cancer

a) The distribution of peaks identified by exomePeak and MeTPeak. Each dot in the scatterplot represents a sample, histograms show the distribution of m6A methylated genes and peaks across samples. Bottom histogram shows the distribution of the number of peaks identified per gene. b) Spearman’s correlation of the number of peaks and the library size. The size of the disks corresponds to the magnitude of Spearman’s ⍴ and the background color indicate the P value (two-tailed test). c-d) The number of peaks called using three IP libraries when randomly subsampled to 50%, 25% 10% and 1% of the original reads for exomePeak (c) and MeTPeak (d). e) Comparing the coverage of MeTPeak and exomePeak peaks using the Simpson Index (intersection/smallest set, median = 94.80%) and the Jaccard Index (intersection/union, median = 53.01%) for each sample. Inner panel: Overlap by genomic coverage with previously identified peaks in the REPIC database26. f) Distribution of peak density across a metagene as called by MeTPeak. Line shows the density of the peaks identified by HistogramZoo while the blue background shows the variation of peak density across individual samples. Bottom plot shows the distribution of peaks annotated across transcriptomic elements. g-j) Exemplar plots for MALAT1 (g), FOXA1 (h), RB1 (i) and NKX3-1 (j). The median IP (green) and Input (purple) coverage are represented using lines while background colors represent the range of IP and Input coverage across samples. Exons are annotated below in dark blue. Bottom heatmaps represent the distribution of IP and Input coverage (log1p-transformed) across samples. k) Relationship between the number of samples with an m6A peak at each gene and the corresponding gene-level RNA abundance. Box plots represent the median (center line) and upper and lower quartiles (box limits), and whiskers extend to the minimum and maximum values within 1.5x the IQR. l) Distribution of spearman’s correlation between m6A level and input abundance. m-n) Relationship between gene-level RNA and protein abundance correlation and (m) gene-level m6A abundance and (n) the number of samples with an m6A peak at each gene. Spearman’s correlation and corresponding P value displayed (n = 3,233).