Extended Data Fig. 1: Data Analysis & Quality Control.
From: The landscape of N6-methyladenosine in localized primary prostate cancer

a) The study workflow: we investigate the role of the post-transcriptional RNA modification m6A in primary prostate tumors by examining global m6A patterns, germline origins, clinical associations, microenvironmental associations and functional impact. This builds on previous molecular profiling studies of this primary prostate tumor cohort including 133 genotypes18, 148 CNA and 143 somatic mutation profiles17, 146 DNA methylation18, 29 H3K27ac ChIP-seq20, 92 ultra-deep RNA-sequencing19 and 54 proteomic profiles2. b) Data processing pipeline for meRIP-sequencing. c) Quality control metrics derived from the output of STAR, RSeQC, cutadapt and correlation of the Input library with previous transcriptome profiling datasets16,19. Top barplot: the sum of negative Z-scores for each sample. Bottom heatmap: metric by sample matrix where colors denote the magnitude of negative Z-scores. d) Left panel: sample-level correlation of transcript abundance in 92 samples common between Input libraries of this study and ultra-deep bulk RNA-seq from Chen et al.19 (Spearman’s ⍴). Right panel: gene-level correlation of transcript abundance in the same datasets (Spearman’s ⍴)19. e) Validation of sample identity by comparing germline variants identified in the Input libraries and Houlahan et al.18. Both axes are ordered identically, first, by ancestry (covariate: orange represents European ancestry, green represents non-European ancestry) and then, by sample ID. Heatmap color represents precision (left) and sensitivity (right). Precision is calculated as the intersection of RNA-seq and WGS variants divided by all RNA-seq variants. Sensitivity is calculated as the intersection of RNA-seq and WGS variants divided by all WGS variants. f) Comparisons of the PCR Duplicate Proportion (PDP), the Non-Redundant Fraction (NRF) and the PCR Bottlenecking Coefficients 1 and 2 (PBC1/2) for each sample. The metric values of the IP library (y-axis) are compared to the metric values of the Input library (x-axis).