Extended Data Fig. 8: MPRA data quality and comparisons to epigenomic landscapes.
From: The dynamic, combinatorial cis-regulatory lexicon of epidermal differentiation

(a) Distribution of barcodes in plasmid library, demonstrating the skew of barcode representation. (b) Number of barcodes per fragment in plasmid library, demonstrating on average 10 barcodes per fragment tested. (c) Number of reads per MPRA sample. (d) Number of barcodes per fragment in MPRA RNA reads, demonstrating on average 10 barcodes per fragment tested. (e) Average MPRA signal compared to controls, showing ATAC regions on average have activity in between negative controls (genomic negatives and shuffled sequences) and positive controls (promoter sequences). (f) MPRA replicate consistency. Left: Consistency by Pearson R across replicates and timepoints tested. Right: Consistency of MPRA replicate signal for two example replicates in timepoint day 0. (g) Correlation of MPRA signal to various genomic and/or modeling signals: ATAC signal, NN predictions of ATAC signal, H3K27ac, and regression predictions from linear model utilizing NN final layer activations as model inputs (results shown on held-out test data). (h) ATAC signal across timepoints day 0,3, and 6 for sequences containing HOXA1 motif and ETV5 motif. Box-and-whisker plots show all points, minimum to maximum, with 25th to 75th interquartile range.