Extended Data Fig. 1: Quality statistics of the MPRA experiments.

(a) Statistics for FLASH-merged reads in the association library. The plot shows that 46.1% are 200 bp fragments as designed. (b) Statistics of BWA-mapped reads in the association library. The plot shows that 44.1% are 200 bp fragments as designed. (c) Statistics of barcode types per oligo in the association library. On average, each oligo is linked with 126 different barcodes. (d) Statistics of barcode types per oligo in reference (n = 1102), alternative (n = 1103), negative control (n = 153), and positive control (n = 30) oligos. Data is from the association library. (e) Statistics of barcode counts per oligo in reference (n = 1102), alternative (n = 1103), negative control (n = 153), and positive control (n = 30) oligos. Data is from the association library. (f) Barcode types for reference and alternative alleles are comparable. Pearson’s r = 0.91, p < 2 × 10−16. (g) Principal component analysis of DNA and RNA libraries from MNT-1 and WM88 cells. Three replicates. (h) Summary of enhancer activities estimated by MPRA. Enhancer activities were defined as the barcode counts per million in the RNA library divided by the barcode counts per million in the DNA library. Alt: oligos containing alternative alleles (n = 1103). Ref: oligos containing reference alleles (n = 1102). Negative, negative control oligos (n = 148). Positive, positive control oligos (n = 30). For boxplots, central lines are median, with boxes extending from the 25th to the 75th percentiles. Whiskers further extend by ±1.5 times the interquartile range from the limits of each box.