Supplementary Figure 2: Validation of bead merging computational approach. | Nature Biotechnology

Supplementary Figure 2: Validation of bead merging computational approach.

From: Droplet-based combinatorial indexing for massive-scale single-cell chromatin accessibility

Supplementary Figure 2

(a) Browser shot of paired-end reads near the DIAPH1 and GAPDH loci. Reads are colored by bead barcode sequence. (b) Schematic of verification experiment where a library of random oligonucleotides was encapsulated into droplets together with Tn5 transposed cells and barcoded beads. The schematic shows a droplet containing a library of random oligos, a cell and two beads with different barcode sequences. (c) The expected number of beads per drop as a function of bead concentration. Inference of this line was determined by a maximum likelihood estimation for a double-truncated Poisson distribution. (d) Percent of drops with one or more beads as a function of bead concentration. Values are estimated using the probability density function of a Poisson distribution parameterized by the mean number of beads per drop from (c). (e) Jaccard index overlap metric for pairs of bead barcodes loaded at a concentration of 200 beads/μL. For each pair of bead barcodes observed, the Jaccard index was computed over the observed random oligonucleotide sequences. (f) The BAP overlap score computed from the dscATAC-seq data (agnostic to oligonucleotides) from the same experiment. In each panel, pairs of bead barcodes nominated for merging are highlighted in blue. Merged pairs were determined by computing a “knee” inflection point. The same two panels are shown in (g–j) but for increased bead concentration: (g, h) 800 beads/μL; (i,j) 5,000 beads/μL. (k) (left panel) Area under the receiver operating curve (AUROC) values for true positive bead merges nominated from the random oligonucleotide sequences. Four metrics are compared, including Pearson and Spearman correlation and the Jaccard index of reads in peaks per pair of bead barcodes. The final metric is our novel computational approach, termed BAP. Various bead concentrations per experimental condition are shown below the x-axis. (right panel) The same conditions and metrics but showing the area under the precision-recall curve (AUPRC). (l) %TSS enrichment scores for the same pool of cells processed at different bead concentrations. Center line, median; box limits, first and third quartiles; whiskers, 1.5x interquartile range. (m) Per-cell library complexities across a range of tested bead concentrations, the same as in panel (l). Center line, median; box limits, first and third quartiles; whiskers, 1.5x interquartile range. Both panels (l, m) show the top 500 cells sorted by library size. (n) Species mixing plots and collision rates (text) for the same experiment (800 beads/μL) with and without bead merging. (o) The same plots as in (n) but at a bead concentration of 5,000 beads/μL.

Back to article page