Fig. 3: Inference and effect of barcode multiplets in single-cell ATAC-seq data.
From: Inference and effects of barcode multiplets in droplet-based single-cell assays

a Default t-SNE depiction of public scATAC-seq PBMC 5k dataset. Colors represent cluster annotations from the automated CellRanger output. b Quantification of barcodes affected by barcode multiplets for the same dataset (identified by bap). c Depiction of two multiplets each composed of 9 oligonucleotide barcodes. Barcodes in each multiplet share a long common subsequence, denoted in black. d Visualization of two barcode multiplets from c in t-SNE coordinates. e Visualization of all implicated barcode multiplets from this dataset. The zoomed panel shows a small group of cells affected by five multiplets, indicated by color. f Empirical distribution of the mean restricted longest common subsequence (rLCS) per multiplet. A cutoff of 6 was used to determine either of the two classes of barcode multiplets. g Percent difference of the mean log2 fragments between pairs of barcodes within a multiplet. The reported p-value is from a two-sided Kolmogorov–Smirnov test. The exact p-value is lower than machine precision. Analysis represents n = 5205 barcodes over 1 experimental replicate. Boxplots: center line, median; box limits, first and third quartiles; whiskers, 1.5× interquartile range. h Overall rates of barcode multiplets from additional scATAC-seq data comparing v1.0 and v1.1 (NextGEM) chip designs. Source data are available in the Source Data file.