Extended Data Fig. 10: Evaluation of barcode replacement in IronThrone GoT processing. | Nature

Extended Data Fig. 10: Evaluation of barcode replacement in IronThrone GoT processing.

From: Somatic mutations and cell identity linked by Genotyping of Transcriptomes

Extended Data Fig. 10

a, Fraction of reads with cell barcodes that are not perfectly matched to the whitelisted cell barcodes from the species-mixing experiment. ‘>Hamm-1’ denotes filtered reads with barcodes that are more than one Hamming distance away from whitelisted barcodes (n = 139,422 reads). ‘Not significant’ denotes filtered reads with barcodes that are one Hamming distance away from the whitelisted barcodes, but which have a low probability of originating from the barcode (posterior probability < 0.99, n = 14,830 reads). ‘Replaced’ denotes rescued reads with barcodes that have candidates that are one Hamming distance away from the whitelisted barcodes, with statistical significance (posterior probability ≥ 0.99, n = 224,085 reads). b, c, Number of supporting reads per candidate barcode and base quality at the differing base positions (b) and across base positions (c). Two-sided Wilcoxon rank-sum tests were applied to compare not significant (n = 14,830) and replaced (n = 224,085) barcodes. d, Correlation between the number of supporting reads per candidate barcode and median base quality at the differing base (two-tailed Pearson’s correlation, F-test). e, Distribution of prior and posterior probabilities from not significant (n = 14,830) and replaced (n = 224,085) barcodes. The dashed red line represents the posterior probability cut-off (0.99). fh, To further evaluate the efficiency of barcode replacement, we generated synthetic cell barcodes by randomly changing one base in whitelisted cell barcodes (n = 100 iterations). f, Percentage of reads with cell barcodes that are not identical to the whitelisted cell barcodes (n = 100 iterations). Percentages of replaced reads were 99.1% ± 0.001% (median ± absolute deviation) in simulations with 1 base changed, 1.1% ± 0.002% in simulations with 2 bases changed and 0.7 ± 0.001% in simulations with 3 bases changed. g, Determination of whether replaced cell barcodes are identical to the original cell barcodes. In simulations with 1 base change, the percentage of reads with replaced cell barcodes that were identical to the original cell barcodes was 100 ± 0% (median ± absolute deviation of 100 iterations). h, Estimation of prediction power for classifying cell barcodes from simulations with 1 base changed (n = 100 iterations).

Back to article page