Extended Data Fig. 8: Additional analyses of the ground truth perturbation dataset. | Nature Methods

Extended Data Fig. 8: Additional analyses of the ground truth perturbation dataset.

From: Targeted Perturb-seq enables genome-scale genetic screens in single cells

Extended Data Fig. 8

a, Precision-Recall curves, as in Fig. 2f. Potentially true gRNA off-target or downstream effects were identified by differential expression testing across all cells, and then excluded from the analysis. Points indicate performance at a nominal FDR of 0.05. See Note S3 section ‘Sensitivity analysis (differential expression)’ for detail on the statistical test used. b, Comparison of Area under the precision-recall curves (AUPRC) for n = 6,100 cells per perturbation, sampled to various read depths. Potential gRNA off-target and downstream effects were treated as false positives (solid lines, same as in Fig. 2g) or excluded (dashed lines). c, The absolute effect of a gRNA-mediated perturbation in UMIs/cell was quantified from non-downsampled whole transcriptome data (x Axis). The probability of observing these effects as significant was the quantified by drawing 100 samples using 150 cells per sample and 1,000 average reads per cell (y Axis). Lines derive from a logistic regression. The UMI difference required for achieving a 50% detection probability was used as a measure of molecular sensitivity (dotted line). Data from n = 660,106 cells and 9,750 sampling runs. d, Like Fig. 2g, but using molecular sensitivity as defined in panel c as the measure of sensitivity. Down-sampling was restricted to 50–150 cells per perturbation, since estimates of molecular sensitivity were otherwise driven by excessive sampling noise. Data from n = 660,106 cells and 7,150 sampling runs. e, AUPRC plotted in relationship to number of cells per perturbation and total number of reads (data from Fig. 2g). f, For of n = 656 each gRNA targets, the absolute and relative expression change elicited by the perturbation, as well as the expression baseline, were computed from whole transcriptome data without subsampling (x axis). Data from both methods were then downsampled repeatedly to 150 cells per perturbation and 10,000 (Perturb-seq) or 1,000 (TAP-seq) reads per cell to determine the probability of detecting a change (y axis). Refer to methods section on ‘data visualization’ for a definition of box plot elements.

Back to article page