Extended Data Fig. 6: Analysis with stringent thresholding of variant barcodes per cell for KRAS variants.
From: Massively parallel phenotyping of coding variants in cancer with Perturb-seq

a.Distribution of normalized variant expression (x axis, transcripts per 10,000 UMIs/cell (TP10K)). The vertical red line represents a stringent threshold for detecting variant barcodes per cell, whose results are investigated in this figure. b. Cumulative distribution function (CDF) of number of cells (x axis) profiled for each variant after using the threshold in a, considering either all cells (gray) or only cells with a single variant (black). c. Distribution of the number of variants detected per cell. d. Low-dimensional embedding of mean expression profiles of variants (dots), colored by variant class, as determined in Fig. 3a. e. sc-eVIP scores from the full dataset (x axis) versus computed on the thresholded dataset (y axis), colored by variant class. f. Sensitivity of sc-eVIP scores (y axis) for identifying impactful variants, at an FDR of 5%, as a function of the number of subsampled cells per variant (x axis) (black: all variants, green: Impactful I, purple: Impactful II, gold: Impactful III, red: Impactful IV (gain-of-function)). Mean sensitivity (lines) and 95% confidence intervals (error shade) are based on 10 different subsampling iterations. g. Top: Hierarchical clustering of variants by their correlation profiles, for the thresholded dataset. Bottom: average expression profile of all variable genes (rows) in each variant (columns), grouped into 12 gene programs (row colors). Program 8, higher in assigned vs. unassigned cells was enriched for translation, nonsense-mediated decay, and viral transcription, and may reflect the response to lentiviral transduction.