Fig. 3: Evaluation of ancestry-associated SNP mismatches in sgRNA targeting sequences. | Nature Communications

Fig. 3: Evaluation of ancestry-associated SNP mismatches in sgRNA targeting sequences.

From: Germline variation contributes to false negatives in CRISPR-based experiments with varying burden across ancestries

Fig. 3

a Association between East Asian (EAS) or European (EUR) ancestry and guide depletion scores for all guides targeting ancestry-associated dependencies. Guides with a Single Nucleotide Variant (SNV) in the targeting sequence are indicated in red. Raw data are described in Source Data 3A. b For all sgRNA sequences in the Avana library, the fraction of cell lines with a SNV in its targeting sequence. Guides which are affected in >10 cell lines (n = 3209) are indicated in red to the right of the dashed line, whereas those that are affected in <10 cell lines are not shown. Raw data are described in Source Data 3B. c Variants (from WES/WGS) for all cell lines were mapped to the targeting sequence of all Avana guides (see Data Availability). Genes are stratified by the number of guides with a mismatch in at least one cell line. Raw data are described in Source Data 3C. d Germline (red) and somatic (gray) variants from 32 tumor types profiled in TCGA were mapped to targeting sequences for guides in the Avana library. The total number of variants in each sample that map to Avana guides is plotted on the x-axis. In the boxplot, the box includes the second and third data quartiles divided by a median line, and whiskers represent the first and fourth quartiles. Boxplot summary values are described in Source Data 3Db. Raw data are described in Source Data 3D. e Guides were stratified by the position of mismatches within the sgRNA targeting sequence and the association between the SNV and the guide depletion score was computed for each sgRNA in the Avana library (black boxes). P-values were computed with two-sided t-tests between cell lines with and without each SNV. The impact of mismatches on guide activity from Doench et al.3 is indicated in blue circles. In the boxplot, the box includes the second and third data quartiles divided by a median line, and whiskers represent the first and fourth quartiles. Boxplot summary values are described in Source Data 3Eb. Raw data are described in Source Data 3E.

Back to article page