Fig. 2: Rule Set 3 (Sequence + Target) validation.

a Schematic depicting essential/non-essential tiling library construction and screening approach. b Spearman correlations between observed and predicted activity for all essential genes (n = 201) in each of the essential/non-essential screens across previous models and Rule Set 3 models. Rule Set 3 models using the same tracrRNA feature as the screen are highlighted in pink and significantly outperformed other models (one-sided t-test p value < 0.002). Boxes show 25th and 75th percentiles as minima and maxima and the center represents the median; whiskers show the 5th and 95th percentiles. c Percent of all sgRNAs targeting essential and non-essential genes broken down by Rule Set 3 (Sequence + Target) bins for the Hsu and Chen tracrRNAs. d Spearman correlations between predicted scores and the growth phenotype for sgRNAs (n = 1964) targeting essential genes in a tiling CRISPRi dataset across all sequence models. e Percent of quintiles for sgRNAs from essential genes (n = 199) with at least 20 guides binned by Rule Set 3 (Sequence + Target) scores for each screen. f Average log-fold change of 4 sgRNAs per gene for essential genes (n = 201) and non-essential genes (n = 198) calculated by picking sgRNAs randomly, using Rule Set 2 or using Rule Set 3 (Sequence + Target) for the tiling library screened with Hsu, Chen, and DeWeirdt tracrRNAs. Rule Set 3 (Sequence + Target) scores used are with the matched tracrRNA. For the screen performed with the DeWeirdt tracrRNA, we used on-target scores with the Chen tracrRNA. Boxes show 25th and 75th percentiles as minima and maxima and the center represents the median; whiskers show 10th and 90th percentile. Heatmap shows the corresponding SSMD scores. Source data are provided as a Source Data file.