Fig. 3 | Nature Communications

Fig. 3

From: A CRISPRi screen in E. coli reveals sequence-specific toxicity of dCas9

Fig. 3

A machine-learning approach reveals a toxicity effect determined by the seed sequence. a A locally connected neural network was trained to predict the fitness effect of guide RNAs that target neutral regions, using the one-hot-encoded 60 nt sequence window around the target. Comparison of predicted and actual log2FC values on a held-out test set. b To identify the positions used by the model to make its predictions, we generated a set of 1000 random sequences, mutated each position in silico, and computed the effect of each mutation on the model prediction. The standard deviation of the effect of mutations at each position is plotted. The red bar indicates the position of the GG bases of the PAM. c The model was trained again using only the 20 nt of the guide sequence. The box plots show the distribution of the effects that mutations to all possible bases have on the model prediction. One can see that the effect of specific mutations can be either positive or negative, revealing a strong dependence on the rest of the sequence. d To measure the level of interaction between positions, we generated all possible pairs of mutations for each sequence in a set of 100 random sequences and compared the effect of individual mutations to that of pairs of mutations. Positions are interacting if the effect of a double mutation (Eij) is different from the sum of the effect of the single mutations (Ei + Ej). The heat map shows the average Euclidean distance between Eij and Ei + Ej for all pairs of positions (see Supplementary Fig. 5). Note the strong network of interacting bases in the 5 nt of the seed sequence

Back to article page