Supplementary Figure 11: Development of an algorithm to predict the AsCpf1-induced indel frequency for a given guide RNA-target sequence pair.
From: In vivo high-throughput profiling of CRISPR–Cpf1 activity

(a) The workflow for the development of a model to predict the AsCpf1-induced indel frequency using the measured indel frequencies for 1,251 pairs of guide RNA and target sequences. (b) Feature selection using Elastic Net for prediction model development. (Upper panel) The 57 non-zero coefficients with the fewest miss-classification errors were automatically selected from a certain shrinkage value range, indicated between the two dotted lines. The misclassification error (y-axis) is the error rate associated with classifying an entity into two classes and the shrinkage parameter adds information to a raw estimate to regularize the ill poised problem. (Lower panel) Coefficient values of the 57 selected features.