Supplementary Figure 4: Performance of in vitro trained RBP models on in vivo data | Nature Biotechnology

Supplementary Figure 4: Performance of in vitro trained RBP models on in vivo data

From: Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning

Supplementary Figure 4

Performance of all RBP models for which RNAcompete in vivo data was available (c.f. Ray et al.19, Fig. 1C). Figure 3d shows only the subset of RBPs for which the in vivo test sequences has average length <1000. All AUCs are calculated with 100 bootstrap samples, and the standard deviation is shown as vertical lines. “Base counts” show the best performance achievable from ranking test sequences by the proportion of a single nucleotide or by sequence length; for example, ranking the QKI test sequences by 1/(fraction of Gs) gives AUC of 0.95. There are 9 RBPs for which at least one method can perform better than base counts on this test data. RNAcompete PFMs beat base counts for PUM2, SRSF1, FMR1, and Vts1p. DeepBind beats base counts for 8 RBPs (no significant improvement for FMR1). See Supplementary Table 3 (“In vivo AUCs”) for raw data for this plot.

Back to article page