Fig. 2: Performance of DeepPBS for predicting binding specificity across protein families for experimentally determined structures. | Nature Methods

Fig. 2: Performance of DeepPBS for predicting binding specificity across protein families for experimentally determined structures.

From: Geometric deep learning of protein–DNA binding specificity

Fig. 2

a, Prediction performances of DeepPBS along with ‘groove readout’, ‘shape readout’ and ‘with DNA SeqInfo’ variations, on benchmark set (biological assemblies corresponding to n = 130 protein chains (for each box plot); Supplementary Section 1). MAE, mean absolute error; RMSE, root mean squared error. b, Performances of DeepPBS and ‘with DNA SeqInfo’ models in context of PWM–co-crystal-derived DNA alignment score (Supplementary Section 2). The shaded regions indicate the 95% confidence interval for the corresponding linear fit. The MAE equivalent of this plot is available as Supplementary Fig. 12, showing similar trends. c, Abundances of various protein families (as appearing in PFAM annotations) in constructed benchmark set (counts >3). d, Performances of DeepPBS, groove readout and shape readout models across various protein families (counts >3) (biological assemblies corresponding to n protein chains (for each family), where n is as described in c, total unique n = 130). All benchmark predictions are made by an ensemble average of five models trained via cross-validation. Cross-validation performances of individual trained models are shown in Supplementary Fig. 5a. For the box plots in a and d, the lower limit represents the lower quartile, the middle line represents the median and the upper limit represents the upper quartile.

Source data

Back to article page