Fig. 2: Performance of DeepFRI in predicting MF-GO terms of experimental structures and protein models.
From: Structure-based protein function prediction using graph convolutional networks

a Precision-recall curves showing the performance of DeepFRI on ~700 protein contact maps (PDB700 dataset) from NATIVE PDB structures (CMAP_NATIVE, black), their corresponding Rosetta-predicted lowest energy (LE) models (CMAP-Rosetta_LE, orange) and DMPfold lowest energy (LE) models (CMAP-DMPFold_LE, red), in comparison to the sequence-only CNN-based method (SEQUENCE, blue). All DeepFRI models are trained only on experimental PDB structures. b Distribution of protein-centric Fmax score over 1500 different Rosetta models from the PDB700 dataset grouped by their TM-score computed against the native structures. Data are represented as boxplots with the center line representing the median, upper and lower edges of the boxes representing the interquartile range, and whiskers representing the data range (0.5 × interquartile range). c An example of DeepFRI predictions for Rosetta models of a lipid-binding protein (PDB id: 1IFC) with different TM-scores computed against its native structure. The DeepFRI output score >0.5 is considered as a significant prediction. Precision-recall curves showing the: d performance of our method, trained only on PDB experimental structures, and evaluated on homology models from SWISS-MODEL (red), in comparison to the CNN-based method (DeepGO) trained only on PDB sequences, and BLAST baselines are shown in blue and gray, respectively; e performance of DeepFRI trained on PDB (blue), SWISS-MODEL (orange) and both PDB and SWISS-MODEL (red) structures in comparison to the BLAST baseline (gray). The dot on the curve indicates where the maximum F-score is achieved (the perfect prediction should have Fmax = 1 at the top right corner of the plot).