Fig. 3: Evaluating prediction performance with MoleculeNet datasets. | Nature Communications

Fig. 3: Evaluating prediction performance with MoleculeNet datasets.

From: A systematic study of key elements underlying molecular property prediction

Fig. 3

a Performance of RF on RDKit2D descriptors, MolBERT, GROVER and GROVER_RDKit (performance distribution in Supplementary Fig. 14a). b Performance of RF on RDKit2D descriptors, MolBERT, GROVER and GROVER_RDKit under scaffold split. c Statistical significance for pairwise model comparison in b. d Performance of RF, SVM & XGBoost on RDKit2D descriptors, RNN & MolBERT and GCN, GIN & GROVER under scaffold split (performance distribution in Supplementary Fig. 14b). e Performance of RF on fixed representations (performance distribution in Supplementary Fig. 14c). f Statistical significance for pairwise model comparison in d. g Statistical significance for pairwise fixed representation comparison in e. Default metric for classification datasets (BACE, BBBP, HIV) is the area under the receiver operating characteristic curve (AUROC) and root mean square error (RMSE) for regression datasets (ESOL, FreeSolv, Lipop); other metrics include the area under the precision-recall curve (AUPRC), positive predictive value (Precision_PPV), negative predictive value (Precision_NPV), mean absolute error (MAE), coefficient of determination (R2) and Pearson correlation coefficient (Pearson_R). Error bar denotes standard deviation over 30 splits. Mann–Whitney U test is applied in f, g Data are in the Source Data.

Back to article page