Table 3 ROC AUC and Average Precision scores of RFC models for their prediction on 5-fold cross validation using the train set (5-fold CV: mean of each of the 5 folds performance) and the optimized hyperparameters, on validation set only, and on validation set with the addition of DATASET2 (V + D2). Highest values for the most few number of features are highlighted in bold while values for selected RFC model (20 descriptors) are highlighted in italic. 0 features correspond to a random prediction.
From: Ligand-based machine learning models to classify active compounds for prostaglandin EP2 receptor
ROC_AUC | AP | |||||
|---|---|---|---|---|---|---|
descriptors number | 5-fold CV | Validation | V + D2 | 5-fold CV | Validation | V + D2 |
0 | 0.5 | 0.5 | 0.5 | 0.46 | 0.47 | 0.004 |
1 | 0.878 | 0.853 | 0.766 | 0.873 | 0.798 | 0.014 |
2 | 0.905 | 0.893 | 0.835 | 0.896 | 0.877 | 0.040 |
3 | 0.910 | 0.906 | 0.845 | 0.902 | 0.878 | 0.202 |
4 | 0.916 | 0.915 | 0.830 | 0.903 | 0.908 | 0.382 |
5 | 0.919 | 0.915 | 0.841 | 0.907 | 0.896 | 0.388 |
6 | 0.927 | 0.935 | 0.885 | 0.914 | 0.918 | 0.507 |
7 | 0.922 | 0.933 | 0.873 | 0.911 | 0.916 | 0.478 |
8 | 0.928 | 0.941 | 0.884 | 0.918 | 0.925 | 0.490 |
9 | 0.925 | 0.934 | 0.881 | 0.915 | 0.924 | 0.469 |
10 | 0.927 | 0.937 | 0.892 | 0.916 | 0.922 | 0.574 |
20 | 0.924 | 0.952 | 0.911 | 0.911 | 0.938 | 0.644 |
30 | 0.927 | 0.952 | 0.915 | 0.914 | 0.941 | 0.658 |
40 | 0.927 | 0.954 | 0.918 | 0.915 | 0.949 | 0.695 |
50 | 0.924 | 0.951 | 0.919 | 0.910 | 0.944 | 0.711 |
60 | 0.925 | 0.954 | 0.920 | 0.910 | 0.947 | 0.712 |
70 | 0.926 | 0.956 | 0.923 | 0.912 | 0.953 | 0.695 |
80 | 0.924 | 0.953 | 0.926 | 0.910 | 0.948 | 0.716 |
90 | 0.924 | 0.953 | 0.922 | 0.911 | 0.947 | 0.709 |
100 | 0.926 | 0.951 | 0.918 | 0.913 | 0.944 | 0.713 |