Table 1 The result of AUC (Area under Curve) and test accuracy for all models based on 71 nucleoside derivatives

From: Developing a machine learning model for accurate nucleoside hydrogels prediction based on descriptors

Models

Features

Test Accuracy

AUC

Mean

SEM

Mean

SEM

DT*

Descriptor_4175

0.65

0.01

0.65

0.02

LR

Descriptor_4175

0.65

0.02

0.67

0.02

RF

Descriptor_4175

0.63

0.01

0.72

0.02

XGBoost

Descriptor_4175

0.63

0.01

0.69

0.02

DT

Descriptor_144

0.64

0.01

0.64

0.01

LR

Descriptor_144

0.68

0.02

0.80

0.02

RF

Descriptor_144

0.67

0.01

0.75

0.02

XGBoost

Descriptor_144

0.64

0.02

0.72

0.02

DT

Descriptor_40

0.66

0.02

0.69

0.02

LR

Descriptor_40

0.70

0.01

0.81

0.02

RF

Descriptor_40

0.67

0.01

0.74

0.02

XGBoost

Descriptor_40

0.65

0.01

0.75

0.02

DT

Descriptor_ REF #

0.59

0.02

0.63

0.02

LR

Descriptor_ REF #

0.71

0.01

0.84

0.02

RF

Descriptor_ REF #

0.67

0.01

0.75

0.02

XGBoost

Descriptor_ REF #

0.70

0.02

0.79

0.02

  1. *LR Logistic regression, DT Decision tree, RF Random forest, XGBoost Extreme gradient boosting, SEM Standard error of the mean.
  2. #Descriptors-REF: Recursive feature elimination (REF) has different optimal descriptors for different Algorithms: LR, n = 24; XGBoost, n = 16; DT, n = 30; RF, n = 37.