Table 3 Effect of various synthetic data generation and classification techniques on F1-score improvements and reductions in the COVID-19, kidney and dengue datasets.

From: An enhancement of machine learning model performance in disease prediction with synthetic data generation

Data

Highest F1-score upgrade

Combined method

Model

Synthetic data

Highest F1-score down-grade

Combined method downgrade

Model

Synthetic data

COVID

+ 0.15

ADASYN + DCTGAN + ResNet(Poposed)

TabNet

+ 50%

− 0.30

SMOTE + DCTGAN+ ResNet

KNN

Only-synth

COVID

+ 0.12

NC + DCTGAN + ResNet

XGB

+ 25%

− 0.22

BS + Sep. + DCTGAN+ ResNet

TabNet

+ 200%

COVID

+ 0.10

SMOTE + DCTGAN + ResNet

RF

+ 75%

− 0.18

NC + DCTGAN+ ResNet

RF

+ 100%

Kidney

+ 0.20

ADASYN + DCTGAN + ResNet(proposed)

TabNet

+ 100%

− 0.25

NC + Sep. + DCTGAN+ ResNet

KNN

Only-synth

Kidney

+ 0.05

BS + DCTGAN + ResNet

RF

+ 200%

− 0.20

ADASYN + DCTGAN+ ResNet

TabNet

+ 400%

Kidney

+ 0.03

NC + DCTGAN + ResNet

XGB

+ 50%

− 0.15

SMOTE + DCTGAN+ ResNet

KNN

+ 100%

Dengue

+ 0.18

SMOTE + DCTGAN + ResNet(proposed)

TabNet

+ 150%

− 0.24

NC + DCTGAN+ ResNet

XGB

+ 150%

Dengue

+ 0.11

NC + DCTGAN + ResNet

KNN

+ 75%

− 0.19

SMOTE + DCTGAN+ ResNet

RF

+ 300%

Dengue

+ 0.08

BS + DCTGAN + ResNet

RF

+ 100%

− 0.17

BS + Sep. + DCTGAN + ResNet

KNN

+ 200%