Table 2 P-values for the AUROC comparison among DCNN models (DeLong’s Test).

From: DCNN models with post-hoc interpretability for the automated detection of glossitis and OSCC on the tongue

Model

VGG16

ResNet50

VGG19

ResNet152

Ensemble_2

Ensemble_4

VGG16

 

0.153553

0.698381

0.537263

0.545622

0.715626

ResNet50

0.153553

 

0.069682

0.417893

0.042352*

0.07332

VGG19

0.698381

0.069682

 

0.315307

0.828244

0.981488

ResNet152

0.537263

0.417893

0.315307

 

0.222236

0.326602

Ensemble_2

0.545622

0.042352*

0.828244

0.222236

 

0.810211

Ensemble_4

0.715626

0.07332

0.981488

0.326602

0.810211

 
  1. VGG16: Visual Geometry Group 16-layer network, ResNet50: Residual Network with 50 layers, VGG19: Visual Geometry Group 19-layer network, ResNet152: Residual Network with 152 layers, Ensemble_2: Ensemble model combining VGG16 and VGG19, Ensemble_4: Ensemble model combining VGG16, ResNet50, VGG19, and ResNet152, AUROC: Area Under the Receiver Operating Characteristic curve, 95% CI: 95% Confidence Interval. A P-value of less than 0.05 was considered statistically significant and was marked with an asterisk (*) and bolded.