Table 3 Comparison in performance metrics of the DenseNet121-based classification model for each outcome class between the non-tilted vs. tilted disc images using the test dataset.

From: Deep learning-based optic disc classification is affected by optic-disc tilt

Outcome class

Metric

Non-tilted disc

Tilted disc

P-value

Normal

Accuracy

0.962 ± 0.008

0.945 ± 0.009

0.05

Sensitivity

0.967 ± 0.015

0.841 ± 0.032

 < 0.01

Specificity

0.958 ± 0.013

0.976 ± 0.009

0.15

Precision

0.940 ± 0.017

0.914 ± 0.028

0.23

F1 score

0.953 ± 0.009

0.876 ± 0.020

 < 0.01

Glaucoma

Accuracy

0.962 ± 0.009

0.939 ± 0.008

0.01

Sensitivity

0.964 ± 0.013

0.965 ± 0.010

0.44

Specificity

0.955 ± 0.037

0.798 ± 0.044

0.02

Precision

0.988 ± 0.010

0.962 ± 0.008

0.03

F1 score

0.975 ± 0.006

0.964 ± 0.005

0.06

Optic disc pallor

Accuracy

0.960 ± 0.008

0.960 ± 0.009

0.43

Sensitivity

0.981 ± 0.012

0.980 ± 0.010

0.44

Specificity

0.820 ± 0.058

0.608 ± 0.133

0.03

Precision

0.974 ± 0.008

0.978 ± 0.007

0.37

F1 score

0.977 ± 0.005

0.979 ± 0.005

0.40

Optic disc swelling

Accuracy

0.988 ± 0.005

0.985 ± 0.004

0.31

Sensitivity

0.994 ± 0.005

0.999 ± 0.002

0.38

Specificity

0.899 ± 0.055

0.275 ± 0.189

 < 0.01

Precision

0.993 ± 0.004

0.986 ± 0.004

0.10

F1 score

0.994 ± 0.003

0.992 ± 0.002

0.31

  1. The table displays the means ± standard errors of accuracy, sensitivity, specificity, precision and F1 score for the Dense121-based classification model in predicting each outcome class. P-value is a significance probability for testing the mean difference in each performance metric between classification models developed with the non-tilted vs. tilted disc images, and was derived from 100 bootstrap resamples of the test dataset.