Table 3 Comparative performance for human (clinicians) vs machine (ML-based models) for diagnosing glaucoma.

From: OCT-based diagnosis of glaucoma and glaucoma stages using explainable machine learning

Decision maker

Stage

Sensitivity

Specificity

Accuracy

AUC

F1-Score

Machine (RF)

Overall Glaucoma

0.917 ± 0.069

0.958 ± 0.018

0.938 ± 0.033

0.967 ± 0.025

0.921 ± 0.038

Early

0.909 ± 0.057

0.934 ± 0.044

0.921 ± 0.024

0.947 ± 0.034

0.860 ± 0.057

Moderate

0.959 ± 0.039

0.960 ± 0.039

0.960 ± 0.025

0.982 ± 0.017

0.921 ± 0.035

Advanced

1 ± 0

0.997 ± 0.007

0.998 ± 0.003

1 ± 0

0.990 ± 0.021

Machine (SVM)

Overall Glaucoma

0.919 ± 0.051

0.974 ± 0.031

0.946 ± 0.023

0.972 ± 0.018

0.935 ± 0.028

Early

0.874 ± 0.052

0.962 ± 0.037

0.918 ± 0.015

0.956 ± 0.021

0.880 ± 0.027

Moderate

0.959 ± 0.038

0.969 ± 0.037

0.964 ± 0.031

0.978 ± 0.020

0.945 ± 0.044

Advanced

1 ± 0

1 ± 0

1 ± 0

1 ± 0

1 ± 0

Machine (KNN)

Overall Glaucoma

0.808 ± 0.071

0.963 ± 0.029

0.886 ± 0.032

0.926 ± 0.034

0.864 ± 0.038

Early

0.680 ± 0.115

0.910 ± 0.058

0.795 ± 0.050

0.817 ± 0.051

0.689 ± 0.047

Moderate

0.828 ± 0.052

0.976 ± 0.016

0.902 ± 0.021

0.908 ± 0.025

0.855 ± 0.025

Advanced

0.861 ± 0.111

1 ± 0

0.931 ± 0.055

0.931 ± 0.055

0.922 ± 0.065

Clinician-1

(Masked)

Overall Glaucoma

0.797

0.838

0.816

-

0.816

Early

0.444

0.864

0.774

-

0.457

Moderate

0.714

0.966

0.917

-

0.769

Advanced

1

1

1

-

1

Clinician -2 (Masked)

Overall Glaucoma

0.863

0.842

0.853

-

0.862

Early

0.571

0.893

0.816

-

0.600

Moderate

0.800

0.936

0.917

-

0.727

Advanced

1

1

1

-

1

Clinician -3 (Masked)

Overall Glaucoma

0.826

0.842

0.834

-

0.837

Early

0.500

0.893

0.795

-

0.550

Moderate

0.777

0.951

0.929

-

0.786

Advanced

1

0.983

0.985

-

0.947

Mean Clinician Performance (Masked)

Overall Glaucoma

0.826

0.842

0.834

-

0.838

Early

0.500

0.893

0.795

-

0.550

Moderate

0.777

0.951

0.929

-

0.786

Advanced

1

0.983

0.985

-

0.947

  1. Performance is reported using a one-vs-one approach for clinicians (using RNFL and GC-IPL data). Highlighted numbers were used to compare the overall performance in terms of accuracy. Reported: mean ± standard deviation.
  2. Significant values are bold.