Table 3 Per class and weighted average measures for method in29, stand-alone CNNs and stacking ensemble of our method.

From: Automatic detection of microaneurysms in optical coherence tomography images of retina using convolutional neural networks and transfer learning

Model per class parameter

Inception V3

VGG16

VGG19

Xception

Method in29

MLP ensemble

Accuracy

Abnormal

0.948

0.792

0.922

0.883

0.931

0.974

MA

0.883

0.883

0.987

0.922

0.962

0.987

Normal

0.935

0.948

0.922

0.857

0.943

0.961

Vessel

0.896

0.961

0.987

0.974

0.946

1

Weighted average accuracy

0.913

0.91

0.958

0.916

0.945

0.982

Precision

Abnormal

0.813

0.464

1

0.619

0.952

0.928

MA

0.682

1

0.945

0.739

0.964

0.944

Normal

0.895

0.944

0.797

1

0.97

0.947

Vessel

0.95

1

0.963

1

0.964

1

Weighted average precision

0.851

0.888

0.921

0.873

0.963

0.961

Recall (sensitivity)

Abnormal

0.929

0.929

0.571

0.929

0.86

0.928

MA

0.882

0.471

1

1

0.913

1

Normal

0.85

0.85

0.95

0.45

0.907

0.9

Vessel

0.731

0.885

1

0.923

0.883

1

Weighted average recall

0.831

0.792

0.909

0.818

0.890

0.961

Specificity

Abnormal

0.952

0.762

1

0.873

0.866

0.984

MA

0.883

1

0.983

0.9

0.941

0.983

Normal

0.964

0.982

0.912

1

0.86

0.982

Vessel

0.980

1

0.980

1

0.887

1

Weighted average specificity

0.95

0.952

0.967

0.818

0.888

0.989

F1-score

Abnormal

0.867

0.619

0.727

0.743

–

0.928

MA

0.769

0.64

0.971

0.85

–

0.971

Normal

0.872

0.895

0.863

0.621

–

0.923

Vessel

0.826

0.939

0.981

0.96

–

1

Weighted average F1-score

0.833

0.803

0.902

0.808

–

0.961

Weighted average ROC-AUC score

0.981

0.97

0.99

0.989

–

0.998

  1. Accuracy = (TP + TN)/(TP + TN + FP + FN), Precision = TP/(TP + FP), Recall(sensitivity) = TP/(TP + FN), Specificity = TN/(TN + FP), F1score = (2 × Recall × Precision)/(Recall + Precision) = (2TP)/(2TP + FP + FN), ROC-AUC score: the closer this criterion is to one, the greater the number of correctly predicted cases. On the other hand, the closer this criterion is to zero, the greater the number of incorrectly predicted cases.