Table 1 Classification performance metrics for different CNN architectures on the NIH “ChestX-ray 14” database.

From: Automated abnormality classification of chest radiographs using deep convolutional neural networks

Models

AUC

Sensitivity (%)

Specificity (%)

PPV (%)

NPV (%)

F1 score

Accuracy (%)

AlexNet (P)

0.9741 ± 0.0050

94.18 ± 0.47

87.70 ± 0.56

87.66 ± 0.61

94.10 ± 0.41

0.9091 ± 0.0057

90.85 ± 0.48

AlexNet (S)

0.9684 ± 0.0043

92.65 ± 0.45

87.99 ± 0.41

87.94 ± 0.57

92.68 ± 0.38

0.9023 ± 0.0052

90.25 ± 0.45

VGG16 (P)

0.9797 ± 0.0039

94.03 ± 0.36

90.74 ± 0.41

90.56 ± 0.45

94.14 ± 0.43

0.9226 ± 0.0038

92.34 ± 0.40

VGG16 (S)

0.9742 ± 0.0044

93.42 ± 0.40

91.46 ± 0.46

91.18 ± 0.50

93.63 ± 0.46

0.9228 ± 0.0040

92.41 ± 0.42

VGG19 (P)

0.9842 ± 0.0036

97.09 ± 0.39

87.99 ± 0.35

88.42 ± 0.41

96.97 ± 0.43

0.9255 ± 0.0035

92.41 ± 0.33

VGG19 (S)

0.9757 ± 0.0054

94.49 ± 0.59

88.86 ± 0.49

88.90 ± 0.56

94.46 ± 0.47

0.9161 ± 0.0048

91.59 ± 0.50

ResNet18 (P)

0.9824 ± 0.0043

96.50 ± 0.36

92.86 ± 0.48

92.84 ± 0.55

96.52 ± 0.30

0.9463 ± 0.0041

94.64 ± 0.45

ResNet18 (S)

0.9766 ± 0.0034

96.63 ± 0.41

85.09 ± 0.33

85.97 ± 0.47

96.39 ± 0.36

0.9099 ± 0.0034

90.70 ± 0.38

ResNet50 (P)

0.9837 ± 0.0048

96.94 ± 0.50

88.42 ± 0.61

88.78 ± 0.73

96.83 ± 0.39

0.9268 ± 0.0055

92.56 ± 0.54

ResNet50 (S)

0.9775 ± 0.0057

94.32 ± 0.54

90.59 ± 0.66

90.43 ± 0.75

94.42 ± 0.44

0.9233 ± 0.0059

92.40 ± 0.60

Inception-v3 (P)

0.9866 ± 0.0041

97.38 ± 0.35

87.57 ± 0.48

88.11 ± 0.55

97.26 ± 0.27

0.9250 ± 0.0051

92.33 ± 0.42

Inception-v3 (S)

0.9796 ± 0.0034

95.08 ± 0.32

89.58 ± 0.35

89.58 ± 0.42

95.08 ± 0.23

0.9225 ± 0.0047

92.25 ± 0.37

DenseNet121 (P)

0.9871 ± 0.0057

97.40 ± 0.53

87.55 ± 0.68

88.09 ± 0.74

97.27 ± 0.33

0.9251 ± 0.0056

92.34 ± 0.56

DenseNet121 (S)

0.9801 ± 0.0044

95.10 ± 0.38

90.01 ± 0.49

90.00 ± 0.61

95.11 ± 0.27

0.9248 ± 0.0041

92.49 ± 0.44

  1. CNN model predictions were compared with the consensus labels of three board-certified radiologists.
  2. AUC area under the receiver operating characteristic curve, PPV positive predictive value (or precision), NPV negative predictive value.
  3. P: model weights were initialized from the ImageNet pre-trained model. S: random initialization of model weights, i.e., training from scratch.