Table 3 Accuracy of Google teachable machine algorithms (external validation; cutoff level = 0.5).

From: A deep learning-based algorithm for pulmonary tuberculosis detection in chest radiography

 

AUC

Sen

Sp

PPV

NPV

LR+ 

LR−

OA

F1 score

Validation dataset 1 (TB vs. normal)

 Model 1

0.800

0.65

0.89

0.85

0.72

5.90

0.39

0.77

0.73

 Model 2

0.902

0.83

0.83

0.82

0.83

4.88

0.20

0.83

0.82

 Model 3

0.951

0.88

0.95

0.94

0.88

17.6

0.12

0.91

0.91

Validation dataset 1 (TB vs. normal and other abnormality)

 Model 1

0.720

0.65

0.73

0.54

0.80

2.40

0.47

0.70

0.59

 Model 2

0.656

0.83

0.44

0.42

0.83

1.48

0.38

0.56

0.56

 Model 3

0.758

0.88

0.52

0.47

0.89

1.83

0.23

0.64

0.61

Validation dataset 2 (TB vs. normal)

 Model 1

0.795

0.68

0.93

0.93

0.65

9.71

0.34

0.78

0.78

 Model 2

0.917

0.86

0.90

0.92

0.81

8.60

0.15

0.87

0.89

 Model 3

0.975

0.86

1.0

1.0

0.82

Infinity

0.14

0.91

0.92

Validation dataset 2 (TB vs. normal and other abnormality)

 Model 1

0.752

0.68

0.81

0.76

0.73

3.57

0.39

0.74

0.72

 Model 2

0.718

0.86

0.58

0.65

0.82

2.04

0.24

0.71

0.74

 Model 3

0.828

0.86

0.65

0.69

0.83

2.45

0.21

0.75

0.76

  1. AUC area under curve, LR likelihood ratio, NPV negative predictive value, OA overall accuracy, PPV positive predictive value, Sen sensitivity, Sp specificity.