Table 2 Radiologist and Deep Learning System Performance for Chest Radiographs and Tuberculosis.

From: Using artificial intelligence to read chest radiographs for tuberculosis detection: A multi-site evaluation of the diagnostic accuracy of three deep learning systems

 

Human Readers

Deep Learning Systems

CAD4TB (v6)

Lunit (v4.7.2)

qXR (v2)

Accuracy

Sensitivity (95%CI)

Specificity (95% CI)

Accuracy

Sensitivity (95% CI)

Specificity (95% CI)

Accuracy

Sensitivity (95% CI)

Specificity (95%CI)

Accuracy

Sensitivity (95% CI)

Specificity (95% CI)

Nepal

Senior Radiologist

0.57

0.96

0.48

0.74

0.96

0.69

0.67

0.96

0.6

0.7

0.97*

0.65

(0.89–0.99)

(0.43–0.53)

(0.89–0.99)

(0.64–0.73)

(0.89–0.99)

(0.55–0.65)

(0.91–0.99)

(0.6–0.69))

Junior Radiologist & Residents

0.72

0.87

0.69

0.77

0.87

0.75

0.85

0.87

0.78

0.69

0.87

0.81

(0.79–0.93)

(0.64–0.73)

(0.79–0.93)

(0.71–0.79)

(0.79–0.93)

(0.73–0.82)

(0.79–0.93)

(0.76–0.84)

Cameroon

Radiologist

0.74

0.8

0.74

0.9

0.8

0.9

0.94

0.8

0.94

0.94

0.8

0.95

(0.52–0.96)

(0.71–0.78)

(0.52–0.96)

(0.87–0.92)

(0.52–0.96)

(0.92–0.96)

(0.52–0.96)

(0.93–0.96)

Teleradiology Company

0.74

0.8

0.74

0.9

0.8

0.9

0.94

0.8

0.94

0.94

0.8

0.95

(0.52–0.96)

(0.71–0.77)

(0.52–0.96)

(0.87–0.92)

(0.52–0.96)

(0.92–0.96)

(0.52–0.96)

(0.93–0.96)

  1. *The sensitivity of qXR version 2 closest to that of the senior radiologist is 97%, instead of 96%.