Table 1 Summarized results of performance assessments of DL algorithm and radiologists.

From: Development and clinical application of deep learning model for lung nodules screening on CT images

Data set

Rater

Time consumption per study (mins)

FROC score (95% CI)

ROC Sensitivity % (95% CI)

ROC Specificity % (95% CI)

AUC (95% CI)

LUNA (n = 888)

DL algorithm

0.08

0.80 [0.78, 0.82]

82 [73, 94]

82 [70, 91]

0.90 [0.88, 0.92]

Multi-centre validation (n = 582)

DL algorithm

0.09

0.75 [0.73, 0.78]

73 [63, 86]

85 [73, 96]

0.86 [0.84, 0.90]

Radiologists

1.71* [0.87, 3.23]

NA§

83 [75, 89]

64 [51, 77]

0.73 [0.68, 0.78]

  1. *Median, minutes range [min, max] for per-study time consumption in the multi-centre validation set.
  2. §FROC score could not be calculated from individual radiologist.