Table 2 Summary of the data extracted for each paper included in our systematic review
Reference | Diagnosis/prognosis | Data used in model | Predictors | Sample size development | Sample size test | Type of validation | Evaluation | Public code |
---|---|---|---|---|---|---|---|---|
Is this paper describing a COVID-19 diagnosis or prognosis model (or both)? | Does this use CXR or CT (or both)? | What are the predictors? In purely deep learning models, this is DL. | Total sample size used for development (that is, training and validation and NOT test set), along with number of positive outcomes. | Total sample size used for testing of the algorithm, along with the number of positive outcomes. | k-fold CV, external validation in k centres, no validation and so on | Performance of the model, AUC, confidence interval, sensitivity, specificity and so on. 95% CI if available. | Is there code available? (Is the trained model available?) | |
Ghoshal and Tucker17 | Diagnosis | CXR | DL | 4,752 images, 54 COVID-19 | 1,189 images, 14 COVID-19 | Unclear validation procedure | Unclear in the paper | No |
Li et al.34 | Diagnosis | CXR | DL | 429 images, 143 COVID-19 | 108 images, 36 COVID-19 | Internal holdout validation | Accuracy, 0.880; AUC, 0.970 | Yes (Yes) |
Ezzat et al.28 | Diagnosis | CXR | DL | Unclear in the paper | Unclear in the paper | Internal holdout validation | Precision (w), 0.98; recall (w), 0.98; F1 score (w), 0.98 | No |
Tartaglione et al.16 | Diagnosis | CXR | DL | 231 images, 126 COVID-19 | 135 images, 90 COVID-19 | Internal holdout validation | Unclear in the paper | No |
Luz et al.30 | Diagnosis | CXR | DL | 13,569 images, 152 COVID-19 | 231 images, 31 COVID-19 | Internal holdout validation | Accuracy, 0.94; sensitivity, 0.97; PPV, 1.00 | Yes (Yes) |
Bassi and Attux31 | Diagnosis | CXR | DL | 2,724 images, 159 COVID-19 | 180 images, 60 COVID-19 | Internal holdout validation | Recall, 0.98; precision, 1.00 | No |
Gueguim Kana et al.32 | Diagnosis | CXR | DL | Unclear in the paper | Unclear in the paper | External validation | Accuracy, 0.99; recall, 1.00; precision, 0.99; F1 score, 1.00 | No |
Heidari et al.33 | Diagnosis | CXR | DL | 8,474 images, 415 COVID-19 | 848 images,42 COVID-19 | Internal holdout validation | Precision (w), 0.95; recall (w), 0.94; F1 score (w), 0.94 | No |
Farooq and Hafeez29 | Diagnosis | CXR | DL | Unclear in the paper | 637 images, 8 COVID-19 | Internal holdout validation | Accuracy, 0.96; sensitivity, 0.97; PPV, 0.99; F1 score, 0.98 | No |
Zhang et al.27 | Diagnosis | CXR | DL | 5,236 images, 2,582 COVID-19 | 5,869 images, 3,223 COVID-19 | Internal holdout validation | AUC, 0.92; sensitivity, 0.88; specificity, 0.79 | Yes (No) |
Zhang et al.37 | Diagnosis | CXR | DL | 386 images, 150 COVID-19 | 101 images, 39 COVID-19 | Internal holdout validation | Accuracy, 0.91 | No |
Wang et al.26 | Diagnosis | CXR | DL | 3,522 images, 204 COVID-19 | 61 images, 20 COVID-19 | Internal holdout validation | AUC, 1.00; accuracy, 0.99 | No |
Bararia et al.25 | Diagnosis | CXR | DL | Unclear in the paper | 1,000 images, 341 COVID-19 | Internal holdout validation | Accuracy, 0.81; sensitivity, 0.81; specificity, 0.90; precision, 0.74; recall, 0.77; F1 score, 0.75 | No |
Tsiknakis et al.21 | Diagnosis | CXR | DL | 458 (CV) images, 98 COVID-19 | 114 (CV) images, 24 COVID-19 | Fivefold internal cross-validation | AUC, 1.00; accuracy, 1.00; sensitivity, 0.99; specificity, 1.00 | Yes (No) |
Malhotra et al.18 | Diagnosis | CXR | DL | 26,464 images, 1,740 COVID-19a | 6,299 images, 125 COVID-19a | Internal holdout validation | Sensitivity, 0.87; specificity, 0.97 | No |
Sayyed et al.36 | Diagnosis | CXR | DL | 5,018 (CV) images, 334 COVID-19 | 1,255 (CV) images, 83 COVID-19 | Fivefold internal cross-validation | Accuracy, 0.99 ± 0.05 | Yes (No) |
Rahaman et al.19 | Diagnosis | CXR | DL | 720 images, 220 COVID-19 | 140 images, 40 COVID-19 | Internal holdout validation | Accuracy, 0.89; precision, 0.90; recall, 0.89; F1 score, 0.90 | No |
Amer et al.20 | Diagnosis | CXR | DL | Unclear in the paper | Unclear in the paper | Internal holdout validation | AUC, 0.98; accuracy, 0.94; sensitivity, 0.92; specificity, 0.97; PPV, 0.98 | No |
Elaziz et al.22 | Diagnosis | CXR | Hand-engineered radiomic features | Unclear in the paper | Unclear in the paper | Internal holdout validation and external validation | Internal validation: accuracy, 0.96; recall, 0.99; precision, 0.96 External validation: accuracy, 0.98; recall, 0.99; precision, 0.99 | No |
Tamal et al.24 | Diagnosis | CXR | Hand-engineered radiomic features. | 378 images, 226 COVID-19 | 165 images, 115 COVID-19 | Internal holdout validation | Sensitivity, 1.00; specificity, 0.85 | Nob |
Gil et al.23 | Diagnosis | CXR | Hand-engineered radiomic features | Unclear in the paper | Unclear in the paper | Internal holdout validation | Accuracy, 0.96; sensitivity, 0.98; specificity, 0.93; precision, 0.96 | Yes (Yes) |
Zokaeinikoo et al.35 | Diagnosis | CXR and CT | DL | Unclear in the paper | Unclear in the paper | Tenfold internal cross-validation | Accuracy, 0.99; sensitivity, 0.99; specificity, 1.00; PPV, 1.00 | No |
Amyar et al.44 | Diagnosis | CT | DL | 944 patients, 399 COVID-19 | 100 patients, 50 COVID-19 | Internal holdout validation | Accuracy, 0.95; sensitivity, 0.96; specificity, 0.92; AUC, 0.97 | No |
Ardakani et al.45 | Diagnosis | CT | DL | Unclear as splits do not total correctly | Unclear as splits do not total correctly | Internal holdout validation | AUC, 0.99; sensitivity, 1.00; specificity, 0.99; accuracy, 1.00; PPV, 0.99; NPV, 1.00 | No |
Bai et al.81 | Diagnosis | CT | DL | 118,401 images, 60,776 COVID-19 | 14,182 images, 5,040 COVID-19 | Internal holdout validation | AUC, 0.95; accuracy, 0.96; sensitivity, 0.95; specificity, 0.96 | Yes (Yes) |
Jin et al.50 | Diagnosis | CT | DL | 1,136 images, 723 COVID-19 | 282 images, 154 COVID-19 | Internal holdout validation | Sensitivity, 0.97; specificity, 0.92; AUC, 0.99 | No |
Wang et al.42 | Diagnosis | CT | DL | 320 images, 160 COVID-19 | Internal validation: 455 images, 95 COVID-19 External validation: 290 images, 70 COVID-19 | Internal holdout validation and external validation | Internal validation: AUC, 0.93 [0.90, 0.96] External validation: AUC, 0.81 [0.71, 0.84] | No |
Ko et al.41 | Diagnosis | CT | DL | 3,194 (CV) images, 955 COVID-19 | Internal cross-validation: 799 (CV) images, 239 COVID-19 External validation: 264 images, all COVID-19 | Fivefold internal cross-validation and external validation | Internal validation: AUC, 1.00; accuracy, 1.00; sensitivity, 1.00; specificity, 1.00 External validation: accuracy, 0.97 | No |
Acar et al.48 | Diagnosis | CT | DL | 2,552 images, 1,085 COVID-19 | 580 images, 246 COVID-19 | Internal holdout validation | AUC, 1.00; accuracy, 1.00; error, 0.01; precision, 1.00; recall, 1.00; F1 score, 1.00 | No |
Pu et al.43 | Diagnosis | CT | DL | Unclear in the paper | Unclear in the paper | Internal holdout validation | AUC, 0.70 [0.56, 0.85]; sensitivity, 0.98; specificity, 0.28 | No |
Chen et al.49 | Diagnosis | CT | DL | 770 (CV) images, 413 COVID-19 | Internal cross-validation: 86 (CV) images, 46 COVID-19 | Tenfold internal cross-validation | AUC, 0.94 ± 0.01; accuracy, 0.88 ± 0.01; precision, 0.90 ± 0.01; recall, 0.88 ± 0.01 | No |
Shah et al.52 | Diagnosis | CT | DL | 664 images, 314 COVID-19 | 74 images, 35 COVID-19 | Internal holdout validation | Accuracy, 0.95 | No |
Han et al.47 | Diagnosis | CT | DL | 368 (CV) images, 184 COVID-19 | 92 (CV) images, 46 COVID-19 | Fivefold internal cross-validation | AUC, 0.99; accuracy, 0.98 | Nob |
Wang et al.53 | Diagnosis | CT | DL | 3,997 images, 1,095 COVID-19 | 600 images, 200 COVID-19 | Internal holdout validation | AUC, 0.97; accuracy, 0.93; specificity, 0.96; precision, 0.88; recall, 0.88 | No |
Wang et al.54 | Diagnosis | CT | DL | 2,447 images, 1,647 COVID-19 | Internal validation: 639 images, 439 COVID-19 External validation: 2,120 images, 217 COVID-19 | Internal holdout and external validation | Internal validation: AUC, 0.99; sensitivity, 0.97; specificity, 0.85 External validation: AUC, 0.95; sensitivity: 0.92; specificity, 0.85 | No |
Goncharov et al.71 | Diagnosis and severity prognosis | CT | DL | Unclear in the paper | Diagnosis: 101 images, 33 COVID-19 Severity: 38 images of differing severity | Internal holdout validation | Diagnosis model: AUC, 0.95 Severity model: correlation, 0.98 | Noc |
Xie et al.61 | Diagnosis | CT | Hand-engineered radiomic features | 225 images, 27 COVID-19 | 76 images, 6 COVID-19 | Internal holdout validation | AUC, 0.91; accuracy, 0.90; sensitivity, 0.83; specificity, 0.90 | No |
Xu et al.62 | Diagnosis | CT | DL and hand-engineered radiomic features | 551 images, 289 COVID-19 | 138 images, 73 COVID-19 | Internal holdout validation | Accuracy, 0.98; F1 score, 0.99 | Nod |
Qin et al.60 | Diagnosis | CT | Hand-engineered radiomic features | 118 patients, 62 COVID-19 | 50 patients, 26 COVID-19 | Internal holdout validation | AUC, 0.85 [0.74, 0.96]; sensitivity, 0.89; specificity, 0.92 | No |
Georgescu et al.40 | Diagnosis | CT | DL and hand-engineered radiomic features | 1,902 patients, 1,050 COVID-19 | 194 patients, 100 COVID-19 | Internal holdout validation | AUC, 0.90; sensitivity, 0.86; specificity, 0.81 | No |
Guiot et al.58 | Diagnosis | CT | Hand-engineered radiomic features | Unclear in the paper | Unclear in the paper | Internal holdout validation | AUC, 0.94 [0.88, 1.00]; accuracy, 0.90 [0.84, 0.94]; sensitivity, 0.79; specificity, 0.91 | No |
Shi et al.57 | Diagnosis | CT | Hand-engineered radiomic features | 2,148 (CV) images, 1,326 COVID-19 | Internal cross-validation: 537 (CV) images, 332 COVID-19 | Fivefold internal cross-validation | AUC, 0.94; accuracy, 0.88; sensitivity, 0.91; specificity, 0.83 | No |
Mei et al.46 | Diagnosis | CT | DL and CNN extracted features and clinical data | 626 images, 285 COVID-19 | 279 images, 134 COVID-19 | Internal holdout validation | AUC, 0.92 [0.89, 0.95]; sensitivity, 0.843 [0.77, 0.90]; specificity, 0.83 [0.76, 0.89] | Yes (Yes) |
Chen et al.59 | Diagnosis | CT | Clinical features, qualitative imaging features and hand-engineered radiomic imaging features | 98 patients, 51 COVID-19 | 38 images, 19 COVID-19 | Internal holdout validation | AUC, 0.94 [0.87, 1.00]; accuracy, 0.76; sensitivity, 0.74; specificity, 0.79 | No |
Wang et al.51 | Diagnosis and prognosis for length of hospital stay | CT | Diagnosis model: DL Prognosis model: 64 CNN features and clinical factors | 709 images, 560 COVID-19 | Validation 1: 226 images, 102 COVID-19 Validation 2: 161 images, 92 COVID-19 Validation 3: 53 images, all with length of hospital stay Validation 4: 117 images, all with length of hospital stay | External validation | Validation 1 (diagnosis): AUC, 0.87 Validation 2 (diagnosis): AUC, 0.88 Validation 3 (prognosis): KM separation, P = 0.01 Validation 4 (prognosis): KM separation, P = 0.01 | Yes (Yes) |
Li et al.66 | Prognosis for severity | CXR | DL | 354 images of differing severities | Internal validation: 108 images External validation: 111 images | Internal holdout validation and external validation | Internal validation: correlation, 0.88 External validation: correlation, 0.90 | Yes (No) |
Li et al.67 | Prognosis for severity | CXR | DL | 314 images of differing severities | Internal validation: 154 images External validation: 113 images | Internal holdout validation and external validation | Internal validation: correlation, 0.86 External validation: correlation, 0.86 | Yes (No) |
Schalekamp et al.68 | Prognosis for severity | CXR | Hand-engineered radiomic features and clinical factors | Unclear in the paper | Unclear in the paper | Internal holdout validation | AUC, 0.77 | No |
Cohen et al.76 | Prognosis of lung opacity and extent of lung involvement with GGOs for patients with COVID-19 | CXR | Features from a trained CNN extracted at various layers | 47 patients of varying severity | 47 patients of varying severity | Internal holdout validation | Opacity correlation, 0.80; extent correlation, 0.78 | Yes (Yes) |
Yue et al.74 | Prognosing short- and long-term (>10 days) hospital stay for patients with COVID-19 | CT | Hand-engineered radiomic features | 26 patients, 16 long term | Internal validation: 5 patients, 3 long term Temporal-split internal validation: 6 patients, all long term | Internal holdout and temporal-split validation | AUC, 0.97 [0.83,1.00]; sensitivity, 1.00; specificity, 0.89; NPV, 1.00; PPV, 0.80 | Yesd |
Zhu et al.75 | The prognosis for whether patients will convert to a severe stage of COVID-19 and regression to predict the time to that conversion | CT | Hand-engineered radiomic features | Unclear in the paper | Unclear in the paper | Fivefold internal cross-validation run 20 times, average reported | AUC, 0.86 ± 0.02; accuracy, 0.86 ± 0.02; sensitivity, 0.77 ± 0.03; specificity, 0.88 ± 0.015 | No |
Lassau et al.73 | The prognostic model used for predicting the risk of death, need for ventilation or requirement for over 15 l min−1 oxygen | CT | CNN extracted features and clinical data | 646 patients, all with COVID-19; 243 with severe outcomes | Internal validation: 150 images, all COVID-19, 48 with severe outcome External validation: 135 patients, all with COVID-19, unclear number of severe patients | Internal holdout validation and external validation | Internal validation: AUC, 0.76 External validation: AUC, 0.75 | Noc |
Chassagnon et al.63 | Short-term prognosis intubation and death within four days Long-term prognosis: death within one month after CT | CT | Hand-engineered radiomic features and clinical data | 536 patients with COVID-19, 108 severe short-term outcomes, unclear for long term | 157 patients with COVID-19, 31 severe short-term outcomes, unclear for long term | External validation | Short-term prognosis: precision (w), 0.94; sensitivity (w), 0.94; specificity (w), 0.81; balanced accuracy, 0.88 Long-term prognosis: precision (w), 0.77; sensitivity (w), 0.94; specificity (w), 0.82; balanced accuracy, 0.71 | Nob |
Chao et al.77 | Prognosing for ICU admission | CT | Hand-engineered radiomic features and clinical data | 236 (CV) images, 125 admitted to ICU | 59 (CV) images, 31 admitted to ICU | Fivefold internal cross-validation | Unclear in the paper | No |
Wu et al.78 | Prognosing for death, ventilation and ICU admission in early- and late-stage COVID-19 | CT | Hand-engineered radiomic features | 351 images, 25 severe outcomes | 141 images, 26 severe outcomes | External validation | Early-stage COVID-19: AUC, 0.86; sensitivity, 0.80; specificity, 0.86 Late-stage COVID-19: AUC, 0.98; sensitivity, 1.00; specificity, 0.94 | No |
Zheng et al.79 | Prognosing for admission to an ICU, use of mechanical ventilation or death | CT | Hand-engineered radiomic features and clinical data | 166 images, 35 severe outcomes | 72 images, 10 severe outcomes | External validation | C index, 0.89 | No |
Chen et al.80 | Prognosis for acute respiratory distress syndrome | CT | Hand-engineered radiomic features and clinical data | 247 images, 36 severe cases | 105 images, 15 severe cases | Internal holdout validation | Accuracy, 0.88; sensitivity, 0.55; specificity, 0.95 | No |
Ghosh et al.64 | Prognosing COVID-19 severity | CT | Hand-engineered radiomic features | 36 images, unclear number of severe cases | 24 images, unclear number of severe cases | Internal holdout validation | Accuracy, 0.88 | No |
Ramtohul et al.72 | Prognosing mortality for patients with COVID-19 in a cancer population | CT | Hand-engineered radiomic features and clinical data | 35 (CV) patients, unclear number of deaths | 70 patients, unclear number of deaths | Twofold internal cross-validation | C index, 0.83 [0.73, 0.93] | No |
Wei et al.65 | Prognosing COVID-19 severity | CT | Hand-engineered radiomic features | Unclear in the paper | Unclear in the paper | One-hundred-fold leave-group-out cross-validation | AUC, 0.93 accuracy, 0.91; sensitivity, 0.81; specificity, 0.95 | No |
Wang et al.69 | Prognosis for survival | CT | Hand-engineered radiomic features | 161 patients, 15 non-survivors | 135 patients, unclear number of non-survivors | External validation | C index, [0.92, 0.95]; accuracy, [0.85, 0.87]; sensitivity, [0.71, 0.76]; specificity, [0.91, 0.92] | No |
Yip et al.70 | Prognosing COVID-19 severity | CT | Hand-engineered radiomic features | 657 images of various severities | 441 images of various severities | Internal holdout validation | AUC, 0.85 | No |