Table 1 Dataset characteristics.
VinDr-CXR | ChestX-ray14 | CheXpert | MIMIC-CXR | PadChest | |
|---|---|---|---|---|---|
Number of radiographs total (training set/test set) [n] | 18,000 (15,000/3000) | 112,120 (86,524/25,596) | 157,878 (128,356/29,320) | 213,921 (170,153/43,768 | 110,525 (88,480/22,045) |
Number of patients (Total) [n] | N/A | 30,805 | 65,240 | 65,379 | 67,213 |
Patient age [years] | |||||
Median | 42 | 49 | 61 | N/A | 63 |
Mean ± Standard deviation | 54 ± 18 | 47 ± 17 | 60 ± 18 | N/A | 59 ± 20 |
Range (minimum, maximum) | (2, 91) | (1, 96) | (18, 91) | N/A | (1, 105) |
Patient sex female/male [%] | |||||
Training set | 47.8/52.2 | 42.4/57.6 | 41.4/58.6 | N/A | 50.0/50.0 |
Test set | 44.1/55.9 | 41.9/58.1 | 39.0/61.0 | N/A | 48.2/51.8 |
Projections [%] | |||||
Anteroposterior | 0.0 | 40.0 | 84.5 | 58.2 | 17.1 |
Posteroanterior | 100.0 | 60.0 | 15.5 | 41.8 | 82.9 |
Country | Vietnam | USA | USA | USA | Spain |
Contributing hospitals [n] | 2 | 1 | 1 | 1 | 1 |
Clinical setting | N/A | N/A | Inpatient and Outpatien t | Intensive Care Unit | N/A |
Radiography systems [n] | ≥ 8 | N/A | N/A | N/A | N/A |
Labeling method | Manual | Automatic (NLP) | Automatic (NLP) | Automatic (NLP) | Partially manual, Partially Automatic (NLP) |
Radiographs with cardiomegaly [%] | 11.8 | 2.5 | 12.6 | 19.7 | 8.9 |
Radiographs with Pleural effusion [%] | 4.1 | 11.9 | 41.3 | 22.6 | 6.3 |
Radiographs with pneumonia [%] | 4.0 | 1.3 | 2.5 | 6.5 | 4.7 |
Radiographs with atelectasis [%] | 0.8 | 10.3 | 16.7 | 19.9 | 5.6 |
Radiographs with consolidation [%] | 1.2 | 4.2 | 6.0 | 4.0 | 1.5 |
Radiographs with pneumothorax [%] | 0.4 | 4.7 | 10.3 | 4.6 | 0.4 |
Radiographs without abnormality [%] | 70.3 | 53.8 | 10.8 | 37.7 | 32.9 |