Table 1 Dataset characteristics.

From: Enhancing domain generalization in the AI-based analysis of chest radiographs with federated learning

 

VinDr-CXR

ChestX-ray14

CheXpert

MIMIC-CXR

PadChest

Number of radiographs total (training set/test set) [n]

18,000 (15,000/3000)

112,120 (86,524/25,596)

157,878 (128,356/29,320)

213,921 (170,153/43,768

110,525 (88,480/22,045)

Number of patients (Total) [n]

N/A

30,805

65,240

65,379

67,213

Patient age [years]

 Median

42

49

61

N/A

63

 Mean ± Standard deviation

54 ± 18

47 ± 17

60 ± 18

N/A

59 ± 20

 Range (minimum, maximum)

(2, 91)

(1, 96)

(18, 91)

N/A

(1, 105)

Patient sex female/male [%]

 Training set

47.8/52.2

42.4/57.6

41.4/58.6

N/A

50.0/50.0

 Test set

44.1/55.9

41.9/58.1

39.0/61.0

N/A

48.2/51.8

Projections [%]

 Anteroposterior

0.0

40.0

84.5

58.2

17.1

 Posteroanterior

100.0

60.0

15.5

41.8

82.9

Country

Vietnam

USA

USA

USA

Spain

Contributing hospitals [n]

2

1

1

1

1

Clinical setting

N/A

N/A

Inpatient and Outpatien t

Intensive Care Unit

N/A

Radiography systems [n]

 ≥ 8

N/A

N/A

N/A

N/A

Labeling method

Manual

Automatic (NLP)

Automatic (NLP)

Automatic (NLP)

Partially manual, Partially Automatic (NLP)

Radiographs with cardiomegaly [%]

11.8

2.5

12.6

19.7

8.9

Radiographs with Pleural effusion [%]

4.1

11.9

41.3

22.6

6.3

Radiographs with pneumonia [%]

4.0

1.3

2.5

6.5

4.7

Radiographs with atelectasis [%]

0.8

10.3

16.7

19.9

5.6

Radiographs with consolidation [%]

1.2

4.2

6.0

4.0

1.5

Radiographs with pneumothorax [%]

0.4

4.7

10.3

4.6

0.4

Radiographs without abnormality [%]

70.3

53.8

10.8

37.7

32.9

  1. Indicated are the included datasets, i.e., VinDr-CXR28, ChestX-ray1429, CheXpert30, MIMIC-CXR31, and PadChest32, and their characteristics. Only frontal chest radiographs (both anteroposterior and posteroanterior projections) were used for this study, while lateral projections were disregarded. Multiple radiographs may have been included per patient. N/A not available, NLP natural language processing.