Table 1 Summary of dataset statistics and results

From: Preserving fairness and diagnostic accuracy in private large-scale AI models for medical imaging

UKA-CXR

  

Total

Male

Female

[0,30)

[30,60)

[60,70)

[70,80)

[80,100)

Train

N

153,502

100,659

52,843

4279

42,340

36,882

48,864

21,137

Test

N

39,809

25,360

14,449

1165

10,291

10,025

12,958

5370

 

Cardiomegaly

18,616

12,868

5748

334

3853

4714

6837

2876

 

Congestion

3275

2206

1069

50

817

906

991

510

 

Pl. Eff. R.

3275

2090

1185

52

709

847

1248

419

 

Pl. Eff. L.

2602

1636

966

70

589

632

894

417

 

Pn. Inf. R.

4847

3374

1473

184

1322

1367

1361

612

 

Pn. Inf. L.

3562

2381

1181

143

1087

949

959

423

 

Atel. R.

3920

2571

1349

127

1010

1056

1272

454

 

Atel. L.

3166

2010

1156

119

867

774

961

444

 

ε

μ

σ

μ

σ

μ

σ

μ

σ

μ

σ

μ

σ

μ

σ

μ

σ

AUROC

0.29

83.13

3.9

82.66

3.9

83.85

4.0

86.47

3.5

85.21

3.9

83.03

3.6

81.66

4.5

81.27

4.3

 

0.54

84.00

3.8

83.61

3.8

84.61

3.9

86.43

3.1

85.96

3.7

83.96

3.5

82.69

4.5

82.15

4.2

 

1.06

84.98

3.9

84.69

3.9

85.40

4.0

87.69

3.2

86.90

3.7

84.95

3.8

83.70

4.5

82.96

4.3

 

2.04

85.80

3.9

85.52

3.9

86.19

3.9

88.77

3.3

87.53

3.8

85.88

3.8

84.47

4.4

83.85

4.3

 

4.71

86.93

4.0

86.73

4.1

87.19

4.0

89.11

3.3

88.59

3.9

86.80

3.7

85.89

4.7

85.08

4.6

 

7.89

87.36

4.1

87.12

4.2

87.66

4.1

89.72

4.1

88.97

3.9

87.26

3.9

86.30

4.7

85.48

4.8

 

89.71

3.8

89.46

3.9

90.06

3.8

91.64

3.5

90.99

3.4

89.73

3.8

88.73

4.4

88.18

4.5

PtD

0.29

  

−1.40

0.22

+1.40

0.22

+7.05

0.18

+0.98

0.73

+0.97

1.75

−1.73

0.36

−1.63

1.00

 

0.54

  

−1.56

0.10

+1.56

0.10

+7.20

0.21

+0.80

0.52

+1.95

0.48

−2.65

0.31

−1.23

0.56

 

1.06

  

−0.87

0.73

+0.87

0.73

+7.35

0.51

+2.56

0.67

+0.49

0.23

−1.92

0.78

−3.12

0.13

 

2.04

  

+0.15

0.42

−0.15

0.42

+6.12

0.92

+1.80

0.39

+1.50

0.00

−2.80

0.30

−1.61

0.15

 

4.71

  

−1.63

0.31

+1.63

0.31

+4.37

0.18

+2.15

0.70

+1.26

1.38

−2.27

0.50

−2.36

2.38

 

7.89

  

−0.66

0.75

+0.66

0.75

+5.53

0.92

+1.27

0.04

+1.21

0.22

−1.33

0.06

−2.89

0.52

 

  

−0.34

0.47

+0.34

0.47

+4.00

0.60

+1.32

0.65

+0.21

0.66

−0.43

0.95

−2.67

0.20

PDAC

  

Total

Male

Female

Youngest 25%

Second 25%

Third 25%

Oldest 25%

Train

N

975

552

423

231

290

228

226

Test

N

325

197

127

86

85

79

75

 

Tumor

173

95

77

23

48

54

48

 

Control

152

102

50

63

37

25

27

 

ε

μ

σ

μ

σ

μ

σ

μ

σ

μ

σ

μ

σ

μ

σ

AUROC

0.29

86.84

4.0

88.11

4.6

85.47

2.5

87.92

9.1

85.87

3.0

84.44

1.2

89.15

7.2

 

0.54

92.60

1.3

93.62

1.5

91.00

0.9

93.77

3.2

91.97

1.2

90.05

0.4

95.63

2.3

 

1.06

95.58

0.9

96.70

0.9

93.52

1.3

96.57

1.6

94.84

1.3

93.83

1.1

98.43

0.9

 

2.04

97.49

0.4

98.50

0.3

95.36

0.9

97.98

0.9

96.90

0.8

97.06

0.9

99.36

0.6

 

4.71

98.31

0.2

99.19

0.1

96.38

0.7

98.48

0.3

97.84

0.2

98.30

0.4

99.97

0.0

 

5.0

98.33

0.2

99.20

0.1

96.41

0.7

98.48

0.4

97.86

0.1

98.37

0.4

100.00

0.0

 

6.0

98.39

0.3

99.22

0.1

96.55

0.8

98.57

0.3

97.84

0.2

98.35

0.5

100.00

0.0

 

7.0

98.41

0.3

99.22

0.1

96.60

0.8

98.62

0.3

97.88

0.1

98.25

0.5

100.00

0.0

 

8.0

99.28

0.7

99.77

0.3

98.13

1.6

99.59

0.7

99.23

1.2

98.37

0.9

100.00

0.0

 

99.70

0.2

99.97

0.1

99.01

0.6

99.98

0.0

99.94

0.1

98.47

0.9

100.00

0.0

PtD

0.29

  

+3.27

5.38

−3.27

5.38

+9.03

1.32

+1.87

2.12

−9.54

4.33

−2.04

4.74

 

0.54

  

+1.02

0.76

−1.02

0.76

+3.17

0.54

+0.34

1.44

−7.02

4.39

+3.42

3.28

 

1.06

  

+1.29

1.27

−1.29

1.27

−0.18

3.85

+0.20

1.66

−3.58

3.53

+3.69

2.83

 

2.04

  

+3.00

0.78

−3.00

0.78

−1.97

0.65

−3.16

2.47

+1.55

0.62

+4.00

3.47

 

4.71

  

+4.58

1.33

−4.58

1.33

−3.29

1.23

−2.34

1.20

+1.47

1.46

+4.62

1.46

 

5.0

  

+4.85

1.37

−4.85

1.37

−2.62

0.82

−2.73

1.16

+1.61

0.87

+4.18

1.90

 

6.0

  

+4.41

0.53

−4.41

0.53

−2.10

2.06

−2.20

0.64

+1.05

2.10

+3.60

1.14

 

7.0

  

+3.19

1.27

−3.19

1.27

−1.99

2.97

−3.68

1.21

+1.20

2.31

+4.93

2.02

 

8.0

  

+3.28

2.61

−3.28

2.61

−2.45

1.51

−1.45

2.28

+1.58

2.44

+2.62

1.61

 

  

+2.81

2.38

−2.81

2.38

−1.21

1.59

−0.18

1.16

−0.33

1.71

+1.87

1.22

  1. Diagnostic performance of patient sugroups for the UKA-CXR and PDAC datasets. We report the number of cases over subgroups and labels. All values refer to the test set. Total denotes the results on the entire test set. AUROC denotes the area under the receiver operating characteristic curve. PtD is the statistical parity difference of each subgroup. PDAC stands for presence of pancreatic ductal adenocarcinoma. μ are mean values, σ shows the standard deviation calculated over 1000 bootstrapping samples (UKA-CXR) respectively 3 independent model trainings (PDAC). All results are in percent.