Table 3 Dice score and fairness scores for hip segmentation across different protected attributes, including race, sex, and age.

From: Fair AI-powered orthopedic image segmentation: addressing bias and promoting equitable healthcare

U-Net Backbone

Model

White/Caucasian

Black/AA*

SER

SD

(a) Racial group dice scores & fairness scores

ResNet18

Baseline

0.927

0.920

1.094

0.003

Balanced

0.914

0.910

1.039

0.002

Stratified

0.919

0.913

1.073

0.003

Group

0.911

0.908

1.030

0.001

EfficientNet-B0

Baseline

0.922

0.918

1.061

0.002

Balanced

0.903

0.903

1.009

0.000

Stratified

0.927

0.922

1.077

0.003

Group

0.917

0.909

1.096

0.004

U-Net Backbone

Model

Male

Female

SER

SD

(b) Sex group dice scores & fairness scores

ResNet18

Baseline

0.925

0.923

1.028

0.001

Balanced

0.912

0.910

1.030

0.001

Stratified

0.924

0.920

1.055

0.002

Group

0.908

0.907

1.015

0.001

EfficientNet-B0

Baseline

0.922

0.919

1.036

0.001

Balanced

0.910

0.892

1.206

0.009

Stratified

0.923

0.918

1.058

0.002

Group

0.903

0.913

1.118

0.005

U-Net Backbone

Model

Age 50 or Lower

Age 51–64

Age 65–79

SER

SD

(c) Age group dice score & fairness scores

ResNet18

Baseline

0.921

0.925

0.923

1.058

0.002

Balanced

0.912

0.896

0.897

1.180

0.007

Stratified

0.920

0.921

0.917

1.059

0.002

Group

0.817

0.911

0.910

2.049

0.044

EfficientNet-B0

Baseline

0.919

0.921

0.919

1.028

0.001

Balanced

0.894

0.882

0.880

1.136

0.006

Stratified

0.922

0.925

0.922

1.039

0.001

Group

0.785

0.917

0.903

2.581

0.059

  1. *AA: African American.