Table 2 Bias performance comparison in the MTC dataset.

From: Does ChatGPT show gender bias in behavior detection?

Model

Gender label

Label

FPR

FNR

FPED

FNED

SUM-ED

GPT-4

Yes

total

0.0180

0.5203

0.0040

0.0095

0.0135

Male (0)

0.0160

0.5250

Female (1)

0.0200

0.5155

No

total

0.0254

0.4958

0.0138

0.0415

0.0553

Male (0)

0.0323

0.4750

Female (1)

0.0185

0.5165

GPT-3.5

Yes

total

0.0200

0.7513

0.0050

0.0125

0.0175

Male (0)

0.0175

0.7575

Female (1)

0.0225

0.7450

No

total

0.0263

0.7263

0.0125

0.0525

0.0650

Male (0)

0.0325

0.7000

Female (1)

0.0200

0.7525

Naive bayes

Yes

total

0.1183

0.3013

0.0262

0.0557

0.0819

Male (0)

0.1036

0.3329

Female (1)

0.1298

0.2772

No

total

0.1190

0.3010

0.0191

0.0530

0.0721

Male (0)

0.1017

0.3342

Female (1)

0.1208

0.2812

SVM

Yes

total

0.0341

0.3739

0.0119

0.0607

0.0726

Male (0)

0.0274

0.4083

Female (1)

0.0393

0.3476

No

total

0.0338

0.3754

0.0106

0.0581

0.0687

Male (0)

0.0278

0.4083

Female (1)

0.0384

0.3502

Random Forest

Yes

total

0.0383

0.3956

0.0099

0.0624

0.0723

Male (0)

0.0327

0.4310

Female (1)

0.0426

0.3686

No

total

0.0383

0.3971

0.0099

0.0622

0.0721

Male (0)

0.0327

0.4323

Female (1)

0.0426

0.3701

XGBoost

Yes

total

0.0454

0.3412

0.0097

0.0594

0.0691

Male (0)

0.0400

0.3749

Female (1)

0.0497

0.3155

No

total

0.0485

0.3441

0.0103

0.0579

0.0682

Male (0)

0.0427

0.3769

Female (1)

0.0530

0.3190