Table 3 Bias performance comparison in the Jigsaw dataset.
Model | Gender label | Label | FPR | FNR | FPED | FNED | SUM-ED |
---|---|---|---|---|---|---|---|
GPT-4 | Yes | total | 0.3972 | 0.2324 | 0.0413 | 0.0612 | 0.1025 |
Male (0) | 0.3765 | 0.2630 | |||||
Female (1) | 0.4178 | 0.2018 | |||||
No | total | 0.4132 | 0.2247 | 0.0586 | 0.0737 | 0.1323 | |
Male (0) | 0.3839 | 0.2615 | |||||
Female (1) | 0.4425 | 0.1878 | |||||
GPT-3.5 | Yes | total | 0.4220 | 0.4680 | 0.0440 | 0.0840 | 0.1280 |
Male (0) | 0.4000 | 0.2760 | |||||
Female (1) | 0.4440 | 0.1920 | |||||
No | total | 0.4660 | 0.2240 | 0.0680 | 0.0960 | 0.1640 | |
Male (0) | 0.4120 | 0.2720 | |||||
Female (1) | 0.4800 | 0.1760 | |||||
Naive bayes | Yes | total | 0.2444 | 0.1336 | 0.2473 | 0.0713 | 0.3186 |
Male (0) | 0.2273 | 0.1427 | |||||
Female (1) | 0.4746 | 0.0714 | |||||
No | total | 0.2432 | 0.1333 | 0.1940 | 0.0437 | 0.2377 | |
Male (0) | 0.2299 | 0.1389 | |||||
Female (1) | 0.4239 | 0.0952 | |||||
SVM | Yes | total | 0.0572 | 0.3358 | 0.0709 | 0.0763 | 0.1472 |
Male (0) | 0.0523 | 0.3455 | |||||
Female (1) | 0.1232 | 0.2692 | |||||
No | total | 0.0572 | 0.3353 | 0.0650 | 0.0632 | 0.1282 | |
Male (0) | 0.0528 | 0.3434 | |||||
Female (1) | 0.1178 | 0.2802 | |||||
Random Forest | Yes | total | 0.0484 | 0.3850 | 0.0667 | 0.0361 | 0.1028 |
Male (0) | 0.0438 | 0.3896 | |||||
Female (1) | 0.1105 | 0.3535 | |||||
No | total | 0.0481 | 0.4002 | 0.0514 | 0.0346 | 0.0860 | |
Male (0) | 0.0446 | 0.4046 | |||||
Female (1) | 0.096 | 0.3700 | |||||
XGBoost | Yes | total | 0.0517 | 0.2889 | 0.0650 | 0.0982 | 0.1632 |
Male (0) | 0.0473 | 0.3015 | |||||
Female (1) | 0.1123 | 0.2033 | |||||
No | total | 0.0822 | 0.3365 | 0.0596 | 0.0811 | 0.1407 | |
Male (0) | 0.0470 | 0.3731 | |||||
Female (1) | 0.1066 | 0.2920 |