Table 2 Performance comparison across additional race categories
From: Mitigating the risk of health inequity exacerbated by large language models
Model | Middle Eastern | Indigenous | African American | South Asian | East Asian |
---|---|---|---|---|---|
LLaMA3 8B w/ EquityGuard | 70.0 ± 0.6% | 69.9 ± 0.7% | 70.2 ± 0.5% | 70.2 ± 0.6% | 70.3 ± 0.5% |
LLaMA3 8B w/o EquityGuard | 68.7 ± 0.8% | 68.6 ± 0.8% | 69.2 ± 0.7% | 68.9 ± 0.8% | 69.2 ± 0.7% |
Mistral v0.3 w/ EquityGuard | 70.3 ± 0.6% | 70.4 ± 0.5% | 70.4 ± 0.6% | 70.6 ± 0.5% | 70.6 ± 0.5% |
Mistral v0.3 w/o EquityGuard | 68.9 ± 0.7% | 68.9 ± 0.8% | 69.2 ± 0.7% | 69.3 ± 0.8% | 69.5 ± 0.7% |
GPT-4 | 71.1 ± 0.5% | 71.1 ± 0.5% | 74.1 ± 0.6% | 73.1 ± 0.6% | 71.2 ± 0.5% |