Table 2 Performance evaluation of age-based counterfactual explanations obtained using different approaches. In each case, we report results averaged across 500 test samples.
Method | Validity \(\downarrow\) | Sparsity \(\downarrow\) | Proximity \(\downarrow\) | Realism \(\uparrow\) |
---|---|---|---|---|
Vanilla | 2.49 | 0.06±0.08 | 4.08±0.48 | 1.26±0.1 |
Mixup | 0.83 | 0.05±0.07 | 3.79±0.52 | 1.28±0.07 |
UWCC | 0.74 | 0.09±0.03 | 3.81±0.42 | 1.33±0.05 |
MC dropout | 1.44 | 0.07±0.08 | 4.13±0.29 | 1.26±0.06 |
Deep ensembles (5 models) | 0.45 | 0.05±0.09 | 3.89±0.32 | 1.32±0.06 |
TraCE | 0.16 | 0.05±0.03 | 3.66±0.35 | 1.38 ± 0.06 |