Fig. 3: TWIX mitigates the underskilling bias across hospitals.
From: Human visual explanations mitigate bias in AI-based assessment of surgeon skills

We present the average performance of SAIS on the most disadvantaged sub-cohort (worst-case NPV) before and after adopting TWIX, indicating the percent change. An improvement (↑) in the worst-case NPV is considered bias mitigation. SAIS is tasked with assessing the skill-level of a needle handling and b needle driving. Note that SAIS is trained on data from USC and deployed on data from St. Antonius Hospital and Houston Methodist Hospital. Results are an average across ten Monte Carlo cross-validation folds.