Fig. 1: The generalization challenge and potential solutions. | npj Digital Medicine

Fig. 1: The generalization challenge and potential solutions.

From: Generalization—a key challenge for responsible AI in patient-facing clinical applications

Fig. 1

a ML models trained on biased or non-representative datasets may fail to generalize to a subset of patients. b Potential solutions to the generalization challenge (left to right). Data collection augments training datasets with additional (real or synthetic) data so models can learn on all patients encountered during deployment. Limitation: data collection might be expensive or logistically challenging. Model-centric selection uses an additional ML model, e.g. an out-of-distribution (OOD) detector, or the ML model itself, e.g. model uncertainty, to select samples on which model outputs are trustworthy and to defer others to a clinician. Limitation: reliance on the model performing sample selection and patient exclusion. Sample-centric selection excludes samples where untrustworthy model outputs are expected either upfront or during deployment, deferring these samples to clinicians. Limitation: if sample exclusion leads to coverage gaps, it can harm model performance by exacerbating existing biases. Head icons from https://icons8.com/.

Back to article page