Extended Data Fig. 2: Model architecture and workflow of bandit algorithms during reaction optimization.
From: Identifying general reaction conditions by bandit optimization

The bandit algorithm suggests a condition (an arm) to evaluate first. The chemist-designed reaction scope suggests a reaction to evaluate with the selected condition. The suggested reaction is tested experimentally, and the result is used to update both the reaction scope and the bandit algorithm for the next round of proposal. Finally, a prediction model, separately trained with existing experimental results, is optionally used to propose reactions to evaluate via other mechanisms (e.g., batch proposal).