Table 1 Key components of reinforcement learning
From: A Primer on Reinforcement Learning in Medicine for Clinicians
Terms | Definitions | Examples in the context of “learning to ride a bike” |
|---|---|---|
Agent | The entity learning to make decisions and take actions within an environment. | The person learning to ride is the agent. |
Environment | The external system with which the agent interacts. | The physical world within which the agent rides the bike or a simulated environment in the context of machine learning. |
State | A representation of the current situation or condition of the environment. | The state could include factors such as the rider’s speed, posture, and proximity to obstacles. |
Action | The decision or choice made by the agent that affects the state of the environment. | Actions could include pedaling, steering, or braking. |
Reward | Feedback from the environment that evaluates the goodness or badness of an action taken by the agent. Rewards serve as signals to reinforce or discourage certain behaviors. | Falling off the bicycle could result in a negative reward, while successfully riding without falling could yield a positive reward |
Policy | The strategy or rule that the agent follows to select sequence of actions based on its current state. | Learning to ride the bike could be constituted as the “learned policy”. |