Table 1 Key components of reinforcement learning

From: A Primer on Reinforcement Learning in Medicine for Clinicians

Terms

Definitions

Examples in the context of “learning to ride a bike”

Agent

The entity learning to make decisions and take actions within an environment.

The person learning to ride is the agent.

Environment

The external system with which the agent interacts.

The physical world within which the agent rides the bike or a simulated environment in the context of machine learning.

State

A representation of the current situation or condition of the environment.

The state could include factors such as the rider’s speed, posture, and proximity to obstacles.

Action

The decision or choice made by the agent that affects the state of the environment.

Actions could include pedaling, steering, or braking.

Reward

Feedback from the environment that evaluates the goodness or badness of an action taken by the agent.

Rewards serve as signals to reinforce or discourage certain behaviors.

Falling off the bicycle could result in a negative reward, while successfully riding without falling could yield a positive reward

Policy

The strategy or rule that the agent follows to select sequence of actions based on its current state.

Learning to ride the bike could be constituted as the “learned policy”.