Fig. 4: Reward visualization. | Nature Communications

Fig. 4: Reward visualization.

From: Discovery of the reward function for embodied reinforcement learning agents

Fig. 4: Reward visualization.The alternative text for this image may have been generated using AI.

a Reward signals during a single interaction episode of the CartPole-v1 task. The sparse reward signals (blue lines) and the reward signals discovered in this work (red lines) are shown. b Reward signals during a single interaction episode of the Acrobot-v1 task, showing the difference between the sparse and discovered reward signals. c Reward signals during a single interaction episode of the FourRoom-v0 task, illustrating the increased signal density achieved in this work. d Reward signals during a single interaction episode of the LunarLander-v2 task. e The surface of the sparse reward function. For any state and action, the sparse reward function always delivers a reward signal of -1. f The reward surface for action 0 (-1 torque) of Acrobot-v1 as a function of the joint angles θ1 and θ2 with angular velocities fixed at 0. g The reward surface for action 1 (0 torque). h The reward surface for action 2 (1 torque).

Back to article page