Fig. 7: Visualization of the final RL model.

The arrows indicate which state features are used to predict the different reward functions. The five reward functions can be combined by setting different weights α.

The arrows indicate which state features are used to predict the different reward functions. The five reward functions can be combined by setting different weights α.