Intelligent behavior of artificial agents and their design are usually considered as a reward maximization phenomenon, however, the reward function construction may be challenging. The authors introduce an alternative principle for agents’ behavior and design based on maximizing the occupancy of possible state and action paths.
- Jorge Ramírez-Ruiz
- Dmytro Grytskyy
- Rubén Moreno-Bote