Table 2 Average number of violated constraints and trajectory reward, and percentage of S2 solver use, for very low and very high risk aversion (ra = 0 and ra = 1)

From: Fast, slow, and metacognitive thinking in AI

 

ra = 0

ra = 1

Violated Constraints

1.7520 (3.2495)

0.9361 (2.1726)

Reward

−201.4358 (266.6511)

−121.4951 (191.1319)

Perc. usage S2

0.2243 (0.1914)

0.3431 (0.3214)

  1. The S1 solver used for these results is the one with pRand = 0.25. Numbers in parenthesis show the standard deviations over 10 grids and 500 trajectories per grid.