Fig. 1: Illustration of the game.

A Two rounds (denoted \(t\)) are illustrated (columns). In each, a mechanism allocates resources (blue flowers) from a pool of size \(R\) to \(p=4\) players, who each choose a quantity to reciprocate, with any remainder going to surplus (gold coins). For example, in the schematic, in round \(t\) the first player (left) receives 2 flowers, and reciprocates 1, generating a surplus of \(r=0.4\) (amount due to growth factor shown in grey). The pool size is depleted by the allocation and replenished by the reciprocations. Note that players who receive no resources cannot reciprocate (e.g. centre left player on round t) (B) Illustration of our approach. First, (1) we collected data from human participants under a range of mechanisms defined by different values of \(w\), and used imitation learning to create clones that behaved like people. Then (2) we used these clones to train the RL agent, and (3) conducted Exp.1, in which we compared the RL agent to baselines. Next, (4) we analysed the RL agent policy, and constructed a heuristic approximation that was more explainable (the ‘interpolation baseline’), which (5) we tested on behavioural clones, and (6) compared to the RL agent (and proportional mechanism) in Exp. 2. Finally, (7) we used all of the data so far to retrain a new version of the RL agent, and (8) compared it to the interpolation baseline in Exp. 3. C Example games (using behavioural clones). The game starts with \({R}_{0}=200\). Left: offers (full lines) and reciprocations (dashed lines) to four players (lower panels) over 40 trials (x-axis) in an example game with the equal baseline. The grey shaded area indicates where each player receives an offer of zero. The top panel shows the size of the pool (blue) and the total per-trial surplus (red). The middle and right panels show example games under the proportional baseline and the RL agent, respectively. Note that in the example proportional baseline game, three players fall into poverty traps, leaving a single player to contribute, and increasing inequality.