Fig. 2: Simulations of the decision-making model.
From: Decoding reward–curiosity conflict in decision-making from irrational behaviors

a, The two-choice task with constant reward probabilities. b, The selection probabilities for the left and right options, plotted as a moving average with window width of 101. c, The recognized reward probabilities for the left and right options compared with the ground truths depicted by dashed lines. d, The confidences of reward probability recognitions for left the and right options. e–g, The expected brief updates (e), expected reward (f) and expected net utility (g) for the left and right options. h, The two-choice task with constant and temporally varying reward probabilities for the left and right options. i–n, The same as b–g with parameter values of \(c = 1\), \(P_o = 0.8\), \(\alpha = 0.05\), \(\beta = 2\) and \(\sigma _w = 0.63\). o–q, The selected options in a space of left–right differences of the expected reward and information gain. The ReCU model was simulated with dynamically changing reward probabilities for different intensities of curiosity: \(c = - 1\) (o), \(c = 0\) (p) and \(c = 3\) (q). The reward probabilities were generated by the Ornstein–Uhlenbeck process of w for 1,000 trials: \(w_{i,t} = w_{i,t - 1} - 0.01w_{i,t - 1} + 0.15\xi _t\), where \(\xi _t\) indicates the standard Gauss noise. The heatmap represents the probability of action selection in the space (equation (2)).