Table 1 Model parameters
Parameter | Value | Units | Description |
---|---|---|---|
Ninp | 120 | – | Number of neurons in external network |
N | 120 | – | Number of neurons in hippocampal network |
Nact | 2 | – | Number of neurons in action network |
\({dx}/{dt}\) | 1 | dv (arb.) | Velocity of agent in maze |
L | 100 | dx (arb.) | Length of maze |
σ | 5 | dx (arb.) | Standard deviation of input Gaussians |
τe, τea | 10, 20 | dt (arb.) | Time constants of eligibility traces |
τa,τv | 40, 10 | dt (arb.) | Time constants of apical compartment, action neurons |
σv | N(0,0.75) | – | Noise, action neuron activity |
σp | N(0,5) | – | Noise, policy selection |
ma, mβ | 1, 5 | – | Stretch coefficient, activation function |
ca, cβ | 5, 1.4 | – | Offset, activation function |
\({{{t}}}_{{{{{\mathbb{choice}}}}}}\) | 60 | dt (arb.) | Time of turn choice |
prand | 10 | % | Chance of random turn |
ntrials | 4000 | – | Number of trials |
\({{{{{r}}}}_{{{{0}}}}{{{,}}}{{{r}}}}_{{{{q}}}}\) | 0.5,0.6 | – | Reward expectation |
\({{{t}}}_{{{\mathbb{plateau}}}}^{{{{i}}}}\) | i | dt (arb.) | Time of induced plateau for neuron i |
Mik | δik, U(0,\({1{{{{{\rm{e}}}}}}}^{-4}\)) (S2 only) | – | Weight matrix, input layer to representation layer. Identity matrix (fixed) for all but Supplementary Fig. 2) |
Wij | U(0,\({1{{{{{\rm{e}}}}}}}^{-4}\)) | – | Initial values, recurrent weight matrix, representation layer (before learning). Uniform distribution between 0 and \({1{{{{{\rm{e}}}}}}}^{-4}\) |
Qli | U(0,\({1{{{{{\rm{e}}}}}}}^{-4}\)) | – | Initial values, weight matrix, representation layer to action layer (before learning). All initial values the same |
Iml | \(\left(\begin{array}{cc}0 & -0.125\\ -0.125 & 0\end{array}\right)\) | – | Recurrent weight matrix, action layer (fixed) |
Mmax | 0.75 | – | Upper bound for input weight (Supplementary Fig. 2 only) |
Wmax | 0.15 | – | Upper bound, recurrent weights |
Qmin,Qmax | −0.15, 0.15 | – | Lower and upper bounds, action weights |
\({{{{{\eta }}}}}^{{{{{{\boldsymbol{W}}}}}}}\), \({{{{{\eta }}}}}^{{{{{{\boldsymbol{Q}}}}}}}\), \({{{{{\eta }}}}}^{{{{{{\boldsymbol{M}}}}}}}\) | 0.0006, 0.0003, 0.15 | – | Learning rates, recurrent weights, state-action weights and input weights (input learning for Supplementary Fig. 2 only) |
\({{{{\lambda }}}}_{{{{w}}}}\) | 0.025 | – | Decay constant, recurrent weights |