Table 1 Optimised parameter values for Q-learning algorithm trained on each animal’s behavioural data

From: Post-learning replay of hippocampal-striatal activity is biased by reward-prediction signals

 

α

γ

ϵ

Error score

Rat H

0.0111

0.6805

2.6444

10.1367

Rat I

0.0132

1.0000

2.5555

5.1981

Rat J

0.0026

1.0000

2.7749

9.2751

Rat K

0.0319

0.6130

2.5299

3.7080

Rat L

0.0036

1.0000

2.2478

10.416

Rat M

0.0038

1.0000

2.6368

7.7669

  1. α is the learning rate, γ is the discount factor and ϵ is the exploration factor. Source data for this table are provided as a Source Data file.