Figure 4 | Scientific Reports

Figure 4

From: Towards biologically plausible model-based reinforcement learning in recurrent spiking networks by dreaming new experiences

Figure 4

(A) Average reward over 10 realizations, as a function of “number of dreams” after the “awake” phase. Black solid line: median reward at the end of the training, Dashed black lines, 20-th and 80-th percentiles of the reward over 10 realizations. Dashed gray lines: comparison between median rewards obtained with “dreaming” and without it (which is equivalent to e-prop). (B) Top. Three consecutive frames in the environment. Bottom. Three consecutive reconstructed frames while dreaming.

Back to article page