Figure 4

Experimental results on T-maze. (a) The separation property of evolving LSMs which is calculated from the average of all individuals in population. (b) Reversal learning results. Green dots indicate that the agent has obtained food, and red dots indicate poison. (c) Performance of LSMs (applying different learning rules). Evolved model results are the average performance of \(N_{opt}\) individuals, unevolved model results are the average performance of multiple runs. (d) The agent learns to change behavior guided by DA regulation. (e) When the rule is reversed, the agent learns to avoid the poison after being punished once.