Extended Data Fig. 7: Performance of human participants in all curricula.
From: Shared sensitivity to data distribution during learning in humans and transformer networks

a. Four groups of human participants (Exp. 3, N = 50 per group) were exposed to a composite distribution (Pc = 0.5, αs = 2) with different training curricula, that is different block order, denoted C1 to C4 (‘uniform’, α = 0; ‘skewed’, αs = 2). b. Performance during training per curriculum. c. Double learning index per curriculum. n.s. p > 0.05, * p < 0.05, ** p < 0.01, *** p< 0.001. d. Training and test performances for humans per curriculum. A curriculum that promotes learning first in-context and then in-weights improves the in-context performance without impairing in-weights learning. Small dots are individuals, large dots are group average. n.s. p > 0.05, * p < 0.05, ** p < 0.01, *** p< 0.001.