Extended Data Fig. 12: Similarity score with respect to idealised attention patterns. | Nature Human Behaviour

Extended Data Fig. 12: Similarity score with respect to idealised attention patterns.

From: Shared sensitivity to data distribution during learning in humans and transformer networks

Extended Data Fig. 12: Similarity score with respect to idealised attention patterns.

(left) Similarity score between observed attention patterns (N = 10 transformers per training distribution) and idealised attention patterns performing in-context learning. (right) Same with idealised attention patterns performing in-weights learning. The similarity score was a dot product normalised by the ℓ1-norm of the idealised head. Models trained on α < 1 were similar to in-context learning heads while models trained on α > 1 were similar to in-weights learning. Results were less clear for in-weights learning head #1 because these heads tended to have more diverse patterns (attention spread to all tokens, or restricted to some tokens, and most of the time restricted to the last token).

Back to article page