Fig. 5: Correspondence between headwise brain and dependency predictions. | Nature Communications

Fig. 5: Correspondence between headwise brain and dependency predictions.

From: Shared functional specialization in transformer-based language models and the human brain

Fig. 5

A Correlation between headwise brain prediction and dependency prediction scores, for example, ROIs and dependencies; nominal subject (nsubj) in MFG and dmPFC, direct object (dobj) in AngG and vmPFC (see Fig. S26 for correlations plotted for all ROIs and dependencies). Each point in the scatter plot represents the dependency prediction (x axis) and brain prediction (y axis) scores for each of the 144 heads. Brain prediction scores reflect cross-validated encoding model performance evaluated in terms of the percent of a noise ceiling estimated using intersubject correlation. Dependency prediction scores reflect the classification accuracy of a cross-validated logistic regression model trained to predict the occurrence of a given linguistic dependency at each TR from the 64-dimensional transformation vector for a given attention head. Each of these plots corresponds to a labeled cell in the dependencies-by-ROI correlation matrix in B. Error bands around the line of best fit represent 95% bootstrapped confidence intervals. B Correlation between headwise brain prediction and dependency prediction scores for each language ROI and syntactic dependency. Dependencies (y axis) are ordered by their token distance; e.g., the adjectival modifier (amod) spans fewer tokens on average than the clausal complement (ccomp; see “Methods” for details). Cells with black borders contain significant correlations as determined by a two-tailed permutation test in which we shuffle assignments between headwise dependency prediction scores and brain prediction scores across heads (FDR controlled at p < 0.05). Labeled cells correspond to the example correlations in A. Dependencies are described in Table S5. C We summarize the brain–dependency prediction correspondence for each ROI by averaging across syntactic dependencies (i.e., averaging each column of B). Error bars indicate 95% bootstrap confidence intervals around the mean across N = 12 dependencies. Each data point denotes a dependency, and black borders indicate dependencies with significant correspondence. Source data are provided as a Source Data file. Figure made using Matplotlib, seaborn, and Inkscape.

Back to article page