Fig. 8 | Scientific Data

Fig. 8

From: An fMRI dataset in response to large-scale short natural dynamic facial expression videos

Fig. 8

Semantic analysis of facial expression videos based on metadata encoding. (a) RSA approach based on metadata: Video metadata is provided as input to a video-language model for the purpose of generating a vector embedding. In a similar manner, we extract brain responses evoked by facial expression videos within a searchlight disk, producing a vector embedding at each vertex. The pairwise similarity of the brain embeddings evoked by the video and the metadata embeddings is separately computed to construct an n_stimuli × n_stimuli RDM. Subsequently, we correlate the metadata RDM with the voxel-wise RDMs generated through searchlight analysis. (b) Metadata RDMs: To produce vector embeddings, we fed the metadata for “expressions” and “actions” into a video-language model. We used cosine distance to compute the RDMs, and their visual representation is shown here. (c) Correlation of RDMs at each vertex in the brain with metadata RDMs across the entire brain: We computed the Spearman correlation between the metadata RDM and each RDM at each vertex in the brain for all participants. (d) Correlation based on ROI: The average correlations across participants within the ROI are shown. We conducted permutation tests to detect whether the Pearson correlation of voxel responses in each brain region is significantly higher than the null hypothesis distribution. Voxels with a Pearson correlation higher than 0.0126 are considered accurately predictable, indicating a significant difference from the null hypothesis (p < 0.05). The 95% confidence interval is depicted by the error bars.

Back to article page