Abstract
We perceive emotions daily through facial expressions, often accompanied by a body posture that provide additional emotional context. Congruent facial and bodily expressions (conveying the same emotion) enhance emotional recognition compared to incongruent ones, suggesting interaction between these channels. Although behavioral evidence suggests that this integration occur automatic, its underlying neural mechanisms remains unclear. This study investigated the automaticity of facial and bodily expressions integration by manipulating cognitive load. Twenty-eight participants completed an emotion recognition task with congruent or incongruent facial and bodily expressions while performing a memory task under low or high cognitive load. EEG recordings captured brain activity, and emotion recognition accuracy and reaction times were measured. Results revealed that congruent expressions improved recognition, with bodily expressions exerting a stronger influence on facial expression recognition than vice versa. Early neural responses (P100, N100, P250, N250) were stronger during facial expression focus, while later responses reflected attention to body expressions. Bayesian analyses provided strong evidence for the absence of significant interaction between congruence and cognitive load, supporting the automaticity of integration. These findings suggest that emotional expressions are integrated automatically, independent of cognitive resources, and emphasize the differential influence of bodily expressions over facial expressions in shaping emotional perception.
Similar content being viewed by others
Introduction
In everyday life, we primarily perceive emotions through facial expressions, which serve as rapid and easily recognizable indicators of a person’s emotional state1,2. However, facial expressions are sometimes ambiguous, less common and more difficult to identify, particularly when emotions are concealed by social norms or complex feelings such as embarrassment or shame. For example, a smile may conceal underlying nervousness, complicating the accurate recognition of emotions based solely on facial cues. In such situations, additional emotional context becomes essential. Surrounding cues, such as body language3, background scenes4, vocal tone5, and even olfactory cues6, provide critical information that enhances the accurate recognition of emotions conveyed through facial expressions.
Emotions are typically perceived through multiple sensory channels simultaneously, which helps emotional recognition, particularly when these cues are congruent (i.e., expressing the same emotion)7. For example, in a horror movie, visual and auditory fear cues combine to amplify the emotional experience. In contrast, pairing a horror scene with cheerful or absent music can make it harder to perceive fear conveyed in the film. Coherently with these observations, numerous studies have showed that congruent audio-visual cues improve emotional recognition, while incongruent cues make it more difficult6,8,9,10. This interaction also occurs within a single sensory modality, such as vision, where multiple emotional cues may be perceived simultaneously. For instance, facial expressions often appear alongside bodily expressions11,12, emotional background scene (e.g., a disgusted context such as garbage)13,14, or even a combination of all these cues15. When facial and bodily expressions are combined, emotion facial recognition is enhanced and faster when these cues are congruent, while incongruent bodily cues bias facial expression recognition12,15,16,17. Other studies have manipulated attentional focus of participants by instructing them to focus either on the face to recognize the expressed emotion (face focus condition), or on the body posture to recognize the emotion conveyed (body focus condition). These studies have shown a bidirectional interaction between the two emotional channels, that is, just as body expressions influence the recognition of facial emotions, facial expressions also affect the recognition of bodily emotions11,18. However, facial expressions seem to have a weaker influence on the recognition of bodily expressions than bodily expressions have on the recognition of facial expressions. According to Lecker et al. (2020), this asymmetry in the influence of these emotional signals is likely because individuals are naturally inclined to focus on facial expressions as the primary channel of emotional information11. In everyday interactions, person tend to rely on the face to recognize emotion of others, while bodily expressions are typically processed as contextual cues. This default attentional bias facilitates the emergence of contextual effects when facial and bodily cues are combined. However, when participants are instructed to focus specifically on body posture, this strategy is less intuitive. As a result, bodily expressions are more likely to be interpreted in isolation, without being integrated into the broader emotional context11. These interactions, often described as the congruency effect11,12 or the contextual effect11, highlight the critical role of facial and bodily expression integration in shaping emotional recognition.
In the literature, it remains unclear whether the integration of facial and bodily expressions occurs automatically or requires voluntary effort. While some studies suggest automatic processing of emotional signal integration16,19, others show that focusing attention on either facial or bodily expressions influences this integration, contradicting the hypothesis of automaticity11. However, the (lack of clear) definition of automaticity makes difficult to investigate this concept and any definitive conclusion is hard to reach as automaticity probably involves multiple aspects. In this regard, three criteria have been proposed to evaluate the automaticity of cognitive processes: efficiency, unconsciousness, and speed20. While the criterion of speed has been less extensively explored, one study manipulated this criterion and suggested that even when participants had very limited time to respond and were exposed to stimuli for only a brief moment, bodily expressions still influenced the recognition of facial expressions. This suggests that the integration of facial and bodily expressions occurs rapidly17. Additionally, the unconsciousness criterion evaluates whether a process operates without requiring attention or awareness20. Evidence suggests that even when facial and bodily expressions are presented subliminally (e.g., 33 ms), bodily expressions influence the perception of facial expressions, indicating that this integration occurs outside conscious awareness19. The criterion of efficiency evaluates whether a process requires minimal cognitive or attentional resources20. This is often tested using a dual-task protocol, where participants perform a primary task while simultaneously engaging in a secondary task that demands cognitive resources (e.g., a digit memorization task, target detection task, etc.). If the performance on the primary task remains unaffected by the cognitive load imposed by the secondary task, the process is considered efficient20. To date, only one study has assessed the efficiency of integrating facial and bodily expressions16. Participants were asked to recognize emotional facial expressions that were either congruent or incongruent with bodily expressions while performing a secondary task involving memorization of either a complex sequence (e.g.,183K65) or a simpler sequence (just the letter, e.g., 183K65). By comparing high and low cognitive load conditions, researchers evaluated whether the contextual effect of bodily expressions on facial expression recognition was influenced by cognitive load. The findings suggest no impact of cognitive load on this effect, indicating that the integration process is efficient16. Overall, these results suggest that the integration of facial and bodily expressions likely involves automatic processing, supported by evidence for both unconsciousness and efficiency.
Using electrophysiological methods, the temporal dynamics of emotional stimuli processing can be examined through even-related potentials (ERPs). Evidence suggests that the integration of facial and bodily expressions unfolds in three-stages3. The first stage involves the automatic and rapid processing of bodily expressions, with a negativity bias favoring threatening stimuli. This is reflected by an increase of P100 amplitude, a positive component localized at occipital electrode sites, known to be sensitive to visual stimulations and their psychophysical parameters (such as color, visual field position, contrast, spatial frequencies, etc.), when fearful bodily expressions are perceived compared to happy ones during, combined face-body perception, regardless of whether attention is focused on the face or on the body3. This early response suggests that bodily expressions are processed automatically, even without visual awareness. In contrast, the authors did not observe significant differences in P100 amplitude between fearful and happy facial expressions, suggesting that the negativity bias operates differently for facial and bodily emotional cues3. The second stage is characterized by the rapid detection of incongruency between facial and bodily expressions3. This process is reflected in increased N200 component amplitude, a negative component localized at fronto-central electrode sites, known to be sensitive to conflict detection21, for incongruent stimuli compared to congruent ones3,22. Conflict detection can occur as early as 100 ms, as indicated by the increase of P100 amplitude during incongruent conditions17,23. These findings indicate that facial and bodily expressions are integrated quickly and automatically17. The final stage involves selective attention and detailed evaluation of emotional stimuli. At this stage, processing depends on the focus of attention. When attention is directed toward facial expressions, P300 amplitude is greater for congruent conditions than for incongruent ones3,23. Conversely, when attention is directed toward bodily expressions, the amplitude of P300, a positive component localized at centro-parietal electrode sites and sensitive to probability of occurrence of stimuli24, but also to their affective congruence and emotional valence25, is higher for fearful expressions than for happy ones, regardless of congruence3. This suggests that attentional focus modulates the integration of facial and bodily expressions during later stages of processing3,11. In summary, the simultaneous perception of facial and bodily expressions leads to distinct early emotional processing compared to isolated facial25,26 or bodily expressions27. Bodily expressions appear to dominate early processing when both cues are presented together3. Additionally, the rapid detection of incongruence highlights the automatic integration of facial and bodily expressions3,17,22,23. At later stages, attentional focus further modulates emotional processing, as reflected in P300 variations3.
All of these studies have highlighted the integration of facial and bodily expressions both behaviorally and at the neural level. However, it remains uncertain whether this integration meets the criterion of efficiency for automatic processing, particularly in relation to brain activity20. While Aviezer et al. (2011) suggest automatic integration by manipulating cognitive load at the behavioral level, no study has yet investigated this process using ERP components to examine the chronology of integration. Furthermore, the large majority of studies in this field are subject to several limitations: (1) First, the use of low-quality stimuli restricts the ecological validity of their findings and does not allow for a realistic assessment of emotional perception. (2) Second, their conclusions regarding the absence of significant effects rely on frequency analyses, which are not suitable for confirming the absence of effects. In contrast, Bayesian analyses can estimate the likelihood of an effect occurring28,29,30,31. (3) Moreover, many studies frame bodily expressions only as an emotional context that influences the recognition of facial emotions17,22, often overlooking the bidirectional nature of this contextual effect. Specifically, the influence of facial expressions on the recognition of bodily emotions remains understudied3,11,32.
The objective of this study is to investigate the automaticity of integrating facial and bodily expressions, with a focus on the efficiency criterion20. To achieve this, participants will perform a task involving the recognition of either facial or bodily expressions (i.e., face focus vs. body focus conditions) while simultaneously engaging in a digit memorization task (i.e., memorization of seven different or similar digits). The manipulation of attentional focus on either facial or bodily expressions is used to observe the bidirectionality of the contextual effect11 during the manipulation of automaticity criterion (high or low cognitive load). The attentional focus does not represent a criterion of automaticity. The study will utilize ERP component measures, high-quality congruent and incongruent facial and bodily expression stimuli33, and Bayesian statistical analyses to address gaps identified in previous research. Three hypotheses will be tested using both behavioral and EEG measures.
Hypothesis 1
we expect a classical contextual effect between facial and bodily cues, i.e., facilitated emotion recognition in the congruent condition, and hindered recognition in the incongruent condition. At behavioral level, we should observe higher accuracy and faster response times for emotionally congruent pairs compared to incongruent expressions3,11. At the brain level, we expect incongruent expressions to elicit larger amplitudes in early ERP components (P100, N100, P200, N200) reflecting an early detection of conflict3,17.
Hypothesis 2
we hypothesize that attentional focus/task instructions will modulate the contextual effect between facial and bodily cues11. We expect that the influence of facial expressions on the recognition of bodily expressions (in the body focus condition) will be less pronounced than the influence of bodily expressions on the recognition of facial emotions (in the face focus condition). In terms of ERP components, we expect that when attention is directed to the face, congruent expressions may be associated with enhanced P300 amplitudes, consistent with facilitated extraction of emotional information and more efficient allocation of sustained attention3,23.
Hypothesis 3
we hypothesize that there will be no influence of cognitive load on other factors16, suggesting the automaticity of these processes. It should lead to a high level of confidence in the alternative hypothesis, i.e., no differences in the contextual effect with respect to cognitive load for behavioral (accuracy of recognition and reaction times) and brain measures (amplitudes of ERPs of interest, P100, N100, P200, N200).
Methods
Participants
The sample size for this study was estimated using the SuperPower package (version 0.2.0; https://CRAN.R-project.org/package=Superpower, power = 0.95, standard deviation = 2) in R Studio (version 4.3.0)34. Thirty-five right-handed participants (25 females; Mage = 26 +/- 5 years; range 18–35 years) were recruited between June 2023 and November 2023 through social networks and a specific University platform for advertising experiments. Each participant received CAD 30 as compensation for participating in a three-hour session. The inclusion criterion was an age range of 18 to 35 years. Exclusion criteria included neurological disorders (e.g., traumatic brain injury), diagnosed psychological conditions (e.g., psychosis), uncorrected visual or auditory impairments, and the use of medications or substances (e.g., drugs) affecting the nervous system.
Participants completed “The Hospital Anxiety and Depression Scale” (HADS; Means of Anxiety score = 6.77 +/-3.3; Means of depression score = 2.94 +/-1.97) to assess mood35. This screening was run because anxiety or depression tendencies influence emotion processing36. Two participants were excluded due to high HADS anxiety scores (17 and 12, respectively)35.
Demographic information, including date of birth, gender, dominant hand, native language, and education level, was also collected (see https://osf.io/5at6g/ for details). All participants provided their informed consent. Moreover, the study was approved by the UQTR ethics committee (certificate number: CERPPE-22-06-07.03) and all methods were performed in accordance with the relevant guidelines and regulations.
Materials
Stimuli were sourced from the “Validation of the Emotionally Congruent and Incongruent Face-Body Static Set” (ECIFBSS)33. This dataset includes 1952 images of facial and bodily expressions (49 images/actors) presented in both congruent and incongruent situations. For this study, we selected 84 stimuli: 42 congruent (14 images per emotion: happiness, anger, and sadness) and 42 incongruent (seven images for each incongruent combination of happiness, anger, and sadness). We selected the images so that facial and bodily expressions had a relatively high and similar average recognition accuracy (94% for facial expressions and 92% for bodily expressions; OSF; https://osf.io/5at6g/). The emotions of anger, happiness, and sadness have been selected because they are commonly employed in studies on contextual effects between facial and body cues and have been shown to elicit such effects18,37.
The ECIFBSS provided the evaluation of intensity, valence, and arousal (Table 1). Multiple ANOVAs were performed to explore differences in these emotional dimensions (valence, arousal, and intensity) as a function of emotion (happiness, sadness, and anger), congruence (congruent vs. incongruent), and mode of expression (facial vs. bodily expressions).
In terms of valence, the ANOVA revealed a significant main effect of emotion (F (2,156) = 948.348, p < 0.001, η2 = 0.918), and a three-way interaction between emotion, mode of expression, and congruence (F (2,156) = 3.346, p = 0.038, η2 = 0.003). Bonferroni-adjusted paired t-tests showed that happiness was perceived as more positive than anger (t (156) =-39.411, p < 0.001, d=-7.448, CI [-3.255, -2.886]), and sadness (t (156) = 35.755, p < 0.001, d = 6.757, CI [2.601, 2.970]). Given the significant three-way interaction between emotion, congruence, and mode of expression, follow-up ANOVAs were conducted separately for each emotion to further examine the interaction between congruence and mode of expression. This analysis for sadness showed a significant main effect of mode of expression (F (1,52) = 4.106, p = 0.048, η2 = 0.069). Bonferroni-adjusted paired t-tests showed that sad faces were perceived as more negative than sad bodies (t (52) = 2.026, p = 0.048, d = 0.542, CI [0.002, 0.315]). No other significant effects were found (ps > 0.05).
In terms of arousal, the ANOVA showed a significant main effect of emotion (F (2,156) = 108.979, p < 0.001, η2 = 0.552) as well as a Congruence * Mode of expression interaction (F (1,156) = 9.668, p = 0.002, η2 = 0.024). Bonferroni-adjusted paired t-tests showed that happiness (t (156) = 14.239, p < 0.001, d = 2.691, CI [0.952, 1.332]) and anger (t (156) = 10.496, p < 0.001, d = 1.984, CI [0.652, 1.031]) were perceived as more arousing than sadness. Simple effects analysis also showed a significant effect of mode of expression during incongruent condition (F (1,156) = 9.714, p = 0.002) and a significant effect of congruence when emotions were expressed by the body (F (1,156) = 4.902, p = 0.028) and by the face (F (1,156) = 4.766, p = 0.031). No other significant effects were found (ps > 0.05).
In terms of intensity, the ANOVA revealed a significant main effect of emotion (F (2,156) = 76.604, p < 0.001, η2 = 0.459), and mode of expression (F (1,156) = 8.667, p = 0.004, η2 = 0.026), as well as Congruence * Mode of expression interaction (F (1,156) = 7.716, p = 0.006, η2 = 0.023). Bonferroni-adjusted paired t-tests showed that happiness was rated as more intense than anger (t (156) =-5.464, p < 0.001, d=-1.033, CI [-1.277, -0.505]), and anger more intense than sadness (t (156) = 6.886, p < 0.001, d = 1.301, CI [0.737, 1.509]). Moreover, bodily expressions were rated as more intense than facial expressions (t (156) = 2.944, p = 0.004, d = 454, CI [0.129, 0.665]). Simple effects analysis also showed a significant effect of mode of expression in the incongruent condition (F (1,156) = 16.369, p < 0.001), and a significant effect of congruence when emotions were expressed by the body (F (1,156) = 5.182, p = 0.024). No other significant effects were found (ps > 0.05).
Given that brightness and contrast of images influence ERP components amplitudes38, we controlled these variables across stimuli (OSF; https://osf.io/5at6g/). Brightness and contrast values were measured using ImageJ software. The mean brightness was 232.67 (+/-2.13) for congruent conditions and 232.03 (+/-2.13) for incongruent conditions. The mean of contrast was 62 (+/-3.9) for congruent conditions and 63.13 (+/-3.33) for incongruent conditions. Two one-way ANOVAs were conducted to examine potential differences in brightness and contrast between congruent and incongruent conditions. Results could not reveal significant effect of congruency on either brightness or contrast (ps > 0.154).
Finally, we used a set of 12 different and 12 identical digits to create sequences of seven random numbers. Two parameters were enforced during sequence generation: (1) no consecutive identical numbers were included (e.g., 4497425), and (2) ascending (e.g., 1234567) or descending (e.g., 7654321) sequences were avoided (OSF; https://osf.io/5at6g/). Given that this method differs slightly from previous studies2,16,39, we conducted several pilot tests to ensure that our methodology effectively manipulated cognitive load of participants. These pilot tests helped us determine the best approach for our study.
Experimental design
The experimental design followed a within-subject approach, manipulating four independent variables: attentional focus (on face emotion or bodily expression), cognitive load (high or low cognitive load), congruency (congruent or incongruent between facial and bodily expressions), and type of emotional expression (happiness, sadness, and anger). The study consisted of a total of 672 trials, evenly divided into 336 trials per attentional focus condition (on face emotion or bodily expression). Within each attentional focus condition, there were 168 trials per congruency condition (congruency and incongruency), 168 trials per cognitive load condition (high and low), and 28 trials per emotion (happiness, sadness, or anger) in congruent situations, and 14 trials per pairs of incongruent emotions (e.g. angry face with sadness body).
General procedure
Prior to the experiment, participants completed three online questionnaires via LimeSurvey software (version 6.15.2; https://www.limesurvey.org): a socio-demographic questionnaire, the HADS35, and an informed consent and information questionnaire.
The experiment lasted approximately three hours, including one hour for the task and two hours for encephalography (EEG) equipment setup, control of impedances, and quality of signal. It was conducted in a dedicated room equipped with an EEG system at the University of Quebec at Trois-Rivières (UQTR). Participants were seated comfortably on a chair inside a Faraday cage at 60 cm from a computer screen. The experiment task was programmed using E PRIME 2.0 software (version 2.0; https://pstnet.com/products/e-prime/).
The task began with on-screen instructions, followed by two identical parts, counterbalanced in order across participants. In one part, participants focused on recognizing facial expressions, while in the other, they focused on bodily expressions. Each part started with a training session of two blocks, followed by an experimental session consisting of 48 randomized blocks. At the beginning of each block, participants were presented with a sequence of seven digits to memorize, displayed for 3000ms at the center of the screen (“Courier New” font; bold; point size 25). These sequences were either different (e.g., 4916274; representing high cognition load) or identical (e.g., 1111111; representing low cognition load). Following this, seven congruent and incongruent images were shown consecutively for 1500ms each. Each image was preceded by a 500ms fixation cross. Participants were instructed to identify the emotion expressed in either the facial or bodily expressions based on their initial impression, without time constraints. Participants had to choose, as quickly as possible, among five emotion labels: the three target emotions (anger, happiness, and sadness), plus two extra choices, disgust and surprise. Disgust was included because it is often confused with anger in incongruent face-body pairings40, which can amplify contextual effects. Surprise was also included as it is often misperceived as happiness in both facial and bodily expressions33, thereby also potentially amplifying contextual effect between these emotions. Given that surprise is an ambiguous emotion, which could be evaluated as positive or negative, it is possible that this choice might introduce variability and potentially affect the experimental outcomes. However, we believe that this influence would be minimal given that surprise was not among our target emotions but only a supplementary choice of response. After completing the emotion recognition task for all seven images in the block, participants transcribed the digit sequence they had memorized at the beginning of the block. EEG and behavioral data (accuracy and response time) were recorded simultaneously throughout the experiment (Fig. 1).
The block was constituted of the presentation of digits, a fixation cross, a picture of emotionally congruent or incongruent face and body expressions, a prompt for emotional identification. Every seven trials, participant had to write the digits they had memorized.
Statistics analysis
The behavioral and EEG data were analyzed using MATLAB (R2021a; https://www.mathworks.com/products/matlab.html), JASP (version 0.18.1; RRID: SCR_015823; https://jasp-stats.org/), and R (version 4.3.0; https://www.r-project.org/). The package ggplot2 (version 3.4.2; RRID: SCR_014601; https://ggplot2.tidyverse.org/)41, tidyverse (version 1.2.0; RRID: SCR_019186; https://www.tidyverse.org/)42, dplyr (version 1.1.2; RRID: SCR_017102; https://dplyr.tidyverse.org/)43, and emmeans (version 1.8.7; RRID: SCR_018734; https://cran.r-project.org/package=emmeans)44 were used. The analyses conducted are available on the OSF platform (https://osf.io/5at6g/).
Behavioral analysis
First, we analyzed data from the memorization task. A paired t-test and a Bayesian t-test were performed on digit memorization scores to examine differences between cognitive load levels. Additionally, a Pearson correlation was conducted between digit memorization and emotion recognition accuracy to assess their association. Further analyses were conducted on trials where participants achieved a 100%-digit memorization rate. Trials with incorrect digit memorization were excluded from further analysis.
Secondly, we analyzed the emotion recognition task. Behavioral data were assessed using two metrics: the accuracy rate of correct categorization and response time. For accuracy, we calculated the Unbiased hit rates (Hu score)45 for emotion recognition. This metric adjusts for potential response bias that could inflate accuracy score. For instance, a participant who categorizes all facial expressions as “anger” may achieve a high accuracy score for anger, but this does not indicate true recognition. The Hu score corrects this bias by considering the squared frequency of correct responses for a target emotion, divided by the product of the total number of stimuli representing that emotion and the overall frequency of selecting that emotion category45. To identify and exclude outliers in both Hu scores and response times, we used the Mean Absolute Deviation (MAD) function in R46,47.
We conducted classical repeated measures ANOVAs on the arcsine-transformed Hu score as recommended in the literature45,48 and response times, incorporating three factors: congruency, attentional focus, and cognitive load. Since classical repeated measures ANOVAs do not allow for direct support of the null hypothesis (i.e., no difference between conditions), we complemented these with repeated measures Bayesian ANOVAs using the same factors. For Bayesian analyses, we used default prior options for effects (r = 0.5 for the fixed effects) as recommended29,31. The Bayesian ANOVAs produced two types of models: the null model and the alternative model. Alternative models included one or more factors and interactions between them. JASP software calculated Bayes factors (BF10) for each model, quantifying the strength of evidence in favor of each alternative model relative to the null model or the best-performing model30. In other words, a BF10 ratio indicates how much more likely the observed data are under an alternative model (i.e., the most likely model) compared to the null model. Evidence for the null hypothesis is indicated by BF10 values between 1 and 1/3 (weak evidence), between 1/3 and 1/10 (moderate evidence), and below 1/10 (strong evidence). Conversely, B10 values between 1 and 3 (weak evidence), 3 and 10 (moderate evidence), and above 10 (strong evidence) support the alternative hypothesis28,31. For example, a BF10 of 30 indicates that the data are thirty times more likely to occur under the alternative model than the null model29. Additionally, we calculated inclusion Bayes factor (BFinclusion) for each factor (congruency, focus, and cognitive load) to quantify the evidence for including these factors and their interactions within the set of models. This approach allowed us to assess the predictive value of each factor in the dataset29.
ERP recording, preprocessing, and analysis
BrainVision Recorder software (version 1.27.0001;https://www.brainproducts.com/downloads/recorder/) was used to record EEG data. EEG activity was recorded from 64 scalp electrodes, with Fpz serving as the ground electrode. Data were sampled at a rate of 500 Hz per channel for offline analysis. Additionally, the vertical electrooculogram (VEOG) and horizontal electrooculogram (HEOG) were recorded using three electrodes: below the left eye (Fp2), and on the left (FT9) and right (FT10) sides of the eyes. Two electrodes (TP9 and TP10) were placed on the earlobes as reference electrodes. All electrode impedances were kept below 25 kΩ.
Data pre-processing was performed using the EEGLab (version 2024.2.1; https://eeglab.org/others/How_to_download_EEGLAB.html)49 and ERPLAB (version 12.01; https://github.com/ucdavis/erplab/releases/tag/12.01)50 toolboxes in MATLAB, following the pipeline outlined by Lopez-Calderon and Luck50. The data were re-referenced offline to the average and filtered with a 0.1 Hz high-pass filter. Independent Component Analysis (ICA) was performed by the first author (ASP) using AMICA51. The signals were segmented into epochs ([-200ms; 700ms]) around the target stimuli. A low-pass filter with a cutoff at 30 Hz was applied, followed by two rounds of artifact detection. The first round used the “moving window peak-to-peak threshold” method from ERPLAB, with a threshold of 100uV, a window size of 200ms, and a window step of 100ms. The second round targeted eye movements at electrodes 5, 25, and 30, using the “step-like artifacts” detection method, with a threshold of 30uV, a window size of 200ms, and a window step of 50ms. After artifact rejection, an average of 96.5% of data was retained for further analysis (Face focus: 97.57%; body focus: 95.46%; Congruency: 96.72%; Incongruency: 96.42%; High cognitive load: 96.53%; Low cognitive load: 96.52%). As in the behavioral analyses, EEG analyses were only conducted on trials in which participants achieved a 100%-digit memorization rate. Trials with incorrect digit memorization were excluded. On average, 77.71% of the data (M = 539.5 trials) was retained for analysis (Face focus: 77.91% (M = 268.25 trials); body focus: 77.1% (M = 271.25 trials); Congruency: 78.35% (M = 271.46 trials); Incongruency: 77.14% (M = 268.04 trials); High cognitive load: 65.64% (M = 227.25 trials); Low cognitive load: 89.91% (M = 312.25 trials)).
The Factorial Mass Univariate Toolbox (FMUT; version 0.5.1; RRID: SCR_018734; https://github.com/ericcfields/FMUT/releases) was used to analyze the EEG data52. This analysis involves iteratively conducting thousands of inferential statistical tests and applying multiple comparison corrections across all electrodes within a temporal window of interest52,53,54,55,56. Mass univariate analyses provide an exploratory approach to identifying effects without requiring a priori assumptions and offer greater power for detecting effects compared to traditional spatiotemporal averaging methods52,57.
We first conducted an exploratory mass univariate analysis on all electrodes and time points from − 100ms pre-stimulus to the end of the epoch (700ms). Based on this exploratory analysis and previous work3,17,58,59, we identified two early time windows of interest at fronto-central sites (F1, F2, F3, F5, F6, FC1, FC2, FC3, FC5, Fz, and FCz) during the P100 (70-140ms) and N250 (150–318ms) and at parieto-occipital sites (PO7, PO8, PO4, PO3, P5, P6, P7, P8, O1, O2, and Oz) during the P100 (70-140ms) and P250 (150–318ms). We also identified one later time window in the centro-parietal (Cz, C1, C2, C3, C4, CPz, CP1, CP2, CP3, CP4, Pz, P1, P2, P3, and P4) and occipital sites (O1, O2, and OZ; 200–700ms).
Mass univariate analysis used an alpha level of 0.05 and was performed with three within-subjects factors: cognitive load, focus, and congruence. In FMUT, ANOVAs were conducted using 100,000 permutations for each data point, with correction for multiple comparisons via permutation test, as recommended52,53,54. Finally, we also conducted repeated-measures Bayesian ANOVAs on the amplitude of components averaged over the same time windows and electrode sites as the FMUT analysis, using the three factors: cognitive load, focus, and congruence.
Results
The behavioral and EEG analyses included 28 participants (22 females; Mage = 27 +/- 5 years; range 18–35 years) with mean HADS score of 6.68 (+/- 3.45) for anxiety and 3.04 (+/-1.84) for depression because five participants were excluded from the data analysis due to a low number of successful trials.
Memorization task
The mean digit memorization accuracy was 0.803 (+/ 0.36), with 0.929 (+/- 0.074) for the low cognition load condition and 0.676 (+/- 0.185) for the high cognition load condition.
A paired-samples t-test on accuracy revealed a significant difference between high and low cognitive load (t (27) =-7.065, p < 0.001, d=-1.335, CI [-0.326, -0.180]). Digits were better memorized in the low cognitive load condition compared to the high cognitive load condition. Similarly, Bayesian paired-samples t-test on accuracy provided strong evidence in favor of H1 (BF10 = 113651.451), indicating a high likelihood that digits in the low cognitive load condition were better retained than those in the high cognitive load condition.
Pearson correlations were computed between memorization accuracy and emotion recognition accuracy to assess their association. The analysis revealed no significant correlation (r = 0.008, p = 0.281).
Preliminary analyses: behavioral and EEG data
We used the MAD function46,47 to identify outlier trials and participants. For Hu scores, no participants or trials were excluded due to Hu score outliers. For reaction times, no participants and 11.4% of the trials were excluded due to response time outliers.
For ERP data, the exploratory analysis conducted with FMUT across all electrodes within the time window of -100ms to 700ms revealed a significant effect of focus at fronto-central (F1, F2, F3, F5, F6, FC1, FC2, FC3, FC5, Fz, and FCz) and parieto-occipital (PO7, PO8, PO4, PO3, P5, P6, P7, P8, O1, O2, and Oz) sites between 70ms to 300ms, and at occipital sites (O1, O2, and OZ) from 400 to 700ms, with a sustained potential around 500ms (p-values ranged from 0.048 to 0.0002). Additionally, a significant effect of cognitive load was found at the centro-parietal (Cz, C1, C2, C3, C4, CPz, CP1, CP2, CP3, CP4, Pz, P1, P2, P3, and P4) sites between 200ms and 700ms, with a sustained potential around 500ms (p-values ranged from 0.034 to 0.0001). No other effects were significant (ps > 0.05; Fig. 2).
For grand averaged ERPs for all electrodes (A), the cerebral activity was elicited between − 100 and 700ms post-stimulus. For C graph: coloured sections correspond to the significant F-values as indicated by the colour bar.
Based on these exploratory analyses, more specific analyses using the FMUT were conducted in specific scalp zones and time windows (see section about processing of ERP data). At the fronto-central sites, the N100 and N250 time-windows were examined. At the parieto-occipital sites, P100 and P250 time-windows were analyzed. Finally, the occipital sites were analyzed during a late-stage time-window.
Hypothesis 1: contextual effect
The 2 (Focus) * 2 (Cognitive Load) * 2 (Congruence) repeated measures ANOVA on the arcsine-transformed Hu scores revealed a significant main effect of congruence (F (1,27) = 82.452, p < 0.001, η2 = 0.307). Bonferroni-adjusted paired t-tests on congruence main effect showed that emotions were better recognized during congruent conditions rather than incongruent conditions (t (27) = 9.08, p < 0.001, d = 0.848, CI [0.110, 0.174]; Fig. 3). The Bayesian repeated measures ANOVA indicated strong evidence that the best model, i.e., the one with the largest Bayes Factor, includes the main effect of congruence, focus, and the Focus * Congruence interaction (BF10 = 9.281e + 7). BFinclusion showed strong evidence for including the congruence factor (BFinclusion =1.159e + 7).
The same repeated measures ANOVA on reaction times did not show any effect related to congruency (p = 0.095). Bayesian analyses provided moderate evidence against the inclusion of the congruence factor (BFinclusion =0.08; Fig. 3).
Finally, analyses FMUT on ERPs did not show an effect of congruence (ps > 0.065), and Bayesian analyses revealed weak evidence against the inclusion of the congruence factor across ERP components (N100: BFinclusion = 0.243; P100: BFinclusion = 0.256; N250: BFinclusion = 0.458; P250: BFinclusion = 0.387; The sustained potential: BFinclusion= 0.249).
Hypothesis 2: attentional focus effect
The repeated measures ANOVA on arcsine transformation to the Hu Score revealed a significant Focus * Congruence interaction (F(1,27) = 8.46, p = 0.007, η2 = 0.039). No main effect of focus was found (p = 0.110). Simple effects analysis on focus and congruence interaction showed a significant effect of focus during incongruence conditions (F (1,27) = 5.870, p = 0.022). The simple effects analysis also showed a significant effect of congruence when participants focused on the face (F (1,27) = 71.603, p < 0.001) and on the body posture (F (1,27) = 14.569, p < 0.001; Fig. 4). The Bayesian analyses revealed moderate evidence for including the Congruence * Focus interaction (BFinclusion = 8.381), and weak evidence against the inclusion of the focus factor (BFinclusion = 0.923).
For reaction times, the repeated measures ANOVA did not show any effect related to main effect of focus (p = 0.578) and Congruence * Focus interaction (p = 0.571). The Bayesian analyses indicated strong evidence that the best model includes the main effects of cognitive load and focus (BF10 = 19.340). The Bayesian analyses revealed weak evidence against the inclusion of the focus factor (BFinclusion =0.601), and the Focus * Congruence interaction (BFinclusion =0.601; Fig. 4).
Finally, analyses FMUT on ERPs found a significant focus effect (ps < 0.041) at the fronto-central sites (F1, F2, F3, F5, F6, FC1, FC2, FC3, FC5, Fz, and FCz) for the N100 component, at the parieto-occipital sites (PO7, PO8, PO4, PO3, P5, P6, P7, P8, O1, O2, and Oz) for the P100 component, at the fronto-central sites (F1, F2, F3, F5, F6, FC1, FC2, FC3, FC5, Fz, and FCz) for the N250 component, at the parieto-occipital sites (PO7, PO8, PO4, PO3, P5, P6, P7, P8, O1, O2, and Oz) for the P250 component, and at the occipital sites (O1, O2, and Oz) for a sustained potential during the late-stage time-window (400-700ms). No other significant effects were observed (ps > 0.065). The Bayesian repeated measures ANOVA indicated strong evidence that the best model for the N100, P100, P250, N250, and the sustained potential (after 400ms post-stimuli) includes the main effect of focus (N100: BF10 = 114.105; P100: BF10 = 64.880; N250: BF10 = 6.985e + 7; P250: BF10 = 1.407e + 7; The sustained potential: BF10 = 3941.140; Figs. 5 and 6, and 7). The analyses revealed strong evidence for including the focus factor (N100: BFinclusion= 115.267; P100: BFinclusion= 66.916; N250: BFinclusion =6.966e + 7; P250: BFinclusion= 1.333e + 7; The sustained potential: BFinclusion= 3815.977).
Hypothesis 3: cognitive load effect
The repeated measures ANOVA on arcsine transformation to the Hu Score found no significant main effect of cognitive load (p = 0.761) or Congruence * Cognitive Load interaction (p = 0.455). No other comparisons, including those with cognitive load factor, were significant (ps > 0.455; Fig. 8). The Bayesian repeated measures ANOVA indicated weak evidence against the inclusion of the Congruence * Cognitive Load interaction (BFinclusion = 0.311) and moderate evidence against the inclusion of the cognitive load factor (BFinclusion = 0.192).
The same repeated measures ANOVA on reaction times revealed a significant main effect of cognitive load (F (1,27) = 7.937, p = 0.009, η2 = 0.025). No other comparisons, including Cognitive Load, were significant (ps > 0.095; Fig. 8). Bonferroni-adjusted paired t-tests showed that response times were longer during high rather than low cognitive load (t (27) = 2.817, p = 0.009, d = 0.164, CI [5.908, 37.582]). The Bayesian analyses revealed moderate evidence for including the cognitive load factor (BFinclusion =3.486), and moderate evidence against including the Congruence * Cognitive Load interaction (BFinclusion = 0.116).
Histogram of Congruence in Function of Focus Type for Each Cognitive Load Conditions for Means of Arcsine Transformation (A) and for Response Times (C), and Violin Plot of Accuracy Means in Congruence in Function of Cognitive Load and Focus for Means of Arcsine Transformation (B) and Response Times (D).
Finally, analyses FMUT on ERPs did not show significant Congruence * Cognitive Load interaction (ps > 0.065). The Bayesian repeated measures ANOVA on ERPs revealed weak to moderate evidence against including the Congruence * Cognitive Load factor (N100: BFinclusion= 0.265; P100: BFinclusion= 0.452; N250: BFinclusion= 0.992; P250: BFinclusion= 0.863; The sustained potential: BFinclusion= 1.019).
Discussion
In our daily lives, we perceive emotions through both facial and bodily expressions, and these emotional cues interact to enhance the recognition of congruent emotions while making the recognition of incongruent ones more challenging11. However, it remains unclear whether these interactions occur automatically or voluntarily. Using high-quality stimuli, appropriate statistical methods (e.g., Bayesian statistics), and brain activity data with EEG analyses, we observed a contextual effect between facial and bodily expressions, with both facial and bodily expressions being recognized more accurately and quickly in congruent conditions compared to incongruent ones. Furthermore, bodily expressions had a greater influence on the recognition of facial expressions than the reverse. At the cerebral level, we observed two stages of emotional processing, depending on attentional focus. In the early stages (N100, P100, P250, N250), greater brain activity was associated with attention directed at facial expressions rather than bodily expressions. In contrast, later stages (after 400ms) indicated that sustained attention was more strongly linked to focus on bodily expressions than facial expressions. Finally, both behavioral and neural findings suggest that there was no interaction between congruence and cognitive load, implying that the contextual effect was processed automatically. We discuss these effects in the following sections.
The contextual effect between face and bodily cues
Consistent with previous behavioral research, our findings suggest that recognition of facial and bodily expressions is enhanced in congruent conditions and becomes biased in incongruent ones11,12,60. Specifically, we observed improved recognition accuracy in congruent conditions compared to incongruent ones. This supports the notion that emotional recognition is facilitated by the integration of co-occurring emotional cues from both facial and bodily expressions17,18. This contextual effect supports the hypothesis of a common and integrated processing of facial and bodily emotional signals27,61.
However, in contrast to earlier studies3,23, the behavioral findings and hypothesis of the present study, we did not observe significant differences in neural activity between congruent and incongruent conditions at early (P100, N100, P200, N200) and later (P300) stages for congruent compared to incongruent conditions during the face focus condition. Two non-exclusive explanations may account for this discrepancy. First, the high accuracy rate (90% correct) in the emotion recognition task suggests that participants found the task relatively easy, likely due to the extended stimuli presentation time (1500ms). While shorter presentation times can introduce artifacts or increase task difficulty11, longer durations often result in ceiling effects, minimizing observable differences in brain activity18. Thus, while the accuracy data confirm the impact of facial and bodily cues on recognition, the minimal ambiguity of the stimuli and prolonged exposure time may have diminished neural differences between congruent and incongruent conditions.
Second, the inclusion of a concurrent memory task may have influenced the neural processing of emotional context. Although we varied cognitive load (low vs. high), our study lacked a baseline condition without a memory task. This omission prevents direct comparison of contextual effects with and without cognitive load. Performing a dual task, regardless of cognitive load level, may have uniformly influenced brain activity associated with emotional processing. Supporting this idea, a study by Cao et al. (2022) on the impact of cognitive load on processing congruent and incongruent emotional background scenes and facial expression found a significant contextual effect on N170 amplitude under no-load conditions, but this effect disappeared under cognitive load conditions62. These results suggest that the impact of emotional background scenes on facial expression processing primarily occurs when cognitive resources are not taxed by an additional task. This indicates that cognitive load can modulate emotional integration62. In the present study, it is possible that the contextual effects at the neural level were diminished not by differences in cognitive load but by the overall influence of performing a secondary task. Future research should include a no-task baseline condition to clarify how dual-task demands impact the integration of facial and bodily emotional cues at the neural level.
The role of attentional focus on the facial and bodily expressions integration
The contextual effect between facial and bodily expressions is modulated by attentional focus and operates bidirectionally. Bodily expressions influence the recognition of facial expressions and vice versa, suggesting that bodily expressions are not merely contextual cues for interpreting facial expressions but can also serve as primarily targets for emotional recognition11,18. Notably, our findings indicate that facial emotion recognition is more strongly influenced by bodily expressions than the reverse11. This asymmetry may stem from a natural tendency to prioritize facial cues over body cues in everyday social interactions, with bodily expressions functioning as complementary rather than primary channels for emotion recognition11,63. Another possible explanation lies in the visual salience and consistency of bodily expressions, which often feature distinctive, easily recognizable cues, such as clenched fists in anger. The high reliability of these visual markers could reduce the reliance on facial expressions for emotional interpretation3. Finally, another possible explanation lies in the intensity and arousal differences between facial and bodily expressions in the incongruent condition. Bodily expressions were perceived as more intense and more arousing than facial expressions. Given that higher emotional intensity and arousal are associated with improved recognition64, the greater intensity of bodily expressions may have increased their influence on the recognition of less intense facial expressions. To our knowledge, the role of emotional intensity in contextual effects between facial and bodily cues remains unexplored, underscoring the need for further research.
At the neural level, the temporal processing of facial and bodily expressions follows a similar trajectory during both early and late stages, as evidenced by the consistent engagement of neural components such as P100, N100, P250, and N250. This activity in early neural encoding suggests substantial shared mechanisms for processing facial and bodily emotional cues, consistent with studies highlighting the involvement of common regions, such as the fusiform cortex61,65.
However, during the early stages (100 to 250ms), attentional focus on facial expressions elicits greater brain activity, as evidenced by higher amplitudes in P100, N100, P250, and N250 compared to when attention is directed toward bodily expressions. The heightened activation around 100ms (P100 et N100) for facial-focused attention aligns with prior findings17,65 and may reflect the sensitivity of these components to the physical properties of facial stimuli, such as color, contrast38,66, and low-frequency spatial cues, such as whether the mouth is open or closed67. These differences in P100 and N100 amplitudes likely reflect the sensitivity of these components to facial features rather than being specific to emotional processing65. Finally, our findings diverge from studies reporting greater early activation for bodily expressions compared to facial expressions23,59. This discrepancy may arise from methodological differences, particularly in the types of emotional stimuli used. While previous studies predominantly examined negative emotions23,59, our study included both positive and negative emotions. This broader emotional range aligns with findings of Meeren et al. (2005), who observed enhanced early brain activity for facial expressions across emotional valences, rather than for bodily expressions17. The differences observed between this present study and previous ones could also be attributed to the stimuli used. Unlike earlier research, we employed high-quality, color stimuli designed to closely mimic real-life emotional perception. Additionally, we controlled the contrast and brightness of the stimuli to ensure these factors did not influence the EEG signals. Future research should further investigate how different emotional valences affect the neural mechanisms underlying the processing of facial and bodily expressions, while also utilizing higher-quality stimuli to obtain data that more accurately reflects real-world emotional processing.
In the later stages of processing (from 400ms), attentional focus on bodily expressions is associated with sustained neural activity, potentially reflecting the increased cognitive demands required to process bodily cues, which are often perceived as more ambiguous than facial expressions65. This finding supports the notion that focusing on bodily expressions requires greater attentional engagement and cognitive resources, as individuals are less accustomed to relying exclusively on body cues for emotional recognition. In contrast, focusing on facial expressions leads to more efficient and automatic processing, occurring earlier in the timeline of emotional processing65.
Finally, the greater influence of bodily expressions on facial emotion recognition compared to the reverse highlights the potential modulation of the contextual effect by attentional focus. Our results suggest that attentional focus may influence the processing of emotional cues from facial and bodily expressions, as previous behavioral11 and neural studies3,22. Meta-analyses also support this idea, showing that attention and cognitive load modulate both early and late stages of facial emotion processing, potentially influencing the automaticity of these processes68. However, no previous study has yet examined the influence of cognitive load on the integration of emotional signals when the attentional focus is on bodily expressions rather than facial expressions. Therefore, future research should investigate these contextual effects in greater depth to clarify the neural mechanisms underlying the bidirectional integration of facial and bodily expressions and how attentional focus and cognitive load interact with this process.
The role of cognitive load on the facial and bodily expressions integration
Our behavioral results suggest that the contextual effect occurs automatically, satisfying the efficiency criterion for automaticity20. Specifically, we found no interaction between congruence and cognitive load for either the accuracy or response time. This indicates that the integration of facial and bodily expressions is similarly efficient under both high and low cognitive load, operating without significant cognitive effort, even during a concurrent task16. These findings support the hypothesis of automatic contextual integration, consistent with prior behavioral studies16.
At neural level, although no interaction between congruence and cognitive load was observed at either early or later stages, suggesting that the contextual effect may operate automatically, satisfying the efficiency criterion for automaticity20. However, this interpretation should be considered with caution due to the absence of a clear contextual effect in the EEG data. Indeed, while the level of cognitive load (low vs. high) does not seem to affect the integration of facial and bodily expressions, it remains possible that the mere presence of a concurrent task, regardless of its cognitive demands, may influence this integration. Moreover, the lack of a direct correspondence between the behavioral data and the EEG components studies, although not entirely uncommon, is somewhat surprising. Typically, ERP effects are detected even when behavioral measures fail to capture effects, due to the higher sensitivity of neural recordings. In the present study, this dissociation may partly stem from the use of conservative analytical techniques, which could have masked subtle neural effects. Again, a no-task baseline condition could, in the future, help to clarify how dual-task demands impact the integration of facial and bodily emotional cues at the neural level.
Limitations
A limitation concerns the difference in perceived intensity among the emotions (anger, happiness, and sadness) and expression (facial vs. bodily expressions), as well as the differences in intensity and arousal between facial and bodily expressions in the incongruent condition. Specifically, happiness was rated as more intense than anger, and anger more intense than sadness. Bodily expressions were rated as more intense than facial expressions. Moreover, in incongruent conditions, bodily expressions were rated as more intense and arousing than facial expressions. Given that higher emotional intensity and arousal can facilitate emotion recognition64, such differences may have influenced the emotion recognition task.
Conclusion
Emotions are an integral part of daily life, making it essential to understand how we perceive and integrate them to respond effectively. Our study aimed to explore the integration of facial and bodily expressions using behavioral and neural measures while manipulating cognitive load as a criterion for automaticity. Our findings strongly suggest an automatic integration of facial and bodily emotional signals at the behavioral level, although this effect appears more nuanced at the neural level. We also observed a bidirectional asymmetry in the influence of bodily expressions on facial expression recognition and vice versa, suggesting the role of attentional focus in modulating the interaction between facial and bodily cues. Furthermore, temporal differences were noted in emotional processing depending on attentional focus. Facial expressions were processed more quickly and efficiently, whereas bodily expressions required extended attentional engagement, likely due to their greater ambiguity and dependence on contextual interpretation. Our research contributes to enhancing the understanding of emotional signal integration by being the first study to use both behavioral and neutral measures, alongside high-quality stimuli that closely resemble real-life perception, to characterize the automaticity of this integration. Future research should further investigate the neural mechanisms underlying these contextual effects, including a no-load control condition, which would help determine whether cognitive demands directly influence the integration process or if the contextual effect functions independently of such demands. Additionally, exploring how attention and cognitive load interact with the processing of congruent and incongruent expressions could deepen our understanding of the automaticity of this integration and its role in emotional recognition.
Data availability
Supplementary material available at: https://osf.io/5at6g/.
References
Ekman, P. Are there basic emotions? Psychol. Rev. 99 (3), 550–553. https://doi.org/10.1037/0033-295X.99.3.550https://doi.org/https://doi-org.biblioproxy.uqtr (1992).
Tracy, J. L. & Robins, R. W. The automaticity of emotion recognition. Emotion 8 (1), 81–95. https://doi.org/10.1037/1528-3542.8.1.81 (2008).
Gu, Y., Mai, X. & Luo, Y. J. Do bodily expressions compete with facial expressions? Time course of integration of emotional signals from the face and the body. PLoS One. 8 (7), e66762. https://doi.org/10.1371/journal.pone.0066762 (2013).
Kret, M. E. & de Gelder, B. Social context influences recognition of bodily expressions. Exp. Brain Res. 203 (1), 169–180. https://doi.org/10.1007/s00221-010-2220-8 (2010).
Focker, J., Gondan, M. & Roder, B. Preattentive processing of audio-visual emotional signals. Acta. Psychol. 137 (1), 36–47. https://doi.org/10.1016/j.actpsy.2011.02.004 (2011).
Seubert, J. et al. Processing of disgusted faces is facilitated by odor primes: a functional MRI study. Neuroimage 53 (2), 746–756. https://doi.org/10.1016/j.neuroimage.2010.07.012 (2010).
Belin, P., Campanella, S., Ethofer, T. E. & & Integrating Face and Voice in Person Perception (Springer Science & Business Media, 2012).
Jessen, S. & Kotz, S. A. The Temporal dynamics of processing emotions from vocal, facial, and bodily expressions. Neuroimage 58 (2), 665–674. https://doi.org/10.1016/j.neuroimage.2011.06.035 (2011).
Paulmann, S., Jessen, S. & Kotz, S. A. Investigating the multimodal nature of human communication. J. Psychophysiol. 23 (2), 63–76. https://doi.org/10.1027/0269-8803.23.2.63 (2009).
Van den Stock, J., Righart, R. & de Gelder, B. Body expressions influence recognition of emotions in the face and voice. Emotion 7 (3), 487–494. https://doi.org/10.1037/1528-3542.7.3.487 (2007).
Lecker, M., Dotsch, R., Bijlstra, G. & Aviezer, H. Bidirectional contextual influence between faces and bodies in emotion perception. Emotion 20 (7), 1154–1164. https://doi.org/10.1037/emo0000619 (2020).
Mondloch, C. J., Nelson, N. L. & Horner, M. Asymmetries of influence: differential effects of body postures on perceptions of emotional facial expressions. PLoS One. 8 (9), e73605. https://doi.org/10.1371/journal.pone.0073605 (2013).
Righart, R. & de Gelder, B. Context influences early perceptual analysis of faces–an electrophysiological study. Cereb. Cortex. 16 (9), 1249–1257. https://doi.org/10.1093/cercor/bhj066 (2006).
Van den Stock, J., Vandenbulcke, M., Sinke, C. B., Goebel, R. & de Gelder, B. How affective information from faces and scenes interacts in the brain. Soc. Cognit. Affect. Neurosci. 9 (10), 1481–1488. https://doi.org/10.1093/scan/nst138 (2014).
Reschke, P. J. & Walle, E. A. The unique and interactive effects of faces, postures, and scenes on emotion categorization. Affect. Sci. 2 (4), 468–483. https://doi.org/10.1007/s42761-021-00061-x (2021).
Aviezer, H., Bentin, S., Dudarev, V. & Hassin, R. R. The automaticity of emotional face-context integration. Emotion 11 (6), 1406–1414. https://doi.org/10.1037/a0023578 (2011).
Meeren, H. K., Van Heijnsbergen, C. C. & de Gelder, B. Rapid perceptual integration of facial expression and emotional body language. Proc. Natl. Acad. Sci., 102(45), 16518–16523. (2005). https://doi.org/10.1073/pnas.0507650102
Kret, M. E., Stekelenburg, J. J., Roelofs, K. & de Gelder, B. Perception of face and body expressions using electromyography, pupillometry and gaze measures. Front. Psycholol.. 4, 28. https://doi.org/10.3389/fpsyg.2013.00028 (2013).
Karaaslan, A., Durmuş, B. & Amado, S. Does body context affect facial emotion perception and eliminate emotional ambiguity without visual awareness? Vis. Cogn.. 28 (10), 605–620. https://doi.org/10.1080/13506285.2020.1846649 (2020).
Moors, A. & De Houwer, J. Automaticity: a theoretical and conceptual analysis. Psychol. Bull. 132 (2), 297–326. https://doi.org/10.1037/0033-2909.132.2.297 (2006).
Bago, B. et al. Fast and slow thinking: electrophysiological evidence for early conflict sensitivity. Neuropsychologia 117, 483–490. https://doi.org/10.1016/j.neuropsychologia.2018.07.017 (2018).
Chen, T., Sun, Y., Feng, C. & Feng, W. In identifying the source of the incongruent effect. J. Psychophysiol. 36 (3), 167–176. https://doi.org/10.1027/0269-8803/a000290 (2022).
Li, X. Recognition characteristics of facial and bodily expressions: evidence from erps. Front. Psychol. 12, 680959. https://doi.org/10.3389/fpsyg.2021.680959 (2021).
Polich, J. P300, probability, and interstimulus interval. Psychophysiology 27 (4), 396–403. https://doi.org/10.1111/j.1469-8986.1990.tb02333.x (1990). https://doi-org.biblioproxy.uqtr.ca/
Luo, W., Feng, W., He, W., Wang, N. Y. & Luo, Y. J. Three stages of facial expression processing: ERP study with rapid serial visual presentation. Neuroimage 49 (2), 1857–1867. https://doi.org/10.1016/j.neuroimage.2009.09.018 (2010).
Smith, N. K., Cacioppo, J. T., Larsen, J. T. & Chartrand, T. L. May I have your attention, please: electrocortical responses to positive and negative stimuli. Neuropsychologia 41 (2), 171–183. https://doi.org/10.1016/S0028-3932(02)00147-1 (2003).
De Gelder, B., de Borst, A. W. & Watson, R. The perception of emotion in body expressions. Wiley Interdisciplinary Reviews: Cogn. Sci. 6 (2), 149–158. https://doi.org/10.1002/wcs.1335 (2015).
Lee, M. D. & Wagenmakers, E. J. Bayesian Cognitive Modeling: A Practical Course (Cambridge University Press, 2013).
van den Bergh, D. et al. A tutorial on conducting and interpreting a bayesian ANOVA in JASP. L’Année Psychologique. 120 (1), 73–96 (2020).
van den Bergh, D., Wagenmakers, E. J. & Aust, F. Bayesian Repeated-Measures analysis of variance: an updated methodology implemented in JASP. Adv. Methods Practices Psychol. Sci. 6 (2). https://doi.org/10.1177/25152459231168024 (2023).
van Doorn, J. et al. The JASP guidelines for conducting and reporting a bayesian analysis. Psychon. Bull. Rev. 28 (3), 813–826. https://doi.org/10.3758/s13423-020-01798-5 (2021).
Zhang, M. et al. The asynchronous influence of facial expressions on bodily expressions. Acta. Psychol. 200, 102941. https://doi.org/10.1016/j.actpsy.2019.102941 (2019).
Puffet, A. S. & Rigoulot, S. Validation of the emotionally congruent and incongruent Face-Body static set (ECIFBSS). Behav. Res. Methods. 57 (1), 41. https://doi.org/10.3758/s13428-024-02550-w (2025).
Lakens, D. & Caldwell, A. R. Simulation-Based power analysis for factorial analysis of variance designs. Adv. Methods Practices Psychol. Sci. 4 (1). https://doi.org/10.1177/2515245920951503 (2021).
Zigmond, A. S. & Snaith, R. P. The hospital anxiety and depression scale. Acta Psychiatry. Scand. 67 (6), 361–370. https://doi.org/10.1111/j.1600-0447.1983.tb09716.x (1983).
Peschard, V., Maurage, P. & Philippot, P. Towards a cross-modal perspective of emotional perception in social anxiety: review and future directions. Front. Hum. Neurosci. 8, 322. https://doi.org/10.3389/fnhum.2014.00322 (2014).
Abo Foul, Y., Eitan, R. & Aviezer, H. Perceiving emotionally incongruent cues from faces and bodies: older adults get the whole picture. Psychol. Aging. 33 (4), 660–666. https://doi.org/10.1037/pag0000255 (2018).
Puce, A. et al. Multiple faces elicit augmented neural activity. Front. Hum. Neurosci. 7, 282. https://doi.org/10.3389/fnhum.2013.00282 (2013).
Lima, C. F., Anikin, A., Monteiro, A. C., Scott, S. K. & Castro, S. L. Automaticity in the recognition of nonverbal emotional vocalizations. Emotion 19 (2), 219–233. https://doi.org/10.1037/emo0000429 (2019).
Aviezer, H. et al. Angry, disgusted, or afraid? Studies on the malleability of emotion perception. Psychol. Sci. 19 (7), 724–732. https://doi.org/10.1111/j.1467-9280.2008.02148.x (2008).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York., ISBN 978-3-319-24277-4,. https://doi.org/https://ggplot2.tidyverse.org (2016).
Wickham, H. et al. Welcome to the tidyverse. J. Open. Source Softw. 4 (43). https://doi.org/10.21105/joss.01686 (2019).
Wickham, H., François, R., Henry, L., Müller, K. & Vaughan, D. dplyr: A Grammar of Data Manipulation. (2023).
Russell, A., Buerkner, P., Herve, M., Love, J. & Singmann, H. Package ‘emmeans’. R Package Vers. 1(4). (2021).
Wagner, H. L. On measuring performance in category judgment studies of nonverbal behavior. J. Nonverbal Behav. 17, 3–28. https://doi.org/10.1007/BF00987006 (1993). https://doi.org/https://doi-org.biblioproxy.uqtr
Leys, C., Delacre, M., Mora, Y. L., Lakens, D. & Ley, C. How to classify, detect, and manage univariate and multivariate outliers, with emphasis on Pre-Registration. Int. Rev. Social Psychol. 32 (1). https://doi.org/10.5334/irsp.289 (2019).
Leys, C., Ley, C., Klein, O., Bernard, P. & Licata, L. Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J. Exp. Soc. Psychol. 49 (4), 764–766. https://doi.org/10.1016/j.jesp.2013.03.013 (2013).
Goeleven, E., De Raedt, R., Leyman, L. & Verschuere, B. The Karolinska directed emotional faces: A validation study. Cognition Emot. 22 (6), 1094–1118. https://doi.org/10.1080/02699930701626582 (2008).
Delorme, A. & Makeig, S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J. Neurosci. Methods. 134 (1), 9–21 (2004).
Lopez-Calderon, J. & Luck, S. J. ERPLAB: an open-source toolbox for the analysis of event-related potentials. Front. Hum. Neurosci. 8, 213. https://doi.org/10.3389/fnhum.2014.00213 (2014).
Palmer, J. A., Kreutz-Delgado, K. & Makeig, S. AMICA: an adaptive mixture of independent component analyzers with shared components. Swartz Center for Computatonal Neursoscience, Univ. California San Diego, Tech. Rep.. (2012).
Fields, E. C. & Kuperberg, G. R. Having your cake and eating it too: flexibility and power with mass univariate statistics for ERP data. Psychophysiology 57 (2), e13468. https://doi.org/10.1111/psyp.13468 (2020).
Durston, A. J. & Itier, R. J. The early processing of fearful and happy facial expressions is independent of task demands - Support from mass univariate analyses. Brain Res. 1765, 147505. https://doi.org/10.1016/j.brainres.2021.147505 (2021).
Hudson, A., Durston, A. J., McCrackin, S. D. & Itier, R. J. Emotion, gender and gaze discrimination tasks do not differentially impact the neural processing of angry or happy facial Expressions-a mass univariate ERP analysis. Brain Topogr. 34 (6), 813–833. https://doi.org/10.1007/s10548-021-00873-x (2021).
Jaspers-Fayer, F. et al. Spatiotemporal dynamics of Covert vs. Overt emotional face processing in dysphoria. Front. Behav. Neurosci. 16, 920989. https://doi.org/10.3389/fnbeh.2022.920989 (2022).
McCrackin, S. D. & Itier, R. J. Feeling through another’s eyes: perceived gaze direction impacts ERP and behavioural measures of positive and negative affective empathy. Neuroimage 226, 117605. https://doi.org/10.1016/j.neuroimage.2020.117605 (2021).
Groppe, D. M., Urbach, T. P. & Kutas, M. Mass univariate analysis of event- related brain potentials/fields I: a critical tutorial review. Psychophysiology 48 (12), 1711–1725. https://doi.org/10.1111/j.1469-8986.2011.01273.x (2011).
Van Dillen, L. F. & Derks, B. Working memory load reduces facilitated processing of threatening faces: an ERP study. Emotion 12 (6), 1340–1349. https://doi.org/10.1037/a0028624 (2012).
Zhang, D., Zhao, T., Liu, Y. & Chen, Y. Comparison of facial expressions and body expressions: an Event-related potential study. Acta Physiol. Sinica. 47 (8). https://doi.org/10.3724/sp.J.1041.2015.00963 (2015).
Willis, M. L., Palermo, R. & Burke, D. Judging approachability on the face of it: the influence of face and body expressions on the perception of approachability. Emotion 11 (3), 514–523. https://doi.org/10.1037/a0022571 (2011).
De Gelder, B. Towards the neurobiology of emotional body Language. Nat. Rev. Neurosci. 7 (3), 242–249. https://doi.org/10.1038/nrn1872 (2006).
Cao, F. et al. Influence of scene-based expectation on facial expression perception: the moderating effect of cognitive load. Biol. Psychol. 168, 108247. https://doi.org/10.1016/j.biopsycho.2021.108247 (2022).
Aviezer, H., Trope, Y. & Todorov, A. Body cues, not facial expressions, discriminate between intense positive and negative emotions. Science 338 (6111), 1225–1229. https://doi.org/10.1126/science.1224313 (2012).
Morningstar, M., Gilbert, A. C., Burdo, J., Leis, M. & Dirks, M. A. Recognition of vocal socioemotional expressions at varying levels of emotional intensity. Emotion 21 (7), 1570–1575. https://doi.org/10.1037/emo0001024 (2021).
Stekelenburg, J. J. & de Gelder, B. The neural correlates of perceiving human bodies: an ERP study on the body-inversion effect. Neuroreport 15 (5), 777–780. https://doi.org/10.1097/01.wnr.0000119730.93564.e8 (2004).
Yang, Y. F., Brunet-Gouet, E., Burca, M., Kalunga, E. K. & Amorim, M. A. Brain processes while struggling with evidence accumulation during facial emotion recognition: an ERP study. Front. Hum. Neurosci. 14, 340. https://doi.org/10.3389/fnhum.2020.00340 (2020).
Pourtois, G., Dan, E. S., Grandjean, D., Sander, D. & Vuilleumier, P. Enhanced extrastriate visual response to bandpass Spatial frequency filtered fearful faces: time course and topographic evoked-potentials mapping. Hum. Brain. Mapp. 26 (1), 65–79. https://doi.org/10.1002/hbm.20130 (2005).
Schindler, S. & Bublatzky, F. Attention and emotion: An integrative review of emotional face processing as a function of attention. Cortex, 130, 362–386. https://doi.org/10.1016/j.cortex.2020.06.010(2020).
Author information
Authors and Affiliations
Contributions
A-S.P: Conceptualization, Formal Analysis, Investigation, Methodology, Project administration, Validation, Visualization, Writing-original draft, Writing-reviewing, and editing. S.R: Conceptualization, Funding acquisition, Resources, Supervision, Writing-reviewing and editing, Investigation, Methodology, Project administration.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Puffet, AS., Rigoulot, S. The role of cognitive load in automatic integration of emotional information from face and body. Sci Rep 15, 28184 (2025). https://doi.org/10.1038/s41598-025-12511-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-12511-8