Abstract
Emotional prosody (EP) processing is vital for social communication. Seed-based functional connectivity has been widely used to probe its neural basis, yet most studies rely on part of predefined regions, introducing uncertainty and bias. Furthermore, although gender and task type modulate its activation pattern, their network-level impact remains unclear. Using activation network mapping (a network-level analogue of meta-analysis), we identified a unified EP network and delineated its modulation by gender and task types (explicit or implicit). Results showed broader activation networks in females compared to males, regardless of the task type. Moreover, explicit tasks recruited additional frontal and sensorimotor regions beyond implicit tasks, supporting hierarchical processing. We also identified associations with specific receptors and diseases like autism and Alzheimer’s. These findings underscore the importance of considering gender and task type effects on emotional processing research and provide a network-level neural mechanism underlying emotional prosody.
Similar content being viewed by others
Introduction
Emotion, as a fundamental aspect of human beings, plays a vital role in social interactions and communication. Accurate perception of emotional information is crucial for effective social interaction and deficits in emotional perception are linked to neuropsychiatric disorders1,2,3,4. Emotional prosody (EP) is a key emotional cue and a primary channel for conveying paralinguistic emotion in spoken communication5, 6. It enriches the meaning of spoken language, enabling listeners to infer the speaker’s feelings and intentions more accurately. Accurate perception of the EP is essential for successful social interactions, as it supports the recognition of emotional states and fosters empathy, thereby promoting conversational response7. Therefore, investigating the neural underpinnings of emotional prosody processing may facilitate the understanding of the neural mechanisms underlying emotional processes and provide insights into disease pathogenesis.
With advances in seed-based functional connectivity techniques, numerous studies have investigated the neural underpinning of the emotional prosody processing from a brain network perspective8,9,10,11,12,13,14,15. However, the predominant reliance on part of predefined regions from prior literatures introduces potential methodological challenges in accurately and completely network mapping16. The core issue lies in the limited reproducibility of these regions across studies, a variability attributed to differences in experimental designs17 and participant demographics (e.g., gender). For example, an fMRI study involving an explicit emotional judgement task found peak activations in the right precentral gyrus (PreCG) for emotional prosody processing. In contrast, another fMRI study using a passive listening task showed different motor cortex activations in bilateral postcentral gyrus (PostCG), and additional peak activations in bilateral anterior cingulate cortex, and fusiform gyrus14, 18. The limited reproducibility of regions across studies might result in introducing spurious components into the network or lead to incomplete network characterization. To address this limitation, the activation network mapping (ANM) technique, analogous to a meta-analytic tool, has been developed to systematically integrate activation coordinates across multiple studies to delineate more robust and unified functional networks19. Critically, ANM not only delineates more robust networks but also provides a novel framework to explain inconsistent findings across studies through variations in network‑level information flow, by mapping directed connectivity from meta‑analytic data.
While ANM effectively characterizes brain networks for emotional prosody processing, it remains unknown how it is modulated by critical factors like gender and task type—both known to shape regional activation patterns of emotional prosody processing20,21,22. Gender, in particular, has been repeatedly identified as an important source of variability. For example, some studies found that men exhibit stronger activation in the right frontal cortex, especially the inferior frontal gyrus (IFG), while women showed greater activity in the left middle temporal gyrus (MTG) during emotional prosody processing. However, other studies claimed that women showed greater bilateral IFG activation than men20, 21, 23, 24. From an evolutionary perspective, gender differences in emotion prosody processing might be partially attributed to distinct social roles in ancestral environments25. Females, likely due to caregiving roles, may have evolved heightened sensitivity to emotional cues, whereas males, possibly through hunting and competition, may have become more responsive to threats and dominance signals26,27,28. Neuroimaging evidence provides partial support for this account, showing that women exhibit higher gray-to-white matter ratios in frontal and temporal regions, whereas men have larger total volumes in the front medial cortex, cerebellum, and amygdala29, 30. Furthermore, previous studies have proposed that such gender differences could reflect genetically mediated adaptations to ancestral environments25, 31,32,33. Task type, on the other hand, which introduces differences in task demands—from implicit processing to explicit emotional judgments (i.e., implicit emotional prosody [IEP] vs. explicit emotional prosody [EEP])—is believed to engage distinct processing levels and neural circuits34, 35.
Based on previous findings, we put forward two hypotheses: 1) The EEP and IEP processing exhibit different activation network patterns; 2) There is a gender difference in the activation network patterns of EP processing. To test these hypotheses, we adopted the ANM technique to investigate the neural networks of EP processing from a network perspective based on the Human Connectome Project dataset (HCP)36. First, we converged the heterogeneous fMRI results into a common emotional prosody network under ANM analysis. Second, we assessed the effect of the two factors, gender and task type, on emotional prosody processing. Finally, to gain insight into the neurochemical mechanisms supporting the emotional prosody network, we examined its spatial correspondence with neurotransmitter receptor and gene expression profiles.
Results
Articles selection
Detailed information about the article selection is shown in Table 1 and Supplementary Table 1. Here, different experiments or contrasts for the same article were counted separately. For the emotional prosody analysis, a total of 40 articles were included, with 54 experiments, 725 subjects (334 females, 391 males), and 474 activity coordinates. To explore the difference in activation networks between EEP and IEP, we categorized the total experiments into EEP experiments and IEP experiments. There were 32 EEP articles, with 41 experiments, 587 subjects (270 females, 317 males), 390 coordinates, and 9 IEP articles with 11 experiments, 162 subjects (78 females, 84 males), and 59 coordinates. Note that 2 experiments were excluded from EEP and IEP analysis for their inappropriate contrast. Please find the detailed information of the selected studies in Supplementary Table 1.
Activation networks of EP processing
The activation networks from ANM are shown in Fig. 1A. For EP, our analysis revealed a widespread activation brain network, including somatomotor network (SMN, bilateral PreCG and PostCG, bilateral primary auditory cortex [PAC], bilateral planum temporal [PT], bilateral posterior part of superior temporal gyrus [STGp], bilateral superior temporal sulcus [STS] and left Heschl’s gyrus [HG]), ventral attention network (VAN, bilateral Insular), default mode network (DMN, right posterior middle temporal gyrus [MTGp]), frontoparietal network (FPN, bilateral opercular and triangular part of IFG [IFGpo and IFGpt], bilateral middle frontal gyrus [MFG] and bilateral posterior supramarginal gyrus [SMGp]), dorsal attention network (DAN, temporooccipital part of the bilateral MTG [MTGtp]), limbic network (LN, bilateral orbitofrontal cortex [OFC]), and subcortical network (Amygdala).
A The activation networks of EP, EEP and IEP, B The overlap of EEP and IEP.
For EEP, the unthresholded activation network was highly correlated to that of EP (\(r\) = 0.99, \({P}_{{spin}}\) < 1.0 × 10−4) (Fig. 2B). In contrast, the activation network of IEP was more localized, primarily involving regions such as the bilateral PAC, bilateral Insular, bilateral SMGp, bilateral PT, bilateral STGp and STS, resulting in a relatively lower spatial correlation with that of EP (\(r\) = 0.86, \({P}_{{spin}}\) < 1.0 × 10−4) (Fig. 2B). The overlap between activation networks of EEP and IEP showed that the EEP activation network encompassed the network of IEP (Fig. 2A), with an EEP-specific network primarily including regions such as the bilateral IFGpo, left IFGpt, bilateral MFG, bilateral OFC, right MTGtp, bilateral PreCG and bilateral PostCG. These findings indicate that EEP engages a more extensive neural network compared to IEP, especially in areas associated with higher-order cognitive processing.
A The activation networks of EP processing (including EP, EEP and IEP) and the overlap of EEP and IEP based on female cohorts (left) and male cohorts (right) respectively. B The spatial correlation between unthresholded activation networks. C The overlap maps of female and male for EP, EEP and IEP. O-EP, F-EP, M-EP: Emotional prosody activation networks for overall, female and male cohort; O-EEP, F-EEP, M-EEP: Explicit emotional prosody activation networks for overall, female and male cohort; O-IEP, F-IEP, M-IEP: Implicit emotional prosody activation networks for overall, female and male cohort.
To investigate the task-type effects statistically on functional connectivity strength, voxel-wise two-sample t-tests were conducted based on averaged Fisher z maps in each specific cohort. Across all three cohorts, compared with IEP, consistently higher functional connectivity between the activation seeds and brain regions mainly located in the left IFG, MFG, and PreCG was found during EEP processing (Fig. 3A. For overall cohorts, peak MNI coordinates [−46, 4, 46], t (50) = 3.98; for females, peak MNI coordinates [−46, 4, 46], t (50) = 4.44; for males, peak MNI coordinates [−50, 4, 50], t (50) = 3.6971. ALL corresponding voxel-level p < 0.01, cluster-level FWE corrected p < 0.05). The increased functional connectivity during EEP processing might be related to the additional emotional evaluative demands inherent to the explicit task.
A Clusters revealed significant differences between EEP (n = 41) and IEP (n = 11) in the overall, female, and male cohorts. B Clusters revealed significant differences between female and male participants in EP (n = 54), EEP, and IEP. Cluster-forming threshold p < 0.01; cluster-level correction: FWE p < 0.05.
Widespread activation networks of EP in females
To test gender differences in brain networks involved in EP, we separately estimated the activation networks for male and female cohorts using activation seeds and fMRI scans of corresponding subjects of the HCP. As illustrated in Fig. 2A, C, the ANM results indicated that females engaged in a broader brain network than males during EP processing. Specifically, for EP and EEP, the gender-shared activation network was mainly in the bilateral PT and IFGpo/IFGpt, while IEP showed activation exclusively in the bilateral PT. The female-specific network involved in EP and EEP was mainly distributed in regions including bilateral MFG, right MTGtp, bilateral Insular, left SMC, bilateral PreCG, and bilateral PostCG. Moreover, for IEP, brain regions including right Insular were female-specific.
To investigate the gender effects statistically on functional connectivity strength, voxel-wise paired t-tests were conducted across experiment based on group-averaged Fisher z maps for each specific task type. Compared with males, females exhibited significantly increased functional connectivity between the activation seeds and most of the voxels (95.32% for EP, 90.30% for EEP) within the female-male union emotion-prosody network for EP and EEP (Fig. 3B). For IEP, the significantly increased connectivity in females mainly localized in the bilateral insula (For left Cerebrum: peak MNI coordinates [−36, −10, 4], t (10) = 8.80; for right Cerebrum: peak MNI coordinates [36, 12, 2], t (10) = 12.14. ALL corresponding voxel-level p < 0.01, cluster-level FWE corrected p < 0.05).
Neuroreceptor mechanisms underlying EP processing
To investigate the underlying neuroreceptor mechanism for the activation networks of emotional prosody, we conducted a spatial correlation analysis between receptors and the activation networks. A common neuroreceptor set related to emotional prosodies was found, including \({5{HT}}_{1A}\), \({{CB}}_{1}\), \({{mGluR}}_{5}\) and \({NAT}\) (\({5{HT}}_{1A}\): \(r\) = 0.32 ± 0.037, \({P}_{{spin}}\) = 0.014 ± 0.012; \({{CB}}_{1}\): \(r\) = 0.34 ± 0.074, \({P}_{{spin}}\) = 0.028 ± 0.012; \({{mGluR}}_{5}\): \(r\) = 0.31 ± 0.026, \({P}_{{spin}}\) = 0.020 ± 0.021; \({NAT}\): \(r\) = 0.33 ± 0.013, \({P}_{{spin}}\) = 0.00013 ± 5.0 × 10−5). Additionally, another set of neuroreceptors exhibited a gender-specific effect. For EP and EEP, \({5{HT}}_{1B}\), \({5{HT}}_{2A}\), and \({5{HT}}_{6}\) were significantly positively associated with spatial activation patterns in females (\({5{{\rm\it{HT}}}}_{1{{\rm\it{B}}}}\): [EP: \(r\) = 0.27, \({P}_{{{\mathrm{spin}}}}\) = 0.018; EEP: \(r\) = 0.27, \({P}_{{{\mathrm{spin}}}}\) = 0.013]; \({5{{\rm\it{HT}}}}_{2{{\rm\it{A}}}}\): [EP: \(r\) = 0.28, \({P}_{{{\mathrm{spin}}}}\) = 0.036; EEP: \(r\) = 0.28, \({P}_{{{\mathrm{spin}}}}\) = 0.032]; \({5{{\rm\it{HT}}}}_{6}\): [EP: \(r\) = 0.21, \({P}_{{{\mathrm{spin}}}}\) = 0.044; EEP: \(r\) = 0.20, \({P}_{{{\mathrm{spin}}}}\) = 0.0496]). In contrast, \({{\rm\it{VAChT}}}\) showed significant positive correlations with spatial activation patterns in males (EP: \(r\) = 0.31, \({P}_{{{\mathrm{spin}}}}\) = 0.01; EEP: \(r\) = 0.34, \({P}_{{{\mathrm{spin}}}}\) = 0.0030) (Fig. 4C).
A Gene enrichment analysis results based on BP, CC and MF. B Gene enrichment analysis results based on Pathways and Disease. Please refer to Supplementary Table 2 to get the corresponding term. Only the enrichments with a significance of FDR p < 0.05 are displayed. Terms colored blue are related to the physiological process of Energy metabolism; those in green pertain to Synapse extensions; and the red - colored terms are associated with Active transportation. C The correlation between activation networks and the receptor/transporter maps. Only the correlation with a significance of Pspin < 0.01 are displayed. D The hierarchical processing model of emotional prosody processing we proposed. Primary auditory cortex (PAC).
Genetic mechanisms underlying the EP processing
All first three PLS components were significant after 10,000 times permutation. For each activation network, the first three components of the PLS regression totally explained 32.48–43.42% of the variance (Supplementary Fig. 2).
Supplementary Fig. 3 illustrates that within each component of the PLS regressions (PLS1, PLS2, and PLS3), the distributions are highly correlated, while the correlations between different components across all activation networks are relatively low. Specifically, PLS1 exhibited a transcriptional profile marked by under-expression predominantly in the bilateral LG and para-hippocampal regions. PLS2 displayed overexpression primarily in the bilateral Insula, TP, PreCG, and PostCG, with concurrent under-expression in the occipital lobe. PLS3 revealed overexpression chiefly in the bilateral PreCG, PostCG, posterior STG, and occipital lobe, while showing under-expression in the frontal lobe.
The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analysis revealed that the genes were enriched in biological processes (BP), cell components (CC), molecular function (MF) and pathways related to energy metabolism, synapse extensions and active transportation (Fig. 4A). Energy metabolism included fatty acid metabolic process [GO:0006631] in BP, mitochondrial matrix [GO:0005759] in CC, ATP hydrolysis activity [GO:0016887] in MF, and fatty acid degradation [hsa00071] in pathways. Synapse extensions included cell leading edge [GO:0031252] in CC and metallopeptidase activity [GO:0008237] in MF). Active transportation included active transmembrane transporter activity [GO:0022804] in MF and ABC transporters [hsa02010] in pathways (Fig. 4A).
Additional Disease Gene Network (DisGeNET) disease enrichment analyses showed that genes associated with emotional prosody processing are linked to Disease Progression [umls:C0242656], Autistic Disorder [umls:C0004352] and Alzheimer’s Disease [umls:C0002395] (Fig. 4B).
Discussion
By integrating the previous neuroimaging findings, we identified a common activation brain network underlying EP processing, characterized its gender-specific features, and linked it to transcription profiles and neurotransmitter receptor patterns. Our analysis revealed a widespread activation brain network for EP processing including DMN, DAN, LN, SMN, and subcortical network. Notably, females exhibited a relatively broader activation network than males. Furthermore, the task type effect analysis showed that the explicit network extended beyond the implicit network, additionally recruiting executive frontal and sensorimotor regions, which echoes with the hierarchical processing model of emotional prosody. Interestingly, these activation networks significantly correlated with the spatial patterns of receptors/transporters (\({5{HT}}_{1A}\), \({{CB}}_{1}\), \({{mGluR}}_{5}\), and \({NET}\)) and gene expression profiles (energy metabolism, synapse extension, active transmembrane transport, along with diseases such as autistic disorder, Alzheimer’s disease, and general disease progression). These findings enhance our understanding of the neurobiological and modular mechanisms underlying EP processing and suggest that both the gender and task type modulate its neural architecture.
Extended hierarchical processing model of emotional prosody
By using ANM analysis, we identified an emotional prosody network, which included the DMN (MTGp), DAN (MTGtp), FPN (IFG), LN (OFC), SMN (PreCG, PostCG, PT, PAC, STG, STS, and HG), VAN (bilateral Insular), and subcortical network (Amygdala). We found that most of the activated brain regions (PAC, STG, STS, IFG, OFC) were involved in the prevailing three-stage model of the emotional prosody processing37, jointly demonstrating the central functional importance of these regions in EP processing. Moreover, brain regions involved in the first two stages of this model (the first stage [sensory processing]: PAC; the second stage [integration processing]: STG and STS) are primarily located within the IEP network, while brain regions involved in the third stage ([evaluative judgment processing]: IFG, OFC) are primarily located within the EEP-specific network. The alignment between the processing models from these two perspectives not only validates the established three-stage hierarchical framework but also elucidates the distinct large-scale network mechanisms underpinning each processing stage.
Additionally, the EEP network also included the PreCG and PostCG, which were not encompassed by the three-stage model. Although these two regions have been reported in previous studies12, 18, 38, 39, they have generally been overlooked or merely considered as motor responses in explicit tasks, rather than components of emotional prosody processing40. However, our network-based analysis demonstrates an intrinsic functional association between these sensorimotor regions and emotional prosody activation points. This finding is in line with previous evidences that the PreCG and PostCG contribute to somatosensory monitoring and articulatory feedback during speech perception, particularly under demanding task conditions41, thereby enhancing the robustness of emotional prosody perception. Thus, these results suggest that the PreCG and PostCG may provide supportive contributions to emotional prosody recognition.
Taken together, we propose a hierarchical processing model of EP from the perspective of brain networks (Fig. 4D). This model refines the traditional three-stage framework by distinguishing regions primarily engaged during implicit (input integration process) processing, explicit (decision-making process) processing, and newly extended sensorimotor components (Auxiliary decision-making process). By delineating these subsystems, our findings clarify how task demands shape the recruitment of distinct but interconnected neural circuits for emotional prosody processing.
Broader emotional prosody activation network in female
Our study revealed significant gender differences in the neural mechanisms underlying emotional prosody processing. The female group showed more broader activation in bilateral MFG, right MTGtp, bilateral Insular, left SMC, bilateral PreCG, and bilateral PostCG regions, which are critical for emotion processing and speech perception34, 35, 42,43,44. This pattern suggests that females engage more extensively in affective evaluation and speech-related sensorimotor integration compared with males. Supporting this interpretation, behavioral studies have shown that females tend to perform better in emotion recognition tasks, including the perception of vocal affect, facial expressions, and body language45,46,47,48,49. Taken together, these findings imply that females may utilize more complex neural networks to handle emotional tasks, resulting in a higher cognitive and emotional integration demand50.
The emotional prosody related receptors/transporters
Our findings provided critical insights into the molecular underpinnings of emotional prosody, highlighting both shared and gender-specific neuroreceptor contributions. The identification of shared receptors, including the \({5{HT}}_{1A}\) receptor, \({{CB}}_{1}\) receptor, \({{mGluR}}_{5}\), and \({NET}\), suggests that emotional prosody processing may involve a foundational set of neuroreceptors linked to broader emotional regulation. These findings align with prior research showing that these receptors play critical roles in fear and anxiety regulation, emotional perceptual biases51,52,53. For example, \({5{HT}}_{1A}\) receptor agonists reduce fear recognition in facial expressions, and \({{CB}}_{1}\) receptor agonists modulate anxiety responses, supporting their involvement in processing emotionally salient prosodic cues54,55,56. This shared receptor set underscores the likelihood of conserved molecular mechanisms underlying various emotional processing tasks, extending their significance to the domain of emotional prosody. Future research should experimentally validate the causal involvement of these receptor systems in prosodic emotion processing.
In addition to the shared neuroreceptors, our results revealed gender-specific receptor effects, providing a nuanced understanding of the neurochemical basis of emotional prosody. In females, serotonin receptors (\({5{HT}}_{1B}\), \({5{HT}}_{2A}\), and \({5{HT}}_{6}\)) were significantly associated with activation patterns for both EP and EEP, suggesting heightened serotonergic modulation in emotional prosody. These findings align with previous studies demonstrating that serotonergic signaling is modulated by sex hormones such as estrogen57, which may enhance serotonin receptor sensitivity in females. Conversely, the cholinergic system, as indicated by the significant correlation of \({VAChT}\) with activation patterns in males, points to a distinct neurochemical pathway in male emotional prosody processing. This divergence could reflect structural or functional brain differences between genders, such as the density of cholinergic innervation or hormonal effects on acetylcholine receptor regulation58, 59.
The molecular mechanism underlay emotional prosody
By applying GO and KEGG pathway enrichment analyses, we revealed that the genes correlated with emotional prosody processing are mainly enriched in pathways related to energy metabolism. These findings suggest that the emotional prosody processing is a high energy-demanding process. These high energy-demanding might be likely associated with the fine-grained discrimination involved in emotional prosody processing60. Additionally, we also found that genes involved in metallopeptidase activity and synaptic extension pathways were associated with emotional prosody. The involvement of metallopeptidase activity and synapse extension pathways may reflect the structural plasticity required for efficient neural signaling during emotional communication. These molecular pathways also provide potential targets for further functional validation studies.
The identified genes were also enriched in diseases such as autistic disorder, Alzheimer’s disease, and general disease progression. While these associations are intriguing, they should be interpreted with caution. The enrichment results do not establish causality but rather suggest that emotional prosody processing shares molecular mechanisms with these disorders. For instance, deficits in emotional prosody are well-documented in autism and Alzheimer’s disease. Future research could investigate whether these shared molecular signatures contribute to the impaired emotional communication observed in these conditions.
Beyond co-activation: a connectivity-driven network paradigm for emotional prosody
This study employs ANM to systematically investigate the neural underpinnings of emotional prosody processing. Its central contribution lies in providing a higher-dimensional explanatory framework for existing findings. The analysis reveals that the core brain regions identified by ANM - such as bilateral frontotemporal areas and subcortical structures like the amygdala - show strong convergence with prior meta-analyses (Fig. 1). This consistency not only validates the reliability of the ANM approach but also confirms the robustness of the neural architecture for emotional prosody across different analytical methods. However, the novelty of this work does not rest on replicating established findings but rather on two key advances. Methodologically, ANM moves beyond coordinate-based meta-analysis by detecting stable subnetwork structures that may be obscured in averaged activation maps, thereby uncovering neural patterns that traditional approaches may miss. More critically, on a theoretical level, by analyzing functional connectivity patterns between regions, this study proposes a unified network model capable of explaining inconsistencies in prior literature--such as variable involvement of specific brain areas across studies. This model suggests that condition-specific modulation of information flow within a stable large-scale circuit may account for seemingly contradictory results.
Limitation
There are several limitations to our study. First, the activation networks for males and females were estimated using activation data from the overall cohorts because few studies investigated the EP activation within a single gender. As a result, our integrated findings may not fully capture the actual network patterns unique to each gender. Nevertheless, using the same activation data from the overall cohort, we still observed distinct activation networks for males and females, highlighting the different activation patterns during EP processing. Second, our analysis was restricted to statistically significant activation coordinates, so that weaker but reliable activations (e.g., in underpowered studies) might be overlooked. However, given the high temporal synchrony of BOLD signals among brain regions involved in the same cognitive process, network-based analytical approaches substantially reduce the risk of missing functionally relevant areas. Third, given the high heterogeneity observed across studies on emotional prosody, a larger number of experiments could yield more robust results. Future research can further explore the receptor involvement and genetic regulatory mechanisms in emotional prosody processing, to gain a deeper understanding of the neural mechanisms underlying emotional prosody.
Materials and methods
Articles selection
A systematic review was conducted to identify relevant articles with two independent reviewers. The primary search included a comprehensive examination of the PubMed database, as well as a thorough review of existing meta-analyses. Studies from January 1, 1993, to July 10, 2024, were searched using terms related to emotional prosodic studies, including “prosody”, “emotional prosody”, and “affective prosody”, in conjunction with terms related to neuroimaging techniques, including “fMRI”, “functional MRI”, “functional magnetic resonance imaging”, “PET”, “positron emission tomography”, “neuroimaging”, and “BOLD”. As a result, a total of 248 research articles were obtained from both a PubMed search and earlier meta-analyses40, 42, 44, 61. Further, screening processes were conducted to include the articles upon the fulfillment of the following criteria: 1) Studies that employed tasks to assess the auditory processing of emotional prosody; 2) Studies that included a rest or control task; 3) Studies that involved healthy adult participants, or case-control studies with a healthy adult control group; 4) Studies that utilized standardized 3D spatial neuroimaging coordinates from functional magnetic resonance imaging (fMRI) or positron emission tomography (PET). Additionally, studies that employed multi-modal or dichotic listening methodologies were excluded. As a result of the screening process, 40 peer-reviewed articles were identified. The process of the literature search is graphically represented in Fig. 5.
The flowchart showing the process of identifying the 40 articles included in the analysis.
Data extraction
Data extracted from each experiment included the number of participants and the activation coordinates (in MNI or Talairach space). Only coordinates reported significant activation in emotional prosody processing were considered. To unify the coordinates in the same space system coordinates, those initially reported in Talairach space were transformed to MNI using the MNI2Tal web tool (https://bioimagesuiteweb.github.io/webapp/mni2tal.html). To examine the differences between EEP and IEP, we categorized these experiments into two groups based on the specific tasks performed and the contrasts analyzed.
MRI processing for HCP data
To construct a normative functional connectome, the MRI dataset from the HCP S1200 release was included in this study62. Here, 1084 participants with two runs including a left-to-right (LR) and a right-to-left (RL) phase encoding direction were used. All subjects provided written informed consent, and the research protocol was approved by the Institutional Review Board of Washington University. Please refer to Van Essen et al.36 for more details about the dataset.
MRIs were acquired using a Siemens Connectome Skyra 3 T scanner with a 32-channel head coil housed at Washington University in St. Louis. For more specific scanning parameters please find them in Supplementary materials or refer to Glasser et al.63. The rs-fMRI and T1-weighted images were preprocessed using the HCP minimal-preprocessing pipelines63. The following preprocess procedures were further performed on the minimal preprocessing pipeline data. For the T1 images, skull-stripping was first performed using FSL bet2 (f = 0.5). Then skull-stripped T1-weighted images were segmented into three tissue types including white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF). For each individual, the averaged WM signal and CSF signal were extracted based on the segmentation results. For the fMRI images, detrending was first executed to minimize the effects of the low-frequency drift. Then, to eliminate potential confounding effects on the signal of gray matter, 24 movement parameters, signals of WM and CSF were regressed. Finally, all images were bandpass filtered (0.01–0.1 Hz) to eliminate the impact of high-frequency physiological noise and low-frequency drift.
Activation network mapping
To uncover the emotional prosody activation network hidden in inconsistent neuroimaging results, we conducted an ANM analysis. Here, the “overlap” approach was used. The steps were summarized as follows. First, we identified the activation seed for each experiment by generating 4-mm-radius spheres centered on each coordinate. The activation seed was identified by combining the spheres associated with the same experiment. Next, a normative functional connectivity map which was standardized representations of the typical patterns of functional connectivity in the human brain across a population, was generated for each experiment. Specifically, for each participant in HCP, we computed a Fisher z-transformed correlation map. This was achieved by first calculating the temporal synchronization (Pearson correlation) between the time series of each voxel and activation seed, followed by applying the Fisher z-transform to normalize the correlation coefficients. Voxel-wise one-sample t-tests were then performed on the individual Fisher z maps, with age and gender effects controlled, followed by smoothing with a 6 mm full-width at half-maximum (FWHM) Gaussian kernel. Then, we binarized the normative functional connectivity map for each experiment with a threshold of Bonferroni-corrected P < 0.05. The activation network was finally generated by overlapping these binary maps and thresholding at 60%.
Besides, we also investigate the task-type effects and the gender effect on the strength of the functional connectivity by performing the voxel-wise statistical test on the population-averaged Fisher z maps. Notably, we performed two-sample t-test for task-type comparison and paired t-test for gender comparison considering the same seed were used for female and male. The analyses were restricted to the union of the activation networks from two groups being compared, and multiple comparisons were corrected at cluster-level FWE P < 0.05 (cluster-forming threshold at voxel-level P < 0.01, 10,000 permutations) using non-parametric permutation tests implemented in Statistical non-Parametric Mapping (SnPM13, http://warwick.ac.uk/snpm).
Here, the overlap approach was employed due to the following advantages: 1) higher interpretability, 2) reduction of bias induced by autocorrelation, 3) mitigation of bias driven by extreme values in a small number of experiment-level maps. To validate the results, the “t-test” approach was employed. For the details of the “t-test” approach please refer to Supplementary Methods or Peng et al.19.
Mapping neurotransmitter receptor maps to the activation networks of EP processing
To identify the molecular foundation of emotional prosody, we calculated the spatial correlation between activation networks and the distribution of receptors. The publicly available receptors dataset, Hansen receptors, was used64 (https://github.com/netneurolab/hansen_receptors). Hansen receptors dataset included PET images of 19 different neurotransmitter receptors and transporters, across 9 neurotransmitter systems: serotonin (\({5{HT}}_{1A}\), \({5{HT}}_{1B}\), \({5{HT}}_{2A}\), \({5{HT}}_{4}\), \({5{HT}}_{6}\), \(5{HTT}\)), dopamine (\({D}_{1}\), \({D}_{2}\), \({DAT}\)), norepinephrine (\({NAT}\)), histamine (\({H}_{3}\)), acetylcholine (\({\alpha }_{4}{\beta }_{2}\), \({M}_{1}\), \({VAChT}\)), cannabinoid (\({{CB}}_{1}\)), opioid (\({MU}\)), glutamate (\({NMDA}\), \({{mGluR}}_{5}\)), and GABA (\({{GABA}}_{A/{BZ}}\)) systems. All these PET images were collected and registered to MNI152 space.
To explore the relationship between activation networks and the distribution of receptor/transporter, we performed Spearman’s correlation analysis after normalization65. Notably, 10,000 spin test permutations while controlling for spatial autocorrelation were used to test the significance of these correlations.
Association between activation networks and gene expression
To identify the gene foundation of emotional prosody, the public gene expression data, Allen Human Brain Atlas (AHBA) (http://human.brain-map.org) were used. It offers transcriptomic data of 58,692 probes corresponding to 29,131 genes, derived from 3702 spatially distinct tissue samples collected from six postmortem adult brains. The gene expression data was processed as follows: intensity-based probes filtering, representative probe selection, matching tissue samples to atlas (The Atlas of Intrinsic Connectivity of Homotopic Areas, AICHA)66, normalization and aggregation. More details please find in Supplementary methods. As a result, a 15,633 × 384 gene expression matrix was created. Here the abagen toolbox (https://abagen.readthedocs.io/) was used.
To detect genes whose expression levels were significantly correlated to the activation networks of EP, PLS (Partial Least Squares) regression was used, with z-scored region-level gene expression matrix as the independent variable (X) and the z-scored parcellated activation network as the dependent variable (Y). A 10,000 times spatial permutation test was utilized to examine the significance level for each PLS component67. Ultimately, the first three components of all PLS regressions passed the spin test (\({P}_{{spin}}\) < 0.01). Therefore, subsequent studies are based on the first three components of all PLS regression. Genes whose absolute gene weight ranked in the top 10% (top 1563) in each PLS component were selected for subsequent enrichment analyses.
Gene enrichment analysis
The extracted gene sets and their signed gene weights were selected to perform enrichment analysis to identify enriched GO (https://geneontology.org/) terms68, 69, KEGG pathways70 and DisGeNET (https://www.disgenet.com) diseases71 by using the Web-based Gene Set Analysis Toolkit (WebGestalt: https://www.webgestalt.org)72. For GO terms, all three ontology categories, including BP, CC and MF were considered. Notably, only the top 5 positively and negatively related terms that were significantly enriched at FDR \(p\) < 0.05 with 10,000 permutations were identified as enriched terms for each PLS component.
Statistics and reproducibility
For the computational part of ANM, statistical analyses including calculating Pearson correlation, one-sample t test, and Bonferroni correction were all performed using built-in functions of MATLAB R2023a. The Spearman correlation and spin test in the receptor analysis were implemented with functions from the Python scipy (https://docs.scipy.org) and netneurotools (https://github.com/netneurolab/netneurotools) libraries. All statistical analyses for gene enrichment were carried out using the WebGestalt web tool. Sample sizes and the number of replicates were determined according to the experimental requirements, and relevant data and codes are available upon reasonable request from the corresponding author to ensure the reproducibility of the study.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
All data are available in the main text or the supplementary materials.
Code availability
The ANM algorithm is open - source and can be freely accessed at (https://github.com/sailingpeng/2021_ActivationNetworkMapping.git). The custom code is available in website https://github.com/PinyuanHu/Emotional-prosody-ANM.
References
Eddy, C. M. & Cook, J. L. Emotions in action: the relationship between motor function and social cognition across multiple clinical populations. Prog. Neuro Psychopharmacol. Biol. Psychiatry 86, 229–244 (2018).
Blonder, L. X., Gur, R. E. & Gur, R. C. The effects of right and left hemiparkinsonism on prosody. Brain Lang. 36, 193–207 (1989).
Shamay-Tsoory, S. G., Tomer, R., Goldsher, D., Berger, B. D. & Aharon-Peretz, J. Impairment in cognitive and affective empathy in patients with brain lesions: anatomical and cognitive correlates. J. Clin. Exp. Neuropsyc 26, 1113–1127 (2004).
Van Lancker, D. & Sidtis, J. J. The identification of affective-prosodic stimuli by left- and right-hemisphere-damaged subjects: all errors are not created equal. J. Speech Hear Res. 35, 963–970 (1992).
Grandjean, D., Bänziger, T. & Scherer, K. R. Intonation as an interface between language and affect. In Progress in Brain Research Vol. 156, 235–247 (Elsevier, 2006).
Pell, M. D. Judging emotion and attitudes from prosody following brain damage. In Progress in Brain Research, Vol. 156, 303–317 (Elsevier, 2006).
Mitchell, R. L. C. & Ross, E. D. Attitudinal prosody: what we know and directions for future study. Neurosci. Biobehav. R. 37, 471–479 (2013).
Ethofer, T. et al. Cerebral pathways in processing of affective prosody: a dynamic causal modeling study. NeuroImage 30, 580–587 (2006).
Leitman. “It’s not what you say, but how you say it”: a reciprocal temporo-frontal network for affective prosody. Front. Hum. Neurosci. https://doi.org/10.3389/fnhum.2010.00019 (2010).
Ethofer, T. et al. Emotional voice areas: anatomic location, functional properties, and structural connections revealed by combined fMRI/DTI. Cereb. Cortex 22, 191–200 (2012).
Pichon, S. & Kell, C. A. Affective and sensorimotor components of emotional prosody generation. J. Neurosci. 33, 1640–1650 (2013).
Péron, J., Frühholz, S., Ceravolo, L. & Grandjean, D. Structural and functional connectivity of the subthalamic nucleus during vocal emotion decoding. Soc. Cogn. Affect Neurosci. 11, 349–356 (2016).
Correia, A. I. et al. Resting-state connectivity reveals a role for sensorimotor systems in vocal emotional processing in children. NeuroImage 201, 116052 (2019).
Ceravolo, L., Frühholz, S., Pierce, J., Grandjean, D. & Péron, J. Basal ganglia and cerebellum contributions to vocal emotion processing as revealed by high-resolution fMRI. Sci. Rep. 11, 10645 (2021).
Leipold, S., Abrams, D. A., Karraker, S., Phillips, J. M. & Menon, V. Aberrant emotional prosody circuitry predicts social communication impairments in children with autism. Biol. Psychiatry 8, 531–541 (2023).
Li, M.-T. et al. The effect of seed location on functional connectivity: evidence from an image-based meta-analysis. Front. Neurosci. 17, 1120741 (2023).
Kotz, S. A., Meyer, M. & Paulmann, S. Lateralization of emotional prosody in the brain: an overview and synopsis on the impact of study design. In Progress in Brain Research, Vol. 156 (eds Anders, S., Ende, G., Junghofer, M., Kissler, J. & Wildgruber, D.) 285–294 (Elsevier, 2006).
Elizalde Acevedo, B. et al. Brain mapping of emotional prosody in patients with drug-resistant temporal epilepsy: an indicator of plasticity. Cortex 153, 97–109 (2022).
Peng, S., Xu, P., Jiang, Y. & Gong, G. Activation network mapping for integration of heterogeneous fMRI findings. Nat. Hum. Behav. 6, 1417–1429 (2022).
Schirmer, A., Zysset, S., Kotz, S. A. & Yves Von Cramon, D. Gender differences in the activation of inferior frontal cortex during emotional speech perception. NeuroImage 21, 1114–1123 (2004).
Beaucousin, V. et al. Sex-dependent modulation of activity in the neural networks engaged during emotional speech comprehension. Brain Res. 1390, 108–117 (2011).
Frühholz, S., Ceravolo, L. & Grandjean, D. Specific brain networks during explicit and implicit decoding of emotional prosody. Cereb. Cortex 22, 1107–1117 (2012).
Wildgruber, D., Pihan, H., Ackermann, H., Erb, M. & Grodd, W. Dynamic brain activation during processing of emotional intonation: influence of acoustic parameters, emotional valence, and sex. NeuroImage 15, 856–869 (2002).
Imaizumi, S., Homma, M., Ozawa, Y., Maruishi, M. & Muranaka, H. Gender differences in the functional organization of the brain for emotional prosody processing. In Speech Prosody 2004 605–608 (ISCA, 2004). https://doi.org/10.21437/SpeechProsody.2004-139.
Eagly, A. H. & Wood, W. The origins of sex differences in human behavior: evolved dispositions versus social roles. Am. Psychol. 54, 408–423 (1999).
Kret, M. E. & De Gelder, B. A review on sex differences in processing emotional signals. Neuropsychologia 50, 1211–1221 (2012).
Malezieux, M., Klein, A. S. & Gogolla, N. Neural circuits for emotion. Annu. Rev. Neurosci. 46, 211–231 (2023).
Taylor, S. E. et al. Biobehavioral responses to stress in females: tend-and-befriend, not fight-or-flight. Psychol. Rev. 107, 411–429 (2000).
Cosgrove, K. P., Mazure, C. M. & Staley, J. K. Evolving knowledge of sex differences in brain structure, function and chemistry. Biol. Psychiatry 62, 847–855 (2007).
Beani, L. & Zuk, M. Beyond sexual selection: The evolution of sex differences from brain to behavior. Neurosci. Biobehav. Rev. 46, 497–500 (2014).
Archer, J. The reality and evolutionary significance of human psychological sex differences. Biol. Rev. 94, 1381–1415 (2019).
Kappeler, P. M. et al. Sex roles and sex ratios in animals. Biol. Rev. 98, 462–480 (2023).
Ngun, T. C., Ghahramani, N., Sánchez, F. J., Bocklandt, S. & Vilain, E. The genetics of sex differences in brain and behavior. Front. Neuroendocr. 32, 227–246 (2011).
Brück, C., Kreifelts, B. & Wildgruber, D. Emotional voices in context: a neurobiological model of multimodal affective information processing. Phys. Life Rev. 8, 383–403 (2011).
Kotz, S. A. & Paulmann, S. Emotion, language, and the brain. Lang. Linguist Compas 5, 108–125 (2011).
Van Essen, D. C. et al. The WU-Minn Human Connectome Project: an overview. Neuroimage 80, 62–79 (2013).
Schirmer, A. & Kotz, S. A. Beyond the right hemisphere: brain mechanisms mediating vocal emotional processing. Trends Cogn. Sci. 10, 24–30 (2006).
Alba-Ferrara, L., Hausmann, M., Mitchell, R. L. & Weis, S. The neural correlates of emotional prosody comprehension: disentangling simple from complex emotion. PLOS One 6, e28701 (2011).
Escoffier, N., Zhong, J., Schirmer, A. & Qiu, A. Emotional expressions in voice and music: same code, same effect? Hum. Brain Mapp. 34, 1796–1810 (2013).
Liang, B. & Du, Y. The functional neuroanatomy of lexical tone perception: an activation likelihood estimation meta-analysis. Front. Neurosci. 12, 495 (2018).
Silva, A. B. et al. A neurosurgical functional dissection of the middle precentral gyrus during speech production. J. Neurosci. 42, 8416–8426 (2022).
Belyk, M. & Brown, S. Perception of affective and linguistic prosody: an ALE meta-analysis of neuroimaging studies. Soc. Cogn. Affect Neur 9, 1395–1403 (2014).
Wildgruber, D., Ackermann, H., Kreifelts, B. & Ethofer, T. Cerebral processing of linguistic and emotional prosody: fMRI studies. In Progress in Brain Research, Vol. 156, 249–268 (Elsevier, 2006).
Witteman, J., Van Heuven, V. J. P. & Schiller, N. O. Hearing feelings: a quantitative meta-analysis on the neuroimaging literature of emotional prosody perception. Neuropsychologia 50, 2752–2763 (2012).
Fischer, A. H. & Evers, C. The social costs and benefits of anger as a function of gender and relationship context. Sex. Roles 65, 23–34 (2011).
Forni-Santos, L. & Osório, F. L. Influence of gender in the recognition of basic facial expressions: a critical literature review. World J. Psychiatry 5, 342 (2015).
Lausen, A. & Schacht, A. Gender differences in the recognition of vocal emotions. Front. Psychol. 9, 882 (2018).
Sokolov, A. A., Krüger, S., Enck, P., Krägeloh-Mann, I. & Pavlova, M. A. Gender affects body language reading. Front. Psychol. 2, 16 (2011).
Thompson, A. E. & Voyer, D. Sex differences in the ability to recognise non-verbal displays of emotion: a meta-analysis. Cognit. Emot. 28, 1164–1195 (2014).
Pell, M. D. & Kotz, S. A. On the time course of vocal emotion recognition. PLoS One 6, e27256 (2011).
Goddard, A. W. et al. Current perspectives of the roles of the central norepinephrine system in anxiety and depression. Depress Anxiety 27, 339–350 (2010).
Harrison, N. A., Morgan, R. & Critchley, H. D. From facial mimicry to emotional empathy: a role for norepinephrine? Soc. Neurosci. 5, 393–400 (2010).
Rodrigues, S. M., Bauer, E. P., Farb, C. R., Schafe, G. E. & LeDoux, J. E. The Group I metabotropic glutamate receptor mGluR5 is required for fear memory formation and long-term potentiation in the lateral amygdala. J. Neurosci. 22, 5219–5229 (2002).
Banerjee, P., Mehta, M. & Kanjilal, B. The 5-HT1A receptor: a signaling hub linked to emotional balance. In Serotonin Receptors in Neurobiology (ed. Chattopadhyay, A.) (CRC Press/Taylor & Francis, 2007).
Bernasconi, F. et al. Spatiotemporal brain dynamics of emotional face processing modulations induced by the serotonin 1A/2A receptor agonist psilocybin. Cereb. Cortex 24, 3221–3231 (2014).
Lutz, B. Endocannabinoid signals in the control of emotion. Curr. Opin. Pharm. 9, 46–52 (2009).
Spies, M., Handschuh, P. A., Lanzenberger, R. & Kranz, G. S. Sex and the serotonergic underpinnings of depression and migraine. In Handbook of Clinical Neurology, Vol. 175, 117–140 (Elsevier, 2020).
Acosta, J. I. et al. Transitional versus surgical menopause in a rodent model: etiology of ovarian hormone loss impacts memory and the acetylcholine system. Endocrinology 150, 4248–4259 (2009).
Muth, E. A., Crowley, W. R. & Jacobowitz, D. M. Effect of gonadal hormones on luteinizing hormone in plasma and on choline acetyltransferase activity and acetylcholine levels in discrete nuclei of the rat brain. Neuroendocrinology 30, 329–336 (2008).
Schirmer, A. & Adolphs, R. Emotion perception from face, voice, and touch: comparisons and convergence. Trends Cogn. Sci. 21, 216–228 (2017).
Mauchand, M. & Zhang, S. Disentangling emotional signals in the brain: an ALE meta-analysis of vocal affect perception. Cogn., Affect. Behav. Neurosci. 23, 17–29 (2023).
Van Essen, D. C. et al. The Human Connectome Project: a data acquisition perspective. Neuroimage 62, 2222–2231 (2012).
Glasser, M. F. et al. The minimal preprocessing pipelines for the Human Connectome Project. Neuroimage 80, 105–124 (2013).
Hansen, J. Y. et al. Local molecular and global connectomic contributions to cross-disorder cortical abnormalities. Nat. Commun. 13, 4682 (2022).
Dong, X. et al. How brain structure–function decoupling supports individual cognition and its molecular mechanism. Hum. Brain Mapp. 45, e26575 (2024).
Joliot, M. et al. AICHA: an atlas of intrinsic connectivity of homotopic areas. J. Neurosci. Methods 254, 46–59 (2015).
Liu, J., Xia, M., Wang, X., Liao, X. & He, Y. The spatial organization of the chronnectome associates with cortical hierarchy and transcriptional profiles in the human brain. NeuroImage 222, 117296 (2020).
Aleksander, S. A. et al. The Gene Ontology knowledgebase in 2023. Genetics 224, iyad031 (2023).
Ashburner, M. et al. Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).
Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).
Piñero, J. et al. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. gkz1021, https://doi.org/10.1093/nar/gkz1021 (2019).
Elizarraras, J. M. et al. WebGestalt 2024: faster gene set analysis and new support for metabolomics and multi-omics. Nucleic Acids Res. 52, W415–W421 (2024).
Acknowledgements
This research project is supported in part by the National Natural Science Foundation of China (81701783 to S.Z., 81801782 to X.S.), New Talent Project of Beijing University of Posts and Telecommunications (2021RC40, 2023RC59), STI 2030--Major Projects (2021ZD0200500), Science Foundation of Beijing Language and Culture University (supported by “the Fundamental Research Funds for the Central Universities”) (19YBB39, 20YJ090001).
Author information
Authors and Affiliations
Contributions
Conceptualization: S.Z., X.S., X.Z. Methodology: S.Z., S.P., P.H., Y.S., M.L., X.Z., Visualization: P.H., X.O., Supervision: S.Z., X.S., Writing—original draft: P.H., S.Z., X.S., X.O., X.Z., Writing—review & editing: P.H., S.Z., X.S., X.O., S.P., Y.S., W.Z.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Communications Biology thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editor: Jasmine Pan. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Hu, P., Sun, X., Ouyang, X. et al. Gender and task type effects on the neural network of emotional prosody processing. Commun Biol 9, 351 (2026). https://doi.org/10.1038/s42003-026-09625-8
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s42003-026-09625-8







