Introduction

Emotion, as a fundamental aspect of human beings, plays a vital role in social interactions and communication. Accurate perception of emotional information is crucial for effective social interaction and deficits in emotional perception are linked to neuropsychiatric disorders1,2,3,4. Emotional prosody (EP) is a key emotional cue and a primary channel for conveying paralinguistic emotion in spoken communication5, 6. It enriches the meaning of spoken language, enabling listeners to infer the speaker’s feelings and intentions more accurately. Accurate perception of the EP is essential for successful social interactions, as it supports the recognition of emotional states and fosters empathy, thereby promoting conversational response7. Therefore, investigating the neural underpinnings of emotional prosody processing may facilitate the understanding of the neural mechanisms underlying emotional processes and provide insights into disease pathogenesis.

With advances in seed-based functional connectivity techniques, numerous studies have investigated the neural underpinning of the emotional prosody processing from a brain network perspective8,9,10,11,12,13,14,15. However, the predominant reliance on part of predefined regions from prior literatures introduces potential methodological challenges in accurately and completely network mapping16. The core issue lies in the limited reproducibility of these regions across studies, a variability attributed to differences in experimental designs17 and participant demographics (e.g., gender). For example, an fMRI study involving an explicit emotional judgement task found peak activations in the right precentral gyrus (PreCG) for emotional prosody processing. In contrast, another fMRI study using a passive listening task showed different motor cortex activations in bilateral postcentral gyrus (PostCG), and additional peak activations in bilateral anterior cingulate cortex, and fusiform gyrus14, 18. The limited reproducibility of regions across studies might result in introducing spurious components into the network or lead to incomplete network characterization. To address this limitation, the activation network mapping (ANM) technique, analogous to a meta-analytic tool, has been developed to systematically integrate activation coordinates across multiple studies to delineate more robust and unified functional networks19. Critically, ANM not only delineates more robust networks but also provides a novel framework to explain inconsistent findings across studies through variations in network‑level information flow, by mapping directed connectivity from meta‑analytic data.

While ANM effectively characterizes brain networks for emotional prosody processing, it remains unknown how it is modulated by critical factors like gender and task type—both known to shape regional activation patterns of emotional prosody processing20,21,22. Gender, in particular, has been repeatedly identified as an important source of variability. For example, some studies found that men exhibit stronger activation in the right frontal cortex, especially the inferior frontal gyrus (IFG), while women showed greater activity in the left middle temporal gyrus (MTG) during emotional prosody processing. However, other studies claimed that women showed greater bilateral IFG activation than men20, 21, 23, 24. From an evolutionary perspective, gender differences in emotion prosody processing might be partially attributed to distinct social roles in ancestral environments25. Females, likely due to caregiving roles, may have evolved heightened sensitivity to emotional cues, whereas males, possibly through hunting and competition, may have become more responsive to threats and dominance signals26,27,28. Neuroimaging evidence provides partial support for this account, showing that women exhibit higher gray-to-white matter ratios in frontal and temporal regions, whereas men have larger total volumes in the front medial cortex, cerebellum, and amygdala29, 30. Furthermore, previous studies have proposed that such gender differences could reflect genetically mediated adaptations to ancestral environments25, 31,32,33. Task type, on the other hand, which introduces differences in task demands—from implicit processing to explicit emotional judgments (i.e., implicit emotional prosody [IEP] vs. explicit emotional prosody [EEP])—is believed to engage distinct processing levels and neural circuits34, 35.

Based on previous findings, we put forward two hypotheses: 1) The EEP and IEP processing exhibit different activation network patterns; 2) There is a gender difference in the activation network patterns of EP processing. To test these hypotheses, we adopted the ANM technique to investigate the neural networks of EP processing from a network perspective based on the Human Connectome Project dataset (HCP)36. First, we converged the heterogeneous fMRI results into a common emotional prosody network under ANM analysis. Second, we assessed the effect of the two factors, gender and task type, on emotional prosody processing. Finally, to gain insight into the neurochemical mechanisms supporting the emotional prosody network, we examined its spatial correspondence with neurotransmitter receptor and gene expression profiles.

Results

Articles selection

Detailed information about the article selection is shown in Table 1 and Supplementary Table 1. Here, different experiments or contrasts for the same article were counted separately. For the emotional prosody analysis, a total of 40 articles were included, with 54 experiments, 725 subjects (334 females, 391 males), and 474 activity coordinates. To explore the difference in activation networks between EEP and IEP, we categorized the total experiments into EEP experiments and IEP experiments. There were 32 EEP articles, with 41 experiments, 587 subjects (270 females, 317 males), 390 coordinates, and 9 IEP articles with 11 experiments, 162 subjects (78 females, 84 males), and 59 coordinates. Note that 2 experiments were excluded from EEP and IEP analysis for their inappropriate contrast. Please find the detailed information of the selected studies in Supplementary Table 1.

Table 1 The detail information about the article selection

Activation networks of EP processing

The activation networks from ANM are shown in Fig. 1A. For EP, our analysis revealed a widespread activation brain network, including somatomotor network (SMN, bilateral PreCG and PostCG, bilateral primary auditory cortex [PAC], bilateral planum temporal [PT], bilateral posterior part of superior temporal gyrus [STGp], bilateral superior temporal sulcus [STS] and left Heschl’s gyrus [HG]), ventral attention network (VAN, bilateral Insular), default mode network (DMN, right posterior middle temporal gyrus [MTGp]), frontoparietal network (FPN, bilateral opercular and triangular part of IFG [IFGpo and IFGpt], bilateral middle frontal gyrus [MFG] and bilateral posterior supramarginal gyrus [SMGp]), dorsal attention network (DAN, temporooccipital part of the bilateral MTG [MTGtp]), limbic network (LN, bilateral orbitofrontal cortex [OFC]), and subcortical network (Amygdala).

Fig. 1: The activation networks of emotional prosody based on overall cohort.
Fig. 1: The activation networks of emotional prosody based on overall cohort.
Full size image

A The activation networks of EP, EEP and IEP, B The overlap of EEP and IEP.

For EEP, the unthresholded activation network was highly correlated to that of EP (\(r\) = 0.99, \({P}_{{spin}}\) < 1.0 × 10−4) (Fig. 2B). In contrast, the activation network of IEP was more localized, primarily involving regions such as the bilateral PAC, bilateral Insular, bilateral SMGp, bilateral PT, bilateral STGp and STS, resulting in a relatively lower spatial correlation with that of EP (\(r\) = 0.86, \({P}_{{spin}}\) < 1.0 × 10−4) (Fig. 2B). The overlap between activation networks of EEP and IEP showed that the EEP activation network encompassed the network of IEP (Fig. 2A), with an EEP-specific network primarily including regions such as the bilateral IFGpo, left IFGpt, bilateral MFG, bilateral OFC, right MTGtp, bilateral PreCG and bilateral PostCG. These findings indicate that EEP engages a more extensive neural network compared to IEP, especially in areas associated with higher-order cognitive processing.

Fig. 2: Gender effects on emotional prosody activation network.
Fig. 2: Gender effects on emotional prosody activation network.
Full size image

A The activation networks of EP processing (including EP, EEP and IEP) and the overlap of EEP and IEP based on female cohorts (left) and male cohorts (right) respectively. B The spatial correlation between unthresholded activation networks. C The overlap maps of female and male for EP, EEP and IEP. O-EP, F-EP, M-EP: Emotional prosody activation networks for overall, female and male cohort; O-EEP, F-EEP, M-EEP: Explicit emotional prosody activation networks for overall, female and male cohort; O-IEP, F-IEP, M-IEP: Implicit emotional prosody activation networks for overall, female and male cohort.

To investigate the task-type effects statistically on functional connectivity strength, voxel-wise two-sample t-tests were conducted based on averaged Fisher z maps in each specific cohort. Across all three cohorts, compared with IEP, consistently higher functional connectivity between the activation seeds and brain regions mainly located in the left IFG, MFG, and PreCG was found during EEP processing (Fig. 3A. For overall cohorts, peak MNI coordinates [−46, 4, 46], t (50) = 3.98; for females, peak MNI coordinates [−46, 4, 46], t (50) = 4.44; for males, peak MNI coordinates [−50, 4, 50], t (50) = 3.6971. ALL corresponding voxel-level p < 0.01, cluster-level FWE corrected p < 0.05). The increased functional connectivity during EEP processing might be related to the additional emotional evaluative demands inherent to the explicit task.

Fig. 3: Statistical comparison.
Fig. 3: Statistical comparison.
Full size image

A Clusters revealed significant differences between EEP (n = 41) and IEP (n = 11) in the overall, female, and male cohorts. B Clusters revealed significant differences between female and male participants in EP (n = 54), EEP, and IEP. Cluster-forming threshold p < 0.01; cluster-level correction: FWE p < 0.05.

Widespread activation networks of EP in females

To test gender differences in brain networks involved in EP, we separately estimated the activation networks for male and female cohorts using activation seeds and fMRI scans of corresponding subjects of the HCP. As illustrated in Fig. 2A, C, the ANM results indicated that females engaged in a broader brain network than males during EP processing. Specifically, for EP and EEP, the gender-shared activation network was mainly in the bilateral PT and IFGpo/IFGpt, while IEP showed activation exclusively in the bilateral PT. The female-specific network involved in EP and EEP was mainly distributed in regions including bilateral MFG, right MTGtp, bilateral Insular, left SMC, bilateral PreCG, and bilateral PostCG. Moreover, for IEP, brain regions including right Insular were female-specific.

To investigate the gender effects statistically on functional connectivity strength, voxel-wise paired t-tests were conducted across experiment based on group-averaged Fisher z maps for each specific task type. Compared with males, females exhibited significantly increased functional connectivity between the activation seeds and most of the voxels (95.32% for EP, 90.30% for EEP) within the female-male union emotion-prosody network for EP and EEP (Fig. 3B). For IEP, the significantly increased connectivity in females mainly localized in the bilateral insula (For left Cerebrum: peak MNI coordinates [−36, −10, 4], t (10) = 8.80; for right Cerebrum: peak MNI coordinates [36, 12, 2], t (10) = 12.14. ALL corresponding voxel-level p < 0.01, cluster-level FWE corrected p < 0.05).

Neuroreceptor mechanisms underlying EP processing

To investigate the underlying neuroreceptor mechanism for the activation networks of emotional prosody, we conducted a spatial correlation analysis between receptors and the activation networks. A common neuroreceptor set related to emotional prosodies was found, including \({5{HT}}_{1A}\), \({{CB}}_{1}\), \({{mGluR}}_{5}\) and \({NAT}\) (\({5{HT}}_{1A}\): \(r\) = 0.32 ± 0.037, \({P}_{{spin}}\) = 0.014 ± 0.012; \({{CB}}_{1}\): \(r\) = 0.34 ± 0.074, \({P}_{{spin}}\) = 0.028 ± 0.012; \({{mGluR}}_{5}\): \(r\) = 0.31 ± 0.026, \({P}_{{spin}}\) = 0.020 ± 0.021; \({NAT}\): \(r\) = 0.33 ± 0.013, \({P}_{{spin}}\) = 0.00013 ± 5.0 × 10−5). Additionally, another set of neuroreceptors exhibited a gender-specific effect. For EP and EEP, \({5{HT}}_{1B}\), \({5{HT}}_{2A}\), and \({5{HT}}_{6}\) were significantly positively associated with spatial activation patterns in females (\({5{{\rm\it{HT}}}}_{1{{\rm\it{B}}}}\): [EP: \(r\) = 0.27, \({P}_{{{\mathrm{spin}}}}\) = 0.018; EEP: \(r\) = 0.27, \({P}_{{{\mathrm{spin}}}}\) = 0.013]; \({5{{\rm\it{HT}}}}_{2{{\rm\it{A}}}}\): [EP: \(r\) = 0.28, \({P}_{{{\mathrm{spin}}}}\) = 0.036; EEP: \(r\) = 0.28, \({P}_{{{\mathrm{spin}}}}\) = 0.032]; \({5{{\rm\it{HT}}}}_{6}\): [EP: \(r\) = 0.21, \({P}_{{{\mathrm{spin}}}}\) = 0.044; EEP: \(r\) = 0.20, \({P}_{{{\mathrm{spin}}}}\) = 0.0496]). In contrast, \({{\rm\it{VAChT}}}\) showed significant positive correlations with spatial activation patterns in males (EP: \(r\) = 0.31, \({P}_{{{\mathrm{spin}}}}\) = 0.01; EEP: \(r\) = 0.34, \({P}_{{{\mathrm{spin}}}}\) = 0.0030) (Fig. 4C).

Fig. 4: Receptor/transporter and gene enrichment analysis.
Fig. 4: Receptor/transporter and gene enrichment analysis.
Full size image

A Gene enrichment analysis results based on BP, CC and MF. B Gene enrichment analysis results based on Pathways and Disease. Please refer to Supplementary Table 2 to get the corresponding term. Only the enrichments with a significance of FDR p < 0.05 are displayed. Terms colored blue are related to the physiological process of Energy metabolism; those in green pertain to Synapse extensions; and the red - colored terms are associated with Active transportation. C The correlation between activation networks and the receptor/transporter maps. Only the correlation with a significance of Pspin < 0.01 are displayed. D The hierarchical processing model of emotional prosody processing we proposed. Primary auditory cortex (PAC).

Genetic mechanisms underlying the EP processing

All first three PLS components were significant after 10,000 times permutation. For each activation network, the first three components of the PLS regression totally explained 32.48–43.42% of the variance (Supplementary Fig. 2).

Supplementary Fig. 3 illustrates that within each component of the PLS regressions (PLS1, PLS2, and PLS3), the distributions are highly correlated, while the correlations between different components across all activation networks are relatively low. Specifically, PLS1 exhibited a transcriptional profile marked by under-expression predominantly in the bilateral LG and para-hippocampal regions. PLS2 displayed overexpression primarily in the bilateral Insula, TP, PreCG, and PostCG, with concurrent under-expression in the occipital lobe. PLS3 revealed overexpression chiefly in the bilateral PreCG, PostCG, posterior STG, and occipital lobe, while showing under-expression in the frontal lobe.

The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analysis revealed that the genes were enriched in biological processes (BP), cell components (CC), molecular function (MF) and pathways related to energy metabolism, synapse extensions and active transportation (Fig. 4A). Energy metabolism included fatty acid metabolic process [GO:0006631] in BP, mitochondrial matrix [GO:0005759] in CC, ATP hydrolysis activity [GO:0016887] in MF, and fatty acid degradation [hsa00071] in pathways. Synapse extensions included cell leading edge [GO:0031252] in CC and metallopeptidase activity [GO:0008237] in MF). Active transportation included active transmembrane transporter activity [GO:0022804] in MF and ABC transporters [hsa02010] in pathways (Fig. 4A).

Additional Disease Gene Network (DisGeNET) disease enrichment analyses showed that genes associated with emotional prosody processing are linked to Disease Progression [umls:C0242656], Autistic Disorder [umls:C0004352] and Alzheimer’s Disease [umls:C0002395] (Fig. 4B).

Discussion

By integrating the previous neuroimaging findings, we identified a common activation brain network underlying EP processing, characterized its gender-specific features, and linked it to transcription profiles and neurotransmitter receptor patterns. Our analysis revealed a widespread activation brain network for EP processing including DMN, DAN, LN, SMN, and subcortical network. Notably, females exhibited a relatively broader activation network than males. Furthermore, the task type effect analysis showed that the explicit network extended beyond the implicit network, additionally recruiting executive frontal and sensorimotor regions, which echoes with the hierarchical processing model of emotional prosody. Interestingly, these activation networks significantly correlated with the spatial patterns of receptors/transporters (\({5{HT}}_{1A}\), \({{CB}}_{1}\), \({{mGluR}}_{5}\), and \({NET}\)) and gene expression profiles (energy metabolism, synapse extension, active transmembrane transport, along with diseases such as autistic disorder, Alzheimer’s disease, and general disease progression). These findings enhance our understanding of the neurobiological and modular mechanisms underlying EP processing and suggest that both the gender and task type modulate its neural architecture.

Extended hierarchical processing model of emotional prosody

By using ANM analysis, we identified an emotional prosody network, which included the DMN (MTGp), DAN (MTGtp), FPN (IFG), LN (OFC), SMN (PreCG, PostCG, PT, PAC, STG, STS, and HG), VAN (bilateral Insular), and subcortical network (Amygdala). We found that most of the activated brain regions (PAC, STG, STS, IFG, OFC) were involved in the prevailing three-stage model of the emotional prosody processing37, jointly demonstrating the central functional importance of these regions in EP processing. Moreover, brain regions involved in the first two stages of this model (the first stage [sensory processing]: PAC; the second stage [integration processing]: STG and STS) are primarily located within the IEP network, while brain regions involved in the third stage ([evaluative judgment processing]: IFG, OFC) are primarily located within the EEP-specific network. The alignment between the processing models from these two perspectives not only validates the established three-stage hierarchical framework but also elucidates the distinct large-scale network mechanisms underpinning each processing stage.

Additionally, the EEP network also included the PreCG and PostCG, which were not encompassed by the three-stage model. Although these two regions have been reported in previous studies12, 18, 38, 39, they have generally been overlooked or merely considered as motor responses in explicit tasks, rather than components of emotional prosody processing40. However, our network-based analysis demonstrates an intrinsic functional association between these sensorimotor regions and emotional prosody activation points. This finding is in line with previous evidences that the PreCG and PostCG contribute to somatosensory monitoring and articulatory feedback during speech perception, particularly under demanding task conditions41, thereby enhancing the robustness of emotional prosody perception. Thus, these results suggest that the PreCG and PostCG may provide supportive contributions to emotional prosody recognition.

Taken together, we propose a hierarchical processing model of EP from the perspective of brain networks (Fig. 4D). This model refines the traditional three-stage framework by distinguishing regions primarily engaged during implicit (input integration process) processing, explicit (decision-making process) processing, and newly extended sensorimotor components (Auxiliary decision-making process). By delineating these subsystems, our findings clarify how task demands shape the recruitment of distinct but interconnected neural circuits for emotional prosody processing.

Broader emotional prosody activation network in female

Our study revealed significant gender differences in the neural mechanisms underlying emotional prosody processing. The female group showed more broader activation in bilateral MFG, right MTGtp, bilateral Insular, left SMC, bilateral PreCG, and bilateral PostCG regions, which are critical for emotion processing and speech perception34, 35, 42,43,44. This pattern suggests that females engage more extensively in affective evaluation and speech-related sensorimotor integration compared with males. Supporting this interpretation, behavioral studies have shown that females tend to perform better in emotion recognition tasks, including the perception of vocal affect, facial expressions, and body language45,46,47,48,49. Taken together, these findings imply that females may utilize more complex neural networks to handle emotional tasks, resulting in a higher cognitive and emotional integration demand50.

The emotional prosody related receptors/transporters

Our findings provided critical insights into the molecular underpinnings of emotional prosody, highlighting both shared and gender-specific neuroreceptor contributions. The identification of shared receptors, including the \({5{HT}}_{1A}\) receptor, \({{CB}}_{1}\) receptor, \({{mGluR}}_{5}\), and \({NET}\), suggests that emotional prosody processing may involve a foundational set of neuroreceptors linked to broader emotional regulation. These findings align with prior research showing that these receptors play critical roles in fear and anxiety regulation, emotional perceptual biases51,52,53. For example, \({5{HT}}_{1A}\) receptor agonists reduce fear recognition in facial expressions, and \({{CB}}_{1}\) receptor agonists modulate anxiety responses, supporting their involvement in processing emotionally salient prosodic cues54,55,56. This shared receptor set underscores the likelihood of conserved molecular mechanisms underlying various emotional processing tasks, extending their significance to the domain of emotional prosody. Future research should experimentally validate the causal involvement of these receptor systems in prosodic emotion processing.

In addition to the shared neuroreceptors, our results revealed gender-specific receptor effects, providing a nuanced understanding of the neurochemical basis of emotional prosody. In females, serotonin receptors (\({5{HT}}_{1B}\), \({5{HT}}_{2A}\), and \({5{HT}}_{6}\)) were significantly associated with activation patterns for both EP and EEP, suggesting heightened serotonergic modulation in emotional prosody. These findings align with previous studies demonstrating that serotonergic signaling is modulated by sex hormones such as estrogen57, which may enhance serotonin receptor sensitivity in females. Conversely, the cholinergic system, as indicated by the significant correlation of \({VAChT}\) with activation patterns in males, points to a distinct neurochemical pathway in male emotional prosody processing. This divergence could reflect structural or functional brain differences between genders, such as the density of cholinergic innervation or hormonal effects on acetylcholine receptor regulation58, 59.

The molecular mechanism underlay emotional prosody

By applying GO and KEGG pathway enrichment analyses, we revealed that the genes correlated with emotional prosody processing are mainly enriched in pathways related to energy metabolism. These findings suggest that the emotional prosody processing is a high energy-demanding process. These high energy-demanding might be likely associated with the fine-grained discrimination involved in emotional prosody processing60. Additionally, we also found that genes involved in metallopeptidase activity and synaptic extension pathways were associated with emotional prosody. The involvement of metallopeptidase activity and synapse extension pathways may reflect the structural plasticity required for efficient neural signaling during emotional communication. These molecular pathways also provide potential targets for further functional validation studies.

The identified genes were also enriched in diseases such as autistic disorder, Alzheimer’s disease, and general disease progression. While these associations are intriguing, they should be interpreted with caution. The enrichment results do not establish causality but rather suggest that emotional prosody processing shares molecular mechanisms with these disorders. For instance, deficits in emotional prosody are well-documented in autism and Alzheimer’s disease. Future research could investigate whether these shared molecular signatures contribute to the impaired emotional communication observed in these conditions.

Beyond co-activation: a connectivity-driven network paradigm for emotional prosody

This study employs ANM to systematically investigate the neural underpinnings of emotional prosody processing. Its central contribution lies in providing a higher-dimensional explanatory framework for existing findings. The analysis reveals that the core brain regions identified by ANM - such as bilateral frontotemporal areas and subcortical structures like the amygdala - show strong convergence with prior meta-analyses (Fig. 1). This consistency not only validates the reliability of the ANM approach but also confirms the robustness of the neural architecture for emotional prosody across different analytical methods. However, the novelty of this work does not rest on replicating established findings but rather on two key advances. Methodologically, ANM moves beyond coordinate-based meta-analysis by detecting stable subnetwork structures that may be obscured in averaged activation maps, thereby uncovering neural patterns that traditional approaches may miss. More critically, on a theoretical level, by analyzing functional connectivity patterns between regions, this study proposes a unified network model capable of explaining inconsistencies in prior literature--such as variable involvement of specific brain areas across studies. This model suggests that condition-specific modulation of information flow within a stable large-scale circuit may account for seemingly contradictory results.

Limitation

There are several limitations to our study. First, the activation networks for males and females were estimated using activation data from the overall cohorts because few studies investigated the EP activation within a single gender. As a result, our integrated findings may not fully capture the actual network patterns unique to each gender. Nevertheless, using the same activation data from the overall cohort, we still observed distinct activation networks for males and females, highlighting the different activation patterns during EP processing. Second, our analysis was restricted to statistically significant activation coordinates, so that weaker but reliable activations (e.g., in underpowered studies) might be overlooked. However, given the high temporal synchrony of BOLD signals among brain regions involved in the same cognitive process, network-based analytical approaches substantially reduce the risk of missing functionally relevant areas. Third, given the high heterogeneity observed across studies on emotional prosody, a larger number of experiments could yield more robust results. Future research can further explore the receptor involvement and genetic regulatory mechanisms in emotional prosody processing, to gain a deeper understanding of the neural mechanisms underlying emotional prosody.

Materials and methods

Articles selection

A systematic review was conducted to identify relevant articles with two independent reviewers. The primary search included a comprehensive examination of the PubMed database, as well as a thorough review of existing meta-analyses. Studies from January 1, 1993, to July 10, 2024, were searched using terms related to emotional prosodic studies, including “prosody”, “emotional prosody”, and “affective prosody”, in conjunction with terms related to neuroimaging techniques, including “fMRI”, “functional MRI”, “functional magnetic resonance imaging”, “PET”, “positron emission tomography”, “neuroimaging”, and “BOLD”. As a result, a total of 248 research articles were obtained from both a PubMed search and earlier meta-analyses40, 42, 44, 61. Further, screening processes were conducted to include the articles upon the fulfillment of the following criteria: 1) Studies that employed tasks to assess the auditory processing of emotional prosody; 2) Studies that included a rest or control task; 3) Studies that involved healthy adult participants, or case-control studies with a healthy adult control group; 4) Studies that utilized standardized 3D spatial neuroimaging coordinates from functional magnetic resonance imaging (fMRI) or positron emission tomography (PET). Additionally, studies that employed multi-modal or dichotic listening methodologies were excluded. As a result of the screening process, 40 peer-reviewed articles were identified. The process of the literature search is graphically represented in Fig. 5.

Fig. 5
Fig. 5
Full size image

The flowchart showing the process of identifying the 40 articles included in the analysis.

Data extraction

Data extracted from each experiment included the number of participants and the activation coordinates (in MNI or Talairach space). Only coordinates reported significant activation in emotional prosody processing were considered. To unify the coordinates in the same space system coordinates, those initially reported in Talairach space were transformed to MNI using the MNI2Tal web tool (https://bioimagesuiteweb.github.io/webapp/mni2tal.html). To examine the differences between EEP and IEP, we categorized these experiments into two groups based on the specific tasks performed and the contrasts analyzed.

MRI processing for HCP data

To construct a normative functional connectome, the MRI dataset from the HCP S1200 release was included in this study62. Here, 1084 participants with two runs including a left-to-right (LR) and a right-to-left (RL) phase encoding direction were used. All subjects provided written informed consent, and the research protocol was approved by the Institutional Review Board of Washington University. Please refer to Van Essen et al.36 for more details about the dataset.

MRIs were acquired using a Siemens Connectome Skyra 3 T scanner with a 32-channel head coil housed at Washington University in St. Louis. For more specific scanning parameters please find them in Supplementary materials or refer to Glasser et al.63. The rs-fMRI and T1-weighted images were preprocessed using the HCP minimal-preprocessing pipelines63. The following preprocess procedures were further performed on the minimal preprocessing pipeline data. For the T1 images, skull-stripping was first performed using FSL bet2 (f = 0.5). Then skull-stripped T1-weighted images were segmented into three tissue types including white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF). For each individual, the averaged WM signal and CSF signal were extracted based on the segmentation results. For the fMRI images, detrending was first executed to minimize the effects of the low-frequency drift. Then, to eliminate potential confounding effects on the signal of gray matter, 24 movement parameters, signals of WM and CSF were regressed. Finally, all images were bandpass filtered (0.01–0.1 Hz) to eliminate the impact of high-frequency physiological noise and low-frequency drift.

Activation network mapping

To uncover the emotional prosody activation network hidden in inconsistent neuroimaging results, we conducted an ANM analysis. Here, the “overlap” approach was used. The steps were summarized as follows. First, we identified the activation seed for each experiment by generating 4-mm-radius spheres centered on each coordinate. The activation seed was identified by combining the spheres associated with the same experiment. Next, a normative functional connectivity map which was standardized representations of the typical patterns of functional connectivity in the human brain across a population, was generated for each experiment. Specifically, for each participant in HCP, we computed a Fisher z-transformed correlation map. This was achieved by first calculating the temporal synchronization (Pearson correlation) between the time series of each voxel and activation seed, followed by applying the Fisher z-transform to normalize the correlation coefficients. Voxel-wise one-sample t-tests were then performed on the individual Fisher z maps, with age and gender effects controlled, followed by smoothing with a 6 mm full-width at half-maximum (FWHM) Gaussian kernel. Then, we binarized the normative functional connectivity map for each experiment with a threshold of Bonferroni-corrected P < 0.05. The activation network was finally generated by overlapping these binary maps and thresholding at 60%.

Besides, we also investigate the task-type effects and the gender effect on the strength of the functional connectivity by performing the voxel-wise statistical test on the population-averaged Fisher z maps. Notably, we performed two-sample t-test for task-type comparison and paired t-test for gender comparison considering the same seed were used for female and male. The analyses were restricted to the union of the activation networks from two groups being compared, and multiple comparisons were corrected at cluster-level FWE P < 0.05 (cluster-forming threshold at voxel-level P < 0.01, 10,000 permutations) using non-parametric permutation tests implemented in Statistical non-Parametric Mapping (SnPM13, http://warwick.ac.uk/snpm).

Here, the overlap approach was employed due to the following advantages: 1) higher interpretability, 2) reduction of bias induced by autocorrelation, 3) mitigation of bias driven by extreme values in a small number of experiment-level maps. To validate the results, the “t-test” approach was employed. For the details of the “t-test” approach please refer to Supplementary Methods or Peng et al.19.

Mapping neurotransmitter receptor maps to the activation networks of EP processing

To identify the molecular foundation of emotional prosody, we calculated the spatial correlation between activation networks and the distribution of receptors. The publicly available receptors dataset, Hansen receptors, was used64 (https://github.com/netneurolab/hansen_receptors). Hansen receptors dataset included PET images of 19 different neurotransmitter receptors and transporters, across 9 neurotransmitter systems: serotonin (\({5{HT}}_{1A}\), \({5{HT}}_{1B}\), \({5{HT}}_{2A}\), \({5{HT}}_{4}\), \({5{HT}}_{6}\), \(5{HTT}\)), dopamine (\({D}_{1}\), \({D}_{2}\), \({DAT}\)), norepinephrine (\({NAT}\)), histamine (\({H}_{3}\)), acetylcholine (\({\alpha }_{4}{\beta }_{2}\), \({M}_{1}\), \({VAChT}\)), cannabinoid (\({{CB}}_{1}\)), opioid (\({MU}\)), glutamate (\({NMDA}\), \({{mGluR}}_{5}\)), and GABA (\({{GABA}}_{A/{BZ}}\)) systems. All these PET images were collected and registered to MNI152 space.

To explore the relationship between activation networks and the distribution of receptor/transporter, we performed Spearman’s correlation analysis after normalization65. Notably, 10,000 spin test permutations while controlling for spatial autocorrelation were used to test the significance of these correlations.

Association between activation networks and gene expression

To identify the gene foundation of emotional prosody, the public gene expression data, Allen Human Brain Atlas (AHBA) (http://human.brain-map.org) were used. It offers transcriptomic data of 58,692 probes corresponding to 29,131 genes, derived from 3702 spatially distinct tissue samples collected from six postmortem adult brains. The gene expression data was processed as follows: intensity-based probes filtering, representative probe selection, matching tissue samples to atlas (The Atlas of Intrinsic Connectivity of Homotopic Areas, AICHA)66, normalization and aggregation. More details please find in Supplementary methods. As a result, a 15,633 × 384 gene expression matrix was created. Here the abagen toolbox (https://abagen.readthedocs.io/) was used.

To detect genes whose expression levels were significantly correlated to the activation networks of EP, PLS (Partial Least Squares) regression was used, with z-scored region-level gene expression matrix as the independent variable (X) and the z-scored parcellated activation network as the dependent variable (Y). A 10,000 times spatial permutation test was utilized to examine the significance level for each PLS component67. Ultimately, the first three components of all PLS regressions passed the spin test (\({P}_{{spin}}\) < 0.01). Therefore, subsequent studies are based on the first three components of all PLS regression. Genes whose absolute gene weight ranked in the top 10% (top 1563) in each PLS component were selected for subsequent enrichment analyses.

Gene enrichment analysis

The extracted gene sets and their signed gene weights were selected to perform enrichment analysis to identify enriched GO (https://geneontology.org/) terms68, 69, KEGG pathways70 and DisGeNET (https://www.disgenet.com) diseases71 by using the Web-based Gene Set Analysis Toolkit (WebGestalt: https://www.webgestalt.org)72. For GO terms, all three ontology categories, including BP, CC and MF were considered. Notably, only the top 5 positively and negatively related terms that were significantly enriched at FDR \(p\) < 0.05 with 10,000 permutations were identified as enriched terms for each PLS component.

Statistics and reproducibility

For the computational part of ANM, statistical analyses including calculating Pearson correlation, one-sample t test, and Bonferroni correction were all performed using built-in functions of MATLAB R2023a. The Spearman correlation and spin test in the receptor analysis were implemented with functions from the Python scipy (https://docs.scipy.org) and netneurotools (https://github.com/netneurolab/netneurotools) libraries. All statistical analyses for gene enrichment were carried out using the WebGestalt web tool. Sample sizes and the number of replicates were determined according to the experimental requirements, and relevant data and codes are available upon reasonable request from the corresponding author to ensure the reproducibility of the study.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.