Introduction

What is social facilitation?

For over a century, social psychologists have examined how the presence of conspecifics –perhaps the most fundamental invariant in myriad animal species—affects behavior. These context-dependent behavioral modulations are commonly termed social facilitation/inhibition (SFI) effects, as others’ presence can either facilitate (enhance) or inhibit (degrade) behavioral performance1. Yet, despite research on SFI effects garnering attention1,2,3, the neural bases of these effects remain insufficiently charted. On a behavioral scale, Zajonc’s work was the first to demonstrate that the presence of others—as observers or coactors—typically promotes performance in easy or well-learned tasks, and conversely, impairs performance in complex or poorly-learned ones4. Drawing upon this observation, and the prevailing Hull-Spence behaviorist theory of learning, conditioning, and motivation, Zajonc proposed that the mere presence of conspecifics energizes the emission of the dominant (i.e., habitual, prepotent or automatic) responses. This framework proposes that presence-induced energization of dominant responses improves performance in well-learned tasks where correct responses are—by definition—dominant, and deteriorates performance in poorly-learned tasks where errors constitute the most prevalent responses. A multitude of studies have lent support to Zajonc’s persepective of SFI effects, employing a range of species, whose dominant responses –whether correct or incorrect—increased in social presence relative to isolation3. And while this classic view remains the most common interpretation of SFI effects (see McFall et al. for a motivational account close to Zajonc’s5), there is ample evidence that these effects can also involve attentional mechanisms1, at least in primates. Specifically, presence of others can reportedly facilitate attentional focusing or undermine cognitive control (which heavily relies on executive attention), in self-threatening circumstances6.

As for the neural correlates of other’s presence, the scant number of existing studies provide further support for modulation of attentional regions/networks7,8 and in domain-general brain regions typically involved in mentalizing and reward9. Nonetheless, the specific neural mechanisms underlying SFI effects remain unclear. Given that some recent inquiries have pointed to the existence of context-sensitive neural populations in prefrontal regions that are preferentially active either during conspecific’s presence or absence10, it would stand to reason that the representative characteristics of the ensembles recruited during others’ presence would differ from those recruited during isolation10.

A probabilistic portrait of SFI effects

Upon initial glance, empirical investigations of these characteristics in higher primates face seemingly insurmountable challenges; Whole-brain recordings contend with changes to large-scale brain dynamics, but are often insufficient for delineation of granular neurobiological mechanisms, as the spatiotemporal precision of modalities, such as EEG, MEG, or fMRI does not lend itself to estimation of single-unit activity. Conversely, invasive investigations can illuminate neurobiologically-detailed characteristics, but are inherently limited in number of the explored brain regions, as operating such recordings (e.g., in primates) concurrently throughout the brain would necessitate a pernicious and implausibly colossal electrode array. These constraints are compounded by the timescale of SFI effects, as it inherently precludes the possibility of in vitro analyses in non-human primates.

Enter dynamic causal modeling (DCM), a well-established model-based approach which enables us to overcome this inherent dichotomy of SFI research11. In essence, DCM is a Bayesian framework for generation and testing of in-silico hypotheses with regards to the latent (i.e., underlying) causal mechanisms present within biological neural networks, and their contribution to various cognitive processes and behaviors. Despite the potential of DCM, one should acknowledge the intrinsic limitation associated with thorough representations of biological neural networks; An exhaustive mapping of all elements of such networks is, for all practical purposes, a currently insurmountable challenge. The notion of effective connectivity emerges from this very observation. Directional effective connectivity represents the causal influence exerted by one neural subsystem over another12,13. A large body of work supports the associations between attentional modulation in the brain and changes in effective connectivity14. For instance, evidence points to associations between motion-oriented area V5 and posterior parietal cortex in attending to actions14, prefrontal and premotor cortices during attention to visual motion15, and between visual cortex and medial temporal lobe16.

Moreover, evidence supports a positive relation between increased attentional demands of a task and enhanced effective connectivity within the descending attention pathway17. This attentional modulation is supposedly achieved via optimization of neural communication at the synaptic level through selective modification of synaptic efficacy (SE), which serves as the neurobiological proxy for modulations in effective connectivity18. Since synaptic inputs can stem from either excitatory or inhibitory presynaptic cells, modulations of either of which could bear strong consequences for Excitation/Inhibition (E/I) balance19 (i.e., in a degenerate manner). Indeed, the E/I balance in neural networks is critical for maintaining stable network activity, enabling efficient information processing. Disruptions in E/I balance are commonly linked to cognitive deficits and various neurological disorders, including autism spectrum disorder20. Critically, social behavior reportedly shapes the representations of conspecific traits at both synaptic and cellular levels—a finding that has been associated with perturbations in E/I balance21,22,23.

We therefore hypothesize that modulation of brain dynamics—and their subsequent influence on behavior– due to mere presence could be reliably detected as variations of SE across principal spatiotemporal scales of the primate brain. To test this hypothesis, we employed the framework of DCM based on the data from our facilitation tasks involving non-human and human primates (which are done either in the presence or absence of a social conspecific), in order to link the modulations of task performance to adaptations of SE within single neurons, cortical columns, and functional brain networks.

Although challenging from a convergence perspective, conducting DCM across brain scales can provide a more comprehensive understanding of neural dynamics, potentially revealing a universal principle of brain function. When leveraged with advanced probabilistic machine learning techniques24,25,26, this approach can operate across brain scales, enhancing the accuracy of models inferring the neural mechanisms underlying social facilitation, as we demonstrate in this study. We achieved this by estimating synaptic efficacies across said scales using state-of-the-art deep neural density estimators26,27,28. Tailored to Bayesian learning, a family of these generative models called normalizing flows28 applies a series of invertible transformations in order to efficiently convert a simple base probability distribution (like a standard normal) into any complex target distribution. We employ this framework in the current study in order to calculate the probability distribution of synaptic efficacies, given the summary statistics of empirical evidence. This is motivated by the theory of structured flows on manifolds29,30, which posits that brain dynamics and behavior are both constrained to low-dimensional subspaces and are topologically equivalent.

In the following we will outline the association-learning and lateral-interception tasks conducted in monkeys and humans, respectively. Subsequently, we will demonstrate how presence modulates behavioral dynamics and identify neural correlates of this modulation in the previously mentioned tasks. Finally, we use these neural correlates as informative data features, in order to agnostically estimate synaptic efficacies across three spatiotemporal brain scales, namely microscale (single neurons), mesoscale (cortical columns), and macroscale (whole-brain).

Results

Association-learning in presence vs absence

Monkeys were trained on a touchscreen task to associate abstract cues with specific targets. The experiment consisted of two conditions: social presence and isolation (details on housing and experimental setup provided in Supplementary notes 1 and 2). During social presence trials (Fig. 1A), monkeys faced each other, with each taking on the role of either the actor (performing the task) or the passive spectator (observing the actor). Only the actor interacted with the touchscreen, receiving rewards for correct choices. Spectators remained unrewarded and never participated in the same session as actors to prevent observational learning. Conversely, during isolation trials (Fig. 1B), monkeys performed the task alone, deprived of any visual, auditory, or olfactory contact with their conspecific. The behavioral data obtain through this task indicated a social facilitation effect in both monkeys (see Supplementary Fig. 2, generated from Demolliens et al.10).

Fig. 1: Inference of microscale synaptic efficacy (monkey A).
Fig. 1: Inference of microscale synaptic efficacy (monkey A).
Full size image

A Association-learning task outline in Monkeys, during which subjects had to associate which of the four corners of the touchpad was associated with the cue in order to obtain a food reward. B The two experimental conditions in the association-learning task. C Sample firing rates and spike-rasters of social and asocial neurons. D Correlation of the firing rate of context-oriented neurons (n = 219) with task performance. E Peak firing rate of all categorized neurons in the presence condition (blue) compared to the absence (red) (p < 0.001, n = 92). F Average firing rate of the same neurons as in panel E. G Linear latent dynamics of single-neuron firing rates in the two experimental conditions. H Sample firing rate histogram and spike-raster of simulations of the balanced spiking network. I Sample firing rate of simulations with random values of parameter g from prior distribution. J Peak firing rate of the model based on varying values of SE, and the histogram of said peaks. This peak firing rate is used for training the deep neural density estimators. K Empirical (dotted lines) and posterior predictive fits (solid lines) of average firing rates in presence/absence. L Pooled distribution of inferred SE across all neurons (WS = 0.2, pMWU < 0.0001, 1000 samples per neuron with nneurons = 92). M Distribution of inferred SE from the average firing rates of the recorded regions (WS = 0.15, pMWU < 0.0001, nsamples = 1000). Whiskers in all boxplots extend to 1.5 × IQR beyond the quartiles.

Presence/Absence oriented neural ensembles

We discovered neural subpopulations in both dorsolateral prefrontal cortex (dlPFC) and anterior cingulate cortex (ACC) that were preferentially active under one of the two experimental conditions10. Neurons that fired more during the presence condition were thus categorized as “social neurons”, and those that fired more under social isolation, as “asocial neurons” (Supplementary Note 3). This firing rate strongly correlated with both learning speed and accuracy (trial-to-criterion) in the neurons’ preferred conditions (social neurons under presence; asocial neurons under isolation), while exhibiting negative correlations in the non-preferred conditions (Learning speed (LeS): p < 0.001 for both conditions, Accuracy (Acc): p < 0.01 in preferred, and p < 0.001 in non-preferred conditions; Fig. 1C, D). We were primarily interested in the difference of firing rate between the conditions per neuron. As such, a time window of 500 ms before and after the feedback time was chosen, and the neural firing rates were normalized by dividing by the maximum firing rate of each neuron. We found that the peak firing rates of neurons in both of these subpopulations were significantly elevated in the presence condition compared to the absence condition (p < 0.001, rrb = 0.29; Fig. 1E, F). To validate these findings, we fitted the firing rate of all neurons in each condition to a linear dynamical system (LDS) to infer the latent dynamics (Methods section “Materials and methods”) in the presence and absence conditions. By sampling from these learned dynamics, we observed that changes in firing rate could be seen as variations in the decay rate (relaxation of population activity31,32), which reflects the rate of convergence to the stable fixed point of a LDS (Fig. 1G). Functionally speaking, this framework enabled us to estimate the rate by which the linear system (i.e., neural activity) returns to its equilibrium after a perturbation in its measured output, revealing a slower decay rate in the presence condition, compared to absence, which could hint at a potential neuromodulatory role for others’ presence. However, none of the above findings can directly elucidate how SE is altered in the presence of others.

Enhanced synaptic efficacy during conspecific presence at microscale

Understanding the underlying neurobiological mechanisms that drive changes in neural firing rates is crucial for unraveling the complexities of neural computation and communication. Task-oriented single neurons exhibit specific firing rate patterns that are thought to influence the coarse-grain network dynamics. However, delineating the precise relationship between individual neuron activity and network dynamics remains non-trivial. By using a balanced spiking neural network model33, we aim to bridge this gap. This model allows us to simulate and analyze how changes in SE within the network can lead to changes in firing rates. The balanced nature of the model ensures that excitatory and inhibitory inputs are finely tuned, mirroring the conditions found in biological neural networks. This model (see Eq. (1)) comprises of one excitatory (n = 10,000) and one inhibitory (n = 2500) subpopulation of neurons (Fig. 1H). The strength of connections between the two subpopulations is scaled by negative g, where g parameter denotes the ratio of inhibitory to excitatory weights across the entire population. In other words, higher values of g in this model translate to lower overall (regional) network activity (Fig. 1J). To estimate the posterior distribution of parameter g, we trained deep neural density estimators (called neural spline flows28, see “Methods” section) on the maximum firing rate of the excitatory subpopulation from n = 10,000 random simulations (Fig. 1I, J).

After training, empirical data were used to estimate connectivity values that best represent the averaged firing rate in each experimental condition. By random sampling from the estimated posterior distribution, we achieved a close fit around the peaks while maintaining a consistent baseline (Fig. 1K). We found significant decreases in parameter g not only in condition averages, but also when comparing conditions across all neurons (p < 0.0001, rrb = 0.28 for pooled firing rates, and p < 0.0001, rrb = 0.90 for average firing rates; Fig. 1L, M). In sum, these findings demonstrate that a conspecific’s presence leads to an increase in effective connectivity between single neurons in attention-oriented regions.

Increased local recurrent excitation during conspecific presence at mesoscale

A global change to effective connectivity between single neurons does not inform us of the intricacies of how connectivity between different cortical subpopulations might be altered within a cortical column. To this end, we computed event-related potentials (ERPs) during presence and absence conditions. We observed that not only the ERPs in the presence condition possess a significantly higher peak compared to the absence condition (p < 0.001, rrb = 0.33; Fig. 2A, B), but also that the ratio of ERP peaks between conditions was correlated with the ratio of behavioral performance (accuracy per session) between conditions (ρpearson = 0.33; Fig. 2C). This finding is further validated by the clear separation of the embeddings of ERPs (see “Methods” section) between the experimental conditions (Fig. 2D).

Fig. 2: Inference of mesoscale synaptic efficacies (monkey A).
Fig. 2: Inference of mesoscale synaptic efficacies (monkey A).
Full size image

A Feedback-locked empirical evoked potentials across all sessions. n = 92 for empirical ERPs in all related sub-panels. B Maximum amplitude of the empirical ERPs in presence vs absence (p < 0.001). C Correlation of between-condition ratio of ERP peak with behavioral performance (accuracy). D Embedding manifold of empirical ERPs. E Sample of simulations used for training the posterior density estimator. F Peak amplitude of pyramidal neuron activity from posterior predictive checks across the conditions (p < 0.001). G Nonlinear latent space of the posterior predictive check done on empirical ERPs. H Embedding manifold of pyramidal neuron activity. I Pooled distribution of mesoscale synaptic efficacies across sessions (nsamples = 2000 per ERP). J Computed excitatory and inhibitory postsynaptic potentials in presence versus absence. Whiskers in all boxplots extend to 1.5 × IQR beyond the quartiles.

Evoked potentials typically measure synaptic activity at the population level, making it computationally implausible to infer the synaptic connection between every possible cortical subpopulation. Our model of choice is a modified derivation of the Jansen-Rit neural mass model (NMM)34,35—capable of generating evoked potentials (Fig. 2E)—with three subpopulations comprising pyramidal neurons (PNs), interneurons (INs), and stellate cells (SCs), which provides a balance between computational efficiency and biological plausibility. The subpopulations are connected by four effective connectivity parameters g1−4. The biological plausibility of this model, however, translates to model degeneracy, where different sets of input parameters could lead to identical model output36. By fixing the model time-scales at biological values (i.e., placing an informative prior in Bayesian setting), we therefore trained neural density estimators on peak evoked values obtained from n = 10,000 random simulations to approximate the joint posterior distribution of effective connectivities. Our inferred distribution of peak evoked activity—in each condition—shares similar characteristics with that of the empirical data, up to the second order of statistical moments (p < 0.001, rrb = 0.33; Fig. 2F).

To further validate the estimation, we generated nonlinear latent dynamics from the learned connectivity values for each experimental session/condition (Fig. 2G). These dynamics collectively formed the calculated embedding manifold, illustrating higher evoked activity in the presence condition compared to absence (Fig. 2H). Pooling the connectivity distributions across all sessions and in both monkeys, we found a significant difference between the experimental conditions only with respect to the parameter g2, which provides local recurrent excitation to PNs (Monkey A: WS = 0.10; Fig. 2I, Monkey M: WS = 0.16, Supplementary Fig. 4E). This parameter could therefore serve as the causal mechanism behind the observed increase in ERP peak during the presence of conspecifics. Since the NMM used in this scale makes the inference process computationally tractable, we ran the current gold standard sampling algorithm (called No-U-Turn Sampler37) for an asymptotically exact estimation of connectives from ERPs (Supplementary Note 7). Our results indicate similar findings regarding the increased g2 during the presence of conspecifics, thus enhanced SEe (see Supplementary Fig. 8). This algorithmic consistency validates our Bayesian learning using low-dimensional data features for training neural density estimators to conduct causal inference on SE. Finally, we computed the excitatory (e) and inhibitory (i) postsynaptic potentials (EPSPs/IPSPs; Fig. 2J) by inserting the median of the pooled effective connectivity into the following equation: \(PS{P}_{e/i}=({g}_{e/i}\frac{{h}_{e/i}}{{\tau }_{e/i}})t{e}^{\frac{-t}{{\tau }_{e/i}}}\), where ge = g1 + g2 + g3 and gi = g4.

The identified shift in g2 increases the E/I ratio, which could, in turn, be responsible for the observed increase in peak evoked activity within the column (Fig. 2J). While E/I ratios themselves could be non-identifiable, in our case, the similar values of IPSPs from our results in conjunction with elevated EPSPs during the presence condition alleviates this issue. In order to ensure the robustness of our findings in the association-learning test, we compared said results to parameters inferred from a permuted set of empirical observations (null-hypothesis test). We did not observe any significant elevation in any of the SE parameters in either of the two scales (Supplementary Fig. 7A, B). Therefore, we posit that SFI effects are potentially driven by enhanced SEe. In the following, we investigate this proposal by performing inference from EEG recordings of a lateral-interception task at the whole-brain scale.

Motor responses in presence vs absence

While an increase in effective connectivity at micro/mesoscale is observed within single regions, this finding does not immediately translate to modulations of whole-brain dynamics. However, it is reasonable to hypothesize that social presence of conspecifics would exert disparate dynamics in functional brain networks involved in attention-modulated tasks. Not only that, the effects of social presence should theoretically manifest across a wide array of cognitive tasks38,39. To this end, we designed a counter-balanced lateral-interception task in which participants (n = 27, nfemale = 14, nmale = 13) played a virtual game with varying ball velocities and trajectories (ref. 40, Supplementary Fig. 3A). The task involved intercepting a downward-moving ball on a large display using a handheld slider to control a virtual paddle on-screen (Fig. 3A). After each interception attempt, visual feedback was provided to the participant by briefly changing the paddle color to green/red for successful or failed trials, respectively. In the presence condition, the “observer” sat in the participant’s peripheral vision, sometimes monitoring the hands of the “actor” but not the screen, so as to minimize evaluation-induced anxiety during the presence condition41 (Fig. 3B). Participants were unaware of the observer’s true purpose and were informed that they were present to monitor equipment. In general, participants learned the task goal very quickly and were consistent in pursuing the ball across trials (Fig. 3C). Kinematic and electrophysiological data (paddle movement and EEG activity) were recorded during the task (see “Methods” section).

Fig. 3: Inference of macroscale synaptic efficacy.
Fig. 3: Inference of macroscale synaptic efficacy.
Full size image

A Lateral-interception task outline. B The two experimental conditions in the task. C Covariance of ball and paddle trajectories in time, along the axis of interception. D Sample paddle movement trajectories. E Average paddle speed across conditions and subject groups (whiskers = 1.5 × IQR). Number of empirical observables in all panels (i.e., subjects) is n = 14 and n = 13 for the female and male subjects groups, respectively. F Sample functional connectivity from one participant. G Source time-courses of an example participant’s EEG. H Median of the integration of the two subject groups across the seven functional networks. I Structural connectivity template used for whole-brain simulations. J Sample simulation from the Jansen-Rit model. K Integration of random simulations used as the data feature for training the neural density estimators. L Pooled estimated posterior distribution of SEe in attentional networks across subject groups (whiskers = 1 × IQR, nsamples = 2000 for each observable data point). M Median of inferred SEe in female versus male subjects. N Correlation of between-condition ratios of inferred SEe with behavioral performance (average paddle speed).

Previous studies on social facilitation have outlined the importance of using kinematic measures of performance in motor tasks, such as movement speed/duration—and features thereof—as opposed to outcome-based metrics such as accuracy or hit-rate42. Therefore, we analyzed average and peak paddle velocity within a 700 ms window (half-width of the longest trials) before the feedback time (Fig. 3D). We found that our chosen measure of performance (see “Methods” section), namely average paddle speed (within a 700 ms window or half-width of the longest trials before the feedback time), was significantly higher in the female subject group in the presence condition compared to absence (p < 0.05, rrb = 0.53; Fig. 3E, left panel), whereas there was no significant difference between the experimental conditions in the male subject group (p > 0.05, rrb = 0.01; Fig. 3E, right panel), indicating a social facilitation effect for one subject group only.

Increased integration during conspecific presence

Previous investigations have outlined the potential role of attention-oriented regions/networks in social facilitation10. If attentional modulation could indeed serve as the driving force underlying SFI effects, one would expect to observe a higher correlation of activity in attention-oriented functional networks such as dorsal/ventral attention network, and frontoparietal network (FPN), compared to other functional networks. We thus focused solely on alpha-band activity, not only due to the prominence of alpha peak in the vast majority of the participants (see Supplementary Fig. 3C), but also since it is prominently associated with distractor suppression, top-down attentional control, and selective attention43,44. In order to investigate how attention could affect the organization of brain networks, we source-localized the participants’ EEG (Fig. 3G) and extracted the source activity of seven functional networks45. These networks comprise dorsal attention network (DAN), ventral attention network (VAN), somatomotor network (SMN), visual network (VIS), FPN, default-mode network (DMN), and limbic network (see Supplementary Fig. 3E). The source activity was subsequently used to compute functional connectivity (FC) for each subject (Fig. 3F). The sum of FC within a network could be used to quantify the overall level of integration (or interconnectedness) of brain regions within that network. A higher integration suggests a network with stronger communication across regions, and thereby could serve as a suitable “observable” proxy for increased effective connectivity within functional brain networks46. We found that conspecific presence significantly increases integration of attentional networks such as DAN, VAN, and FPN—compared to other networks—in the female subject group (p < 0.001; Fig. 3H, left). As for the male subject group, we found decreased measures of integration during conspecific presence, however, this difference was not significant (p > 0.05; Fig. 3H, right).

Increased excitatory synaptic efficacy in attentional networks during conspecific presence

Integration might convey some useful information about changes to effective brain connectivity47,48. However, it is not a direct measure and therefore cannot be readily relied upon for direct interpretation of changes to this connectivity. As mentioned before, we identified local recurrent excitation (the parameter g2 in the Jansen–Rit model, given by Eq. (2); Fig. 2) as the most significant driver between presence and absence in dlPFC and ACC. A reasonable prediction therefore entails recurrent excitation also increasing in attentional networks during conspecific presence. To test this hypothesis, we employed the framework of Bayesian learning on a whole-brain model of brain comprised of interconnected Jansen–Rit mass models (given by Eq. (3), with 400 regions using Schaefer atlas; Fig. 3I). We constrained the values of the connectivity (g2) to those corresponding to the alpha band (Fig. 3J), and trained the neural density estimators to per-network integration values obtained from random simulations (n = 20,000), given the linear relationship between g2 and integration values (Fig. 3K). Empirical integration values were subsequently used to obtain joint posterior distributions of local recurrent excitation in the brain functional networks. In the female subject group, pooling the estimated posterior distributions across subjects showed a significant increase in DAN, VAN, and FPN in the presence condition compared to the absence condition (p < 0.001, WS = 0.52, WS = 0.25, WS = 0.42, respectively; Fig. 3L left panel, Supplementary Fig. 7C top left panel). The male subject group however, exhibited a significant decrease of SE in DAN and VAN, and a -negligible- decrease in FPN, during the presence condition (p < 0.001, WS = 0.14, WS = 0.22, WS = 0.08, respectively; Fig. 3L, right panel). In both subject groups, the aggregate distribution of g2 across the networks showed a striking resemblance to the mapping of empirical integration values (Fig. 3, panel H vs M). We conducted a null-hypothesis test similar to that described for the previous scales, in order to test the validity of inferred parameter values, and did not observe any notable shift in g2 across any of the functional networks (Supplementary Fig. 7C).

Since raw values of effective connectivity might not necessarily be linked to the aforementioned behavioral modulations, we calculated per-subject between-condition ratio (percent change) of average effective connectivity and that of behavioral performance (average paddle speed). The most significant finding was in the facilitated subject group (females), where we found a significantly high correlation between effective connectivity and behavior in both DAN and VAN (ρPearson = 0.79, ρSpearman = 0.45 for DAN, and ρPearson = 0.73, ρSpearman = 0.51 for VAN; Fig. 3N). As for FPN, while we found a strong linear correlation with behavioral performance, the significance of this correlation quickly diminished after computing correlations on ranked data using Spearman test (ρPearson = 0.68, ρSpearman = 0.12; Fig. 3N). These results provide support for the potential role of effective connectivity in presence-induced attentional modulations. Similar to FPN, we found high linear correlations in SMN and DMN, which significantly diminished when correlations were computed on ranked datasets (e.g., ρSpearman = (0.31 and 0.09) for SMN, and DMN in the female subject group, respectively; Fig. 3N). We found no significant correlation with behavior within any of the attentional networks in the non-facilitated group (Fig. 3N, lower panels). Finally, we found a negative correlated relation between effective connectivity and behavioral performance in the VIS network in both subject groups. The female subject group exhibited a much stronger negative correlation of SE with behavioral performance compared to the male subject group (ρSpearman = ( −0.54 and −0.34) for the female and male subject groups, respectively; Fig. 3N). This finding further alludes to the potential contribution of attentional modulation of visual stimuli—as opposed to sheer perception of visual cues—to the observed social facilitation effects in our visuomotor task. However, we should note that due to the limited number of subjects per group, the generalizability of these findings needs to be further explored in future investigations.

Discussion

Despite the wide array of research regarding the influence of social stimuli on both brain and behavioral dynamics49,50,51, the majority of this work largely revolves around how humans (and non-human animals) process social information in isolation based on the concept and techniques of cognitive psychology, or how they process social information in social contexts involving some form of explicit interactions or communications between the social agents. Yet, a basic, and perhaps most fundamental component of social cognition is the influence which the presence of others—in the immediate environment—may have on cognition, regardless of any overt interaction.

Modulations of performance during the presence of a conspecific have been hypothesized to largely stem from attentional processes, via the activity of context-specific neural substrates10,38. However, our understanding of the neurobiological mechanisms underlying such a modulation is still severely lacking. One reason could be the difficulties in making Bayesian inference, particularly across scales. These challenges notwithstanding, recent advances in probabilistic machine learning, such as deep neural density estimators used in simulation-based inference (SBI24,25,26,52,53,54) and adaptive sampling techniques in probabilistic programming languages have helped bridge this gap36,52,55,56, as also demonstrated in this work. We have further advanced this progress by opting for SBI that operates across brain scales. This enabled us to universally evaluate our hypothesis informed by low-dimensional features such as firing rate or level of integration. Notably, SBI bypasses the need for Markov chain Monte Carlo (MCMC) sampling, whose gradient-based variants becomes computationally expensive when dealing with inference at the whole-brain scale, or inapplicable at the micro-scale due to the discrete nature of spike events. Moreover, this approach is amenable to an amortized strategy25,53, allowing the use of the same trained model, validated on synthetic data (see Supplementary Note 6), immediately on empirical data.

In this study, we leveraged an efficient form of Bayesian learning with deep neural density estimators, to investigate the link between attentional modulation and effective connectivity—as a proxy for the activity of context-specific neural populations—during others’ presence in visual/visuomotor tasks across three spatiotemporal brain scales. In single neurons (microscale), we found that others’ presence strongly increases the overall SEe (or a decrease in the ratio of inhibitory to excitatory weights) in regions dlPFC and ACC, both of which are highly involved in attentional modulation57,58. Our findings for the ERPs (mesoscale, validated via adaptive MCMC sampling) further supported these results, as we observed a significant increase in local recurrent excitation in the presence condition. The observed facilitation of learning during conspecific presence supports the hypothesis of re-allocation of attentional resources due to the suppression of visual distractors. These findings are consistent with the previous research on the interplay between connectivity and attentional modulation in ACC and dlPFC. For one, stronger cooperation between the two regions (i.e., the presence of significant effective connectivity) has been reported to be critical for attention shifting57. In addition, alterations to the connectivity of these regions have also been linked to clinical measures of inattention and impaired cognitive control58,59. Finally, the synaptic specializations in ACC suggest its potentially great impact in reducing noise in dorsolateral areas during challenging cognitive tasks, and the disruption of said mechanisms in neuropathologies such as schizophrenia and major depressive disorder60,61.

Given our hypothesis, one would expect functional brain networks involved in attention to be more predictive of behavioral performance compared to others; The lateral-interception task in humans complemented our findings at the preceding scales by attempting to answer this question. We found that the female subject group exhibited significantly faster average paddle movements during the presence condition compared to the absence condition, while the male subject group showed no difference in performance between the two. Similar to our findings in the micro/mesoscale, we observed an increase in effective connectivity during conspecific presence, but interestingly only in the facilitated subject group (females). Remarkably, we found that effective connectivity is strongly correlated with task performance but only in attention-oriented functional networks such as DAN and VAN. One can interpret the mutual correlation of both attentional networks with behavior as a dynamic interplay between these two networks62. This co-activation may involve the transfer of response decisions from focused targets to high-level centers63, and is often associated with changes in synchronization and FC64,65. The involvement of the DAN in attentional orienting and the modulatory influence of the VAN during reorienting62 further support the idea of a dynamic interplay between these networks. The significant elevation of SE in the FPN during conspecific presence in the facilitated subject group, coupled with the lack of significant correlations between SE and task performance might seem initially perplexing. However, one potential explanation might stem from the lower cognitive demands of the lateral-interception task compared to the association-learning task in monkeys. Taken together, these findings suggest that the social facilitation effect observed in the female subject group is possibly mediated by changes in the way attentional networks coordinate and process information during presence. The observed increase in effective connectivity within these networks could reflect a more synchronized and efficient neural processing mode, ultimately leading to faster and more accurate motor performance. In sum, we suggest facilitative effects arising from others’ mere presence could largely stem from an increase in SEe (i.e., effective connectivity) in attentional brain networks, the influences of which could be observed across spatiotemporal brain scales.

Theoretical and clinical implications

The present results obtained in both monkeys and humans provide a more nuanced perspective on the effects caused by the presence of conspecifics, with significant implications at both the theoretical and clinical levels. At the theoretical level, the finding of elevated excitatory SE under social presence—the building block of social context in all animal species—suggests its contribution to the regulation of neuronal activity. The study of this socially-driven regulation deserves further investigation: is said modulation ubiquitous in the primate brain, regardless of the social versus non-social nature of stimuli or tasks at hand? The two prefrontal regions selected for the present study in monkeys are well known for their role in the attentional mechanisms of learning66,67,68,69, irrespective of whether the information to be processed is social or non-social in nature70,71. Furthermore, presence-induced enhancement of SE in dorsal and ventral attentional networks in human participants implies a sensitivity of neurons to social context in these networks. Given the highly reactive and ancestral nature of the presence of conspecifics especially in primates (human and non-human;72), it is plausible that the increased SE prompted by this presence could extend to a wider array of regions than those examined here—potentially encompassing the entire brain. Future research should therefore focus on exploring this critical issue. Indeed, if presence-induced elevations of SE were to be observed more widely in the brain, we would be able to rethink the social brain, which is currently defined mostly in reference to the neural regions that appear to be fairly specialized in the processing of social information73. A recurring debate concerning the social brain revolves around whether identical, different, or partly overlapping neuronal representations support the processing of social and non-social information (see Ruff et al. for a review74). However, said debate neglects the possibility that the processing of both types of information, and the control of behavior—be it social or nonsocial—may rely on relatively distinct neuronal populations within the region, depending on basic features of the social context of performance (for evidence that the social context impacts the processing of both social and non-social information, see Monteil et al.75). As indicated by our findings, even non-social tasks (i.e., classic visuomotor association-learning tasks) can recruit social neurons. One possiblity is that social neurons such as those reported here could be ubiquitous in the brain, but cannot be detected by neuroimaging techniques typically used in social neurosciences in humans74,76 and nonhuman primates38. Said techniques—particularly fMRI,—have been instrumental in identifying brain networks engaged in various social cognitive functions, but are ultimately unable to provide a granular breakdown such as the one provided here via single unit recordings. Hence, it is entirely possible that the proportions of social neurons vary from one brain region to another, and that the regions identified by neuroimaging techniques as supporting the social brain77 actually reflect higher proportions of social neurons. These neurons may yet be present in myriad regions outside the social brain and operate efficiently under social circumstances, such that a variety of perceptual, cognitive, emotional and motor tasks would be modulated by the social context.

At the clinical level, the assumption of behavioral modulation observed during conspecific presence stemming from the activity of context-aware (or social) neurons, alongside our findings of altered E/I ratio during presence, leads one to expect perturbations to the functioning of said cells in social neuropathologies such as autism spectrum disorder and schizophrenia. Indeed, alterations in E/I balance within neural microcircuitry have been linked to these social neuropathologies78. Neural communication and disruptions in E/I balance in autism, stemming from atypical brain connectivity, have been linked to changes in attentional modulation and social interactions79. FC between the cerebellum and cortical social brain regions is also reported to be altered in autism80. In schizophrenia, elevated activation of frontal control networks and association cortices has been proposed as a compensatory mechanism for impairments in connectivity within the social brain networks81. In addition, resting-state networks are reported to be differentially affected in schizophrenia, with reduced segregation between the default mode and executive control networks82, as well as reduced connectivity in the dorsal attention and executive control networks83. Direct comparisons between the two pathologies have also found a reduction of activity in regions within the social-cognitive network in autism compared to schizophrenia84.

It should be noted, however, that the specific nature of these changes in connectivity, the direction of said changes, and their relationship to social neuropathologies remain a topic of ongoing research85,86. Can changes in neural activity can be simply due to neuromodulation, as opposed to elevation of SE? As a matter of fact, neuromodulation is commonly recognized as context-driven alterations in neural excitability and presynaptic efficacy87,88. Neuromodulators such as neuropeptides and biogenic amines reportedly modulate SE via GCPRs (G protein-coupled receptors), and can profoundly influence both synaptic function and neuronal excitability89,90; Put another way, experimental conditions structure information flow by inducing neuromodulators to selectively activate a subset of synapses, which in turn adjusts neuronal excitability and presynaptic efficacy89.

Scope and limitations

As previously mentioned, the literature surrounding the neurobiological underpinnings of SFI—and mere presence—effects in higher primates is rather sparse. We attempted to address this sparsity by employing our Bayesian learning framework across three spatiotemporal brain scales—an endeavor which, to the best of our knowledge, has not been undertaken neither in the field of computational neuroscience, nor when pertaining to SFI effects. However, there are some limitations in both of the previously outlined tasks that need to be addressed rigorously in order to obtain an accurate perspective on SFI effects. The longitudinal nature of the association-learning task is highly informative as it provides insights into social facilitation effect over a much longer timescale, and partially complements our lateral-interception task, but remains limited by factors such as number of subjects, sex, and the number of brain areas investigated.

We addressed some of these restrictions in the lateral-interception task, but there remains ample ground for improvement in future SFI investigations: first, the most optimal multiscale SFI study should entail closely-matched tasks across species. It is pivotal to note that—by our current understanding—SFI effects should persist across tasks (and species), given well-learned and/or simple tasks. For the purposes of this investigation, the tasks were carefully chosen to fullfill all essential criteria of an SFI study (i.e., existence of easy trials and “mere” presence of the spectator). While the design of said tasks were partially constrained by the availability of experimental equipment, in terms of SFI paradigms, this design disparity could be interpreted as a measure of external validity (i.e., SFI effects are reportedly observed across a range of task outlines1). Second, our current framework rests on the prevalence of context-oriented “social/asocial” neurons across the brain. But despite the outstanding discoveries of similarly-tuned neurons in various regions of the brain10,91, we are still a ways from establishing the “ubiquity” of such subpopulations. Future investigations in non-human primates should therefore aim to cover a wider range of brain regions in order to test the spatial composition of said neurons.

Third, we primarily focused on alpha synchronization for inference of synaptic efficacies in the macroscale, based on an array of evidence: The prominence of alpha peak—as opposed to others—during the task trials (Supplementary Fig. 3C). Moreover, our aim of investigating network-wide measures of coherence necessitated rigorous evidence for the involvement of the chosen oscillatory regime in large-scale network integration92. The critical involvement of alpha oscillations in modulation of visuospatial attention was equally important. Alpha-band synchronization is reportedly modulated by the orienting of attention, and is associated with decreased reaction times to attended stimuli43,44. Moreover, attention has been reported to drive the synchronization of alpha and beta rhythms between the right inferior frontal and primary sensory neocortex93. Finally, alpha-band phase has been shown to modulate bottom-up feature processing and is modulated by the nature of the recruited attentional mechanisms94,95. However, it should be noted that our findings with regards to the alpha band synchronization are still limited in their scope, due to the absence of individualized connectomes for source-localization, and per-node local-field activity to compare against EEG recordings. Therefore, a promising direction for future SFI inquiries would be to assess the contribution of presence-induced SE variations to large-scale brain oscillations, via concurrent, multimodal and personalized recordings.

The fourth and final limitation concerns the control of confounding variables and sample size of the lateral-interception task. During the design and conductance of the study, we took extra care to control for factors such as sex, age and occupation, with our initial hypothesis revolving around absence of any differences in performance or FC measures across the subject groups. When it comes to analysis of mere presence effects across subject groups, SFI investigations—by definition—call for an expansive participant pool (or alternatively, within-group longitudinal investigations, as with the association-learning task). Moreover, SFI is first and foremost defined through modulations of task performance across conditions. As such, even with the presence of similar behavioral trends across groups, disparate scales of inter-individual variability across samples, in conjunction with a relatively small sample size can compound the inference of genuine group-level statistics. Consequently, it is paramount to not associate the between-group differences outlined here with an inherent and prevalent disparity of facilitation/inhibition capacity across other tasks. Rather than stemming from sampling effects, the lack of improvements of kinematic performance during conspecific presence in the male subject group might partially stem from the level of perceived rank within the immediate social environment (in this case, the task space96). Alternatively, said absence of performance modulation could also be explained by recent findings on sex differences in social cognition. Notably, women exhibit greater empathy, positive evaluation of face, and interest in social information compared to men97,98, with said differences being observed in multiple facets of social cognition, including processing of social information, emotion recognition, and face processing97. Nonetheless, a substantial sample size is still the optimal way of diminishing the uncertainty surrounding the causal factors behind this —seeming—difference.

Concluding remarks

It is important to note that while social facilitation effects could range from extremely significant, to barely noticeable2, such effects have been repeatedly observed in a vast array of previous investigations, and as such, this phenomenon could currently serve as the best proxy to investigation of “pure” mere presence effects. Nonetheless, the development of alternative frameworks on the underpinnings of SFI effects would be invaluable for gaining a deeper understanding of the contribution of such stimuli in social cognition as a whole. Moreover, a genuine implementation of DCM is not a one-directional estimation of parameters, but a cyclic exploration of potential drivers through both model-based estimations and neurobiological inquiries. Rather than conclusively championing SE changes as the only possible driver of social facilitation, our aim here has been to kickstart the conversation and subsequent investigations into SFI, which we believe is a central axis in social cognition. Future research should therefore explore the generalizability of the findings outlined here, across a wider range of tasks, subject groups, and social contexts.

Materials and methods

Between-condition ratio

We defined the between-condition differences to behavioral and neural metrics X in all three scales, as the percent change of X between the presence P and absence A conditions: \({X}_{Ratio}=\frac{{X}_{P}}{{X}_{P}+{X}_{A}}\times 100\).

Association-learning: subjects

The subjects comprised of two adult male Rhesus monkeys (Macaca mulatta). They were housed together since the age of 3 years and weighed 8–12 kg at the time of the study. They had established stable and spontaneous social interactions, with monkey A being the dominant as revealed by the “access to water and food” test. Animal care, housing, and experimental procedures conformed to the European Directive (2010/63/UE) on the use of nonhuman primates in scientific research. We have complied with all relevant ethical regulations for animal use. The two monkeys were maintained on a dry diet, and their liquid consumption and weight were carefully monitored. Both subjects exhibited significant social facilitation effects. However, Monkey M’s field-recordings were highly laden with noise, leading to a decrease in the total number of session-averaged condition-ERPs. In addition, a lower number of recorded neurons—in this subject—were found to belong to any of the two context-dependent ensembles. We thus opted not to pool the estimated synaptic efficacies obtained from the two monkeys (see Supplementary Note. 5, Fig. 1, Fig. 2).

Association-learning: behavioral procedures

During each session, one monkey (the actor) was trained to associate abstract images with targets on a touchscreen, either in presence of the other monkey (the spectator) or in absence (Fig. 1A, B). Under social presence, the two monkeys were positioned in primate chairs facing each other, with their heads immobilized (Supplementary Note 2). Only the actor had access to the touchscreen, and thus performed the task and received rewards on correct trials. The spectator was not rewarded, had no incentive to produce any particular behavior, was never tested (as actor) during the same day, and when tested, a new set of stimuli was used (therefore preventing any observational learning from occurring). When the actor was tested in absence, the other remained in the housing room located at a distance such that the actor was truly alone in the testing box, deprived of any communicative means with the conspecific through visual, auditory, or olfactory channels. During the task, the actor started trials by touching a white rectangle, which triggered the presentation of a cue at the center of the screen. The monkey was required to indicate among the four white squares (targets), the one associated with this cue. After a variable delay (500–700 ms), the cue went off (go signal) and the monkey had to move the hand and touch the chosen target. If correct, a green circle (positive feedback) informed the monkey that the choice was correct, and a reward (fruit juice) was delivered after 1 s. If the choice was incorrect, a red circle signaled the error (negative feedback), and no reward was delivered.

Association-learning: neuronal firing rates

Social and asocial neurons were found in similar proportions in both prefrontal areas (n = 376 in dlPFC and n = 216 in ACC, before filtering and selection), and histological reconstruction revealed no spatial segregation within individual areas. Because of similar profiles and proportions of outcome-related activations in dlPFC and ACC, the data were pooled together into a single neuronal sample. The correlation of firing rates with behavior was conducted on the entire neural dataset, from which we chose a subset of neurons (n = 92 for monkey A, and n = 29 for monkey M) oriented to negative feedback—as performance in this condition correlates highly with neural firing rate—which belonged to Monkey A and were strictly categorized as social or asocial.

Association-learning: event-related potentials

Local-field potentials (LFPs) across four channels were epoched by time-locking the signals to a 350 ms time window before the feedback timestamps. A high-pass filter of 2 Hz was applied to the LFPs before epoching, and electrodes with flat signals were removed from the analysis. The electrodes were further filtered by including only those with excellent and good signal quality in the calculation of ERPs. Trials with negative feedback were averaged during the presence or absence to obtain the raw ERPs (n = 117 for monkey A, and n = 21 for monkey M) in each task condition. A low-pass (butterworth) filter of 20 Hz was then applied to the evoked potentials, which were then up-sampled to 1000 time-steps. Subsequently, evoked potentials in each condition were normalized with respect to the maximum amplitude of their session in order to retain across-session comparability. These normalized evoked potentials were then used as input to the machine learning method CEBRA99 to obtain low-dimensional embeddings of the data. Using the PyMC55,100 package, the embeddings themselves were fed to a Bayesian neural network using variational inference with a perceptron network with two hidden layers and 5 nodes per layer. This process aimed to obtain a smooth probability grid of the likelihood of an ERP embedding belonging to one of the experimental conditions.

Lateral-interception: participants

A total of 28 healthy right-handed participants (14 women and 14 men, with an average age of 23.6 ± 2.1 years) were recruited for the study. All participants provided written consent before participating in the study. One male participant was removed from the study due to a recording error (nmale = 13). The study procedures were approved by the Central Ethics Board of the University Medical Centre of Groningen and according to the Declaration of Helsinki (World Medical Association). All ethical regulations relevant to human research participants were followed.

Lateral-interception: experimental setup

In the presence condition, a “spectator” sat across from the participant in their peripheral vision at a distance of 1.5 m, meaning that the spectator sat in the participants’ social space, as defined by Hall101. The spectator’s sex and age were controlled to be close to those of the participants, and participants were unaware of the true purpose of the spectator’s presence, as they were informed through a cover story that the spectator was present to monitor the equipment, with the spectator appearing to focus on a laptop. To prevent the spectator from inducing a sense of evaluation, they avoided looking at the screen and focused approximately 70% of the time on observing the participants’ hands, thereby creating a sense of being observed.

Lateral-interception: data preprocessing

Behavioral data were epoched to a 700 ms time window preceding the feedback events. This duration was long enough to encompass the majority of paddle movements, but also allowed for the preclusion of the starting flat tails of paddle movements. A low-pass 5 Hz filter (order = 2) was subsequently applied to the movement data. Paddle data for each subject were then normalized with respect to ball arrival positions and subsequently converted to absolute values to ensure uniform computation of paddle speeds. In addition, artefactual trials (with misaligned starting position) were excluded from the analysis. Previous inquiries on social facilitation have outlined the importance of using kinematic measures of performance in motor tasks, such as movement speed/duration—and features thereof—as opposed to outcome-based metrics such as accuracy or hit-rate42. In light of these findings, we chose average paddle speed as our chosen measure of behavioral performance.

Preprocessing of whole-brain recordings was conducted automatically and agnostically, minimizing biases in preprocessing pipeline due to experimenter’s subjective choices, and was entirely performed using the MNE-python package102,103,104 (Supplementary Note 4). First, we identified and interpolated bad EEG channels. This was done (per-subject) by creating trials from −500 ms to 1000 ms after ball movement (with a baseline of −500 ms to −200 ms), which were given to the Ransac module of the AutoReject package105. EEG channels were then re-referenced using average referencing. Second, we removed blinks and saccades components from EEG via independent component analysis (ICA; Supplementary Fig. 3D). EEG data were first filtered using an FIR band-pass filter from 1 to 45 Hz (using the “fastica” method), of which, epochs of equal lengths alongside a rejection threshold were computed, which were then given to the ICA algorithm for computation of the independent components. General templates for blink and saccade components were initially computed and saved from a single subject (p01). Subsequently, these components were template-matched and projected out of the data (threshold = 0.9) for all participants. Due to the portable nature of the EEG setup, which was specifically designed for movement in the natural environment, we did not have access to subject-specific MRI scans. Therefore, source-localization was performed via the freesurfer106 template “fsaverage” (for comparison between experimental conditions, the data need to be morphed to a representative “template subject” regardless, even in the presence of per-subject structural imaging). The source space model (“fsaverage-ico-5-src”) and boundary-element model (“fsaverage-5120-5120-5120-bem-sol”) were used to compute the general forward solution (mindist = 5 mm). Noise covariance was computed from broadband epochs spanning from −500 ms before to 1000 ms after ball movement, and a baseline of −500 ms to −200 ms (with a tmax = −0.2 ms). The forward solution and the noise covariance were then used to create the inverse operator (with regularization parameter loose = 0.2).

Extraction of source time-courses is as follows. The data were first band-passed (multi-taper method) to the alpha band (8–12 Hz), and subsequently epoched for a time window of 500 ms starting from ball movement (with a baseline from −200 to 0, cropped before saving the epochs, Supplementary Fig. 3B). We chose this window as attentional modulation should theoretically occur before modulations to behavior, and closer to the visual cue. Subsequently, the precomputed inverse operator was used with the MNE algorithm to obtain source activity in vertex format, which was in turn used alongside the Schaefer 2018 atlas (400 parcels) to extract the source time-courses for the activity of the functional brain networks45,107.

Lateral-interception: functional connectivity

FC was calculated as the Pearson correlation between the sources/labels (or nodes in case of simulations). For the correlation matrix FC, the element FCij representing the statistical dependency between two labels i and j, was calculated as \(F{C}_{ij}=\frac{{C}_{ij}}{\sqrt{{C}_{ii}* {C}_{jj}}}\), where Cij is defined as the covariance between said labels, and Cii as the variance of i12. Elements of the FC matrix were subsequently sorted and isolated into separate masks according to the aforementioned functional networks (with left and right hemispheres concatenated together, as we are interested in the activity of the entire network). The sum of their upper triangular was used to obtain: Integration = ∑FCnetwork.

Our choice for this coherence measure is supported by previous research, as not only the strength of distractor suppression is reportedly influenced by the intrinsic connectivity within and between attentional networks, but within-network connectivity was found to be a better predictor of this suppression108.

Generative model of spiking neurons

The model employed at microscale is a network of spiking neurons with alpha synapses, comprising two subpopulations: excitatory (n = 10,000) and inhibitory (n = 2500) leaky integrate-and-fire neurons33:

$$\frac{d{V}_{{{{\rm{m}}}}}}{dt} = \, -\frac{{V}_{{{{\rm{m}}}}}-{E}_{{{{\rm{L}}}}}}{{\tau }_{{{{\rm{m}}}}}}+\frac{{I}_{{{{\rm{syn}}}}}+{I}_{{{{\rm{ext}}}}}+{I}_{{{{\rm{stim}}}}}}{{C}_{{{{\rm{m}}}}}}\\ {I}_{syn}(t)= \, {I}_{syn,e}(t)+{I}_{syn,i}(t)\\ {I}_{syn,x}(t) = \, {\sum}_{n}{J}_{n} {\sum}_{k}{i}_{syn,x}(t-{t}_{j}^{k}-{d}_{j})\\ {J}_{i}= \, -g{J}_{e}\\ {i}_{syn,x}(t) = \, \frac{e}{{\tau }_{syn,x}}t{e}^{-\frac{t}{{\tau }_{syn,x}}}\Theta (t).$$
(1)

If Vm(tk) < Vth and Vm(tk + 1) ≥ Vth, then a spike is emitted at timestep t* = tk + 1 and the membrane potential is reset to Vm(t) = Vreset for t* ≤ t < t* + tref.

With regards to the synaptic currents Isyn,x, subscript x represents either excitatory (e) or inhibitory (i) synapses, and both synaptic subtypes share an identical time constant τsyn. Here, n serves as the index for neurons within either the inhibitory or excitatory subpopulation. For a neuron with index n, then k and dj represent the index for spike times of, and the delay from said neuron, respectively. The remaining parameters and their representations are as follows:

Moreover, the external current Iext is modeled as a Poisson generator with a rate of prate = 13341.8 Hz, which is calculated based on the in-degree of the excitatory synapses, the external rate relative to threshold rate, and a number of other parameters109 (see Brunel33 for in-depth discussion of the external current). In order to simulate the effects of feedback anticipation in the population, a step current Istim from 350 to 900 ms was applied to the population of neurons in each simulation, with mean of 150 pA and standard deviation of 1 pA. During each simulation neurons received a different realization of the current based on the standard deviation of the current: Istim(t) = μ + σNw, where Nw is a sample drawn from the zero-mean unit-variance normal distribution for each time-step during the current’s activation interval w.

We employed the NEST simulator109 and simulated the model (n = 10,000) for 1000 ms (dt = 0.1 ms) with varying values of the parameter g, which represents inhibitory/excitatory weight ratio. For each simulation, the value of g was randomly sampled from a truncated prior uniform distribution g [5, 8]. All other parameters of including external input, leak time constant, and synaptic current time constants were fixed (Table 1). Consequently, simulation firing rates were computed by first calculating the histogram of the spike times (nbins = 100), which were then smoothed with a Butterworth low-pass filter of order = 5 and critical frequency of 2 Hz (20 Hz in the original sampling rate). Firing rate time series were then normalized between 0 and 1, and the maximum value of each time series was computed. After training the deep neural density estimators (see “Methods” section), empirical firing rates were used as low-dimensional data features to efficiently obtain the posterior distributions of effective connectivity given each condition and neuron. The training took approximately 2 min, while the sampling took less than a minute.

Table 1 Parameters for LIF neurons

Generative model of ERPs

The generative model of ERPs used for the inference at the single column (mesocspoic level) is based on a modified iteration of the Jansen–Rit NMM110 developed for DCM11. The model comprises ten parameters as g1,2,3,4 (connection strengths), τe/i (membrane rate-constants), he/i (post-synaptic potential maximum amplitude), δ (intrinsic delay), and u (external input). The NMM model comprises nine ordinary differential equations of hidden neuronal states x(t) that are a first-order approximation to delay-differential equations, i.e., using \(x(t-\delta )=x(t)-\delta \dot{x}(t)\). The dynamics of three neural populations: PNs x9, spiny-SCs x1, and inhibitory interneurons x7 are given by:

$$\frac{d{x}_{1}(t)}{dt}= \, {x}_{4}(t)\\ \frac{d{x}_{2}(t)}{dt}= \, {x}_{5}(t)\\ \frac{d{x}_{3}(t)}{dt}= \, {x}_{6}(t)\\ \frac{d{x}_{7}(t)}{dt}= \, {x}_{8}(t)\\ \frac{d{x}_{4}(t)}{dt}= \, \frac{{h}_{e}({g}_{1}(\frac{1}{{e}^{-0.56{x}_{9}(t-\delta )}+1}-0.5)+u)}{{\tau }_{e}}-\frac{{x}_{1}(t)}{{\tau }_{e}^{2}}-\frac{2{x}_{4}(t)}{{\tau }_{e}}\\ \frac{d{x}_{5}(t)}{dt}= \, \frac{{h}_{e}{g}_{2}(\frac{1}{{e}^{-0.56{x}_{1}(t-\delta )}+1}-0.5)}{{\tau }_{e}}-\frac{{x}_{2}(t)}{{\tau }_{e}^{2}}-\frac{2{x}_{5}(t)}{{\tau }_{e}}\\ \frac{d{x}_{6}(t)}{dt} = \, \frac{{h}_{i}{g}_{4}(\frac{1}{{e}^{-0.56{x}_{7}(t-\delta )}+1}-0.5)}{{\tau }_{i}}-\frac{{x}_{3}(t)}{{\tau }_{i}^{2}}-\frac{2{x}_{6}^{t}}{{\tau }_{i}}\\ \frac{d{x}_{8}(t)}{dt}= \, \frac{{h}_{e}{g}_{3}(\frac{1}{{e}^{-0.56{x}_{9}(t-\delta )}+1}-0.5)}{{\tau }_{e}}-\frac{{x}_{7}(t)}{{\tau }_{e}^{2}}-\frac{2{x}_{8}(t)}{{\tau }_{e}}\\ \frac{d{x}_{9}(t)}{dt}= \, {x}_{5}(t)-{x}_{6}(t)$$
(2)

The fixed parameters for the simulations are given in Table 2. For the inference at mesoscopic scale, the samples were drawn from a uniform random distributions with minima and maxima set given in Table 3.

Table 2 Parameters for ERP model
Table 3 Prior range for ERP model parameters

Generative model of whole-brain EEG

Taking a network-based approach and connectome-based modeling111,112,113, we generated whole-brain EEG simulations by placing the Jansen–Rit NMMs at each parcellated brain region, connected through a structural connectivity (SC) matrix:

$$\frac{d{y}_{0,i}(t)}{dt}= \, {y}_{3,i}(t)\\ \frac{d{y}_{1,i}(t)}{dt}= \, {y}_{4,i}(t)\\ \frac{d{y}_{2,i}(t)}{dt}= \, {y}_{5,i}(t)\\ \frac{d{y}_{3,i}(t)}{dt}= \, Aa\,\,{{\mbox{Sig}}}\,({y}_{1,i}(t)-{y}_{2,i}(t))-2a{y}_{3,i}(t)-{a}^{2}{y}_{0,i}(t)\\ \frac{d{y}_{4,i}(t)}{dt}= \, Aa\left(P(t)+{g}_{2}\,\,{{\mbox{Sig}}}(\,{g}_{1}{y}_{0,i}(t))+G\times {{\mbox{Coupling}}}_{i}\right)-2a{y}_{4,i}(t)\\ -{a}^{2}{y}_{1,i}(t)\\ \frac{d{y}_{5,i}(t)}{dt}= \, Bb\left({g}_{4}\,\,{{\mbox{Sig}}}\,({g}_{3}{y}_{0,i}(t))\right)-2b{y}_{5,i}(t)-{b}^{2}{y}_{2,i}(t)\\ \,{\mbox{Sig}}\,(v)= \, \frac{{v}_{max}}{1+\exp (r({v}_{0}-v))}\\ {{\mbox{Coupling}}}_{i}= \, {\sum}_{j}{{{{\rm{SC}}}}}_{ij}\,{{\mbox{Sig}}}\,({y}_{1,j}-{y}_{2,j})$$
(3)

The SC matrix was obtained using tractography techniques114, with Schaefer Atlas (400 regions), averaged across subjects, and subsequently normalizing by \(S{C}_{norm}=\log (SC+1)\). The external current P(t) was modeled as a Gaussian random noise with mean 0.295 and zero standard deviation. All simulations were done using an Euler integration method with dt = 0.05 s. The fixed model parameters are given in Table 4, whereas we inferred g2 (i.e., average number of synapses from SCs to PNs). The prior distribution for this parameter was defined as a truncated uniform distribution g2 [101.25, 110.7]. This range was selected to induce alpha-band oscillations in all simulations. Specifically, we simulated the model (n = 20,000 simulations) for 3000 ms (the first 2000 ms of which were cropped as transients). Simulations were then used to compute FC and integration (see “Methods” section). Sampled parameters from the prior alongside integration values—as data features—were then used to train the posterior density estimators (see “Methods” section). Empirical integration values were then used to sample from the learned posterior distributions of effective connectivity for each brain subnetwork.

Table 4 Parameters for whole-brain EEG model

Linear Dynamical Systems (LDS)

LDS115 are widely used to model data whose evolution functions over time are linear, making them mathematically tractable for learning and inference. At time-step t, we observe a high-dimensional emission \({y}_{t}\in {{\mathbb{R}}}^{N}\), which is driven by low-dimensional hidden states \({x}_{t}\in {{\mathbb{R}}}^{D}\). The state dynamics obey the equation xt = Axt−1 + Vut + b + wt, where A is the dynamics matrix, V is the (the input-to-state) control matrix, ut is a control input, b is an offset vector, and \(w \sim {{{\mathcal{N}}}}(0,\,{\Sigma }_{w})\) denotes the process noise. To fully specify an LDS, we also need to describe how the emissions yt are generated from the hidden states xt. A simple linear-Gaussian emission model is given by yt = Cxt + Fut + d + vt, where C is the measurement matrix, F is the feed-through (input-to-emission) matrix, d is an offset or bias term, and \(v \sim {{{\mathcal{N}}}}(0,\,{\Sigma }_{v})\) denotes the observation noise.

We used the open-source state space modeling package (ssm116) for parameter estimation in LDS, streamlining the estimation of hidden states and model parameters from observed time series data (using Expectation-Maximisation algorithm). An LDS was constructed and fitted to the firing rate time series -obtain from monkey A- in each condition (n = 92), assuming Gaussian dynamics and emission, to learn the vector fields driven by the measurement matrix A, for each condition. For stability, all eigenvalues of A must lie strictly within the unit circle in the complex plane, which means their magnitudes must be less than 1. The fitted systems were subsequently sampled (n = 5000) to forecast the system dynamics for each condition.

Simulation-based Inference (SBI)

We adopted a Bayesian framework for likelihood-free inference and uncertainty quantification of effective connectivity at micro, meso, and macro scales. Across different brain scales, the likelihood function (i.e., the conditional probability of obtaining data given parameters) can become computationally prohibitive, rendering Monte Carlo estimation of posterior (i.e., the conditional probability of obtaining parameters given data) inapplicable. This is due to either the high-dimensional nature of the data (e.g., the large number of neurons or brain regions), or the complex relationships between variables and parameters (e.g., nonlinearity, and numerous measures of effective connectivity), leading us to employ simulation-based inference (SBI24) leveraged by deep neural density estimators25,26. Taking prior distribution \(p(\overrightarrow{\theta })\) over the parameters of interest \(\overrightarrow{\theta }\), a limited number of N simulations are generated for training step as \({\{({\overrightarrow{\theta }}_{i},{\overrightarrow{x}}_{i})\}}_{i = 1}^{N}\), where \(\overrightarrow{{\theta }_{i}} \sim p(\overrightarrow{\theta })\) and \({\overrightarrow{x}}_{i}\) is the simulated data features given model parameters \({\overrightarrow{\theta }}_{i}\). After the training the generative models of probability distributions, so-called normalizing flows28, we are able to efficiently estimate the approximated posterior \({q}_{\phi }(\overrightarrow{\theta }| \overrightarrow{x})\) with learnable parameters ϕ, so that for the observed data features \({\overrightarrow{x}}_{obs}\): \({q}_{\phi }(\overrightarrow{\theta }| {\overrightarrow{x}}_{obs})\simeq p(\overrightarrow{\theta }| {\overrightarrow{x}}_{obs})\).

Through a series of invertible transformations, implemented by deep neural networks, normalizing flows convert a simple initial distribution (uniform prior) into any complex target distribution (multimodal posterior). The state-of-the art neural spline flows (NSFs117) offer efficient and exact density evaluation and sampling from the joint distribution of high-dimensional random variables with low-cost computation. NSFs leverage splines as a coupling function, enhancing the flexibility of the transformations while retaining exact invertibility. To conduct SBI, we used an NSF model consists of 5 flow transforms, two residual blocks of 50 hidden units each, ReLU nonlinearity, and 10 spline bins, all as implemented in the public sbi toolbox118. By training NSFs on the spiking network, neural mass model of ERPs, and whole-brain network model of EEG, we were able to readily estimate the approximate posterior of effective connectivity from low-dimensional data features, such as maximum firing rates, peak ERP amplitude, and measures of integration. For validation of SBI on synthetic data, across brain scales see Supplementary Note 6.

Statistics and reproducibility

All statistical analyses were performed using scipy in Python119. For comparing the distributions of empirical observables (such as mean firing rates, maximum amplitude of ERPs, and measures of integration in functional networks), the Mann–Whitney U (MWU) test was used. The sample size, denoted by n, refers to the number of biologically independent samples, such as number of neurons, ERPs, or subjects. Effect sizes were reported as the rank-biserial correlation, computed as \({r}_{rb}=1-\frac{2U}{{n}_{1}{n}_{2}}\), where U is the test statistic and n1n2 are the sample sizes of the two groups (i.e., conditions). All tests were two-sided. As such, Bayesian learning was conducted on processed and curated empirical observables: n = 92 neurons for monkey A, and n = 29 for monkey M in microscale, n = 117 in Monkey A and n = 21 in Monkey M for mesoscale, and n = 14 female subjects and n = 13 male subjects for macroscale. Inference was conducted separately on informative data features computed from each observable (i.e., peak firing rate, maximum evoked amplitude, and per-network integration), with posterior samples of n = 1000, n = 2000, n = 2000 per observable, across microscale, mesoscale, and macroscale, respectively. To assess the between-condition difference of pooled distributions of sampled parameters from the simulation-based inference, the Wasserstein (WS) distance was used as opposed to traditional significance tests. This metric was chosen as an agnostic and unbiased measure of distance, as conventional p values from tests like MWU often become misleadingly small and statistically significant when comparing large, concatenated distributions of sampled parameters. The WS distance thus provides a more robust and meaningful measure of the difference between posterior parameter distributions across the two experimental conditions.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.