Abstract
Bipolar disorder (BD) involves altered reward processing and decision-making, with inconsistencies across studies. Here, we integrated hierarchical Bayesian modelling with magnetoencephalography (MEG) to characterise maladaptive belief updating in this condition. First, we determined if previously reported increased learning rates in BD stem from a heightened expectation of environmental changes. Additionally, we examined if this increased expectation speeds up belief updating in decision-making, associated with modulation of rhythmic neural activity within the prefrontal, orbitofrontal, and anterior cingulate cortex (PFC, OFC, ACC). Twenty-two euthymic BD and 27 healthy control (HC) participants completed a reward-based motor decision-making task in a volatile setting. Hierarchical Bayesian modelling revealed BD participants anticipated greater environmental volatility, resulting in a more stochastic mapping from beliefs to actions and paralleled by lower win rates and a reduced tendency to repeat rewarded actions than HC. Despite this, BD individuals adjusted their expectations of action-outcome contingencies more slowly, but both groups invigorated their actions similarly. On a neural level, while healthy individuals exhibited an alpha-beta suppression and gamma increase during belief updating, BD participants showed dampened effects, extending across the PFC, OFC, and ACC regions. This was accompanied by an abnormally increased beta-band directed information flow in BD. Overall, the results suggest euthymic BD individuals anticipate environmental change without adequately learning from it, contributing to maladaptive belief updating. Alterations in frequency-domain amplitude and functional connectivity within the PFC, OFC, and ACC during belief updating underlie the computational effects and could serve as potential indicators for predicting relapse in future research.
Similar content being viewed by others
Introduction
Bipolar disorder (BD) is a chronic affective condition characterised by episodes of elation, depression, and mixed states, interspersed with periods of clinical remission [1, 2]. Alterations in reward processing and impaired decision-making performance have been associated with the condition [3, 4], pointing to disrupted functional connectivity between the prefrontal cortex (PFC) and the mesolimbic reward system [5, 6]. Yet findings across studies are variable, even when considering euthymic periods alone [5,6,7]. BD research has reported both heightened sensitivity to negative feedback and decreased learning from rewards, or the reverse [8,9,10,11,12]. Recently, Ossola et al. [13] found that in euthymic BD, attenuated belief updating from positive feedback forecasts relapse, highlighting the importance of investigating dynamic belief updating during euthymia.
Influential proposals advocate for the application of computational models in BD research to understand fluctuations in mood and reward processing [14,15,16]. Short-lived emotional changes in response to rewards can accumulate to generate longer-lasting mood states, which further bias the processing of outcomes, favouring computations congruent with the valence of the current mood [17,18,19]. In BD, new frameworks, building upon previous neurocomputational work on mood instability [20], suggest that altered affective reactivity to reward and punishment may elevate learning rates, even during euthymia, predisposing individuals to form stronger expectations about rewards or punishments. Recent empirical work supports this, revealing a tendency in BD for reward perception to be biased by fluctuations in the momentum of recent reward prediction errors [21].
An increased learning rate could also reflect a heightened anticipation of environmental changes in BD [14]. Indeed, seminal modelling studies support that agents learn faster when anticipating more frequent transitions in the environment [22, 23]. In BD, persistent biased expectations during mood episodes will deviate from the true hidden state, increasing the likelihood of unsuccessful decisions. This could promote a perception that the environment is more volatile. However, the extent to which individuals with BD overestimate volatility and how this influences their belief updating and decision-making remains unexplored. Building on the proposal that moods can specify the prior probabilities of different computations [20, 24], we hypothesise that a history of experiencing mood extremes and high mood instability in BD will set a prior on high environmental volatility in this condition. Such inflated volatility estimates could introduce ‘noise’ into the decision-making process [25], leading to incorrect decisions. Alternatively, the reported shifts in decision-making performance during euthymic BD [3, 4] could be explained by slower belief updating, aligning with recent empirical observations in valence-dependent learning [13].
To test these hypotheses, we investigate the computational processes underlying altered decision-making in euthymic bipolar patients, compared to healthy participants, as they undertake a probabilistic reward-based learning task in a volatile environment. We employ the Hierarchical Gaussian Filter (HGF), a validated modelling framework based on Hierarchical Bayesian inference that describes individual learning dynamics in uncertain and volatile environments [26,27,28,29]. The HGF is based on influential theories of cortical function, which propose that the brain continuously makes and refines predictions about the states of the world using approximate Bayesian inference [30, 31]. We used the HGF to model how input about probabilistic reward outcomes and their change over time is integrated with prior beliefs during learning, resulting in posterior beliefs about the hidden states causing the observed outcomes [26, 27]. Belief updates in the HGF are driven by prediction errors (PE)—the discrepancy between predictions and outcomes—and are modulated by the precision weights, where precision is defined as the inverse variance of belief distributions. This computational framework has already proven useful for understanding psychiatric conditions [12, 32, 33], aligning with proposals that understand clinical and subclinical conditions as manifestations of aberrant belief updating and predictive processing [34, 35]. Integrating generative models of learning and inference, such as the HGF, with dynamic models of mood in BD could offer insights into how extreme changes in affective states and mood dynamically shape adaptive learning [16, 36,37,38,39].
To gain a more mechanistic understanding of the processes underlying the hypothesised computational alterations in euthymic BD, we additionally investigated the neural correlates of hierarchical belief updating using magnetoencephalography (MEG). Existing research supports the role of cortical oscillations in maintaining predictions and encoding PEs [40,41,42,43]. Specific frequency rhythms such as alpha (8–12 Hz) and beta (13–30 Hz) oscillations have been associated with the transmission of top-down predictions, and encoding precision, while gamma-band activity (>30 Hz) has been linked to the propagation of PEs and precision-weighted PEs, pwPEs [41, 43,44,45]. Importantly, disruptions in these rhythms are suggested to contribute to learning deficits observed in various psychiatric conditions, including anxiety, schizophrenia, and autism [45,46,47].
On a neural level, we hypothesised that biases in probabilistic reward-based learning in BD in a volatile setting can be reflected in alpha, beta, and gamma activity during the encoding of pwPE and precision. We anticipated between-group differences in gamma activity along with concomitant alpha/beta activity during pwPE processing, and in alpha/beta activity during the encoding of uncertainty. Faster (slower) belief updating in BD would be reflected in increased (decreased) gamma activity, with opposite directional modulation in alpha/beta during pwPE encoding [45, 46]. These alterations are expected to manifest in the orchestrated activity across decision-making brain areas, such as the prefrontal, anterior cingulate, and orbitofrontal cortex (PFC, ACC, OFC). These regions are involved in learning in volatile and uncertain settings [22, 45, 48, 49], and form part of the fronto-striatal reward circuit, which exhibits disturbed connectivity in BD [9, 50]. We therefore additionally hypothesised that changes in frequency-domain connectivity patterns between these regions during belief updating would occur in BD relative to healthy control participants.
Lastly, we aimed to determine whether the computations underlying decision-making deficits in euthymic BD influence the motivational aspects associated with the invigoration of movements. Evidence suggests that reward expectations can speed motor performance [51, 52], and the nigrostriatal dopamine pathway is crucial for invigorating future movements [53]. The ‘dopamine hypothesis’ of BD [54, 55] posits that dopamine dysregulation underlies both manic episodes and the broader episodic features of BD. Moreover, individuals with BD have been shown to exhibit heightened energy and effort following success, indicating enhanced motor vigour effects [4, 56]. Consequently, our final complementary hypothesis posits that the strength of predictions about reward contingencies will speed decision-related movements more in euthymic BD than in healthy individuals [57].
Methods and materials
Ethics declarations
The study was approved by the Institutional Review Board of National Research University Higher School of Economics and the Local Ethical Committee of the First Moscow State Medical University. All procedures contributing to this work comply with the relevant ethical guidelines and regulations for research involving human participants, including those of the approving institutional committees and the Helsinki Declaration of 1975, including its subsequent amendments. All participants provided written informed consent.
Participants
Participants included 22 bipolar patients (mean age: 29.1 years [SEM = 1.67], 17 females; Table 1), and 27 healthy participants (27.5 years [SEM = 1.18], 15 females). Bipolar participants were assessed by a consultant psychiatrist who confirmed the diagnosis of BD (I or II) using the structured clinical interview for the International Statistical Classification of Diseases and Related Health Problems (ICD-11) [58]. Patients included in the study were euthymic for at least 2.5 months before recruitment. Additional inclusion criteria were: most recent episode being depression, aged 18–50 years, absence of symptoms from other mental health conditions beyond BD, and no history of substance abuse. We assessed residual mood symptoms and cognitive performance using validated scales on mania, anxiety, depression and tasks on executive and general cognitive performance. See Table 1 for further details, and Supplementary Material for sample size estimates.
Reward-based motor sequence learning task
Participants underwent an initial fine motor control assessment, then completed a validated motor-based decision-making paradigm [57] (Fig. 1A), which combines probabilistic binary reward-based learning within a volatile setting (reminiscent of reversal learning) with the execution of motor sequences to express decisions. Participants learned two sequences of four finger presses (matched in difficulty, Supplementary Materials), followed by a 320-trial test phase. In each trial, they were required to choose and perform one of the sequences to potentially earn a reward (5 points; Fig. 1A). Reward probabilities for sequences were reciprocal (p, 1-p) and changed pseudorandomly every 26–38 trials (Fig. 1B). The aim was to infer the reward probability associated with each sequence (‘action-outcome’ contingencies henceforth) and adjust their choices considering changing contingencies. Accumulated points translated to monetary rewards. See timeline in Fig. 1C. The task, programmed in MATLAB using Psychtoolbox, recorded participants’ keypress timings to evaluate reaction time (RT) and performance tempo (Fig. 1D). See Supplementary Material.
A The initial phase of the task involved practicing two motor sequences, each linked to a distinct fractal image. The red fractal corresponded to sequence 1 (seq1: 1-3-2-4), and the blue fractal to sequence 2 (seq2: 4-1-3-2), with button presses producing sounds of varying pitches (E5, F5, G5, A5). B The stimulus-outcome mapping varied per participant across each block of 160 trials, with the win probability shifting every 26–38 trials through different phases (blue fractal: p(win|seq2) = 0.9, 0.7, 0.1, 0.3, 0.5) and the red fractal (seq1) having reciprocal probabilities (p(win|seq1) = 1-p(win|seq2)). Across both blocks participants encountered each contingency type twice. C Each trial presented the fractals on-screen, prompting participants to perform the sequence they believed most likely to win, aiming to maximize rewards. On average, participants performed the sequences within 1561 (SEM 40) ms, displayed as ~ 1600 ms. Binary feedback on reward acquisition was displayed 1000 [±200] ms after sequence performance, visible for 1900 [ ± 100] ms, indicating either ‘You earned 5 points’ or ‘You earned 0 points’. D Trial-by-trial performance tempo (ms) for the healthy control (HC, green) and bipolar disorder (BD, purple) groups. Tempo, calculated as the mean inter-key press interval, is shown as trial-wise averages (black dots) with 95% confidence intervals represented by bars.
General task performance
General probabilistic task performance was assessed using the win rate (rate of rewarded trials), lose-shift and win-stay rates [45, 59], related to our first hypothesis (Supplementary Material). Higher learning would be associated with higher values across three variables. Separately, we controlled for between-group differences in error rates (performance errors and timeouts).
Modelling decision-making behaviour using hierarchical Gaussian filters
To assess probabilistic learning in our task we used a validated hierarchical Bayesian model, the 3-level perceptual HGF for binary categorical inputs [26, 27](Fig. 2A). This model described how participants infer hidden states about the tendency of the action-outcome contingencies on trial k, x2(k)(level 2), and the rate of change in that tendency (log-volatility), x3(k). Level 1 represents the binary reward input. Gaussian belief distributions on levels 2 and 3 are represented by their posterior mean (μ2(k), μ3(k)) and posterior variance (uncertainty: σ2, σ3), where precision is the inverse variance or uncertainty, πi (i = 2, 3). The first-, second- and third-level variance (σ1, σ2, σ3) represent irreducible, estimation and volatility uncertainty [26, 60, 61]. Estimation and volatility uncertainty arise from imperfect information about the true states x2 and x3, respectively, and can be reduced as learning progresses. By contrast, σ1 cannot be reduced through learning, as it embodies the probabilistic nature of response-outcome relationships. See further details in Supplementary Materials.
A Overview of the winning model: 3-level binary categorical HGF perceptual model and coupled response model. In this model, agents infer true states about the current tendency of the action-outcome probabilistic mapping on trial k, x2(k), and its rate change or log-volatility, x3(k). Beliefs about these true states are Gaussian distributions parametrised by their mean (μ2(k), μ3(k)) and variance (σ2(k), σ3(k)), representing uncertainty or the inverse of precision. These mean and precision variables are updated using one-step equations, with updates modulated by parameters such as κ, ω2, ω3. The response model maps these beliefs to decisions based on the expectation of log-volatility from the previous trial (μ3(k-1)), equivalent to the prediction for the current trial (denoted by "^", Supplementary Materials). B Trajectories used in further analyses include the strength of predictions about action-outcome contingencies,\(\left|{\hat{\mu }}_{2}^{\left(k\right)}\right|\) (top), for assessing motor vigour effects; the trajectory of unsigned precision-weighted prediction errors updating beliefs at level 2, labelled |ε2| here (centre), serving as a parametric regressor of source-reconstructed MEG activity, alongside uncertainty regressors σ2, σ3; and log-volatility estimates, μ3 (bottom), averaged to test the hypothesis that BD participants overestimate volatility in this setting. See expanded Supplementary Figs. S1 and S2. C Comparative win rates show BD participants (purple) were significantly less successful in achieving rewarding outcomes than their healthy counterparts (green; lower win rate, PFDR = 0.0014, permutation test). D BD patients exhibited a significantly higher tendency to switch after a win compared to the HC group (reduced win-stay behaviour, PFDR = 0.0194). Nonetheless, lose-shift behaviour was similar across groups (P = 0.0966, non-significant; BF10 = 0.8905; anecdotal evidence against group differences). Mean and SEM rates shown in black dots for panels c and d represent performance by ideal Bayesian observers with the same input as our participants (detailed in Supplementary Material), highlighting deviations from these ideal patterns in our actual participants, which however did not account for the observed between-group differences. E–G Between-group comparisons of HGF computational variables revealed that BD patients consistently overestimated environmental log-volatility (E; initially, μ3(0): PFDR = 0.0142, and throughout the task, F; mean μ3: PFDR = 0.0428), while showing an attenuation effect on tonic volatility, ω2 (G; significant reduction compared to HC, PFDR = 0.0174).
Belief updating on each level i and trial k is driven by prediction errors, and modulated by precision ratios, weighting the influence of precision or uncertainty in the current level and the level below. This is termed precision-weighted PE, pwPE. For level 2, belief updating takes the simple form:
Thus, updating beliefs about the tendency of the action-outcome contingencies is proportional to the PE about action outcomes, δ1(k), weighted with the estimation uncertainty on that level, σ2(k). Here, pwPE is equal to σ2(k)δ1(k). See general equation, representing updates on level 3, in Supplementary Material, and ref. [27].
States x2, x3 evolve as random Gaussian walks, with volatility states x3 directly influencing the time evolution of x2 through its variance (conditional on past values):
with (dropping k for simplicity)
In (3), ω2 is the tonic portion of the log conditional variance of x2, and κ is a coupling constant that regulates how phasic volatility, x3, alters the magnitude of belief updates about action outcomes. The step size at level 3 is modulated by ω3, representing high-level tonic volatility. Larger values of ω2 and ω3 are associated to larger updates in beliefs about the probabilistic mapping at level 2 and volatility, respectively, as demonstrated in previous simulations [59]. See also Results. Higher κ values increase the influence of log-volatility changes on belief updates at level 2. See further details in Supplementary Material.
To assess how beliefs mapped to decisions, we coupled this perceptual model to response models previously used in similar tasks [27, 45, 46]. First, we considered a unit-square sigmoid response model where choice probability is shaped by a free fixed (time-invariant) parameter ζ, interpreted as inverse decision noise: the sigmoid approaches a step function as ζ tends to infinity. This constituted our model M1. Model M2 was similar but employed a two-level HGF with constant volatility. M3 combined the 3-level HGF with a response model where the sigmoid function depends on the trial-wise prediction of log-volatility, \(\zeta ={e}^{-{\mu }_{3}^{\left(k-1\right)}}\) [25](Fig. 2A). In this model, higher estimates of volatility lead to a more stochastic mapping from beliefs to decisions. As a result, there is an increased likelihood of choosing responses that deviate from predictions, consistent with increased exploration (exploring whether the contingency has changed). In models M1 and M3, parameters ω2 and ω3 were free; ω2 was also free in M2. Additionally, ζ was free in M1 and M2, while initial values μ3(0) and σ3(0) were free in M3. Higher initial values in μ3(0) indicate that an agent expects rapid changes in the probabilistic mapping initially, while σ3(0) represents the initial uncertainty an agent has about μ3(0). A fourth model, M4, was constructed similarly to M3 but replaced the free parameter ω2 with κ [32].
Models were fitted to individual behavioural data (series of responses and observed outcomes) using priors described in Supplementary Table S1. Log model evidence from random-effects Bayesian model selection was used for model comparison (Supplementary Materials). Simulations, similar to previous work, quantified the estimability of free model parameters [32, 45]. Relevant belief and uncertainty trajectories were used subsequently for our MEG analysis (Fig. 2B; and expanded figures in Supplementary Figs. S1 and S2; see Results). The models were implemented as a part of the TAPAS toolbox [62]. We used the HGF release v7.1 in MATLAB R2020b, and functions ‘tapas_ehgf_binary’.
Assessing motor invigoration
Details on assessing motor vigour are included in Supplementary Materials. Using Bayesian multilevel regression modelling, we investigated whether trial-by-trial predictions about the action-outcome contingencies, \({\hat{\mu }}_{2}^{\left(k\right)}\), differentially influenced the timing of motor performance in the groups, related to our motor vigour hypothesis. As in ref. [57], we hypothesised a negative association between the strength of predictions, \(|{\hat{\mu }}_{2}^{\left(k\right)}|\), and performance timing, suggesting that stronger expectations about reward contingencies speed performance. We also hypothesised a greater sensitivity to these predictions (steeper slope) in BD compared to HC. See Supplementary Tables S2, S3 and the Supplementary Material, which also includes control analyses on baseline motor performance.
MEG recording and preprocessing
MEG was performed using a 306-channel system (Elekta Neuromag VectorView), with head movements tracked by a head position indicator with four coils. Concurrently, ECG and EOG were recorded for MEG artefact rejection. Recordings were sampled at 1000 Hz and filtered between 0.1–330 Hz. MEG preprocessing involved head movement correction, noise reduction, and channel selection using standard methods ([63]; Elekta Maxfilter software; Supplementary Material). MEG data was further processed using MNE-python [64] (Python version 3.11.5) and custom Python scripts, lowpass filtered at 125 Hz, downsampled to 250 Hz, with a notch filter applied at 50 and 100 Hz. Independent component analysis (FastICA) removed eye and heart artifacts (3.3. ICs on average per participant).
Source reconstruction of MEG signals
We reconstructed MEG signals using Linearly Constrained Minimum Variance beamforming (LCMV [65]) in MNE-Python and individual T1-MRI images for cortical divisions with Freesurfer 6.0 [66, 67], http://surfer.nmr.mgh.harvard.edu/). We aligned MRI and MEG coordinate systems, selected the Desikan-Killiany atlas for cortical parcellation (DK [68]), and performed forward modelling with boundary element models [45].
We focused on alpha and beta frequency bands, band-pass filtering signals between 1–40 Hz before LCMV beamforming. Theta-band activity was also examined given its robust association with feedback processing [42, 69], relevant for win/lose outcomes in our task. Gamma frequency analysis followed a similar process (30–124 Hz band-pass filter). Time courses were extracted for regions of interest (ROIs) associated with decision-making under uncertainty and reward processing [43, 70,71,72,73,74,75], and linked with impairments in fronto-striatal reward circuitry in refs. [5, 9, 76, 77]. These included the (1) ACC, (2) OFC, including the ventromedial PFC, (3) dorsomedial PFC (dmPFC), (4) dorsolateral PFC (dlPFC). We also included the (5) primary motor cortex (M1) and (6) premotor cortex (PMC), to assess motor activity during decision-making [78].
Our study’s ROIs comprised 16 bilateral labels in eight areas from the DK atlas: (1) rostral and caudal ACC, (2) lateral and medial OFC (including vmPFC), (3) superior frontal gyrus (dmPFC, and supplementary motor area, SMA), (4) rostral middle frontal gyrus (rMFG), (5) precentral gyrus (M1), and (6) caudal MFG. Time series extraction utilised the PCA flip method in MNE-Python. Although the ‘flip’ operator was not relevant for our time-frequency analysis, it was essential for preparing the source-reconstructed time series for subsequent connectivity analysis. See anatomical label references in Supplementary Material.
Convolution modelling of time-frequency responses during outcome processing
We used a validated convolution-modelling approach to analyse frequency-domain amplitude changes related to belief updating and uncertainty following outcome presentation [44, 46, 79]. Building on previous work [45], this frequency-domain general linear model (GLM) included as parametric regressors the unsigned pwPE updating beliefs on level 2 (representing precision-weighted Bayesian surprise; the absolute value is preferred for the binary HGF where sign on level 2 is arbitrary [59, 80]), and uncertainty measures (σ2, σ3). It also included discrete regressors for win/lose outcomes and error trials. To avoid regressor collinearity and potential GLM misspecification, we excluded the level 3 pwPE [45, 46], due to its high linear correlation with the unsigned pwPE on level 2 (Supplementary Materials).
The GLM was applied to concatenated epochs of source-reconstructed data in our ROIs, using Morlet wavelets for time-frequency (TF) analysis in 4–100 Hz and within −0.5–1.8 s (Supplementary Fig. S3). We conducted this analysis using SPM12 software (http://www.fil.ion.ucl.ac.uk/spm/), adapting original code by ref. [81], as used in ref. [45], with additional details available in the Supplementary Materials.
Frequency-resolved functional connectivity
To analyse directed functional connectivity between frequency-resolved activity in our ROIs, we employed time-reversed Granger causality (TRGC [82]; Supplementary Materials) as a robust metric for directed information flow [83]. Following Pellegrini et al. [83], we applied TRGC in the frequency domain to LCMV-based source-reconstructed time series from our 16 ROIs after the PCA flip transformation.
Our analysis focused on between-group differences in the directionality of information flow within the 8–30 Hz range during the 0.5–1 s interval of outcome processing for trials with large unsigned pwPEs updating beliefs at level 2. We employed a median split of unsigned pwPE values, yielding approximately 160 high-|pwPE| trials per participant. This frequency range was selected based on evidence that beta-band functional connectivity from the PFC effectively differentiates levels of predictability, exhibiting reduced values during unpredictable trials [42]. By examining TRGC in trials with high unsigned pwPE values, we anticipated a general decrease in beta-band TRGC in HC, in parallel with alpha/beta amplitude suppression during belief updating. We hypothesised that this pattern would be disrupted in BD. The TRGC analysis was conducted using the ROIconnect plugin for EEGLAB [83], adapted for our MNE-python LCMV outputs. See Supplementary Materials.
Statistical analysis
Between-group analyses of behavioural, computational, and TRGC-derived variables used independent-sample permutation tests (5000 permutations, two-sided test) in MATLAB®. Within-subject analyses used paired permutation tests (two-sided). We maintained an alpha significance level at 0.05 and controlled false discovery rates (FDR) at q = 0.05 for multiple tests. Non-parametric effect sizes are reported as probability of superiority [84, 85] (Δ). Non-significant effects were further evaluated using Bayes Factors (BF10), interpreted following Wetzels and Wagenmakers [86].
Statistical analysis of source-level time-frequency images used cluster-based permutation testing in the FieldTrip Toolbox [87, 88] (1000 permutations). We averaged TF activity across frequency bins within each band (theta, alpha, beta; 60–100 Hz for gamma [45]). Temporal intervals of interest for statistical analyses were selected based on previous research [45, 46, 69]: 0.5–1.8 s for parametric regressors, 0.2–1 s for win/lose regressors. We controlled the family-wise error rate (FWER) at 0.05 (two-sided tests, effects considered if PFWER < 0.025). See Supplementary Materials.
Results
Demographics
BF analysis provided anecdotal evidence for a balanced distribution of age and sex across the groups. Furthermore, substantial to anecdotal evidence indicated similar scores in mania, anxiety, depression, and general cognitive functioning between groups. Significant differences were observed exclusively in executive functioning, with the BD group demonstrating lower performance. See Table 1.
Altered reward-based decision dynamics in bipolar disorder during euthymia
Euthymic BD participants exhibited lower win rates compared to HC individuals (PFDR = 0.0014; Δ = 0.79, CI = [0.60, 0.90]; Fig. 2C). They also demonstrated lower win-stay rates (PFDR = 0.0194; Δ = 0.71, CI = [0.55, 0.85]; Fig. 2D). This indicates that, after securing a win on a trial, BD individuals were less likely to repeat the sequence compared to HC. Their decision to switch strategies post-loss was similar, based on anecdotal evidence (lose-shift rate: P = 0.0966; BF10 = 0.8905; Fig. 2D), and despite an overall increased total switch rate in BD relative to HC (See details in Supplementary Materials, including evidence for similar performance error rates).
To test our computational hypotheses, we used the HGF framework [27]. Bayesian model selection identified as the best model overall, and for each group separately, a three-level HGF with a response model in which the decisions depend on dynamic trial-by-trial expectations of log-volatility, μ3(k-1), and with ω2, ω3, μ3(0), and σ3(0) as free model parameters (M3; Supplementary Table S4). Simulation analyses confirmed good parameter recovery (Supplementary Fig. S4).
Using this model, we found that BD participants had higher expectations of log-volatility initially and on average (μ3(0): PFDR = 0.0142; Δ = 0.70, CI = [0.55, 0.85]; trial-average μ3: PFDR = 0.0428; Δ = 0.66, CI = [0.52, 0.82]; Fig. 2E, F). This suggests increased stochasticity in their responses, as also indicated by a positive correlation between log-volatility μ3 and the response switch rate in the total sample and within each group separately (Supplementary Fig. S5a). Parameter μ3 also exhibited a negative correlation with the win-stay rate (Supplementary Fig. S5b; Supplementary Materials), consistent with the finding that BD had overestimation of μ3 and a lower win-stay rate than HC (Fig. 2D, F). Additionally, BD participants exhibited lower tonic volatility, ω2, compared to HCs (PFDR = 0.0174; Δ = 0.67, CI = [0.52, 0.82]; Fig. 2G), suggesting a slower adjustment of beliefs about action-outcome contingencies (see simulation analysis in Supplementary Fig. S6). No significant between-group differences were found in ω3.
We additionally assessed the association between residual symptoms in BD participants and relevant HGF variables. Prior work suggests a positive correlation between volatility and trait anxiety [45]. Accordingly, we analysed the relationship between trait anxiety levels in BD and μ3, confirming a significant positive correlation (Spearman’s rank correlation ρ = 0.46, 95% confidence interval, CI, [0.04, 0.75], PFDR = 0.030). For mania scores, we hypothesised a correlation with the precision weights term, σ2 (estimation uncertainty), which scales the influence of PEs on belief updates about action-outcome contingencies (Eq. (1)). We posited that higher mania levels in BD might be associated with an enhanced reactivity to PEs [21], speeding belief updating via σ2. Non-parametric regression analyses revealed a negative association between mania and σ2 (ρ = −0.46 [−0.75, −0.02], PFDR = 0.037). Conversely, we considered that depression scores might be associated with attenuated reward-based belief updating (lower σ2) yet found a lack of association between these variables (ρ = 0.04 [−0.34, 0.41], P = 0.836; BF10 = 0.464, anecdotal evidence). See Supplementary Fig. S7 and Supplementary Materials.
Control analyses revealed no medication effects (antipsychotics and dopamine-blocking/modulating drugs) on these associations or the main between-group computational results. See Supplementary Table S5.
Expectation about the tendency of the reward probability invigorates motor performance similarly in both groups
Bayesian multilevel modelling demonstrated that greater expectations about the tendency of the action-outcome probability speeded performance tempo, but similarly in BD and HC groups (Supplementary Table S6, Supplementary Figs. S8, S9). RT was not modulated by trial-wise predictions, as in ref. [57].
Attenuated neural representation of precision-weighted prediction errors updating beliefs about the action-outcome contingencies in bipolar disorder
During the processing of unsigned pwPEs about the tendency of action-outcome contingencies, HC and BD participants exhibited suppression of 8–30 Hz activity across prefrontal, orbitofrontal, cingulate, and motor regions (negative cluster within 0.5–0.9 s, post relative to pre-outcome baseline, PFWER = 0.001, 0.024 in each group; Fig. 3A, B). This suppression effect was less widespread in BD, and the between-group difference was significant across the caudal and rostral ACC, MFG, and OFC; as well as in the SFG and M1 (BD − HC positive cluster at 0.6–0.9 s, PFWER = 0.0130; Fig. 3C, D; Supplementary Fig. S12). Alongside these 8–30 Hz effects, the BD group exhibited significantly attenuated high gamma activity (60–100 Hz) compared to HC (negative cluster, PFWER = 0.0090; Fig. 3E). The latency of the gamma effect coincided with the timing of the alpha-beta modulations, spanning 0.5–0.82 s, and overlapping within the aforementioned ROIs. No significant within-subject changes in gamma activity to the unsigned pwPE regressor were observed in either group (Supplementary Material).
A Source reconstruction of MEG signals was carried out with linearly constrained minimum norm variance (LCMV) beamforming. The statistical analysis of convolution GLM results targeted brain regions implicated in decision-making under uncertainty and reward processing [43, 71,72,73,74,75], associated with impairments in the fronto-striatal reward circuitry in BD [5, 9, 76, 77]: caudal and rostral ACC, OFC (lateral and medial portions: lOFC, mOFC), SFG, caudal and rostral MFG, M1. Panel a illustrates these regions using anatomical labels from the neuroanatomical Desikan-Killiany atlas (DK), utilised to parcellate the cerebral cortex of each participant based on their individual T1-weighted MRI. B Left and centre panels display within-subject effects in time-frequency (TF) images representing oscillatory amplitude responses to unsigned precision-weighted PEs about stimulus outcomes. TF images cover the 4–100 Hz range, including theta (4–6 Hz), alpha (8–12 Hz), beta (14–30 Hz), and gamma (32–100 Hz) activity. The TF images were normalised by subtracting the mean and dividing by standard deviation (SD) of the activity in the [−300, −50] ms pre-outcome interval, and thus are presented in SD units. Significant within-subject effects are outlined in black for the HC (left) and BD (centre) groups (cluster-based permutation tests, negative cluster within 0.5–0.9 s post relative to pre-outcome baseline, PFWER = 0.001, 0.024 in each group, respectively. Although no within-subject effects in BD were observed in the illustrated SFG label, there were effects across other ROIs). The right panel shows the between-group differences, significant in a cluster-based permutation test (positive cluster within 8–30 Hz, PFWER = 0.0130; negative cluster within 60–100 Hz, PFWER = 0.0090; N = 21 BD and 27 HC independent samples). The time point 0 s marks the onset of outcome presentation. C, D Panels Illustrate between-group effects in the alpha (C) and beta (D) ranges, attributed to more pronounced alpha and beta suppression in HC than in BD participants during encoding of unsigned pwPE on level 2. Effects are depicted in ROIs including the cACC, lOFC, SFG, M1. E Similar to panels C and D but in the gamma range, showing that unsigned pwPE were associated with increases in TF amplitude in gamma range for HC participants, yet with gamma attenuation in BD participants, and across a similar range of ROIs. Labels denote the rostral anterior cingulate cortex, rACC; caudal ACC, cACC; superior frontal gyrus, SFG; lateral and medial orbitofrontal cortex, lOFC and mOFC; primary motor cortex, M1; caudal and rostral middle frontal gyrus, cMFG, rMFC.
In addition, for the uncertainty regressors σ2 and σ3, despite a significant widespread increase in 8–30 Hz activity to estimation uncertainty σ2 in HC, no significant between-group differences were observed (Supplementary Fig. S13). Regarding theta modulation by win and lose events, no significant differences were observed between groups either. However, as expected, both groups showed significant increases in theta activity from baseline in the ACC, extending to prefrontal and orbitofrontal ROIs (Supplementary Fig. S14).
In a post-hoc analysis, we investigated alpha and beta raw power during inter-trial intervals. This aimed to determine whether the reduced suppression in the 8–30 Hz range to the pwPE regressor in BD indicated a limited dynamic range of activity at these frequencies. Significantly lower power was observed in BD compared to HC, yet exclusively at 13–20 Hz. This effect emerged in most of the ROIs where the pwPE effect was expressed (Supplementary Fig. S15; Supplementary Material).
Frequency-domain functional connectivity patterns during unsigned pwPE processing
We next assessed group differences in the directionality of information flow during outcome processing for trials with large unsigned pwPEs updating beliefs at level 2. The BD cohort exhibited significantly larger TRGC coefficients than HC participants from the cACC to the rMFG and rACC, as well as from the SFG to the cMFG, in the beta frequency range (PFDR = 0.0032, 0.0064, 0.0064, respectively; Fig. 4). The effect from the cACC to the rMFG extended to the alpha range (Fig. 4A, C). These findings indicate stronger evidence for statistical dependencies between sources in the identified directions for BD than HC in the beta (alpha) range. Importantly, these between-group effects were not attributable to differences in signal-to-noise ratio (Supplementary Fig. S16).
A Comparison of TRGC estimates in the alpha band for healthy control participants (HC, left column), bipolar disorder patients (BD, centre), and their difference (BD-HC, right column). The direction of information flow goes from rows to columns, with positive coefficients denoting increased predictability in that direction, while negative coefficients denote the reverse (increased predictability from column to row). Between-group statistical analysis was conducted in the above-diagonal values. Anatomical labels represent our regions of interest, bilaterally. Labels are displayed for one hemisphere. The coloured pixel in the right panel indicates a significant between-group difference in TRGC metric, after FDR control, due to increased evidence for TRGC from the caudal ACC to the rostral MFG in BD (PFDR = 0.0032). B Same as A but for the beta band, illustrating a significantly larger TRGC metric in BD than HC from the cACC to the rMFG and rACC, as well as from the SGF to the cMFG (PFDR = 0.0032, 0.0064, 0.0064, respectively). C Left: Illustration of the TRGC metric from cACC to the rMFG between 8–30 Hz for HC (green line: mean and SEM as shaded area) and BD (purple line: mean and SEM). The horizontal black line denotes the frequency bins of significant differences after FDR control, shown in A. Middle: Same as the left panel but for the TRGC metric from cACC to rACC, showing beta effects. Right: Same as left and middle panels, exhibiting larger TRGC metric values in BD than HC from SGF to the cMFG in the beta range. Labels: rACC, rostral anterior cingulate cortex; cACC, caudal ACC; cMFG, caudal medial frontal gyrus, rMFG, rostral MFG; lOFC, lateral orbitofrontal cortex; mOFC, medial OFC; SFG, superior frontal gyrus; M1, primary motor cortex.
Discussion
Completing our reward-based motor decision-making task, euthymic BD participants demonstrated lower win rates and a decreased tendency to repeat rewarded actions than HC, despite similar post-loss decision-making behaviour. Furthermore, employing the HGF to probe the computational processes underpinning decision-making, we found that BD participants expected more environmental volatility than HC, leading to a more stochastic mapping from beliefs to actions and higher switch rates, particularly after wins. These findings align with previous reports of heightened risk-taking and inconsistent behaviour in BD [89, 90], mirroring elevated win-switch tendencies in BD adolescents [91] and deficits in response reversal during remission [3, 4]. This suggests that decisions in euthymic BD are misaligned with their beliefs about recent successes, favouring suboptimal actions due to an overestimation of environmental changes, potentially overriding the influence of their beliefs about action-outcome contingencies on decisions.
Despite expecting increased volatility, BD participants were slower to adjust their expectations of action-outcome contingencies compared to HC, with a lower tonic volatility parameter ω2 indicating slower adaptation. Similar results in HGF modelling for paranoia [32] suggest that this propensity to anticipate change without learning from it appropriately may be a common feature across paranoia and BD. Additionally, despite similar residual symptom levels in BD and HC, trait anxiety in BD correlated with volatility estimates, aligning with findings that high trait anxiety exacerbates difficulties in adapting to environmental changes [45, 92].
Of relevance in BD, we observed that residual mania symptoms negatively correlated with estimation uncertainty, σ2, which scales the influence of PEs about action-outcomes on level-2 belief updating. Therefore, those with higher mania scores struggled more with updating these predictions. Given that early relapse in BD has been associated with a reduction in an empirical measure of belief updating in response to positive feedback [13], future work could investigate if computational metrics of belief updating like σ2 enhance prediction of clinical progression over behavioural indicators. Further investigations should also explore the effect of comorbid anxiety on volatility responses and relapse.
Despite deficits in decision-making and baseline executive function in our BD sample, motor performance invigoration was comparable to HC, indicating preserved motivational drive in euthymic BD. This contrasts with previous findings that rewards and success amplify energy and effort in BD [4, 56]. Our Bayesian analyses revealed a similar sensitivity of performance tempo to expectations about reward contingencies in both groups, highlighting that the alterations in euthymic BD were confined to decision-making processes.
On a neural level, convolution modelling on source-reconstructed time-frequency activity revealed BD individuals had attenuated neural representations of encoding unsigned pwPE—updating beliefs about action-outcome contingencies—compared to HC. This was marked by decreased gamma and increased alpha-beta amplitude changes 0.5–0.9 s post-outcome across multiple PFC, OFC, ACC, and motor regions. Spatial effects in anatomical PFC labels corresponded with functional vmPFC, dmPFC, and dlPFC, aligning with the neural correlates of decision-making under uncertainty [43, 72,73,74], and BD-specific neural alterations during reward processing [5, 9, 76, 77].
Recent rhythm-based formulations of predictive coding suggest distinct roles of oscillatory activity at different frequencies in conveying predictions and PE during perception [42, 93]. Alpha and beta oscillations in deep cortical layers are implicated in conveying top-down predictions, while gamma oscillations in superficial layers are associated with the representation of PE, particularly in sensory cortices and related areas [42, 93]. This division has received empirical validation in both human and animal studies, supporting generative models like predictive coding [94,95,96] and hierarchical Bayesian inference [44, 97], extending across perceptual and cognitive domains [43, 98]. In models of hierarchical Bayesian inference, like the HGF, these oscillatory activities may underpin pwPE encoding, demonstrating an antithetical modulation of alpha/beta and gamma activity [44]. The observed dysregulation of these rhythms in conditions like anxiety [45, 46] suggests a neurophysiological basis for symptoms resulting from imbalances in belief updating.
Our findings indicate that in euthymic BD, exacerbated alpha and beta activity may inhibit gamma activity during unsigned pwPE encoding, potentially accounting for maladaptive belief updating. This may reflect an under-reliance on using predictions about action-outcome contingencies to optimise behaviour, in line with the computational results. Such rhythmic changes match electrophysiological evidence of heightened beta and reduced gamma activity in BD during oddball processing [99, 100]. Moreover, using the TRGC to assess directional influences in frequency-domain activity, we observed stronger evidence for beta-band directional flow in BD compared to HC, from cACC to rACC and rMFG and SFG to cMFG during trials with larger unsigned pwPEs. TRGC values increased in BD but decreased in HC, aligning with expectations from primate research where beta-band Granger Causality in the PFC decreases during unpredictable trials—a pattern suggesting normative responses [42]. Thus, our study revealed that euthymic BD was associated both with altered frequency-domain amplitude changes and functional connectivity during belief updating.
Insufficient GABAergic neurotransmission and excessive glutamatergic activity have been linked to electrophysiological alterations in BD [101]. Considering that mood stabilisers for BD, such as valproate and lithium, may have opposing effects on beta activity and potentially on beta/gamma connectivity [101, 102], a promising avenue for future research is to assess the modulation of alpha-beta and gamma amplitude and connectivity during pwPE encoding in BD as potential markers for tracking treatment response and for diagnostic purposes. A key limitation of our study is the inclusion of patients on diverse psychiatric medications, including mood stabilisers, antipsychotics, and antidepressants. These treatments impact various neurotransmitter systems like dopamine and serotonin, affecting neural and behavioural aspects of decision-making [50, 103]. The varied effects of these medications may have influenced the magnitudes of the effects reported, a factor future research should consider. However, control analyses showed that medication types did not account for differences in the main or exploratory computational analyses. Additionally, the study was not preregistered, yet all analyses followed established pipelines from our recent work involving similar tasks [45, 57], except for the TRGC analysis. This was specifically designed based on similar Granger-causality analyses that assess rhythm-based hypotheses of predictive processing [42]. Lastly, our study did not contrast the HGF framework with alternatives like Bayesian change-point models [104] or those jointly estimating volatility and stochasticity [23]. Future research would benefit from such comparisons to validate the computational processes underlying belief updating alterations in BD. Integrating dynamic models of mood in BD [16, 38, 39] with the HGF framework in longitudinal studies will also be crucial for determining how markers identified in euthymic BD—overestimation of volatility and lower ω2—relate to changes across BD episodes, including depression, mania/hypomania, and mixed states. We tentatively propose that overestimation of volatility may be a trait marker of BD, with a stronger effect on reward or punishment learning depending on the episode.
In sum, our findings highlight significant alterations in belief updating among BD individuals during euthymia, when learning reward-based probabilistic mappings in volatile environments, without affecting the motivational aspects of motor execution. Importantly, the identification of frequency-domain amplitude and functional connectivity alterations underpinning these computational maladaptations provides crucial insights for enhancing relapse prediction and monitoring treatment response in future research.
Data availability
Behavioural data are publicly available at the Open Science Framework (OSF) repository, https://osf.io/m63a8/. Participants did not consent to the open sharing of their MEG or MRI data. MEG source and convolution modelling analysis codes are publicly available at https://osf.io/m63a8/.
References
Grande I, Berk M, Birmaher B, Vieta E. Bipolar disorder. Lancet. 2016;387:1561–72.
Merikangas KR, Jin R, He JP, Kessler RC, Lee S, Sampson NA, et al. Prevalence and correlates of bipolar spectrum disorder in the world mental health survey initiative. Arch Gen Psychiatry. 2011;68:241–51.
Gorrindo T, Blair RJR, Budhani S, Dickstein DP, Pine DS, Leibenluft E. Deficits on a probabilistic response-reversal task in patients with pediatric bipolar disorder. Am J Psychiatry. 2005;162:1975–7.
Johnson SL, Edge MD, Holmes MK, Carver CS. The behavioral activation system and mania. Annu Rev Clin Psychol. 2012;8:243–67.
Mason L, O’Sullivan N, Montaldi D, Bentall RP, El-Deredy W. Decision-making and trait impulsivity in bipolar disorder are associated with reduced prefrontal regulation of striatal reward valuation. Brain. 2014;137:2346–55.
Trost S, Diekhof EK, Zvonik K, Lewandowski M, Usher J, Keil M, et al. Disturbed anterior prefrontal control of the mesolimbic reward system and increased impulsivity in bipolar disorder. Neuropsychopharmacology. 2014;39:1914–23.
Adida M, Jollant F, Clark L, Guillaume S, Goodwin GM, Azorin J-M, et al. Lithium might be associated with better decision-making performance in euthymic bipolar patients. Eur Neuropsychopharmacol. 2015;25:788–97.
Schreiter S, Spengler S, Willert A, Mohnke S, Herold D, Erk S, et al. Neural alterations of fronto-striatal circuitry during reward anticipation in euthymic bipolar disorder. Psychol Med. 2016;46:3187–98.
Bart CP, Titone MK, Ng TH, Nusslock R, Alloy LB. Neural reward circuit dysfunction as a risk factor for bipolar spectrum disorders and substance use disorders: a review and integration. Clin Psychol Rev. 2021;87:102035. https://doi.org/10.1016/j.cpr.2021.102035.
Adida M, Jollant F, Clark L, Besnier N, Guillaume S, Kaladjian A, et al. Trait-related decision-making impairment in the three phases of bipolar disorder. Biol Psychiatry. 2011;70:357–65.
Brambilla P, Perlini C, Bellani M, Tomelleri L, Ferro A, Cerruti S, et al. Increased salience of gains versus decreased associative learning differentiate bipolar disorder from schizophrenia during incentive decision making. Psychol Med. 2013;43:571–80.
Powers RL, Russo M, Mahon K, Brand J, Braga RJ, Malhotra AK, et al. Impulsivity in bipolar disorder: relationships with neurocognitive dysfunction and substance use history. Bipolar Disord. 2013;15:876–84.
Ossola P, Garrett N, Sharot T, Marchesi C. Belief updating in bipolar disorder predicts time of recurrence. Elife. 2020;9:1–17.
Mason L, Eldar E, Rutledge RB. Mood instability and reward dysregulation—a neurocomputational model of bipolar disorder. JAMA Psychiatry. 2017;74:1275.
Eldar E, Felso V, Cohen JD, Niv Y. A pupillary index of susceptibility to decision biases. Nat Hum Behav. 2021;5:653–62.
Pulcu E, Saunders KEA, Harmer CJ, Harrison PJ, Goodwin GM, Geddes GR, et al. Using a generative model of affect to characterize affective variability and its response to treatment in bipolar disorder. Proc Natl Acad Sci USA. 2022;119:e2202983119. https://doi.org/10.1073/pnas.220298311.
Eldar E, Pessiglione M, van Dillen L. Positive affect as a computational mechanism. Curr Opin Behav Sci. 2021;39:52–57.
Erdman A, Eldar E. The computational psychopathology of emotion. Psychopharmacology. 2023;240:2231–8.
Pessiglione M, Heerema R, Daunizeau J, Vinckier F. Origins and consequences of mood flexibility: a computational perspective. Neurosci Biobehav Rev. 2023;147:105084. https://doi.org/10.1016/j.neubiorev.2023.105084.
Eldar E, Niv Y. Interaction between emotional state and learning underlies mood instability. Nat Commun. 2015;6:6149. https://doi.org/10.1038/ncomms7149.
Moningka H, Mason L. Misperceiving momentum: computational mechanisms of biased striatal reward prediction errors in bipolar disorder. Biol Psychiatry Glob Open Sci. 2024;4:100330. https://doi.org/10.1016/j.bpsgos.2024.100330.
Behrens TEJ, Woolrich MW, Walton ME, Rushworth MFS. Learning the value of information in an uncertain world. Nat Neurosci. 2007;10:1214–21.
Piray P, Daw ND. A model for learning based on the joint estimation of stochasticity and volatility. Nat Commun. 2021;12:6587. https://doi.org/10.1038/s41467-021-26731-9.
Emanuel A, Eldar E. Emotions as computations. Neurosci Biobehav Rev. 2023;144:104977. https://doi.org/10.1016/j.neubiorev.2022.104977.
Diaconescu AO, Mathys C, Weber LAE, Daunizeau J, Kasper L, Lomakina EI, et al. Inferring on the intentions of others by Hierarchical Bayesian learning. PLOS Comput Biol. 2014;10:e1003810. https://doi.org/10.1371/journal.pcbi.1003810.
Mathys C, Daunizeau J, Friston KJ, Stephan KE. A Bayesian foundation for individual learning under uncertainty. Front Hum Neurosci. 2011;5:39. https://doi.org/10.3389/fnhum.2011.00039.
Mathys CD, Lomakina EI, Daunizeau J, et al. Uncertainty in perception and the Hierarchical Gaussian filter. Front Hum Neurosci. 2014;8:825. https://doi.org/10.3389/fnhum.2014.00825.
Friston K, Kilner J, Harrison L. A free energy principle for the brain. J Physiol. 2006;100:70–87.
Friston K, Schwartenbeck P, FitzGerald T, Moutoussis M, Behrens T, Dolan RJ. The anatomy of choice: active inference and agency. Front Hum Neurosci. 2013;7:598. https://doi.org/10.3389/fnhum.2013.00598.
Friston K, Kiebel S. Predictive coding under the free-energy principle. Philos Trans R Soc B Biol Sci. 2009;364:1211–21.
Kok P, De Lange FP. Predictive coding in sensory cortex. In: An Introduction to Model-Based Cognitive Neuroscience 1st edn., 17, New York: Springer; 2015.
Reed EJ, Uddenberg S, Suthaharan P, Mathys CD, Taylor JR, Groman SM, et al. Paranoia as a deficit in non-social belief updating. Elife. 2020;9:1–55.
Deserno L, Boehme R, Mathys C, Katthagen T, Kaminski J, Stephan KE, et al. Volatility estimates increase choice switching and relate to prefrontal activity in Schizophrenia. Biol Psychiatry Cogn Neurosci Neuroimaging. 2020;5:173–83.
Friston K. Computational psychiatry: from synapses to sentience. Mol Psychiatry. 2022;28:256–68.
Kas MJH, Hyman S, Williams LM, Hidalgo-Mazzei D, Huys QJM, Hotopf M, et al. Towards a consensus roadmap for a new diagnostic framework for mental disorders. Eur Neuropsychopharmacol. 2024;90:16–27.
Clark JE, Watson S, Friston KJ. What is mood? A computational perspective. Psychol Med. 2018;48:2277–84.
Aylward J, Valton V, Ahn WY, Bond RL, Dayan P, Roiser JP, et al. Altered learning under uncertainty in unmedicated mood and anxiety disorders. Nat Hum Behav. 2019;3:1116–23.
Cochran AL, Schultz A, McInnis MG, Forger DB. A comparison of mathematical models of mood in bipolar disorder. Comput Neurol psychiatry. 2017:315–41.
Nunes A, Singh S, Allman J, Becker S, Ortiz A, Trappenberg T, et al. A critical evaluation of dynamical systems models of bipolar disorder. Transl Psychiatry. 2022;12:1–12.
Arnal LH, Giraud AL. Cortical oscillations and sensory predictions. Trends Cogn Sci. 2012;16:390–8.
Sedley W, Gander PE, Kumar S, Kovach CK, Oya H, Kawasaki H, et al. Neural signatures of perceptual inference. Elife. 2016;5:e11476. https://doi.org/10.7554/eLife.11476.
Bastos AM, Lundqvist M, Waite AS, Kopell N, Miller EK. Layer and rhythm specificity for predictive routing. Proc Natl Acad Sci. 2020;117:31459–69.
Domenech P, Rheims S, Koechlin E. Neural mechanisms resolving exploitation-exploration dilemmas in the medial prefrontal cortex. Science. 2020;369:eabb0184. https://doi.org/10.1126/science.abb0184.
Auksztulewicz R, Friston KJ, Nobre AC. Task relevance modulates the behavioural and neural effects of sensory predictions. PLOS Biol. 2017;15:e2003143. https://doi.org/10.1371/journal.pbio.2003143.
Hein TP, Gong Z, Ivanova M, Fedele T, Nikulin V, Ruiz MH. Anterior cingulate and medial prefrontal cortex oscillations underlie learning alterations in trait anxiety in humans. Commun Biol. 2023;6:271. https://doi.org/10.1038/s42003-023-04628-1.
Hein TP, Ruiz MH. State anxiety alters the neural oscillatory correlates of predictions and prediction errors during reward-based learning. Neuroimage. 2022;249:118895. https://doi.org/10.1016/j.neuroimage.2022.118895.
Tarasi L, Trajkovic J, Diciotti S, Pellegrino G, Ferri F, Ursino M, et al. Predictive waves in the autism-schizophrenia continuum: A novel biobehavioral model. Neurosci Biobehav Rev. 2022;132:1–22.
Rushworth MFS, Behrens TEJ. Choice, uncertainty and value in prefrontal and cingulate cortex. Nat Neurosci. 2008;11:389–97.
Rouault M, Drugowitsch J, Koechlin E. Prefrontal mechanisms combining rewards and beliefs in human decision-making. Nat Commun. 2019;10:301. https://doi.org/10.1038/s41467-018-08121-w.
Jiménez E, Solé B, Arias B, Mitjans M, Varo C, Reinares M, et al. Characterizing decision-making and reward processing in bipolar disorder: a cluster analysis. Eur Neuropsychopharmacol. 2018;28:863–74.
Summerside EM, Shadmehr R, Ahmed AA. Vigor of reaching movements: reward discounts the cost of effort. J Neurophysiol. 2018;119:2347–57.
Sedaghat-Nejad E, Herzfeld DJ, Shadmehr R. Reward prediction error modulates saccade vigor. J Neurosci. 2019;39:5010–7.
da Silva JA, Tecuapetla F, Paixão V, Costa RM. Dopamine neuron activity before action initiation gates and invigorates future movements. Nature. 2018;554:244–8.
Ashok AH, Marques TR, Jauhar S, Nour MM, Goodwin GM, Young AH, et al. The dopamine hypothesis of bipolar affective disorder: the state of the art and implications for treatment. Mol Psychiatry. 2017;22:666–79.
Zhang CY, Cai X, Guo L, Wang L, Liu Z, Luo XJ, et al. Genetic evidence for the “dopamine hypothesis of bipolar disorder”. Mol Psychiatry. 2023;28:532–5.
Fulford D, Johnson SL, Llabre MM, Carver CS. Pushing and coasting in dynamic goal pursuit. Psychol Sci. 2010;21:1021–7.
Tecilla M, Großbach M, Gentile G, Holland P, Sporn S, Antonini A, et al. Modulation of motor vigor by expectation of reward probability trial-by-trial Is preserved in healthy ageing and Parkinson’s disease patients. J Neurosci. 2023;43:1757–77.
World Health Organization. The ICD-10 classification of mental and behavioural disorders: Clinical descriptions and diagnostic guidelines. Geneva: World Health Organization; 1992.
Hein TP, de Fockert J, Ruiz MH. State anxiety biases estimates of uncertainty and impairs reward learning in volatile environments. Neuroimage. 2021;224:117424. https://doi.org/10.1016/j.neuroimage.2020.117424.
Pulcu E, Browning M. The misestimation of uncertainty in affective disorders. Trends Cogn Sci. 2019;23:865–75.
de Berker AO, Rutledge RB, Mathys C, Marshall L, Cross JF, Dolan RJ, et al. Computations of uncertainty mediate acute stress responses in humans. Nat Commun. 2016;7:1–11.
Frässle S, Aponte EA, Bollmann S, Brodersen KH, Do CT, Harrison OK, et al. TAPAS: an open-source software package for translational neuromodeling and computational psychiatry. Front Psychiatry. 2021;12:1–25.
Taulu S, Hari R. Removal of magnetoencephalographic artifacts with temporal signal-space separation: demonstration with single-trial auditory-evoked responses. Hum Brain Mapp. 2009;30:1524–34.
Gramfort A, Luessi M, Larson E, Engemann DA, Strohmeier D, Brodbeck C, et al. MEG and EEG data analysis with MNE-Python. Front Neurosci. 2013;7:267. https://doi.org/10.3389/fnins.2013.00267.
Veen BD, Drongelen W, Yuchtman M, Suzuki A. Localization of brain electrical activity via linearly constrained minimum variance spatial filtering. IEEE Trans Biomed Eng. 1997;44:867–80.
Dale AM, Fischl B, Sereno MI. Cortical surface-based analysis. Neuroimage. 1999;9:179–94.
Fischl B, Sereno MI, Dale AM. Cortical surface-based analysis. Neuroimage. 1999;9:195–207.
Desikan RS, Ségonne F, Fischl B, Quinn BT, Dickerson BC, Blacker D, et al. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage. 2006;31:968–80.
Andreou C, Frielinghaus H, Rauh J, Mubmann M, Vauth S, Braun P, et al. Theta and high-beta networks for feedback processing: a simultaneous EEG–fMRI study in healthy male subjects. Transl Psychiatry. 2017;7:e1016. https://doi.org/10.1038/tp.2016.287.
Paulus MP, Feinstein JS, Simmons A, Stein MB. Anterior cingulate activation in high trait anxious subjects is related to altered error processing during decision making. Biol Psychiatry. 2004;55:1179–87.
Hayden BY, Heilbronner SR, Pearson JM, Platt ML. Surprise signals in anterior cingulate cortex: neuronal encoding of unsigned reward prediction errors driving adjustment in behavior. J Neurosci. 2011;31:4178–87.
Rolls ET, Deco G, Huang CC, Feng J. The human orbitofrontal cortex, vmPFC, and anterior cingulate cortex effective connectome: emotion, memory, and action. Cereb Cortex. 2022;33:330–56.
Nour MM, Dahoun T, Schwartenbeck P, Adams RA, FitzGerald THB, Coello C, et al. Dopaminergic basis for signaling belief updates, but not surprise, and the link to paranoia. Proc Natl Acad Sci USA. 2018;115:E10167–E10176. https://doi.org/10.1073/pnas.1809298115.
Schulreich S, Schwabe L. Causal role of the dorsolateral prefrontal cortex in belief updating under uncertainty. Cereb Cortex. 2021;31:184–200.
Grupe DW, Nitschke JB. Uncertainty and anticipation in anxiety: an integrated neurobiological and psychological perspective. Nat Rev Neurosci. 2013;14:488–501.
Dutra SJ, Man V, Kober H, Cunningham WA, Gruber J. Disrupted cortico‐limbic connectivity during reward processing in remitted bipolar I disorder. Bipolar Disord. 2017;19:661–75.
Acuff HE, Versace A, Bertocci MA, Ladouceur CD, Hanford LC, Manelis A, et al. Baseline and follow-up activity and functional connectivity in reward neural circuitries in offspring at risk for bipolar disorder. Neuropsychopharmacology. 2019;44:1570–8.
Kobayashi K, Hsu M. Neural mechanisms of updating under reducible and irreducible uncertainty. J Neurosci. 2017;37:6972–82.
Litvak V, Jha A, Flandin G, Friston K. Convolution models for induced electromagnetic responses. Neuroimage. 2013;64:388–98.
Stefanics G, Heinzle J, Horváth AA, Stephan KE. Visual mismatch and predictive coding: a computational single-trial ERP study. J Neurosci. 2018;38:4020–30.
Spitzer B, Blankenburg F, Summerfield C. Rhythmic gain control during supramodal integration of approximate number. Neuroimage. 2016;129:470–9.
Haufe S, Nikulin VV, Müller KR, Nolte G. A critical assessment of connectivity measures for EEG data: a simulation study. Neuroimage. 2013;64:120–33.
Pellegrini F, Delorme A, Nikulin V, Haufe S. Identifying good practices for detecting inter-regional linear functional connectivity from EEG. Neuroimage. 2023;277:120218. https://doi.org/10.1016/j.neuroimage.2023.120218.
Grissom RJ, Kim JJ. Effect Sizes for Research. New York: Routledge; 2012.
Ruscio J, Mullen T. Confidence intervals for the probability of superiority effect size measure and the area under a receiver operating characteristic curve. Multivariate Behav Res. 2012;47:201–23.
Wetzels R, Wagenmakers EJ. A default Bayesian hypothesis test for correlations and partial correlations. Psychon Bull Rev. 2012;19:1057–64.
Oostenveld R, Fries P, Maris E, Schoffelen JM. FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput Intell Neurosci. 2011. https://doi.org/10.1155/2011/156869.
Maris E, Oostenveld R. Nonparametric statistical testing of EEG- and MEG-data. J Neurosci Methods. 2007;164:177–90.
Lasagna CA, Pleskac TJ, Burton CZ, McInnis MG, Taylor SF, Tso IF. Mathematical modeling of risk-taking in bipolar disorder: evidence of reduced behavioral consistency, with altered loss aversion specific to those with history of substance use disorder. Comput Psychiatry. 2022;6:96–116.
Yechiam E, Hayden EP, Bodkins M, O’Donnell BF, Hetrick WP. Decision making in bipolar disorder: a cognitive modeling approach. Psychiatry Res. 2008;161:142–52.
Dickstein DP, Finger EC, Brotman MA, Rich BA, Pine DS, Blair JR, et al. Impaired probabilistic reversal learning in youths with mood and anxiety disorders. Psychol Med. 2010;40:1089–1100.
Browning M, Behrens TE, Jocham G, O’Reilly JX, Bishop SJ. Anxious individuals have difficulty learning the causal statistics of aversive environments. Nat Neurosci. 2015;18:590–6.
Bastos AM, Usrey WM, Adams RA, Mangun GR, Fries P, Friston KJ. Canonical microcircuits for predictive coding. Neuron. 2012;76:695–711.
Xing D, Yeh CI, Burns S, Shapley RM. Laminar analysis of visually evoked activity in the primary visual cortex. Proc Natl Acad Sci. 2012;109:13871–6.
Michalareas G, Vezoli J, van Pelt S, Schoffelen JM, Kennedy H, Fries P. Alpha-Beta and Gamma rhythms subserve feedback and feedforward Influences among human visual cortical areas. Neuron. 2016;89:384–97.
Roberts MJ, Lowet E, Brunet NM, Wal MT, Tiesinga P, Fries P, et al. Robust gamma coherence between macaque V1 and V2 by dynamic frequency matching. Neuron. 2013;78:523–36.
Palmer CE, Auksztulewicz R, Ondobaka S, Kilner JM. Sensorimotor beta power reflects the precision-weighting afforded to sensory prediction errors. Neuroimage. 2019;200:59–71.
van Pelt S, Heil L, Kwisthout J, Ondobaka S, van Rooij I, Bekkering H. Beta- and gamma-band activity reflect predictive coding in the processing of causal events. Soc Cogn Affect Neurosci. 2016;11:973–80.
Ethridge LE, Hamm JP, Shapiro JR, Summerfelt AT, Keedy SK, Stevens MC, et al. Neural activations during auditory oddball processing discriminating schizophrenia and psychotic bipolar disorder. Biol Psychiatry. 2012;72:766–74.
Özerdem A, Güntekin B, Tunca Z, Başar E. Brain oscillatory responses in patients with bipolar disorder manic episode before and after valproate treatment. Brain Res. 2008;1235:98–108.
Lu Z, Wang H, Gu J, Gao F. Association between abnormal brain oscillations and cognitive performance in patients with bipolar disorder: molecular mechanisms and clinical evidence. Synapse. 2022;76:e22247. https://doi.org/10.1002/syn.22247.
Özerdem A, Güntekin B, Atagün I, Turp B, Başar E. Reduced long distance gamma (28–48 Hz) coherence in euthymic patients with bipolar disorder. J Affect Disord. 2011;132:325–32.
Moncrieff J, Cohen D, Porter S. The psychoactive effects of psychiatric medication: the Elephant in the room. J Psychoactive Drugs. 2013;45:409–15.
Moens V, Zénon A. Learning and forgetting using reinforced Bayesian change detection. PLOS Comput Biol. 2019;15:e1006713. https://doi.org/10.1371/journal.pcbi.1006713.
Acknowledgements
This research was partially supported by the Basic Research Programme of the National Research University Higher School of Economics (Russian Federation). The research used the Elekta Neuromag 306-channel MEG system at Centre for the neurocognitive research (MEG-Centre) in Moscow (Russian Federation).
Author information
Authors and Affiliations
Contributions
MI, DSP, AR, VVN and MHR conceptualized the study. MHR developed the methodology. The study was implemented by MI, with input from MHR, DSP, AR and BAV, and MI and GK conducted the experiment. MHR, MI and KG analysed the data, with MI an KG being supervised by MHR, and with input from VVN. MHR, MI and KG wrote the first draft of the manuscript, which was revised and edited by VVN, DSP, AR, BAV and GK.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ivanova, M., Germanova, K., Petelin, D.S. et al. Frequency-specific changes in prefrontal activity associated with maladaptive belief updating in volatile environments in euthymic bipolar disorder. Transl Psychiatry 15, 13 (2025). https://doi.org/10.1038/s41398-025-03225-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41398-025-03225-6