Abstract
This report sought to employ multi-modal integration of pre-treatment brain (electroencephalogram, resting-state functional magnetic resonance imaging) and blood (immune and metabolic) biomarkers to facilitate causal inference-based treatment selection by virtue of establishing predictability of remission to multi-stage antidepressant treatment. Data from two stages of pharmacotherapy in the ‘Establishing Moderators and Biosignatures of Antidepressant Response for Clinical Care for Depression’ (EMBARC) study from participants with both brain and blood biomarkers were included (N = 197). Participants were initially randomized to sertraline or placebo (Stage 1), and depending on clinical response at week-8, their therapy in Stage 2 was either maintained or switched (to sertraline, if a non-responder to placebo, or to bupropion, if a non-responder to sertraline). Three readily accessible clinical features combined with 15 multi-modal features associated with baseline depression severity predicted stage 2 remission with an AUC of 0.74, 0.71, and 0.73 for sertraline, bupropion, and placebo treatment respectively. Propensity score-matching (causal inference) was conducted across Stage 2 treatment arms, and the same features were used to build an unsupervised model to produce the probability of remission to the given Stage 2 treatment (as factual outcome), as well as the alternative treatment not given (as counter factual). While the accuracy of observed outcomes across treatment arms was 82%, the accuracies of predicted counterfactual (unobserved) outcomes warrant future prospective studies. 16 weeks and associated biomarker-based prediction of counterfactuals suggest that the selected markers are highly sensitive features for guiding antidepressant treatment selection.
Similar content being viewed by others
Introduction
Major depressive disorder (MDD) is the leading cause of disability worldwide [1, 2]. Only a third of patients achieve remission from first-line antidepressant therapies [3], with treatment failures adding to disease and economic burden [4]. Given heterogeneity in disease symptom severity and response trajectories, there is keen interest in precision medicine approaches for treating MDD and individualizing the selection of antidepressants by utilizing prognostic biomarkers and or biosignatures of disease pathophysiology [5].
Two primary modalities of biomarkers have been studied thus far for establishing predictability of antidepressant response, namely, blood (metabolomics [6], genomics [7], immune-proteomics [8]) and brain (magnetic resonance imaging (MRI) [9], electroencephalographic (EEG) [10]) biomarkers. Machine learning (ML) approaches utilizing these biomarkers as predictors of antidepressant (escitalopram, bupropion) response at 8 or 12 weeks achieved area under the receiver-operating curve (AUC) of 0.64 to 0.86 [11, 12]. Multimodal integration of functional MRI (fMRI) and blood biomarkers has demonstrated improved performance compared to single modality models [13]. In a review predicting MDD treatment outcomes using ML methods, only one used a combination of predictors from three modalities (clinical, cognitive, and neuroimaging) [14]. Thus, there continues to be a knowledge gap on whether the integration of blood and brain derived biomarkers could (a) improve predictability of sequential treatment outcomes over a single modality, and (b) aid in treatment selection.
With the goal of elucidating blood or brain biomarkers that could predict multi-stage remission (i.e., 2 or more successive pharmacological treatments for MDD) and eventually facilitate treatment selection, the Establishing Moderators and Biosignatures of Antidepressant Response in Clinical Care (EMBARC) study collected both brain and blood measurements [15]. The EMBARC study was a multisite double-blinded randomized clinical trial designed to identify potential biomarkers of antidepressant response [15]. In one prior report, a model using clinical features and pretreatment theta current density, obtained from electroencephalogram (EEG) data, predicted placebo responders and remitters at week 8 with an AUC of 0.73 and 0.76 respectively [16]. Another report using fMRI data from the EMBARC study predicted response to sertraline at week 8 with an AUC of 0.73 [17]. While prior reports using EMBARC data investigated response or remission outcomes at week 8, no study from EMBARC to date has (a) integrated blood and brain biomarkers to predict treatment outcomes, and (b) identified pre-treatment biomarkers that can facilitate treatment selection based on prediction of remission (or the lack of) at 16 weeks.
The goal of this report (see Fig. 1A for overview) was to employ multi-modal integration of pre-treatment brain and blood biomarkers to facilitate treatment selection (using causal inference) by establishing predictability of remission to multi-stage antidepressant therapy (using supervised ML). The analytical emphasis was on answering the question (referred to as the counterfactual hypothesis in this work), “if a patient was a non-remitter to a treatment “x”, what was the likelihood of remission to the alternative treatment “y”, based on factual outcomes of participants with similar biological profiles (e.g., similar EEG, fMRI, immune profiles) who were given the alternative treatment “y”. In this work’s context, the counterfactual is the hypothetical scenario that explores the likely outcome had the patient been treated with an alternative antidepressant. The overarching hypothesis of this work is that brain and blood biomarkers associated with depression severity at baseline would facilitate the prediction of Stage 2 remission: thereby generating the counterfactual hypothesis regarding the likelihood of remission to the other antidepressant treatment.
A Breakdown of the study arms of EMBARC, as well as the classification of Subgroups A, B, and C is displayed. The workflow of training the model on Subgroup A and testing on Subgroup B & C is repeated on several samples with varying levels of data completeness. B Receiver operator characteristic curves of training and testing sets. ACC: Accuracy. AUC: Area Under the Curve.
Materials and methods
Study design and population
This is a secondary analysis of data from the EMBARC study (clinicaltrials.gov ID: NCT01407094) [15]. Briefly, the EMBARC study was a multisite double-blind randomized clinical trial consisting of 296 outpatients diagnosed with nonpsychotic MDD, designed to identify pretreatment and early-treatment (week 1) clinical, neuroimaging, and blood moderators for predicting response to sertraline or placebo. The EMBARC trial was approved by the Institutional Review Board at each clinical site, and all patients provided informed consent upon study entry. Demographic information and clinical characteristics of participants is provided in Table 1. The EMBARC study comprised two stages; in Stage 1, the first 8 weeks of the study, all participants were randomized to either placebo (N = 150) or sertraline (N = 146). At week 8 (end of Stage 1), participants were assessed with the Clinical Global Improvement scale (CGI), and those receiving a score of less than “much improved” were considered non-responders [18]. In Stage 2, placebo non-responders switched to sertraline, and sertraline non-responders switched to bupropion. Responders at Stage 1 continued with their previously assigned treatment plan.
Primary outcome measure
The primary measure of outcome assessed of the EMBARC study was symptom severity measured using the Hamilton Depression Rating Scale (HAM-D) [19]. Remission at week 16 was defined as a 17-item HAM-D total score ≤ 7 [20].
Brain and blood measures
Functional magnetic resonance imaging (fMRI)
MRI scans were collected on 3 Tesla MR systems at all EMBARC sites. Briefly, the data were preprocessed with a standard pipeline, namely slice time correction, motion corrected (realignment and unwarp), spatial normalization, and smooth with an 8 mm Gaussian Kernel. Complete fMRI preprocessing is listed in Supplementary Materials 1.
Electroencephalography (EEG)
As described in previous studies, different systems were used at each site to acquire high-density EEG (60 to 128 electrodes) [21], and harmonized using established procedures [22]. Additional details of EEG preprocessing are in Supplementary Materials 1.
Blood markers
Peripheral blood samples were collected in EDTA tubes at all EMBARC sites. The complete list of analytes measured, and data processing details are available in Supplementary Materials 1, Supplementary Tables 1 and 2, and Supplementary Fig. 1.
Missingness
Prior work with EMBARC data has reported the availability of fMRI and EEG data [23, 24]. Any missingness from the top associating plasma immune features identified by the predictive power score (PPS) feature selection algorithm, would be imputed using k-nearest neighbors.
Machine learning (ML) strategy for predicting stage-2 remission
The ML workflow (see Fig. 1A) identified a small set of pre-treatment brain-blood biomarkers associated with baseline depression severity to be used for (a) predicting 16-week outcomes, and (b) serve as inputs to a causal inference algorithm that can match a Stage 1 non-responder to another patient (based on biomarker similarity) who achieved remission to alternative therapies.
Feature selection
Feature selection was conducted using the PPS algorithm [25] to identify pre-treatment EEG, fMRI and blood measures associating with baseline HAM-D scores as the target variable. The PPS algorithm identifies and ranks variables based on a data-type agnostic score for linear and non-linear relationships ranging from 0 to 1, with 1 indicating perfect predictive power. The motivation for selecting baseline depression severity as the target variable was due to it often being reported as the top predictor of remission [26]. The algorithm was employed on all participants for each of the modalities who had baseline symptom severity. The top five features as indicated by PPS values (listed in Supplementary Table 3) were used to train and test prediction models.
Training and external validation cohorts
The dataset was stratified into three independent subgroups, Subgroup A for training prediction models and the Subgroups B & C for external validation of predictive performance (see Fig. 1A). The demographic characteristics of each subgroup is available in Supplementary Table 4. Subgroup A was defined as participants who received sertraline as a treatment at week 16, including sertraline responders and placebo non-responders who switched to sertraline at Stage 2. Subgroup B was defined as participants who were sertraline non-responders and were given bupropion during Stage 2. Subgroup C was defined as placebo responders.
Algorithm selection
The goal was to identify a supervised ML algorithm that achieved the best predictive performance when combining all clinical and biological features identified by PPS. The tested suite of algorithms included random forest, K-nearest neighbors, extreme gradient-boosted decision tree-based ensembles (XGBoost), support vector machine (SVM), and penalized regression. Nested cross-validation was used to train and test all algorithms, using 75% of the sample in Subgroup A allocated for training. The training and testing split within Subgroup A was repeated 10 times, and results were averaged over splits. Averaged AUC, accuracy, sensitivity, and specificity for these models are listed in Table 2. The best performing algorithm (defined by highest AUC) was selected to be retrained and retested on other combinations of clinical and biological features (e.g., blood & EEG), with the sample sizes of these available features illustrated in Fig. 1A. This was done to assess the utility of multi-modal integration of features on model performance. Prediction performance (AUC, accuracy, sensitivity, specificity, null information rate) was reported for both training using Subgroup A samples and testing on Subgroup B & C (see Fig. 1B). The null information rate (NIR) is equal to the accuracy of a model that predicts the majority class and represents the null hypothesis for this work. Model-specific variable importance was measured using the ‘vip’ function available in R [27].
Causal inference for facilitating treatment selection
Propensity score
The goal of using causal inference methods was to identify a counterfactual (unobserved) hypothesis, i.e., if subjects who could not achieve remission at Stage 2 would have higher likelihood (probability) for an alternative Stage 2 treatment (see Fig. 2). On the placebo arm, probabilities of Stage 2 remission were calculated for placebo and sertraline, and on the sertraline arm, the same probability was calculated for sertraline and bupropion. Propensity score matching was done using the “MatchIt” package in R to match individual participants across Stage 2 treatment arms based on their biological profile (predictive features as determined by supervised ML approach) [28]. It is important to note that biological variables used for matching should not be confused for biomarkers that inform drug mechanism.
Pipeline of counterfactual prediction production using biomarker-based propensity score matching. A Calculate propensity scores using predictive features. B Match participants across Stage 2 arms using propensity scores. C Utilize penalized regression to calculate conditional probabilities and produce counterfactual predictions.
Generating the counterfactual hypothesis
Causal models were built on both the sertraline arm and placebo arm predicting how subjects would respond on not only the given stage two treatment but on the alternative treatment. After matching, a penalized logistic regression model was used to produce the probability of remitting to both the actual given treatment, and the alternative treatment. In this context, we asked the following questions (see Fig. 2), “would a sertraline non-remitter at Stage 2 have remitted to bupropion had they been switched from sertraline after Stage 1?”, “would a bupropion non-remitter at Stage 2 have eventually remitted from extended exposure to sertraline?”, and “would a non-remitter to placebo at Stage 2 have remitted to sertraline?”
Results
Participants and available features
After randomization, of the 296 participants, 278 had fMRI data (7260 features), 212 had immune/metabolic data (50 features), and 275 had EEG data (29,700 features). A total of 197 out of 296 participants had Stage 2 treatment plan and week 16 remission status reported. Out of the 197 participants who had week 16 outcomes, 153 participants had immune/metabolic assays (blood), 144 participants had both immune/metabolic assays and fMRI, 126 participants had immune/metabolic assays and EEG, and 118 participants had immune assays, EEG, and fMRI measures.
Pre-treatment features for prediction algorithms
Utilizing all available pre-treatment samples for each of the modality of biomarkers (fMRI [brain]: N = 278; EEG [brain]: N = 275; immune/metabolic [blood]: N = 212), top five pre-treatment features associated with baseline MDD were derived using the PPS feature selection algorithm (all features and their predictive power scores are listed in Supplementary Table 3). The top five fMRI features represented connectivity between various brain regions including the anterior cingulate, striatum, calcarine, and more. The EEG features represented alpha band connectivity across different regions. The top five EEG and fMRI network connectivity regions are visualized in Supplementary Fig. 2. The top five blood features were chemokine ligand-1, interleukin-4, interleukin-8, chemokine ligand-26, and chemokine ligand-13. As the final features selected had no missingness across participants, imputation was not needed. These 15 biological pre-treatment features with the addition of 3 demographic features (age, sex, and race) yielded 18 features for predicting remission of MDD at week 16 of treatment.
Prediction performance
In training
Models were trained using data exclusively from Subgroup A. The highest AUC was achieved by a penalized logistic regression integration of demographic, blood, fMRI and EEG brain connectivity measures (mean AUC = 0.74), with sensitivity of 0.74, and specificity of 0.63 (see Table 2). In Validation: Trained models using data from Subgroup A achieved AUCs of 0.71 and 0.73 respectively using data from Subgroup B and Subgroup C. The receiver operator curve of all performing models is visualized in Fig. 1B. The predictive performance metrics averaged over 10 repeats for all models tested are tabulated in Table 2. Top predictors: The top predictors of remission at 16 weeks of the fully integrated model were two EEG connectivity features (Schaefer parcel 1 & 98, and Schaefer parcel 26 & 100), two fMRI connectivity features (Schaefer parcel 71 & 52, and Schaefer parcel 97 & 42), and interleukin-8 respectively. Additionally, the top predictors for all combinations of models are summarized in Supplementary Table 5.
Causal inference for treatment selection hypothesis
The result of propensity score matching was an equal number of responders and non-responders on the Stage 2 arms of the study (48 total in the sertraline arm and 48 in the placebo arm). There were no significant differences between feature values across matched groups (see Supplementary Table 6 for additional results from t-tests). The mean standardized difference in features between the sertraline-to-bupropion group versus the sertraline-to-sertraline group before matching was 17.7% and after matching was 14.2%. The mean standardized difference in covariates between the placebo-to-placebo group versus the placebo-to-sertraline group before matching was 43.4% and after matching was 24.6%. Ideally the mean standardized differences would be lower, however, after matching both groups are under the 25% cutoff that is deemed adequate for balancing [29].
Penalized logistic regression results which produced averaged counterfactual probabilities by treatment and observed outcome are present in Table 3, and probabilities for each individual subject are in Supplementary Table 7. Within the placebo arm, one of the three placebo non-remitters had a higher likelihood of remitting to sertraline. Within the sertraline arm, two of the eight sertraline non-remitters had a higher likelihood of remitting to bupropion, and all bupropion non-remitters had a higher likelihood of remitting to sertraline (indicating that these participants were likely to remit had they continued sertraline treatment for a longer duration). Within the sertraline arm, three non-remitting subjects had low probability ( < 50%) of remitting to either pharmacotherapy (sertraline or bupropion). Factual outcomes (i.e., those observed in the study) were predicted with 82% accuracy.
Discussion
Antidepressant treatment selection continues to be “artisanal” in that treatment management is a “try-and-try-again” process depending on treatment outcomes in sequential pharmacotherapy [30]. Although ML algorithms show promise in establishing predictability of antidepressant response [14], there is a need to extend the analytical capabilities with biomarkers toward facilitating antidepressant selection. The work presented herein comprises a strong basis for future research efforts aimed at utilizing predictive biomarkers for developing treatment selection models for MDD—work that is crucial for precision psychiatry. To do so, validation of biomarkers will be needed, involving studies evaluating the efficacy of AI-guided treatment selection v. standard of care. This report addresses this critical need towards precision psychiatry of major depressive disorder and as a novel application of casual inference approaches [31], by integrating brain and blood biomarkers for generating counterfactual hypothesis of potential remission (or lack of) to alternative treatments using sequential treatment data from the EMBARC study.
Prior work by Iniesta et al. identified that prediction performance of AUC > 0.70 is considered clinically meaningful [32]. In this context, this report found that integrating blood with fMRI or fMRI and EEG achieved AUC > 0.70 in all subgroups both during training and testing prediction of sequential treatment outcomes. Ultimately, predicting outcomes to specific treatments serves as a first step towards developing effective treatment selection tools, and more work is being done to identify how to best select first and second-line treatments. Data from the STAR-D trial, demonstrated that several clinical features are predictive of whether patients will respond to second line antidepressants [33]. A recent analysis created the first pharmacological differential treatment benefit model for MDD by using clinical and demographic data from 17 studies to predict remission and generate remission probabilities for five other treatments [34]. Other work has used ML to create a tool that assists clinicians in deciding between esketamine nasal spray and transcranial magnetic stimulation for treatment resistant depression [35]. This report extends the literature of antidepressant response predictions, by also utilizing predictive biomarkers for potentially aiding in treatment selection. Finally, this report encourages future research to optimize the selection of predictive biomarkers that can be easily obtained in clinical practice.
The top predictor of the model was connectivity between the left fusiform of the visual network, and the right middle frontal of the default mode network at the alpha frequency (Schaefer parcel 1 & 98). The default mode network (DMN) is believed to be activated during internal mental simulations such as thinking of one’s past or worrying about the future [36]. Previous work has shown that there is a significant difference in DMN activity between healthy control groups and individuals with MDD [37]. In a study attempting to uncover suicide-risk specific features from fMRI connectivity analysis, it was found that increased suicide attempts had an increased connectivity within the visual network which extended to regions of the DMN [38]. The DMN has shown up as a signature that is predictive of depressive symptom improvement following ECT, and treatment response to escitalopram [39, 40] Future exploration of the utility of this marker as a prognostic biomarker of treatment outcomes across subsets of patients who differ by baseline severity is warranted.
The top blood-based, and fifth overall predictor was interleukin (IL)-8. Observational studies demonstrated that IL-8 are found in higher levels of patients with lower depressive symptom severity with treatment resistant depression [41]. While individual studies highlighted a difference in IL-8 levels between patients with and without MDD, meta-analyses have found inconclusive results [42]. These analyses have shown that individuals with MDD don’t have differing levels of IL-8 compared to those without, however a limitation is the size of these studies [43, 44]. Despite this, IL-8 has previously been discussed in the literature to be predictive of depression response to ketamine, electroconvulsive therapy, and paroxetine [45, 46]. While the role IL-8 role plays in MDD is still not clear, prior research supports the assumption that IL-8 can be predictive of treatment response.
Limitations
Despite demonstrating cross-arm replication in predictive performance, there is no independent study with EEG, fMRI and blood data to validate the predictive performance and counterfactual hypothesis. Even with efforts to reduce the chance of overfitting through cross-validation and external replication across EMBARC study arms, future work with replications across studies are needed for establishing reliability of prediction models. Of the 197 subjects used to build and test the models, 38.6% were male, which is characteristic of male incident cases of MDD globally (37.7%) [47]. Additionally, several algorithms were only tested on the sample of participants with all possible biological data collected, and future efforts are needed to collect biological measures on larger cohorts with greater representation of ethnic populations that may facilitate generalization of biomarker-guided treatment selection models.
Finally, causal inference and counterfactual prediction is an evolving field of analyses in individualizing antidepressant response. Full verification of the validity and completeness of the model is not possible without observable data; however, there are aspects of the causal model which are empirically testable, such as the accuracy of predicting observed outcomes [48]. It is important to note that biological variables used for propensity-score matching should not be confused for biomarkers that inform antidepressant response mechanism. Additional efforts are needed to prospectively evaluate counterfactual models, rather than solely calculating error on factual outcomes. While ML formulations offer limited interpretability by design, they offer more insight into predictability of treatment outcomes and rank-order predictor variables. When interpreting predictions, variable importance is included which provides a rank order list of variables that contribute the most when making predictions. Although the machine learning approach in this work identified that combining biomarkers of multiple modalities increases predictability of antidepressant remission and aids in treatment selection, future work is needed to elucidate the mechanistic effects of these interactions in the context of precision medicine for MDD.
Conclusions
This work demonstrates that a combination of EEG, fMRI, and plasma-derived immune markers can be important prognostic, integrative indicators to identify patients who are likely to experience remission in following the antidepressant therapies studied in EMBARC. The model trained on sertraline-treated patients performed well on additional treatment subgroups within EMBARC, suggesting that the model may predict treatment response more generally and might not be useful in selecting specific treatments. A causal model with these biomarkers was constructed, providing estimations of the likelihood of an individual remitting to their assigned Stage 2 treatment, as well as the alternative treatment. The model predicted factual (observed) events with high accuracy, and generated counterfactual predictions that can be validated in future studies. Computed estimates from the model demonstrated that some subjects had a low probability of achieving remission on either sertraline or bupropion, potentially identifying subjects who may benefit from more intensive intervention or monitoring.
Data availability
Data and code are available upon reasonable request to Dr. Madhukar H. Trivedi.
References
Friedrich MJ. Depression Is the leading cause of disability around the world. JAMA. 2017;317:1517.
Kupcova I, Danisovic L, Klein M, Harsanyi S. Effects of the COVID-19 pandemic on mental health, anxiety, and depression. BMC Psychol. 2023;11:108.
Trivedi MH, Rush AJ, Wisniewski SR, Nierenberg AA, Warden D, Ritz L, et al. Evaluation of outcomes with citalopram for depression using measurement-based care in STAR*D: implications for clinical practice. Am J Psychiatry. 2006;163:28–40.
Greenberg PE, Fournier AA, Sisitsky T, Pike CT, Kessler RC. The economic burden of adults with major depressive disorder in the United States (2005 and 2010). J Clin Psychiatry. 2015;76:155–62.
Phillips ML, Chase HW, Sheline YI, Etkin A, Almeida JR, Deckersbach T, et al. Identifying predictors, moderators, and mediators of antidepressant response in major depressive disorder: neuroimaging approaches. Am J Psychiatry. 2015;172:124–38.
Caspani G, Turecki G, Lam RW, Milev RV, Frey BN, MacQueen GM, et al. Metabolomic signatures associated with depression and predictors of antidepressant response in humans: A CAN-BIND-1 report. Commun Biol. 2021;4:903.
Bobo WV, Van Ommeren B, Athreya AP. Machine learning, pharmacogenomics, and clinical psychiatry: predicting antidepressant response in patients with major depressive disorder. Expert Rev Clin Pharm. 2022;15:927–44.
Kim EY, Ahn HS, Lee MY, Yu J, Yeom J, Jeong H, et al. An exploratory pilot study with plasma protein signatures associated with response of patients with depression to antidepressant treatment for 10 weeks. Biomedicines. 2020;8:455.
Cohen SE, Zantvoord JB, Wezenberg BN, Bockting C, van Wingen GA. Magnetic resonance imaging for individual prediction of treatment response in major depressive disorder: a systematic review and meta-analysis. Transl Psychiatry. 2021;11:168.
Watts D, Pulice RF, Reilly J, Brunoni AR, Kapczinski F, Passos IC. Predicting treatment response using EEG in major depressive disorder: A machine-learning meta-analysis. Transl Psychiatry. 2022;12:332.
Cook IA, Hunter AM, Caudill MM, Abrams MJ, Leuchter AF. Prospective testing of a neurophysiologic biomarker for treatment decisions in major depressive disorder: The PRISE-MD trial. J Psychiatr Res. 2020;124:159–65.
Wu H, Liu R, Zhou J, Feng L, Wang Y, Chen X, et al. Prediction of remission among patients with a major depressive disorder based on the resting-state functional connectivity of emotion regulation networks. Transl Psychiatry. 2022;12:391.
Sajjadian M, Uher R, Ho K, Hassel S, Milev R, Frey BN, et al. Prediction of depression treatment outcome from multimodal data: a CAN-BIND-1 report. Psychol Med. 2023;53:5374–84.
Sajjadian M, Lam RW, Milev R, Rotzinger S, Frey BN, Soares CN, et al. Machine learning in the prediction of depression treatment outcomes: a systematic review and meta-analysis. Psychol Med. 2021;51:2742–51.
Trivedi MH, McGrath PJ, Fava M, Parsey RV, Kurian BT, Phillips ML, et al. Establishing moderators and biosignatures of antidepressant response in clinical care (EMBARC): Rationale and design. J Psychiatr Res. 2016;78:11–23.
Trivedi MH, South C, Jha MK, Rush AJ, Cao J, Kurian B, et al. A novel strategy to identify placebo responders: prediction index of clinical and biological markers in the EMBARC trial. Psychother Psychosom. 2018;87:285–95.
Poirot MG, Ruhe HG, Mutsaerts H, Maximov II, Groote IR, Bjørnerud A, et al. Treatment response prediction in major depressive disorder using multimodal MRI and clinical data: secondary analysis of a randomized clinical trial. Am J Psychiatry. 2024;181:223–33.
Busner J, Targum SD. The clinical global impressions scale: applying a research tool in clinical practice. Psychiatry. 2007;4:28–37.
Bobo WV, Angleró GC, Jenkins G, Hall-Flavin DK, Weinshilboum R, Biernacka JM. Validation of the 17-item Hamilton Depression Rating Scale definition of response for adults with major depressive disorder using equipercentile linking to Clinical Global Impression scale ratings: analysis of Pharmacogenomic Research Network Antidepressant Medication Pharmacogenomic Study (PGRN-AMPS) data. Hum Psychopharmacol. 2016;31:185–92.
Rush AJ. STAR*D: what have we learned? Am J Psychiatry. 2007;164:201–4.
Rolle CE, Fonzo GA, Wu W, Toll R, Jha MK, Cooper C, et al. Cortical connectivity moderators of antidepressant vs placebo treatment response in major depressive disorder: secondary analysis of a randomized clinical trial. JAMA Psychiatry. 2020;77:397–408.
Tenke CE, Kayser J, Pechtel P, Webb CA, Dillon DG, Goer F, et al. Demonstrating test-retest reliability of electrophysiological measures for healthy adults in a multisite study of biomarkers of antidepressant treatment response. Psychophysiology. 2017;54:34–50.
Zhao K, Xie H, Fonzo GA, Tong X, Carlisle N, Chidharom M, et al. Individualized fMRI connectivity defines signatures of antidepressant and placebo responses in major depression. Mol Psychiatry. 2023;28:2490–9.
Wu W, Zhang Y, Jiang J, Lucas MV, Fonzo GA, Rolle CE, et al. An electroencephalographic signature predicts antidepressant response in major depression. Nat Biotechnol. 2020;38:439–47.
Boehmke BM, Boehmke BC. Variable Importance Plots-An Introduction to the vip Package. The R Journal. 2020;12:343–66.
Friedman ES, Davis LL, Zisook S, Wisniewski SR, Trivedi MH, Fava M, et al. Baseline depression severity as a predictor of single and combination antidepressant treatment outcome: results from the CO-MED trial. Eur Neuropsychopharmacol. 2012;22:183–99.
Laken Pvd. paulvanderlaken/ppsr: CRAN ppsr 0.0.5. 2024.
Zhao QY, Luo JC, Su Y, Zhang YJ, Tu GW, Luo Z. Propensity score matching with R: conventional methods and new features. Ann Transl Med. 2021;9:812.
Stuart EA. Matching methods for causal inference: A review and a look forward. Stat Sci. 2010;25:1–21.
Perlis RH. Abandoning personalization to get to precision in the pharmacotherapy of depression. World Psychiatry. 2016;15:228–35.
Krishnadas R, Leighton SP, Jones PB. Precision psychiatry: thinking beyond simple prediction models - enhancing causal predictions. Br J Psychiatry. 2025;226:184–8.
Iniesta R, Malki K, Maier W, Rietschel M, Mors O, Hauser J, et al. Combining clinical variables to optimize prediction of antidepressant treatment outcomes. J Psychiatr Res. 2016;78:94–102.
Rush AJ, Wisniewski SR, Warden D, Luther JF, Davis LL, Fava M, et al. Selecting among second-step antidepressant medication monotherapies: predictive value of clinical, demographic, or first-step treatment features. Arch Gen Psychiatry. 2008;65:870–80.
Perlman K, Mehltretter J, Benrimoh D, Armstrong C, Fratila R, Popescu C, et al. Development of a differential treatment selection model for depression on consolidated and transformed clinical trial datasets. Transl Psychiatry. 2024;14:263.
Pettorruso M, Di Lorenzo G, Benatti B, d'Andrea G, Cavallotto C, Carullo R, et al. Overcoming treatment-resistant depression with machine-learning based tools: a study protocol combining EEG and clinical data to personalize glutamatergic and brain stimulation interventions (SelecTool Project). Front Psychiatry. 2024;15:1436006.
Buckner RL. The brain’s default network: origins and implications for the study of psychosis. Dialogues Clin Neurosci. 2013;15:351–8.
Chou T, Deckersbach T, Dougherty DD, Hooley JM. The default mode network and rumination in individuals at risk for depression. Soc Cogn Affect Neurosci, 2023;18:nsad032.
Shiwei L, Xiaojing Z, Yingli Z, Shengli C, Xiaoshan L, Ziyun X, et al. Cortical hierarchy disorganization in major depressive disorder and its association with suicidality. Front Psychiatry. 2023;14:1140915.
Li Y, Yu X, Ma Y, Su J, Li Y, Zhu S, et al. Neural signatures of default mode network in major depression disorder after electroconvulsive therapy. Cereb Cortex. 2023;33:3840–52.
Martens MAG, Filippini N, Harmer CJ, Godlewska BR. Resting state functional connectivity patterns as biomarkers of treatment response to escitalopram in patients with major depressive disorder. Psychopharmacology. 2022;239:3447–60.
Kruse JL, Olmstead R, Hellemann G, Breen EC, Tye SJ, Brooks JO, et al. Interleukin-8 and lower severity of depression in females, but not males, with treatment-resistant depression. J Psychiatr Res. 2021;140:350–6.
Köhler CA, Freitas TH, Maes M, de Andrade NQ, Liu CS, Fernandes BS, et al. Peripheral cytokine and chemokine alterations in depression: a meta-analysis of 82 studies. Acta Psychiatr Scand. 2017;135:373–87.
Eyre HA, Air T, Pradhan A, Johnston J, Lavretsky H, Stuart MJ, et al. A meta-analysis of chemokines in major depression. Prog Neuropsychopharmacol Biol Psychiatry. 2016;68:1–8.
Dowlati Y, Herrmann N, Swardfager W, Liu H, Sham L, Reim EK, et al. A meta-analysis of cytokines in major depression. Biol Psychiatry. 2010;67:446–57.
Zhu ZH, Song XY, Man LJ, Chen P, Tang Z, Li RH, et al. Comparisons of Serum Interleukin-8 Levels in Major Depressive Patients With Drug-Free Versus SSRIs Versus Healthy Controls. Front Psychiatry. 2022;13:858675.
Kruse JL, Vasavada MM, Olmstead R, Hellemann G, Wade B, Breen EC, et al. Depression treatment response to ketamine: sex-specific role of interleukin-8, but not other inflammatory markers. Transl Psychiatry. 2021;11:167.
Li S, Zhang X, Cai Y, Zheng L, Pang H, Lou L. Sex difference in incidence of major depressive disorder: an analysis from the Global Burden of Disease Study 2019. Ann Gen Psychiatry. 2023;22:53.
Dickerman BA, Hernan MA. Counterfactual prediction is not only for causal inference. Eur J Epidemiol. 2020;35:615–7.
Funding
This work is funded in part by a Graduate Research Education Fellowship funded by Mayo Clinic Center for Individualized Medicine, National Science Foundation under award IIS-2041339 (PIs: Athreya, Bobo). The Establishing Moderators and Biosignatures of Antidepressant Response in Clinical Care (EMBARC) study was supported by NIMH grants U01MH092221 (PI: Trivedi) and U01MH092250 (PIs: McGrath, Parsey, and Weissman), and in part by the Hersh Foundation. In addition, this work was partly funded by the Center for Depression Research and Clinical Care (PI: Madhukar Trivedi). CWG is supported by the Harry C. and Debra A. Stonecipher Predoctoral Fellowship at the Mayo Clinic Graduate School of Biomedical Sciences. Dr. Jha is supported by a career development award from National Institute of Mental Health (MH126202) and the O’Donnell Clinical Neuroscience Scholar Award from UT Southwestern Medical Center.
Author information
Authors and Affiliations
Contributions
MB. – Data curation, formal analysis, methodology, software, validation, visualization, writing – original draft and review & editing; CG. – Data curation, formal analysis, methodology, software, validation, visualization, writing – review & editing; TC. – methodology; AM. – methodology.; CCF. – methodology; RT. – Data curation, methodology; JF. – methodology; PEC – Conceptualization, project administration, resources, writing – review & editing; WVB. - Conceptualization, formal analysis, funding acquisition, project administration, resources, supervision, writing – review & editing. MJ. - Conceptualization, formal analysis, funding acquisition, project administration, resources, supervision, writing – review & editing. APA Conceptualization, formal analysis, methodology, project administration, resources, software, supervision, validation, writing – review & editing; and MHT. Conceptualization, formal analysis, methodology, project administration, resources, software, supervision, validation, writing – review & editing.
Corresponding authors
Ethics declarations
Competing interests
Dr. Jha has the following conflicts to report: Contract research grants from Neurocrine Bioscience, Navitor/Supernus and Janssen Research & Development; consultant fees from Janssen Scientific Affairs and Boehringer Ingelheim; and fees to serve on Data Safety and Monitoring Board for Worldwide Clinical Trials (Eliem, Skye and Inversargo), Vicore Pharma and IQVIA (Click). Dr. Croarkin has received research support from the National Institutes of Health (NIH), National Science Foundation (NSF), Agency for Healthcare Research and Quality (AHRQ), Brain and Behavior Research Foundation and the Mayo Clinic Foundation. Dr. Croarkin has received research support from Pfizer, Inc. He has received grant-in-kind equipment support from Neuronetics, Inc.and MagVenture, Inc. for investigator-initiated studies. He received grant-in-kind supplies and genotyping from Assurex Health, Inc. for an investigator-initiated study. He served as the principal investigator for a multicenter study funded by Neuronetics Inc., a site principal investigator for a study funded by NeoSync, Inc., and site principal investigator for a study funded by Innosphere. Dr. Croarkin served as a paid consultant for Engrail Therapeutics, Meta Platforms, Inc, MindMed, Myriad Neuroscience,Procter & Gamble Company, and Sunovion. .Dr. Croarkin is employed by Mayo Clinic. He receives compensation as the Editor-in-Chief for the Journal of Child and Adolescent Psychopharmacology. Dr. Chin Fatt has served as an advisor for Janssen Research & Development. Dr. Foster has served on the Scientific Advisory Board for MRM Health NL and has received consulting/speaker fees from Takeda Canada, Rothman, Benson, Hedges Inc., and WebMD. Dr. Trivedi has provided consulting services to Alkermes Inc, Axsome Therapeutics, Biogen MA Inc., Cerebral Inc., Circular Genomics Inc, Compass Pathfinder Limited, GH Research Limited, Heading Health Inc, Janssen, Legion Health Inc, Merck Sharp & Dohme Corp., Mind Medicine (MindMed) Inc, Merck Sharp & Dohme LLC, Naki Health, Ltd., Neurocrine Biosciences Inc, Noema Pharma AG, Orexo US Inc, Otsuka American Pharmaceutical Inc, Otsuka Canada Pharmaceutical Inc, Otsuka Pharmaceutical Development & Commercialization Inc, Praxis Precision Medicines Inc, SAGE Therapeutics, SparianBiosciences Inc, Takeda Pharmaceutical Company Ltd, WebMD. He sits on the Scientific Advisory Board of Alto Neuroscience Inc, Cerebral Inc., Compass Pathfinder Limited, Heading Health, GreenLight VitalSign6 Inc, Legion Health Inc, Merck Sharp & Dohme Corp, Orexo US Inc, Signant Health. He holds stock in Alto Neuroscience Inc, Cerebral Inc, Circular Genomics Inc, GreenLight VitalSign6 Inc, Legion Health Inc. Additionally, he has received editorial compensation from American Psychiatric Association, and Oxford University Press. Dr Carmody has received consultant fees from Holmusk Technologies, Inc. Dr. Minhajuddin reports no conflicts of interest. Dr. Bobo has received research support from NIH, NSF, AHRQ, the Watzinger Foundation, the Blue Gator Foundation, the Myocarditis Foundation, and the Mayo Foundation for Medical Education and Research. He has contributed chapters to UpToDate on the pharmacological management of bipolar major depression. Dr. Toll reports no conflicts of interest. Dr. Athreya has received research support from NIH, NSF, the Watzinger Foundation, the Blue Gator Foundation, and the Mayo Foundation for Medical Education and Research. The remaining authors have nothing to disclose.
Prior Presentation
This work’s earlier findings were presented as a Poster at the 2025 Annual meeting of Am. College of Neuropsychopharmacology, Scottsdale, AZ, USA.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Barac, M., Grant, C.W., Toll, R. et al. Towards causal inference-based antidepressant selection with brain and blood biomarkers. Neuropsychopharmacol. (2025). https://doi.org/10.1038/s41386-025-02183-3
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41386-025-02183-3