Introduction

Electroconvulsive therapy (ECT) is a highly effective treatment for clinical depression, with remission rates as high as 60–70%. ECT has demonstrated superior efficacy over pharmacotherapy [1, 2] and non-invasive brain stimulation techniques such as transcranial magnetic stimulation and transcranial direct current stimulation [3], particularly for severe or treatment resistant depression (TRD). Racemic ketamine, an off-patent and affordable N-methyl-D-aspartate receptor antagonist commonly used as an anaesthetic, has also shown promise in this patient population [4]. Complementing multiple smaller trials of racemic ketamine [5,6,7], a recent phase 3, double-blind, randomised, active-controlled multicentre trial confirmed the efficacy and safety of subcutaneous racemic ketamine in TRD [8]. As both ECT and ketamine are effective for TRD, understanding the comparative effectiveness of these interventions is important to guide treatment decisions.

Six randomised controlled trials (RCTs), and one cohort study, have compared the efficacy of ECT and ketamine [9,10,11,12,13,14,15]. Eight meta-analyses – all published since 2022 – have synthesised data from these trials to evaluate their efficacy in reducing depression severity scores. Despite substantial overlap in included studies (Table S1 in the Supplement), these meta-analyses have yielded conflicting results. Two report a significant advantage of ECT over ketamine [16, 17], while others suggest only a numerical, non-significant benefit [18, 19], or approximate equivalence [20,21,22]. In contrast, one found that ketamine provided greater acute improvements in depression scores than ECT [23]. Given that clinicians and clinical guidelines refer to meta-analyses as the ‘gold standard’ for appraising the totality of clinical evidence in a field, these conflicting conclusions are problematic.

A key factor contributing to discrepancies between meta-analyses, and a significant source of heterogeneity between trials, is the treatment of “time”. Specifically, from inconsistencies in the timing of assessment periods. Meta-analyses often extract data from the primary endpoint as defined by each trial; however, these can vary widely due to differences in treatment course durations. It is important to ensure comparison of like-for-like data between trials to avoid major temporal discrepancies, as this can significantly influence effect size estimates, e.g., see Nikolin, et al. [24].

To address these issues, we conducted a systematic review and meta-analysis comparing the antidepressant efficacy of ECT and ketamine incorporating all available assessment data throughout the treatment course, using meta-regression techniques to model differences in the rate of improvement in depressive symptoms over time.

Methods

This meta-analysis was conducted in line with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [25]. A PRISMA checklist is provided in the Table S2 and Table S3 in the Supplement.

Search strategy

To update the extensive body of reviews published in recent years, a systematic search was conducted focusing on identifying new articles released between January 1, 2023, and November 31, 2024. PubMed (MEDLINE), Embase, and the Cochrane Library were used in addition to a manual search of reference lists of included studies. The complete search strategy, including search terms, is detailed in Methods S1 in the Supplement.

Study selection criteria

Eligible studies had to meet the following inclusion criteria: 1) participants with a clinical diagnosis of MDD based on standardised diagnostic criteria (e.g., DSM-IV) receiving treatment primarily for depressive symptoms; 2) treatment arms comprising ECT and ketamine administered via a parenteral route (intravenous, intramuscular, or subcutaneous). A preliminary review of the literature, including prior meta-analyses, could not identify any studies comparing ECT with intranasal ketamine or esketamine; 3) provision of depression symptom severity scores for ECT and ketamine at a minimum of two time points (baseline and end of the acute phase), and use of equivalent treatment course durations and assessment periods for both interventions; 4) use of a standardised scale to measure depression symptom severity, such as the Montgomery–Åsberg Depression Rating Scale (MADRS) or Hamilton Depression Rating Scale (HDRS); and 5) study designs limited to either parallel-group RCTs or retrospective non-randomised cohort studies.

Data extraction and quality assessment

Studies were systematically screened according to their title, abstract, and full text. Two investigators (C.M-T., and L.B.) independently assessed risk of bias for each included study using the Cochrane Risk of Bias tool Version 2 (RoB 2) [26] for RCTs and Risk Of Bias In Non-randomized Studies – of Interventions Version 2 (ROBINS-I 2) [27] for cohort studies. Disagreements were resolved through discussion until consensus was achieved. Extracted data included the primary outcome measure (depression severity), depression rating scale, study design, ECT electrode placement and dose, ketamine route of administration and dose in mg/kg, and participant characteristics, including age and psychiatric background.

Outcomes

We extracted all available data on depression symptom severity outcomes. To assist with visualisation, scores were additionally converted to a common depression scale, the 17-item HDRS, using validated methods [28, 29]. These have been used in previous meta-analyses to investigate the efficacy of transcranial direct current stimulation, psychotherapy, and antidepressants in major depression [24, 30,31,32]. Although the MADRS was used for most participants, conversion to the HDRS-17 avoided the need for two-step conversions (e.g., from alternative HDRS scales such as the HDRS-25 to the HDRS-17, and then to the MADRS), which could compound error and inflate variance.

Statistical analysis

Statistical analyses were performed in open-source R software (Version 4.4.1) using the metafor and splines packages.

Visualisation of treatment using converted scores

To visualise the time-course of ECT and ketamine effects, we conducted separate meta-regressions for each intervention using converted HDRS-17 scores. The association between HDRS-17 and Time was modelled with a natural cubic spline with four degrees of freedom, allowing for a flexible, non-linear trajectory across the treatment period.

Meta-analysis of comparative efficacy

A multivariable mixed-effect meta-regression was conducted on standardized mean difference (SMD) scores obtained from the original (non-converted) depression severity measures for ECT and ketamine. The model incorporated Time (expressed in days since treatment initiation) as a fixed effect and specified random intercepts and slopes for Time within each Study to account for within-study clustering and potential variation in time trends across studies. This structure models the interdependence of multiple effect sizes contributed by the same study (e.g., see Nikolin, et al. [4]). We used a restricted maximum-likelihood estimation method to obtain unbiased estimates of variance parameters. All available time-points from eligible studies were included in the meta-regression analysis, excluding baseline (i.e., pre-treatment). Time was modelled as a linear effect as this improved model fit, defined by a lower Bayesian Information Criterion (BIC). A statistical primer on multivariable, multilevel mixed-effect meta-regression techniques is provided in Methods S3 in the Supplement.

Sensitivity analyses

Sensitivity analyses included: 1) Performing meta-regressions limited to the primary endpoint of each included study (i.e., separate analyses for data obtained within 1-, 2-, 3-, and 4-week treatment durations); 2) Modelling ‘Time’ using a non-linear approach. SMDs were analysed using a natural cubic spline (four degrees of freedom), with Baseline Depression Severity included as a covariate; 3) Evaluating the robustness of our findings to assumptions about within-study correlations by using alternative covariance structures (Compound Symmetry (CS) and First-Order Autoregressive (AR1)) and a range of plausible intra-study correlation values (ρ = 0.0–0.9); 4) Leave-one-out analysis to assess the influence of individual studies; and 5) Excluding non-RCTs and repeating all main and sensitivity analyses.

Results

Our systematic review identified seven eligible studies, six RCTs and one cohort study [9,10,11,12,13,14,15], consisting of 731 participants (365 receiving ECT and 366 receiving ketamine). The PRISMA flow diagram of search results is displayed in Figure S1 in the Supplement. Racemic ketamine was delivered at 0·5 mg/kg in all studies, except one that allowed dose modification if clinically indicated [14]. Route of administration was intravenous (IV) for all studies except one that used an intramuscular (IM) route [10]. ECT stimulation parameters, including electrode placement, dosage, and pulse-width, varied substantially between studies and are summarised in Table 1. Treatment course durations ranged from 1–4 weeks, with session frequencies of 2–3 treatments per week adopted across all studies.

Table 1 Trials meeting inclusion criteria for the comparative efficacy of electroconvulsive therapy (ECT) and ketamine in the treatment of depression.

Assessment of study quality using the Cochrane RoB 2 tool found five (71%) studies to have a high overall risk of bias, with some concerns for the remaining two (Figure S2 and Figure S3 in the Supplement).

Visualisation of treatment using converted scores

Meta-regressions of converted HDRS-17 scores using a natural cubic spline to model the effect of Time are presented for ECT (Fig. 1a) and ketamine (Fig. 1b). These indicate a rapid improvement in mood in the first week, followed by a slight reduction in the rate of improvement. Visual inspection of the ECT and ketamine models (Fig. 1c) revealed lower depression severity scores at baseline for ketamine relative to ECT.

Fig. 1: Visualising ECT and ketamine treatment using a common scale of depression symptom severity.
figure 1

Depression scores are plotted for all included studies. Black lines represent the estimated effect from meta-regression modelled with a natural cubic spline with four degrees of freedom, with the shaded grey regions representing the 95% confidence interval. Error bars indicate the standard error of the mean. A Depression scores converted to the 17-item Hamilton Depression Rating Scale (HDRS-17) as a common scale for the electroconvulsive therapy (ECT) arms. B HDRS-17 depression scores for the ketamine arms. C Results from cubic spline meta-regression models of converted HDRS-17 scores for both ECT and ketamine superimposed.

To further explore the observation of a possible pre-treatment difference between ECT and ketamine arms, we performed a meta-analysis of mood scores for the baseline assessment. This revealed a significant difference between interventions (SMD = −0·28; p = 0·018; 95% CI −0·51−0·05) indicating lower pre-treatment depression symptom severity for ketamine participants (Fig. 2). All studies showed numerically reduced baseline depression scores for the ketamine group relative to the ECT group, with only moderate heterogeneity (I2 = 35·9%). Funnel plot analysis, Egger’s test, and the Trim-and-Fill method (Figure S4 and Figure S5 in the Supplement) suggested possible small study bias.

Fig. 2: Forest plot of baseline depression severity.
figure 2

Comparison of baseline depression severity scores using original scales within a random effects model. “Favours ketamine” indicates the mean depression score was lower at baseline in the ketamine group.

Meta-analysis of comparative efficacy

Our primary analysis used a mixed-effect meta-regression model to assess the SMD of depression severity scores for ECT compared to ketamine over time. Considering the significant baseline difference in depression severity scores between treatment arms, the meta-regression model incorporated a covariate of Baseline Depression Severity as a fixed effect. The model identified a significant effect of Time (β = 0·018; p < 0·0001; 95% CI 0·009–0·026), suggesting a benefit of ~0·02 SMD per day in favour of ECT (Fig. 3). The model predicted a difference between ECT and ketamine of SMD = 0·14 (95% CI −0·69–0·98) at day 1, which increased to SMD = 0·59 (95% CI −0·26–1·43) by the end of a four-week course.

Fig. 3: Comparative efficacy of ECT and ketamine during a treatment course.
figure 3

Meta-regression results comparing the efficacy of ECT and ketamine using standardised mean difference (SMD) values. SMDs were derived from the original depression rating scales (i.e., not converted scores). The model included Time as a fixed effect, Baseline Depression Severity as a covariate, and Study as a random effect. The analysis revealed a significant effect of Time, indicating a faster rate of improvement in depression severity for ECT. The black line represents the estimated effect from the meta-regression model, with the shaded grey region representing the 95% confidence interval. Data points are adjusted for Baseline Depression Severity, and error bars indicate the 95% confidence interval.

Sensitivity analyses

Considering the variability in treatment durations across studies (1–4 weeks), we conducted sensitivity analyses by restricting data to each time frame. Regardless of treatment course duration, the effect of Time remained significant in favour of ECT (β = 0·018–0·038; see Table S7 in the Supplement).

To account for non-linear effects, we performed a natural cubic spline meta-regression on the SMD of depression severity scores. This revealed significant non-linear effects over time (Fig. 4). Three spline terms were statistically significant: Week 1 (β W1 = 0·376; p = 0·0047; 95% CI 0·115to 0·636), Week 3 (β W3 = 0·711; p = 0·0012; 95% CI 0·282–1·140), and Week 4 (β W4 = 0·350; p = 0·017; 95% CI 0·063to 0·637). The Week 2 spline term was not significant (β W2 = 0·216; p = 0·14; 95% CI −0·071–0·503), suggesting a stable comparative difference between treatments during this period. Overall, the results agree with primary analysis outcomes, with inflection points indicating a general trend towards a faster rate of improvement for ECT relative to ketamine over a treatment course.

Fig. 4: Sensitivity analysis of the comparative efficacy of ECT and ketamine using non-linear modelling.
figure 4

Results from a natural cubic spline meta-regression (four degrees of freedom). This revealed significant non-linear effects of Time, with an overall trend favouring ECT. The black line represents the estimated effect from the spline meta-regression model, with the shaded grey region representing the 95% confidence interval. Data points are adjusted for Baseline Depression Severity, and error bars indicate the 95% confidence interval.

Within-study correlations between assessments at different timepoints may lead to under- or over-estimation of standard errors and subsequently inaccurate confidence intervals in final model outputs. To assess the robustness of our findings to potential correlation differences between repeated measurements within studies, we conducted a sensitivity analysis using the same multilevel meta-regression model whilst incorporating several alternative plausible assumptions regarding the within-study covariance structure. Across all models, the estimated effect of Time remained statistically significant, with minimal variation in effect size estimates (β = 0·016–0·017; see Table S8 in the Supplement). These results suggest that our primary finding is robust to a range of within-study correlation structures.

A leave-one-out sensitivity analysis assessed the influence of individual studies on the overall meta-analytic findings (see Table S9). Here, we report the results following the exclusion of the two largest studies, given their substantial weighting in the dataset. When the Ekstrand, et al. [13] study (N = 186) was excluded the effect of Time was no longer significant, although the direction of the effect continued to favour ECT (β = 0.009; p = 0.16; 95% CI: –0.003–0.022). In contrast, when the largest study in the meta-analysis, Anand, et al. [14], was excluded the effect of Time remained statistically significant and increased, indicating a greater daily benefit in favour of ECT (β = 0.024; p < 0.001; 95% CI: 0.015–0.034).

Lastly, we repeated all analyses, including primary and previous sensitivity analyses, excluding the sole cohort study of Basso, et al. [15], leaving only RCTs. The mixed-effect meta-regression model found that the effect of Time remained significant (β = 0·017; p = 0·0002; 95% CI 0·008–0·025). The model predicted a difference between ECT and ketamine of SMD = 0·18 (95% CI −0·77–1·13) at day 1, which increased to SMD = 0·59 (95% CI −0·37–1·55) by the end of a four-week course. Re-analysis across varying treatment durations (1–4 weeks) continued to demonstrate a significant effect of Time in favour of ECT (β = 0·017–0·038; see Table S10). A cubic spline non-linear meta-regression again found statistically significant spline terms for Week 1 (β W1 = 0·392; p = 0·0028; 95% CI 0·135–0·649) and Week 3 (β W3 = 0·663; p = 0·0022; 95% CI 0·239–1·086). The Week 4 spline term marginally exceeded the significance threshold (β W4 = 0·278; p = 0·0646; 95% CI −0·017–0·573). The Week 2 spline term was not significant (β W2 = 0·225; p = 0·12; 95% CI −0·058–0·507). This pattern of results is consistent with results from analyses that included Basso, et al. [15]. To assess the robustness of our results to potential within-study correlation among repeated measures, we re-ran the models using alternative covariance structures (Compound Symmetry (CS) and First-Order Autoregressive (AR1)) and a range of plausible intra-study correlation values (ρ = 0.0–0.9). Across all models, the effect of Time remained statistically significant, with negligible variation in effect size estimates (β = 0·015–0·016; see Table S11).

Discussion

ECT and ketamine are highly effective and rapidly acting treatments for depression, including severe and/or difficult-to-treat subtypes. Several meta-analyses have attempted to synthesise the comparative efficacy of these interventions from the available RCTs and observational studies, drawing disparate conclusions. The present meta-regression incorporated the critical variable of time and demonstrated 1) a significantly faster rate of improvement in depressive symptoms with ECT compared to ketamine (~0·02 SMD per day), accumulating to a moderate effect size difference (SMD = 0·59) over a 4-week treatment course; and 2) a bias in participant baseline depression severity, where participants in the ketamine arm started with less severe symptoms at the outset (SMD = −0·28). This is a new finding not hitherto acknowledged in the literature. By adjusting for time and baseline severity, this meta-analysis provides greater clarity on the conflicting results in the literature regarding the comparative efficacy of ECT and ketamine.

Baseline differences between treatment groups play a crucial role in interpreting the outcomes of comparative studies. Although statistical comparisons between RCT treatment arms at baseline are not necessary due to the randomisation procedure, which theoretically ensures any differences are due to chance [33,34,35,36], it is nevertheless recommended that relevant baseline variables are incorporated into statistical testing as covariates to improve the accuracy of effect estimates. The relevance of each variable in this case is determined according to whether the baseline factor is a known or suspected prognostic indicator for the outcome of interest [36, 37]. A lower pre-treatment depression severity score is indeed a strong predictor of antidepressant efficacy for several treatment modalities [24, 38], including ECT [39] and ketamine [40]. Therefore, there is a compelling argument that baseline depression severity should be accounted for in meta-analyses [41], as is recommended for analyses of RCTs [42]. Our finding of a significant small-sized (SMD = −0·28) difference in baseline depression scores between ECT and ketamine treatment arms may explain some of the disagreement in prior meta-analyses. For example, Rhee, et al. [17] calculated mood change scores at the study endpoint, thereby incorporating baseline severity, and concluded a significant benefit of ECT over ketamine, whereas Moreira, et al. [20] extracted endpoint scores without adjustment and reported no difference between interventions.

It is unclear whether baseline differences reflect a chance occurrence or indicate a systemic bias. In our analysis, RCTs with larger sample sizes [13, 14], and therefore reduced susceptibility to random fluctuations at baseline, showed smaller differences (SMD = −0·23; and SMD = −0·05). Conversely, there is reason to suspect the possibility of a bias. ECT continues to be a highly stigmatised therapy [43]. Notably, in the largest RCT (ELEKT-D), 20% of participants assigned to ECT withdrew prior to starting treatment, whereas only 2·5% of those allocated to ketamine withdrew [14]. Speculatively, individuals with more severe depression may be more inclined to adhere to ECT despite its stigma, while those with less severe symptoms may be more likely to withdraw prior to starting treatment if randomised to ECT, thus generating the observed bias. Future comparative effectiveness trials of ECT and ketamine should consider measuring patient preference and treatment expectations prior to study commencement to assess the possible influence of stigma on outcomes.

By meta-regressing depression scores over time, incorporating baseline severity as a covariate in our analysis, and enhancing statistical power through the collection of data from multiple assessment periods, we detected a significant effect of time favouring ECT. The benefit is estimated at SMD = 0·02 per day, accumulating to SMD = 0·59, or 3·86 points on the HDRS-17 using converted depression scores, by the end of a four-week treatment course (Fig. 3). Expert consensus suggests a range of 3–6 points on the HDRS-17 corresponds to a clinically meaningful difference in depression outcomes between two treatments [44,45,46]. ECT’s advantage over ketamine is at the lower end of this range, suggesting a clinically meaningful difference over a four-week treatment period. Importantly, although the four-week SMD suggests a moderate effect size advantage for ECT over ketamine, the uncertainty in the effect estimate (i.e., wide confidence interval) does not preclude equivalence between treatments, nor the potential for a large effect size difference. Furthermore, as the longest treatment course included in the present meta-analysis only extends to four weeks, it is unclear whether this trend would continue with longer treatments, in which case ECT may offer greater benefits over extended durations.

Over the 4-week treatment period, ECT showed an overall greater rate of response than ketamine (SMD = 0·02 per day). This does not contradict clinical observations and meta-analytic evidence suggesting that ketamine produces a “remarkable rapid-onset antidepressant effect” [47], within hours of the first administration [4, 48]. Our findings are consistent with the interpretation that ketamine provides a rapid early antidepressant effect within the first few hours, but that ECT catches up quickly, resulting in a negligible difference in symptom improvement in the days following treatment.

The use of converted HDRS-17 scores in this study allows for a visual comparison of intervention effects between studies for ketamine and ECT (Fig. 1c). Although not formally analysed, these figures reveal marked differences between studies, with potentially greater variability observed in ECT outcomes between studies (Fig. 1a), whereas ketamine studies included in this analysis appear relatively homogeneous (Fig. 1b). Most racemic ketamine interventions used intravenous administration, with only one instance of intramuscular delivery (Table 1), and all followed a standardised dose of 0·5 mg/kg. In contrast, ECT protocols varied considerably in electrode placement, dosage, and pulse width, which are known to influence treatment outcomes [49, 50]. For example, Anand, et al. [14] primarily used ultrabrief pulse right unilateral ECT, a method associated with lower efficacy compared to brief pulse ECT [51,52,53], and reported less improvement with ECT than other studies in this review. This heterogeneity between studies is reflected in the visibly wider confidence intervals for ECT in Fig. 1c.

Limitations

Only six RCTs and one cohort study met eligibility criteria for quantitative meta-analysis. This meta-analysis and others highlight substantial heterogeneity in effects between these studies. Indeed, the two largest trials show a marked disagreement in outcomes [13, 14], partly due to differences in stimulation parameters (e.g., ultra-brief pulse vs brief pulse ECT), as well as due to patient-specific factors (e.g., comorbidities and medication use) [54]. This is a limitation common to all meta-analyses, particularly during early stages where a limited number of studies are available to investigate the influence of contributors to heterogeneity through the incorporation of effect modifiers in statistical analyses. Additional studies are needed to improve the precision of meta-regression effect estimates and to enable further analyses of moderators of comparative efficacy (e.g., ECT stimulation parameters, ketamine dose titration, or population sub-groups).

This meta-analysis uses accepted methods to convert between commonly used depression scales [28, 29], enabling visualisation of the time course of treatment effects for the ECT and ketamine arms of the included studies (Fig. 1). While these transformations may introduce additional variability into the source data used for arm-specific visualisations, it is important to note that the primary analyses – assessing the SMD in comparative depression severity scores between ECT and ketamine – were conducted using the original depression scales and therefore do not have this limitation.

Lastly, the studies included in this analysis, with the exception of Basso, et al. [15], are RCTs conducted under a tightly regulated research setting. As such, they have protocols that do not comprehensively mirror real-world clinical practice, in which treatment is adapted flexibly to optimise individual patient response. This includes the number of treatments in a treatment course, for both ECT and ketamine, rather than a set number as prescribed in RCTs.

Conclusions

This meta-analysis revealed a baseline difference in depression severity between ECT and ketamine treatment arms, finding lower pre-treatment depression scores for participants receiving ketamine. Compared to prior meta-analyses, our analysis accounted for baseline severity, incorporated data from all available mood assessments, and statistically modelled treatment effects over time. Our meta-regression analysis found that ECT offers a moderate advantage in efficacy over ketamine (SMD = 0·02 per day, equating to SMD = 0·59 over a 4-week treatment course). This advantage is at the lower range of what is considered a clinically meaningful benefit at the end of four-weeks of treatment. Future RCTs should incorporate assessments at baseline to assess whether patient perceptions of their allocated intervention may influence their likelihood to withdraw from treatment.