Introduction

Follicular lymphoma (FL) has an incidence of 20–25% worldwide and is the most common subtype of indolent B-cell lymphoma [1]. Despite its indolent nature, 15–20% of FL assume a more aggressive disease progression, particularly those with progression of disease within 24 months (POD24). This subpopulation of FL typically requires multiple lines of therapies, including autologous stem cell transplant [2]. Unfortunately, both progression-free survival (PFS) and overall survival (OS) decline markedly with each subsequent line of treatment. This was demonstrated by Batlevi et al., who reported a stepwise reduction in median PFS and OS with each progressive line of chemotherapy [3]. At the same time, a multicentre study from the USA reported considerable variability in the types of therapies used in third-line or later, with 2-year PFS of 40% and lymphoma remained the main cause of death (17% in 5 years) after a median follow-up of 71 months [4]. Both studies highlight the limitations of conventional immunochemotherapies in achieving and maintaining disease control.

The treatment paradigm of relapsed or refractory (R/R) FL has shifted remarkably since the introduction of chimeric antigen receptor T-cell therapy (CAR-T), as well as bispecific antibody (BsAb), all of which have successfully demonstrated an unprecedentedly high metabolic complete response rate (CRR) in the third line (3L) or beyond (3L+) setting [5,6,7,8,9,10]. However, based on their single-arm Phase 2 trials, these treatment modalities have been associated with significant adverse events, particularly neurological toxicities such as immune effector cell-associated neurotoxicity syndrome (ICANS), cytokine-release syndrome (CRS) and infective complications.

In the absence of Phase 3 randomised controlled trials comparing these two T-cell-mediated therapies, the clinical decision-making process in the 3L or beyond R/R FL setting remains challenging. While awaiting the availability of more Phase 3 clinical trials data [11], our group conducted this systematic review and comparative meta-analysis of all currently available Phase 1/2 CAR-T and BsAb trials in R/R FL as 3L+ treatment with the aim of comparing their relative strengths and weaknesses, focusing on their efficacy and safety profiles.

Materials and methods

Search strategy and study selection criteria

This systematic review and meta-analysis were conducted in accordance with the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols (PRISMA-P) [12] and the Meta-Analysis Of Observational Studies in Epidemiology guidelines [13] and registered on the PROSPERO database (Prospero no: CRD42024608398). A comprehensive systematic search of Medline/PubMed, Embase and the Cochrane Library was conducted. To achieve maximum sensitivity for the search strategy, combinations of free text and medical subject heading were used e.g. For patient population, ‘lymphoma, follicular’[Mesh] OR ‘lymphoma, b-cell’[MeSH Terms], relapsed refractory [MeSH Terms] AND for intervention, ‘chimeric t cell receptor’ [MeSH Terms] OR ‘CAR-T’, ‘antibodies, bispecifics’[MeSH Terms]. Full search terms used are summarised in Table 1. The search was also expanded by reviewing the reference lists of initial studies included for further relevant articles to ensure a comprehensive search.

Table 1 List of key eligibility criteria for studies to be included.

Eligibility criteria

Searches were conducted from 1 January 2010 to 30 June 2025, as clinical trials of CAR-T and BsAb started only after 2010. The study eligibility criteria were defined a priori. Population included: adult (≥18 years) subjects with FL; who had received ≥2 prior lines of therapy. Interventions of interest were anti-CD19 CAR-T therapy; and CD20xCD3 BsAb treatment. Prospective interventional clinical trials that evaluated the efficacy of CD20×CD3 BsAb or anti-CD19 CAR-T therapy for R/R FL were included. Phase 1 dose escalation studies used to determine the recommended Phase 2 dose were also included. Studies with the following characteristics were excluded: (1) Evaluated the therapy as first or second line; (2) Studied a paediatric population; (3) Assessed dual targeting or other targeting CAR-T or BsAb. Table 1 summarises the study eligibility criteria used.

Screening search results

The study selection process occurred in two phases. Level 1 screening: titles and abstracts of studies identified from the electronic database searches were double-screened in a blinded fashion by two researchers, independently and in parallel, to determine eligibility according to the eligibility criteria displayed in Table 1. Disagreements between any two research members were reviewed by a third reviewer for consensus. Level 2 screening: Full texts of studies selected in level 1 were retrieved and reviewed to determine eligibility in a similar manner as in level 1 screening. The final selected studies were identified for data extraction and further appraisal. The inclusion and exclusion processes were thoroughly documented, including completion of a PRISMA flow chart (Fig. 1).

Fig. 1: PRISMA inclusion/exclusion process.
figure 1

Flow chart of the final selected studies is shown.

Data extraction and risk of bias appraisal

For studies that fulfilled the inclusion criteria, data were extracted by the research team (PLM, LXH, LCKN, TYC and MS) and tabulated to summarise key findings. Baseline data extracted from each study were: study characteristics (author and year of publication, study design, timing of study and location), patient and disease characteristics (including age, stage of disease, percentage of patients with POD24, double refractory disease, refractory to last line of treatment, FLIPI scores [14] and disease bulk, lines of chemotherapy given, percentage who had received prior hematopoietic stem cell transplantation (HSCT) or CAR-T (for the BsAb group)). Efficacy measurements of interest included: overall response rates (ORRs); complete remission rates (CRRs) [For BsAb trials, CRR rates in the CAR-T naïve population were also collected]; 1-year PFS; 1-year OS; and safety outcomes such as adverse events including grade ≥3, CRS, ICANS and infection. Data were extracted from full-text versions of studies, where available, or from conference abstracts summary by one reviewer and data were then quality-checked by a second reviewer. Discrepancies between the two investigators were resolved by discussion and consensus. Risk of bias in individual studies was evaluated independently by two reviewers (PLM and LCKN) using the cohort checklist of the Newcastle-Ottawa quality assessment scale (NOS) [15].

Data synthesis and analysis

Results of this systematic literature review were summarised qualitatively and quantitatively as appropriate. Although we had planned to calculate risk ratios (RR) and 95% confidence interval (CI), the current body of evidence did not find any comparative trials to perform such an analysis. Therefore, evidence from single-arm studies was summarised using descriptive summary statistics. When the demographics of the study populations and inclusion/exclusion criteria were relatively similar between the single-arm cohort studies, a meta-analysis of proportions (expressed as a percentage), with their 95% CI, was performed using the Review Manager (RevMan), version (8.1.1) for our analyses (available at revman.cochrane.org). To incorporate heterogeneity (anticipated among the included studies), transformed proportions were combined using DerSimonian-Laird random effects models.

In order to assess whether effect sizes were consistent across the included studies, heterogeneity was quantified. The test for heterogeneity was performed using the I2 statistic, which provides a magnitude of variability, where 0% indicates that any variability is due to chance, whilst higher I2 values (>50%) indicate increasing levels of unexplained variability. I2 values of 25%, 50% and 75% correspond to low, moderate and high degrees of heterogeneity, respectively. Differences between subgroups were calculated using the chi-square test, as reflected in the reported p values.

We carried out the outcome analyses, as far as possible, on an intention-to-treat basis, meaning we included all participants allocated to each group in the analyses, regardless of whether they received the allocated intervention or not. Missing data and drop-outs/attrition for each study were considered while evaluating the ‘Risk of bias’, and the extent to which the missing data could alter the results/conclusions of the review is discussed in this review [16].

Results

Literature search

A total of 3960 records were identified from database searches (Medline = 1197, Embase = 2415, The Cochrane Library = 348), of which 3459 were eligible for screening after removal of duplicates. Of these, 108 potentially eligible studies remained eligible for detailed review. Of these, 12 studies met the eligibility criteria and were included in this review (CD19 CAR-T = 7; CD3xCD20 BsAb = 5). Details of the PRISMA inclusion/exclusion process are presented in Fig. 1.

Study characteristics

This review did not identify any Phase III randomised controlled trials comparing the effect of CAR-T therapy versus BsAb in patients with R/R FL with a high risk of disease progression. Of the 12 trials, nine were Phase II single-arm studies (4 combination Phase I/II), two Phase I studies and one prospective case-series. A total of 795 patients were included in the analysis: 398 patients received CAR-T, 397 patients received BsAb, with the sample sizes of the studies ranging from 8 to 128 patients. CAR-T from the 7 clinical trials include: Tisagenlecleucel (Tisa-cel) (Fowler et al. [17], Dreyling et al. [6]), Axicabtagene ciloleucel (Axi-cel) (Jacobsen et al. [36], Neelapu et al. [5]), Lisocabtagene maraleucel (Liso-cel) (Morchhauser et al., 2024), Relmacabtagene autoleucel, CD19 CAR-T with a fixed CD4:CD8 ratio 1:1, utilising the 41BB co-stimulatory domain (later developed as Liso-cel) (Hirayama et al. [18]), CTL019 (later developed as Tisa-cel) (Schuster et al. [19], Chong et al. [20]) and the FMC63-derived single-chain variable fragment moiety (scFv) CD19-reactive CAR with a CD28 co-stimulatory receptor (Fried et al. [21]); BsAb trials include: two studies of Odronextamab (Bennerji et al. [22], Kim et al. [10]), one study of Epcoritamab (Linton et al. [9]) and two studies of Mosunetuzumab (Budde et al. [23] and Sehn et al. [8], Goto et al. [24]). Median age of patients ranged from 53 to 72 years old. Median follow-up ranged from 4 months to 42 months. Details of the included studies were summarised in Table 2.

Table 2 Characteristics of the included studies.

Risk of bias assessment

Given the lack of quality assessment tools for single-arm trials, Risks of Bias assessment was evaluated using the cohort checklist of the Newcastle-Ottawa scale (NOS) [15]. Key domains were assessed individually for each study, including: selection of the study groups and ascertainment of either the exposure or outcome of interest for the studies. Studies were then graded accordingly based on the NOS. This quality assessment is described in Table 3.

Table 3 Risk of bias assessment.

Efficacy outcomes

Response rates (RR)

All 12 included studies reported ORR and CRR. The pooled ORR was higher in patients treated with CAR-T therapy at 93% (95% CI: 88–97%) compared to 82% (95% CI: 78–86%) in patients treated with BsAb, p = 0.0002. The difference in treatment benefit was 11% (95% CI: 7–16%) in favour of CAR-T therapy. Similarly, more patients given CAR-T therapy achieved CRR at 82% (95% CI: 73–91%), compared to BsAb at 67% (95% CI: 61–73%), p = 0.005. CAR-T therapy demonstrated a 15% (95% CI: 9–21%) greater CRR benefit compared to the BsAb. However, when we examined the higher risk patient group who suffered progression within 24 months (POD24), the difference in treatment benefit reduced to only 6% (95% CI: 0–13%), p = 0.56 in which the pooled CRR for POD24 cohort from 4 trials was 75% (95% CI: 58–92%) for CAR-T compared to pooled estimates from 3 BsAb trials (69%; 95% CI: 61–77%) (Fig. 2A–C).

Fig. 2: Meta-analysis of response rate.
figure 2

A Overall response rate (ORR), B complete response rate (CRR), C CRR in patients with POD24.

Progression-free survival (PFS)

There was significant heterogeneity among studies in terms of the time points used to assess PFS. All 12 studies reported 6-month PFS; however, not all trials reported beyond this period. At 6 months, the pooled PFS estimate was 85% (95% CI: 80–90%) in patients treated with CAR-T, compared to 74% (95% CI: 68–80%) in those given BsAb, p = 0.006. This translates to an 11% (95% CI: 5–17%) higher PFS with CAR-T compared to BsAb at 6 months. At 1 year (11 studies: 6 CAR-T, 5 BsAb), the pooled PFS rate remained higher when treated with CAR-T 74% (95% CI: 68–80%) versus 62% (95% CI: 57–66%), in patients given BsAb with a risk difference of 12% (5–18%), p = 0.002.

Fewer studies reported PFS at 2 years (8 studies: 4 each for CAR-T and BsAb). The evidence from these single-arm trials demonstrated that CAR-T pooled 2-year PFS was 62% (95% CI: 54–71%) compared to 47% (95% CI: 42–53%) in patients treated with BsAb. The difference in 2-year PFS was 15% (95% CI: 0.08–0.22%), p = 0.004. 3-year PFS was only reported in two CAR-T and three BsAb studies; the pooled estimates were 54% (95% CI: 47–60%) and 42% (95% CI: 36–48%), respectively (risk difference 12% (95% CI: 3–21%), p = 0.009 (Fig. 3A–D)).

Fig. 3: Meta-analysis of progression-free survival (PFS).
figure 3

A 6-month PFS, B 1-year PFS, C 2-year PFS, D 3-year PFS.

Overall survival (OS)

Patients treated with CAR-T again demonstrated a potential benefit in OS than those treated with BsAb. However, compared to RR and PFS, this difference in OS was smaller. Pooled 6-month, 1-year, 2-year and 3-year OS for CAR-T compared to BsAb were 98% (95% CI: 96–100%) vs 90% (95% CI: 86–94%), p = 0.002; 94% (95% CI: 92–96%) vs 87% (95% CI: 81–94%), p = 0.05; 87% (95% CI: 83–91%) vs 76% (95% CI: 62–90%), p = 0.13; and 80% (95% CI: 74–85%) vs 73% (95% CI: 54–91%), p = 0.48, respectively. Notably, the number of studies contributing to OS analysis decreased with longer follow-up durations (Fig. 4A–D).

Fig. 4: Meta-analysis of overall survival (OS).
figure 4

A 6-month OS, B 1-year OS, C 2-year OS, D 3-year OS.

Toxicity outcomes

Grade ≥3 toxicities, including CRS, ICANS and infections, were assessed across studies. The pooled incidence of grade ≥3 CRS was low for both modalities: 3% (95% CI: 0–6%) for CAR-T and 4% (95% CI: 1–7%) for BsAb, p = 0.75. The pooled rate of grade ≥3 ICANS was notably higher in CAR-T studies at 8% (95% CI: 1–15%), compared to 0% (95% CI: 0–2%) in BsAb studies, p = 0.04. In contrast, CAR-T therapies demonstrated a numerically lower pooled rate of severe infections (grade ≥3) compared to BsAb [9% (95% CI: 4–15%) versus 17% (95% CI: 3–31%), p = 0.31]. One-year non-relapse mortality (NRM) was similar between groups, with a pooled rate of 3% (95% CI: 1–5%), p = 0.92 for both CAR-T and BsAb therapies (Fig. 5A–D).

Fig. 5: Meta-analysis of grade ≥3 toxicities and non-relapsed mortality (NRM).
figure 5

A Rate of ≥ grade 3 CRS, B rate of ≥ grade 3 ICANS, C rate of ≥ grade 3 infection, D 1-year NRM.

Discussion

This systematic review and meta-analysis of data from single-arm trials of CAR-T and BaAb trials demonstrates that CAR-T therapies may be more effective than BsAb in the third line and beyond setting for adult patients with R/R FL. Evidence from pooled analyses of non-comparative trials demonstrated that patients treated with CAR-T had higher ORR and CRR compared to those who received BsAb. In the high-risk POD24 subgroup, CAR-T-cell therapy’s benefit is less pronounced, with both CAR-T and BsAb demonstrating comparable efficacy, achieving CRRs of 75% and 69%, respectively, with p = 0.56. The higher ORR and CRR in patients treated with CAR-T translated to consistently higher PFS across time points, from 6 months to 3 years. However, OS were similar between the two treatment modalities.

While CAR-T-cell therapies achieved better ORR, CRR and PFS, on the other, hand they are also associated with higher rates of ICANS compared to BsAbs. This may be contributed to by the ZUMA-5 trial, which reported a significant event rate of 0.15 of severe (grade ≥3) ICANS. This has been postulated to be related to the use of the CD28 co-stimulatory domain in this product [25]. Additionally, some of the CAR-T studies included in our analysis were conducted during the early clinical development of CAR-T therapies, when management strategies for ICANS were still evolving and potentially suboptimal. Since then, several clinical practice guidelines have been developed to reduce the incidence of severe ICANS, such as the early prophylactic use of high-dose corticosteroids, e.g. dexamethasone and cytokine antagonists, e.g. anakinra [26, 27]. In contrast, severe ICANS was rarely observed with BsAb. As such, they may represent a safer treatment option for outpatient administration.

However, BsAb appear to be associated with potentially higher risk of severe infective complications, with higher rates of ≥ grade 3 infections reported compared to CAR-T therapy. This could possibly be attributed to prolonged B-cell suppression with BsAb, as they are typically administered over a longer period compared to CAR-T. The duration of BsAb therapy varies by agent and clinical response, ranging from 8 to 17 months with mosunetuzumab to continuous treatment approaches with agents like epcoritamab and odronextamab. Our findings are generally in agreement with a recent meta-analysis focusing specifically on the infective complications of BsAb and CAR-T [28], which also showed that both all-grade and ≥ grade 3 infections per patient-month were significantly higher in patients receiving BsAb compared to those treated with CAR-T. Moreover, continuous-treatment BsAb were associated with higher rates of ≥ grade 3 infections than fixed-duration ones.

In BCMA-targeted BsAb, preventive strategies such as vaccination and administration of intravenous immunoglobulin have been shown to help mitigate the risk of infections [29]. Furthermore, fixed-duration therapy, as seen with mosunetuzumab, may also reduce infective risk by allowing immune reconstitution after treatment cessation. A possible confounding factor that might have led to the increased infection rate reported in BsAb may be related to the timing of some BsAb trials - with many taking place during the early phases and peak of the COVID-19 pandemic. Importantly, none of the toxicities discussed led to excessive NRM, with 1-year NRM rates remaining below 5% for both treatment modalities.

These findings in FL are similar to a previous meta-analysis comparing CAR-T and BsAb in the treatment of relapsed/refractory aggressive large B-cell lymphoma, which also demonstrated that CAR-T was superior in terms of efficacy, particularly in CRR and 1-year PFS [30]. However, in that study, the toxicity profile favoured BsAb, showing lower rates of ≥ grade 3 CRS, ICANS and infections compared to CAR-T therapy.

Despite the encouraging findings, several critical factors and limitations should be considered when interpreting our results. First, this study represents a secondary analysis based on published data from prospective single-arm trials. Consequently, the analysis is subject to heterogeneity in study designs and enroled patient populations. Due to the lack of individual patient data (IPD), propensity score matching could not be performed. Although efforts were made to minimise heterogeneity by restricting inclusion to studies evaluating third-line therapy and beyond, inherent differences among studies should still be acknowledged.

Secondly, due to significant heterogeneity within each treatment group, we were unable to conduct certain subgroup analysis beyond the CRR in the POD24 subgroup—one of the most frequently reported high-risk cohorts. However, there were slight differences among included studies on definition of POD24. (Summarised in Table 2) While 4 out of 5 of the BsAb included patients who had failed CAR-T, the outcomes of these patients were not reported separately, hence precluding subgroup analyses for this cohort. Given however these patients contributed a very small proportion of the overall trial population (≤5% for all except one study), it is unlikely they would confound or affect the overall findings of this meta-analysis. Additionally, as the reported median duration of follow-up varies across studies, with some studies having a relatively short follow-up, pooled analysis for 2-year and 3-year PFS and OS may not be fully representative of all studies. Given that FL is typically an indolent disease with a prolonged clinical course, a 3-year follow-up may be insufficient to detect differences in OS. Additionally, the availability of multiple subsequent therapies—including switching to CAR-T or BsAb not previously used, as well as targeted agents like tazemetostat and zanubrutinib—may further influence long-term outcomes [31, 32].

Thirdly, our analysis mainly focuses on efficacy and toxicity data between CAR-T and BsAb. Other important factors, such as logistics, frequency of clinic visits, total duration of active therapy and cost, were not compared here. Although important, these outcomes were not frequently reported in trials to allow for a systematic review and comparison. In general, CAR-T therapies involve complex logistics, including apheresis, manufacturing and potential inpatient care for acute toxicities, as well as post-treatment monitoring requirements such as proximity to treatment centres and driving restrictions. In contrast, BsAb, being off-the-shelf products, avoid many of these logistical challenges but require prolonged administration and frequent clinic visits.

Fourthly, a key consideration often under-reported in clinical trials is patient-reported outcomes and health-related quality of life, which were not reported in all the studies we included here. These factors may significantly influence patient preferences. A study led by the Lymphoma Epidemiology of Outcome Consortium found that patients with FL value a holistic and individualised approach to treatment that minimises side effects, fits their lifestyle and reduces overall treatment burden [33]. While both patients and physicians prioritise disease control and survival, patients are often more willing to trade some degree of efficacy for convenience, reduced toxicity and fewer monitoring visits [34, 35].

Finally, our study is unable to address the optimal sequencing of therapies, as data on PFS2 and OS2 were not available. It is likely that CAR-T and BsAb both serve important roles at different stages of FL management, rather than being mutually exclusive options. Ongoing clinical trials are evaluating both modalities in earlier lines of therapy, which may reshape future treatment sequencing.

In conclusion, this pooled analysis shows that CAR-T therapy could be potentially more effective and have longer durability of remission compared to BsAb in the third line and beyond setting, for adult patients with R/R FL. Toxicity profiles vary not only between the two modalities, and among the different agents within each category. Future comparative randomised controlled clinical trials are essential to determine the optimal sequencing and integration of these treatments in the evolving therapeutic landscape.