Introduction

Multiple myeloma (MM) has long been thought to be an incurable disease; contemporary MM therapy involving prolonged maintenance therapy has challenged this assertion by leading to more exceptional responders, though the indefinite nature of maintenance makes identifying a cure even more challenging. Measurable residual disease (MRD), referring to low levels of cancer cells only detectable by advanced assays, serves as an important prognostic marker for progression-free survival (PFS) and overall survival (OS) in MM, with each increase in MRD sensitivity threshold leading to improved prognostication [1]. The depth of MRD testing is not the only factor. While a single instance of MRD negativity may be prognostic, sustained MRD negativity—defined by the International Myeloma Working Group (IMWG) as two consecutive MRD negative results at the 105 threshold performed at least one year apart—carries even greater significance with respect to outcomes [2,3,4].

Maintenance therapy in MM – typically with lenalidomide – can deepen responses after completion of induction and/or consolidation with autologous stem cell transplantation (ASCT), including converting patients with MRD positivity to MRD negativity [5,6,7]. Meta-analyses have shown that lenalidomide maintenance is associated with an OS benefit [8, 9], though some of these studies pre-date widespread use of triplet induction with a proteasome inhibitor and an immunomodulatory imide drug and all pre-date the incorporation of anti-CD38 monoclonal antibodies into frontline triplet or quadruplet therapies. With further improvements in the depth and duration of response using contemporary MM agents, should indefinite maintenance still be the standard of care? As of now, maintenance therapy has no established end date, may impact health-related quality of life (HRQoL) due to ongoing treatment-related side effects, may increase the risk of second cancers [10], and carries significant financial burden for patients and payors. For these reasons, there is emerging interest in discovering if patients can undergo maintenance cessation without excess risk of disease re-emergence based on the attainment of (sustained) MRD negativity.

In this prospective intervention study called MRD2STOP (NCT04108624), we evaluated the outcomes of patients with MM with sustained multimodal MRD negativity who underwent discontinuation of maintenance therapy.

Methods

Study design and participants

This was a prospective, pragmatic, single-institution trial that enrolled patients aged 18 or older with MM. Eligibility criteria for discontinuation included patients with MM who had received at least one year of maintenance, who were previously known to be in at least a complete response (CR) by IMWG criteria with no detectable disease by positron emission tomography (PET), and MRD negative by multiparametric flow cytometry (MFC) or next generation sequencing (NGS) at a sensitivity of at least 105 in the bone marrow (BM). Patients had to be receiving only single-agent maintenance therapy at the time of screening, though the 1-year duration of maintenance therapy could include time spent receiving multi-agent maintenance prior to de-escalation to single-agent maintenance therapy. Given the intentional pragmatic nature of this study, there were no stipulations regarding the type, duration, or intensity of induction or consolidation.

Patients meeting these criteria were screened with peripheral blood (PB) testing for paraproteins, a PET scan, and a BM biopsy and aspiration with MRD evaluation by MFC (limit of detection [LoD] 105) and by NGS using clonoSEQ (Adaptive Biotechnologies, LoD of 6.8 × 107 and limit of quantitation of 1.76 × 106 with 20 μg [~3 million cells] input). Discontinuation of therapy was permitted only for those who were MRD negative by PET, MFC, and NGS at the 106 threshold.

After discontinuation of therapy, patients underwent PB testing every 3 months (serum protein electrophoresis, immunofixation, and serum-free light chains), BM MRD testing by MFC and NGS every year, and PET imaging every year for 3 years.

All patients provided written informed consent; the study was conducted in accordance with the US Food and Drug Administration, the International Conference on Harmonization Guidelines for Good Clinical Practice, and the Declaration of Helsinki. The study was approved by the University of Chicago institutional review board (#19-0339) and was registered at ClinicalTrials.gov (NCT04108624).

Additional evaluations

Additional assessments were performed in the BM. At screening and at each timepoint, BM aspirate samples were also collected for CD138+ immunomagnetic enrichment using EasySep (Stem Cell Technologies, Vancouver, Canada) for the performance of clonoSEQ to achieve an approximate sensitivity of 107. A first-pull aspirate of 10 cc was divided into 2 cc for clonoSEQ and 8 cc for CD138+ selection. A fraction of the CD138+-selected sample was analyzed to target 107 sensitivity. To validate the results, the remaining fraction was also analyzed. Investigators and patients were blinded to individual 107 results at screening. HRQoL was assessed serially every 6 months for a subset of patients using the validated patient-reported outcome (PRO) survey tools EORTC QLQ-C30 and EORTC QLQ-MY20.

Study endpoints

The endpoints of interest in this study included the rate of MRD 106 resurgence (including progression) and PFS among those MRD negative by the standard non-enriched NGS assay (106) and the CD138+ enriched NGS assay (107). MRD resurgence was defined as MRD detectable at the 106 threshold. If a serial sample was indeterminate at the 106 threshold due to insufficient sample, it was not considered MRD resurgence. We also followed patients for overall survival. Secondary endpoints included the feasibility of the CD138+ enrichment method and the differences in MRD detection by NGS using both methods. Longitudinal changes in PRO scores, including global health status and within specific functional and symptom HRQoL domains, were an exploratory endpoint.

MRD sensitivity thresholds

MRD by NGS using the standard clonoSEQ assay carries an LoD of 6.8 × 107 and LoQ of 1.76 × 106 with an input of 20 µg DNA. For the CD138+ enriched sample processed using the clonoSEQ assay, feasibility was defined as at least 80% of samples having at least 20 million nucleated cells in the raw aspirate sample.

Statistical analysis

The study was powered to assess MRD resurgence and PFS stratified by MRD 107. Based on the results of the IFM-2009 trial, we hypothesized that the 3-year progression-free survival (PFS) from maintenance cessation for patients who were MRD negative at the 106 threshold would be 75% [5]; as was the case for lower MRD sensitivity thresholds, we surmised that PFS could be further stratified based on 107 status. Assuming 40% of enrolled patients would be 107 positive, calculations suggested that if the 3-year PFS was 52% for patients with 106 negativity but 107 positivity and 90% for patients with double (106/107) negativity (for a 3-year PFS of 75% overall), a sample size of 45 patients would yield 80% power with a two-sided alpha of 0.05 to detect a difference of this magnitude. Interim analyses were prespecified to limit undue harm to enrolled patients in the event of excess relapse; if the 2-year PFS was significantly less than 85% at a one-sided 0.05 alpha level, the study would be terminated. The Kaplan–Meier method was used to analyze PFS and OS.

Conversion from MRD 106 negativity to positivity (MRD resurgence rate) within the first two years from the achievement of MRD negativity was 27% in one study, which has been confirmed in more recent analyses [11,12,13]. We posited that if patients with MRD 107 negativity had a 2-year MRD 106 resurgence rate of ≤10%, it would be considered clinically meaningful. A sample size of 45 patients would provide 84% power with a two-sided alpha of 0.05 to detect a reduction of this magnitude using the historical control of 27%. The cumulative incidence rate of progression or MRD 106 resurgence was calculated by the Fine-Gray method, with death without MRD resurgence considered a competing event. We also calculated MRD-free survival (MRD-FS)—defined as the absence of MRD resurgence, progression, or death—using the Kaplan–Meier method.

Longitudinal changes in PRO scores between baseline and 6, 12, and 18 months were analyzed by one-way repeated-measures ANOVA and were visualized as mean score changes from baseline (Supplement page 7). Absolute score changes of 8 for the EORTC QLQ-C30 [14] and 9–13 for the EORTQ QLQ-MY20 (10 for disease symptoms, 10 for side effects of treatment, 13 for body image, and 9 for future perspective) [15] were considered minimally important differences (MID), as previously validated in patients with MM. All statistical analyses were performed in STATA 17.0 (College Station, TX). The data cutoff for this report is January 21, 2024.

Results

Patient characteristics

A total of 83 patients signed consent and underwent screening, of which 47 patients met protocol eligibility criteria and discontinued maintenance therapy. A total of 9/47 (19%) patients had MRD ≥ 107 at baseline. The median age of enrolled patients was 66 years (range, 39–84) (Table 1). At least one high-risk disease feature was present in 17 (36%) patients at diagnosis, and 2 (4%) had 2+ HRCAs. Nearly all patients (45/47, 96%) were still in their first line of therapy at the time of treatment discontinuation; 26 (55%) received a triplet induction regimen, 20 (43%) received a quadruplet, and 1 (2%) received doublet therapy. Thirty (64%) received an ASCT, and 36 (77%) received multi-drug consolidation prior to single-agent maintenance therapy. Nearly all patients (45/47, 96%) received lenalidomide as their single-agent maintenance therapy. The median duration of post-induction/ASCT therapy prior to discontinuation was 36 months (range, 12–95); 14 (30%) patients received 2 years or less of consolidation/maintenance. There were no significant differences in patient characteristics based on baseline MRD 107 status, though numerically a higher percentage of patients received quadruplet induction therapy in the MRD < 107 group, and the median duration of consolidation/maintenance therapy was numerically longer for the MRD ≥ 107 group (Table 1).

Table 1 Patient Characteristics.

During screening, sustained MRD < 106 was confirmed in 22/47 (47%); the remaining patients had MRD < 106 during screening, and IMWG-defined sustained MRD < 105 but with indeterminate prior 106 status. There was a numerically higher percentage of patients with baseline MRD ≥ 107 in those with indeterminate sustained MRD < 106 (6/25) vs those with confirmed sustained MRD < 106 (3/22, p = 0.3).

The median follow-up was 30 months (range, 8–46). Of the 47 enrolled patients, 1 (2%) patient withdrew from the study due to geographic relocation and restarted maintenance (no evidence of disease resurgence), and 1 (2%) patient could not be followed for MRD surveillance due to insurance coverage issues; both were alive and free of disease progression at the data cutoff and were included in PFS and OS analyses. Because MRD status could not be ascertained, these two patients were not included in the MRD-FS analysis. An additional two patients were enrolled to account for these withdrawals and are included in this analysis (Fig. 1).

Fig. 1: Enrollment and evaluable patients.
figure 1

A CONSORT diagram of patients assessed for eligibility and subsequently enrolled.

Feasibility of CD138+ selection

BM samples evaluated for MRD by NGS at 107 sensitivity passed quality control in 134/136 (99%) cases. The median estimated unselected nucleated cell count prior to CD138+ enrichment was 57 × 106 (interquartile range, 38 × 106–72 × 106) cells; only 6 (4%) samples had fewer than 20 × 106 cells. The median CD138+ cell count analyzed was 216,469 (range 1291–2,857,681) cells across both runs. The feasibility endpoint of ≥80% of samples having ≥20 million nucleated cells in the unselected aspirate sample, along with ≥80% evaluable 107 results, was met.

MRD resurgence

Among the 45 MRD evaluable patients, MRD resurgence at the 106 threshold was identified in 11 (24%) patients, which included 5 (11%) patients with disease progression. Characteristics of patients with disease progression or MRD 106 resurgence can be found in Table S1. Consolidation before maintenance was received by 8/11 (73%) and 6/11 (54%) had a known high-risk cytogenetic abnormality (1q gain n = 4, t(4;14) n = 1, and TP53 deletion n = 1). Among the 11 patients with progression or MRD resurgence, 5 (45%) had detectable MRD by the 107 assay at baseline, and an additional 3 (27%) had MRD 107 detectable at least one year prior to MRD 106 resurgence; the other 3 (27%) patients had MRD resurgence at 106 at the same time as MRD resurgence at 107. MRD 106 resurgence during follow-up preceded disease progression in 3/5 (60%) patients. MRD 107 positivity at baseline preceded disease progression in 3/5 (60%) patients by a median of 18 months (12–24 months), and MRD 107 resurgence during follow-up preceded disease progression in 1 (20%) other patient (Table S1).

Of the 9 patients with MRD positivity at 107 at baseline, 5 (56%) experienced progression (n = 3) or MRD resurgence at the 106 threshold (n = 2) (Fig. S1).

At 30 months of median follow-up, MRD 107 positivity at baseline was associated with inferior MRD-FS (HR 6.3, 95% CI 1.9–21.5, p = 0.003). The estimated 3-year MRD-FS in all patients was 68% (95% CI 49–81%) (Fig. 2): 77% (95% CI 55–89%) for MRD 107 negative patients and 25% (95% CI 1–63%) for MRD 107 positive patients (Fig. 3A). Univariate analyses found no differences in MRD-FS when stratifying patients by confirmed sustained MRD < 106 status, receipt of quadruplet induction, ASCT, or consolidation, or duration of consolidation and/or maintenance. The presence of a high-risk cytogenetic abnormality at baseline was associated with inferior MRD-FS (HR 3.7, 95% CI 1.2–11.7, p = 0.02).

Fig. 2: MRD-free survival and progression-free survival for all patients.
figure 2

MRD-free survival (MRD-FS; red) refers to patients free of MRD 106 resurgence, progression, or death. Progression-free survival (PFS; blue) refers to patients free of progression or death.

Fig. 3: Outcomes By MRD 10-7 Status.
figure 3

A MRD-free survival (MRD-FS) and B progression-free survival (PFS) stratified by MRD 107 status at baseline. MRD-FS refers to patients free of MRD 106 resurgence, progression, or death.

The 3-year cumulative incidence rate of progression or MRD 106 resurgence for all patients was 30%, including a 16% rate of MRD 106 resurgence without progression. The cumulative incidence rate was higher for patients with MRD 107 positivity compared to 107 negativity at baseline (subdistribution hazard ratio [SHR] 7.8, 95% CI 2.2–27.6, p = 0.001); the 3-year cumulative incidence of progression or MRD resurgence was 20% for patients with MRD 107 negativity at baseline and 75% for patients with MRD 107 positivity at baseline (Fig. S2)

Progression-free survival and overall survival

With a median follow-up of 30 months, MRD 107 positivity at baseline was associated with inferior PFS compared to MRD 107 negativity (HR 10.1, 95% CI 1.6–62.3, p = 0.01). The estimated 3-year PFS for the whole cohort was 85% (95% CI 67–94%) (Fig. 2): 92% (95% CI 72–98%) for MRD 107 negative patients and 49% (95% CI 8–82%) for MRD 107 positive patients (Fig. 3B). Univariate analyses found no differences in PFS when stratifying patients by high-risk cytogenetics at baseline, confirmed sustained MRD < 106 status, receipt of quadruplet induction, ASCT, or consolidation, or duration of consolidation and/or maintenance.

Cumulative incidence of progression was higher for patients with MRD 107 positivity compared to 107 negativity at baseline (SHR 21.5, 95% CI 2.8–166.7, p = 0.003). The 3-year cumulative incidence rate of progression was 12%; 4% for patients with MRD 107 negativity and 51% for patients with MRD 107 positivity at baseline (Fig. S3).

Of the 5 progression events, 2 (40%) had biochemical progression alone, and 3 (60%) had hypermetabolic osseous lesions identified on PET scans. None of the 5 patients had a history of extramedullary disease at diagnosis. All five patients successfully underwent retreatment, and three again reached MRD < 106.

One patient died due to complications related to a second hematologic malignancy. Overall survival (OS) is immature; the estimated 3-year OS was 97% (Fig. S4).

Health-related quality of life

There were 18 and 16 patients with available QLQ-C30 and QLQ-MY20 data, respectively. Survey completion rates were 55% and 50% at 18 months post-discontinuation for the QLQ-C30 and QLQ-MY20, respectively.

There were statistically significant improvements in role functioning, insomnia, diarrhea, pain, and financial difficulties after lenalidomide discontinuation (all p < 0.05, Fig. 4), all of which exceeded the MID threshold. These improvements were sustained up to 18 months after maintenance discontinuation. There were no significant changes in other QLQ-C30 or QLQ-M20 domains (Figs. S5S7).

Fig. 4: Longitudinal mean score changes from baseline in selected Patient Reported Outcome Domains from the EORTC QLQ-C30 Instrument.
figure 4

MID minimally important difference.

Adverse events

Adverse events of interest included second cancers, including second hematologic malignancies, that occurred during the study period. Three patients were diagnosed with second cancers during the study: one patient was diagnosed with Hodgkin lymphoma after 14 months (in complete remission), one patient was diagnosed with B-cell acute lymphoblastic leukemia after 24 months (achieved complete remission but died due to an infectious complication), and one patient was diagnosed with melanoma. All three patients had received high-dose melphalan and had received 68, 48, and 23 months of maintenance therapy, respectively. All three patients had remained MRD negative with respect to MM.

Discussion

In this pragmatic prospective study of patients with MM with multimodal MRD < 106 who underwent discontinuation of maintenance therapy regardless of induction regimen, receipt of ASCT, or maintenance therapy, the estimated 3-year PFS was 85% and even higher (92%) for those with MRD < 107 using a CD138+-selected assay. The cumulative incidence of MRD 106 resurgence or disease progression at 3 years following maintenance discontinuation was 30%, though this was driven mainly by patients with MRD 107 positivity at baseline (20% for MRD < 107 vs 75% for MRD ≥ 107). The presence of an HRCA was associated with worse MRD-FS but not PFS. HRQoL improved among a subset of patients following maintenance cessation, specifically role functioning, insomnia, diarrhea, pain, and financial difficulties.

MRD is not a binary marker but exists on a continuous spectrum, and naturally, MRD negativity at deeper thresholds will be associated with better outcomes. However, there are practical constraints to the amount of sample that can be obtained owing to limitations with DNA/cell input, expense of evaluation, and potential harm to the patient. No MRD assay is currently validated to provide 107 sensitivity, but we show that the CD138+-selected NGS 107 assay was feasible and detected MRD when the standard 106 assay did not. Four out of five progression events were preceded by MRD 107 positivity at baseline (n = 3) or at least one year prior to progression (n = 1). CD138+ immunomagnetic separation is already performed in cytogenetic laboratories as part of the routine fluorescent in situ hybridization workflow, but efforts to increase its use are needed [16]. CD138+-selection to achieve a higher level of sensitivity was demonstrated in conjunction with next-generation flow cytometry (NGF) in PB of patients with MM but required substantially higher volumes (50 mL) and did not demonstrate superior sensitivity to bone marrow methods [17].

The results from our study build upon previously published data. In the IFM-2009 trial, patients received a fixed duration of one year of lenalidomide maintenance therapy. Patients with MRD 106 negativity following one year of maintenance lenalidomide had a 3-year PFS of approximately 75% from the time of maintenance discontinuation [5]. In the GEM2014MAIN trial of lenalidomide, dexamethasone, ±ixazomib maintenance following triplet induction and ASCT, patients with MRD < 106 by NGF after two years of maintenance underwent cessation of treatment; the 4-year PFS from treatment discontinuation was 82.8%, similar to our results [18]. These two studies examined the impact of cessation of maintenance therapy in patients who received triplet induction therapy. Our study included 40% of patients who received quadruplet induction and 36% who did not receive ASCT, along with variable durations of consolidation and maintenance therapy. This reflects a contemporary patient population that may be more generalizable to the current treatment landscape. Our results support the use of MRD < 106 as a minimum threshold for prompting a patient-centered discussion around treatment discontinuation. Notably, the MASTER trial employed a rapid MRD-guided discontinuation in patients who received quadruplet therapy and ASCT and found that sustained MRD 106 negativity was associated with improved PFS but sustained MRD 105 negativity was not [19].

Cost savings from the discontinuation of maintenance therapy could be substantial, which can be validated by a formal cost savings analysis. Second hematologic cancers occurred despite maintenance discontinuation; whether this risk can be reduced by earlier cessation of maintenance following exposure to high-dose melphalan is unknown.

There are several limitations to this study. This was a single institution, single-arm trial; the lack of randomization precludes our ability to determine if continuing lenalidomide would have led to better outcomes. There was a preponderance of patients who received multi-drug consolidation (77%), which is not universally done; the overrepresentation may be explained by multi-drug therapy being associated with superior PFS and sustained MRD negativity versus single agent maintenance [2, 12, 20, 21]. Few patients with two or more HRCAs were included in this study, which is reflective of the recalcitrant nature of this disease subset [12]. Our results are most applicable to patients in their first line of therapy without ultra-high-risk disease. It is possible hemodilution could have accounted for false negative NGS results; however, this is unlikely as we used first-pull aspirate for all NGS analyses, and the estimated quantity of cells in the CD138+-selected samples was much higher than what has been reported in PB in healthy individuals [22].

While patients could meet the criteria for death or progression at any point, MRD resurgence could have occurred earlier than the annual assessment. HRQoL surveys could have been biased as they were completed by a subset of patients. Lastly, 19% of patients had MRD positivity at 107, lower than the anticipated 40%; we have since expanded the study to enroll additional patients.

Our results suggest that MRD 107 status further informs on the risk of disease resurgence among patients discontinuing therapy. Decreasing the risk of disease resurgence may be important to patients. As such, this study identifies the need to further validate MRD < 107 as a potential guide for treatment discontinuation, performed both before and/or during maintenance therapy in the era of quadruplet induction which is responsible for exceptionally deep responses. We propose that MRD < 107 might further improve upon existing assays to guide earlier discontinuation of maintenance therapy or provide justification to forgo maintenance entirely. Considering that three patients had osseous lesions at relapse, incorporating optimal imaging and PB techniques to better characterize spatial heterogeneity [23] and its impact on disease re-emergence is another area for further study. Encouragingly, all five patients with relapse were successfully retreated, and three had again achieved MRD < 106.

If a cure is defined by the absence of disease in patients who have stopped treatment and with mortality similar to an age-matched population [24], patients with sustained MRD < 106 following discontinuation may ultimately fulfill this definition and prove that a cure is indeed possible in MM. Incorporation of MRD 107 testing, complementary imaging, and peripheral blood assays such as mass spectrometry using the intact light chain or clonotypic peptide approaches may help to further enhance our understanding of a cure.

In conclusion, discontinuation of maintenance therapy in patients with MM and multimodal MRD < 106 resulted in a low rate of disease progression or MRD resurgence. MRD-guided maintenance cessation may also improve HRQoL and could result in substantial cost savings. The results to date indicate that MRD assessed using the clonoSEQ assay with CD138+-enrichment to achieve 107 sensitivity may anticipate disease progression or conventional MRD resurgence, and consequently may help to better identify patients who can more confidently discontinue maintenance therapy with improved chances for a cure. Longer follow-up is needed to validate these preliminary results.