Abstract
Multiple sclerosis (MS) affects 2.9 million people. Traditional classification of MS into distinct subtypes poorly reflects its pathobiology and has limited value for prognosticating disease evolution and treatment response, thereby hampering drug discovery. Here we report a data-driven classification of MS disease evolution by analyzing a large clinical trial database (approximately 8,000 patients, 118,000 patient visits and more than 35,000 magnetic resonance imaging scans) using probabilistic machine learning. Four dimensions define MS disease states: physical disability, brain damage, relapse and subclinical disease activity. Early/mild/evolving (EME) MS and advanced MS represent two poles of a disease severity spectrum. Patients with EME MS show limited clinical impairment and minor brain damage. Transitions to advanced MS occur via brain damage accumulation through inflammatory states, with or without accompanying symptoms. Advanced MS is characterized by moderate to high disability levels, radiological disease burden and risk of disease progression independent of relapses, with little probability of returning to earlier MS states. We validated these results in an independent clinical trial database and a real-world cohort, totaling more than 4,000 patients with MS. Our findings support viewing MS as a disease continuum. We propose a streamlined disease classification to offer a unifying understanding of the disease, improve patient management and enhance drug discovery efficiency and precision.
Similar content being viewed by others
Main
MS is a debilitating disorder of the central nervous system (CNS), affecting approximately 2.9 million individuals worldwide (one in 3,000 people)1,2. MS course descriptors, defined and revised by clinical consensus3,4, have aided patient–physician dialogue in the clinic and the design and recruitment of clinical trials. These clinical disease course descriptors—relapsing-remitting MS (RRMS), secondary progressive MS (SPMS) and primary progressive MS (PPMS)—have had a major impact on drug development, as they were used to delineate trial populations, and on the practical management of people living with MS, as they define treatment indications and patient access to approved MS therapies. However, these traditional course descriptors are based on clinical presentations of the disease rather than on the underlying disease biology. Their value for prognostication is limited to the observation that patients with a progressive disease course (PPMS or SPMS) tend to have a worse prognosis than those with a relapsing-remitting disease course5, and responsiveness to treatments in the progressive spectrum of the disease depends on the presence of radiological disease activity rather than the subtype of MS6. Recent evidence from real-world studies and clinical trials shows that disease progression independent of relapse activity (PIRA) is common in RRMS7,8,9,10 and is associated with poor long-term prognosis11. Additionally, current MS course descriptors are categorical and mutually exclusive (with implications for patient access to approved medications), whereas, in reality, relapsing and progressive disease features often overlap10. Pathophysiological differences between relapsing and progressive MS are more quantitative than qualitative in nature, with many features associated with disease worsening, such as slowly expanding lesions, meningeal and compartmentalized inflammation, neuroaxonal injury and brain and spinal cord atrophy, being present from disease onset and shared between subtypes of MS12,13,14,15. This raises the important question of whether the current MS categorization into distinct subtypes is justified or whether MS would be better described as a disease continuum from a focal inflammatory to a progressive disease course.
Over the last three decades, progress has been made in the development of efficacious therapies, which have markedly improved the outlook for people living with MS10,16. Current therapies are predominantly licensed for the relapsing phase of MS, with demonstrated benefits in progressive MS (SPMS and PPMS) mostly confined to patients with recent disease activity. Based on heterogeneous treatment effects observed within progressive subtypes of MS and benefit–risk considerations, regulators in the United States and Europe have created complex subclassifications to tailor indications (for example ‘active SPMS’ and ‘early PPMS’) in recent approvals, thereby deviating from the original consensus definitions (Extended Data Fig. 1). Such inconsistent views of MS pose challenges in the drug development process, especially in the progressive spectrum of the disease where there remains a high unmet medical need. Differences in definitions between jurisdictions and deviations from the consensus definitions also have the potential to create confusion in clinical practice in terms of patient access to approved treatments17. To address these fundamental issues, it is crucial to reevaluate the current MS categorization using a data-driven and evidence-based approach15,18,19.
With the goal of achieving a data-driven reclassification of MS, we adopted an AI-based approach, more specifically a bespoke probabilistic machine learning method, to reclassify the disease trajectories of more than 8,000 patients from the Novartis-Oxford MS (NO.MS) database, which is currently the largest and most comprehensive MS clinical trial database20. Our methodology—a scalable probabilistic factor analysis hidden Markov model (FAHMM) agnostic to the traditional MS subtypes—uses a probabilistic latent factor analysis (PFA) to represent multimodal clinical and radiological trial data. This approach simplifies complex data by capturing the correlation between variables into composite scores, which are then assumed to follow a hidden Markov model (HMM). By applying this model, we report homogeneous multivariate disease states and the transition probability matrix between these states, thus providing quantitative insights into transition pathways and a data-driven view of the evolution of MS over time. Drawing from our findings and existing literature, we propose a reclassification of MS, which offers an opportunity to unify the understanding of this disease with implications for drug development and the practical management of people living with MS.
Results
Discovery and validation
A total of 8,023 patients with up to 15 years of follow-up (118,235 visits) from nine phase 2/3 MS clinical trials in the NO.MS database were included in the main analysis. Diagnoses at trial entry were RRMS (n = 5,761), SPMS (n = 1,550) or PPMS (n = 712). Baseline characteristics largely overlapped across studies, with more disabled patients in progressive MS trials (Supplementary Fig. 1.1). The dataset was split into discovery (n = 6,419) and replication (n = 1,604) samples. Results were externally replicated in an independent Roche clinical trial dataset21,22 (N = 2,243; see Extended Data Table 1 for source studies) and a real-world cohort, Multiple Sclerosis Partners Advancing Technology and Health Solutions (MS PATHS)23 (N = 2,080), based on predefined validation criteria (Methods). Demographics and MS feature distributions are compared across datasets in Extended Data Table 2 and Supplementary Fig. 1.2.
Four latent dimensions to characterize MS
The FAHMM model, which uses latent variable modeling, is presented alongside the validation results in Fig. 1 and is illustrated in Fig. 1a. Probabilistic latent variable modeling of all clinical and radiological features measured in the clinical trials identified four latent dimensions of MS (Table 1): physical disability, brain damage, relapse and asymptomatic radiological activity. These were named based on clinical and radiological features in the loading matrix (Fig. 1b).
a, Graphical summary of the eight statistical states of MS and the transition probabilities among them. b, Estimated loading matrices to identify ‘key dimensions of MS’. Bolded numbers refer to measures significantly (positively or negatively) associated with the dimensions based on the posterior probability of belonging to the slab component. Asymptomatic MS disease activity is identified based on the presence of Gd+ T1 lesions, in the absence of relapses. c, Descriptive summary of the percentage of patients with an MS subtype diagnosis (RRMS, SPMS or PPMS) and empirical means of the original variables characterizing the eight states; for more complete summary statistics, see Supplementary Table 2.1 and Extended Data Fig. 3. Note that, for MS PATHS, the diagnosis was self-reported by the patient and was missing for most (53%) and, therefore, not reported here. For a more detailed comparison of baseline features of the patients in NO.MS, Roche MS and MS PATHS datasets, see Supplementary Fig. 1.2 and Extended Data Table 2. d, Transition matrix between disease states as estimated by FAHMM, where each cell indicates the probability of a patient transitioning from their current state (row) to a subsequent state (column) over the course of 1 month. Patients may transition in any order between disease states. The thickness of the arrows in a is proportional to the probability of the transition between states as described in the transition probability matrix in d. In all figures, the color code refers to the clinical meta-states: blue indicates EME MS; yellow indicates asymptomatic MS disease activity; orange indicates relapse; and red indicates advanced MS. BPF, brain parenchymal fraction; Gd+, gadolinium-enhancing; s, seconds.
MS disease states and evolution over time
The longitudinal composite scores of the four latent dimensions (from the probabilistic model) were used to identify disease states (Fig. 1c). The number of states was selected using the Bayesian information criterion (BIC; Extended Data Fig. 2), which favored models with more states over simpler alternatives, particularly over the traditional MS subtypes (RRMS, SPMS and PPMS), which would resemble a three-state model with a single one-way transition from RRMS to SPMS and PPMS as a distinct, static entity. The more complex models revealed overlapping feature distributions between states that, in fact, represent a disease severity gradient in terms of disability and brain damage and identified distinct states only for relapses and for periods during which patients have high asymptomatic radiological disease activity (Extended Data Fig. 3). Models with eight or more states had similarly low BIC and adequately represented the data. We present an eight-state model and its external validation in the main text in Fig. 1 (with feature distributions in Supplementary Table 2.1), and nine-state and 10-state variants are detailed in Extended Data Fig. 4 and Extended Data Tables 3 and 4. The disease states were successfully reproduced in the holdout data (Supplementary Fig. 3.1).
The FAHMM model provides a data-driven, probabilistic assessment of how individuals with MS transition between disease states over time. This dynamic aspect of the FAHMM model is captured in the transition probability matrix (Fig. 1a,d), which allows patients to remain in or move between states from one visit to the next. Patients may transition between disease states in any order. We did not restrict the transitions before fitting the model to the data, allowing the discovery of transition patterns in a data-driven manner (see Methods for details).
A key finding is that there is no direct monthly transition from states 1–3 to states 6–8; patients must first pass through one of the active states (4 or 5) (Fig. 1a,d). Furthermore, once in states 6–8, patients do not return to earlier states (1–3). Hence, solely based on those transition patterns observed in the NO-MS dataset, the eight states can be grouped into four clinical (meta-)states, which we named by consensus of the co-authors and the disease characteristics of the patients in these states as ‘EME’ (states 1–3), ‘asymptomatic radiological MS disease activity’ (state 4), ‘relapse’ (state 5) and ‘advanced’ state of MS (states 6–8). Table 1 provides a clinical characterization of the patients in each clinical (meta-)state of MS, along with the transition probabilities. A summary is as follows:
-
EME MS (states 1–3) represents clinically stable, ambulatory patients with MS with low disability, minimal cognitive impairment and limited brain damage. Patients with EME MS are most likely to remain in an EME state from one visit to the next. Transitions to active states are possible by developing asymptomatic radiological activity (state 4) or by experiencing a relapse (state 5). Direct transition to advanced states of MS (6–8) without going to an activity state is highly unlikely (close to zero probability).
-
Asymptomatic radiological MS disease activity (state 4): a radiologically active but clinically silent state, marked by multiple gadolinium (Gd)-enhancing lesions in the absence of reported new or worsening neurological symptoms. Most patients in this state are young and have a diagnosis of RRMS, but those with SPMS or PPMS also reach the asymptomatic radiological activity state, as revealed by our clinical trial datasets, in which magnetic resonance imaging (MRI) scans are collected at set times irrespective of symptoms. Patients often enter from EME states (1–3) or after a relapse (state 5). They may remain in this state, return to EME states (1–3) or relapse by developing acute neurological develop symptoms (state 5) or transition to advanced MS (states 6–8). Notably, there is a measurable risk (approximately 11%) of progressing via asymptomatic disease activity directly to advanced states (6–8) of MS, showing that subclinical disease activity is a risk factor for disease evolution.
-
Relapse (state 5): characterized by acute new or worsening neurological symptoms—can occur from any state. Although relapses are more common early in the disease, they also occur at older ages and in patients with higher disability; they are seen most commonly in patients with a diagnosis of RRMS but are also experienced by patients with a diagnosis of SPMS or PPMS. A relapse is a risk factor for further disease evolution, as direct transitions from the relapse state to one of the advanced MS states (that is, state 5 to states 6–8) are possible with a 18% probability.
-
Advanced MS (states 6–8) is defined by higher levels of physical and cognitive impairment, greater brain atrophy and reduced focal inflammation. Once in an advanced state, patients are unlikely to return to earlier states. Although relapses and lesions can still occur, transitions out of the advanced meta-state are rare; most patients remain within these states. The model does not differentiate between SPMS and PPMS in these advanced stages—that is, patients with SPMS and patients with PPMS are distributed with similar frequency across states 6–8.
Demographics and MS disease characteristics from NO.MS are provided for the four clinical (meta-)states in Table 2.
The clinical (meta-)states were replicated in the independent external clinical trial and real-world datasets based on predefined validation (Methods) criteria as follows:
Validation in the Roche ocrelizumab phase 3 program (Roche MS) clinical trial dataset (N = 2,243) confirmed the four MS dimensions—physical disability, brain damage, relapse and asymptomatic disease activity—and showed that progression from EME to advanced MS occurred primarily through active states, despite the absence of patients with SPMS. The clinical interpretation aligned with the main findings and met validation criteria (Fig. 1).
Validation in the real-world MS PATHS cohort (N = 2,280). Despite using different measures (for example, new/enlarging T2 lesions, brain parenchymal fraction and no Expanded Disability Status Scale (EDSS)), lower-frequency assessments (6–12 monthly visits) and some patient-reported rather than physician-assessed measurements, the four MS dimensions and the EME-to-advanced disease gradient were also reproduced in the MS PATHS cohort. Transitions to advanced MS typically passed through active states, and direct worsening from early to advanced MS was confirmed to be rare. Minor differences emerged: only two advanced states were identified (versus three in NO.MS), and relapse was split into two relapsing states (one in earlier patients and one in more advanced patients). This is due to the fact that, in MS PATHS, which reflects real-world practice, visits were more symptom driven rather than scheduled independently, as is typically the case in clinical trials (Fig. 1).
Sensitivity analyses
An analysis for ‘bout-onset MS’ (RRMS and SPMS) and PPMS separately revealed similar disease states and transition patterns (Extended Data Fig. 5). The FAHMM requires complete data at each visit. In the main analysis, missing data were imputed only for partially observed visits, scheduled clinical visits and unscheduled relapse visits.
The missing data rates and mean absolute errors for imputed values on heldout data were as follows: Timed 25-Foot Walking Test (T25FWT): 32.3%, 0.14; 9-Hole Peg Test (9HPT): 43.6%, 0.17; Paced Auditory Serial Addition Test (PASAT): 53.9%, 0.30; T2 lesion volume: 72.5%, 0.08; brain volume: 69.9%, 0.08; and Gd-enhancing lesions: 73.1%, 0.08. These metrics suggest a reasonable level of imputation accuracy across variables.
EDSS and relapse data were complete. Results of a further sensitivity analysis without data imputation, mapping available assessments to annual visits based on the availability of annual brain scans, are given in Extended Data Fig. 6. It showed an EME MS to advanced MS disease severity gradient with transitions either via relapses or asymptomatic radiological disease activity, consistent with the main model despite a substantial loss of longitudinal information (88.5% of all patient visits and 87.1% of the relapses were missed due to the remapping to annual visits).
Progression independent of relapse activity by disease state
The time to the first PIRA event was analyzed by the disease state that patients were in at the baseline of a clinical trial (Fig. 2a). Patients starting in EME or active states (relapse or asymptomatic activity) had a lower risk and longer time to PIRA than those starting in advanced states, indicating that, although progression risk exists early, it increases markedly in advanced MS.
a, Time to first 3-month PIRA as a function of the clinical state in which the patient started at trial baseline. Kaplan–Meier estimates with shaded area representing 95% confidence intervals. b, Sankey plot of individual patient trajectories among the four clinical states of MS over a timeframe of 5 years. At year 0, patients are shown in the disease state in which they entered into a clinical trial: patients were in an EME state (blue), in a state of asymptomatic radiological disease activity (yellow), in a relapse (orange) or in an advanced state of MS (red). From left to right, the plot illustrates the proportion of patients who remain in the same disease state or move to another disease state. Please note that, for clarity of the graphic, only the yearly status of the patients is shown, and transitions between yearly visits are not displayed to avoid overcrowding the figure. If patients had relapses or radiological inflammation at these annual visits, this is correctly presented in orange and yellow, respectively. However, patients may have experienced relapses or asymptomatic radiological inflammation states between these annual points that contributed to their worsening, which cannot appear in this graphical representation; this explains why the figure displays blue connection lines between EME and advanced disease states in the figure even though the probability of a direct transition between EME and advanced states without passing through the inflammatory states is, in fact, low (see underlying transition matrix in Fig. 1d). c, Effect of (any) DMTs (versus placebo) on the transition probabilities among the four clinical states of MS. ‘Any DMT’ includes one of the following: interferon beta-1, glatiramer acetate, teriflunomide, fingolimod, siponimod or ofatumumab, which were compared to ‘no DMT’ (that is, placebo). The numbers refer to the percentage risk reduction (1 − HR, where HR refers to the hazard ratio between treated and untreated (placebo) patients and is reported with 95% confidence limits).
Prognostication of individual trajectories
Individual patient journeys through the clinical disease states over 5 years are visualized in a Sankey plot (Fig. 2b). At baseline of the clinical trials, most patients are in an EME MS state (blue) or in an active disease state (yellow or orange); only a minority of patients are already in an advanced MS disease state (red), reflecting the composition of the NO.MS database and the eligibility criteria of the various trials. Patients transition between states annually, with both worsening and improvement observed. Each visit to an inflammatory state (relapse or asymptomatic activity) increases the risk of progressing to advanced MS. Over time, more patients accumulate disease burden and enter advanced states. Similar patterns were observed across RRMS, SPMS and PPMS (Supplementary Fig. 4.1), with consistent transition pathways from early to advanced states via inflammatory activity, regardless of MS subtype.
We evaluated the predictive performance of the proposed classification for the prognostication of an individual patient’s risk to transition into an advanced state of MS and the effect of disease-modifying therapies (DMTs) based on the estimated FAHMM model (Methods). The analysis revealed that, overall, the model predicts individual patient trajectories with high concordance in independent holdout data, with a good out-of-sample performance (C-score = 0.82, Brier score = 0.06).
Impact of DMTs on MS disease evolution
Treating with a DMT significantly reduces the risk of patients with EME MS transitioning into the highly active asymptomatic MS disease state as well as their risk of transitioning into a relapse state (Fig. 2c). Compared to placebo, DMTs also lower the chance of remaining in an active state across visits. By interrupting the accumulation of damage to the CNS through these high-risk states, DMTs are associated with a higher probability of patients staying longer in the EME phase of MS.
Discussion
We developed a data-driven classification of MS by applying bespoke probabilistic machine learning methodology to the longitudinal multimodal disease trajectories of more than 8,000 patients, covering all classical subtypes of MS. The main analysis in NO.MS is based on approximately 120,000 standardized neurological assessments and more than 35,000 MRI scans.
Our study was conducted in a clinical trial dataset with protocol-defined eligibility criteria as previously described20, resulting in a more narrowly defined population than typically encountered in clinical practice. We addressed this possible limitation by validating and reproducing our findings not only on a holdout dataset but also in an independent clinical trial dataset (Roche MS dataset)21,22 and in a real-world cohort (MS PATHS)23. The population included in our analysis is broadly similar to the NO.MS full dataset and to the real-world cohort based on reported demographics and disease features (Extended Data Table 2 and Supplementary Fig. 1.2). By including MS PATHS and Roche MS, we markedly expanded the range of treatments used.
The model is agnostic to the diagnosed clinical subtype of MS, allowing us to contrast it with the consensus-based MS course descriptors (Table 3). By contextualizing our findings within the traditional MS classification and existing literature, we propose revisions to the classification and recommend actions that could positively impact patient management and clinical trials alike.
Our FAHMM model results are more compatible with the view of MS as a disease continuum10,13,15 than with the traditional view of distinct phenotypes. A three-state model of static MS subtypes (RMS, SPMS and PPMS) was found to be inferior to more complex and dynamic models. In these dynamic models, the frequency of disease activity varies between individuals, explaining some of the individual differences in the accumulation of damage to the CNS and the acquisition of physical and cognitive impairment over time. Based on four dimensions—physical disability, radiological disease burden (that is, focal and diffuse brain damage), relapse and subclinical disease activity—and the transition matrix between disease states, the FAHMM has good performance for predicting disease course at the individual level (C-score > 0.8). The model thus enables prognostication of time to advanced MS based on the frequency of visits into one of the active disease states and the accumulation of disease burden. The results of this approach support the proposal by Lublin24 that the time framing—that is, the frequency of disease activity and progression within a set time window (for example, within a year)—should be considered in the practical management of MS.
The disease continuum was identified by our modeling in the form of multiple overlapping states that represent a gradient in terms of disease severity as defined by the level of disability and damage to the brain. Thus, the accumulation of a radiological disease burden is a manifestation of disease worsening, which can be reduced (or even prevented) with efficacious DMTs. We identified the meta-states EME MS (states 1–3) versus advanced MS (states 6–8) as two poles of a disease severity gradient. Models with more than eight states show greater resolution of the same gradient with no additional value for the clinical interpretation. Separate models for ‘bout-onset MS’ (RRMS and SPMS) versus PPMS were essentially similar.
The FAHMM results indicate that focal inflammatory and progressive forms of MS form a continuum in their phenotypic presentation. The model grouped RRMS, ‘active SPMS’ and ‘early PPMS’ together in the EME stages of MS without distinction, whereas fully evolved PPMS was grouped with SPMS in the advanced stages, also without distinction. The primary transition pathways from EME MS to advanced MS were through focal inflammatory states, either with accompanying neurological symptoms or stealthily via clinically silent radiological disease activity. The FAHMM considers the highly active disease states (4 and 5) as distinct disease states set apart from all other states by the occurrence of acute neurological symptoms or the unequivocal evidence of high levels of inflammation on the MRI scan (Gd-enhancing T1 lesions mean (s.d.): 3.34 (5.22)), respectively. This differs from the traditional view of MS that attributes lesions and relapses primarily to RRMS and is justified based on abundant evidence that lesions and relapses are not exclusive to any specific subtype of MS but can occur across the entire MS spectrum, including in patients with PPMS10,20.
The finding that ‘early PPMS’—that is, relatively young patients with low to moderate levels of disability and/or evidence of radiological disease activity—can benefit from anti-inflammatory treatment is consistent with subgroup results of the first trials of B-cell-depleting antibodies in PPMS25,26. More recently, a meta-analysis of 12 studies in the progressive spectrum of the disease (SPMS or PPMS) in a total of 8,659 patients showed that patients with recent disease activity could benefit from available DMTs, irrespective of whether their disease course was diagnosed as SPMS or PPMS, whereas patients without such recent disease activity could not6.
The 2013 revisions by Lublin et al.4 to the definition of the MS disease course introduced disease activity measures (MRI lesions and relapses) and progression as ‘modifiers’ of the classical static phenotypes (RRMS, SPMS and PPMS), without substantiating the quantitative relevance of such modifiers for disease evolution. Our FAHMM results enhance understanding of MS by offering a data-driven examination of the latent dimensions for disease characterization and a probabilistic quantification of the transition pathways between disease states. In our analysis, focal inflammation, with or without accompanying neurological symptoms, is the key driver of worsening. This finding was validated in the independent clinical trial dataset as well as in the real-world cohort. Our FAHMM shows that clinical and radiological assessments provide complementary information that differs from the initial consensus-based classification of MS, which focused on only two clinical dimensions (relapses and disability progression)3. It also differs from a more recent data-driven classification of MS that used only MRI data based on the premise that this would better reflect the pathophysiology of MS27. FAHMM results suggest that, in addition to the level of disability and relapse, two MRI-based latent dimensions of MS are relevant and should be considered for disease characterization and disease evolution:
-
(1)
Brain damage: assessed by the total T2 lesion volume and the level of brain atrophy as cumulative measures of focal and diffuse radiological disease burden. As the disease evolves, disease-related damage to the brain gradually accumulates, progressively lowering the patient’s chances for disability improvement28 and increasing the risk of further disability worsening29 and progression30,31. This decreasing ability to compensate and improve as the disease advances will similarly contribute to the clinical presentation of a progressive disease course, much like active drivers of worsening would, albeit likely through a different pathophysiological mechanism28. Thus, the accumulation of a radiological disease burden is a form of disease worsening, associated with poorer long-term outcomes; it should be minimized through the use of efficacious DMTs.
-
(2)
Asymptomatic radiological disease activity was revealed as a major pathway for patients to worsen, as shown by the FAHMM transition matrix. Our results from the much larger NO.MS dataset confirm the finding of a previous report by Thorpe et al.32, showing that more than 90% of the MS disease activity visible on regularly scheduled MRI scans (brain and spinal cord) is not accompanied by new or worsening neurological symptoms. High lesion activity is predominantly, but not exclusively, seen in young patients and occurs in all subtypes of MS20 and contributes to neuronal injury33,34,35 and brain tissue loss36 and, thereby, to the overall accumulation of damage to the brain. Enhancing lesions contribute to disease worsening in the same direction as clinical relapses28. Such damage is, in turn, associated with a lower chance of improvement and an increased risk of disease worsening and progression28 (see the FAHMM-proposed brain damage-related latent dimension of MS). Therefore, monitoring and preventing subclinical disease activity to protect reserve capacities should be a priority in MS management, considering that brain reserve influences how structural damage translates into clinical symptoms37. However, this approach may require novel strategies considering the current era of low relapse rates and safety concerns associated with regular use of Gd38.
Our FAHMM model confirms that PIRA starts early in MS and becomes the dominant feature in the advanced disease states when brain reserve is depleted, in line with the findings of Kappos et al.9 and Lublin et al.10. Patients in all states of MS can experience PIRA events, but the absolute risk of PIRA is higher in the advanced states compared to the EME states of the disease. In relative terms, PIRA is the most frequent manifestation of disability accumulation across the full spectrum of traditional MS phenotypes, as summarized in a recent literature review39. Perhaps surprisingly, FAHMM did not identify any direct pathway from EME to advanced states of the disease without passing through focal inflammatory states (4–5). Among the various biological mechanisms for MS evolution that have been proposed40, this finding supports the view that inflammation is central to the pathogenesis of MS and that degenerative processes in MS are secondary in nature. The risk of PIRA was estimated to be highest in the advanced states of MS, when the amount of damage to the CNS is typically substantial and response to currently available treatments is least likely.
Regarding the advanced states of MS, our results do not support maintaining a distinction between SPMS and PPMS; patients with SPMS and patients with PPMS were similarly distributed among the advanced states of the disease, both in the NO.MS dataset and in the real-world validation set. Once patients reach the advanced states of the disease, the risk of progression is high, and the chances of a treatment response are low, regardless of whether the previous disease course was characterized by relapses or not. Our results align well with the accumulating evidence that the apparent evolution from EME states to advanced states of the disease reflects a partial shift from predominantly localized acute injury to widespread chronic inflammation and secondary neurodegeneration13,14,41,42,43,44,45,46. Aging has a modulating effect; it decreases the likelihood of focal inflammation, increases neural susceptibility to injury and reduces resilience15,20. Abandoning the distinction between primary and secondary progressive MS in the advanced spectrum would acknowledge that these patients likely progress for the same reasons. Treating advanced PPMS and advanced SPMS as a uniform population would facilitate trial recruitment and drug discovery and would simplify drug indications and access in the progressive spectrum of the disease (that is, advanced MS is seen as one indication).
One may consider how the classical disease course descriptors (at presentation) would map to the FAHMM states and how this relates to current and potentially future disease classification. Radiologically isolated syndrome (RIS), which, in specific situations, is now MS47, would be detected based on asymptomatic lesions (state 4). Clinically isolated syndrome (CIS) would be identified based on a first clinical episode (state 5). Fully ambulatory patients with MS with limited damage to the brain who are not in a relapse and not in a highly inflammatory state would be considered in one of the EME disease states (1–3), irrespective of whether they have a relapsing or a progressive disease onset. Patients who have progressed in their disease, are impaired in their walking ability, have cognitive deficits and/or have accumulated a substantial lesion load or brain atrophy would likely qualify as being in an advanced disease state (6–8). It should be noted that, under the FAHMM, the combinations of physical and cognitive impairment and radiological damage to the brain that determine the classification of patients to a particular disease state can vary, reflecting inter-individual differences. Such differences render it impossible to provide simple thresholds that would demark movement between states and be applicable to all patients with MS. For the practical management of patients with MS and for the conduct of clinical trials, we propose it to be adequate and sufficient to consider the two ends of the disease continuum (EME MS versus advanced MS) while acknowledging that the shift in pathobiology is gradual rather than sudden, with considerable overlap of focal inflammatory and degenerative biology.
EME MS would include RIS, CIS and RRMS as well as ambulatory and cognitively functional patients with SPMS or PPMS who had recent imaging features characteristic of inflammatory activity. The primary goal would be to prevent relapses and asymptomatic disease activity by addressing the evolving disease pathology15. Meta-analyses suggest that such patients can benefit from currently available anti-inflammatory drugs6,16,48.
Advanced MS would include SPMS and PPMS (without distinction) and patients with RRMS who progressed in absence of relapse activity (PIRA) and who have moderate to high levels of disability with evidence of brain damage as shown by T2 lesion burden or evidence of disease-related brain atrophy. The primary goal would be to minimize the risk of further progression and to maintain or restore function. Secondary goals could be to minimize the risk of lesion expansion and brain atrophy. Developing treatments for advanced MS remains an urgent unmet medical need.
We summarized our recommendations for clinical practice and for clinical trials based on our findings and the literature in Table 4.
We acknowledge the limitations in our dataset in terms of imaging frequency (typically only available in annual visits) and lack of advanced MRI features, including spinal cord imaging, cortical lesions and slowly expanding and paramagnetic rim lesions, which might account for some discordance between MRI disease activity and clinical outcomes. The overall limited data from spinal cord MRI in our database do not allow for definite or generalizable conclusions regarding its added value as a disease descriptor. Although there is evidence to show that spinal cord volume loss is significantly associated with present and future disability49,50,51,52,53,54, its potential added value over brain atrophy has not been studied or demonstrated. Regarding spinal cord lesions, based on a recent review55 systematic assessment of such lesions for monitoring of disease activity seems to have limited added value over monitoring of brain lesions and is considered optional based on current guidelines38.
To conclude, our model is a potentially important stepping stone to a data-driven disease characterization of MS. Although more comprehensive clinical information, such as spinal cord abnormalities or biological data, will be required before it can be considered definite, we think that if more advanced MRI measures and fluid biomarker data, such as proteomic or metabolomic data from banked samples, could be made available in this or any other large MS database, the FAHMM model would be suitable for a more in-depth pathophysiological description of MS and, provided availability of respective data, of any other diseases with similar complexities. With this comprehensive analysis, we hope to contribute to a more unified understanding of MS.
Methods
Analysis set from the NO.MS database
The NO.MS database was previously described20. In brief, it comprises 39 clinical trials from 2003 to April 2021, approved by institutional review boards (IRBs) or ethics committees (‘Ethics’ subsection and Supplementary Table 5.2) and conducted following the principles of the Declaration of Helsinki and Good Clinical Practice. All patients from all 39 trials provided written informed consent. Trial protocols prospectively defined the objectives, eligibility, endpoints, assessments and statistical analyses. The individual study results were previously published. Data were deidentified in a risk-based approach as reported elsewhere56,57. For this analysis, all phase 2 and 3 studies conducted in RRMS, SPMS or PPMS and their corresponding open-label extensions were selected based on the availability of protocol-defined standardized clinical assessments and regular MRI acquisitions. Studies contributing to the analysis are listed in Extended Data Table 1. In addition, an analysis of the wider NO.MS database, including all patients with clinical assessments of relapses and EDSS, investigated the risk of relapses and progression from pediatric MS to adult MS and from RRMS to SPMS and PPMS and identified a decreasing gradient of focal inflammation and an increasing gradient of the risk of progression10.
Ethics
The ethics committees and IRBs used in the nine NO.MS source studies included: Alta Bates Summit IRB; Asahikawa Medical Center IRB; Ascension Wisconsin IRB; Aurora IRB; Baltimore IRB; Biomedical IRB; CentraState IRB; Central Ethics Committee; Chiba University Hospital IRB; Christiana Care IRB Helen F.; Copernicus Group IRB; Crescent City IRB; Dean IRB; Ebara Hospital IRB; Ehime University Hospital IRB; Georgetown University IRB; Health Sciences Institutional Review Boards; Health Sciences Campus IRB; Health System IRB; Healthcare -IRB; Henry Ford Hospital IRB; Hospital IRB; IRB University of California Davis; IRB of Beijing Hospital; IRB of West China Hospital; IRB-WB2; IRB/OSA; IRBMED; Institutional Ethics Committee, Bakirkoy; Institutional Ethics Committee, Dokuz Eylul; Institutional Ethics Committee, Ege; Institutional Ethics Committee, Gazi; Institutional Ethics Committee, Gaziantep; Institutional Ethics Committee, Hacettepe; Institutional Ethics Committee, Istanbul; Institutional Ethics Committee, Mersin; Institutional Ethics Committee, Uludag; Iwate Medical University Hospital IRB N/A; Johns Hopkins IRB; Keio University Hospital IRB; Kyoto Min-iren Chuo Hospital IRB; Lifespan IRB; Local Ethics Committee of AHEPA; Multicentric Ethics Committee IKEM; NIMS Institutional Ethics Committee; National Ethics Committee; Network IRB; Osaka University Hospital IRB; Pro Health Care IRB Research; Providence Health & Service IRB; Providence Health & Services IRB; Psychiatry IRB; Quorum Review IRB; Research Ethics Committee; Saitama Medical Center IRB; Schulman Associates, IRB; Sone Clinic IRB; The Ethics Committee of Sri; University IRB; University of Colorado Health IRB; University of Utah IRB; WIRB; WakeMed IRB; Wayne State University IRB; and Wheaton Franciscan Healthcare IRB N/A (see Supplementary Table 5.2 for full list and further details).
Clinical assessments in NO.MS
For all the trials included in the NO.MS analysis set, the following clinical assessments, which are commonly used in MS clinical trials, have been regularly monitored (typically every 3 months or 6 months; for details, see the individual protocols and study designs) by specifically trained healthcare professionals:
-
EDSS58,59: a standard tool for assessing the neurological disability status and disability progression, ranging from 0 (neurologically normal) to 10 (death due to MS).
-
T25FWT60: an objective quantitative measure of neurological function (patient’s walking speed).
-
9HPT60: an objective quantitative measure of upper extremity (arm and hand) function.
-
PASAT61: an objective measure of cognitive function that specifically assesses auditory information processing speed and flexibility as well as calculation ability.
-
Relapse occurrence: defined as the appearance of a new neurological abnormality or worsening62, as experienced by the patients and reported by the study investigator. Patients who experienced new or worsening symptoms were instructed to come for an unscheduled visit where symptoms were assessed (with an EDSS assessment performed), and onset as well as end date were recorded by the physician. Patients were transferred to an EDSS rater (in phase 3 trials) and an independent physician for the EDSS assessment. In the present analysis, all patient-experienced and physician-reported new or worsening symptoms are considered, irrespective of the EDSS confirmation.
It should be noted that such visits could happen at any time and would typically occur in unscheduled visits between the regular scheduled visits. For this reason, to capture the timing of events adequately, a monthly grid was used for modeling purposes.
Radiological assessments in NO.MS
In NO.MS, all images obtained according to study-specific standardized protocols were reanalyzed centrally by the Big Data Institute in Oxford, United Kingdom, using a harmonized MRI pipeline on standard MRI outcomes in MS (normalized brain volume using SIENAX63, part of FSL 6.0; percentage brain volume change using SIENA64, part of FSL 6.0). Gd-enhancing lesions and T2 lesion volume were used as reported in the original trials.
Variables in NO.MS
The clinical and radiological variables used in our modeling are presented in Table 2, and their assessment is described in the previous two Methods subsections. Demographic and disease-related features are updated longitudinally. For each visit, the patient’s age is updated, and the normalized brain volume is calculated based on the normalized brain volume measured at baseline (using SIENAX63) and the percentage change from baseline measured at post-baseline visits (using SIENA64). Lesion assessments were done centrally as previously reported for each of the original trials.
Demographic features and the diagnosed phenotype of MS (RRMS, SPMS or PPMS) were not used in the modeling but are reported across visits for the disease states newly identified by the model to characterize the patients in a specific state and to help establish the link between the newly proposed FAHMM states and the traditional classification of MS. PIRA was derived as a 3-month EDSS-confirmed irreversible worsening of disability in the absence of relapses10.
FAHMM
The proposed hierarchical model uses a PFA65 model to find a parsimonious representation of data. It exploits the shared information among elements of observed data to find MS dimensions (loading matrix) and corresponding composite scores (latent variables) that are continuous and a posteriori following a normal distribution. The spike and slab prior with Laplace components on the loading matrix favors sparsity that helps with the interpretation of MS dimensions. The number of MS dimensions is determined in a data-driven manner by putting an Indian buffet process prior on the inclusion/exclusion binary variables of spike and slab prior. Moreover, it helps with assigning observed variables to the MS dimensions by using a posterior probability of inclusion to the slab component greater than 0.5.
Next, our model assumes that the composite scores follow an HMM with multivariate normal emission distribution66. For modeling purposes, only the time gap between two consecutive visits is assumed to be 1 month (except for MS PATHS where it is assumed to be 6 months; see Supplementary Information Section 5 for more details). The HMM models MS evolution over time by (1) finding homogeneous disease states (latent unobserved) where the distribution of longitudinal composite scores is similar in terms of mean and covariance and (2) characterizing the progression between states by a transition probability matrix where all transitions are a priori possible (we are not restricting the transition probability matrix or assuming any structure). The number of states is determined by using the BIC.
Our proposed probabilistic multivariate model for disease evolution using longitudinal data is capable of handling mixed data modalities (binary, count and continuous) and missing data. The allocation of the observed variables to the MS dimensions does not change over time, which translates into a fixed loading matrix across visits. However, the composite scores are changing over time. The PFA uses baseline data where there are no missing data to estimate the loading matrix, and then the rest of the model parameters are estimated conditionally on the estimated loading matrix.
The model parameters are estimated using an expectation–maximization algorithm67. To evaluate the proposed classification’s predictive performance for the prognostication of an individual patient’s risk to transition into an advanced state of MS, the effect of DMTs and the characterization of individual states, the estimated FAHMM model is used to assign each visit to the corresponding disease and clinical states using the Viterbi algorithm67. The states are characterized by calculating the mean and s.d. of the corresponding variables. The disease states and transition probabilities are illustrated in Supplementary Fig. 5.1.
A discrete time-to-event analysis using Bayesian Additive Regression Trees (BART)68 was used to evaluate the prognostication performance of the clinical (meta-)states (Results). Time to first transition to one of the advanced states for patients who are in the early, relapse or asymptomatic states at baseline was predicted using baseline radiological and clinical features and demographic characteristics, including age, sex, treatment, relapses and number of relapses before entering the trial.
A continuous-time Markov model was used to assess the association between the use versus non-use of a DMT on the transition probabilities between clinical (meta-)states (msm package69).
Replication in holdout data from NO.MS
The analysis was based on a total dataset of 8,023 patients that was randomly divided into a discovery set (6,419 patients) for analysis purposes and a holdout set (1,604 patients) for validation purposes. The k-means clustering method was used to identify a homogeneous group of patients using the average of longitudinal composite scores per patient. The clustering method found five different groups using the elbow approach, where 80% of patients in each group are randomly assigned to the discovery set and the remaining 20% to the validation set.
More methodological details can be found in Supplementary Information Section 5.
Sensitivity analysis
A sensitivity analysis was conducted to check whether the disease states and transition pattern for ‘bout-onset MS’ (RRMS and SPMS) is similar to that of PPMS. A separate model was fit to RRMS/SPMS (excluding patients with PPMS) and to patients with PPMS alone.
Another sensitivity analysis was conducted without data imputation. This approach presents inherent complexities as the model requires complete data for all visits, whereas relapses can occur at any time, and other assessments are often unavailable at these timepoints. To conduct an analysis without data imputation, it was, therefore, necessary to remap all available data to annual visits based on the availability of MRI scans. This approach has the limitation that all data points collected between these annual visits are either ignored or shifted in time. After remapping the data to annual visits, the model was fit to these ‘complete case’ data.
External validation on independent datasets
After submitting the initial version of this paper to Nature Medicine based on the NO.MS data, we sought to ensure the reproducibility and generalizability of our findings through validation using independent external datasets where we established predefined validation criteria prior to accessing these datasets. The model was then fitted to each external dataset, including an independent clinical trial dataset (Roche MS dataset) and a real-world cohort (MS PATHS), confirming the reproducibility of our results following the data preparatory steps described further in this section.
Validation step 1: replication of MS dimensions
The PFA part of FAHMM65 uses baseline data to find MS dimensions (loading matrix) and corresponding composite scores (latent variables). The FAHMM model was fitted to all datasets where validation was evaluated by examining whether the same disease dimensions would emerge in the external datasets. Specifically, we determined whether the same or similar sets of variables from the primary analysis were assigned to the corresponding latent variables in the validation datasets. The validation of the latent dimensions of MS would be considered successful (validation criterion 1) if we could re-identify four dimensions related to (1) physical disability, (2) brain damage, (3) relapse and (4) asymptomatic MS disease activity.
Validation step 2: replication of disease evolution modeling
In the main analysis, the MS evolution modeling using FAHMM discovered eight states that were grouped in four meta-states based solely on the patterns of the transition probability matrix using NO.MS data: EME MS, asymptomatic radiological MS disease activity, relapse and advanced MS. To replicate the main findings from the NO.MS dataset, we fit the FAHMM to either the Roche MS or the MS PATHS data with eight states as in the main model. A successful validation would entail finding meta-states with similar clinical interpretation and similar transition probability to NO.MS (validation criterion 2): the validation would be considered successful if we could re-identify an EME MS versus an advanced state of MS with a disease severity gradient and if the transition from the first to the second would primarily be through focal inflammatory disease states—that is, through a relapse or an asymptomatic radiological disease state—with little to no probability for patients to worsen from EME MS to advanced states without passing through these focal inflammatory states.
As described above, the validation focused on the qualitative similarity of the clinical interpretability rather than on numerical thresholds. By applying these predefined validation criteria to unseen data, we aimed to show the generalizability and robustness of our findings across independent datasets, including real-world data.
Variables in external datasets
In the Roche MS dataset, the same clinical and radiological variables as in the NO.MS dataset were available. As in the NO.MS dataset, the clinical measurements were collected by trained neurologists, and MRI assessments (lesions and brain volume change) were measured by a central reading center. All assessments were used as reported in the original trials.
For the real-world dataset from MS PATHS, data assertation was different than in NO.MS and in Roche MS. Specifically, no EDSS assessments were available (patient determined disease steps (PDDS) measurements were collected rather than EDSS, but this was not used in the modeling). For most other variables in NO.MS, corresponding similar measures in MS PATHS could be identified: an iPad version of the 9HPT test was used (labeled as ‘manual dexterity test’); the walking test (noted as ‘walking speed test’) was found to be similar to the T25FWT; and an iPad version of a cognitive text (noted as ‘processing speed test’) similar to the Symbol Digit Modality Test (SDMT) was used. In MS PATHS, brain parenchymal fraction70 was calculated instead of normalized brain volume, and the number of new/enlarging T2 lesions was used instead of the number of Gd-enhancing T1 lesions. Other differences between MS PATHS and the clinical trial datasets were a lower frequency of visits (typically every 6–12 months) and the fact that visit occurrence was not independent of the occurrence of clinical symptoms (scanning frequency seemed to depend on the occurrence of relapses). Therefore, whereas, in the NO.MS and the Roche MS datasets, the transition probabilities refer to the probability of changing from one disease state to another one within a period of 1 month, in MS PATHS they refer to the probability of changing from one disease states to another within 6 months.
Data preparatory steps
In each of the respective independent external validation datasets, visits were mapped to a regular grid to capture the timing of regularly scheduled visits as well as of unscheduled visits (for example, due to new or worsening neurological symptoms). For the Roche MS dataset, this was a monthly grid, similar to that of NO.MS, whereas, for MS PATHS, due to the lower visit frequency, this was a six-monthly visit grid (subsequently, probabilities in the transition matrix refer to monthly versus six-monthly transitions, respectively). To account for incomplete records and missing post-baseline data, the observed variable’s trajectory over time was used to impute missing values using generalized additive models. Such data imputation was done only at scheduled or unscheduled patient visits where partial patient data were available (Supplementary Information Section 5). The percentage of imputed values overall and for each variable is reported together with the mean absolute error for the imputation. In the clinical trial dataset, baseline was defined as the last assessment prior to randomization, whereas, in MS PATHS, baseline was defined as the first timepoint that patients had all the necessary measurements required for modeling; furthermore, the availability of serial post-baseline assessments was required for inclusion into the analysis set, which led to the total sample size of 2,080 patients from MS PATHS.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
For NO.MS, the reader is able to request the raw data (anonymized) and related documents (for example, protocol, reporting and analysis plan and clinical study report) of all the studies that underlie the modeling results reported in this article by connecting to CSDR (https://www.clinicalstudydatarequest.com) and signing a data-sharing agreement with Novartis. The data will be made available to researchers, with requests reviewed and approved by an independent review panel of CSDR. For Roche MS, including phase 3 ocrelizumab trial data used for the clinical trial validation, qualified researchers can request access to patient-level data by making a request via https://vivli.org/. The anonymized MS PATHS dataset used for the real-world validation can be obtained for purposes of replicating the findings of this study by contacting H.W. at heinz.wiendl@uniklinik-freiburg.de.
Code availability
SIENA/X, part of FSL 6.0, available at https://fsl.fmrib.ox.ac.uk/fsl/, was used to derive normalized brain volume and percentage of brain volume change. The FAHHM code used in this work is available at https://github.com/habib61/FAHMM.
References
MS International Federation. Atlas of MS. https://www.atlasofms.org
Walton, C. et al. Rising prevalence of multiple sclerosis worldwide: insights from the Atlas of MS, third edition. Mult. Scler. 26, 1816–1821 (2020).
Lublin, F. D. & Reingold, S. C. Defining the clinical course of multiple sclerosis: results of an international survey. Neurology 46, 907–911 (1996).
Lublin, F. D. et al. Defining the clinical course of multiple sclerosis: the 2013 revisions. Neurology 83, 278–286 (2014).
Rotstein, D. & Montalban, X. Reaching an evidence-based prognosis for personalized treatment of multiple sclerosis. Nat. Rev. Neurol. 15, 287–300 (2019).
Capanna, M., Signori, A. & Sormani, M. P. Is the effect of drugs in progressive MS only due to an effect on inflammation? A subgroup meta-analysis of randomised trials. Mult. Scler. 28, 1744–1751 (2022).
Cree, B. A. C. et al. Silent progression in disease activity-free relapsing multiple sclerosis. Ann. Neurol. 85, 653–666 (2019).
Kappos, L. et al. Greater sensitivity to multiple sclerosis disability worsening and progression events using a roving versus a fixed reference value in a prospective cohort study. Mult. Scler. 24, 963–973 (2018).
Kappos, L. et al. Contribution of relapse-independent progression vs relapse-associated worsening to overall confirmed disability accumulation in typical relapsing multiple sclerosis in a pooled analysis of 2 randomized clinical trials. JAMA Neurol. 77, 1132–1140 (2020).
Lublin, F. D. et al. How patients with multiple sclerosis acquire disability. Brain 145, 3147–3161 (2022).
Tur, C. et al. Association of early progression independent of relapse activity with long-term disability after a first demyelinating event in multiple sclerosis. JAMA Neurol. 80, 151–160 (2023).
Lassmann, H., van Horssen, J. & Mahad, D. Progressive multiple sclerosis: pathology and pathogenesis. Nat. Rev. Neurol. 8, 647–656 (2012).
Yong, H. Y. F. & Yong, V. W. Mechanism-based criteria to improve therapeutic outcomes in progressive multiple sclerosis. Nat. Rev. Neurol. 18, 40–55 (2022).
Filippi, M. et al. Diagnosis of progressive multiple sclerosis from the imaging perspective: a review. JAMA Neurol. 78, 351–364 (2021).
Kuhlmann, T. et al. Multiple sclerosis progression: time for a new mechanism-driven framework. Lancet Neurol. 22, 78–88 (2023).
He, A. et al. Timing of high-efficacy therapy for multiple sclerosis: a retrospective observational cohort study. Lancet Neurol. 19, 307–316 (2020).
Lublin, F. D., Coetzee, T., Cohen, J. A., Marrie, R. A. & Thompson, A. J. The 2013 clinical course descriptors for multiple sclerosis: a clarification. Neurology 94, 1088–1092 (2020).
Granziera, C., Derfuss, T. & Kappos, L. Time to change the current clinical classification of multiple sclerosis? JAMA Neurol. 80, 128–130 (2023).
Thompson, A. J. et al. Do the current MS clinical course descriptors need to change and if so how? A survey of the MS community. Mult. Scler. 29, 1363–1372 (2023).
Dahlke, F. et al. Characterisation of MS phenotypes across the age span using a novel data set integrating 34 clinical trials (NO.MS cohort): age is a key contributor to presentation. Mult. Scler. 27, 2062–2076 (2021).
Hauser, S. L. et al. Ocrelizumab versus interferon beta-1a in relapsing multiple sclerosis. N. Engl. J. Med. 376, 221–234 (2017).
Montalban, X. et al. Ocrelizumab versus placebo in primary progressive multiple sclerosis. N. Engl. J. Med. 376, 209–220 (2017).
Mowry, E. M. et al. Harnessing real-world data to inform decision-making: Multiple Sclerosis Partners Advancing Technology and Health Solutions (MS PATHS). Front. Neurol. 11, 632 (2020).
Lublin, F. D. Worsening MS—a reappraisal of how we characterize the MS disease course. Mult. Scler. 29, 1699–1700 (2023).
Hawker, K. et al. Rituximab in patients with primary progressive multiple sclerosis: results of a randomized double-blind placebo-controlled multicenter trial. Ann. Neurol. 66, 460–471 (2009).
Wolinsky, J. S. et al. Fingolimod effect on disability progression in primary progressive multiple sclerosis patients with inflammatory activity: a post-hoc subgroup analysis of the INFORMS study. Mult. Scler. 22, 679–680 (2016).
Eshaghi, A. et al. Identifying multiple sclerosis subtypes using unsupervised machine learning and MRI data. Nat. Commun. 12, 2078 (2021).
Ocampo, A. et al. Prognostic factors for worsening and improvement in multiple sclerosis using a multistate model. Mult. Scler. 30, 1455–1467 (2024).
Sormani, M. P. et al. Defining brain volume cutoffs to identify clinically relevant atrophy in RRMS. Mult. Scler. 23, 656–664 (2017).
Miller, D. H. et al. Brain atrophy and disability worsening in primary progressive multiple sclerosis: insights from the INFORMS study. Ann. Clin. Transl. Neurol. 5, 346–356 (2018).
Cagol, A. et al. Association of brain atrophy with disease progression independent of relapse activity in patients with relapsing multiple sclerosis. JAMA Neurol. 79, 682–692 (2022).
Thorpe, J. W. et al. Serial gadolinium-enhanced MRI of the brain and spinal cord in early relapsing-remitting multiple sclerosis. Neurology 46, 373–378 (1996).
Kuhle, J. et al. Serum neurofilament is associated with progression of brain atrophy and disability in early MS. Neurology 88, 826–831 (2017).
Kuhle, J. et al. Blood neurofilament light chain as a biomarker of MS disease activity and treatment response. Neurology 92, e1007–e1015 (2019).
van den Bosch, A. et al. Neurofilament light chain levels in multiple sclerosis correlate with lesions containing foamy macrophages and with acute axonal damage. Neurol. Neuroimmunol. Neuroinflamm. 9, e1154 (2022).
Radue, E. W. et al. Correlation between brain volume loss and clinical and MRI outcomes in multiple sclerosis. Neurology 84, 784–793 (2015).
Petracca, M. et al. Brain reserve and timing of clinical onset in multiple sclerosis. Mult. Scler. 30, 1290–1295 (2024).
Wattjes, M. P. et al. 2021 MAGNIMS–CMSC–NAIMS consensus recommendations on the use of MRI in patients with multiple sclerosis. Lancet Neurol. 20, 653–670 (2021).
Müller, J. et al. Harmonizing definitions for progression independent of relapse activity in multiple sclerosis: a systematic review. JAMA Neurol. 80, 1232–1245 (2023).
Stadelmann, C. Multiple sclerosis as a neurodegenerative disease: pathology, mechanisms and therapeutic implications. Curr. Opin. Neurol. 24, 224–229 (2011).
Cree, B. A. Genetics of primary progressive multiple sclerosis. Handb. Clin. Neurol. 122, 211–230 (2014).
Frischer, J. M. et al. The relation between inflammation and neurodegeneration in multiple sclerosis brains. Brain 132, 1175–1189 (2009).
Kuchling, J. et al. Identical lesion morphology in primary progressive and relapsing-remitting MS—an ultrahigh field MRI study. Mult. Scler. 20, 1866–1871 (2014).
Kuhlmann, T. et al. An updated histological classification system for multiple sclerosis lesions. Acta Neuropathol. 133, 13–24 (2017).
Lassmann, H. Targets of therapy in progressive MS. Mult. Scler. 23, 1593–1599 (2017).
Luchetti, S. et al. Progressive multiple sclerosis patients show substantial lesion activity that correlates with clinical disease severity and sex: a retrospective autopsy cohort analysis. Acta Neuropathol. 135, 511–528 (2018).
Montalban, X. 2024 revisions of the McDonald criteria. Presented at the European Committee for Treatment and Research in Multiple Sclerosis (ECTRIMS), Copenhagen, Denmark. https://ectrims.eu/mcdonald-diagnostic-criteria (2024).
Samjoo, I. A. et al. Comparative efficacy of therapies for relapsing multiple sclerosis: a systematic review and network meta-analysis. J. Comp. Eff. Res. 12, e230016 (2023).
Bussas, M. et al. Multiple sclerosis lesions and atrophy in the spinal cord: distribution across vertebral levels and correlation with disability. Neuroimage Clin. 34, 103006 (2022).
Ruggieri, S. et al. The added value of spinal cord lesions to disability accrual in multiple sclerosis. J. Neurol. 270, 4995–5003 (2023).
Sastre-Garriga, J. et al. Spinal cord reserve in multiple sclerosis. J. Neurol. Neurosurg. Psychiatry 94, 502–510 (2023).
Eden, D. et al. Spatial distribution of multiple sclerosis lesions in the cervical spinal cord. Brain 142, 633–646 (2019).
Bonacchi, R. et al. Clinical relevance of multiparametric MRI assessment of cervical cord damage in multiple sclerosis. Radiology 296, 605–615 (2020).
Bischof, A. et al. Spinal cord atrophy predicts progressive disease in relapsing multiple sclerosis. Ann. Neurol. 91, 268–281 (2022).
Hong, J. et al. Limited added value of systematic spinal cord MRI vs brain MRI alone to classify patients with MS as active or inactive during follow-up. J. Neurol. 272, 316 (2025).
Delbarre, D. J. et al. Application of a convolutional neural network to the quality control of MRI defacing. Comput. Biol. Med. 151, 106211 (2022).
Mallon, A. M. et al. Advancing data science in drug development through an innovative computational framework for data sharing and statistical analysis. BMC Med. Res. Methodol. 21, 250 (2021).
Kurtzke, J. F. Rating neurologic impairment in multiple sclerosis: an expanded disability status scale (EDSS). Neurology 33, 1444–1452 (1983).
D’Souza, M., Papadopoulou, A., Girardey, C. & Kappos, L. Standardization and digitization of clinical data in multiple sclerosis. Nat. Rev. Neurol. 17, 119–125 (2021).
Schwid, S. R., Goodman, A. D., McDermott, M. P., Bever, C. F. & Cook, S. D. Quantitative functional measures in MS: what is a reliable change. Neurology 58, 1294–1296 (2002).
Tombaugh, T. N. A comprehensive review of the Paced Auditory Serial Addition Test (PASAT). Arch. Clin. Neuropsychol. 21, 53–76 (2006).
McDonald, W. I. et al. Recommended diagnostic criteria for multiple sclerosis: guidelines from the international panel on the diagnosis of multiple sclerosis. Ann. Neurol. 50, 121–127 (2001).
Smith, S. M. et al. Accurate, robust, and automated longitudinal and cross-sectional brain change analysis. Neuroimage 17, 479–489 (2002).
Smith, S. M., De Stefano, N., Jenkinson, M. & Matthews, P. M. Normalized accurate measurement of longitudinal brain change. J. Comput. Assist. Tomogr. 25, 466–475 (2001).
Ročková, V. & George, E. I. Fast Bayesian factor analysis via automatic rotations to sparsity. J. Am. Stat. Assoc. 111, 1608–1622 (2016).
Malsiner-Walli, G., Frühwirth-Schnatter, S. & Grün, B. Model-based clustering based on sparse finite Gaussian mixtures. Stat. Comput. 26, 303–324 (2016).
Viterbi, A. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inf. Theory 13, 260–269 (1967).
Sparapani, R. A., Logan, B. R., McCulloch, R. E. & Laud, P. W. Nonparametric survival analysis using Bayesian Additive Regression Trees (BART). Stat. Med. 35, 2741–2753 (2016).
Jackson, C. Multi-state modelling with R: the msm package. https://cran.r-project.org/web/packages/msm/index.html (2007).
Rudick, R. A., Fisher, E., Lee, J. C., Simon, J. & Jacobs, L. Use of the brain parenchymal fraction to measure whole brain atrophy in relapsing-remitting MS. Neurology 53, 1698–1704 (1999).
Acknowledgements
The authors acknowledge the physicians and patients who participated in these studies for their valuable contributions toward the advancement of our knowledge of MS. We also thank the many anonymous researchers and reviewers who have discussed the results and their interpretations with us along the way. Finally, we thank M.-C. Mousseau (Novartis Ireland Ltd.) for her writing support, including reviewing and incorporating author and reviewer comments under the guidance of the authors, as well as P. Coyle (Novartis Ireland Ltd.) and H. Prasad VS (Novartis Healthcare Pvt. Ltd.) for their editorial and design assistance, respectively.
Funding
Open access funding provided by Albert-Ludwigs-Universität Freiburg im Breisgau.
Author information
Authors and Affiliations
Contributions
D.A.H., H.G., T.E.N., F.D.L., L.K., L.G., A.B., D.L.A., R.A.B., W.S., B.C.K., H.W. and C.C.H. contributed substantially to conceptualizing the research. H.G. and C.C.H. developed the probabilistic machine learning method, with input and support from G.G., D.A.H., S.G., P.A. and Y.S. H.G., P.A., Y.S., C.B., S.P.T. and S.G. conducted the wrangling, the analysis and the visualization of the data, with support from G.G. and D.A.H. D.A.H., H.G., T.E.N., F.D.L., L.K., X.M., C.B., A.B., E.F., L.G., S.P.T., A.J.T., D.L.A., R.A.B., G.G., P.A., S.G., Y.S., W.S., B.C.K., H.W. and C.C.H. contributed substantially to the interpretation of the data and the review of the manuscript. C.B., A.B., E.F., L.G. and S.P.T. contributed to the external validation analyses and critically reviewed the final manuscript. D.A.H. and H.G. wrote the manuscript and incorporated co-author comments.
Corresponding author
Ethics declarations
Competing interests
A.J.T. receives fees from being Co-Chair, UCL-Eisai Steering Committee drug discovery collaboration; Member, National Multiple Sclerosis Society (NMSS) Research Programs Advisory Committee; Clinical Trials Committee, Progressive MS Alliance; Board Member, European Charcot Foundation; Editor in Chief, Multiple Sclerosis journal; and Editorial Board Member, Lancet Neurology. He receives no fees from being Chair (Scientific Ambassadors), ‘Stop MS’ Appeal Board, UK MS Society; research & academic counsellor, Fundació Privada Cemcat; and Ambassador, European Brain Council. A.J.T. additionally holds a patent for the MSIS-29 Impact Scale. X.M. has received speaking honoraria and travel expenses for participation in scientific meetings and has been a steering committee member of clinical trials or has participated in advisory boards of clinical trials in the past years with AbbVie, Actelion, Alexion, Biogen, Bristol Myers Squibb/Celgene, EMD Serono, Genzyme, F. Hoffmann-La Roche, Immunic, Janssen Pharmaceuticals, MedDay Pharmaceuticals, Merck, Mylan, NervGen, Novartis, Sandoz, Sanofi-Genzyme, Teva, TG Therapeutics, EXCEMED, the MS International Federation and the NMSS. F.D.L. reports as sources of funding for research: Novartis, Actelion, Biogen, Sanofi, NMSS, National Institutes of Health and Brainstorm Cell Therapeutics; consulting agreements/advisory boards/data and safety monitoring board: Biogen, EMD Serono, Novartis, Teva, Actelion/Janssen, Sanofi/Genzyme, Acorda, Roche/ Genentech, MedImmune/Viela Bio, Receptos/Celgene/Bristol Myers Squibb, TG Therapeutics, MedDay Pharmaceuticals, Atara Biotherapeutics, Mapi Pharma, Apitope, Orion Biotechnology, Brainstorm Cell Therapeutics, Jazz Pharmaceuticals, GW Pharma, Mylan, Immunic, Population Council, Avotres, Neurogene, Banner Life Sciences, LabCorp, Entelexo Biotherapeutics and NeuraLight; stock options: Avotres and NeuraLight; and speaker: Sanofi (non-promotional). L.K.’s institution (University Hospital Basel) has received the following exclusively for research support: steering committee, advisory board and/or consultancy fees (Biogen, EMD Serono Research and Development, Genentech, Janssen, Novartis, Clene Nanomedicine, Bayer, Bristol Myers Squibb, Celltrion, Eli Lilly (Suisse) SA, EMD Serono Research and Development, Galapagos NV, Kiniksa Pharmaceuticals, Merck Healthcare AG, Minoryx and Santhera, Neurostatus UHB AG, Roche, Sanofi, Santhera Pharmaceuticals, Shionogi BV, Wellmera AG and Zai Lab); speaker fees (Bristol Myers Squibb, Janssen, Novartis, Roche and Sanofi); grants (European Union, Innosuisse, Merck Healthcare AG, Novartis and Roche); and testimony (Df-mp Mplina & Pohlman). D.L.A. reports consulting fees from Biogen, Biohaven, Bristol Myers Squibb, Eli Lilly, EMD Serono, Find Therapeutics, Frequency Therapeutics, GlaxoSmithKline, Idorsia Pharmaceuticals, Kiniksa Pharmaceuticals, Merck, Novartis, Race to Erase MS, Roche, Sanofi-Aventis, Shionogi and Xfacto Communications as well as an equity interest in NeuroRx. R.A.B. has served as a consultant for AstraZeneca, Biogen, EMD Serono, Genzyme, Genentech, Novartis and VielaBio. He also receives research support from Biogen, Genentech and Novartis. H.W. has received honoraria for being a member of scientific advisory boards for Biogen, Evgen, Genzyme, MedDay Pharmaceuticals, Merck Serono, Novartis, Roche Pharma AG and Sanofi-Aventis as well as speaker honoraria and travel support from Alexion, Biogen, Cognomed, F. Hoffmann-La Roche, Gemeinnützige Hertie-Stiftung, Merck Serono, Novartis, Roche Pharma AG, Genzyme, Teva and WebMD Global. H.W. is also acting as a paid consultant for AbbVie, Actelion, Biogen, IGES, Johnson & Johnson, Novartis, Roche, Sanofi-Aventis and the Swiss Multiple Sclerosis Society. His research is funded by the German Ministry for Education and Research (BMBF), the Deutsche Forschungsgemeinschaft (DFG), the Else Kröner Fresenius Foundation, the European Union, the Hertie Foundation, the NRW Ministry of Education and Research, the Interdisciplinary Center for Clinical Studies (IZKF) Muenster and RE Children’s Foundation, Biogen, GlaxoSmithKline, Roche Pharma AG and Sanofi-Genzyme. D.A.H., P.A., G.G., W.S., E.F., L.G. and B.C.K. are employees of Novartis. H.G., Y.S., S.G., T.E.N. and C.C.H. are current employees of the Big Data Institute, which received funding from Novartis to collaborate on AI in medicine, including the work presented here. C.B. and S.P.T. are employees of Roche. A.B. received speaker honoraria from Amicus Therapeutics and Biogen. T.E.N. received consulting fees from Perspectum Ltd.
Peer review
Peer review information
Nature Medicine thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editor: Lorenzo Righetto, in collaboration with the Nature Medicine team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Disease classification of multiple sclerosis.
Consensus definitions from 1996 and in its ‘2013 revisions’ and variants of it as used in indication statements in US packet inserts, summaries of product characteristics by the European Medicines Agency and in scientific publications. Relapsing forms of multiple MS include CIS, RRMS, and aSPMS in adults. aRRMS, active RRMS; aSPMS, active SPMS; CIS, clinically isolated syndrome; haRRMS, highly active RRMS; IPPMS, late PPMS (as opposed to ‘early PPMS’); MS, multiple sclerosis; naSPMS, non-active SPMS (as opposed to ‘active SPMS’); PMS progressive MS (SPMS + PPMS); PPMS, primary progressive MS; PRMS, progressive relapsing MS; RRMS relapsing remitting MS; SPMS, secondary progressive MS; «progression in MS » refers to the process of progression, which occurs in all subtypes of MS.
Extended Data Fig. 2 Selecting the number of states.
BIC, Bayesian Information Criterion. A local minimum would indicate the optimal number of states. However, no single best number of states was identified. Models with more and more states with overlapping disease features (as illustrated in Extended Data Fig. 3) – representing a gradient in disease severity features – were found to be a better representation of the data than simpler models.
Extended Data Fig. 3 Graphical illustration of disease states and MS as a gradient.
The states represent a gradient of disease severity based on physical disability and brain damage with distinct inflammatory states without accompanying symptoms (state 4) or with such symptoms, that is relapse (state 5). a, Density plots in the eight-state model: Latent factor distribution in the eight states. Overlapping distributions form a gradient of disease severity based on physical disability and brain damage. Distinct inflammatory states for the clinical relapse (state 5), and for asymptomatic lesions (state 4) b, Endpoint distribution of the original clinical and radiological variables in the eight states. EDSS, Expanded Disability Status Scale; Gd, gadolinium-enhancing; MS, multiple sclerosis; PASAT, Paced Auditory Serial Addition Test.
Extended Data Fig. 4 Alternative models with nine or ten states.
a, b, model with nine states. c, d, model with 10 states. Composite score of MS dimensions and empirical means of the original variables characterising the states (a and c). Transition probability matrix from FAHMM (b and d). The transition probabilities refer to the probability of changing from one disease state to another one within a period of 1 month; the colour code refers to the clinical disease states as described in Fig. 1. Asympt., asymptomatic; EDSS, Expanded Disability Status Scale; Gd + , gadolinium-enhancing; PASAT, Paced Auditory Serial Addition Test.
Extended Data Fig. 5 Eight-state modelling by MS subtypes.
Clinical states (a) and transition matrices between states (b) for bout-onset MS (RRMS, SPMS) and PPMS separately. The disease severity gradient from EME to advanced states of MS, as well as the relapse and asymptomatic disease activity states, were re-discovered for PPMS, and similar to those observed for bout-onset MS. As a minor difference, only one EME state and several advanced states were discovered when fitting the model only to PPMS patients. This is expected, as studies in PPMS systematically excluded patients with an EDSS < 3.5. Overall, disease states and transition patterns were similar between RRMS-SPMS and PPMS patients.
Extended Data Fig. 6 Eight-state modelling with no imputation of missing data.
Analysis based on non-missing data, that is analysis without data imputation performed by mapping data to annual visits (based on the availability of annual MRI scans): (a) clinical states and (b) transition matrix between states. The frequency of subclinical disease activity is underestimated in this model due to the remapping of relapses from other timepoints were they occurred to the annual visits where MRI scans are available. Overall, the disease states and transition patterns observed were similar to those in the main model.
Supplementary information
41591_2025_3901_MOESM1_ESM.pdf
The supplementary file contains five sections as follows: Section 1 (Data sources): Fig. 1.1 and Fig. 1.2. Section 2 (Main model based on NO.MS): Table 2.1. Section 3 (Replication results in holdout NO.MS data): Fig. 3.1 and Table 3.1. Section 4 (Individual patient trajectories): Fig. 4.1. Section 5 (Methodology).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ganjgahi, H., Häring, D.A., Aarden, P. et al. AI-driven reclassification of multiple sclerosis progression. Nat Med (2025). https://doi.org/10.1038/s41591-025-03901-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41591-025-03901-6