Deep hierarchical subtyping of multi-organ systemic sclerosis trajectories - a EUSTAR study

Trottet, Cécile; Schürch, Manuel; Allam, Ahmed; Petelytska, Liubov; Castellví, Ivan; Bečvář, Radim; de Vries-Bouwstra, Jeska; Iannone, Florenzo; Carreira, Patricia; Truchetet, Marie-Elise; Cuomo, Giovanna; Rezus, Elena; Cantatore, Francesco Paolo; Simeón-Aznar, Carmen Pilar; Parvu, Magda; Dzhus, Marta; Distler, Oliver; Hoffmann-Vold, Anna-Maria; Krauthammer, Michael

doi:10.1038/s41746-025-01962-y

Download PDF

Article
Open access
Published: 01 September 2025

Deep hierarchical subtyping of multi-organ systemic sclerosis trajectories - a EUSTAR study

Cécile Trottet^1,2,
Manuel Schürch^3,4,
Ahmed Allam¹,
Liubov Petelytska^5,6,
Ivan Castellví⁷,
Radim Bečvář⁸,
Jeska de Vries-Bouwstra⁹,
Florenzo Iannone¹⁰,
Patricia Carreira¹¹,
Marie-Elise Truchetet¹²,
Giovanna Cuomo¹³,
Elena Rezus¹⁴,
Francesco Paolo Cantatore¹⁵,
Carmen Pilar Simeón-Aznar¹⁶,
Magda Parvu¹⁷,
Marta Dzhus¹⁸,
Oliver Distler⁵,
Anna-Maria Hoffmann-Vold^5,19,
Michael Krauthammer^1,2 &
EUSTAR Collaborators

npj Digital Medicine volume 8, Article number: 563 (2025) Cite this article

4103 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Systemic sclerosis (SSc) is a chronic autoimmune disease with multi-organ involvement. Historically, SSc classification has focused on the type of skin involvement (limited versus diffuse); however, a growing evidence of organ-specific variability suggests the presence of more than two distinct subtypes. We propose a semi-supervised generative deep learning framework leveraging expert-driven definitions of organ-specific involvement and severity. We model SSc disease trajectories in the European Scleroderma Trials and Research (EUSTAR) database, containing 14,000 patients across 67,000 medical visits, and identify clinically meaningful subtypes to enhance patient stratification and prognosis. We systematically evaluate the model’s predictive accuracy, robustness to missing data, and clinical interpretability. We identified five patient clusters, separating patients based on the degree of organ involvement. Notably, a subset with limited skin involvement still showed high risks of lung and heart complications, underscoring the importance of data-driven methods and multi-organ models to complement established insights from clinical practice.

Systemic sclerosis interstitial lung disease: unmet needs and potential solutions

Article 03 November 2023

Development and validation of machine learning for early mortality in systemic sclerosis

Article Open access 13 October 2022

Single-cell analysis reveals immune cell abnormalities underlying the clinical heterogeneity of patients with systemic sclerosis

Article Open access 17 June 2025

Introduction

Systemic sclerosis (SSc) is a chronic autoimmune disease marked by progressive fibrosis and vascular abnormalities in the skin and multiple internal organs such as the lungs, heart, kidneys, and gastrointestinal tract (GT)¹. These multi-organ manifestations vary widely among patients in terms of frequency, onset, and severity, leading to significant morbidity and mortality². Despite known clinical markers, such as skin involvement (limited cutaneous vs. diffuse cutaneous) and autoantibodies (e.g., anti-centromere, anti-topoisomerase I), it remains unclear which organs will become affected over time and how these manifestations might influence subsequent disease progression³. Early detection of at-risk individuals is therefore crucial for managing disease severity and potentially slowing progression⁴.

Traditional classification of SSc relies primarily on the extent of skin involvement: limited cutaneous SSc (lcSSc) is characterized by restricted areas of skin thickening, whereas diffuse cutaneous SSc (dcSSc) involves more widespread skin changes and often correlates with a higher risk of internal organ complications¹. Specific autoantibodies also serve as important biomarkers for SSc diagnosis, organ involvement, and disease progression^5,6. While anti-centromere antibodies (ACA) are predominantly linked with lcSSc and a higher likelihood of pulmonary arterial hypertension (PAH), anti-topoisomerase I antibodies (ATA) are often associated with dcSSc and an increased risk of interstitial lung disease (ILD), and anti-RNA polymerase III antibodies (ARA) are associated with rapid skin thickening, and increased risk of renal crisis⁷. However, because SSc involves complex, overlapping pathologies in multiple organs, subtyping remains a challenge; many crucial aspects of disease progression are not captured by skin-based classification alone⁸.

Recent work has leveraged artificial intelligence (AI), particularly deep learning (DL), to address the complexity of diseases with heterogeneous and longitudinal clinical data⁹ and identify patient subgroups with similar disease evolution^10,11. Fully unsupervised models detect latent (i.e. unobserved) patterns without any labels¹², while supervised approaches rely heavily on labeled outcomes. Neither paradigm alone is ideal for SSc, where labels (e.g., organ-specific damage) may be incomplete or imprecise, yet expert knowledge exists regarding clinically relevant markers and trajectories. Consequently, semi-supervised or hybrid methods have emerged as a promising alternative, combining partial labels and domain knowledge to guide latent representation learning^13,14. Most prior ML-based research for SSc has focused on single-organ complications, such as ILD¹⁵, or is limited by sample sizes¹⁶. A multi-organ model is needed to capture the true disease complexity and identify subtle, high-risk patient subgroups that might otherwise be overlooked⁴.

In this work, we propose a semi-supervised deep learning framework for analyzing and clustering multi-organ trajectories in SSc, leveraging the largest global SSc registry from the European Scleroderma Trials and Research (EUSTAR) group¹⁷. We build on a previously developed temporal variational autoencoder-based model^12,14,18 tailoring it to SSc and incorporating novel expert-guided definitions for two key dynamics, organ involvement and severity, each validated in a prior clinical study¹⁹. We model eight organs commonly affected by SSc: the skin, digital ulcers (DU), joints, muscles, lungs, heart, kidneys, and gastrointestinal tract (GT), and learn interpretable representations of patient disease trajectories. We then cluster these learned embeddings to identify clinically meaningful subtypes that may transcend conventional skin-based classification schemes. Figure 1 summarizes our approach.

**Fig. 1: Overview of the study pipeline.**

Our key contributions include:

Deep multi-organ SSc model: Development of a semi-supervised generative deep learning approach to model eight clinically relevant organs, capturing both involvement and severity over time, while merging data-driven discovery with expert clinical insights.
Deep SSc subtyping: Development of a hierarchical clustering approach for patient trajectories that highlights under-recognized high-risk subgroups and goes beyond traditional SSc subtyping.
Large-scale evaluation: Demonstrating predictive accuracy and generalizability through comprehensive training and evaluation on over 14,000 patients and 67,000 visits from the EUSTAR registry.
Clinical decision support: Demonstrating how additional features of our framework, such as patient similarity and predictive clustering, can support clinical decision-making and personalized medicine.

Results

As detailed in section "Training", we performed five-fold cross-validation (CV). We then trained a final model for each of the five folds, resulting in 5 final models. We specifically analyzed the model trained on the first fold and used the remaining models to assess result stability, particularly in terms of performance on unseen test data to evaluate generalizability. We support our disease subtyping approach with several analyses: (1) We first evaluate the model’s ability to reconstruct or predict the organ-related variables G (defined in section "Model overview and notations")and its robustness to missing data. (2) Next, we examine how different features and labels shape the structure of the latent space, (3) followed by an in-depth analysis of the identified disease subtypes through hierarchical clustering. We conclude by discussing how various model components can support clinical decision-making.

Predictive performance

We compared our approach against several baselines, including both ML and non-ML approaches in predicting the organ variables in G:

Ours – without feature masking: Uses the same architecture as our final approach but does not explicitly train for missing data imputation, unlike our main model, which uses feature masking (i.e. masks 20% randomly during training) and learns to reconstruct missing data (see subsection "Handling missing data"). As a result, this model is optimized purely for prediction rather than also for learning missing variables, and we expect it to perform slightly better on complete datasets.
Multilayer Perceptron (MLP): A non-temporal model using only the most recent clinical measurements (unlike our model, which considers the full patient history). It is optimized purely for prediction, and does not learn latent trajectories.
Non-ML baselines: Distribution-based predictions/heuristics are included to provide a benchmark for the general capabilities of ML models.
- Patient-specific: Predicts the future value of a variable based on its current value
- Cohort mean: Uses the cohort mean of the feature as prediction.

Table 1 presents the Mean Absolute Error (MAE) for continuous variables and weighted F₁ score for categorical variables for each model, averaged across five CV folds. Our final model and the variant without feature masking (i.e. missingness training) perform similarly and slightly outperform the MLP model. All ML models strongly outperform the non-ML baselines. Moreover, in Supplementary Table 4, we show that our approach outperforms all other models in terms of robustness to missing data.

Table 1 Predictive performance

Full size table

Latent space analysis: ground truth vs. reconstructed values

As detailed in section “Model Architecture”, our model is trained to project raw patient trajectories into a latent (i.e. unobserved) space. In this section, we examine and interpret these latent representations. To facilitate the analysis, we computed the 2-dimensional UMAP²⁰ decomposition for each time point in the latent trajectories, providing a visualization aid for the latent space. In the resulting UMAP plots (for instance Fig. 2), each point corresponds to a patient at a specific time. By overlaying the UMAP plots with color-coded clinical measurement values, labels, or clusters, we can intuitively visualize patient trajectories, cluster patterns, and feature/label distributions within the latent space.

**Fig. 2: Ground truth versus reconstructed data.**

As discussed in section “Predictive Performance”, we train our model to infer values for missing variables. Fig. 2 shows a side-by-side UMAP visualization comparing the ground truth for masked values (i.e. not provided as input to the model) and the corresponding model reconstructions for two features related to lung fibrosis. The close alignment between ground truth and reconstructed values illustrates that the model reliably imputes missing data. Notably, this applies to all variables, whether available or not, thereby enriching the latent embeddings beyond what is present in the raw inputs. Supplementary Fig. 5 further demonstrates how the model learns to “fill in"gaps in the latent space.

Latent space regions

We observe that patients with different disease manifestations are mapped to distinct regions within the latent space. By overlaying the UMAP plots with specific feature values, we can identify the areas corresponding to different patient types, and gain insight into which features most strongly influence the latent space separation. In Fig. 3, the latent space is color-coded based on feature values inferred by our model, revealing a clear separation concerning “HRCT: Lung fibrosis” (true vs false) and the “Cutaneous SSc” (limited vs diffuse). Additionally, a subset of the patients with Digital Ulcers is mapped closely together, and we can distinctly identify regions associated with Esophageal symptoms. Additional plots and discussion for other variables are provided in Supplementary Note 5.

**Fig. 3: Regions of the latent space.**

Hierarchical disease subtyping: first hierarchy of clusters

To perform disease subtyping, we followed the hierarchical clustering approach described in section “Trajectory Clustering”. We identified two primary clusters, and then further subdivided each of these into more granular subtypes. The first hierarchy of clusters distinguishes between patients with milder and more severe disease trajectories. The second level divides the mild group into two subtypes and the more severe group into three subtypes (Fig. 1 Panel 3.). In the following, we provide a detailed description of each cluster, followed by a discussion on the differences between clusters, highlighting the key variables driving cluster separation. For every organ, we plotted the empirical involvement and severity curves by averaging the model-inferred probabilities across all patient visits belonging to a given cluster at each follow-up visit.

In the first hierarchy of clusters, patients are split into two clusters (Fig. 4): a mild cluster (green) and a severe cluster (purple).

Mild Cluster (green): Patients have moderate to high probabilities of GT, heart and skin involvement, and an increasing likelihood of DU. They have a low risk of severe symptoms across all organs.
Severe Cluster (purple): Compared to the mild clusters, patients have a higher likelihood of lung involvement, and exhibit high severity of skin symptoms. Severity is additionally elevated for both heart and lung symptoms.

**Fig. 4: First hierarchy of clusters.**

These observations align with established SSc subtypes based on skin severity (limited vs. diffuse/severe)²¹ and previous findings linking severe skin involvement with earlier, more frequent internal-organ complications^22,23 as well as more pronounced ILD²⁴. Supplementary Fig. 11 compares the average feature values over time in both clusters. Overall, patients in the severe cluster exhibit higher modified rodnan skin scores (mRSS), more dyspnea, increased lung fibrosis (on HRCT and X-ray) and lower forced vital capacity (FVC) compared to those in the mild cluster.

Second hierarchy of clusters

This hierarchy further subdivides the clusters: the mild disease trajectory cluster is split into two subtypes (pale and dark green, Fig. 5A.), while the severe disease trajectory cluster is divided into three subtypes (pale blue, dark blue, and red, Fig. 5B.).

Figure 5A. shows the average label values over time for the patients categorized in the two mild disease subtypes. In particular, the clusters have the following characteristics:

Pale Green Cluster: Patients in this cluster have a high likelihood of skin involvement (non-severe). They have moderate probabilities of heart and GT involvement and experience an increasing probability of DU involvement over time. The probability of severe involvement remains low for all organs.
Dark Green Cluster: Patients have a comparatively higher likelihood of heart but particularly GT involvement. Additionally, there is comparatively faster rise in kidney involvement. Symptom severity remains low across organs.

In summary, patients in the pale green cluster generally experience the mildest disease, while those in the dark green cluster exhibit slightly increased risks—particularly for GT and heart involvement. These patterns suggest that even among patients with limited (i.e. non-severe) skin involvement, a subgroup exists with higher probabilities of GT and cardiac issues²⁴. The dark green cluster shows an increasing trend in dyspnea, lower eGFR, and more frequent esophageal symptoms and recurrent DU (Supplementary Fig. 12). Figure 5B. shows the average label values over time for the patients categorized in the more severe disease subtypes. In particular, the clusters have the following characteristics:

Pale blue cluster: Patients in this cluster experience high probabilities of severe skin involvement, with slightly increased severity of lung symptoms. Given overall high organ involvement, these patients show prototypical characteristics of diffuse cutaneous SSc, with elevated risks for heart, ILD, GT, and DU²⁴.
Red cluster: Compared to the pale blue cluster, patients experience elevated but slightly lower skin severity, with higher severity of heart, lung, GT and DU symptoms. These diffuse cutaneous SSc patients are at high risk for multi-organ complications.
Dark blue cluster: Patients in this cluster have even lower skin severity, while still experiencing elevated levels of heart and lung symptoms. Importantly, using the current disease classification criteria based on skin severity, these patients may be overlooked despite facing a high risk of multi-organ complications^25,26.

In summary, while all three severe subtypes show high probabilities of skin involvement, only the pale blue and red clusters exhibit severe skin manifestations. Importantly, patients in the dark blue cluster may be overlooked due to their limited skin manifestations, even though they face high mortality risk from ILD and heart complications^25,26. Feature comparisons (Supplementary Fig. 12) show that the dark blue cluster has a higher likelihood of lung fibrosis on HRCT or X-ray, while the red cluster is more prone to esophageal or stomach symptoms. Both the red and dark blue clusters experience increasing dyspnea over time, and the pale blue cluster maintains higher eGFR levels compared to the other two groups.

In summary, cluster separation is primarily driven by lung, skin, heart, and gastrointestinal involvement. For mild trajectories, two clusters emerged—both with low probabilities of severe organ involvement, though one exhibits slightly higher overall organ involvement. Three subtypes of severe trajectories were identified: one cluster shows a high likelihood of severe skin involvement with minimal severe involvement elsewhere, while the other two present increased probabilities of severe lung and heart complications. Notably, we identified a high-risk cluster (dark blue) with limited skin severity.

Cluster stability

As described in subsection “Handling missing data”, we performed 5-fold CV, producing five models each trained on different subsets of the training data. We also reserved a hold-out test set—not included in the CV process—for an independent clinical evaluation of the clustering results. To assess how consistently the clusters formed across these models, we examined which features most strongly contributed to cluster separation. Specifically, for each cluster and each model, we computed the average value (or class probability) of every feature and then calculated the standard deviation of these averages across the clusters. A higher standard deviation indicates a greater influence on cluster separation. Ranking the features by this standard deviation revealed that the same subset of features consistently drove clustering across models. The bar charts in Fig. 6 illustrate the standard deviation of feature values, with larger bars indicating more pronounced variability across clusters and error bars capturing variation among the five models. Notably, the error bars are generally small, suggesting strong consistency in feature ranking across the models. These findings also confirm the trends discussed in section “Hierarchical Disease Subtyping: First Hierarchy of Clusters”, where skin- and lung-related features are the primary drivers of cluster separation.

**Fig. 6: Top eight features ranked by their variation across the five final clusters.**

Clinical decision support system

Using our trained model, we can build a clinical decision support system that enables predicting future patient latent trajectories and early identification of disease subtypes. By comparing predicted cluster assignments at different stages of a patient’s journey to the final cluster assignment—after all medical visits have been encoded—we can anticipate the most likely disease subtype early in the disease course. Figure 7 illustrates these capabilities within a CDSS for a sample patient:

Panel A: Predicted (in blue) versus final (in red) latent trajectory, with corresponding cluster assignments.
Panel B: Final trajectory alongside nearest neighbors.
Panel C: Trajectories of key clinical variables for the patient and nearest neighbors.
Panel D: Trajectories of selected medical labels for the patient and nearest neighbors.

**Fig. 7: Clinical decision support system.**

For this patient, the CDSS suggests they likely belong to the purple subtype, characterized by a high risk of severe skin involvement (Fig. 7A.). Similar patients are located in regions with likely lung fibrosis and esophageal symptoms (Fig. 7B.). Moreover, as shown in Supplementary Fig. 15b, predicting cluster assignment at various stages of a patient’s journey to the final cluster yields a high F₁ score (around 0.8), demonstrating the model’s effectiveness in early severity stratification. This capability allows clinicians to intervene sooner, potentially mitigating organ involvement. Furthermore, following the procedure in section “Trajectory Clustering”, our model identifies the top-k similar patient trajectories (here, k = 3) to any given patient from the test set. Clinicians can leverage this feature to compare disease progressions, offering insights into a patient’s likely trajectory.

Discussion

In this work, we introduced a semi-supervised generative deep learning model that leverages expert-defined disease criteria to capture the complexity of systemic sclerosis across eight organs. Our approach uncovered five distinct hierarchical SSc subtypes spanning a mild-to-severe spectrum (Fig. 5). Among the two “mild” subtypes, one cluster showed only little involvement, whereas the other displayed higher tendencies for GT and heart issues. In the “severe” subtypes, we found one cluster aligned with a classic diffuse disease profile and elevated multi-organ involvement, another marked by pronounced multi-organ severity, and a particularly noteworthy cluster with limited skin involvement yet elevated risks of lung and heart complications. This highlights the shortcomings of relying on skin phenotypes alone.

These findings underscore the clinical utility of combining expert-guided label definitions with data-driven representation learning. By leveraging even partially labeled information, the model aligned learned trajectories with known clinical patterns, while also revealing less apparent subtypes that may carry significant morbidity risk. Overall, our approach moves beyond skin-based distinctions, offering a framework for translating complex patient data into interpretable, actionable insights to support personalized clinical decision support.

The primary limitation of our approach stems from the challenge of modeling highly imbalanced and sparse datasets. We observed that organ dynamics with highly imbalanced data tended to have less impact on subtyping, suggesting the need to investigate techniques like re-weighting minority classes during training. Alternatively, a more targeted model could be developed, focusing only on specific labels rather than the holistic approach used in this study.

Next, we plan to leverage the learned latent trajectories to answer questions specific to particular patient subsets, for instance, patients who develop ILD early in the disease course. By pretraining our model on the full dataset and subsequently clustering only within the ILD cohort, we can uncover ILD-specific subtypes.

Furthermore, our choice of five clusters, although guided by both mathematical and clinical validation, should not be interpreted as a definitive “ground truth”. For more fine-grained results, a similar hierarchical strategy could be extended through further sub-clustering, potentially revealing additional patterns in sparser organ dynamics.

Finally, the present study is purely retrospective, relying on observational patient data. A key limitation is the absence of a healthy-control reference: the EUSTAR registry does not enroll unaffected individuals, and no external cohort provides longitudinal, organ-specific assessments of comparable granularity. As a result, our analysis is confined to delineating phenotypic heterogeneity within the SSc population rather than benchmarking these trajectories against normative patterns. A possible next step would be to conduct a silent prospective evaluation in clinical practice to assess how well the model supports rheumatologists’ decision-making in real-time.

Methods

Analyzing and comparing raw longitudinal patient trajectories presents significant challenges due to heterogeneity, temporality, missingness, and biases⁹. To overcome these issues, we propose a two-stage approach. First, we develop a deep learning model to transform raw, heterogeneous data into smoother temporal patient representations. These refined representations are then used for disease subtyping through temporal clustering. Supplementary Note 11 summarizes the key machine-learning concepts referenced in this work.

Cohort description

We use SSc patient data from the European Scleroderma Trials and Research group (EUSTAR) registry (database export from June 1, 2022), a comprehensive dataset detailed in refs. ^17,27. This study was conducted in accordance with the Declaration of Helsinki and was approved by the local ethical committees of the participating EUSTAR centers. All patients provided written informed consent for their data to be used for research purposes as required by the local ethics committees for this study. The project was approved by the EUSTAR board (project number: CP125).

After preprocessing, the database comprises 14, 060 patients and 67, 894 medical visits, averaging approximately 4.8 medical visits per patient, see Supplementary Fig. 2 for the distribution of the number of patient visits. We included demographic variables such as gender and age, along with temporal variables measuring the disease progression across different organs, following the variable selection approach detailed in section “Variable selection for organ-specific definitions”. Moreover, Supplementary Note 2 provides additional details about the database, such as feature distribution plots (Supplementary Figs. 3 and 4 and Supplementary Tables 2 and 3) and a list of variable names with brief descriptions (Supplementary Table 1). To facilitate comparison with other EUSTAR studies, we retained the original variable names from the EUSTAR database when they were sufficiently clear.

We excluded patients with fewer than two or 15 and more medical visits and removed outliers. Additionally, all patients included in the analysis were 18 years or older. Patients with at least 15 medical visits were excluded to avoid biasing the model towards a few heavily sampled trajectories. A consort diagram describing patient inclusion during the different steps of our analysis is shown in Supplementary Fig. 1. Prior to model training or application, continuous variables were standardized, and categorical variables were one-hot encoded.

Variable selection for organ-specific definitions

For each organ, we model two dynamics: (a) involvement and (b) severity stage (if applicable), representing organ-specific outcome labels. These labels are computed based on clinical definitions (i.e. list of criteria) applied to a set of organ-specific variables recorded in the dataset.

More specifically, to create these labels, (1) we first reviewed the literature to compile all clinical definitions for each organ, usually ending up with multiple definitions per label (i.e. definitions for organ involvement and organ severity). (2) We then identified the relevant clinical variables available in the EUSTAR database (list of variables per definition), resulting in an extensive set of input variables X to describe organ dynamics. (3) In the second stage, a steering committee of ten SSc experts from various EUSTAR centers selected the most clinically relevant definition for each organ and label¹⁹. The final definitions are provided in Supplementary Note 3, and this process yielded a refined subset of EUSTAR variables G ⊆ X, derived from the final definitions. A complete list of variables in X and G is available in Supplementary Note 2. Panel 1 in Fig. 1 illustrates the variable selection process of our study. Note that autoantibody profiles were intentionally omitted, as their prognostic value in SSc is already well-documented, and our objective was to derive patient subtypes exclusively from longitudinal organ-specific trajectories.

Model overview and notations

For each patient, our model learns to summarize raw medical measurements into organ-specific representations that encode both the presence and severity of organ involvement. A sequence of these representations yields a longitudinal trajectory for every patient, and clustering those trajectories uncovers five distinct SSc phenotypes, each with a characteristic pattern of multi-organ disease. Following standard ML practice, we develop and tune the model on a training partition of the data and reserve an independent test set for final evaluation, confirming that the identified phenotypes generalize to previously unseen patients. See Supplementary Note 11 for an overview of the key ML concepts.

As outlined in section “Variable selection for organ-specific definitions”, the temporal input variables set X comprises a broad range of clinical measurements related to organ dynamics. Furthermore, a more refined subset of these variables, G ⊆ X, reflects the latest medical knowledge on organ impact in SSc. These variables are continuous, binary, or categorical, with all categorical variables being ordinal. For each patient, let x ≔ x_1:T ∈ X and g ≔ g_1:T ∈ G, where \(x\in {{\mathbb{R}}}^{D\times T}\) and \(g\in {{\mathbb{R}}}^{P\times T}\) represent the temporal clinical measurements, T is the index of the most recent measurement (i.e. last available in the database), and D and P are the number of variables in X and G respectively. Additionally, we define m ≔ m_1:T ∈ M, where \(m\in {{\mathbb{R}}}^{D\times T}\) is a boolean mask indicating the availability of clinical variables. We also incorporate N static demographic variables s ∈ S, \(s\in {{\mathbb{R}}}^{N}\). Our goal is to model the distribution of L latent, i.e. unobserved, variables z ≔ z_1:T ∈ Z, where \(z\in {{\mathbb{R}}}^{L\times T}\), that generate the observed X and G conditioned on S. These latent variables should contain the key information necessary to reconstruct X and predict G.

Model architecture

We adopt a probabilistic approach leveraging and adapting the well-established variational autoencoder (VAE) framework¹² to learn interpretable latent (unobserved) temporal organ-specific representations. Our method is designed to model organ behaviors in SSc by learning from the entire dataset while separately modeling each organ, thereby facilitating the analysis of organ-specific dynamics. We build on our prior deep probabilistic model¹⁴, in which we designed a temporal VAE-based approach to model the behavior of three organs (lungs, heart, and joints) in SSc to perform online patient monitoring. A key design element is “guiding” distinct latent dimensions for each organ (i.e. non-overlapping subsets of dimensions of the z vector), ensuring each subset of the latent dimension learns specialized organ-specific trajectories. In ref. ¹⁴, we used preliminary label definitions to guide these dimensions in a semi-supervised manner, training separate networks to predict all clinical variables. Here, we instead focus on learning predictive latent processes specifically for the organ-related variables G, with final label definitions aimed at improving disease subtyping.

We model eight organs (previously three), adapting the architecture to handle higher dimensionality and missing data. As in ref. ¹⁴, we dedicate separate latent dimensions to learn each organ’s dynamics (see Fig. 1). Following the bottleneck principle, the model is trained to reconstruct the variables in X. Additionally, we implement individual multilayer perceptrons (MLPs) as “guidance” networks for each variable in G. These networks receive the organ-specific latent subsets and learn to reconstruct and predict the current and future values of their respective variables. Intuitively, we integrate these organ-specific medical definitions as partial labels to guide the latent space for each organ dimension. We also train our model using an additional mask (denoted feature masking) by randomly dropping 20% of the input features to make the model more robust in reconstructing missing data (see subsection "Handling missing data"). In summary, for each patient, given x_1:t, s, m_1:t, and g_1:t, the model learns the distribution of z_1:T and uses a sampled z to reconstruct and predict x_1:t and g_1:T. The encoder network relies on MLPs and Long Short-Term Memory networks (LSTMs)²⁸, while the decoder and guidance networks are independent MLPs (Fig. 1). A separate neural network models the prior distribution of z (not shown in Fig. 1).

Handling missing data

The model expects a fixed-length input vector, so unobserved measurements are initially filled with cohort means computed on the training split. We then supply an accompanying missingness mask m that flags every imputed entry. The encoder, therefore, sees two channels per variable: its (possibly imputed) value and its missingness mask (i.e. boolean indicator). During training, the reconstruction/prediction loss is computed exclusively on observed values; imputed placeholders are ignored. In addition, we randomly drop 20% of the observed inputs in every mini-batch ("feature masking”). This forces the decoder to learn the joint structure of the data and produces reliable model-based imputations. All analyses, therefore, operate on the reconstructed time series, preventing bias from simple mean imputation even for variables with very high missingness. A detailed ablation showing the resulting robustness is reported in Supplementary Table 4.

Training

We first split the full dataset into training and validation (85%) and test (15%) sets; the training portion was used exclusively for model development and tuning, while the test set remained untouched until final evaluation. We then performed five-fold cross-validation (CV) on the training data: the training set was divided into five equal folds, and in turn, one fold served as a validation set while the model was trained on the other four. Within each training split, we executed a random search over hyperparameter combinations, selecting the configuration that minimized validation loss. This procedure yielded five separate final models, one per fold. To assess the stability and consistency of the results, each of the five models is then evaluated on the independent 15 % hold-out test set that was never seen during training and tuning.

To train our model, we adapted the objective function from refs. ^14,18 to our specific setting. We outline the key aspects of the optimization process here and refer the reader to ref. ¹⁴ for detailed computational information. Consider observational patient data x_1:T, g_1:T and s, where T is the index of the most recent clinical measurement. For each time step t = 1, . . . , T, given x_1:t, the model is trained to predict the distribution of the full latent trajectory z_1:T. Using a sample of this latent distribution, the guidance decoders are then trained to reconstruct and predict g_t:T, minimizing the cross entropy loss for binary or categorical variables and the mean squared error (MSE) for continuous variables. Similarly, the decoder is trained to reconstruct x_1:t given z_1:T, also using cross-entropy or MSE depending on the variable type. The model learns the distribution of the latent space by minimizing the Kullback-Leibler (KL) divergence, a regularization term that aligns the prior assumptions about the latent space with the distribution learned by the encoder. Following the approach in ref. ¹⁴, we assume a Gaussian distribution with constant variance for continuous variables, Bernoulli or categorical distributions for binary and categorical variables, and a Gaussian prior distribution for the latent space. During the model training, the parameters of these predefined distributions are learned and optimized.

Importantly, when computing the loss, we only include the observed (non-missing) variables. This ensures that the model is not trained to reconstruct imputed data, reducing potential bias. Furthermore, to enhance the model’s ability to handle missing data, we randomly mask 20% of the available clinical measurements in each batch during each training epoch. We use the Adam²⁹ algorithm with mini-batch processing to optimize the objective function.

Trajectory clustering

For clustering, we used k-means with dynamic time-warping (DTW) distance³⁰ on the learned latent patient trajectories. DTW allows us to align patient trajectories with varying length. After model training, k-means centroids were learned only on the embeddings from the training data. The 15 % hold-out cohort was subsequently projected into the same latent space and assigned to the nearest centroids. Reporting cluster characteristics on this unseen test set, therefore, provides a strictly out-of-sample evaluation of our subtyping approach. To determine the optimal number of clusters, we varied k from 2 to 15, and evaluated the clustering performance by computing the inertia, which measures cluster compactness (Supplementary Fig. 15a), prompting us to set k = 5. Then, for k ∈ [2, 3, 4, 5], we assigned the test embeddings to the nearest cluster centers. We observed a natural hierarchy in the clustering process: as k increased, new clusters were almost perfectly nested within the existing ones (Supplementary Fig. 14). For instance, when k = 2, let \({c}_{1}^{2}\) and \({c}_{2}^{2}\) be the identified clusters. As k increased to 5, \({c}_{1}^{2}\) split into two clusters (\({c}_{1}^{5}\) and \({c}_{2}^{5}\)), while \({c}_{2}^{2}\) divided into three clusters (\({c}_{5}^{3}\), \({c}_{5}^{4}\), and \({c}_{5}^{5}\)). This inherent hierarchy led us to adopt a strict hierarchical clustering approach for the final cluster assignment, resulting in more interpretable and clinically meaningful groupings. Following this procedure, we identified k = 5 main clusters and identified a natural hierarchy among the clusters.

Similarly, we used a k-Nearest Neighbors method to identify similar patients (here k=3), retrieving each test patient’s closest trajectories from the training data (based on the DTW distance).

Data availability

The raw dataset is owned by the EUSTAR group, and may be obtained by request after approval and permission from the EUSTAR board.

Code availability

The code is available at https://github.com/uzh-dqbm-cmi/eustar_npj.

References

Denton, C. P. & Khanna, D. Systemic sclerosis. Lancet 390, 1685–1699 (2017).
Article PubMed Google Scholar
Del Galdo, F. et al. Eular recommendations for the treatment of systemic sclerosis: 2023 update. Ann. Rheum. Dis. 84, 29–40 (2025).
Article PubMed Google Scholar
Jaeger, V. K. et al. Incidences and risk factors of organ manifestations in the early course of systemic sclerosis: a longitudinal eustar study. PloS one 11, e0163894 (2016).
Article PubMed PubMed Central Google Scholar
Hoffmann-Vold, A.-M. et al. Setting the international standard for longitudinal follow-up of patients with systemic sclerosis: a delphi-based expert consensus on core clinical features. RMD open 5, e000826 (2019).
Article PubMed PubMed Central Google Scholar
Elhai, M. et al. Stratification in systemic sclerosis according to autoantibody status versus skin involvement: a study of the prospective eustar cohort. Lancet Rheumatol. 4, e785–e794 (2022).
Article CAS PubMed Google Scholar
Nihtyanova, S. I. et al. Using autoantibodies and cutaneous subset to develop outcome-based disease classification in systemic sclerosis. Arthritis Rheumatol. 72, 465–476 (2020).
Article CAS PubMed Google Scholar
Fretheim, H. et al. Multidimensional tracking of phenotypes and organ involvement in a complete nationwide systemic sclerosis cohort. Rheumatology 59, 2920–2929 (2020).
Article PubMed PubMed Central Google Scholar
Petelytska, L. et al. Heterogeneity of determining disease severity, clinical course and outcomes in systemic sclerosis-associated interstitial lung disease: a systematic literature review. RMD open 9, e003426 (2023).
Article PubMed PubMed Central Google Scholar
Allam, A., Feuerriegel, S., Rebhan, M. & Krauthammer, M. Analyzing patient trajectories with artificial intelligence. J. Med. internet Res. 23, e29812 (2021).
Article PubMed PubMed Central Google Scholar
Lee, C. & Van Der Schaar, M. Temporal phenotyping using deep predictive clustering of disease progression. In International conference on machine learning, 5767–5777 (PMLR, 2020).
Chen, I. Y., Joshi, S., Ghassemi, M. & Ranganath, R. Probabilistic machine learning for healthcare. Annu. Rev. Biomed. data Sci. 4, 393–415 (2021).
Article PubMed Google Scholar
Kingma, D. P. & Welling, M. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
Locatello, F. et al. A sober look at the unsupervised learning of disentangled representations and their evaluation. J. Mach. Learn. Res. 21, 1–62 (2020).
Google Scholar
Trottet, C. et al. Semi-Supervised Generative Models for Disease Trajectories: A Case Study on Systemic Sclerosis. Machine Learning for Healthcare Conference. PMLR, 2024.
Allam, A. et al. Predicting interstitial lung disease progression in patients with systemic sclerosis using attentive neural processes-a eustar study. medRxiv 2024–04 (2024).
Bonomi, F. et al. The use and utility of machine learning in achieving precision medicine in systemic sclerosis: a narrative review. J. Personalized Med. 12, 1198 (2022).
Article Google Scholar
Meier, F. M. et al. Update on the profile of the eustar cohort: an analysis of the eular scleroderma trials and research group database. Ann. Rheum. Dis. 71, 1355–1360 (2012).
Article PubMed Google Scholar
Trottet, C., Schürch, M., Mollaysa, A., Allam, A. & Krauthammer, M. Generative time series models with interpretable latent processes for complex disease trajectories. In Deep Generative Models for Health Workshop NeurIPS 2023 (2023).
Hoffmann-Vold, A. et al. Pos0203 evidence-based expert consensus definition of organ involvement in systemic sclerosis–a eustar study (2024).
McInnes, L., Healy, J. & Melville, J. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018).
Varga, J.Systemic Sclerosis (Scleroderma) and Related Disorders (McGraw-Hill Education, New York, NY, 2018). accessmedicine.mhmedical.com/content.aspx?aid=1179365261.
Herrick, A. L., Assassi, S. & Denton, C. P. Skin involvement in early diffuse cutaneous systemic sclerosis: an unmet clinical need. Nat. Rev. Rheumatol. 18, 276–285 (2022).
Article PubMed PubMed Central Google Scholar
Steen, V. D. & Medsger Jr, T. A. Severe organ involvement in systemic sclerosis with diffuse scleroderma. Arthritis Rheumatism: Off. J. Am. Coll. Rheumatol. 43, 2437–2444 (2000).
Article CAS Google Scholar
Adigun, R., Goyal, A. & Hariz, A.Systemic Sclerosis (Scleroderma) (StatPearls Publishing, Treasure Island (FL), 2025), updated 2024 apr 5 edn. Available from: https://www.ncbi.nlm.nih.gov/books/NBK430875/.
Campochiaro, C. & Matucci-Cerinic, M. Interstitial lung disease in limited cutaneous systemic sclerosis patients: never let your guard down (2024).
Zanatta, E. et al. Phenotype of limited cutaneous systemic sclerosis patients with positive anti-topoisomerase i antibodies: data from the eustar cohort. Rheumatology 61, 4786–4796 (2022).
Article CAS PubMed Google Scholar
Hoffmann-Vold, A.-M. et al. Progressive interstitial lung disease in patients with systemic sclerosis-associated interstitial lung disease in the eustar database. Ann. Rheum. Dis. 80, 219–227 (2021).
Article CAS PubMed Google Scholar
Hochreiter, S. Long short-term memory. Neural Computation MIT-Press (1997).
Kinga, D., Adam, J. B. et al. A method for stochastic optimization. In International conference on learning representations (ICLR), vol. 5, 6 (San Diego, California;, 2015).
Müller, M. Dynamic time warping. Information retrieval for music and motion 69–84 (2007).

Download references

Acknowledgements

The authors thank the patients and caregivers who made the study possible, as well as all involved clinicians from the EUSTAR who collected the data. A list of contributing centers can be found at https://eustar.org/centers/. C.T., M.S., A.A., and M.K. received funding from the Swiss National Science Foundation (grant number 201184) for this work.

Author information

Authors and Affiliations

Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland
Cécile Trottet, Ahmed Allam & Michael Krauthammer
ETH AI Center, Zurich, Switzerland
Cécile Trottet & Michael Krauthammer
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Manuel Schürch
Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
Manuel Schürch
Department of Rheumatology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
Liubov Petelytska, Oliver Distler & Anna-Maria Hoffmann-Vold
Department of Internal Medicine #3, Bogomolets National Medical University, Kyiv, Ukraine
Liubov Petelytska
Department of Rheumatology, Hospital de la Santa Creu i Sant Pau, Barcelona, Spain
Ivan Castellví & Ivan Castellví
Institute of Rheumatology, Department of Rheumatology, 1st Medical School, Charles University, Prague, Czech Republic
Radim Bečvář
Leiden University Medical Center, Department of Rheumatology, Leiden, The Netherlands
Jeska de Vries-Bouwstra & Jeska de Vries-Bouwstra
Rheumatology DiMePReJ, University of Bari, School of Medicine, Bari, Italy
Florenzo Iannone
Hospital Universitario 12 de Octubre, Rheumatology Department, Madrid, Spain
Patricia Carreira
CHU de Bordeaux, Rheumatology Department, Bordeaux, France
Marie-Elise Truchetet
Università della Campania, UOC Medicina Interna, Napoli, Italy
Giovanna Cuomo
“Grigore T Popa” University of Medicine and Pharmacy, Rehabilitation Hospital, Department of Rheumatology, Iasi, Romania
Elena Rezus
University of Foggia, Department of Medical and Surgical Sciences, Rheumatology Unit, Foggia, Italy
Francesco Paolo Cantatore
Hospital Universitario Vall d’Hebron, Department of Internal Medicine, Systemic Autoimmune Diseases Unit, Barcelona, Spain
Carmen Pilar Simeón-Aznar
Colentina Clinical Hospital, Rheumatology Department, Bucharest, Romania
Magda Parvu
Bogomolets National Medical University, Kyiv, Ukraine
Marta Dzhus
Department of Rheumatology, Oslo University Hospital, Oslo, Norway
Anna-Maria Hoffmann-Vold
University of Florence, Azienda Ospedaliera Universitaria Careggi, Dept. of Experimental and Clinical Medicine, Division of Rheumatology, Florence, Italy
Silvia Bellando-Randone
Universtitätsspital Basel, Dept. of Rheumatology, Basel, Switzerland
Ulrich Andreas Walker
San Martino Hospital, Laboratory of Experimental Rheumatology and Division of Rheumatology DIMI Dept. Internal Medicine, University of Genova, School of Medicine IRCCS, Genova, Italy
Maurizio Cutolo
University of Medicine and Pharmacy Iuliu Hatieganu Cluj, Clinica Reumatologie, Cluj-Napoca, Romania
Simona Rednic
Université Paris Cité, Cochin Hospital, Rheumatology Department, Paris, France
Yannick Allanore
Universitá di Pavia e IRCCS Fondazione Policlinico S. Matteo, Pavia, Italy
Carlomaurizio Montecucco
CHC Rijeka, Department of Rheumatology and Clinical Immunology, Rijeka, Croatia
Srdjan Novak
University of Pécs, Department Of Rheumatology And Immunology, Medical Centre, Pecs, Hungary
Gábor Kumánovics
Medical University of Silesia, Voivodeship Hospital No. 5 Sosnowiec, Department of Internal Medicine, Rheumatology and Clinical Immunology, Katowice, Poland
Przemyslaw Kotyla
Padova University Hospital, Rheumatology Unit, Padova, Italy
Elisabetta Zanatta
University Medical Center Ljubljana, Division of Internal Medicine, Department of Rheumatology, Vodnikova 62, 1000 Ljubljana, Slovenia - Patients, Ljubljana, Slovenia
Katja Perdan Pirkmajer
Marche University Hospital, Clinica Medica, Department of Internal Medicine, Ancona, Italy
Gianluca Moroncini
ASST Spedali Civili of Brescia, University of Brescia, Rheumatology and Clinical Immunology Unit, Brescia, Italy
Paolo Airó
University of Split, Division of Rheumatology and Clinical Immunology, Department of Internal Medicine, School of Medicine, University Hospital Center, Split, Croatia
Mislav Radic
Rambam Health Care Campus, Rheumatology Institute, Haifa, Israel
Alexandra Balbir-Gurman
Universitätshautklinik Köln, Köln, Germany
Nico Hunzelmann
University of Verona, UoC Rheumatology, Verona, Italy
Luca Idolazzi
Dubrava University Hospital, Division of Clinical Immunology, Allergology and Rheumatology, Department of Internal Medicine, Zagreb, Croatia
Josko Mitrovic
Royal Free London and University College London Medical School, Centre for Rheumatology, London, UK
Christopher Denton
Radboudumc, Department of Rheumatology, Nijmegen, The Netherlands
Madelon Vonk
Institute of Rheumatology Belgrade, Belgrade, Serbia
Jelena Colic
Medizinische Universitätsklinik, Abt. II (Onkologie, Hämatologie, Rheumatologie, Immunologie, Pulmonologie), Tübingen, Germany
Joerg Henes
Hamburg Centre for Pediatric and Adolescence Rheumatology, Hamburg, Germany
Ivan Foeldvari
Struttura Complessa di Reumatologia - Dipartimento Specialistiche - Azienda Ospedaliera Arcispedale S. Maria Nuova, Reggio Emilia, Italy
Gianluigi Bajocchi
Centro Hospitalar e Universitário de Coimbra, Rheumatology Department, Coimbra, Portugal
Tânia Santiago
Institute for Treatment and Rehabilitation Niska Banja, Nis, Rheumatology Clinic, Niska Banja, Serbia
Bojana Stamenkovic
IRCCS Humanitas Research Hospital, Rozzano - Milan, Italy
Maria De Santis
Chris Hani Baragwanath Academic Hospital and University of the Witwatersrand Center, Rheumatology Unit, Department of Medicine, Johannesburg, South Africa
Claudia Ickinger
V.A. Nasonova Research Institute of Rheumatology, Moscow, Russia
Lidia P. Ananieva
Aarhus University Hospital, Department of Rheumatology, Aarhus, Denmark
Klaus Sondergaard
University of Debrecen, Faculty of Medicine, Department of Rheumatology, Debrecen, Hungary
Gabriella Szucs
Hôpital Huriez, CHU Lille, Lille University, Lille, France
David Launay
Sapienza University of Rome, Rheumatology Clinic, Rome, Italy
Valeria Riccieri
St. Maria Hospital, Carol Davila, University of Medicine and Pharmacy, Department of Rheumatology, Bucharest, Romania
Andra Balanescu
Cantacuzino Hospital, Carol Davila University of Medicine and Pharmacy, Ion Cantacuzino Hospital, Bucharest, Romania
Ana Maria Gheorghiu
University Hospital Erlangen, Department Internal Medicine 3, Erlangen, Germany
Christina Bergmann
Hôpital Cochin, Department of Internal Medicine, Paris, France
Luc Mouthon
University of Ghent, Department of Rheumatology, Gent, Belgium
Vanessa Smith
University Hospital of Copenhagen, Department of Dermatology D-40, HS-Bispebjerg Hospital, Copenhagen, Denmark
Mette Mogensen
Université Catholique de Louvain, Cliniques Universitaires Saint-Luc, Brussels, Belgium
Marie Vanthuyne
Hospital Universitario Dr Peset, Valencia, Spain
Juan Jose Alegre Sancho
Hôpital Nord de Marseille, Service de Médecine Interne, Marseille, France
Brigitte Granel
Hospital de Clinicas da Universidade Federal do Parana, Curitiba, Brazil
Carolina de Souza Müller
Republican Center of Systemic Sclerosis of Nicolae, Testemitanu State University of Medicine and Pharmacys, Chisinau, Republic of Moldova
Svetlana Agachi
Rheumatology Unit, AOU and University of Cagliari, Department of Medical Sciences and Public Health, Monserrato - Cagliari, Italy
Alberto Cauli
Waikato University Hospital, Rheumatology Unit, Hamilton, New Zealand
Kamal Solanki
Rheumatology and Clinical Immunology Unit, Alexandria Faculty of Medicine, Alexandria, Egypt
Eiman Soliman
Sapienza University of Rome, Department of Translational and Precision Medicine Azienda Ospedaliero-Universitaria Policlinico Umberto 1-Centro di riferimento regionale per la sclerosi sistemica, Rome, Italy
Edoardo Rosato
Centre Catania, UO Reumatologia San Marco Hospital, Catania, Italy
Rosario Foti
Insel Gruppe AG, Universitätsklinik für Rheumatologie und Immunologie, Bern, Switzerland
Britta Maurer
National Institute of Geriatrics, Rheumatology and Rehabilitation, Warsaw, Poland
Marzena Olesinska
Assiut University Hospital, Assiut university, Rheumatology Department, Assiut, Egypt
Nihal Awad
Grenoble University Hospital, Grenoble Vascular Medicine Department, Grenoble, France
Sophie Blaise
Hospital Tenon, Department of Dermatology, Paris, France
Patricia Senet
CHU de Hautepierre, Service de Rhumatologie, Centre National de Référence des Maladies auto-immunes et systémiques rares, Strasbourg, France
Emmanuel Chatelus
Centre Tel-Aviv Sourasky, Rheumatology institute, Tel-Aviv, Israel
Ira Litinsky
Leeds Raynaud’s and Scleroderma Program, NIHR Biomedical Research Centre, Leeds Institute of Rheumatic and Musculoskeletal Medicine, Leeds, UK
Francesco Del Galdo
Ramos Meja Hospital, Buenos Aires, Argentina
Eduardo Kerzberg
Clinical Hospital Center Osijek, Department of Clinical Immunology and Allergology, Osijek, Croatia
Jasminka Milas-Ahic
Asst Papa Giovanni XXIII, Bergamo, Italy
Massimiliano Limonta
Centro di Riferimento Interdisciplinare per la Sclerosi Sistemica (CRIIS), Roma, Italy
Antonella Marcoccia
Nouvel Hopital Civil, Clinical Immunology Internal Medicine, National Referral Center for Systemic Autoimmune Diseases, Strasbourg, France
Thierry Martin
Medical University Of Gdansk, University Clinical Centre, Department Of Internal Medicine, Connective Tissue Diseases, and Geriatrics, Gdansk, Poland
Anna Wojteczek
Universitätsklinikum Schleswig-Holstein, Klinik für Rheumatologie und klinische Immunologie, Lübeck, Germany
Gabriela Riemekasten
Centro Hospitalar e Universitário de Coimbra, Consulta de Doenças, Autoimunes Sistémicas Serviço de Medicina Interna, Coimbra, Portugal
Lélita da Conceição Santos
Meir Medical Center, kfar-saba, Israel
Yair Levy
Universidade Federal De Pelotas, Pelotas, Brazil
Daniel Brito de Araujo
Pomeranian Medical University, Ul., Department of Internal Medicine, Rheumatology, Diabetology, Geriatrics and Clinical Immunology, Szczecin, Poland
Marek Brzosko
ASST Grande Ospedale Metropolitano Niguarda, S.C. Reumatologia, Milan, Italy
Oscar Massimiliano Epis
Athens University Medical School, First Propaedeutic and Internal Medicine, Rheumatology Unit, Athens, Greece
Petros Sfikakis
Regional Autoinflammatory, Autoimmune and Rare Diseases Centre (CRBAAR), Spitalul Clinic Judetean de Urgenta “Sf Apostol Andrei” Hospital, Constanta, Romania
Ana-Maria Ramazan
Hôpital Sud, Service de Médecine Interne & Immunologie Clinique, Rennes, France
Alain Lescoat
Vita-Salute San Raffaele University, San Raffaele Hospital, Unit of Immunology, Rheumatology, Allergy and Rare Diseases, Milan, Italy
Marco Matucci Cerinic
University Medical Center Utrecht, Utrecht, The Netherlands
Julia Spierings
University of Messina, Rheumatology Unit, Messina, Italy
Fabiola Atzeni
Nippon Medical School Hospital, Tokyo, Japan
Masataka Kuwana
Hospital Saint-Antoine, Internal Medicine Department, Paris, France
Arsene Mekinian
Poitiers University Hospital, Department of Internal Medicine, Poitiers, France
Mickaël Martin
Local de Saúde Santa Maria, Centro Académico de Medicina de Lisboa, Rheumatology Department, Lisbon, Portugal
Gonçalo Boleto
Ospedale G. Pini, UOC Day Hospital Reumatologia, Scleroderma Clinic, Milan, Italy
Nicoletta Del Papa
Azienda Ospedaliera Universitaria Senese (AOUS), UOC Reumatologia, Siena, Italy
Enrico Selvi
Azienda Ospedaliero-Universitaria Pisana, Pisa, Italy
Marta Mosca
Reha Rheinfelden, Rheinfelden, Switzerland
Ulrich Gerth
Kocaeli University, Department of Rheumatology, Kocaeli, Turkey
Duygu Temiz Karadag
Medical University of Plovdiv, University Hospital Kaspela Plovdiv, Clinic of Rheumatology, Plovdiv, Bulgaria
Anastas Batalov
“Heratsi” University Hospital, Yerevan, Armenia
Knarik Ginosyan
Mikaelyan Institute Of Surgery, Department of Rheumatology, Yerevan, Armenia
Nune Manukyan
Galilee Medical Center, Nahariya, Israel
Mohammad Naffaa
Sahlgrenska University Hospital, Clinical Rheumatology Research Center, Gothenburg, Sweden
Cristina Maglio
Complejo Asistencial Universitario de León, León, Spain
Miriam Retuerto
St. Luke’s International Hospital, Immuno-Rheumatology Centor, Tokyo, Japan
Futoshi Iwata
Yale Scleroderma Program, North Haven, CT, USA
Monique Hinchcliff
Fondazione Policlinico Universitario Campus BioMedico, Rome, Italy
Roberto Giacomelli
Ospedale San Bortolo di Vicenza, Medicina Generale, Vicenza, Italy
Francesco Benvenuti
Instituto Português de Reumatologia, Lisboa, Portugal
Helena Santos Carneiro
Hospital Universitario de La Princesa, IIS-Princesa, Madrid, Spain
Esther Vicente Rabaneda
University Hospital Düsseldorf, Clinic for Rheumatology and Hiller Research Centre, Düsseldorf, Germany
Andrea-Hermina Györfi
Hospital Universitario Son llátzer, Palma de Mallorca, Spain
Lilian Maria Lopez Nunez
Polytechnic University of Marche, “Carlo Urbani” Hospital, Rheumatology Clinic, Ancona, Italy
Rossella De Angelis
Hospital del Mar, Barcelona, Spain
Irene Carrión-Barberà
Fundación Sanatorio Güemes, Buenos Aires, Argentina
Alejandro Brigante
Egyptian Society for Microcirculation in Rheumatic Diseases, Cairo, Egypt
Yasser El Miedany
Peking University Third Hospital, Department of Rheumatology, Beijing, China
Rong Mu
Centro Hospitalar de Leiria, Leiria, Portugal
Alexandra Daniel
University of Naples Federico II, Department of Translational Medical Sciences, Naples, Italy
Amato de Paulis
University of Pennsylvania, Division of Rheumatology, Philadelphia, PA, USA
Chris Derk
The University of Hong Kong-Shenzhen Hospital, Department of Rheumatology, Shenzhen, China
Lijun Zhang
Department of Rheumatology and Immunology, Specialist Hospital. J. Dietla, Cracow, Poland
Bogdan Batko
Hospital Universitari Germans Trias i Pujol, Barcelona, Spain
Ivette Casafont Sole
Rheumatology, Immunology and Internal Medicine Cilinic, Medical University of Lodz, Lodz, Poland
Anna Lewandowska-Polak
RenJi Hospital, Shanghai Jiao Tong University, School of Medicine, Shanghai, China
Qingran Yan
Marmara University School of Medicine, PMR Department Rheumatology Division, Istanbul, Turkey
Tuncay Duruöz
Gulhane Training and Research Hospital, Ankara, Turkey
Seda Colak
Hospital Nacional Dos de Mayo, Lima, Peru
Janeth Villegas Guzmán
Hospital Nacional Edgardo Rebagliati Martins-EsSalud, Lima, Peru
Claudia Mora-Trujillo
Universitá degli Studi di Roma Tor Vergata, Fondazione PTV Policlinico Tor Vergata, U.O.C. Reumatologia, Rome, Italy
Maria Sole Chimenti
Ain Shams University, Internal Medicine Department, Rheumatology Divison, Cairo, Egypt
Samah A. El-Bakry
Marmara University, Department of Internal Medicine, Division of Rheumatology, Istanbul, Turkey
Fatma Alibaz-Oner

Authors

Cécile Trottet
View author publications
Search author on:PubMed Google Scholar
Manuel Schürch
View author publications
Search author on:PubMed Google Scholar
Ahmed Allam
View author publications
Search author on:PubMed Google Scholar
Liubov Petelytska
View author publications
Search author on:PubMed Google Scholar
Ivan Castellví
View author publications
Search author on:PubMed Google Scholar
Radim Bečvář
View author publications
Search author on:PubMed Google Scholar
Jeska de Vries-Bouwstra
View author publications
Search author on:PubMed Google Scholar
Florenzo Iannone
View author publications
Search author on:PubMed Google Scholar
Patricia Carreira
View author publications
Search author on:PubMed Google Scholar
Marie-Elise Truchetet
View author publications
Search author on:PubMed Google Scholar
Giovanna Cuomo
View author publications
Search author on:PubMed Google Scholar
Elena Rezus
View author publications
Search author on:PubMed Google Scholar
Francesco Paolo Cantatore
View author publications
Search author on:PubMed Google Scholar
Carmen Pilar Simeón-Aznar
View author publications
Search author on:PubMed Google Scholar
Magda Parvu
View author publications
Search author on:PubMed Google Scholar
Marta Dzhus
View author publications
Search author on:PubMed Google Scholar
Oliver Distler
View author publications
Search author on:PubMed Google Scholar
Anna-Maria Hoffmann-Vold
View author publications
Search author on:PubMed Google Scholar
Michael Krauthammer
View author publications
Search author on:PubMed Google Scholar

Consortia

EUSTAR Collaborators

Ivan Castellví
, Radim Bečvář
, Jeska de Vries-Bouwstra
, Florenzo Iannone
, Patricia Carreira
, Marie-Elise Truchetet
, Giovanna Cuomo
, Elena Rezus
, Francesco Paolo Cantatore
, Carmen Pilar Simeón-Aznar
, Magda Parvu
, Marta Dzhus
, Oliver Distler
, Anna-Maria Hoffmann-Vold
, Silvia Bellando-Randone
, Ulrich Andreas Walker
, Maurizio Cutolo
, Simona Rednic
, Yannick Allanore
, Carlomaurizio Montecucco
, Srdjan Novak
, Gábor Kumánovics
, Przemyslaw Kotyla
, Elisabetta Zanatta
, Katja Perdan Pirkmajer
, Gianluca Moroncini
, Paolo Airó
, Mislav Radic
, Alexandra Balbir-Gurman
, Nico Hunzelmann
, Luca Idolazzi
, Josko Mitrovic
, Christopher Denton
, Madelon Vonk
, Jelena Colic
, Joerg Henes
, Ivan Foeldvari
, Gianluigi Bajocchi
, Tânia Santiago
, Bojana Stamenkovic
, Maria De Santis
, Claudia Ickinger
, Lidia P. Ananieva
, Klaus Sondergaard
, Gabriella Szucs
, David Launay
, Valeria Riccieri
, Andra Balanescu
, Ana Maria Gheorghiu
, Christina Bergmann
, Luc Mouthon
, Vanessa Smith
, Mette Mogensen
, Marie Vanthuyne
, Juan Jose Alegre Sancho
, Brigitte Granel
, Carolina de Souza Müller
, Svetlana Agachi
, Alberto Cauli
, Kamal Solanki
, Eiman Soliman
, Edoardo Rosato
, Rosario Foti
, Britta Maurer
, Marzena Olesinska
, Nihal Awad
, Sophie Blaise
, Patricia Senet
, Emmanuel Chatelus
, Ira Litinsky
, Francesco Del Galdo
, Eduardo Kerzberg
, Jasminka Milas-Ahic
, Massimiliano Limonta
, Antonella Marcoccia
, Thierry Martin
, Anna Wojteczek
, Gabriela Riemekasten
, Lélita da Conceição Santos
, Yair Levy
, Daniel Brito de Araujo
, Marek Brzosko
, Oscar Massimiliano Epis
, Petros Sfikakis
, Ana-Maria Ramazan
, Alain Lescoat
, Marco Matucci Cerinic
, Julia Spierings
, Fabiola Atzeni
, Masataka Kuwana
, Arsene Mekinian
, Mickaël Martin
, Gonçalo Boleto
, Nicoletta Del Papa
, Enrico Selvi
, Marta Mosca
, Ulrich Gerth
, Duygu Temiz Karadag
, Anastas Batalov
, Knarik Ginosyan
, Nune Manukyan
, Mohammad Naffaa
, Cristina Maglio
, Miriam Retuerto
, Futoshi Iwata
, Monique Hinchcliff
, Roberto Giacomelli
, Francesco Benvenuti
, Helena Santos Carneiro
, Esther Vicente Rabaneda
, Andrea-Hermina Györfi
, Lilian Maria Lopez Nunez
, Rossella De Angelis
, Irene Carrión-Barberà
, Alejandro Brigante
, Yasser El Miedany
, Rong Mu
, Alexandra Daniel
, Amato de Paulis
, Chris Derk
, Lijun Zhang
, Bogdan Batko
, Ivette Casafont Sole
, Anna Lewandowska-Polak
, Qingran Yan
, Tuncay Duruöz
, Seda Colak
, Janeth Villegas Guzmán
, Claudia Mora-Trujillo
, Maria Sole Chimenti
, Samah A. El-Bakry
& Fatma Alibaz-Oner

Contributions

A.H. and M.K. devised the study. C.T. and M.S. curated and analyzed the data and implemented the algorithms. C.T., M.S., A.A., L.P., O.D., A.H., and M.K. analyzed the results. MK, O.D., and A.H. supervised the project. C.T. wrote the original manuscript draft and prepared the figures. All authors critically reviewed, edited, and approved the final manuscript.

Corresponding author

Correspondence to Michael Krauthammer.

Ethics declarations

Competing interests

A.H. has/had consultancy relationship with and/or has received research funding from or has served as a speaker for the following companies in the area of potential treatments for systemic sclerosis and its complications in the last 36 months: Abbvie, Avalyn, CallunaPharma, BMS, Boehringer Ingelheim, Genentech, Janssen, Merck Sharp&Dohme, Medscape, Novartis, Pliant therapeutics, Roche and Werfen. A.H. is a CTD-ILD ERS/EULAR convenor and a EULAR study group leader on the lung in rheumatic and musculoskeletal diseases.OD has/had consultancy relationship with and/or has received research funding from or has served as a speaker for the following companies in the area of potential treatments for systemic sclerosis and its complications in the last two years: 4P-Pharma, Abbvie, Acepodia, Aera, AnaMar, Anaveon AG, Argenx, Boehringer Ingelheim, BMS, Calluna, Cantargia AB, Citus AG, CSL Behring, Galderma, Galapagos, Hemetron AG, Innovaderm, Lilly, MSD Merck, Mitsubishi Tanabe; Nkarta Inc., Orion, Pilan, Quell, Scleroderma Research Foundation, EMD Serono, Topadur and UCB. Patent issued “mir-29 for the treatment of systemic sclerosis” (US8247389, EP2331143). OD is a co-founder of CITUS AG. All other authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Trottet, C., Schürch, M., Allam, A. et al. Deep hierarchical subtyping of multi-organ systemic sclerosis trajectories - a EUSTAR study. npj Digit. Med. 8, 563 (2025). https://doi.org/10.1038/s41746-025-01962-y

Download citation

Received: 23 April 2025
Accepted: 17 August 2025
Published: 01 September 2025
DOI: https://doi.org/10.1038/s41746-025-01962-y