Introduction

Patient-reported outcomes emphasize the importance of capturing the patient’s own perspective on their symptoms, well-being, and quality of life, providing direct insight into their lived experience1. This approach places the individual at the center of their care, empowering them to set their own goals and make informed choices about their treatment and life direction. Key aspects of patient-reported rehabilitation outcomes include self-perceived well-being, developing self-esteem, resilience, and a sense of purpose; building healthy relationships and social connections; gaining independence; and understanding one’s strengths and limitations.

Patient-reported rehabilitation outcomes provide a unique perspective that differs from standard clinical practice. The latter typically involves collecting socio-demographic and basic clinical information to better understand patient symptoms, formulate a diagnosis and prognosis, and guide treatment decision with the aim of symptom reduction and restoration of pre-illness functioning. This reliance on clinical data is especially pronounced in disorders like schizophrenia, where patient empowerment and involvement in care decisions are often limited.

Although standard clinical practice and rehabilitation goals differ, some research suggests that they are nonetheless interconnected. In schizophrenia, studies have demonstrated that intensive, continuous rehabilitation services can enhance both functional outcomes (e.g., employment) and clinical outcomes (e.g., reduced hospitalization, symptom stability)2. Other research has found a small to moderate negative correlation between symptom severity and rehabilitation outcomes3,4, with patients experiencing more severe positive and negative symptoms often facing greater challenges in achieving functional improvements5.

Socio-demographic factors have also demonstrated predictive value for rehabilitation outcomes among individuals with schizophrenia and other mental illness. For example, several studies have found that female patients tend to experience higher rates of recovery and better quality of life compared to males6,7. The impact of age is less consistent: while some research indicates that older age is associated with lower quality of life7, other studies suggest that older individuals may achieve better rehabilitation outcomes6,8. Higher educational attainment is also linked to improved quality of life and greater recovery ability6,9, likely due to enhanced cognitive and social resources that support the recovery process. Similarly, being employed has been associated with more favorable rehabilitation outcomes6,7,8,9. Lastly, strong social support – from romantic partners, family, friends, or community networks – has been consistently identified as a predictor of positive rehabilitation outcomes in schizophrenia and other mental disorders7,8,9,10.

Somewhat contradicting the latter evidence, recent research from our group suggests that well-being and stages of recovery are not meaningfully predicted by traditional psychiatric markers like socio-demographic variables (age, sex, education, employment) or basic clinical factors (diagnosis severity, medication)11,12. Instead, these outcomes were strongly linked to psychological constructs that are themselves rehabilitation outcomes, e.g. self-esteem, self-stigma (internalized negative beliefs about mental health), resilience and autonomy. This creates an intriguing circular relationship where these psychological factors both define and predict rehabilitation success, effectively acting as their own ecosystem.

These findings contrast with the studies mentioned above, underscoring the need for further investigation. If socio-demographic and clinical factors are indeed found to significantly influence rehabilitation outcomes, it will be important to identify and understand them in order to tailor assessment and treatment similarly to standard clinical practice. Conversely, if these factors prove to be less influential, assessment and treatment strategies should be adapted to address the unique needs of rehabilitation more specifically.

While the relationship between socio-demographic or clinical factors and rehabilitation outcomes has been explored previously, to our knowledge, earlier studies have not employed optimal modeling strategies. Specifically, no prior research has utilized state-of-the-art machine learning techniques capable of capturing complex, non-linear relationships and interactions among a comprehensive set of predictors and diverse rehabilitation outcomes. Moreover, previous studies have often lacked rigorous evaluation of model performance on independent data, leading to overly optimistic results and limited generalizability. This is an important matter: models failing to account for sufficient variability in outcomes would be unsuitable for prediction and in turn any associations they identify may not be deemed meaningful. In the present study, we addressed these limitations by applying advanced machine learning methods and robust validation procedures to assess whether socio-demographic and clinical factors reliably predict patient-reported rehabilitation outcomes in individuals with schizophrenia.

Methods

Data

Description of the database

We used data from the French multicentric psychosocial rehabilitation database REHABase13, and specifically background socio-demographic data, basic clinical data, as well as rehabilitation outcome data from patients diagnosed with schizophrenia. The database obtained the authorizations required under French legislation (French National 429 Advisory Committee for the Treatment of Information in Health Research, 16.060bis; French 430 National Computing and Freedom Committee, DR-2017-268).

The initial dataset comprised information from twenty-four clinical centers. To ensure statistical robustness, we consolidated centers with fewer than 50 observations into a single group, resulting in a final set of ten analytical units, comprising nine individual centers and one grouped category.

Dependent variables (patient-reported rehabilitation outcomes)

We examined 19 distinct patient-reported measures, emanating from five different scales. Three dimensions of insight were considered using the Birchwood Insight Scale – BIS14: awareness of disorder (DIS), relabelling of symptoms as part of the illness (SYM), recognition of a need for treatment (TRT). All dimensions are reported on a scale from 0 to 4, with higher scores associated with better insight.

Five dimensions of self-stigmatization were considered using the Internalized Stigma of Mental Illness scale – ISMI15: alienation (ALN), stereotype endorsement (SE), discrimination experience (DE), social withdrawal (SW), stigma resistance (RES). All dimensions are reported on the same scale, ranging from 1 to 4. Higher scores indicate greater internalized stigma, except for the stigma resistance subscale, which is reverse-coded so that higher scores indicate greater resistance to stigma.

We also considered medication adherence as the total score of the Medication Adherence Rating Scale – MARS16. Scores range from 0 to 10, with higher score indicating greater medication adherence,

Two dimensions of self-esteem were considered using the Self-Esteem Rating Scale – SERS17: negative and positive, each scored on a scale from 10 to 70. For positive self-esteem, higher scores indicate better self-esteem. For negative self-esteem, higher scores indicate worse self-esteem.

Finally, eight dimensions of quality of life were considered using the Schizophrenia Quality of Life questionnaire – SQoL1818: psychological well-being (PSY), physical well-being (PHY), self-esteem (SEL), autonomy (AUT), resilience (RES), romantic relationships (ROM), family relationships (FAM), friendship relationships (FRI). Each dimension is scored on a scale from 0 to 100, with higher scores indicating better quality of life.

All scales are validated self-report questionnaires commonly used in studies of schizophrenia. For each outcome measure, observations with missing values were removed from the analysis. We then analyzed each of the 19 patient-reported outcomes separately, performing 19 analyses on 19 datasets.

Predictors (socio-demographic and clinical data)

Predictors of patient-reported rehabilitation outcomes included the following background clinical and socio-demographic factors: center (ten categories); age (continuous); sex (male vs. female); education (no high school diploma; high school diploma; Bachelor’s degree; Master’s degree); marital status (single; divorced/widowed; in a relationship); being a parent (yes vs. no); housing status (homeless; group home; family home; personal residence); employment (employed (regular); employed (specialized); unemployed); being a disabled worker beneficiary (yes vs. no); being of no fixed abode (currently; in the past; no); duration of illness (less than two years; two to five years; five to 10 years; more than 10 years); number of psychiatric admissions (nil; one; two; three; four; five to 10; more than 10); total time spent in hospital (nil; three months or less; three to six months; six to 12 months; more than 12 months); psychiatric comorbidities (yes vs. no); physical comorbidities (yes vs. no); addiction comorbidities (nil; behavioral only; substance only; both substance and behavioral); history of suicide attempts (no previous attempt; one; two; three; four or more); forensic history (yes vs. no); antipsychotic medication (nil; first-generation antipsychotics; second-generation antipsychotics; both); somatic medication (yes vs. no); referrer (clinician from the public healthcare system; clinician from the private healthcare system; social worker; self-referral; other); severity score at the Clinical Global Impression (CGI, continuous); score at the Global Assessment of Functioning (GAF, continuous).

Analysis

Strategy for handling missing data in the predictors set

Although we removed cases with missing patient-reported outcomes, some missing values remained in the other variables within each of the 19 datasets. Overall, the percentage of missing data ranged from 4.4% (BIS dataset) to 5.2% (SERS dataset). For individual variables, missing rates varied from 0% (for center, age, and sex) up to 15% for CGI and GAF, and 16–18% for total time spent in hospital.

Discarding incomplete data and performing a complete case analysis risked biasing our sample towards those participants without missing information. Therefore, our general strategy aimed to retain as many observations and variables as possible. However, we found in our pre-tests that imputing missing data on number of psychiatric admissions and total duration of admission greatly impacted parameter estimation in general linear modeling19, therefore we decided to exclude these two variables from our analysis.

We generated m = 20 imputed datasets. Imputation was performed prior to predictive modeling and conducted separately on the training and hold-out testing sets to prevent data leakage (see below).

Predictive modeling

We employed a SuperLearner ensemble model using the SuperLearner R package to optimize predictions of each patient-reported outcome20. Basis learners included general linear modeling with main terms only, regularized regression, random forest, and extreme gradient boosting. In addition to training each basis learner on the full set of predictors, variables pre-selection was performed using random forest variable importance (to select the top 10 predictors by mean decrease in impurity; implemented with 1000 trees; mtry = 4; nodesize = 5). We also included a general linear model with first-order interactions, using variables pre-selected by the above random forest algorithm. For each base learner, we used an “adaptive” hyperparameter tuning strategy21 aiming to minimize logarithmic loss.

For each outcome, data were split into training (70%) and hold-out testing (30%) sets. Base learners and the ensemble model were trained on the training set using 8-fold cross-validation. Model performance was assessed on both the training set and the hold-out testing set for each of the 20 imputed datasets using the coefficient of determination (R2 metric).

Finally, we aimed at computing SHAP (SHapley Additive exPlanations22 values for every feature and training observation across all 20 imputed datasets using the fastshap R package23. For a given observation, SHAP values quantify the magnitude and direction (positive/negative) of each feature’s additive contribution to the model’s predicted patient-reported outcome. The magnitude reflects the relative importance of each feature’s contribution. By design, the sum of all SHAP values for an observation, combined with the model’s base value (i.e., the average prediction), equals the model’s output for that observation. We considered the results of our SHAP analysis as meaningful only if the R2 measured on the testing set was equal to or greater than 0.1524.

Results

Description of the population

The number of observations ranged from 1,332 (MARS dataset) to 1,563 (BIS dataset). Table 1 summarizes the socio-demographic and clinical characteristics of the study population. Approximately three-quarters of the participants were male, with a mean age of 33 years (SD = 10). Only 15% of participants had children. The mean CGI score was 4.2 (SD = 1.1), indicating moderate illness severity. About one-third of the sample were prescribed first-generation antipsychotics (FGA), and more than 40% had experienced an illness duration exceeding 10 years. Half of the participants reported an addiction to a substance, while one quarter had a physical disability. Unemployment was highly prevalent, affecting approximately 90% of the sample, and only 10% were in a relationship. Regarding other aspects of social and economic status, 40% had a disability worker beneficiary status, 45% had no diploma, and 45% were living in personal housing.

Table 1 Clinical and socio-demographic characteristics of the participants.

Mean scores and standard deviations for each patient-reported rehabilitation outcome are presented in Table 2. Quality of life ratings were lowest for romantic relationships and highest for family relationships, autonomy, and resilience. Self-esteem scores indicated slightly more positive than negative self-esteem overall. Scores indicated moderated medication adherence, and patients on average demonstrated relatively good insight into their need for treatment. Most dimensions of internalized stigma reflected mild levels of internalized stigma, and stigma resistance was also mild across the sample.

Table 2 Characteristics of the patient-reported rehabilitation outcomes.

Predictive performance of the models

Table 3 presents performance metrics across the 20 imputed datasets for each patient-reported rehabilitation outcome. The SuperLearner ensemble achieved a mean R² of 0.048 (range: 0.01–0.08) on the hold-out testing set, indicating very low predictive performance. Performance on the training set varied across base learners, with the random forest algorithm demonstrating consistently higher R² (mean R² range: 0.37–0.73). However, the stark discrepancy between training and testing performances suggests significant overfitting in some of the base learners (essentially the random forest and extreme gradient boosting models; Table 3). Simpler models, such as GLMs, also showed higher performance on the training sets compared to the ensemble on the testing sets, but this difference was much less pronounced than that observed with decision tree-based approaches.

Table 3 Performance of each basis learner and of the ensemble algorithm.

Table 4 details the contribution of each base learner to the SuperLearner ensemble across imputed datasets for the patient-reported rehabilitation outcomes. Substantial variability was observed in the influence of individual algorithms on the ensemble’s predictions. Specifically, the SuperLearner ensemble did not assign the highest weights to the random forest models, reflecting their tendency to overfit the training data.

Table 4 Contribution of each basis learner to the superlearner ensemble.

Given the overall lack of meaningful predictive performance across all patient-reported rehabilitation outcomes, we reported the mean absolute SHAP values for each predictor for reference only (Supplementary Figs. 1–5).

Discussion

In the present study, we investigated whether basic socio-demographic and clinical factors could predict key patient-reported rehabilitation outcomes relevant to recovery in schizophrenia, namely: self-esteem, quality of life, treatment adherence, insight, and self-stigma. To this end, we applied state-of-the-art machine learning models using 21 socio-demographic and clinical variables collected from our cohort of patients with schizophrenia undergoing psychosocial rehabilitation. Predictive performance on the hold-out testing sets was very low, with R² values ranging from 0.01 to 0.08 (mean R²: 0.048). In psychosocial and clinical research, especially when predicting complex, multifactorial outcomes such as self-esteem or self-stigma, such low R² values are not uncommon. This likely reflects the fact that these outcomes are shaped by a wide array of biological, psychological, social, and environmental influences, many of which are not fully captured by the available variables or any single modeling approach. However, since our models left over 95% of the overall variance unexplained, their use for individual prediction or clinical decision-making should be precluded.

Comparison with other studies is challenging because, to our knowledge, no previous research has employed a similar methodology to investigate whether socio-demographic and clinical factors can predict patient-reported rehabilitation outcomes in schizophrenia. Most existing studies have relied on standard general linear modeling, which restricts the analysis to linear predictions and overlooks potential interactions and non-linear effects. More importantly, prior studies have not assessed their models using hold-out data that were not involved in model training. Evaluating predictive models, including GLMs, on separate training and testing datasets is essential to accurately gauge true predictive power and generalizability, rather than the model’s ability to memorize familiar data. Indeed, even deterministic models like GLMs can inadvertently fit patterns unique to the training data, which may be considered as noise or idiosyncrasies. In turn, evaluating a model on its training data often produces overly optimistic performance estimates. In our study, there was notable variation between GLM performances on the training set vs. ensemble performances on the testing set, though we did not observe substantial overfitting like with other, more flexible algorithms.

That being said, we acknowledge that there may be some degree of association between socio-demographic or clinical factors and psychosocial rehabilitation outcomes. For example, variables such as age, education, marital status, and clinical measures, including severity scores on the Clinical Global Impression, Global Assessment of Functioning, and duration of illness, showed some influence on certain patient-reported rehabilitation outcomes (see Supplementary Figures). These findings are generally consistent with previous reports in the literature using standard modeling methodology6,7,8,9,10,25. However, when these predictors were tested on an independent dataset, their ability to predict outcomes was minimal. Overall, while there may be weak associations between some socio-demographic or clinical factors and patient-reported rehabilitation outcomes, our results suggest that these relationships are likely not meaningful beyond statistical significance.

The finding that patient-reported rehabilitation outcomes, such as quality of life, self-esteem, and self-stigma, are only minimally associated with socio-demographic and clinical characteristics has important implications for public mental health. Specifically, these rehabilitation outcomes should be considered as distinct entities, separate from standard psychiatric measures like illness duration or clinical severity. This distinction supports the need for targeted assessment and treatment of patient-reported rehabilitation outcomes, as well as appropriate service provision. This also suggests that psychosocial rehabilitation should be distinct from standard psychiatric care and may be most effective when delivered in specialized settings by clinicians with specific training in this area. Finally, the underlying factors influencing these outcomes may include other variables used to measure psychological processes, cognitive dysfunctions, motivational, and personality traits. Future research should investigate whether routinely collecting data on such variables can improve the predictive performance of models for patient-reported rehabilitation outcomes and enhance the effectiveness of psychosocial rehabilitation care.

Strengths and limitations

Strengths of our study include the large sample size and the increased number of predictors and rehabilitation outcomes. Additionally, we employed a machine learning approach using various base learners and an ensemble model, along with data partitioning for training and testing our models. A machine learning approach is particularly beneficial in situations with complex (non-linear) relationships between variables, high-dimensional predictor matrices, and datasets featuring non-Gaussian distributions or substantial outliers. Conversely, a standard linear regression approach might have been criticized for neglecting to capture these intricate relationships, potentially leading to suboptimal results.

However, our study is not without limitations. First, we used patient-reported measures to evaluate rehabilitation outcomes. Predictive performance of our models might have depended more on socio-demographic and clinical factors if clinician reports had been used instead. Second, as mentioned above, we lacked predictors related to motivation and personality traits. Third, clinical symptoms were assessed using the severity score on the Clinical Global Impression and the Global Assessment of Functioning, which may lack specificity. Rehabilitation outcome predictors often encompass other factors, including those related to personal and clinical recovery, as perceived by patients, their families, and healthcare providers. The extent and quality of mental health clinician training may also yield valuable prognostic insights, which in turn could enhance predictions of rehabilitation outcomes – yet such information is not being recorded in REHABase. Fourth, due to the significant impact of missing data on parameter estimation during our pre-tests, we had to exclude two clinical predictors: number and duration of admissions. Fifth, our findings may only be generalizable to patients with schizophrenia in rehabilitation settings within healthcare systems similar to the French system. In particular, outpatients followed by local community mental health teams may have different clinical and psychological profiles which may impact rehabilitation outcomes differently. Sixth, we chose to remove observations with missing rehabiliation outcomes. Selecting only patients with complete outcome data may introduce selection bias if their profiles differ from those with missing values. However, we did not remove observations based on missing covariate data, instead using multiple imputation to address this issue.

Conclusion

In conclusion, our study demonstrated that, even with a large dataset, a wide range of predictors, and optimal modeling strategies, socio-demographic and basic clinical factors had very limited ability to predict patient-reported psychosocial rehabilitation outcomes. These negative results have potential public mental health implications. They support the need for specialized training, tailored service provision, targeted assessment, and specific treatment programs to improve psychosocial rehabilitation care. Based on these accounts, we propose two future directions: first, to identify better predictors of rehabilitation outcomes beyond basic socio-demographic and clinical variables in order to enhance assessment and treatment; and second, to evaluate whether rehabilitation care is more effective when delivered in dedicated rehabilitation services compared to standard psychiatric services.