Abstract
Artificial intelligence (AI) is increasingly used in mental health, yet its rehabilitation-oriented applications in schizophrenia have not been systematically mapped. We conducted a systematic scoping review of PubMed, Web of Science, IEEE Xplore and the ACM Digital Library (January 1, 2012–October 31, 2025; two search rounds), applying operationalized rehabilitation boundaries and excluding diagnostics-only case–control studies. We extracted data on data sources, feature engineering, model families, validation, calibration, interpretability, application domains, outcomes and implementation readiness. Eighty-three studies met inclusion criteria (median sample size 160; 55% longitudinal). Applications focused on symptom monitoring (48/83), medication management (19/83) and risk management (16/83), whereas functional training (1/83) and psychosocial support (3/83) were rarely targeted. Supervised learning predominated (53/83, 63.9%) over representation learning (20/83, 24%), most commonly using speech/text, electronic health records and smartphone sensing. Across classification tasks, the median AUC was 0.79 (IQR 0.71–0.86); relapse early-warning models showed a median sensitivity of 31.5% at 88.0% specificity. Only four studies reported external validation and three described closed-loop deployment, including one randomized trial that improved adherence. Proxy endpoints were more common than clinical endpoints, and reporting of calibration/uncertainty and fairness auditing was sparse. Overall, AI shows promise for monitoring, adherence support and relapse risk stratification, but routine-care deployment will require externally validated and calibrated human-in-the-loop decision support, privacy-preserving multimodal pipelines and pragmatic trials targeting functional outcomes and participation.
Similar content being viewed by others
Introduction
Schizophrenia is a severe mental disorder characterized by disturbances across multiple domains, such as thinking, perception, self-experience, cognition, volition, affect, and behavior, and is frequently associated with significant social and occupational impairments [1, 2]. Globally, schizophrenia affects approximately 23 million people (around 1 in 345), with higher age-specific prevalence among adults (around 1 in 233) [3]. The illness follows a chronic, relapsing course with substantial functional impairment and premature mortality. Meta-analytic estimates indicate 13–15 years of potential life lost, with the pooled expected age at death being approximately 60 years in men and 68 years in women [4]. Relapse remains common even with treatment; in prospective first-episode cohorts, the five-year cumulative relapse rate reaches approximately 82% [5]. The early phases of the illness carry elevated risks of self-harm and suicide, whereby the lifetime suicide mortality [6, 7] and suicidal ideation rates are 5 and 35%, respectively [8]. These patterns require care that extends beyond acute symptom control.
Psychiatric (psychosocial) rehabilitation, defined by the World Health Organization as a process facilitating opportunities for individuals with mental disorders to achieve optimal independent functioning by strengthening personal competencies and addressing environmental barriers, has emerged as essential for improving functioning and quality of life [9, 10]. Reflecting the needs of patients with mental health disorders, international frameworks emphasize comprehensive, integrated, community-based mental health and social care with longitudinal, measurement-based assessment and adaptation [11,12,13,14]. Best-practice guidelines commonly organize these principles into three core components: (i) comprehensive assessment and individualized care planning, which should be delivered in recovery-oriented, community-based services to support autonomy and participation [12, 13], with ongoing measurement-based monitoring to inform treatment adjustments [14, 15]; (ii) medication management and adherence support, including consideration of long-acting injectable antipsychotics when appropriate [15]; and (iii) evidence-based psychosocial interventions, such as cognitive remediation [16], psychoeducation (e.g., family-based models) [17], and social skills training [18]. In routine clinical workflows, these principles are operationalized through regular follow-up visits [13], family psychoeducation [17], social skills training [18], medication management and adherence support, and measurement-based relapse-prevention planning [15]. Randomized evidence from low-resource settings shows that community-based rehabilitation improves schizophrenia outcomes [19].
Despite the standardization of psychiatric rehabilitation, its implementation remains uneven worldwide [20]. Substantial challenges include resource constraints such as low mental health budgets, workforce shortages, hospital-centric spending [21], and medication-related adverse effects that undermine treatment adherence and tolerability [22, 23]. Moreover, deterioration detection and timely interventions are hampered by the absence of routine, measurement-based assessments [15] and relapse-prevention planning (e.g., early warning sign monitoring) [24]. Coverage data clearly show such implementation gaps: only approximately 29% of people with psychosis receive specialist mental health care globally [3], and approximately one-third of adults with serious mental illnesses in the United States of America have received no mental health treatment in the past year [25]. Such deficiencies in accessibility, quality, and continuity of care, alongside the aforementioned high relapse burden, motivate the search for scalable, remotely deliverable complements to routine rehabilitation [20].
The rapid maturation of digital mental health technologies has opened new pathways for addressing these gaps [26]. Mobile health applications [27], telepsychiatry [28], therapist-guided Internet-delivered cognitive behavioral therapy [29], videoconference-delivered cognitive behavioral therapy [30], automated virtual reality-delivered psychological therapy (some marketed as DTx) [31], and wearable-enabled monitoring [32] now offer practical complements to community care. These technological applications support real-time self-monitoring [27], scalable skills-based interventions [29, 30], remote access [28], and continuous physiological/behavioral monitoring [32]. They capture data through active patient inputs (e.g., ePRO/EMA on smartphones) [33] and passive sensing within a digital phenotyping framework (device logs and onboard sensors) [34].
Artificial intelligence (AI), referring to probabilistic computational methods that learn from data to support prediction/decision-making under uncertainty, has been increasingly applied to analyze these datasets [35]. Related AI approaches encompass several major paradigms, such as supervised learning for outcome prediction from labeled data, unsupervised learning for structure discovery in data without labels, reinforcement learning for sequential decision-making from interactions, and self-supervised learning that derives supervisory signals directly from raw data [35, 36]. Model families range from interpretable approaches (e.g., logistic regression and decision trees) to deep neural networks, with the latter encompassing convolutional architectures for images, recurrent and transformer architectures for sequential data, and graph neural networks for relational structures [35, 37]. Large language models (LLMs; i.e., transformer-based foundation models pretrained on massive text corpora) exhibit strong capabilities in language understanding, generation, and emerging reasoning, enabling applications to process clinical narratives and patient–provider communication [38]. Advanced training strategies for LLMs include multimodal learning to integrate heterogeneous sources, transfer learning to adapt models across domains, and federated learning to enable collaborative training while preserving data locality and privacy [39, 40]. Rigorous LLM deployment requires attention to robustness under distribution shift and principled uncertainty quantification, predictive performance, and governance that advances transparency, fairness, privacy, and security [41, 42].
Notwithstanding the rehabilitation-oriented capabilities of AI and digital therapeutics and the increasing related research, the literature on AI in schizophrenia remains preponderantly concentrated on pathophysiology and diagnosis. For example, supervised models trained on routine electronic health records (EHRs) forecast diagnostic progression for schizophrenia or bipolar disorder [43]. A recurrent neural-network model trained on multi-system EHR data identified individuals at risk of first-episode psychosis up to 12 months before the index event [44]. In neuroimaging, large multisite analyses show that machine learning pipelines can extract reproducible image-derived markers [45]; deep learning graph-neural networks that fuse structural and functional MRI (fMRI) further automate feature discovery, achieving 83% cross-validated accuracy while highlighting circuit-level biomarkers [46]; hypothesis-driven fMRI biomarkers also quantify disease-relevant physiology, such as a cross-validated striatal-dysfunction index that could discriminate schizophrenia from controls and show its relation to antipsychotic response [47]. Multimodal fusion with genomic/transcriptomic data both improves discrimination and helps localize disease-relevant circuits [48], while imaging–transcriptomic maps link MRI phenotypes and fMRI signal amplitude to the cortical expression of interneuron markers [49] and to spatial patterns of schizophrenia risk-gene expression [50].
Meanwhile, rehabilitation-targeted AI applications (e.g., focusing on functional assessment, longitudinal symptom/risk monitoring, medication management, psychosocial skills training, and community reintegration) have received comparatively less attention than diagnostic/prognostic AI applications and remain under-synthesized [51]. A comprehensive synthesis of the AI-based rehabilitation field is particularly critical because psychiatric rehabilitation poses challenges beyond algorithmic performance, requiring context-aware deployment, integration with care pathways, and attention to implementation barriers [52]. Patients commonly raise concerns about data privacy [53] and the possibility that intensive passive monitoring could exacerbate anxiety or paranoia [54]. Clinicians likewise warn that overly intrusive sensing can strain the therapeutic alliance and that recommendations must be sensitive to the clinical context to be actionable [55]. These concerns intersect with technical demands for explainable systems [56] and for high-quality, reliable data, especially in consideration of issues such as label scarcity in psychiatry [57], the limited ecological validity of many functional outcomes [58], device and platform heterogeneity in smartphone/wearable data collection [59], and performance degradation from distribution shifts [35].
For emphasis, we would like to remind the reader that some reviews synthesize data on AI applications, including schizophrenia-focused scoping reviews. Therefore, the research gap we highlight here is that the overt concentration of past reviews on diagnosis and acute phase management [60,61,62] has left rehabilitation processes underexplored. This is compounded by broader, cross-diagnostic overviews of digital/AI approaches rarely providing analyses aligned with schizophrenia rehabilitation targets [63, 64], particularly negative symptoms [65] and community/social participation [66], that require tailored intervention strategies. This systematic review aimed to address these research gaps by examining AI application in schizophrenia rehabilitation management. We analyzed the technical and practical applications of AI models across core rehabilitation domains, including symptom monitoring, medication management, risk management, functional training, and psychosocial support.
Methods
This was a systematic scoping review. We chose this design, which enables research result synthesis, evaluation of AI implementation in schizophrenia rehabilitation, and identification of key values and challenges, owing to the significant heterogeneity in objectives, technology, and evaluation metrics across the studies included in the review. This study was reported following the PRISMA-SCR guidelines [67].
Search strategy
Search sources
We conducted two database searches (Round 1, January 15–31, 2025; Round 2, October 1–15, 2025, following reviewer feedback) across four databases: PubMed (clinical and rehabilitation literature), Web of Science, IEEE Xplore, and the ACM Digital Library (AI-focused computing and engineering venues).
Eligible records spanned January 1, 2012, through October 31, 2025. The 2012 start date reflects the emergence of modern deep learning (e.g., AlexNet) [68], the subsequent acceleration of AI’s development toward natural language processing and computer vision [69], and the sparsity of AI-related mental health literature before this period [64, 70]. We conducted backward and forward citation chasing to improve completeness.
Search terms
We developed search terms under the guidance of two mental health rehabilitation experts, covering target population, AI technologies, and rehabilitation contexts: (“artificial intelligence” OR “AI” OR “machine learning” OR “deep learning” OR “neural networks” OR “natural language processing” OR “computer vision” OR “computational intelligence” OR “data mining” OR “predictive modeling” OR “reinforcement learning”) AND (“schizophrenia” OR “schizophrenic” OR “schizoaffective disorder” OR “psychosis” OR “psychotic disorder” OR “severe mental illness”) AND (“rehabilitation” OR “recovery” OR “management” OR “care” OR “medication adherence” OR “drug compliance” OR “pharmacological management” OR “medication tracking” OR “medication optimization” OR “relapse prevention” OR “risk assessment” OR “risk prediction” OR “violence prediction” OR “crisis management” OR “cognitive training” OR “social skills training” OR “life skills development” OR “functional recovery” OR “skill-building interventions” OR “symptom tracking” OR “symptom monitoring” OR “behavioral monitoring” OR “therapeutic intervention” OR “emotional support” OR “therapy engagement” OR “psychological well-being”).
Study eligibility criteria
Operational boundary of “Rehabilitation” for schizophrenia
In schizophrenia, psychiatric rehabilitation denotes a recovery-oriented, person-centered, longitudinal framework that enables the development of the skills and securing of the environmental supports required to live, learn, and work in the community with the least professional assistance [71]. Currently, this framework integrates evidence-based pharmacological and psychosocial care (e.g., structured symptom monitoring, medication management, proactive risk management, skills-based functional training, and psychosocial support) to maintain stability, prevent relapse, and promote community participation [14]. By contrast, diagnosis is a categorical, operational process that establishes case identification using syndromic criteria and duration thresholds, designed primarily for reliability and clinical utility rather than prescribing specific treatment pathways [72]. Field studies of the ICD-11 diagnostic guidelines similarly emphasize the clinical utility of diagnosis for communication and decision-making rather than uniform intervention protocols [73]. Accordingly, this review focuses on rehabilitation and management approaches that prioritize community functioning and quality-of-life outcomes and go beyond symptom remission.
Rehabilitation application domains
To systematically categorize AI applications within this schizophrenia rehabilitation framework, we operationalized seven core domains reflecting contemporary rehabilitation clinical practice [14]. Each included study was mapped to one or more domains shown in the following list.
-
a.
Symptom monitoring: continuous, structured assessment of positive, negative, affective, and related functional symptoms in real-world settings via clinician ratings, patient-reported outcomes, ecological momentary assessment [74], and passive sensing [75]. This structured assessment aims to detect fluctuations and early warning signs of relapse [76] and guide timely interventions.
-
b.
Medication management: a systematic, long-term process aimed at optimizing antipsychotic therapy [14], preventing relapse, and minimizing harm, including drug selection/titration, adherence assessment and support, adverse effect monitoring/management [77], long-acting injectable scheduling [78], and shared decision-making.
-
c.
Risk management: ongoing assessment, treatment management formulation, and collaborative management targeting high-impact adverse outcomes (suicide/self-harm [79], violence/victimization [80], and relapse/crisis/hospitalization), integrating early warning monitoring [81], safety planning, and stepped, cross-setting responses.
-
d.
Functional training: training-based, skills-focused interventions that build the enduring capacities needed for community functioning (e.g., neurocognition, social cognition, activities of daily living, instrumental activities of daily living, and vocational skills) through repeated practice and coached learning (e.g., cognitive remediation [82], social-cognition training [83], and individual placement and supported employment [84]).
-
e.
Psychosocial support: structured educational, therapeutic, and social network interventions that enhance coping, family and peer involvement, service engagement, and community integration (e.g., family psychoeducation [85], cognitive behavioral therapy for psychosis [86], and peer support [87]).
-
f.
Physical health/lifestyle management (pre-specified, zero-hit in this review): structured, multicomponent interventions addressing cardiometabolic risks to help close the mortality gap [88] and improve functioning and quality of life, including those combining physical activity and diet/weight management [89], smoking cessation [90], and routine metabolic screening [88].
-
g.
Service organization/care coordination (pre-specified, zero-hit in this review): team- and pathway-level models that orchestrate medication, psychosocial intervention, and vocational/educational support to deliver integrated, continuous rehabilitation in routine services. Examples include coordinated specialty care for first-episode psychosis [91], assertive community treatment [92], and intensive/structured case management [93].
None of the included studies mapped to domains f or g. Therefore, although these domains were retained for completeness, they were omitted from our domain-level quantitative synthesis.
Inclusion criteria
To ensure relevance to rehabilitation and methodological rigor, studies were included if they met all the criteria below.
-
a.
Population: adults or adolescents with clinician-confirmed schizophrenia-spectrum disorders (DSM-5/DSM-5-TR or ICD-10/ICD-11). Studies could include broader serious mental illness diagnoses (e.g., schizoaffective disorder or bipolar disorder with psychotic features), provided that schizophrenia-spectrum disorders constituted a primary analytic group or clearly defined subgroup. Studies conducted in hospitals were eligible only if the AI function targeted post-discharge management or community reintegration outcomes.
-
b.
Intervention/AI function: an AI system (as per Organisation for Economic Co-operation and Development/International Organization for Standardization definitions) that infers from inputs to produce predictions/recommendations/decisions/content in service of a rehabilitation task in any of the seven core domains (see Section 2.2.2); eligible paradigms included supervised/unsupervised/semi-supervised learning, deep learning/foundation models/LLMs, reinforcement learning, probabilistic models, and knowledge-based/expert systems [6,7,8, 59].
-
c.
Outcomes: rehabilitation-relevant endpoints (e.g., relapse/hospitalization, treatment adherence, functioning/participation, and social/role outcomes) or model performance explicitly tied to a rehabilitation management task (e.g., treatment adherence prediction that triggers case management).
-
d.
Designs: randomized controlled trials/quasi-experimental, prospective/retrospective observational, and model development/validation studies. Qualitative or mixed-methods implementation studies were eligible when AI functionality operated within a rehabilitation workflow; diagnostics-only designs were not eligible.
-
e.
Setting: community, home-based, supported accommodation, inpatient-to-community transition, or inpatient and digital health settings aligned with sustained rehabilitation care (e.g., inpatient data used to support post-discharge management or longitudinal relapse prevention).
Exclusion criteria
To focus specifically on rehabilitation, we excluded studies that met any of the following criteria:
-
a.
focused on diagnostics (e.g., screening, case finding, and differential diagnosis) or cross-sectional case–control classifiers (e.g., schizophrenia vs. healthy controls) without linkage to rehabilitation;
-
b.
addressed pathophysiology/biomarkers (e.g., discovery neuroimaging) or theoretical simulations without rehabilitative implications;
-
c.
evaluated acute-phase treatment only (e.g., pharmacologic or symptom-focused psychotherapy) without functional/community outcomes or explicit rehabilitation goals;
-
d.
were limited to custodial/forensic settings with no stated pathway to community living;
-
e.
relied exclusively on modalities infeasible for continuous community monitoring or at-home/routine deployment (e.g., fMRI-only protocols and lab-grade electroencephalogram-only); and
-
f.
were editorials, reviews, proposals, posters, conference abstracts, non-original research, or non-English.
Operationalization for cross-sectional and classification studies
Given the prevalence of cross-sectional case–control designs (e.g., schizophrenia vs. healthy controls) in the AI literature, we established explicit operationalization criteria to assess whether such studies qualify as rehabilitation-oriented. This helped us distinguish diagnostic research from rehabilitation-applicable studies by addressing the inherent ambiguity of binary classification paradigms. All baseline eligibility requirements below had to be met.
-
a.
Confirmed diagnosis: used real-world data from individuals with clinician-confirmed schizophrenia spectrum disorders (per ICD/DSM or equivalent diagnostic criteria), excluding samples based solely on self-reported diagnoses or clinical high-risk populations.
-
b.
Community applicability: data collection methods were feasible for sustained use in community, home, or outpatient settings (e.g., smartphone sensors, wearables, speech/text, and EHR data), and thus did not rely exclusively on research-grade neuroimaging (e.g., fMRI) or laboratory-only modalities (e.g., research-grade electroencephalogram) without a plausible pathway to routine deployment.
-
c.
Beyond pure diagnostics: explicitly discussed or proposed rehabilitation management applications beyond solely reporting classification accuracy for “schizophrenia vs. healthy controls” discrimination.
At least one of the following rehabilitation-orientation signals needed to be present:
-
a.
Rehabilitation-anchored constructs: the model or features were explicitly linked to rehabilitation-relevant dimensions, enabling translation to management priorities. Examples include symptom scales (Brief Negative Symptom Scale/Clinical Assessment Interview for Negative Symptoms), social cognition measures, sleep/circadian patterns, functional/participation assessments (Personal and Social Performance/UCSD Performance-based Skills Assessment/Quality of Life Scale/WHO Disability Assessment Schedule), medication adherence or side effects, and/or safety/risk indicators.
-
b.
Change sensitivity or re-test evidence: presented evidence (even if preliminary) of response to intervention, pharmacological challenges, or repeated measurement, indicating potential utility for longitudinal monitoring or treatment-response tracking.
-
c.
Actionability and interpretability: the features or outputs had interpretable clinical meaning and could plausibly inform rehabilitation care actions (e.g., “elevated negative symptom indices → prompt follow-up, social work engagement, or behavioral activation”), even if decision thresholds were not yet quantified.
If at least one of the following was found in the re-review, the study was excluded:
-
a.
Diagnostics-only orientation: focused exclusively on diagnostic discrimination without establishing any rehabilitation-related linkage or management application.
-
b.
Insufficient real-world utility: external validity or applicability was prohibitively low (e.g., excessive false-positive rates and clearly non-deployable workflows), precluding feasible use in rehabilitation management.
-
c.
Non-compliant population or modality: primarily enrolled unconfirmed/self-disclosed cases or clinical high-risk-only samples or relied on data collection methods lacking community-setting feasibility.
Study selection
Zotero automatically filtered and removed duplicates from search results. Two independent reviewers (first and second authors) conducted title/abstract screening, followed by a full-text review of potentially eligible records. Disagreements were resolved through discussion, and unresolved cases were adjudicated by a third expert. Following the initial screening phases, all preliminarily eligible studies underwent a secondary operationalization review to ensure consistent application of the rehabilitation-oriented inclusion criteria, with cross-sectional or case–control designs subjected to stricter criteria (see Section 2.2.5). This secondary review was conducted in November 2025 in response to reviewer feedback, emphasizing clearer rehabilitation boundaries. Interrater agreement for study selection was substantial (Cohen’s κ = 0.78 for title/abstract screening; κ = 0.82 for full-text review; κ = 0.70 for the operationalization review; Fig. 1).
Two database search rounds (Round 1: January 2012–January 2025; Round 2: January–October 2025) yielded 627 unique records after de‑duplication. At title/abstract screening, 520 records were excluded because they were non‑original or non‑empirical publications, out‑of‑scope in terms of population or AI use, or did not address rehabilitation‑oriented management. The remaining 107 reports underwent full‑text eligibility assessment and a secondary operationalization review focusing on rehabilitation‑oriented criteria (Methods 2.2.5), leading to the exclusion of 24 reports and a final cohort of 83 studies.
Data extraction
Two reviewers (first and second authors) independently extracted data using a standardized Microsoft Excel template. A pilot extraction of 20 articles refined the procedure and resolved discrepancies. The extracted information comprised the following: (1) bibliographic details (first author, year, country/region, and World Bank income level); (2) population and study design (target condition and phase, center structure [single-center, multicenter, or nationwide/healthcare system], setting, sample size and composition, and observation window or follow-up); (3) task specification (concise task phrase and task family of classification/regression/sequence/time-to-event) and rehabilitation domains using the task–domains framework (domain labels were drawn from the seven rehabilitation domains in Section 2.2.2); (4) technology paradigm (feature engineering-driven supervised learning; sequence and event-time modeling; representation learning and multimodal deep learning; prescriptive policy learning), recording the model used for primary inferences when multiple were compared; (5) data sources (modalities and whether passively or actively collected) and engagement pattern (passive sensing, nudge, conversational, or none); (6) outcome definition (proxy vs. clinical/functional endpoints) and time horizon (a concrete duration or an explicit window such as same-visit, short, mid, mid-to-long, and long term); (7) performance and outcomes captured in a task-aware manner, including classification metrics (area under the receiver operating characteristic curve [AUC], accuracy, sensitivity/specificity, and, where reported, precision/recall), regression metrics (mean absolute error and root mean squared error), time-to-event metrics (concordance indices, also known as C-index, or time-dependent AUC), early warning metrics (e.g., sensitivity and specificity at pre-specified prediction horizons), and task-appropriate metrics for prescriptive/just-in-time adaptive interventions, reinforcement-learning systems, or LLM-guided interventions, which were summarized narratively owing to heterogeneous definitions; (8) validation, interpretability, and implementation signals, including validation level (cross-validation, hold-out, and external), calibration and/or uncertainty reporting (yes/no), interpretability class (feature-level, local-explanation, rule-based, or none), closed-loop action (yes/no) with an action-delivery label distinguishing recognition-only systems from those that directly triggered patient- or clinician-facing support or training, safety guardrails for deployment or LLM/reinforcement-learning use (yes/no), and supplementary quality indicators where available (e.g., randomized controlled evaluations, clinician benchmarking, patient user testing, or fairness and algorithmic-bias assessments); (9) sufficient data pre-processing and feature engineering summary for reproducibility and interpretation (e.g., aggregation windows, selection procedures such as mRMR or embedded regularization, top-k important features, human-readable rules, or learned policy tables); (10) for cross-sectional or baseline proof-of-concept studies, an explicit justification of rehabilitation relevance aligned with the operationalization criteria (see Section 2.2.5).
For each study and task family, when multiple models, thresholds, time points, or subscales were reported, we abstracted all available performance metrics but designated a single prespecified “primary” estimate for cross-study descriptive summaries, prioritizing held-out or external test performance on the primary endpoint. Metrics were summarized in a task-aware fashion (i.e., classification, regression, sequence/time-to-event, early warning, and prescriptive tasks were not pooled across task families), and medians and interquartile ranges were computed for homogeneous metric families (e.g., AUC, accuracy, sensitivity/specificity, mean absolute error, root mean squared error, and R²). Metrics expressed on different scales (e.g., percentage root mean squared error on bounded ecological momentary assessment scales) were reported narratively, but were not included in pooled medians; for early warning models, sensitivity/specificity summaries were restricted to studies that reported both.
To ensure comparability, we grouped methods into four technology paradigms. First, feature engineering-driven supervised learning (typically static classification/regression), such as handcrafted or statistical features with logistic regression, support vector machines, random forests, and tree-based models. Second, sequence and event-time modeling, that is, models that make explicit use of temporal order or survival time such as hidden Markov models, recurrent neural networks, temporal convolutional networks, time-series transformers (also known as TS-Transformers), and Cox proportional hazards models, random survival forests, or deep survival models. Third, representation learning and multimodal deep learning, including self-supervised/contrastive pretraining and multimodal fusion across speech/text/sensing/electronic medical records. Fourth, prescriptive policy learning, ranging from prediction to action and including contextual bandits, reinforcement learning, and dynamic treatment regimes with offline counterfactual evaluation (e.g., inverse propensity scoring, doubly robust estimation, or fitted Q-evaluation). All extracted data were systematically organized according to the AI model type and rehabilitation domain (Table 1). Discrepancies were resolved through discussion and unresolved cases were adjudicated by a third expert.
Results
Study selection
The systematic search and selection process is illustrated in Fig. 1. Two search rounds yielded a combined total of 627 records after deduplication (Round 1, 561 records from January 2012 to January 2025; Round 2, 66 records from January to October 2025). In Round 1, following title/abstract screening and full-text review of 561 records, 89 studies met the eligibility criteria. The 472 excluded records were ineligible for the following reasons: (i) diagnostics-only focus without rehabilitation linkage (e.g., case–control classifiers for schizophrenia vs. healthy controls discrimination; 198 studies, 41.9%); (ii) acute-phase treatment without functional/community outcomes (96 studies, 20.3%); (iii) pathophysiology/biomarker discovery without rehabilitation management implications (74 studies, 15.7%); (iv) non-original research (i.e., reviews, editorials, protocols, and conference abstracts; 58 studies, 12.3%); (v) exclusive reliance on non-deployable modalities (e.g., fMRI-only or research-grade electroencephalogram data; 31 studies, 6.6%); and (vi) custodial/forensic settings without community pathways (15 studies, 3.2%).
In Round 2, 66 additional records were identified. Following the same screening process as in Round 1, 18 studies met the eligibility criteria. The 48 excluded records showed a distribution similar to that in Round 1, as shown herein: diagnostics-only focus (19 studies, 39.6%), acute-phase treatment only (10 studies, 20.8%), pathophysiology/biomarkers (8 studies, 16.7%), non-original research (6 studies, 12.5%), non-deployable modalities (3 studies, 6.3%), and custodial/forensic settings (2 studies, 4.2%).
All 107 studies (89 from Round 1 and 18 from Round 2) underwent an additional operationalization review. Among these, 42 studies employed cross-sectional or case–control designs requiring stricter operationalization criteria, and 24 were excluded because they (i) lacked confirmed clinical diagnoses or community-deployable methods (9 studies, 37.5%) or (ii) demonstrated no rehabilitation-orientation signals despite initial inclusion (15 studies, 62.5%). This secondary review, conducted in November 2025, resulted in a final cohort of 83 studies spanning January 2012 to October 2025.
Study characteristics
The final cohort comprised 83 studies published between 2012 and October 2025 (Fig. 2). Publication trends demonstrate a marked acceleration in recent years, as early studies from 2012 to 2019 represented only 9.6% (8/83) of the corpus, whereas 2020 to 2023 accounted for 49.4% (41/83), and 2024 to October 2025 contributed 41.0% (34/83).
Annual counts of included studies (N = 83) from 2012 to October 2025. Bars show annual counts with numeric labels; the superimposed line depicts the temporal trend. Data for 2025 include publications through October 31, 2025.
Studies originated predominantly from high-income countries (Fig. 3). The United States of America contributed the largest share (31/83 studies, 37.3%), followed by China (including Taiwan and Hong Kong; 10/83, 12.0%), and the United Kingdom (5/83, 6.0%). Italy also accounted for 5/83 studies (6.0%). Additional contributions were from South Korea (4/83, 4.8%), France (4/83, 4.8%), Canada (3/83, 3.6%), Spain (3/83, 3.6%), the Netherlands (3/83, 3.6%), Germany (2/83, 2.4%), Poland (2/83, 2.4%), Greece (2/83, 2.4%), and Singapore (2/83, 2.4%). Single-country contributions were observed for Japan, Turkey, India, and Denmark (one study each), and three multicountry or regional trials were coded as International or European consortia.
Bubble map of included studies by primary country (N = 83). Circle size and the numeric label denote the number of studies. The China group includes mainland China (n = 4), Taiwan (n = 5), and Hong Kong SAR (n = 1). Multinational or regional consortia (International/Europe, n = 3) are not assigned to a single country on the map.
Most studies were conducted in community or outpatient settings (57 studies, 68.7%), with smaller proportions in inpatient settings (10 studies, 12.0%), mixed settings leveraging nationwide claims or health system data (4 studies, 4.8%), or settings not clearly reported (12 studies, 14.5%). Approximately half of the studies employed multicenter designs (42 studies, 50.6%), with single-center studies accounting for 47.0% (39 studies) and nationwide or health-system analyses for 2.4% (2 studies).
Population and sample size
All 83 studies provided sample size information (range, 5–87 182 participants; median, 160 participants). Regarding sample size distribution, 4.8% (4 studies) enrolled fewer than 20 participants, 33.7% (28 studies) enrolled 20–100 participants, 38.6% (32 studies) enrolled 101–500 participants, 7.2% (6 studies) enrolled 501–1000 participants, and 15.7% (13 studies) enrolled more than 1 000 participants (Fig. 4). Studies with smaller sample sizes ( < 100 participants) accounted for 38.6% (32 studies) of the corpus, whereas those with 100 or more participants represented 61.4% (51 studies).
Sample sizes ranged from 5 to 87 182 (median 160). Overall, 32/83 (38.6%) studies enrolled <100 participants and 51/83 (61.4%) enrolled ≥100, with darker shades indicating larger sample-size categories.
Most studies focused on patients with schizophrenia in a clinically stable or chronic phase (45 studies, 54.2%), followed by mixed populations or other diagnostic categories (13 studies, 15.7%), acute inpatients or hospitalized patients (10 studies, 12.0%), patients with first-episode or early psychosis (9 studies, 10.8%), recently discharged patients (5 studies, 6.0%), and treatment-resistant schizophrenia (1 study, 1.2%). Twenty studies (24.1%) included healthy or matched control groups, whereas the remaining 63 (75.9%) exclusively examined patient populations. Several studies incorporated individuals with schizoaffective disorder, bipolar disorder with psychotic features, or broader serious mental illness categories alongside patients with schizophrenia (Fig. 5).
Categories (mutually exclusive) are: stable/chronic schizophrenia (45, 54.2%), mixed or other diagnoses (13, 15.7%), acute inpatients/hospitalized (10, 12.0%), first‑episode/early psychosis (9, 10.8%), recently discharged (5, 6.0%), and treatment‑resistant schizophrenia (1, 1.2%). Abbreviation: TRS, treatment‑resistant schizophrenia.
Among the 83 studies, 46 (55.4%) were longitudinal studies with specified follow-up or repeated monitoring, while 37 (44.6%) employed cross-sectional or single time-point assessments. For longitudinal studies that explicitly reported follow-up durations (42 studies), follow-up lengths varied considerably as follows: 1 study (2.4%) had a follow-up period of less than 1 week, 4 (9.5%) ranged from 1 week to 1 month, 7 (16.7%) ranged from 1 to 3 months, 7 (16.7%) ranged from 3 to 6 months, 16 (38.1%) ranged from 6 to 12 months, and 7 (16.7%) extended beyond one year (up to 12–17 years).
Data sources and user engagement patterns
The 83 included studies used diverse data-collection methodologies. Active data collection was predominant (56.6%; 47 studies), acquiring data through structured clinical interviews, standardized symptom scales, cognitive task assessments, or patient self-reports. Passive collection comprised 38.6% (32 studies), leveraging sensor-based devices, EHR systems, or social media platforms to capture patient behavioral data. Moreover, 4.8% (4 studies) employed combined approaches integrating both active and passive collection methods to achieve data complementarity.
Regarding user engagement, most studies adopted no-engagement designs (68.7%; 57 studies), wherein data were collected without providing real-time feedback or interventions to patients. Passive sensing constituted 21.7% (18 studies), continuously monitoring patients (e.g., physiological indicators, activity patterns, and behavioral characteristics) via smartphones/wearables. Conversational engagement (e.g., natural language-processing-driven virtual assistants or therapeutic dialogue systems) and nudge-based engagement (e.g., medication reminders or symptom self-assessment prompts through mobile applications) each accounted for 4.8% (4 studies).
Regarding data modality, speech and text data were the most prevalent (22.9%; 19 studies) and included clinical interview transcripts, voice recordings, and natural language-processing techniques. EHRs served as data sources in 14 studies (16.9%), encompassing structured diagnostic codes, prescription information, and unstructured clinical narrative notes. Smartphone-based multimodal sensing ranked next, with 12 studies (14.5%) capturing patients’ mobility trajectories, social interactions, sleep patterns, and screen use behaviors. Wearable device data were relatively scarce, adopted by four studies (4.8%), including wrist-worn accelerometers, smartwatches, or heart-rate monitoring devices.
Outcome measures and temporal horizons
The included studies demonstrated marked heterogeneity in outcome selection and temporal horizons. Proxy endpoints predominated across the literature (e.g., diagnostic classification accuracy, symptom scale scores, medication adherence rates, treatment-response indicators, and social functioning assessments), appearing in 67 studies (80.7%), whereas clinical endpoints (e.g., relapse events, hospital readmissions, symptomatic remission, functional remission, and long-term mortality) were evaluated in 21 studies (25.3%). More specifically, 62 studies (74.7%) exclusively employed proxy endpoints, 16 studies (19.3%) focused solely on clinical endpoints, and 5 studies (6.0%) incorporated both types.
Regarding temporal horizons, concurrent models utilizing data from a single assessment time point represented the most common approach, accounting for 34 studies (41.0%). Short-term investigations of up to three months were employed in 24 studies (28.9%), typically targeting symptom fluctuations, early relapse detection, or medication adherence monitoring. Medium-term investigations spanning 3–12 months represented 16 studies (19.3%), focusing on treatment-response trajectories, functional outcomes, and sustained adherence patterns. Long-term investigations extending beyond 12 months comprised 9 studies (10.8%), addressing outcomes such as multi-year relapse risk, treatment-resistance development, mortality prediction, and chronic disease incidence.
Application domains and task landscape
Across the five rehabilitation management domains that appeared in the included studies, symptom monitoring emerged as the predominant application area (Fig. 6), encompassing 48 studies (57.8%). Symptom monitoring tasks clustered into seven distinct task categories, as follows: diagnostic classification (9 studies) leveraged speech, language, or multimodal features to distinguish patients with schizophrenia from healthy controls [94,95,96,97,98,99,100,101,102]; symptom scale prediction (14 studies) employed machine learning to estimate Positive and Negative Syndrome Scale, Brief Psychiatric Rating Scale, or ecological momentary assessment scores [102,103,104,105,106,107,108,109,110,111,112,113,114,115]; negative symptom quantification (4 studies) automated the assessment of blunted affect, alogia, anhedonia, avolition, and asociality using wearable sensors or speech analysis [116,117,118,119]; cognitive function evaluation (5 studies) detected formal thought disorder or predicted memory performance [120,121,122,123,124]; social functioning assessment (4 studies) utilized smartphone GPS, passive sensing, or facial affect recognition to estimate social isolation, loneliness, and interpersonal competence [125,126,127,128]; quality of life prediction (2 studies) estimated subjective well-being or functional outcomes [129, 130]; clinical phenotyping (7 studies) was used to delineate prognostic subgroups or disease stages, including subtype classification [99, 124, 131,132,133,134,135]. Task categories are not mutually exclusive, and thus the counts may sum to more than the number of studies per domain.
(a) Distribution of the 83 included studies across rehabilitation management domains: symptom monitoring (48 studies), medication management (19), risk management (16), functional training (1), and psychosocial support (3). (b) Symptom‑monitoring task categories among the 48 studies in this domain: diagnostic classification (9 studies), symptom scale prediction (14), negative symptom quantification (4), cognitive function evaluation (5), social functioning assessment (4), quality‑of‑life prediction (2), and clinical phenotyping (7). (c) Medication‑management task categories among 19 studies: adherence monitoring and prediction (7 studies), treatment response and resistance stratification (8), dosage optimization and toxicity prediction (2), pharmacovigilance for non‑psychiatric adverse events (2), and individualized drug selection (1). (d) Risk‑management task categories among 16 studies: relapse prediction (9 studies), hospitalization risk assessment (3), violence‑related classification (3), comorbidity risk prediction (1), and mortality prediction (1). Bars represent the number of studies per domain or task category; domains and task categories are not mutually exclusive, and individual studies can contribute to more than one category.
Medication management constituted the second-largest domain with 19 studies (22.9%), accounting for five core tasks, as shown herein: adherence monitoring and prediction (7 studies) used smartphone-based visual verification, pharmacokinetic modeling, or claims data to forecast treatment continuation [136,137,138,139,140,141,142]; treatment response and resistance stratification (8 studies) predicted symptomatic remission, treatment-resistant schizophrenia status, or clozapine responsiveness [143,144,145,146,147,148,149,150]; dosage optimization and toxicity prediction (2 studies) recommended therapeutic dose ranges or forecasted adverse metabolic effects [151, 152]; pharmacovigilance for non-psychiatric adverse events (2 studies), which included monitoring prolactin elevation and medication-sequence–linked hospitalization risks [152, 153]; individualized drug selection (1 study) generated personalized treatment rules based on baseline characteristics [154].
Risk management applications appeared in 16 studies (19.3%), comprising the five task categories exposed in this list: relapse prediction (9 studies) developed early warning systems for psychotic exacerbation with prediction windows ranging from one week to two years using digital phenotyping, Internet search behavior, or smartphone passive sensing [155,156,157,158,159,160,161,162,163]; hospitalization risk assessment (3 studies) forecasted readmissions or prolonged inpatient stays [156, 164, 165]; violence-related classification (3 studies) covered aggression-risk prediction or victimization event detection [166,167,168]; comorbidity risk prediction (1 study) estimated type 2 diabetes onset [169]; and mortality prediction (1 study) modeled all-cause death using EHR data [170].
For functional training, only one study (1.2%) identified response trajectories to social cognition training and predicted individualized treatment benefits [171]. Psychosocial support interventions comprised three studies (3.6%): one analyzed therapeutic dialogue patterns in virtual-reality avatar therapy [172], one predicted optimal referral pathways to cognitive behavioral therapy or vocational training [173], and one provided policy recommendation prototypes using offline reinforcement learning [174].
Technological approaches and model architectures
The included studies employed four primary technological paradigms: feature engineering-driven supervised learning (53 studies, 63.9%), representation learning-driven modeling (20 studies, 24.1%), sequence and event-time modeling (7 studies, 8.4%), and prescriptive policy learning (3 studies, 3.6%).
For feature engineering-driven supervised learning, random forest was the most frequently adopted algorithm (24 studies), often used for intrinsic feature-importance profiling and ensemble-based generalization [95, 98,99,100,101, 103, 117, 123, 126, 127, 129, 130, 140, 143, 146, 149, 150, 159, 161, 164, 166, 168, 171, 173]. Gradient boosting variants (e.g., XGBoost and gradient boosting machines; 18 studies) were commonly applied to structured tabular data and high-dimensional feature spaces [103, 106, 108, 117, 122, 123, 126, 130, 139, 140, 143, 145, 149, 150, 152, 168, 169, 173]. Support vector machines (18 studies) were usually applied in small-sample settings and frequently used for speech-acoustic classification, but also appeared in higher-dimensional risk-prediction pipelines [94, 96, 104, 105, 109, 110, 112, 123, 127, 134, 136, 146, 147, 159, 163, 164, 166, 168]. Logistic regression (15 studies) was often used for clinical nomogram construction or as a baseline comparator [96, 123, 130, 139,140,141, 146, 148,149,150, 162, 166, 168, 169, 171]. Regularization techniques, such as least absolute shrinkage and selection operator/elastic net (12 studies), were implemented to select high-dimensional predictors and mitigate overfitting [109, 112, 116, 117, 128, 141,142,143, 150, 166, 168, 169].
Among representation learning methods, transformer architectures (4 studies; e.g., BERT/BioBERT and Whisper) were used to process clinical narrative text, therapy-dialogue content, and automatic speech recognition outputs [120, 165, 167, 175]. Convolutional neural networks (4 studies) were applied to model visual inputs for medication adherence verification, painting-based symptom assessment, and accelerometry-based human-activity recognition [102, 107, 118, 137]. Recurrent architectures (3 studies; e.g., long short-term memory/gated recurrent unit/vanilla recurrent neural networks) were used to capture temporal dependencies in smartphone sensor streams, ecological momentary assessment trajectories, and multimodal relapse predictions [107, 113, 157]. Autoencoder frameworks (3 studies) were used for unsupervised anomaly detection in relapse early warning systems and for dimensionality reduction in mortality risk modeling [155, 157, 170]. Two studies reported LLM-augmented pipelines for zero-shot symptom severity scoring or feature extraction from unstructured EHRs [119, 165].
In sequence and event-time modeling, hidden Markov models (1 study) were used to identify latent symptom state transitions from ecological momentary assessment sequences [132]. Cox proportional-hazards regression and random survival forests (1 study) were applied to model time-to-relapse following medication discontinuation [158]. AutoRegressive Integrated Moving Average (ARIMA) models and Gaussian-process anomaly detection (1 study) were implemented to model irregular temporal patterns in relapse prediction systems [156]. Trajectory clustering with fuzzy methods (1 study) was used to stratify first-episode psychosis patients into prognostic phenotypes [133]. Recurrent networks with long short-term memory or gated recurrent unit cells (3 studies) were used to forecast multi-day mental state fluctuations from digital phenotyping data [114, 115, 153].
For prescriptive policy learning, one study applied targeted minimum loss-based individualized treatment rules to recommend optimal antipsychotic selection using baseline clinical features [154]. Two studies deployed offline reinforcement learning (i.e., batch-constrained Q-learning and deep deterministic policy gradient algorithms) for psychotherapy strategy recommendations and simulated inner speech training policies in cognitive remediation contexts [124, 174].
Model performance and predictive efficacy
Model performance metrics varied substantially across task categories. To avoid inappropriate cross-domain comparisons, metrics are reported separately for classification, regression, event-time, and early warning task applications. For classification tasks, 38 studies reported AUC metrics [94,95,96,97,98,99,100,101, 104, 107, 112,113,114, 117, 120, 123, 127, 130, 139,140,141, 143,144,145,146,147,148,149, 153, 155, 159, 164,165,166, 168,169,170, 173], the median of which was 0.79 (interquartile range [IQR]: 0.71–0.86) with a range of 0.59–1.00. The median accuracy was 79.0% (IQR: 66.2–86.9%), ranging from 31.4–99.0%. Four symptom monitoring studies achieved AUC ≥ 0.90, including schizophrenia vs. healthy control discrimination (AUC = 0.99) [94], negative symptom severity classification (AUC = 1.00) [104], diagnostic classification using symptom subtyping (AUC = 0.92) [99], and schizophrenia classification using temporal features (AUC = 0.95) [101]. These models typically drew on feature engineering from speech acoustics or multimodal behavioral markers; in risk-management applications, deep neural architectures with self-attention also achieved AUC ≈ 0.90 (e.g., long-stay hospitalization prediction [165]).
Regarding regression tasks, studies predicted continuous clinical scale scores, symptom trajectories, or functional outcomes using diverse error metrics. Among studies reporting mean absolute error [103, 106, 108, 126, 152, 157], the median was 2.17 (range, 0.05–7.79) when considering different measurement scales, including Brief Psychiatric Rating Scale subscales, social functioning dimensions, and prolactin concentrations. Across five studies that reported absolute root mean squared error values for clinical scales [102, 110, 129, 151, 152], the root mean squared error exhibited a median of 13.30 (range, 0.06–85.23) when considering quality of life indices, Positive and Negative Syndrome Scale total scores, and pharmacokinetic predictions; an additional study reported a relative root mean squared error of 12% on 0–3 ecological momentary assessment symptom scales [109]. The median R² was 0.63 (range, 0.14–0.92) [107, 122, 128, 151], reaching 0.92 in clozapine pharmacokinetic dose concentration modeling [151] and 0.74 in symptom severity prediction from multimodal wearable data streams [107]. Pearson correlation coefficients for symptom scale predictions were generally moderate to high, often in the range of approximately 0.4–0.9 [95, 96, 103, 105, 106, 121, 122, 126, 176], with some Positive and Negative Syndrome Scale reconstruction models achieving very high correlations (up to r ≈ 0.99) [111]. These metrics span heterogeneous scales, and counts reflect studies that reported each metric, and thus should be interpreted with caution.
For event-time modeling, two studies reported a C-index ranging from 0.71–0.78, covering post-discontinuation relapse [158] and all-cause mortality prediction [170]. In both studies, event-time models outperformed baseline-only comparators; for instance, in Brandt et al. [158], the C-index improved from 0.60 for baseline-only covariates to 0.70–0.71 for regularized Cox and random survival forest models.
Among early warning systems, six studies implemented relapse early warning models (mostly evaluated offline/retrospectively) with prediction horizons ranging from 1 week to 30 days [155, 157, 159, 161,162,163]. The median sensitivity was 31.5% (range, 0.6–66.2%), and the median specificity was 88.0% (range, 71.0–99.7%). One system achieved 66.2% recall at 6.3% precision using balanced random forests on smartphone-sensor clusters [161]. Another attained 99.7% specificity with 0.6% sensitivity via one-class support vector machines [162]. Anomaly-rate increases of approximately 108% [157] and 112% (×2.12) [162] were observed in pre-relapse windows. It was common for studies to have three- to four-week windows (overall range, 1–30 days).
Validation rigor, interpretability, and implementation readiness
Regarding validation protocols, most studies relied on cross-validation (e.g., k-fold, leave-one-subject-out, and Monte Carlo), and a subset used hold-out splits. Four studies reported external or cross-dataset evaluations, including independent cohort or cross-trial datasets and leave-one-site-out or temporal holdout designs [112, 114, 131, 150]. One study achieved 68.0% balanced accuracy on external-validation data spanning three independent trials [150]. For calibration and uncertainty quantification, five studies reported some form of probability calibration or predictive uncertainty handling using Monte Carlo dropout [113], fuzzy-logic confidence stratification over uncertainty-aware decisions [114], and Brier scores and/or calibration plots, sometimes combined with bootstrap internal validation [140, 141, 144]. Most other studies provided no such information.
Regarding interpretability mechanisms, these were relatively common, with those most frequently reported being feature-level approaches (e.g., random forest importance, Shapley additive explanations, permutation importance, and least absolute shrinkage and selection operator coefficients). Local or case-level explanation methods (5 studies) provided instance-specific rationales using Shapley additive explanations, counterfactuals, policy-trajectory visualizations, or LLM-generated justifications [119, 145, 153, 170, 174]. Rule-based interpretability (3 studies) employed decision trees or fuzzy-logic rule sets [97, 114, 160]. A subset of studies (14/83, 16.9%), often applying deep or computer-vision models, reported no explicit interpretability mechanisms [113, 115, 118, 120, 125, 138, 139, 151, 162, 164, 167, 169, 175, 176].
For closed-loop implementation, three studies documented implementation wherein AI predictions triggered direct clinical actions as follows: weekly symptom forecasts that automatically triggered clinical outreach [106]; a randomized controlled trial where AI-based adherence verification with real-time alerts improved adherence rates (94.7 vs. 64.4%; p < 0.001) and symptom outcomes [138]; and a partial closed-loop with computer-vision-flagged medication behaviors prompting counselor-mediated interventions [137]. Most studies operated in recognition-only mode, generating predictions without automated action pathways.
Referring to safety guardrails and quality signals, none of the studies employing reinforcement learning or LLMs reported safety constraints [119, 174]. Supplementary quality signals appeared in one randomized controlled trial [138], one clinician benchmark (n = 24 raters) [113], one user-testing study (n = 7) [163], and one algorithmic-bias probe across demographic subgroups [165].
Discussion
This systematic scoping review adopted a rehabilitation- rather than diagnosis-centered approach, focusing on the actionable value chain (i.e., from monitoring to decision support, intervention, follow-up, and audit) of AI in community and long-term schizophrenia rehabilitation management settings. This value chain framework reflects established measurement-based care principles [14, 15] and implementation science models for digital mental health [177, 178], wherein continuous monitoring informs clinical decisions, triggers timely interventions, enables systematic follow-up, and supports quality auditing cycles. Notably, the publication volume in this area has increased steeply in recent years, underscoring both the timeliness of this evidence base and the immaturity of its implementation layer. We also explicitly delineated the boundary between “rehabilitation” and “pure monitoring/prediction” in our methods, such that the only studies included were those in which AI functions demonstrated a clear pathway to rehabilitation goals (e.g., functional improvement, relapse prevention, medication management, or social participation).
Based on the 83 included studies published between 2012 and October 2025, it appears that AI literature for schizophrenia rehabilitation management is undergoing accelerated development, as more than 90% of the studies were published from 2020 onwards, with immature implementation. Most studies engaged in symptom monitoring (57.8%), medication management (22.9%), and risk management (19.3%), while there was a notable scarcity of studies focused on functional training and psychosocial support (i.e., the areas most proximal to rehabilitation outcomes; 1.2 and 3.6%, respectively). The evidence structure likewise skewed toward “identification and characterization,” as surrogate endpoints dominated (67/83, 80.7%), external validation was rare (4/83, 4.8%), calibration and uncertainty reporting were insufficient (5/83 studies, 6.0%), and closed-loop implementation was uncommon (3/83, 3.6%). For methods, active data collection predominated, yet 68.7% of systems adopted a “no-engagement” design without real-time feedback/intervention. Conversational and nudge-based systems together accounted for <10% of the corpus, and speech/text, EHR, and smartphone sensing were the dominant data modalities, with wearable-only systems remaining uncommon. This indicates that most systems remain only able to discriminate, still requiring a critical transition toward executable, auditable, and sustainable schizophrenia rehabilitation closed loops.
These application gaps reflect the bottleneck effect of the rehabilitation value chain. Functional training and psychosocial support studies require long-term, repeated, and contextualized measurement of behavioral change with actionable labels [84, 179], as studies on these domains rely on high-quality process data and granular task decomposition. Given such implementation complexities, both research categories being markedly underrepresented in the current ecosystem may be unsurprising. In the mental health literature, cross-diagnostic digital interventions and just-in-time adaptive interventions provide methodological inspiration for “moving from identification to action” [180,181,182]. Based on our findings, we suggest that translating the current evidence into stable benefits within schizophrenia contexts will require reconstructing the data and intervention units around rehabilitation goals. This will help ensure that the algorithmic outputs correspond one-to-one with executable action scripts [183, 184].
At the “meta-analytic” performance level, without conflating tasks, classification tasks yielded an overall median AUC of 0.79 and accuracy of 79%, with a minority of symptom monitoring studies (i.e., predominantly relying on acoustic voice features, multimodal behavioral markers, or self-attention architectures) achieving AUC ≥ 0.90. Relapse prediction models exhibited the typical profile of low sensitivity–high specificity (median sensitivity 31.5%; specificity 88%), suggesting that they are better suited as upstream triage signals rather than standalone decision gates. Two studies showed approximately doubled anomaly rates within the prediction windows [157, 162], although overall capture rates remained limited. For schizophrenia rehabilitation clinical practice, the significance of performance metrics hinges on whether they can deliver quantifiable data to promote early engagement, reduce relapse, and enhance participation [185]. Therefore, subsequent research should link surrogate endpoints with clinical endpoints (e.g., relapse, rehospitalization, functioning, and quality of life) and employ decision curve analysis to bind prediction thresholds to specific actions and resource allocation [186, 187]. These research efforts may help translate model optimization into real-world outcome improvements.
Regarding methodological maturity, most studies employed cross-validation or internal holdout, whereas few studies provided external/cross-dataset validation, reported on calibration and uncertainty, were user studies, and involved closed-loop implementation. For risk communication and action thresholds to be reached, discrimination is merely the starting point, as it is calibration that determines communication credibility and uncertainty presentation that pinpoints when to trigger human review [188, 189]. Importantly, distributional drift and subgroup disparities may rapidly erode effectiveness across disease stages and service contexts [190, 191]. Therefore, external validation, calibration curves/Brier scores, confidence intervals, and subgroup robustness should be routinely reported in future studies. At the design level, systems should embed “abstain/requires review” mechanisms and online drift monitoring strategies [191,192,193], as this would enable an automatic downgrade to a human–AI collaboration mode when uncertainty escalates.
Furthermore, a pronounced mismatch exists between interpretability and safety requirements. Feature engineering-driven models generally provide global or local explanations, whereas deep learning, LLM, and reinforcement learning applications largely lack transparency and instance-level explanations. Nevertheless, only a few studies using such transparency-lacking methods offered counterfactual or Shapley additive explanation-based case-level evidence. In rehabilitation settings, the accountability requirements for specific actions demand an auditable chain of “prediction–explanation–action” [56, 194, 195], and the model must be able to explain why follow-up was triggered at a given moment, which factors drove medication adjustments, and how thresholds self-adapted for the same patient across different stages. Particularly in reinforcement learning and LLM applications, safety constraints and alignment mechanisms remain unestablished [196,197,198], with governance lagging behind algorithmic complexity.
Clinical integration and reimbursement/literacy constitute the true thresholds for AI’s scaled deployment in schizophrenia rehabilitation management. However, only three studies achieved closed loops whereby predictions directly triggered actions, while most systems remained in identification mode. In community contexts, it is essential to clarify “who sees what signal when, follows which script to take what action, and who is responsible for tracking and auditing” [177, 178], the absence of corresponding reimbursement mechanisms and workload accounting can render proactive outreach unsustainable [199, 200], and patient and team digital literacy directly impact adherence and interpretation quality [201]. These implementation-layer complexities—role ambiguity, reimbursement gaps, and literacy barriers—reveal that algorithmic performance metrics (e.g., AUC and accuracy) measure what a system can achieve under controlled conditions but remain silent on whether it will be adopted, integrated, and sustained in routine care workflows. Previous studies have predominantly evaluated AI effectiveness through technical benchmarks, leaving questions of reach, feasibility, and service-level impact largely unaddressed. Therefore, instead of continuously hinging on technical metrics for assessing AI deployment, we recommend employing implementation science frameworks such as the RE-AIM to assess reach, adoption, and maintenance, and conducting “AI-in-the-loop” pragmatic trials to evaluate service key performance indicators (e.g., follow-up completion rates, relapse intervals, and functional improvement) [202,203,204].
Equity and generalizability issues are also concerning. The evidence base is concentrated in high-income countries, with minimal representation from low- to moderate-income countries (only one study from India), and only one included study explicitly probed algorithmic bias across demographic subgroups [165]. This entails not only out-of-domain mismatches in device/data ecosystems and care models but also potential cultural biases in goals and measurements. For instance, functional recovery is operationalized differently across cultures, such as independent employment and living in Western cultures versus family role restoration and caregiver burden reduction in Eastern cultures, whereas mainstream functional metrics exhibit limited sensitivity to the latter [205, 206]. Medication management models are likewise highly context-dependent, as divergences in drug availability, follow-up frequency, and metabolic monitoring resources directly impact the validity of adherence prediction and risk assessment [24]. Therefore, local recalibration, preregistered subgroup reporting, and quantification of performance degradation in cross-domain deployment should become standard components of transfer protocols (e.g., referencing TRIPOD + AI and PROBAST + AI guidelines on external validation and reporting standards) [207, 208] and should be combined with evidence of effectiveness erosion from distributional drift and model under-specification [193].
Regarding ethics and governance, passive sensing and high-frequency monitoring may exacerbate feelings of constant surveillance and paranoid content [48, 209]. Risk stratification outputs, if not contextualized through communication, can readily produce labeling effects and therapeutic pessimism [210, 211]. Involuntary treatment and forensic contexts further require explicit delineation of algorithmic signal boundaries and procedural safeguards [212, 213]. Current research predominantly remains at the minimal threshold of obtaining informed consent, whereas we recommend operationalizing governance requirements into four actionable standards: dynamic consent with minimum necessary data collection, purpose limitation with withdrawal/portability rights, subgroup fairness reporting with bias monitoring, and intervention safety switches in closed-loop scenarios. In high-autonomy systems such as reinforcement learning/LLMs, we suggest the integration of red-teaming, adversarial examples, and privilege escalation interception across the training-to-deployment pipeline, with human–AI decision logs recorded for post-hoc auditing, also referencing the previously cited LLM clinical evaluation/mitigation recommendations [197].
Regarding actionable recommendations for clinical practice and development, clinically, algorithmic outputs should be embedded into a “measurement–feedback–intensification” closed loop (measurement-based care). There should be preset thresholds and action scripts (e.g., “alert → phone follow-up within 48 h → escalate to in-person visit or medication adjustment if necessary”) [214], human review triggers for scenarios of elevated uncertainty or complex comorbidities, and thresholds and scripts dynamically calibrated through case audits and outcome feedback, forming a “learning rehabilitation system” [215]. Pertaining to development and operation, external validation and calibration, uncertainty quantification with abstention, cross-domain transfer with recalibration toolkits, and edge/low-bandwidth with energy constraints should be designated as minimum viable configurations [215,216,217]. Service key performance indicators should serve as primary evaluation dimensions, ensuring that technology aligns with the rehabilitation goals of “fewer relapses, better engagement, improved quality of life,” and real-world service efficiency and workload accounting should become regular evaluation metrics [218, 219].
This scoping review had several limitations. First, the search and inclusion scope (the databases of choice and English-language literature, respectively) may have resulted in omissions, with the possibilities of disciplinary intersections and non-standard terminology expanding search blind spots. Second, the included studies exhibited substantial heterogeneity in methodology, data sources, participant populations, and outcome specifications, precluding direct comparisons and quantitative synthesis. Third, the existing evidence predominantly features surrogate endpoints and short-to-medium-term follow-ups, with nearly 40% of studies enrolling fewer than 100 participants and only seven studies reporting follow-ups beyond one year, accompanied by limited external validation, calibration/uncertainty reporting, and real-world implementation documentation, all of which affect inferential strength and generalizability. Fourth, research on geography and device/platform ecosystems were concentrated, leaving cross-context transferability and local adaptability yet to be validated. Fifth, our operationalized criteria for determining “readiness for application,” while enhancing relevance for rehabilitation, may have introduced selection bias.
Overall, AI has demonstrated feasibility across several key components of schizophrenia rehabilitation management, although current evidence is insufficient to support conclusions regarding unified effect sizes. The primary contribution of this review lies in providing an application landscape and evaluative criteria centered on rehabilitation goals, distinguishing technologies with mere identification capabilities from tools that can be integrated into service pathways. Future research should adopt patient-centered outcomes and service performance as primary endpoints; conduct prospective, multi-center, and cross-context validation and recalibration; standardize the reporting of calibration, confidence intervals, and subgroup performance; and advance executable and auditable clinical integration within interoperability and governance frameworks. Only through rigorous translation from signal generation to service-level execution can AI substantively reduce relapse risk, enhance engagement, and improve quality of life in schizophrenia contexts.
References
World Health Organization Clinical Descriptions and Diagnostic Requirements for ICD-11 Mental, Behavioural and Neurodevelopmental Disorders (CDDR). Geneva: World Health Organization; 2024.
McCutcheon RA, Reis Marques TR, Howes OD. Schizophrenia—an overview. JAMA Psychiatry. 2020;77:201–10.
World Health Organization Schizophrenia—Fact Sheet. Geneva: World Health Organization; 2025.
Hjorthøj C, Stürup AE, McGrath JJ, Nordentoft M. Years of potential life lost and life expectancy in schizophrenia: a systematic review and meta-analysis. Lancet Psychiatry. 2017;4:295–301.
Robinson D, Woerner MG, Alvir JMJ, Bilder R, Goldman R, Geisler S, et al. Predictors of relapse following response from a first episode of schizophrenia or schizoaffective disorder. Arch Gen Psychiatry. 1999;56:241–7.
Lu L, Dong M, Zhang L, Zhu XM, Ungvari GS, Ng CH, et al. Prevalence of suicide attempts in individuals with schizophrenia: a meta-analysis of observational studies. Epidemiol Psychiatr Sci. 2019;29:e39.
Palmer BA, Pankratz VS, Bostwick JM. The lifetime risk of suicide in schizophrenia: a reexamination. Arch Gen Psychiatry. 2005;62:247–53.
Bai W, Liu ZH, Jiang YY, Zhang QE, Rao WW, Cheung T, et al. Worldwide prevalence of suicidal ideation and suicide plan among people with schizophrenia: a meta-analysis and systematic review of epidemiological surveys. Transl Psychiatry. 2021;11:552.
World Health Organization Psychosocial Rehabilitation—a Consensus Statement (WHO/MNH/MND/96.2). Geneva: World Health Organization; 1996.
Thornicroft G, Deb T, Henderson C. Community mental health care worldwide: current status and further developments. World Psychiatry. 2016;15:276–86.
World Health Organization Comprehensive Mental Health Action Plan 2013-30 (updated). Geneva: World Health Organization; 2021.
World Health Organization Guidance on Community Mental Health Services: Promoting Person-Centred and Rights-Based Approaches. Geneva: World Health Organization; 2021.
Patel V, Saxena S, Lund C, Thornicroft G, Baingana F, Bolton P, et al. The Lancet Commission on global mental health and sustainable development. Lancet. 2018;392:1553–98.
American Psychiatric Association The American Psychiatric Association Practice Guideline for the Treatment of Patients with Schizophrenia. 3rd edn. Washington, DC: APA Publishing; 2020.
Lewis CC, Boyd M, Puspitasari A, Navarro E, Howard J, Kassab H, et al. Implementing measurement-based care in behavioral health: a review. JAMA Psychiatry. 2019;76:324–35.
Wykes T, Huddy V, Cellard C, McGurk SR, Czobor P. A meta-analysis of cognitive remediation for schizophrenia: methodology and effect sizes. Am J Psychiatry. 2011;168:472–85.
Rodolico A, Bighelli I, Avanzato C, Concerto C, Cutrufelli P, Mineo L, et al. Family interventions for relapse prevention in schizophrenia: a systematic review and network meta-analysis. Lancet Psychiatry. 2022;9:211–21.
Turner DT, McGlanaghy E, Cuijpers P, Van Der Gaag M, Karyotaki E, MacBeth A. A meta-analysis of social skills training and related interventions for psychosis. Schizophr Bull. 2018;44:475–91.
Asher L, Hanlon C, Birhane R, Habtamu A, Eaton J, Weiss HA, et al. Community-based rehabilitation intervention for people with schizophrenia in Ethiopia (RISE): a 12 month mixed methods pilot study. BMC Psychiatry. 2018;18:250.
World Health Organization World Mental Health Report: Transforming Mental Health for All. Geneva: World Health Organization; 2022.
World Health Organization Mental Health Atlas. Geneva: World Health Organization; 2020.
Lieberman JA, Stroup TS, McEvoy JP, Swartz MS, Rosenheck RA, Perkins DO, et al. Effectiveness of antipsychotic drugs in patients with chronic schizophrenia. N Engl J Med. 2005;353:1209–23.
Haddad PM, Brain C, Scott J. Nonadherence with antipsychotic medication in schizophrenia: challenges and management strategies. Patient Relat Outcome Meas. 2014;5:43–62.
National Institute for Health and Care Excellence Psychosis and Schizophrenia in Adults: Prevention and Management. London: NICE; 2014. vol. CG178.
Substance Abuse and Mental Health Services Administration Results from the 2023 National Survey on Drug Use and Health: Annual National Report. Rockville, MD: Substance Abuse and Mental Health Services Administration; 2024.
Torous J, Linardon J, Goldberg SB, Sun S, Bell I, Nicholas J, et al. The evolving field of digital mental health: current evidence and implementation issues for smartphone apps, generative artificial intelligence, and virtual reality. World Psychiatry. 2025;24:156–74.
Linardon J, Torous J, Firth J, Cuijpers P, Messer M, Fuller-Tyszkiewicz M. Current evidence on the efficacy of mental health smartphone apps for symptoms of depression and anxiety. A meta-analysis of 176 randomized controlled trials. World Psychiatry. 2024;23:139–49.
Hagi K, Kurokawa S, Takamiya A, Fujikawa M, Kinoshita S, Iizuka M, et al. Telepsychiatry versus face-to-face treatment: systematic review and meta-analysis of randomised controlled trials. Br J Psychiatry. 2023;223:407–14.
Karyotaki E, Efthimiou O, Miguel C, Bermpohl FMG, Furukawa TA, Cuijpers P, et al. Internet-based cognitive behavioral therapy for depression: a systematic review and individual patient data network meta-analysis. JAMA Psychiatry. 2021;78:361–71.
Matsumoto K, Hamatani S, Shimizu E. Effectiveness of videoconference-delivered cognitive behavioral therapy for adults with psychiatric disorders: systematic and meta-analytic review. J Med Internet Res. 2021;23:e31293.
Freeman D, Lambe S, Kabir T, Petit A, Rosebrock L, Yu LM, et al. Automated virtual reality therapy to treat agoraphobic avoidance and distress in patients with psychosis (gameChange): a multicentre, parallel-group, single-blind, randomised, controlled trial in England with mediation and moderation analyses. Lancet Psychiatry. 2022;9:375–88.
Fonseka LN, Woo BKP. Wearables in schizophrenia: update on current and future clinical applications. JMIR Mhealth Uhealth. 2022;10:e35600.
Stone AA, Schneider S, Smyth JM. Evaluation of pressing issues in ecological momentary assessment. Annu Rev Clin Psycho. 2023;19:107–31.
Onnela JP, Rauch SL. Harnessing smartphone-based digital phenotyping to enhance behavioral and mental health. Neuropsychopharmacology. 2016;41:1691–6.
Murphy KP Probabilistic Machine Learning: An Introduction. Cambridge, MA: MIT Press; 2022.
Huang SC, Pareek A, Jensen M, Lungren MP, Yeung S, Chaudhari AS. Self-supervised learning for medical image classification: a systematic review and implementation guidelines. NPJ Digit Med. 2023;6:74.
Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS. A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst. 2021;32:4–24.
Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med. 2023;29:1930–40.
Baltrušaitis T, Ahuja C, Morency LP. Multimodal machine learning: a survey and taxonomy. IEEE Trans Pattern Anal Mach Intell. 2019;41:423–43.
Rieke N, Hancox J, Li W, Milletarì F, Roth HR, Albarqouni S, et al. The future of digital health with federated learning. NPJ Digit Med. 2020;3:119.
Feng J, Phillips RV, Malenica I, Bishara A, Hubbard AE, Celi LA, et al. Clinical artificial intelligence quality improvement: towards continual monitoring and updating of AI algorithms in healthcare. NPJ Digit Med. 2022;5:66.
World Health Organization Ethics and Governance of Artificial Intelligence for Health: WHO Guidance. World Health Organization; 2021. https://www.who.int/publications/i/item/9789240029200.
Hansen L, Bernstorff M, Enevoldsen K, Kolding S, Damgaard JG, Perfalk E, et al. Predicting diagnostic progression to schizophrenia or bipolar disorder via machine learning. JAMA Psychiatry. 2025;82:459–69.
Raket LL, Jaskolowski J, Kinon BJ, Brasen JC, Jönsson L, Wehnert A, et al. Dynamic ElecTronic hEalth reCord deTection (DETECT) of individuals at risk of a first episode of psychosis: a case-control development and validation study. Lancet Digit Health. 2020;2:e229–e239.
Zhu Y, Maikusa N, Radua J, Sämann PG, Fusar-Poli P, Agartz I, et al. Using brain structural neuroimaging measures to predict psychosis onset for individuals at clinical high-risk. Mol Psychiatry. 2024;29:1465–77.
Gao J, Qian M, Wang Z, Li Y, Luo N, Xie S, et al. Exploring schizophrenia classification through multimodal MRI and deep graph neural networks: unveiling brain region-specific weight discrepancies and their association with cell-type specific transcriptomic features. Schizophr Bull. 2024;51:217–35.
Li A, Zalesky A, Yue W, Howes O, Yan H, Liu Y, et al. A neuroimaging biomarker for striatal dysfunction in schizophrenia. Nat Med. 2020;26:558–65.
Qi S, Sui J, Pearlson G, Bustillo J, Perrone-Bizzozero NI, Kochunov P, et al. Derivation and utility of schizophrenia polygenic risk associated multimodal MRI frontotemporal network. Nat Commun. 2022;13:4929.
Anderson KM, Collins MA, Chin R, Ge T, Rosenberg MD, Holmes AJ. Transcriptional and imaging-genetic association of cortical interneurons, brain function, and schizophrenia risk. Nat Commun. 2020;11:2889.
Morgan SE, Seidlitz J, Whitaker KJ, Romero-Garcia R, Clifton NE, Scarpazza C, et al. Cortical patterning of abnormal morphometric similarity in psychosis is associated with brain expression of schizophrenia-related genes. Proc Natl Acad Sci USA. 2019;116:9604–9.
Torous J, Bucci S, Bell IH, Kessing LV, Faurholt-Jepsen M, Whelan P, et al. The growing field of digital psychiatry: current evidence and the future of apps, social media, chatbots, and virtual reality. World Psychiatry. 2021;20:318–35.
Koutsouleris N, Hauser TU, Skvortsova V, De Choudhury M. From promise to practice: towards the realisation of AI-informed mental health care. Lancet Digit Health. 2022;4:e829–e840.
Yoo DW, Woo H, Nguyen VC, Birnbaum ML, Kruzan KP, Kim JG et al. Patient perspectives on AI-driven predictions of schizophrenia relapses: understanding concerns and opportunities for self-care and treatment. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI Conference. New York: Association for Computing Machinery; 2024.
Eisner E, Ball H, Ainsworth J, Cella M, Chalmers N, Clifford S. et al. Using passive sensing to predict psychosis relapse: an in-depth qualitative study exploring perspectives of people with psychosis. Schizophr Bull. 2025:sbaf126. https://doi.org/10.1093/schbul/sbaf126
Rogan J, Firth J, Bucci S. Healthcare professionals’ views on the use of passive sensing and machine learning approaches in secondary mental healthcare: a qualitative study. Health Expect. 2024;27:e70116.
Ghassemi M, Oakden-Rayner L, Beam AL. The false hope of current approaches to explainable artificial intelligence in health care. Lancet Digit Health. 2021;3:e745–e750.
Koppe G, Meyer-Lindenberg A, Durstewitz D. Deep learning for small and big data in psychiatry. Neuropsychopharmacology. 2021;46:176–90.
Lewandowski KE. Ecological validity in cognitive assessment and treatment. Schizophr Res Cogn. 2025;40:100341.
Challis S, Nielssen O, Harris A, Large M. Systematic meta-analysis of the risk factors for deliberate self-harm before and after treatment for first-episode psychosis. Acta Psychiatr Scand. 2013;127:442–54.
Shatte ABR, Hutchinson DM, Teague SJ. Machine learning in mental health: a scoping review of methods and applications. Psychol Med. 2019;49:1426–48.
Foteinopoulou NM, Patras I. Machine learning approaches for fine-grained symptom estimation in schizophrenia: a comprehensive review. Artif Intell Med. 2025;165:103129.
Kambeitz J, Kambeitz-Ilankovic L, Leucht S, Wood S, Davatzikos C, Malchow B, et al. Detecting neuroimaging biomarkers for schizophrenia: a meta-analysis of multivariate pattern recognition studies. Neuropsychopharmacology. 2015;40:1742–51.
Ni Y, Jia F. A scoping review of AI-Driven digital interventions in mental health care: mapping applications across screening, support, monitoring, prevention, and clinical education. Healthcare (Basel). 2025;13:1205.
Cruz-Gonzalez P, He AWJ, Lam EP, Ng IMC, Li MW, Hou R, et al. Artificial intelligence in mental health care: a systematic review of diagnosis, monitoring, and intervention applications. Psychol Med. 2025;55:e18.
Galderisi S, Mucci A, Buchanan RW, Arango C. Negative symptoms of schizophrenia: new developments and unanswered research questions. Lancet Psychiatry. 2018;5:664–77.
Handest R, Molstrom IM, Gram Henriksen M, Hjorthøj C, Nordgaard J. A systematic review and meta-analysis of the association between psychopathology and social functioning in schizophrenia. Schizophr Bull. 2023;49:1470–85.
Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-SCR): checklist and explanation. Ann Intern Med. 2018;169:467–73.
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM. 2017;60:84–90.
LeCun Y, Bengio Y, Hinton GE. Deep learning. Nature. 2015;521:436–44.
Chivilgina O, Elger BS, Jotterand F. Digital technologies for schizophrenia management: a descriptive review. Sci Eng Ethics. 2021;27:25.
Rössler W. Psychiatric rehabilitation today: an overview. World Psychiatry. 2006;5:151–7.
Owen MJ, Sawa A, Mortensen PB. Schizophrenia. Lancet. 2016;388:86–97.
Reed GM, Keeley JW, Rebello TJ, First MB, Gureje O, Ayuso-Mateos JL, et al. Clinical utility of ICD-11 diagnostic guidelines for high-burden mental disorders: results from mental health settings in 13 countries. World Psychiatry. 2018;17:306–15.
Bell IH, Eisner E, Allan S, Cartner S, Torous J, Bucci S, et al. Methodological characteristics and feasibility of ecological momentary assessment studies in psychosis: a systematic review and meta-analysis. Schizophr Bull. 2024;50:238–65.
Benoit J, Onyeaka H, Keshavan M, Torous J. Systematic review of digital phenotyping and machine learning in psychosis spectrum illnesses. Harv Rev Psychiatry. 2020;28:296–304.
Gumley AI, Bradstreet S, Ainsworth J, Allan S, Alvarez-Jimenez M, Aucott L, et al. The EMPOWER blended digital intervention for relapse prevention in schizophrenia: a feasibility cluster randomised controlled trial in Scotland and Australia. Lancet Psychiatry. 2022;9:477–86.
Stroup TS, Gray N. Management of common adverse effects of antipsychotic medications. World Psychiatry. 2018;17:341–56.
Kishimoto T, Hagi K, Kurokawa S, Kane JM, Correll CU. Long-acting injectable versus oral antipsychotics for the maintenance treatment of schizophrenia: a systematic review and comparative meta-analysis of randomised, cohort, and pre–post studies. Lancet Psychiatry. 2021;8:387–404.
Hawton K, Lascelles K, Pitman A, Gilbert S, Silverman M. Assessment of suicide risk in mental health practice: shifting from prediction to therapeutic assessment, formulation, and risk management. Lancet Psychiatry. 2022;9:922–8. 190–200.
Whiting D, Gulati G, Geddes JR, Fazel S. Association of schizophrenia spectrum disorders and violence perpetration in adults and adolescents from 15 countries: a systematic review and meta-analysis. JAMA Psychiatry. 2022;79:120–32.
Gleeson JF, McGuckian TB, Fernandez DK, Fraser MI, Pepe A, Taskis R, et al. Systematic review of early warning signs of relapse and behavioural antecedents of symptom worsening in people living with schizophrenia spectrum disorders. Clin Psychol Rev. 2024;107:102357.
Vita A, Barlati S, Ceraso A, Nibbio G, Ariu C, Deste G, et al. Effectiveness, core elements, and moderators of response of cognitive remediation for schizophrenia: a systematic review and meta-analysis of randomized clinical trials. JAMA Psychiatry. 2021;78:848–58.
Nijman SA, Veling W, van der Stouwe ECD, Pijnenborg GHM. Social cognition training for people with a psychotic disorder: a network meta-analysis. Schizophr Bull. 2020;46:1086–103.
Bond GR, Al-Abdulmunem M, Marbacher J, Christensen TN, Sveinsdottir V, Drake RE. A systematic review and meta-analysis of IPS supported employment for young adults with mental health conditions. Adm Policy Ment Health. 2023;50:160–72.
Bighelli I, Rodolico A, García-Mieres H, Pitschel-Walz G, Hansen WP, Schneider-Thoma J, et al. Psychosocial and psychological interventions for relapse prevention in schizophrenia: a systematic review and network meta-analysis. Lancet Psychiatry. 2021;8:969–80.
Jauhar S, McKenna PJ, Radua J, Fung E, Salvador R, Laws KR. Cognitive–behavioural therapy for the symptoms of schizophrenia: systematic review and meta-analysis with examination of potential bias. Br J Psychiatry. 2014;204:20–29.
Smit D, Miguel C, Vrijsen JN, Groeneweg B, Spijker J, Cuijpers P. The effectiveness of peer support for individuals with mental illness: systematic review and meta-analysis. Psychol Med. 2023;53:5332–41.
Firth J, Siddiqi N, Koyanagi A, Siskind D, Rosenbaum S, Galletly C, et al. The Lancet Psychiatry Commission: a blueprint for protecting physical health in people with mental illness. Lancet Psychiatry. 2019;6:675–712.
Daumit GL, Dickerson FB, Wang NY, Dalcin A, Jerome GJ, Anderson CAM, et al. A behavioral weight-loss intervention in persons with serious mental illness. N Engl J Med. 2013;368:1594–602.
Tsoi DTY, Porwal M, Webster AC. Efficacy and safety of bupropion for smoking cessation and reduction in schizophrenia: systematic review and meta-analysis. Br J Psychiatry. 2010;196:346–53.
Kane JM, Robinson DG, Schooler NR, Mueser KT, Penn DL, Rosenheck RA, et al. Comprehensive versus usual community care for first-episode psychosis: 2-year outcomes from the NIMH RAISE early treatment program. Am J Psychiatry. 2016;173:362–72.
Bond GR, Drake RE. The critical ingredients of assertive community treatment. World Psychiatry. 2015;14:240–2.
Dieterich M, Irving CB, Bergman H, Khokhar MA, Park B, Marshall M. Intensive case management for severe mental illness. Cochrane Database Syst Rev. 2017;1:CD007906.
Osipov M, Behzadi Y, Kane JM, Petrides G, Clifford GD. Objective identification and analysis of physiological and behavioral signs of schizophrenia. J Ment Health. 2015;24:276–82.
Arslan B, Kizilay E, Verim B, Demirlek C, Dokuyan Y, Turan YE, et al. Automated linguistic analysis in speech samples of Turkish-speaking patients with schizophrenia-spectrum disorders. Schizophr Res. 2024;267:65–71.
Chan CC, Norel R, Agurto C, Lysaker PH, Myers EJ, Hazlett EA, et al. Emergence of language related to self-experience and agency in autobiographical narratives of individuals with schizophrenia. Schizophr Bull. 2023;49:444–53.
Parola A, Gabbatore I, Berardinelli L, Salvini R, Bosco FM. Multimodal assessment of communicative-pragmatic features in schizophrenia: a machine learning approach. NPJ Schizophr. 2021;7:28.
Ciampelli S, Voppel AE, De Boer JN, Koops S, Sommer IEC. Combining automatic speech recognition with semantic natural language processing in schizophrenia. Psychiatry Res. 2023;325:115252.
De Boer JN, Voppel AE, Brederoo SG, Schnack HG, Truong KP, Wijnen FNK, et al. Acoustic speech markers for schizophrenia-spectrum disorders: a diagnostic and symptom-recognition tool. Psychol Med. 2023;53:1302–12.
Richter V, Neumann M, Kothare H, Roesler O, Liscombe J, Suendermann-Oeft D et al. Towards multimodal dialog-based speech & facial biomarkers of schizophrenia. In: International Conference on Multimodal Interaction. New York, NY: ACM; 2022.
Kalinich M, Ebrahim S, Hays R, Melcher J, Vaidyam A, Torous J. Applying machine learning to smartphone based cognitive and sleep assessments in schizophrenia. Schizophr Res Cogn. 2022;27:100216.
Shen H, Wang SH, Zhang Y, Wang H, Li F, Lucas MV, et al. Color painting predicts clinical symptoms in chronic schizophrenia patients via deep learning. BMC Psychiatry. 2021;21:522.
Wang R, Aung MSH, Abdullah S, Brian R, Campbell AT, Choudhury T et al. CrossCheck: toward passive sensing and detection of mental health changes in people with schizophrenia. In: Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing. (ACM, New York, 2016). pp 886-97.
Jeong L, Lee M, Eyre B, Balagopalan A, Rudzicz F, Gabilondo C. Exploring the use of natural language processing for objective assessment of disorganized speech in schizophrenia. Psychiatr Res Clin Pract. 2023;5:84–92.
Arevian AC, Bone D, Malandrakis N, Martinez VR, Wells KB, Miklowitz DJ, et al. Clinical state tracking in serious mental illness through computational analysis of speech. PLoS ONE. 2020;15:e0225695.
Wang R, Wang W, Aung MSH, Ben-Zeev D, Brian R, Campbell AT, et al. Predicting symptom trajectories of schizophrenia using mobile sensing. Proc ACM Interact Mob Wearable Ubiquitous Technol. 2017;1:1–24.
Hong M, Kang RR, Yang JH, Rhee SJ, Lee H, Kim YG, et al. Comprehensive symptom prediction in inpatients with acute psychiatric disorders using wearable-based deep learning models: development and validation study. J Med Internet Res. 2024;26:e65994.
Adler DA, Wang F, Mohr DC, Choudhury T. Machine learning for passive mental health symptom prediction: generalization across different longitudinal mobile sensing studies. PLoS ONE. 2022;17:e0266516.
Tseng VWS, Sano A, Ben-Zeev D, Brian R, Campbell AT, Hauser M, et al. Using behavioral rhythms and multi-task learning to predict fine-grained symptoms of schizophrenia. Sci Rep. 2020;10:15100.
Lin E, Lin CH, Lane HY. Applying a bagging ensemble machine learning approach to predict functional outcome of schizophrenia with clinical symptoms and cognitive functions. Sci Rep. 2021;11:6922.
Lin GH, Liu JH, Lee SC, Wu BJ, Li SQ, Chiu HJ, et al. Developing a machine learning-based short form of the Positive and Negative Syndrome Scale. Asian J Psychiatry. 2024;94:103965.
Soldatos RF, Cearns M, Nielsen MØ, Kollias C, Xenaki LA, Stefanatou P, et al. Prediction of early symptom remission in two independent samples of first-episode psychosis patients using machine learning. Schizophr Bull. 2022;48:122–33.
van Dee V, Kia SM, Fregosi C, Swildens WE, Alkema A, Batalla A, et al. Prognostic predictions in psychosis: exploring the complementary role of machine learning models. Ment Health. 2025;28:e301594.
van Opstal DPJ, Kia SM, Jakob L, Somers M, Sommer IEC, Winter-van Rossum I, et al. Psychosis prognosis predictor: a continuous and uncertainty-aware prediction of treatment outcome in first-episode psychosis. Acta Psychiatr Scand. 2025;151:280–92.
Jean T, Guay Hottin R, Orban P. Forecasting mental states in schizophrenia using digital phenotyping data. PLOS Digit Health. 2025;4:e0000734.
Cohen AS, Cox CR, Le TP, Cowan T, Masucci MD, Strauss GP, et al. Using machine learning of computerized vocal expression to measure blunted vocal affect and alogia. NPJ Schizophr. 2020;6:26.
Narkhede SM, Luther L, Raugh IM, Knippenberg AR, Esfahlani FZ, Sayama H, et al. Machine learning identifies digital phenotyping measures most relevant to negative symptoms in psychotic disorders: implications for clinical trials. Schizophr Bull. 2022;48:425–36.
Umbricht D, Cheng WY, Lipsmeier F, Bamdadian A, Lindemann M. Deep learning-based human activity recognition for continuous activity and gesture monitoring for schizophrenia patients with negative symptoms. Front Psychiatry. 2020;11:574375.
Liu CM, Chan YH, Ho MY, Liu CC, Lu MH, Liao YA. et al. Analyzing generative AI and machine learning in auto-assessing schizophrenia's negative symptoms. Schizophr Bull. 2025:sbaf102. https://doi.org/10.1093/schbul/sbaf102.
Bradley ER, Portanova J, Woolley JD, Buck B, Painter IS, Hankin M, et al. Quantifying abnormal emotion processing: a novel computational assessment method and application in schizophrenia. Psychiatry Res. 2024;336:115893.
Holmlund TB, Chandler C, Foltz PW, Cohen AS, Cheng J, Bernstein JC, et al. Applying speech technologies to assess verbal memory in patients with serious mental illness. NPJ Digit Med. 2020;3:33.
McCutcheon RA, Keefe RSE, McGuire PM, Marquand A. Deconstructing cognitive impairment in psychosis with a machine learning approach. JAMA Psychiatry. 2025;82:57–65.
Zakowicz PT, Brzezicki MA, Levidiotis C, Kim S, Wejkuć O, Wisniewska Z, et al. Detection of formal thought disorders in child and adolescent psychosis using machine learning and neuropsychometric data. Front Psychiatry. 2025;16:1550571.
Granato G, Costanzo R, Borghi A, Mattera A, Carruthers S, Rossell S, et al. An experimental and computational investigation of executive functions and inner speech in schizophrenia spectrum disorders. Sci Rep. 2025;15:5185.
Difrancesco S, Fraccaro P, Van Der Veer SN, Alshoumr B, Ainsworth J, Bellazzi R et al. Out-of-home activity recognition from GPS data in schizophrenic patients. In: 29th International Symposium on Computer-based Medical Systems (CBMS) (New York, NY, 2016).
Wang W, Mirjafari S, Harari G, Ben-Zeev D, Brian R, Choudhury T et al. Social sensing: assessing social functioning of patients living with schizophrenia using mobile phone sensing. In: CHI ‘20. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. New York, NY: ACM; 2020. In.
Badal VD, Depp CA, Hitchcock PF, Penn DL, Harvey PD, Pinkham AE. Computational methods for integrative evaluation of confidence, accuracy, and reaction time in facial affect recognition in schizophrenia. Schizophr Res Cogn. 2021;25:100196.
Abplanalp SJ, Green MF, Wynn JK, Eisenberger NI, Horan WP, Lee J, et al. Using machine learning to understand social isolation and loneliness in schizophrenia, bipolar disorder, and the community. Schizophrenia (Heidelb). 2024;10:88.
Shibata Y, Victorino JN, Natsuyama T, Okamoto N, Yoshimura R, Shibata T. Estimation of subjective quality of life in schizophrenic patients using speech features. Front Rehabil Sci. 2023;4:1121034.
Jeong JH, Kim J, Kang N, Ahn YM, Kim YS, Lee D, et al. Modeling the determinants of subjective well-being in schizophrenia. Schizophr Bull. 2025;51:1118–33.
Miley K, Meyer-Kalos P, Ma S, Bond DJ, Kummerfeld E, Vinogradov S. Causal pathways to social and occupational functioning in the first episode of schizophrenia: uncovering unmet treatment needs. Psychol Med. 2023;53:2041–9.
Hulme WJ, Stockton-Powdrell C, Lewis S, Martin GP, Bucci S, Parsia B et al. Cluster hidden Markov models: an application to ecological momentary assessment of schizophrenia. In: 32nd International Symposium on Computer-Based Medical Systems (CBMS) (New York, NY, 2019).
Amoretti S, Verdolini N, Mezquida G, Rabelo-da-Ponte FD, Cuesta MJ, Pina-Camacho L, et al. Identifying clinical clusters with distinct trajectories in first-episode psychosis through an unsupervised machine learning technique. Eur Neuropsychopharmacol. 2021;47:112–29.
Martínez-Cao C, Sánchez-Lasheras F, García-Fernández A, González-Blanco L, Zurrón-Madera P, Sáiz PA, et al. PsiOvi staging model for schizophrenia (PsiOvi SMS): a new internet tool for staging patients with schizophrenia. Eur Psychiatry. 2024;67:e36.
Martinelli A, Leone S, Baronio CM, Archetti D, Redolfi A, Adorni A, et al. Sex differences in schizophrenia spectrum disorders: insights from the DiAPAson study using a data-driven approach. Soc Psychiatry Psychiatr Epidemiol. 2025;60:1983–97.
Howes C, Purver M, McCabe R, Healey P, Lavelle M Predicting adherence to treatment for schizophrenia from dialogue transcripts. In: Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue. New York: ACM; 2012.
Bain EE, Shafner L, Walling DP, Othman AA, Chuang-Stein C, Hinkle J, et al. Use of a novel artificial intelligence platform on mobile devices to assess dosing compliance in a phase 2 clinical trial in subjects with schizophrenia. JMIR Mhealth Uhealth. 2017;5:e18.
Chen HH, Hsu HT, Lin PC, Chen CY, Hsieh HF, Ko CH. Efficacy of a smartphone app in enhancing medication adherence and accuracy in individuals with schizophrenia during the COVID-19 pandemic: randomized controlled trial. JMIR Ment Health. 2023;10:e50806.
Zhu Z, Roy D, Feng S, Vogler B. AI-based medication adherence prediction in patients with schizophrenia and attenuated psychotic disorders. Schizophr Res. 2025;275:42–51.
Jeon SM, Cho J, Lee DY, Kwon JW. Comparison of prediction methods for treatment continuation of antipsychotics in children and adolescents with schizophrenia. Evid Based Ment Health. 2022;25:e26–e33.
Pei X, Du X, Liu D, Li X, Wu Y. Nomogram model for predicting medication adherence in patients with various mental disorders based on the Dryad database. BMJ Open. 2024;14:e087312.
Dickson MC, Nguyen MM, Patel C, Grabich SC, Benson C, Cothran T, et al. Adherence, persistence, readmissions, and costs in Medicaid members with schizophrenia or schizoaffective disorder initiating paliperidone palmitate versus switching oral antipsychotics: a real-world retrospective investigation. Adv Ther. 2023;40:349–66.
Kim EY, Kim J, Jeong JH, Jang J, Kang N, Seo J, et al. Machine learning prediction model of the treatment response in schizophrenia reveals the importance of metabolic and subjective characteristics. Schizophr Res. 2025;275:146–55.
Wong TY, Luo H, Tang J, Moore TM, Gur RC, Suen YN, et al. Development of an individualized risk calculator of treatment resistance in patients with first-episode psychosis (TRipCal) using automated machine learning: a 12-year follow-up study with clozapine prescription as a proxy indicator. Transl Psychiatry. 2024;14:50.
Barruel D, Hilbey J, Charlet J, Chaumette B, Krebs MO, Dauriac-Le Masson V. Predicting treatment resistance in schizophrenia patients: machine learning highlights the role of early pathophysiologic features. Schizophr Res. 2024;270:1–10.
Podichetty JT, Silvola RM, Rodriguez-Romero V, Bergstrom RF, Vakilynejad M, Bies RR, et al. Application of machine learning to predict reduction in total PANSS score and enrich enrollment in schizophrenia clinical trials. Clin Transl Sci. 2021;14:1864–74.
Yee JY, Phua SX, See YM, Andiappan AK, Goh WWB, Lee J. Predicting antipsychotic responsiveness using a machine learning classifier trained on plasma levels of inflammatory markers in schizophrenia. Transl Psychiatry. 2025;15:51.
Vellucci L, Barone A, Buonaguro EF, Ciccarelli M, De Simone G, Iannotta F, et al. Severity of autism-related symptoms in treatment-resistant schizophrenia: associations with cognitive performance, psychosocial functioning, and neurological soft signs—clinical evidence and ROC analysis. J Psychiatr Res. 2025;185:119–29.
Mishra A, Maiti R, Jena M, Srinivasan A. Evaluating machine learning algorithms for prediction of treatment response for sleep disturbances in patients with schizophrenia: a post-hoc analysis from a randomized controlled trial. Psychiatr Danub. 2025;37:46–54.
Hieronymus F, Hieronymus M, Sjöstedt A, Nilsson S, Näslund J, Lisinski A, et al. Predicting remission in schizophrenia using machine learning—assessing the impact of sample size and predictor overinclusion. Acta Psychiatr Scand. 2025;152:441–50.
Wysokiński A, Dreczka J. Clozapine toxicity predictor: deep neural network model predicting clozapine toxicity and its therapeutic dose range. Psychiatry Res. 2024;342:116256.
Zhu X, Hu J, Xiao T, Huang S, Shang D, Wen Y. Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients. Front Endocrinol. 2022;13:1011492.
Vidal N, Sedki M, Younès N, Bottemanne H, Roux P, Brunet-Gouet E. Neural network analysis of the contribution of psychotropic prescription sequences to the risk of non-psychiatric adverse events in bipolar and schizophrenia spectrum disorders. Front Digit Health. 2025;7:1633220.
Wu CS, Luedtke AR, Sadikova E, Tsai HJ, Liao SC, Liu CC, et al. Development and validation of a machine learning individualized treatment rule in first-episode schizophrenia. JAMA Netw Open. 2020;3:e1921660.
Zlatintsi A, Filntisis PP, Garoufis C, Efthymiou N, Maragos P, Menychtas A, et al. E-prevention: advanced support system for monitoring and relapse prevention in patients with psychotic disorders analyzing long-term multimodal data from wearables and video captures. Sensors (Basel). 2022;22:7544.
Wang X, Vouk N, Heaukulani C, Buddhika T, Martanto W, Lee J, et al. HOPES: an integrative digital phenotyping platform for data collection, monitoring, and machine learning. J Med Internet Res. 2021;23:e23984.
Adler DA, Ben-Zeev D, Tseng VWS, Kane JM, Brian R, Campbell AT, et al. Predicting early warning signs of psychotic relapse from passive sensing data: an approach using encoder-decoder neural networks. JMIR Mhealth Uhealth. 2020;8:e19962.
Brandt L, Ritter K, Schneider-Thoma J, Siafis S, Montag C, Ayrilmaz H, et al. Predicting psychotic relapse following randomised discontinuation of paliperidone in individuals with schizophrenia or schizoaffective disorder: an individual participant data analysis. Lancet Psychiatry. 2023;10:184–96.
Birnbaum ML, Kulkarni PP, Van Meter A, Chen V, Rizvi AF, Arenare E, et al. Utilizing machine learning on internet search activity to support the diagnostic process and relapse detection in young individuals with early psychosis: feasibility study. JMIR Ment Health. 2020;7:e19348.
Fond G, Bulzacka E, Boucekine M, Schürhoff F, Berna F, Godin O, et al. Machine learning for predicting psychotic relapse at 2 years in schizophrenia in the national FACE-SZ cohort. Prog Neuropsychopharmacol Biol Psychiatry. 2019;92:8–18.
Zhou J, Lamichhane B, Ben-Zeev D, Campbell A, Sano A. Predicting psychotic relapse in schizophrenia with mobile sensor data: routine cluster analysis. JMIR Mhealth Uhealth. 2022;10:e31006.
Cohen A, Naslund JA, Chang S, Nagendra S, Bhan A, Rozatkar A, et al. Relapse prediction in schizophrenia with smartphone digital phenotyping during COVID-19: a prospective, three-site, two-country, longitudinal study. Schizophrenia (Heidelb). 2023;9:6.
Yoo DW, Woo H, Nguyen VC, Birnbaum ML, Kruzan KP, Kim JG, et al. Patient perspectives on AI-driven predictions of schizophrenia relapses: understanding concerns and opportunities for self-care and treatment. In: Proc SIGCHI Conf Hum Factor Comput Syst. 2024;2024:702.
Góngora Alonso S, Herrera Montano I, de La Torre Díez I, Franco-Martín M, Amoon M, Román-Gallego JA, et al. Predictive modeling of hospital readmission of schizophrenic patients in a Spanish region combining particle swarm optimization and machine learning algorithms. Biomimetics (Basel). 2024;9:752.
Bao Y, Wang W, Liu Z, Wang W, Zhao X, Yu S, et al. Leveraging deep neural network and language models for predicting long-term hospitalization risk in schizophrenia. Schizophrenia (Heidelb). 2025;11:35.
Yu T, Zhang X, Liu X, Xu C, Deng C. The prediction and influential factors of violence in male schizophrenia patients with machine learning algorithms. Front Psychiatry. 2022;13:799899.
Mason AJC, Bhavsar V, Botelle R, Chandran D, Li L, Mascio A, et al. Applying neural network algorithms to ascertain reported experiences of violence in routine mental healthcare records and distributions of reports by diagnosis. Front Psychiatry. 2024;15:1181739.
Wang KZ, Bani-Fatemi A, Adanty C, Harripaul R, Griffiths J, Kolla N, et al. Prediction of physical violence in schizophrenia with machine learning algorithms. Psychiatry Res. 2020;289:112960.
Bernstorff M, Hansen L, Enevoldsen K, Damgaard J, Hæstrup F, Perfalk E, et al. Development and validation of a machine learning model for prediction of type 2 diabetes in patients with mental illness. Acta Psychiatr Scand. 2025;151:245–58.
Banerjee S, Lio P, Jones PB, Cardinal RN. A class-contrastive human-interpretable machine learning approach to predict mortality in severe mental illness. NPJ Schizophr. 2021;7:60.
Miley K, Bronstein MV, Ma S, Lee H, Green MF, Ventura J, et al. Trajectories and predictors of response to social cognition training in people with schizophrenia: a proof-of-concept machine learning study. Schizophr Res. 2024;266:92–99.
Hudon A, Beaudoin M, Phraxayavong K, Potvin S, Dumais A. Unsupervised machine learning driven analysis of verbatims of treatment-resistant schizophrenia patients having followed avatar therapy. J Pers Med. 2023;13:801.
Barbalat G, Plasse J, Chéreau-Boudet I, Gouache B, Legros-Lafarge E, Massoubre C, et al. Contribution of socio-demographic and clinical characteristics to predict initial referrals to psychosocial interventions in patients with serious mental illness. Epidemiol Psychiatr Sci. 2024;33:e2.
Lin B, Cecchi G, Bouneffouf D Psychotherapy AI companion with reinforcement learning recommendations and interpretable policy dynamics. In: WWW ’23 Companion: Companion Proceedings of the ACM Web Conference. New York, NY: ACM; 2023.
Just SA, Elvevåg B, Pandey S, Nenchev I, Bröcker AL, Montag C, et al. Moving beyond word error rate to evaluate automatic speech recognition in clinical samples: lessons from research into schizophrenia-spectrum disorders. Psychiatry Res. 2025;352:116690.
Just SA, Bröcker AL, Ryazanskaya G, Nenchev I, Schneider M, Bermpohl F, et al. Validation of natural language processing methods capturing semantic incoherence in the speech of patients with non-affective psychosis. Front Psychiatry. 2023;14:1208856.
May CR, Mair F, Finch T, MacFarlane A, Dowrick C, Treweek S, et al. Development of a theory of implementation and integration: normalization Process Theory. Implement Sci. 2009;4:29.
Greenhalgh T, Wherton J, Papoutsi C, Lynch J, Hughes G, Hinder S, et al. Beyond adoption: a new framework for theorizing and evaluating nonadoption, abandonment, and challenges to the scale-up, spread, and sustainability of health and care technologies. J Med Internet Res. 2017;19:e8775.
Wykes T, Bowie CR, Cella M. Thinking about the future of cognitive remediation therapy revisited: what is left to solve before patients have access?. Schizophr Bull. 2024;50:993–1005.
Benjet C, Zainal NH, Albor Y, Alvis-Barranco L, Carrasco-Tapias N, Contreras-Ibáñez CC, et al. A precision treatment model for internet-delivered cognitive behavioral therapy for anxiety and depression among university students: a secondary analysis of a randomized clinical trial. JAMA Psychiatry. 2023;80:768–77.
Furukawa TA, Noma H, Tajika A, Toyomoto R, Sakata M, Luo Y, et al. Personalised & optimised therapy (POT) algorithm using five cognitive and behavioural skills for subthreshold depression. NPJ Digit Med. 2025;8:531.
Kollins SH, DeLoss DJ, Cañadas E, Lutz J, Findling RL, Keefe RSE, et al. A novel digital intervention for actively reducing severity of paediatric ADHD (STARS-ADHD): a randomised controlled trial. Lancet Digit Health. 2020;2:e168–e178.
Michie S, Van Stralen MM, West R. The behaviour change wheel: a new method for characterising and designing behaviour change interventions. Implement Sci. 2011;6:42.
Michie S, Richardson M, Johnston M, Abraham C, Francis J, Hardeman W, et al. The behavior change technique taxonomy (v1) of 93 hierarchically clustered techniques: building an international consensus for the reporting of behavior change interventions. Ann Behav Med. 2013;46:81–95.
Kappen TH, van Klei WA, van Wolfswinkel L, Kalkman CJ, Vergouwe Y, Moons KGM. Evaluating the impact of prediction models: lessons learned, challenges, and recommendations. Diagn Progn Res. 2018;2:11.
Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26:565–74.
Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ. 2016;352:i6.
van Calster B, McLernon DJ, van Smeden M, Wynants L, Steyerberg EW, Topic Group. ‘Evaluating diagnostic tests and prediction models’ of the STRATOS initiative. Calibration: the Achilles heel of predictive analytics. BMC Med. 2019;17:230.
Kompa B, Snoek J, Beam AL. Second opinion needed: communicating uncertainty in medical machine learning. NPJ Digit Med. 2021;4:4.
D’Amour A, Heller K, Moldovan D, Adlam B, Alipanahi B, Beutel A, et al. Underspecification presents challenges for credibility in modern machine learning. J Mach Learn Res. 2022;23:1–61.
Koch LM, Baumgartner CF, Berens P. Distribution shift detection for the postmarket surveillance of medical AI algorithms: a retrospective simulation study. NPJ Digit Med. 2024;7:120.
Angelopoulos AN, Bates S. Conformal prediction: a gentle introduction. FNT in Machine Learning. 2023;16:494–591.
Swaminathan A, Lopez I, Wang W, Srivastava U, Tran E, Bhargava-Shah A, et al. Selective prediction for extracting unstructured clinical data. J Am Med Inform Assoc. 2023;31:188–97.
Tonekaboni S, Joshi S, McCradden MD, Goldenberg A What clinicians want: contextualizing explainable machine learning for clinical end use. In: Proceedings of the Machine Learning Research. Rochester, MN: Machine Learning for Healthcare; 2019.
Reddy S. Explainability and artificial intelligence in medicine. Lancet Digit Health. 2022;4:e214–e215.
Gottesman O, Johansson F, Komorowski M, Faisal A, Sontag D, Doshi-Velez F, et al. Guidelines for reinforcement learning in healthcare. Nat Med. 2019;25:16–18.
Hager P, Jungmann F, Holland R, Bhagat K, Hubrecht I, Knauer M, et al. Evaluation and mitigation of the limitations of large language models in clinical decision-making. Nat Med. 2024;30:2613–22.
Moor M, Banerjee O, Abad ZSH, Krumholz HM, Leskovec J, Topol EJ, et al. Foundation models for generalist medical artificial intelligence. Nature. 2023;616:259–65.
Adler-Milstein J, Mehrotra A. Paying for digital health care—problems with the fee-for-service system. N Engl J Med. 2021;385:871–3.
Lozano E, Meza SF, Alexander A, Bonilla P, Jaramillo W Remote patient monitoring (RPM). In Davis M, Kirwan M, Maclay W, Pappas H (eds). Closing the Care Gap with Wearable Devices: Innovating Healthcare with Wearable Patient Monitoring. (Productivity Press, New York, NY, 2022).
Rodriguez JA, Shachar C, Bates DW. Digital inclusion as health care—supporting health care equity with digital-infrastructure initiatives. N Engl J Med. 2022;386:1101–3.
Glasgow RE, Harden SM, Gaglio B, Rabin B, Smith ML, Porter GC, et al. RE-AIM planning and evaluation framework: adapting to new science and practice with a 20-year review. Front Public Health. 2019;7:64.
Loudon K, Treweek S, Sullivan F, Donnan P, Thorpe KE, Zwarenstein M. The PRECIS-2 tool: designing trials that are fit for purpose. BMJ. 2015;350:h2147.
Ford I, Norrie J. Pragmatic trials. N Engl J Med. 2016;375:454–63.
Slade M, Leamy M, Bacon F, Janosik M, Le Boutillier C, Williams J, et al. International differences in understanding recovery: systematic review. Epidemiol Psychiatr Sci. 2012;21:353–64.
Murwasuminar B, Munro I, Recoche K. Mental health recovery for people with schizophrenia in southeast asia: a systematic review. J Psychiatr Ment Health Nurs. 2023;30:620–36. https://doi.org/10.1111/jpm.12902.
Collins GS, Dhiman P, Andaur Navarro CLA, Ma J, Hooft L, Reitsma JB, et al. Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence. BMJ Open. 2021;11:e048008.
Moons KGM, Damen JAA, Kaul T, Hooft L, Andaur Navarro C, Dhiman P, et al. PROBAST+AI: an updated quality, risk of bias, and applicability assessment tool for prediction models using regression or artificial intelligence methods. BMJ. 2025;388:e082505.
Eisner E, Berry N, Bucci S. Digital tools to support mental health: a survey study in psychosis. BMC Psychiatry. 2023;23:726.
Yang LH, Anglin DM, Wonpat-Borja AJ, Opler MG, Greenspoon M, Corcoran CM. Public stigma associated with psychosis risk syndrome in a college population: implications for peer intervention. Psychiatr Serv. 2013;64:284–8.
Corrigan PW, Druss BG, Perlick DA. The impact of mental illness stigma on seeking and participating in mental health care. Psychol Sci Public Interest. 2014;15:37–70.
Burns T, Rugkåsa J, Molodynski A, Dawson J, Yeeles K, Vazquez-Montes M, et al. Community treatment orders for patients with psychosis (OCTET): a randomised controlled trial. Lancet. 2013;381:1627–33.
National Institute for Health and Care Excellence. Transition between Inpatient Mental Health Settings and Community or Care Home Settings [NICE Guideline] NG53 (2016).
Smith M, Saunders R, Stuckhardt L, McGinnis JM, Committee on the Learning. Health care system. In America, Institute of Medicine (eds) Best Care at Lower Cost: the Path to Continuously Learning Health Care in America (National Academies Press, Washington, DC, 2013).
Pereira CVF, de Oliveira EM, de Souza AD. Machine learning applied to edge computing and wearable devices for healthcare: systematic mapping of the literature. Sensors. 2024;24:6322.
Fiske A, Radhuber IM, Willem T, Buyx A, Celi LA, McLennan S. Climate change and health: the next challenge of ethical AI. Lancet Glob Health. 2025;13:e1314–e1320.
Lokmic-Tomkins Z, Davies S, Block LJ, Cochrane L, Dorin A, Von Gerich H, et al. Assessing the carbon footprint of digital health interventions: a scoping review. J Am Med Inform Assoc. 2022;29:2128–39.
National Institute for Health and Care Excellence Evidence Standards Framework for Digital Health Technologies. London, UK: National Institute for Health and Care Excellence; 2019.
Wenderott K, Krups J, Zaruchas F, Weigl M. Effects of artificial intelligence implementation on efficiency in medical imaging—a systematic literature review and meta-analysis. NPJ Digit Med. 2024;7:265.
Acknowledgements
We thank the clinical experts from the Rehabilitation Department of Shanghai Mental Health Center and the nursing staff from several mental rehabilitation management communities in Shanghai for their invaluable guidance and support throughout this project. This work was supported by the 2024 Shanghai Jiao Tong University Key Program for Interdisciplinary Research in Medicine and Engineering (Project No. YG2024ZD24); the 2024 Shanghai “Science and Technology Innovation Action Plan” Medical Innovation Research Special Project (Key Project Sub-Project) (Project No. 24Y22800502); and the Shanghai Jiao Tong University Interdisciplinary Program for Medicine and Engineering Youth Program (Project No. YG2025QNA11). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
HY: Conceptualization, Data curation, Investigation, Formal analysis, Visualization, Writing–original draft, Writing–review and editing. ZL: Investigation, Data curation, Project administration, and Funding acquisition. FM and FC: Validation, Writing–review and editing. WZ and JC: Funding acquisition, Methodology, Supervision, and Validation.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Yang, H., Chang, F., Muroi, F. et al. Application of artificial intelligence in schizophrenia rehabilitation management: a systematic scoping review. Transl Psychiatry 16, 180 (2026). https://doi.org/10.1038/s41398-026-03872-3
Received:
Revised:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41398-026-03872-3








