Abstract
Symptoms of schizophrenia are often reflected in patients’ speech. Natural language processing (NLP) approaches enable quantitative assessment of language-related symptoms in schizophrenia. Previous applications have primarily focused on acute psychopathology or predicting the onset or relapse of psychosis rather than treatment-related improvements. Although electronic health records (EHRs) contain rich longitudinal data, unstructured notes hinder structured quantifications. We applied recent large language models (LLMs) to evaluate symptoms based on speech content recorded in EHRs. We analyzed 5,275 clinical notes from 30 patients with treatment-resistant schizophrenia undergoing clozapine treatment. Three state-of-the-art LLMs rated according to the Brief Psychiatric Rating Scale (BPRS). Complementary analysis included parts-of-speech (POS), bag-of-words (BoW), bigram and Linguistic Inquiry and Word Count (LIWC) analyses. LLM-based BPRS ratings revealed significant decreases in Anxiety, Conceptual Disorganization, Suspiciousness, Unusual Thought Content, Hallucinatory behavior, and Depressive Mood during clozapine treatment. POS analysis indicated an increased use of adjectives per sentence, while LIWC analysis revealed more positive emotional expressions during the later phase of treatment. These findings demonstrate that LLMs can extract clinically meaningful symptom information from unstructured clinical text and capture treatment-related changes in psychosis. This approach premises a low-burden method for supporting clinical judgment using routinely collected EHR data.
Similar content being viewed by others
Data availability
EHR data is available only after the approval of the local ethics committee due to the privacy protection of the Act on the Protection of Personal Information in Japan. To request the access to the data, please contact to the corresponding author ([yosuke.morishima@unibe.ch](mailto: yosuke.morishima@unibe.ch)). Upon publication, the code used for the LLM-based BPRS rating will be made available at: [https://github.com/ymorishi/bprs\_ja](https:/github.com/ymorishi/bprs_ja).
References
Addington, J., Addington, D. & Maticka-Tyndale, E. Cognitive functioning and positive and negative symptoms in schizophrenia. Schizophr. Res. 5, 123–134 (1991).
Kay, S. R., Fiszbein, A. & Opler, L. A. The positive and negative syndrome scale (PANSS) for schizophrenia. Schizophr. Bull. 13, 261–276 (1987).
Striebel, J. M. What is schizophrenia – symptomatology. CNS Spectr. 30, e12 (2025).
Covington, M. A. et al. Schizophrenia and the structure of language: the linguist’s view. Schizophr. Res. 77, 85–98 (2005).
Ehlen, F., Montag, C., Leopold, K. & Heinz, A. Linguistic findings in persons with schizophrenia—a review of the current literature. Front. Psychol. 14, 1287706 (2023).
Corcoran, C. M. et al. Language as a biomarker for psychosis: A natural Language processing approach. Schizophr. Res. 226, 158–166 (2020).
Deneault, A., Dumais, A., Désilets, M. & Hudon, A. Natural Language processing and schizophrenia: A scoping review of uses and challenges. J. Pers. Med. 14, 744 (2024).
Rhoades, H. M. & Overall, J. E. The semistructured BPRS interview and rating guide. Psychopharmacol. Bull. 24, 101–104 (1988).
Wolff, B. Artificial intelligence and natural Language processing in modern clinical neuropsychology: A narrative review. The Clin. Neuropsychologist 0, 1–25 . https://doi.org/10.1080/13854046.2025.2547934
Fradkin, I., Nour, M. M. & Dolan, R. J. Theory-Driven analysis of natural Language processing measures of thought disorder using generative Language modeling. Biol. Psychiatry Cogn. Neurosci. Neuroimaging. 8, 1013–1023 (2023).
Çokal, D. et al. Three dimensions of speech coherence in people with early psychosis and their family members. Schizophrenia (Heidelb). 12, 2 (2025).
Fineberg, S. K. et al. Word use in first-person accounts of schizophrenia. Br. J. Psychiatry. 206, 32–38 (2015).
Strous, R. D. et al. Automated characterization and identification of schizophrenia in writing. J. Nerv. Ment. Dis. 197, 585 (2009).
Henriksen, M. G., Raballo, A. & Nordgaard, J. Self-disorders and psychopathology: a systematic review. Lancet Psychiatry. 8, 1001–1012 (2021).
Sass, L. A., Parnas, J. & Schizophrenia Consciousness, and the self. Schizophr. Bull. 29, 427–444 (2003).
Bedi, G. et al. Automated analysis of free speech predicts psychosis onset in high-risk youths. Npj Schizophr. 1, 1–7 (2015).
Corcoran, C. M. et al. Prediction of psychosis across protocols and risk cohorts using automated Language analysis. World Psychiatry. 17, 67–75 (2018).
Elvevåg, B., Foltz, P. W., Weinberger, D. R. & Goldberg, T. E. Quantifying incoherence in speech: an automated methodology and novel application to schizophrenia. Schizophr. Res. 93, 304–316 (2007).
Just, S. A. et al. Modeling incoherent discourse in Non-Affective psychosis. Frontiers Psychiatry 11, 846 (2020).
Morgan, S. E. et al. Natural Language processing markers in first episode psychosis and people at clinical high-risk. Transl Psychiatry. 11, 1–9 (2021).
Alonso-Sánchez, M. F. et al. Progressive changes in descriptive discourse in first episode schizophrenia: a longitudinal computational semantics study. Schizophr 8, 36 (2022).
Palominos, C. et al. Lexical meaning is lower dimensional in psychosis. Sci. Rep. 16, 859 (2025).
Figueroa-Barra, A. et al. Automatic Language analysis identifies and predicts schizophrenia in first-episode of psychosis. Schizophr 8, 1–8 (2022).
Tayefi, M. et al. Challenges and opportunities beyond structured data in analysis of electronic health records. WIRE Comput. Stat. 13, e1549 (2021).
Irving, J. et al. Using natural Language processing on electronic health records to enhance detection and prediction of psychosis risk. Schizophr. Bull. 47, 405–414 (2021).
Smoller, J. W. The use of electronic health records for psychiatric phenotyping and genomics. Am. J. Med. Genet. Part. B: Neuropsychiatric Genet. 177, 601–612 (2018).
Tran, T. & Kavuluru, R. Predicting mental conditions based on history of present illness in psychiatric notes with deep neural networks. J. Biomed. Inform. 75, S138–S148 (2017).
Verter, V., Frank, E. F., Georghiou, A. & D. & Text mining of outpatient narrative notes to predict the risk of psychiatric hospitalization. Transl Psychiatry. 15, 60 (2025).
Omar, M. et al. Applications of large Language models in psychiatry: a systematic review. Frontiers Psychiatry 15, 1422807 (2024).
Onysk, J. & Huys, Q. J. M. Quantifying depressive mental States with large Language models. Preprint at. https://doi.org/10.48550/arXiv.2502.09487 (2025).
Overall, J. E. & Gorham, D. R. The brief psychiatric rating scale (BPRS): recent developments in ascertainment and scaling. Psychopharmacol. Bull. 24, 97–99 (1988).
Overall, J. E. & Gorham, D. R. The brief psychiatric rating scale. Psychol. Rep. 10, 799–812 (1962).
Kane, J., Honigfeld, G., Singer, J. & Meltzer, H. Clozapine for the Treatment-Resistant schizophrenic: A Double-blind comparison with chlorpromazine. Arch. Gen. Psychiatry. 45, 789–796 (1988).
Martini, F. et al. Clozapine tolerability in treatment resistant schizophrenia: exploring the role of sex. Psychiatry Res. 297, 113698 (2021).
van der Horst, M. Z., de Boer, N., Okhuijsen-Pfeifer, C. & Luykx, J. J. Determinants of patient satisfaction in clozapine users: results from the clozapine international consortium (CLOZIN). Schizophrenia (Heidelb). 11, 28 (2025).
van der Horst, M. Z. et al. Comprehensive dissection of prevalence rates, sex differences, and blood level-dependencies of clozapine-associated adverse drug reactions. Psychiatry Res. 330, 115539 (2023).
OpenAI et al. gpt-oss-120b & gpt-oss-20b Model Card. Preprint at (2025). https://doi.org/10.48550/arXiv.2508.10925
Team, G. 4 5 et al. GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models. Preprint at (2025). https://doi.org/10.48550/arXiv.2508.06471
Yang, A. et al. Qwen3 Technical Report. Preprint at (2025). https://doi.org/10.48550/arXiv.2505.09388
Cohn, M. A., Mehl, M. R. & Pennebaker, J. W. Linguistic markers of psychological change surrounding September 11, 2001. Psychol. Sci. 15, 687–693 (2004).
Tausczik, Y. R. & Pennebaker, J. W. The psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Social Psychol. 29, 24–54 (2010).
Igarashi, T., Okuda, S. & Sasahara, K. Development of the Japanese version of the linguistic inquiry and word count dictionary 2015. Front Psychol 13, 841534 (2022).
Dazzi, F., Shafer, A. & Lauriola, M. Meta-analysis of the brief psychiatric rating Scale – Expanded (BPRS-E) structure and arguments for a new version. J. Psychiatr. Res. 81, 140–151 (2016).
Shafer, A. & Dazzi, F. Meta-analysis of the positive and negative syndrome scale (PANSS) factor structure. J. Psychiatr. Res. 115, 113–120 (2019).
Malik, K. et al. Differences in syntactic and semantic analysis based on machine learning algorithms in prodromal psychosis and normal adolescents. Asian J. Psychiatry. 85, 103633 (2023).
Guo, Z. et al. Large Language models for mental health applications: systematic review. JMIR Mental Health. 11, e57400 (2024).
Lawrence, H. R. et al. The opportunities and risks of large Language models in mental health. JMIR Mental Health. 11, e59479 (2024).
Bao, Y. et al. Leveraging deep neural network and Language models for predicting long-term hospitalization risk in schizophrenia. Schizophr 11, 35 (2025).
Cohen, G. R., Friedman, C. P., Ryan, A. M., Richardson, C. R. & Adler-Milstein, J. Variation in physicians’ electronic health record Documentation and potential patient harm from that variation. J. GEN. INTERN. MED. 34, 2355–2367 (2019).
Minor, K. S. et al. Lexical analysis in schizophrenia: how emotion and social word use informs our Understanding of clinical presentation. J. Psychiatr. Res. 64, 74–78 (2015).
Olson, G. M., Damme, K. S. F., Cowan, H. R., Alliende, L. M. & Mittal, V. A. Emotional tone in clinical high risk for psychosis: novel insights from a natural Language analysis approach. Front Psychiatry 15, 1389597 (2024).
Vakhrusheva, J. et al. Lexical analysis of emotional responses to real-world experiences in individuals with schizophrenia. Schizophr. Res. 216, 272–278 (2020).
Evensen, J. et al. Flat affect and social functioning: A 10 year follow-up study of first episode psychosis patients. Schizophr. Res. 139, 99–104 (2012).
Favrod, J. et al. Impact of positive emotion regulation training on negative symptoms and social functioning in schizophrenia: A field test. Front Psychiatry 10, 532 (2019).
Nuralita, N. S., Camellia, V. & Loebis, B. Relationship between caregiver burden and expressed emotion in families of schizophrenic patients. Open. Access. Maced J. Med. Sci. 7, 2583–2589 (2019).
Panaite, V. et al. The value of extracting Clinician-Recorded affect for advancing clinical research on depression: Proof-of-Concept study applying natural Language processing to electronic health records. JMIR Formative Res. 6, e34436 (2022).
de Boer, J. N., Voppel, A. E., Brederoo, S. G., Wijnen, F. N. K. & Sommer, I. E. C. Language disturbances in schizophrenia: the relation with antipsychotic medication. NPJ Schizophr. 6, 24 (2020).
Hong, K. et al. Lexical use in emotional autobiographical narratives of persons with schizophrenia and healthy controls. Psychiatry Res. 225, 40–49 (2015).
Ephratt, M. Linguistic, paralinguistic and extralinguistic speech and silence. J. Pragmat. 43, 2286–2307 (2011).
Acknowledgements
We thank the Hospital Medical Information Systems Section for extracting EHR data. This work is supported by JSPS KAKENHI Grant Numbers (TK, 22K07589; KT, 25K19067), a research grant from SENSHIN Medical Research Foundation (KN), Swiss National Science Foundation (YM, 32003B_192623).
Funding
This work is supported by JSPS KAKENHI Grant Numbers (TK, 22K07589; KT, 25K19067), a research grant from SENSHIN Medical Research Foundation (KN), Swiss National Science Foundation (YM, 32003B_192623).
Author information
Authors and Affiliations
Contributions
Misa Matsumura : Investigation, Formal analysis, Writing - Original Draft; Keiichiro Nishida : Data curation, Writing - Original Draft; Katsunori Toyoda : Investigation, Data curation; Kaori Kadoyama : Conceptualization, Data curation; Ryoichi Yano : Conceptualization, Data curation; Tetsufumi Kanazawa : Investigation, Supervision; Toshiaki Nakamura : Conceptualization, Data curation, Supervision, Writing - Review & Editing; Yosuke Morishima : Methodology, Software, Formal analysis, Writing - Original Draft, Writing - Review & Editing, Supervision.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Matsumura, M., Nishida, K., Toyoda, K. et al. Quantifying improvement of psychotic symptoms in clozapine-treated schizophrenia: clinical note analysis with large language models. Sci Rep (2026). https://doi.org/10.1038/s41598-026-39676-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-39676-0


