Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

npj Digital Medicine
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. npj digital medicine
  3. articles
  4. article
Uncertainty modeling in multimodal speech analysis across the psychosis spectrum
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 23 January 2026

Uncertainty modeling in multimodal speech analysis across the psychosis spectrum

  • Morteza Rohanian1,2,
  • Roya Hüppi3,
  • Farhad Nooralahzadeh1,
  • Noemi Dannecker3,
  • Yves Pauli3,
  • Werner Surbeck3,
  • Iris Sommer4,
  • Wolfram Hinzen5,6,
  • Nicolas Langer7,
  • Michael Krauthammer1,2 &
  • …
  • Philipp Homan3,8 

npj Digital Medicine , Article number:  (2026) Cite this article

  • 1330 Accesses

  • 1 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Medical research
  • Neuroscience
  • Psychology

Abstract

Speech provides a rich behavioral signal of psychosis, yet its diagnostic use remains limited because speech patterns vary widely across individuals and contexts. We model this variability as uncertainty, capturing how consistently speech features indicate symptom expression. We introduce a multimodal model that integrates acoustic and linguistic information to predict symptom severity and psychosis-related traits across the spectrum, from high schizotypy to clinical psychosis. By estimating uncertainty for each modality, the model learns when to rely on specific signals, adapting to speech quality and task context to improve accuracy and interpretability. Using speech from 114 participants–32 with early psychosis and 82 with low or high schizotypy–recorded in German across structured and narrative tasks, the model achieved an F1-score of 83% (ECE = 0.045), demonstrating robust and well-calibrated performance. Uncertainty estimation further revealed which speech markers most reliably indicated symptoms, including pitch variability, fluency disruptions, and spectral instability.

Similar content being viewed by others

Detecting schizophrenia, bipolar disorder, psychosis vulnerability and major depressive disorder from 5 minutes of online-collected speech

Article Open access 12 July 2025

Towards a scalable approach to assess speech organization across the psychosis-spectrum -online assessment in conjunction with automated transcription and extraction of speech measures

Article Open access 21 March 2024

Speech characteristics yield important clues about motor function: Speech variability in individuals at clinical high-risk for psychosis

Article Open access 16 September 2023

Data availability

Data used to support these findings are available from the corresponding author upon reasonable request.

References

  1. Omlor, W. et al. Estimating multimodal brain variability in schizophrenia spectrum disorders: a worldwide ENIGMA study. Am. J. Psychiatry. https://doi.org/10.1176/appi.ajp.20230806 (2025).

  2. Insel, T. R. Rethinking schizophrenia. Nature 468, 187–193 (2010).

    Google Scholar 

  3. Keeley, J. & Gaebel, W. Symptom rating scales for schizophrenia and other primary psychotic disorders in ICD-11. Epidemiol. Psychiatr. Sci. 27, 219–224 (2018).

    Google Scholar 

  4. Winkelbeiner, S., Leucht, S., Kane, J. M. & Homan, P. Evaluation of differences in individual treatment response in schizophrenia spectrum disorders. JAMA Psychiatry 76, 1063–1073 (2019).

    Google Scholar 

  5. Homan, P. et al. Relapse prevention through health technology program reduces hospitalization in schizophrenia. Psychol. Med. 1–7. https://doi.org/10.1017/S0033291722000794 (2022).

  6. Griswold, K. S., Del Regno, P. A. & Berger, R. C. Recognition and differential diagnosis of psychosis in primary care. Am. Fam. Physician 91, 856–863 (2015).

    Google Scholar 

  7. Phillips, J. Rethinking categories and dimensions in the dsm. In The Journal of Medicine and Philosophy: A Forum for Bioethics and Philosophy of Medicine Vol. 45, 663–682 (Oxford University Press US, 2020).

  8. Sellbom, M. E. & Suhr, J. A.The Cambridge Handbook of Clinical Assessment And Diagnosis (Cambridge University Press, 2020).

  9. Kvig, E. I. & Nilssen, S. Does method matter? assessing the validity and clinical utility of structured diagnostic interviews among a clinical sample of first-admitted patients with psychosis: a replication study. Front. Psychiatry 14, 1076299 (2023).

    Google Scholar 

  10. Palaniyappan, L., Homan, P. & Alonso-Sanchez, M. F. Language network dysfunction and formal thought disorder in schizophrenia. Schizophr. Bull. https://doi.org/10.1093/schbul/sbac159 (2022).

  11. Corcoran, C. M. & Cecchi, G. A. Using language processing and speech analysis for the identification of psychosis and other disorders. Biol. Psychiatry. Cogn. Neurosci. Neuroimaging 5, 770–779 (2020).

    Google Scholar 

  12. Corcoran, C. M. et al. Language as a biomarker for psychosis: a natural language processing approach. Schizophr. Res. 226, 158–166 (2020).

    Google Scholar 

  13. De Boer, J. et al. Acoustic speech markers for schizophrenia-spectrum disorders: a diagnostic and symptom-recognition tool. Psychol. Med. 53, 1302–1312 (2023).

    Google Scholar 

  14. Dikaios, K. et al. Applications of speech analysis in psychiatry. Harv. Rev. Psychiatry 31, 1–13 (2023).

    Google Scholar 

  15. He, R. et al. Task-voting for schizophrenia spectrum disorders prediction using machine learning across linguistic feature domains. medRxiv. https://doi.org/10.1101/2024.08.31.24312886 (2024).

  16. Hernández, H. C. et al. Natural language processing markers for psychosis and other psychiatric disorders: Emerging themes and research agenda from a cross-linguistic workshop. Schizophr. Bull. 49, S86–S92 (2023).

    Google Scholar 

  17. Palominos, C. et al. Approximating the semantic space: word embedding techniques in psychiatric speech analysis. Schizophrenia 10, 114 (2024).

  18. Ben-Zion, Z. et al. “Chat-GPT on the Couch”: assessing and alleviating state anxiety in large language models. NPJ Digit. Med. https://doi.org/10.31234/osf.io/j7fwb (2025).

  19. Panchalingam, J. et al. Motivational interviewing in patients with acute psychosis: a feasibility study. Schizophr. Bull. Open. https://doi.org/10.1093/schizbullopen/sgaf004 (2025).

  20. Voppel, A. E., de Boer, J. N., Brederoo, S. G., Schnack, H. G. & Sommer, I. E. Semantic and acoustic markers in schizophrenia-spectrum disorders: A combinatory machine learning approach. Schizophr. Bull. 49, S163–S171 (2023).

    Google Scholar 

  21. Parola, A. et al. Speech disturbances in schizophrenia: Assessing cross-linguistic generalizability of nlp automated measures of coherence. Schizophr. Res. https://doi.org/10.1016/j.schres.2022.07.002 (2022).

  22. Hitczenko, K., Mittal, V. A. & Goldrick, M. Understanding language abnormalities and associated clinical markers in psychosis: the promise of computational methods. Schizophr. Bull. 47, 344–362 (2021).

    Google Scholar 

  23. Parola, A., Simonsen, A., Bliksted, V. & Fusaroli, R. Voice patterns in schizophrenia: a systematic review and Bayesian meta-analysis. Schizophr. Res. 216, 24–40 (2020).

    Google Scholar 

  24. Gawlikowski, J. et al. A survey of uncertainty in deep neural networks. Artif. Intell. Rev. 56, 1513–1589 (2023).

    Google Scholar 

  25. Voppel, A. E., de Boer, J. N., Brederoo, S., Schnack, H. & Sommer, I. E. Quantified language connectedness in schizophrenia-spectrum disorders. Psychiatry Res. 304, 114130 (2021).

    Google Scholar 

  26. Cohen, A. S., Mitchell, K. R., Docherty, N. M. & Horan, W. P. Vocal expression in schizophrenia: less than meets the ear. J. Abnorm. Psychol. 125, 299 (2016).

    Google Scholar 

  27. Bone, D., Lee, C.-C., Chaspari, T., Gibson, J. & Narayanan, S. Signal processing and machine learning for mental health research and clinical applications [perspectives]. IEEE Signal Process. Mag. 34, 196–195 (2017).

    Google Scholar 

  28. Chekroud, A. M. et al. The promise of machine learning in predicting treatment outcomes in psychiatry. World Psychiatry 20, 154–170 (2021).

    Google Scholar 

  29. McKnight, S. W., Hogg, A. O., Neo, V. W. & Naylor, P. A. Uncertainty quantification in machine learning for joint speaker diarization and identification. arXiv preprint. https://doi.org/10.48550/arXiv.2312.16763 (2023).

  30. Schrüfer, O., Milling, M., Burkhardt, F., Eyben, F. & Schuller, B. Are you sure? Analysing uncertainty quantification approaches for real-world speech emotion recognition. In Proc. Interspeech 2024, pp. 3210–3214 (2024).

  31. Dighe, P. et al. Leveraging large language models for exploiting asr uncertainty. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),12231–12235 (IEEE, 2024).

  32. Kendall, A. & Gal, Y. What uncertainties do we need in Bayesian deep learning for computer vision? Adv. Neural Inf. Process. Syst. 30. https://doi.org/10.48550/arXiv.1703.04977 (2017).

  33. Kompa, B., Snoek, J. & Beam, A. L. Second opinion needed: communicating uncertainty in medical machine learning. NPJ Digit. Med. 4, 4 (2021).

    Google Scholar 

  34. Popat, R. & Ive, J. Embracing the uncertainty in human–machine collaboration to support clinical decision-making for mental health conditions. Front. Digit. Health 5, 1188338 (2023).

    Google Scholar 

  35. Kang, M. et al. Cure: context-and uncertainty-aware mental disorder detection. In Proc. 2024 Conference on Empirical Methods in Natural Language Processing 17924–17940 (Association for Computational Linguistics, 2024).

  36. Sarti, P. et al. Investigating the structure of schizotypy through the ‘multidimensional schizotypy scale’ and ‘Oxford-Liverpool Inventory’: an exploratory network analysis approach in the healthy population. Schizophrenia. https://www.medrxiv.org/content/early/2024/07/14/2024.07.13.24310316 (2025).

  37. Kirchhoff, C. et al. Gender-specific associations of adverse childhood experiences (ACES) and schizotypal traits—an observational study in healthy young adults. medRxiv. https://doi.org/10.1101/2024.07.08.24310072 (2024).

  38. Kiang, M. Schizotypy and language: a review. J. Neurolinguist. 23, 193–203 (2010).

    Google Scholar 

  39. Minor, K. S. & Cohen, A. S. Affective reactivity of speech disturbances in schizotypy. J. Psychiatr. Res. 44, 99–105 (2010).

    Google Scholar 

  40. Cohen, A. S., Auster, T. L., McGovern, J. E. & MacAulay, R. K. The normalities and abnormalities associated with speech in psychometrically-defined schizotypy. Schizophr. Res. 160, 169–172 (2014).

    Google Scholar 

  41. Mason, O. J. The assessment of schizotypy and its clinical relevance. Schizophr. Bull. 41, S374–S385 (2015).

    Google Scholar 

  42. De la Fuente Garcia, S., Ritchie, C. W. & Luz, S. Artificial intelligence, speech, and language processing approaches to monitoring alzheimer’s disease: a systematic review. J. Alzheimer’s. Dis. 78, 1547–1574 (2020).

    Google Scholar 

  43. Asimakidou, E., Job, X. & Kilteni, K. The positive dimension of schizotypy is associated with a reduced attenuation and precision of self-generated touch. Schizophrenia 8, 57 (2022).

    Google Scholar 

  44. Buck, B. & Penn, D. L. Lexical characteristics of emotional narratives in schizophrenia: relationships with symptoms, functioning, and social cognition. J. Nerv. Ment. Dis. 203, 702–708 (2015).

    Google Scholar 

  45. Horan, W. P., Kring, A. M. & Blanchard, J. J. Anhedonia in schizophrenia: a review of assessment strategies. Schizophr. Bull. 32, 259–273 (2006).

    Google Scholar 

  46. Chang, X. et al. Language abnormalities in schizophrenia: binding core symptoms through contemporary empirical evidence. Schizophrenia 8, 95 (2022).

    Google Scholar 

  47. Mason, O., Claridge, G. & Jackson, M. New scales for the assessment of schizotypy. Personal. Individ. Differ. 18, 7–13 (1995).

    Google Scholar 

  48. Mason, O. & Claridge, G. The oxford-liverpool inventory of feelings and experiences (o-life): further description and extended norms. Schizophr. Res. 82, 203–211 (2006).

    Google Scholar 

  49. Kwapil, T. R., Gross, G. M., Silvia, P. J., Raulin, M. L. & Barrantes-Vidal, N. Development and psychometric properties of the multidimensional schizotypy scale: a new measure for assessing positive, negative, and disorganized schizotypy. Schizophr. Res. 193, 209–217 (2018).

    Google Scholar 

  50. Veale, J. F. Edinburgh handedness inventory–short form: a revised version based on confirmatory factor analysis. Laterality 19, 164–177 (2014).

    Google Scholar 

  51. Kay, S. R., Fiszbein, A. & Opler, L. A. The positive and negative syndrome scale (panss) for schizophrenia. Schizophr. Bull. 13, 261–276 (1987).

    Google Scholar 

  52. Bredin, H. & Laurent, A. End-to-end speaker segmentation for overlap-aware resegmentation. In Proc. Interspeech 2021, 3111–3115 (2021).

  53. Bain, M., Huh, J., Han, T. & Zisserman, A. Whisperx: time-accurate speech transcription of long-form audio. In Proc. Interspeech 2023, 4489–4493 (2023).

  54. Spiller, T. R. et al. Efficient and accurate transcription in mental health research—a tutorial on using Whisper AI for audio file transcription. OSF. https://doi.org/10.31219/osf.io/9fue8 (2023).

  55. Baevski, A., Zhou, Y., Mohamed, A. & Auli, M. wav2vec 2.0: a framework for self-supervised learning of speech representations. Adv. neural Inf. Process. Syst. 33, 12449–12460 (2020).

    Google Scholar 

  56. Gong, Y. & Poellabauer, C. Topic modeling based multi-modal depression detection. In Proc. 7th Annual Workshop on Audio/Visual Emotion Challenge 69–76 (Association for Computing Machinery, 2017).

  57. Eyben, F., Wöllmer, M. & Schuller, B. Opensmile: the Munich versatile and fast open-source audio feature extractor. In Proc. the 18th ACM International Conference on Multimedia 1459–1462 (Association for Computing Machinery, 2010).

  58. Eyben, F. et al. The Geneva Minimalistic Acoustic Parameter Set (GEMAPS) for voice research and affective computing. IEEE Trans. Affect. Comput. 7, 190–202 (2015).

    Google Scholar 

  59. Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In Proc. IEEE Conference On Computer Vision and Pattern Recognition 4700–4708 (Institute of Electrical and Electronics Engineers, 2017).

  60. Amiriparian, S. et al. Snore sound classification using image-based deep spectrum features In Proc. Interspeech 2017, 3512–3516 (2017).

  61. Conneau, A. Unsupervised cross-lingual representation learning at scale. In Proc. 58th Annual Meeting of the Association for Computational Linguistics, 8440–8451 (2020).

  62. Ruder, S. et al. Xtreme-r: towards more challenging and nuanced multilingual evaluation. In Proc. 2021 Conference on Empirical Methods in Natural Language Processing, 10215–10245 (2021).

  63. Hazarika, D., Zimmermann, R. & Poria, S. Misa: Modality-invariant and-specific representations for multimodal sentiment analysis. In Proc. 28th ACM International Conference on Multimedia 1122–1131 (Association for Computing Machinery, 2020).

  64. Tellamekala, M. K. et al. Cold fusion: Calibrated and ordinal latent distribution fusion for uncertainty-aware multimodal emotion recognition. In IEEE Transactions on Pattern Analysis and Machine Intelligence (Institute of Electrical and Electronics Engineers, 2023).

  65. Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. In International Conference on Machine Learning 3145–3153 (PMlR, 2017).

  66. Kilic, I. Y. & Pan, S. Incorporating LIWC in neural networks to improve human trait and behavior analysis in low resource scenarios. In Proc. Thirteenth Language Resources and Evaluation Conference 4532–4539 (European Language Resources Association, 2022).

Download references

Acknowledgements

We are grateful to all participants for their contributions. We also thank Anna Steiner, Linus Hany, and Ueli Stocker for their help with data collection. This work was supported by the European Union (GA 101080251—TRUSTING) and by the Swiss National Science Foundation (POZHP1_191938/1).

Author information

Authors and Affiliations

  1. Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland

    Morteza Rohanian, Farhad Nooralahzadeh & Michael Krauthammer

  2. ETH AI Center, Zurich, Switzerland

    Morteza Rohanian & Michael Krauthammer

  3. Department of Adult Psychiatry and Psychotherapy, University of Zurich, Zurich, Switzerland

    Roya Hüppi, Noemi Dannecker, Yves Pauli, Werner Surbeck & Philipp Homan

  4. Department of Neuroscience, University Medical Center Groningen, Groningen, Netherlands

    Iris Sommer

  5. Department of Translation and Language Sciences, Universitat Pompeu Fabra, Barcelona, Spain

    Wolfram Hinzen

  6. Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain

    Wolfram Hinzen

  7. Department of Psychology, University of Zurich, Zurich, Switzerland

    Nicolas Langer

  8. Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland

    Philipp Homan

Authors
  1. Morteza Rohanian
    View author publications

    Search author on:PubMed Google Scholar

  2. Roya Hüppi
    View author publications

    Search author on:PubMed Google Scholar

  3. Farhad Nooralahzadeh
    View author publications

    Search author on:PubMed Google Scholar

  4. Noemi Dannecker
    View author publications

    Search author on:PubMed Google Scholar

  5. Yves Pauli
    View author publications

    Search author on:PubMed Google Scholar

  6. Werner Surbeck
    View author publications

    Search author on:PubMed Google Scholar

  7. Iris Sommer
    View author publications

    Search author on:PubMed Google Scholar

  8. Wolfram Hinzen
    View author publications

    Search author on:PubMed Google Scholar

  9. Nicolas Langer
    View author publications

    Search author on:PubMed Google Scholar

  10. Michael Krauthammer
    View author publications

    Search author on:PubMed Google Scholar

  11. Philipp Homan
    View author publications

    Search author on:PubMed Google Scholar

Contributions

M.R., R.H., and P.H. had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. M.R., R.H., F.N., N.D., Y.P., W.S., I.S., W.H., N.L., M.K., and P.H. made substantial contributions to the conception, design, and analysis of the work, as well as to the drafting and final approval of the manuscript.

Corresponding author

Correspondence to Morteza Rohanian.

Ethics declarations

Competing interests

P.H. has received grants and honoraria from Novartis, Lundbeck, Takeda, Mepha, Janssen, Boehringer Ingelheim, Neurolite and OM Pharma outside of this work. No other conflicts of interest were reported.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rohanian, M., Hüppi, R., Nooralahzadeh, F. et al. Uncertainty modeling in multimodal speech analysis across the psychosis spectrum. npj Digit. Med. (2026). https://doi.org/10.1038/s41746-025-02309-3

Download citation

  • Received: 18 September 2025

  • Accepted: 21 December 2025

  • Published: 23 January 2026

  • DOI: https://doi.org/10.1038/s41746-025-02309-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Content types
  • Journal Information
  • About the Editors
  • Contact
  • Editorial policies
  • Calls for Papers
  • Journal Metrics
  • About the Partner
  • Open Access
  • Early Career Researcher Editorial Fellowship
  • Editorial Team Vacancies
  • News and Views Student Editor
  • Communication Fellowship

Publish with us

  • For Authors and Referees
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

npj Digital Medicine (npj Digit. Med.)

ISSN 2398-6352 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing