Introduction

Physical Restraint (PR) is a coercive and emergency approach that involves reducing a person’s physical movement to ensure safety and maintain necessary treatment in case of life-threatening risks, either auto- or hetero-directed, according to the clinician’s judgment1,2. At the European level, the prevalence of PR use on admitted patients in mental health contexts is around 3.8 to 20%3; in Italy, it ranges from 4 to 22% in general hospitals and from 2 to 83% in nursing homes, while in general hospital psychiatric wards (Servizi Psichiatrici di Diagnosi e Cura– SPDC) the reported frequency is 14.1%4 (11.6% in the Lombardy region5). Resorting to PR is always an extraordinary measure taken only when all other strategies have proven ineffective. The application of PR is strictly regulated by national and international protocols requiring the constant monitoring and reassessment of the person who is restrained, as well as the termination of PR as soon as the imminent danger for the patient and others ceases to exist3,4,5. Nonetheless, PR is associated with several adverse events6 and raises ethical, legal, and medical concerns. Consequently, PR has been the subject of extensive research, addressing its clinical and legal aspects and analyzing the subjective experiences of patients and the involved clinical staff7,8,9.

Studies on PR experiences in mental health settings are mainly based on qualitative approaches such as content and narrative analysis of semi-structured interviews10,11,12. Overall, findings point to PR as a negative experience, generating emotions like fear, powerlessness, and helplessness13, together with anger, humiliation, and distress14, resulting in posttraumatic stress disorder (PTSD) discharge-like symptoms8, traumatic memories to both patients and nurses carrying out the procedure15, as well re-enactment of past traumas11. However, these methods are time-consuming, prone to bias, and somewhat limited in disclosing the multitude of facets characterizing the experience of PR. For instance, patients might avoid sharing some aspects of the experience explicitly as they learn to avoid verbal behaviors that might lead to consequences perceived as punitive16, preventing us from fully grasping important aspects of the PR experience.

A fruitful approach to describing individual subjective experiences is based on the analysis of language produced by the patients in narrating critical events. Language analysis can rely on a range of different techniques. These include Natural Language Processing(NLP), a branch of Artificial Intelligence technology that comprises tools and procedures to automatically or semi-automatically extract linguistic information from texts17,18, from the grammatical to the lexical/semantic level. NLP has proven to be particularly useful in overcoming the limits of conventional qualitative analysis, given its ability to quickly process large amounts of data and to capture subtle elements that might go unnoticed by human analysts19,20. Another useful technique is sentiment analysis, a subset of NLP that focuses on extracting affective features and emotional valence through distributional21 or lexicon-based approaches22. Other, not-automated approaches to language might help outline a more articulated picture of the subjective experience, and these include metaphor analysis23,24, a finer-grained method focused on identifying and classifying non-literal expressions, primarily based on manual annotation. All these methods have proved effective in describing different facets of the subjective experience in clinical and psychiatric contexts but have never been applied to study PR in the psychiatric setting before.

More specifically, previous NLP studies on the narrative production of people describing their experiences in clinical settings have proven such tools’ utility in highlighting the discourse’s structural and lexical/semantic characteristics and linking them to symptoms. For instance, among people living with schizophrenia, those with poor functional and clinical outcomes tend to display relatively simple syntactic structures and reduced fluency25,26, as well as lower lexical richness throughout all stages of illness27. Specific word classes might show abnormal patterns of use in association with specific conditions, such as pronouns in schizophrenia28 and depression29, as well as negation in the narratives describing psychological trauma30 and physical or psychosocial pain31. At the semantic level, studies have investigated the use of words related to cognitive, affective, and social domains, reporting, for instance, a consistent increase of some of these categories (i.e., cognitive words) in relation to better outcomes in PTSD32,33. Finally, studies investigating semantic features of words such as valence, arousal, and dominance34 have been useful in highlighting the unpleasant and controlling nature of symptoms such as hallucinations35 and the distressing content of PTSD memories36.

Beyond the focus on structural and lexical aspects, NLP via sentiment analysis has proved useful in defining the emotional stance (the overall positive or negative disposition toward some object) and the specific emotions (e.g., joy, fear, anger, etc.) of patients’ reported experiences. Sentiment analysis has been widely adopted to analyze a vast range of personal experiences in different pathological conditions37,38. For instance, the analysis of reports about past events generated by people with psychiatric disorders found a higher use of negative words compared to narratives about present episodes and that these latter were characterized by less intense negative emotions39, reflecting the potential of language analysis for capturing the temporal evolution of the participant’s personal narratives. Finally, sentiment analysis can be profitably applied to online patients’ reviews of physicians which, despite the self-selection bias, might inform on user satisfaction in response to treatment40.

Metaphor analysis represents a complementary source of information besides standard questionnaires and surveys, as it enables the collection and analysis of narratives where participants are free to express their stances. Across different clinical conditions and experiences, it was found that metaphors used to talk about illnesses are more frequently employed to narrate negative experiences24. Patients suffering from various diseases, including cancer23 and genetic syndromes41, use metaphors when describing their experience of living with a disease, with the most common source domains being the journey (e.g., navigating the illness road) and war (e.g., fighting against the illness)23. Previous metaphor analysis studies on narratives generated by patients suggest that the production of metaphors seems to be a common feature of people who go through emotionally intense experiences such as early pregnancy termination24, posttraumatic stress disorder42, and depression43. It was shown that identifying the metaphors used by patients and re-structuring and contextualizing them in a structured discussion about medical treatment can improve communication24.

The overarching aim of the present study was to deepen our understanding of the subjective experience of mental health users with respect to PR and to ascertain whether a finer-grained analysis of language might disclose new facets and emotional aspects with possible clinical implications, overcoming the limitations of qualitative analysis. We assumed, indeed, that the structural characteristics of the language employed by patients who experienced PR and the features of their discourse in terms of lexical-semantic choices and expressed stance could help clinicians get insights into the experience from the patient’s point of view, with the ultimate aim of supporting health care decisions.

To do so, we analyzed a corpus of 99 written narratives (Sect. 4.2) about the experience of PR collected across seven mental health services in northern Italy with a combination of NLP (Sect. 4.3 and 4.4) and manual methods (Sect. 4.5) to ensure a multilayered analysis. We extracted linguistic measures (related to fluency and parts of speech, as well as to the semantic characteristics of words) and applied sentiment and emotions analysis by automatically assessing dominant emotions in each sentence. Furthermore, we performed a metaphor analysis, identifying and classifying metaphorical uses. Each level of analysis was compared to a reference corpus of online reviews written by people with psychiatric disorders regarding their non-PR hospital experience (Sect. 4.2). Finally, we correlated the linguistic, sentiment, and emotion measures derived from the narratives of the individuals who experienced PR with the cumulative length of the PR events as well as the scores of the Secluded/Restrained Patients’ Perceptions of their Treatment (SR-PPT) questionnaire44 administered to the same cohort. This questionnaire is a standardized measure intended to capture the patient’s evaluation of the PR treatment and was used here to ground language variables into a more structured subjective evaluation of the PR treatment.

Although the novelty of the study prevents us from drawing clear predictions, we sketched some general hypotheses. First, we hypothesized that inpatients undergoing PR would present an overall impoverished language than outpatients in the reference group, since it is clinically expected that inpatients undergoing PR would present a worse psychopathological profile than highly functioning, possibly fully remitted, outpatients25,45. Specifically, we expected that narratives in the PR corpus would show reduced grammatical complexity and lexical density compared to the reference corpus of online reviews. Furthermore, given the traumatic nature of the PR experience emerging from previous qualitative research8,11,13,14, we expected an increased emotionally loaded lexicon in PR narratives. Furthermore, we expected the overall sentiment expressed in PR narratives to be significantly more negative8,11,13,14 than in the reference corpus. However, based on research highlighting high interindividual variability in terms of acceptance of the PR approach8,46, we hypothesize that positive emotions might also emerge by analyzing narratives with more sophisticated instruments than narrative approaches. Finally, given that metaphor use is known to be specifically linked with the expression of negative emotions24, we expected the use of non-literal language to be higher in the PR corpus compared to the reference corpus.

Results

Sample

The study sample included 99 participants (37 F, mean age 37 ± 15 years) with different diagnoses classified based on the current diagnostic reference system for the Italian psychiatric services (International Classification of Diseases, Ninth Revision, ICD-9): primary diagnosis included schizophrenia and other psychosis (N = 35), affective disorders (N = 22), personality disorders (N = 18), substance abuse disorders (N = 12), and others (N = 12, including intellectual disabilities, dementia, neurotic syndrome, and neurocognitive disorder). Co-morbidities with other psychiatric disorders were not excluded. Thirty-four (34–34%) participants had at least one prior experience of PR, while 65 (66%) participants were at their first PR experience. During the hospital stay, the mean number of PR events was 1.63 ± 2.75 per participant, and their mean cumulative length was 12 h:50 min ± 10 h:12 min. In some cases (N = 20), the length of PR includes multiple restraint events (range 1–27, median 1), which were the result of repeated attempts to interrupt the restraint, followed by a re-escalation of the violent behavior.

Linguistic analysis

Results of the linguistic analysis for the level of fluency and Part-Of-Speech (POS) tags are reported in Table 1. Patients who experienced PR produced significantly shorter narratives, with shorter sentences, fewer sentences per narrative, and lower lexical density than the reference corpus. No differences were observed in the log frequency of words.

The frequency of each POS tag was significantly different across the two corpora. Specifically, PR narratives were characterized by a lower frequency of adjectives, adpositions, coordinating conjunctions, determiners, nouns, proper nouns, and punctuation compared to the reference corpus. In contrast, in PR, adverbs, auxiliaries, pronouns, subordinating conjunctions, and verbs were more frequent than in the reference corpus. Further analysis of pronouns through LIWC revealed that patients experiencing PR tended to use first-person pronouns more frequently than the patients in the reference corpus (5.12 vs. 2.80%), a finding also reflected in the greater use of verbs in the first-person singular (5.97 vs. 3.49%). Negation operators represented 2% of the overall tokens in the PR narratives corpus, and 31.30% of all verbs occurred in negation contexts (15% when considering the 20 most frequently used verbs). In the reference corpus, the negation operators represented 7% of the whole corpus and 17% of verbs in negated contexts (dropping to 0% when considering the 20 most frequent verbs).

Table 1 Fluency measures and POS tags in the two corpora, with statistical comparison.

A summary of the semantic features of the two corpora is reported in Table 2. Narratives of PR patients were characterized by a higher use of words describing Affective and Cognitive processes (but a lower use of words indicating Social processes), a greater use of Informal language, and words reflecting Orientation in space and time. No differences were found in the use of words denoting Biological processes. Furthermore, words in PR narratives were more concrete and imaginable and expressed significantly lower Valence and Dominance but higher Arousal than the reference corpus.

Table 2 Semantic features of the two corpora, with statistical comparison.

Sentiment and emotion analysis

The complete results of basic and fine-grained emotion analysis are reported in Table 3 and depicted in Fig. 1.

The overall Sentiment expressed by the two corpora was significantly different, with the PR corpus being characterized by significantly lower Positive Sentiment and significantly higher Negative Sentiment than the reference corpus.

Concerning the analysis of basic emotions between the two corpora, PR narratives were characterized by significantly higher Sadness, higher Anger, and significantly lower Joy, compared to the reference corpus. No significant differences between the two corpora emerged for Fear.

Within each corpus, emotions were distributed differently: in the PR corpus, the emotion of Sadness was significantly different compared to Anger (U(99) = 2,609.5, p < .001). However, Anger appeared to be no differently represented compared to Fear (U(99) = 1,700.5, p = .313). At the same time, the latter was significantly different from Joy (U(99) = 1,736.5, p = .018). In contrast, in the reference corpus, the dominant emotion was Joy, which was significantly different compared to Sadness (U(148) = 6,301.0, p < .001), with the latter significantly different from Fear (U(148) = 5,119.5, p < .001), which in turn was significantly different from Anger (U(148) = 3,593.5, p = .046).

The fine-grained analysis of emotion confirmed the results of the basic emotion analysis, with a higher representation of negatively oriented rather than positively oriented emotions. PR narratives were characterized by significantly higher levels of Fear, Sadness, Disgust, Anger, Confusion, Surprise, and Disappointment. Conversely, the reference corpus was marked by higher use of utterances to express Admiration, Approval, Gratitude, Joy, Caring, and Neutral attitude.

Table 3 Outcome of the sentiment and emotion analysis in the two corpora, with statistical comparison.
Fig. 1
Fig. 1
Full size image

Visual representation of the results of basic and fine-grained emotion analysis in PR narratives and reference corpus. On top, Plutchik’s Wheels47 for basic emotions derived from FEEL-IT are presented for PR narratives (panel a) and reference corpus (panel b), respectively. In the wheel, each emotion is represented by a petal, and the length of the petal represents the percentage of utterances per narrative where each emotion was dominant. On the bottom, polar histograms of fine-grained emotions derived from EmoRoBERTa are presented for PR narratives (panel c) and in the reference corpus (panel d), respectively. In the histogram, each fine-grained emotion is represented by a bin, whereby height represents the percentage of utterances per narrative where each emotion was dominant. The histograms exclude Neutral values (see Table 3).

Metaphor analysis

In the PR corpus, we identified fifty-four (54) metaphors produced by 31 out of 99 participants, with nearly one-third (31%) of participants making at least one metaphor, ranging from min 1 to max 6 (M = 1.74, SD = 1.24, median = 1). Conversely, in the reference corpus, we identified forty-two (42) metaphors produced by 28 out of 148 participants, with approximately one-fifth (19%) of participants making at least one metaphor, ranging from min 1 to max 4 (M = 1.50, SD = 0.73, median = 1). Wilcoxon rank sum test confirmed that people from the PR group produced more metaphors than participants in the reference group (W = 8,391, p < .01).

Table 4 presents the descriptive outcome of the metaphor analysis in the two corpora, with examples from the PR narratives. Most metaphors in PR narratives were negatively connotated, and a slight majority were creative. Conversely, in reference narratives, the positive and negative sentiments of the metaphors were equally distributed, and most were non-creative. In PR narratives, the most common (above 15%) source domains were related to Animal Kingdom, Religion/History, Objects, and War/Prison. Conversely, in the reference group, the most common source domains were related to Objects and Religion/History.

Table 4 Outcome of the metaphor analysis in the two corpora, with examples of figurative expressions.

Length of PR, SR-PPT questionnaire, and correlations with language variables

Survey scores are shown in Table 5. Patients’ opinions on the “Cooperation with staff” subscale varied widely, with scores ranging from 0 to 10 for each item. Item 8 and item 1 scored the highest. In contrast, item 6 and item 5 scored the lowest. On the “Perception of seclusion and restraint” subscale, patients’ ratings varied from 0 to 10 on each item, with the scores for item 10 being the highest also of the whole questionnaire.

Table 5 Results of the Secluded/Restrained patients’ perceptions of their treatment (SR-PPT) questionnaire.

Correlations between SR-PPT scores, Length of PR, and language variables are presented in Fig. 2.

Concerning the total SR-PPT score, we found a weak positive correlation with the TTR and a weak negative correlation with fluency variables (number of words and the SPN) and number of verbs, indicating that higher lexical density and lower verbosity (with fewer verbs) were related to an overall better PR experience. The SR-PPT score was also negatively correlated with the use of informal language and words related to the social domain and the expression of Anger. This means that the more positive the evaluation of the PR experience, the lower the expression of anger, the use of informal language, and the references to the social domain in the PR narratives.

The Cooperation score exhibited the same pattern as the total score, except for the correlation with words from the social domain. The pattern indicates that higher lexical density, lower verbosity (with fewer verbs), reduced expression of Anger, and use of informal language were associated with better evaluation of the cooperation between the patient and the staff.

The Perception score exhibited the same pattern as the total score, except for the correlation with TTR. The pattern indicates that lower verbosity and reduced use of verbs, as well as reduced expression of Anger, fewer slang words, and less reference to the social domain, were related to a greater acknowledgment of the necessity and benefits of the treatment.

As for the Length of PR, we found a weak positive correlation with fluency variables (number of words and number of sentences), indicating that the longer the PR experience, the more fluent the PR narratives.

Fig. 2
Fig. 2
Full size image

Correlogram depicting the associations between the Secluded/Restrained Patients’ Perceptions of their Treatment (SR-PPT) questionnaire, Length of PR, and language variables. The correlation matrix displays Spearman correlation coefficients for the relationships involving the Total SR-PPT score (first row), Cooperation score (second row) Perception score (third row), and Length of PR (fourth row), along with fluency, selected POS classes, semantic, sentiment, and basic emotions variables. Significant correlations are denoted by asterisks, where ** represents p <0.01 and *** p < 0.001.

Discussion

In this study, we performed a multilayered language analysis of the narratives produced by patients with mental illness who experienced the reduction of physical movement under urgent circumstances, to overcome the limitations of previous qualitative approaches and provide a novel characterization of the PR experience from the subjective point of view of the patient. Linguistic, sentiment, and metaphor features of narratives about PR were evaluated with respect to a reference corpus on narratives about non-PR mental health treatments. Overall, the results revealed differences spanning throughout all levels of analysis, highlighting very distinctive language uses when narrating about PR.

Starting from the linguistic analysis at the fluency level, results point towards a poorly fluent production in PR narratives, characterized by shorter grammatical constructions and lexically impoverished texts compared to the reference corpus. These findings could be accounted for by the presence, in the PR sample, of severe psychiatric disorders (the most represented being schizophrenia, affective disorders, and personality disorders), which are known for their complex pathological manifestations, cognitive heterogeneity48 and low functional outcomes, as well as for a poor linguistic profile both structurally and lexically25,49.

The analysis of word categories (POS tagging) revealed other relevant features of PR narratives, which seem to be related to the traumatic content of the narrated experience. Focusing on the noun-verb distinction, PR narratives overall exhibited fewer nouns but a greater number of verbs compared to the reference corpus. A greater proportion of verbs with respect to nouns was observed in studies that collected narratives about living with a psychiatric disorder over time39 and was interpreted in terms of a greater focus on changes in illness and on action and processes rather than on objects. Here, however, it must be pointed out that approximately 30% of verbal forms occurred in a negated context. The use of negation has been widely investigated via psycholinguistics and neurolinguistics methods50,51,52, which contributed to highlighting that linguistic negation becomes meaningful for brain activity especially in action-related (compared to abstract) sentences52, by attenuating motor response51. In this context, the increased use of negated verbs might reflect a focus on the impossibility of action, stressing more broadly the coercive nature of the PR procedure. PR narratives also exhibited a greater proportion of pronouns compared to the reference corpus. While the greater use of verbs might have triggered the greater use of pronouns, it is interesting per se, as pronouns are indicative of the individual attentional focus53. In particular, the higher frequency of first-person singular pronouns seems to indicate that the attentional focus is on the self as the victim of the event53 and might be linked to negative emotions, especially PTSD-like and depressive symptoms29,32.

The semantic analysis further highlighted the traumatic nature of PR. Patients undergoing PR voiced a high level of displeasure (lower valence) and a great intensity of the emotion evoked (higher arousal), paired with a lower degree of control (lower dominance) over the object of the narrative (the PR experience), compared to patients talking about other psychiatric treatments. High levels of displeasure seem to be typical in verbalized biographical memories in PTSD36 with frequent use of words belonging to the semantic field of violence (e.g., assault, blood, death, etc.). Interestingly, the semantic analysis also revealed aspects that go beyond the narration of traumatic events, possibly pointing to the adoption of adaptive behaviors in response to traumatic content. In particular, patients in the PR group used more cognitive and affective words and this might be interpreted as an initial attempt to process the content of the narration, in line with evidence of increased cognitive lexicon in relation to better outcome in PTSD32. Patients in the PR group also reported a higher level of concreteness, which, on the one hand, might reflect altered semantic representations at the interface with symptoms54, but, on the other hand, might point to elaboration of the traumatic content towards a better outcome55.

As expected, the sentiment analysis evidenced a remarkably negative stance expressed by PR narratives, aligning with previous qualitative studies on PR experience11,12,13,14. Perhaps the most interesting findings come from the detection of the specific emotions expressed by patients. PR narratives were especially connotated by sadness (remarkably higher than in the reference corpus), while anger and fear were represented to a lower and not distinguishable degree, and the expression of joy was negligible. This finding – also confirmed by the fine-grained analysis of emotions – is novel, as previous studies evidenced a cluster of negative emotions but did not specifically highlight sadness56. Multiple explanations can be evoked to make sense of this result. First, the overarching emotion of sadness might be connected to the retrospective perspective of the narratives. A previous study considering the response to symptoms and treatments in psychiatric contexts revealed that participants frequently expressed feeling angry about their psychiatric treatment; however, when asked how they responded emotionally after the treatment, they used a wider range of emotions, including a high proportion of sadness and relief about what happened57. Our study suggests that, for coercive treatment, the retrospective view may trigger a distinctive switch from anger to sadness. Although positive emotions (joy or relief) did not emerge in our analysis, sadness can be seen as related to a number of psychologically positive and adaptive processes58. It has been argued that sadness can promote relatively unbiased information processing59, help cope with loss60, encourage the analysis of the causes61, and eventually facilitate the mobilization of others in assisting the affected person and attainment of future well-being60. All in all, the overarching emotion of sadness seems to indicate not just the negative aspects of the PR experience but also the adoption of strategies to ease cognitive reappraisal and emotional reframing of traumatic experiences. This interpretation of sadness aligns with the one offered for the use of words expressing cognitive processes and extends the array of linguistic mechanisms reflecting an adaptive response to PR.

A very informative outlook is offered by the metaphor analysis. First, we observed that the narratives about the PR experience feature a higher percentage of metaphorical expressions than the reference corpus, with 1/3 of the patients producing at least one metaphor (vs. 1/5). These findings confirm the pervasiveness of metaphorical language to express emotions, and in particular negative emotions, aligning with previous findings on the use of figurative expressions to talk about illness and traumatic experiences23,24,42. The distinctive source domains used in the metaphors in the PR corpus (and not in the reference one) allow us to examine how patients frame the topic and which aspects are highlighted23. Specifically, while metaphors related to religion and history and objectification were common in the two corpora and might be related to the psychopathological dimension, metaphors related to war/prison point to the aggressive nature of the event (see previous work on war metaphors of cancer23) and those related to the animal kingdom might refer to the lack of agency24 and the dehumanizing nature of PR experience (see the use of animal metaphors as insults or terms of abuse62). Second, the great use of metaphors might be seen as another linguistic mechanism reflecting attempts to elaborate on the experience of PR. The high occurrence of metaphors is noteworthy per se, as it seems to point to a preserved expressive ability63, in contrast with the largely documented difficulties in figurative language understanding, especially in schizophrenia64,65. Studies in the psychoanalytic framework have argued that metaphor represents an advanced mode of holding traumatic memories in a way that allows for transformation and healing, converting the traumatic lacuna into a creative force66. In this view, patients might use metaphors not just to highlight some negative aspects of PR, but also to help transform such negative experience and distance from it.

Our analysis also included a focus on the PR group, looking at correlations between language variables and the formal assessment of the quality of the PR experience via the SR-PPT questionnaire, as well as the cumulative length of PR. As a first consideration, the scale results align with those reported in previous studies, highlighting a high interindividual variability in the patient perception of PR44,46. Second, the significant associations that we observed indicate that the semiautomated linguistic analysis can capture clinically relevant variables. Fluency seems to be a particularly sensitive linguistic measure: those who judged the PR experience better talked less, while those who had a longer PR experience talked more, in line with the evidence that those who experienced greater traumas are more verbose67,68. The shorter narratives of those with better PR experience included fewer verbs (hence, a reduced focus on the coercive nature of the event, as argued above) and conversely a greater lexical variety, with less slang and fewer words focused on social processes, as well as reduced anger, a pattern that seems to indicate a more advanced stage in the adaptive response and reframing of PR.

While useful to reveal different aspects of the subjective experience of PR in an unobtrusive way, this study has some limitations that must be addressed. Firstly, the choice of a corpus of online reviews, while useful to provide a benchmark to interpret data, might be biased by a self-selection criterion and a possible overrepresentation of positive emotions40, even though similar considerations apply also to the patients who received PR, thus making both groups equally shifted towards a positive bias. More generally, it must be pointed out that the two corpora are quantitatively small compared to large-scale initiatives for documenting speech in psychiatric populations (e.g., TalkBank, https://www.talkbank.org/). For this reason, results are tentative and any conclusion based on the comparison between the two must be taken with caution. In addition, it must be noted that the different diagnostic groups were not equally represented in the two samples. Future studies should try to replicate this analysis by comparing the narratives of patients treated with PR against narratives by patients with equally severe psychopathology but who received different treatments in similarly urgent circumstances. In addition, the limited clinical information available on the two samples does not allow us to examine cognitive or psychopathological factors that might influence the linguistic profiles. Finally, the absence of longitudinal data prevents us from testing the sensitivity of language measures in predicting long-term clinical outcomes in response to psychiatric treatments.

Despite these limitations, we believe that the PR corpus here collected represents a valuable resource, given the difficulty of representing this type of narrative. Most importantly, we believe that the study was successful in achieving a more fine-grained characterization of the patient’s subjective experience of PR, going beyond the state-of-the-art literature analyzing the perception of restraint measures. The multilayered linguistic analysis confirmed the traumatic nature of PR (as revealed by the emotional values of words, the negative sentiment, as well as the use of negated verbs and 1st person singular pronouns) but also disclosed elements previously overlooked, the main ones being sadness as the dominant emotional connotation and other linguistic mechanisms indicating an adaptive process of reframing the traumatic content (e.g., high frequency of cognitive words and metaphors). This latter point aligns with the idea that there may be acceptance of the PR treatment8, as well as possible healing from its traumatic impact.

The study also discloses relevant clinical implications. First, the association between greater use of linguistic mechanisms related to the elaboration of trauma and greater length of the PR experience underscores the importance of reducing the duration of the restraint to a minimum. While limiting the time of containment is already included in guidelines and practices, our findings serve to further highlight the importance of using multiple strategies to decrease its duration to prevent trauma-like sequalae69. Further insights come from the analysis of metaphorical expressions. Given the power of metaphor to allow for the elaboration of experiences, especially traumatic ones such as PR8,11,12, an ad hoc debriefing protocol could be shaped and administered post-PR and possibly at follow-up, aimed at eliciting metaphorical expressions from the patient, possibly complemented with exercises on metaphor comprehension70. These could be used not only to gain insights on the subjective point of view but also to help orient a psychotherapeutic intervention on PR-related trauma71, in addition to providing individuals with a flexible tool to elaborate and heal from traumatic content. Overall, we offer these findings as an unobtrusive window into the subjective experience of PR, which, in the context of the highly debated legal and ethical aspects of PR4,5,6, might help bridge the gap between the patient’s point of view and mental health care practices, understand the psychological factors associated with PR, and evaluate its clinical application.

Methods

Sample and assessment

Ninety-nine (N = 99) individuals were recruited from seven Italian mental health services (Servizi Psichiatrici di Diagnosi e Cura– SPDC) during or at the end of their admission/stay. The sample size was originally determined based on good practice72 and according to the recruitment capacity of the participating sites. For this reason, we ran an after-the-fact power analysis73 through the software G*Power 3.1.9.674 and found that with 99 participants and α = 0.05, we had enough power (1-β= 1.00) to detect the effect size (d = 0.65) found in previous studies that assessed the use of different lexical markers in psychiatric conditions26. Inclusion criteria were being admitted to an SPDC facility, being between 18 and 65 years of age, having been subjected to at least one PR event during the current admission, and consenting to participate in the study. Physical (mechanical) restraint in our sample was defined as belt fixation to the patient’s bed partly or fully reducing mobility. Exclusion criteria included unsuitable clinical conditions (also at discharge), lack of consent to participate, and lack of collaboration from the treating clinicians. The study was approved by the ethics committee Comitato Etico Territoriale Lombardia 6 as a multicentric mixed-methods observational study (Protocol N. 20200060576). The study was performed in accordance with the Declaration of Helsinki. Informed consent was obtained from all participants. Data were collected between August 2020 and March 2022.

Data concerning the age, sex, diagnosis, number and length of each PR event during the current admission were collected through the patient’s chart. The cumulative length of all PR events was computed as the sum of the length in minutes of each PR event during the current hospital stay for each participant.

The assessment included eliciting written narratives describing the experience of PR and the administration of the SR-PPT questionnaire. Participants were asked the following open question to elicit narratives: “How would you describe the PR event you experienced during this hospitalization?“. They were offered a private area (private room with a laptop) to answer the questions in writing. Data were collected anonymously through an online survey module. Participants were informed that their answers would not be shared with the treating clinicians, so as to avoid social desirability biases in the narratives and to protect their privacy.

The SR-PPT questionnaire was administered after narrative elicitation to avoid any possible bias due to the sequence of themes proposed in the survey. The SR-PPT46 is a standardized self-reporting questionnaire made up of 11 questions. Based on the factor analysis in the original study44, the items fell into two reliable clusters (further referred to as subscales): “Cooperation with staff” (9 items, α = 0.93) and “Perceptions of PR” (2 items, α = 0.89). The SR-PPT items are listed in Table 2. For a detailed description of the questionnaire and its adaptation for the purpose of the current study, see Supplementary Information.

Corpora and pre-processing

The PR narrative corpus comprised 99 narrative excerpts produced by 99 participants for a total of 8,397 words. Given that the focus was on the subjective experience expressed via the narrative rather than on the modality in which the narrative was reported, we decided not to exclude from the analysis seven excerpts that were collected orally following the explicit desire of the participants (i.e., recorded by the clinicians and later transcribed). Among these, one was collected from a non-native speaker, and five reported the conversational turns between the patient and the clinician. In these latter cases, we removed all the clinicians’ turns and analyzed only the patients’ production.

As a reference corpus, we collected online reviews from the QSalute website (www.qsalute.it). In this Italian online forum, users can express opinions about their experience with specific hospitals and hospital units (https://www.qsalute.it/recensioni/health-reviews/). We web-scraped the website to collect reviews of users with a psychiatric disorder (specifically, only bipolar disorder was attested) with no restrictions on the clinical site. To exclude the possibility that online reviews reported events of PR, we scanned the online corpus for keywords such as “contenere” and/or “legare” (to restrain and/or to tie) and no mentions of PR as a first-person experience were reported. This corpus consisted of 148 reviews (after having removed opinions from relatives and other persons not directly involved) by 148 users of Italian psychiatric services (“Reference corpus”), for a total of 13,581 words. Reviews were dated from 2011 to 2023.

The PR narratives and the reference corpus were pre-processed before further analysis. Both corpora were converted to lowercase, tokenized, and lemmatized using the spaCy package75. Analysis at the utterance level (see further) was carried out on any portion of texts between punctuation marks (full stops, commas, colons, semicolons, and dashes).

Linguistic analysis

The linguistic analysis included the extraction of a series of fluency variables computed at the word and sentence level, the POS tag analysis, and the analysis of semantic variables at the word level.

Concerning fluency, we measured the number of Words per Narrative (WPN), a measure of productivity, calculated by counting the number of words on the raw transcript (before tokenization); the number of Words per Sentence (WPS), indicating the mean length of sentences in each narrative; the Sentence per Narrative (SPN), reflecting the ratio between the number of sentences per narrative. We also measured the mean log word frequency, retrieved from Subtlex-it76, indicating the mean frequency of words used in the narrative, logarithmically transformed, and the Type/Token Ratio (TTR), a measure of lexical density, calculated as the ratio between the number of unique tokens and the total number of tokens produced by the participant.

Part-of-speech (POS) tagging to compute the percentage of POS tags was carried out using the spaCy75 package on the raw (unprocessed) texts, and percentages were computed as the ratio between the total occurrences of the POS tag and the total number of words in each narrative. Some POS were underrepresented in the two corpora (namely interjections, numerals, and particles), and for this reason, no further analyses were carried out.

Further descriptive explorations were conducted on pronouns (through Linguistic Inquiry and Word Count - LIWC77) and negation (through the SpaCy package). To explore the role of negation operators, we extracted the occurrences of “no”, “non”, “nessun*” (“no”, “not”, “no/any-body”) from the two corpora. Moreover, to understand the role of negation in the context of verb use, we examined the frequency of negation operators in association with lemmatized verbs (limited to the preceding 5-word context). The same analysis was also done for the 20 most frequently used verbs.

For semantics, we derived measures of Affective, Biological, Cognitive, and Social processes, use of Informal language, and use of language to orient in Space and Time based on Linguistic Inquiry and Word Count (LIWC); see Supplementary Information for further details. Measures of Concreteness and Imageabilityof words were derived from simulated multilanguage data78, while Valence, Arousal, and Dominance(VAD) were identified using the Italian MEmoLon – The Multilingual Emotion Lexicon79. In the present context, Valence is to be interpreted as the level of pleasure/displeasure attributed to the object of the narrative (the PR experience), Dominance as the degree of control over the PR experience, and Arousal as the intensity of the emotion evoked by the PR experience.

Figure 3 summarizes the pipeline of the multilevel linguistic analysis.

Fig. 3
Fig. 3
Full size image

Pipeline of the multilayered linguistic analysis. The picture summarizes the process of analyzing PR narratives and the reference corpus. Following an initial pre-processing phase involving the removal of clinician turns and converting text to lowercase (yellow box), the pipeline included first the linguistic analysis (green box). During this phase, measures of fluency, parts of speech, and semantic variables (including those derived from LIWC), and Concreteness, Imageability, and Valence, Arousal, Dominance (VAD) were extracted. Subsequently, texts were segmented at the utterance level, and sentiment and emotion analysis were carried out to identify positive and negative sentences and basic and fine-grained emotions (blue box). The last step involved the metaphor analysis to identify metaphorical expressions and their source domains (grey box). LIWC  Linguistic Inquiry and Word Count, POS  Part of Speech, PR  Physical Restraint,SNR  Sentences per Narrative, TTR  Type/Token Ratio, VAD  Valence, Arousal, Dominance, WPN  Words per Narrative.

Sentiment and emotion analysis

We performed sentiment analysis first to determine evaluation (positive and negative) and then basic emotions (anger, fear, joy, sadness) using a neural model trained on a corpus of Italian Twitter posts annotated with four basic emotions (the UmBERTo model fine-tuned on the FEEL-IT corpus80). The analysis was carried out at the utterance level, where each utterance was identified as any portion of text between punctuation marks (full stops, commas, colons, semicolons, and dashes).

To better understand the full range of emotions experienced by patients, we additionally analyzed both corpora, adopting a finer-grained instrument for emotion classification as identified by EmoRoBERTa81. Given that this resource was developed for the English language, we analyzed the English version of the narratives, automatically translated with the deeptranslatorpackage82 in Python.

Metaphor analysis

Metaphors were identified80 according to the modified MIP procedure83 proposed by Fuoli and colleagues84, namely by: (i) establishing a general understanding of the text; (ii) identifying meaning units at the level of the phrases; (iii) establishing, for each meaning unit, its meaning within the given context; (iv) determining if the meaning unit has a more basic (i.e., more concrete, related to bodily action, more precise) meaning in other contexts than the one in the text; (v) if so, by checking whether the basic meaning contrasts with the meaning in the context. If all the above criteria were fulfilled, then the meaning unit was considered a metaphor. In line with previous works85,86, we focused on metaphorical comparisons and not strictly on the linguistic forms, which resulted in the inclusion of similes and metonyms87. Once identified, source domains were categorized88,89, capitalizing partly on classic repertoires (i.e., body and object90) and partly on work specifically focusing on traumatic experiences (i.e., war and journey23, religion91, mental state92 and death93). Metaphors were then classified according to evaluation (positive vs. negative) and creativity (creative vs. conventional). Metaphors were identified as “creative” when they introduced a new metaphorical mapping (e.g., “maternal guilt is like off the Richter Scale”94) or extended and elaborated upon existing mappings (e.g., “It would be kind of on a raft somewhere in an ocean. You can see a bit of land in the distance and you’re trying to - you know it’s there, you’ll get there but you’ve just got to cross this huge wide vast bit of open emptiness[…]”94). Corpora were manually annotated for metaphors independently by two Authors (BS and CBdSP). Inter-rater agreement on metaphor identification was almost perfect (Cohen’s k = 0.97), while inter-rater agreement on metaphors’ themes definition was strong (Cohen’s k = 0.87). Discrepancies were resolved through a consensus-based approach95.

Statistical analyses

Language variables derived by NLP methods were compared between the two corpora. Following previous studies96, and given the non-normality of all the variables under investigation (as tested via Shapiro test), we run a series of Wilcoxon rank sum tests for independent groups to analyze the differences in the mean value of fluency variables (WPN, SPN, WPS, TTR, and the mean logarithmic frequency of words), POS tags, and semantic variables (words describing Affective, Biological, Cognitive, and Social processes, Informal language, Orientation in space and time, as well as the level of Concreteness, Imageability, Valence, Arousal, and Dominance expressed by words), computed independently for each narrative. The same approach was adopted to identify differences in sentiment (Positive and Negative), basic emotions (Sadness, Anger, Fear, and Joy), and the number of metaphors produced. Given the underrepresentation of some fine-grained emotions, no further analysis was carried out on the variables derived from the EmoRoBERTa tool. False Discovery Rate (FDR) correction for multiple comparisons was adopted.

The comparison between basic emotions within each corpus was carried out with a Wilcoxon rank sum test for dependent groups, and Bonferroni correction was adopted.

The associations between the SR-PPT scores (Total Score, Cooperation, Perception), length of PR (in minutes), and language variables were tested via Spearman’s correlations. Language variables included number of words and SPN, TTR, WPS, word frequency, selected POS classes (verbs, pronouns, 1st person singular), Concreteness and Imageability, VAD, Affective, Biological, Cognitive, and Social processes, Informal language, and Orientation in space and time) as well as Positive and Negative sentiment and basic emotions (Joy, Fear, Anger, and Sadness). Given the number of correlations computed, the significance level was set at p < .01.

Data analysis was carried out using the R statistical program97.

Descriptive statistics of the language characteristics across diagnosis is reported in Supplementary Information. Given the differences in numerosity across groups, no further statistical analysis was carried out.