Abstract
Disruptions in language processing observed in Individuals diagnosed with schizophrenia (ISZ) are likely to impair turn-taking fluency and social functioning. While turn-taking research in ISZ is limited and mostly interview-based, this study examines fluency differences between ISZ and controls in free conversations and their links to social outcomes and symptoms. We recruited 20 ISZ, 20 healthy interacting partners (IP), and 20 matched controls (MAT). Each IP, unaware of the ISZ diagnosis, had a 6-min conversation with an ISZ and a MAT, and then rated their willingness to interact again. Voice recordings were analyzed for pauses, gaps, and overlaps. Results revealed that conversations with ISZ featured fewer overlaps, more and longer gaps, and extended pauses. Additionally, the gap duration influenced participants’ willingness to engage in future interactions. ISZ symptoms disrupted their speech and were linked to longer gaps and pauses in their partner’s speech. This study extends fluency research in ISZ by shedding light on natural conversational dynamics.
Introduction
Individuals diagnosed with schizophrenia (ISZ) often experience significant impairments in social functioning, the ability to engage in and sustain social interactions1. These difficulties stem from core symptoms of the disorder, including positive symptoms (e.g., delusions, hallucinations), negative symptoms (e.g., reduced motivation, diminished expression), and formal thought disorder2. The latter involves poverty of speech and disorganized discourse leading to communication difficulties3,4. Previous studies found that ISZ experience difficulties across various language production and perception levels. Production difficulties span both higher-level cognitive processes related to semantics and syntactic processes, as well as more fundamental aspects of speech like acoustic production (i.e., slower speech rates, longer pauses, and lower speaking proportion)5,6,7,8. Concerning speech perception deficits, studies have found impairment in central auditory processing, along with reduced social cognitive abilities such as the perception, interpretation, and processing of social information9,10,11,12. Such perception and production language deficits could make it challenging for ISZ to follow the implicit conversation rules7.
Conversation is a dynamic process wherein individuals alternate between speaking and listening, forming the basis of turn-taking13. In this universal social system, speakers tend to minimize overlap (when a listener speaks before the speaker has finished their turn) and long gaps (the silence between different speakers’ turns)14,15. The result is a coordinated sequence of turns with inter-turn intervals typically around 200 ms14,16. Since spoken word conceptualization and production normally takes around 600 ms, the listener must anticipate when the current speaker will conclude their turn17,18,19. Consequently, managing the precise timing of a conversation requires cognitive processes. The listener must understand the current turn and anticipate the next one, formulate a response, and deliver it at the expected endpoint18,19,20. The brief gaps between turns, used as a proxy for speech latency, reflect the respondent’s temporal behavior and indicate how efficiently interlocutors manage dialog coordination21,22. Maintaining well-timed latencies supports smooth transitions and overall dialog fluency.
Fluency includes the speaker’s ability to produce flowing speech with a high production rate, without pauses, hesitations, fillers, and corrections23,24. Along this line, studies on perceived fluency in nonnative speech have found that two of the strongest predictors are the mean number and duration of silent pauses, where ‘silent pause’ refers specifically to the silence between two continuous stretches of speech produced by the same speaker25,26,27. On the dyadic level, fluency lies in the capacity to maintain flow from side to side of turn boundaries - including short gaps and overlaps- and is a key indicator of social outcomes, mirroring both the context and quality of conversation24,28,29. For example, turn-taking can vary with the number of participants and whether the setting is informal or institutional (i.e., with a doctor or researcher), where roles can be more asymmetrical14,30. Moreover, research on neurotypical participants shows that fast responses foster connection29, while long gaps are linked to negative outcomes like less compliance, weaker affiliation, and lower shared cognition28,31,32,33.
Because turn-taking mediates fluency and social connection, disruptions in language processing, such as those seen in ISZ, are likely to impair the smooth coordination of conversational turns and social functioning5. However, despite the relevance of speech timing as a marker of communicative difficulties, research on turn-taking in ISZ remains limited and inconsistent. A study analyzed speech from psychiatric interviews and found that ISZ speech was more fragmented, presenting abnormally long gaps caused by turn-taking delay34. In triadic moral dilemma discussions, another study showed that ISZ presence altered interlocutors’ behavior, leading to longer gaps and more pauses, even when the diagnosis was undisclosed35,36. Finally, another study found more overlaps and mutual silences in ISZ interactions during semi-structured interviews, though gap rates did not differ. Moreover, psychiatrists participated more with ISZs compared to dialog with controls, and negative symptoms were linked to turn-taking patterns, especially mutual silence37.
These findings confirm disfluencies in ISZ interactions, though the unnatural and institutional context may have influenced them. While triadic conversation requires more cognitive effort, psychiatrist interviews impose a more structured or oriented interaction. To date, no study has specifically examined the turn-taking dynamics of free conversations between ISZ and control participants, leaving a significant gap in our understanding of how these interactions unfold in real-life contexts. Moreover, given ISZ’s social functioning deficits and the link between fluency and social bonding among neurotypical individuals, exploring how turn-taking fluency affects social outcomes in clinical populations would be valuable. Importantly, because the timing of turn-taking emerges as a co-constructed process between two interacting partners, one must consider both the dyadic outcome and the individual contributions that shape it.
This study aims to investigate turn-taking fluency in free conversations between ISZ and controls. We recruited 20 ISZ, 20 interacting partners (IP), and 20 matched participants (MAT). Each IP, unaware of the ISZ diagnosis, engaged in a 6-min free conversation once with an ISZ and once with a MAT. At the end of the interaction, participants answered questions about their willingness to engage in future interactions as a social outcome. We recorded their voices and extracted metrics (gaps, overlaps, and pauses) used to assess individual and dyadic fluency. These metrics were analyzed at both the dyad level to assess overall fluency and the individual level to understand each participant’s contribution, considering the influence of individual differences on dyadic interactions. Moreover, we seek to explore the relationship between turn-taking fluency and willingness to engage in future interactions as our social outcome. Finally, we explored whether disfluencies found in ISZ could be related to psychopathological dimensions37,38. Based on the previous findings of the literature:
(H1) We expect to find a decrease in dialog fluency in ISZ interactions driven by overlaps, gaps, and pauses:
-
a.
Overlaps We expect a higher number of overlaps in ISZ interactions37 and will further examine their duration, along with how each participant contributes to them.
-
b.
Gaps We expect no significant differences in the proportion of gaps between ISZ and control interactions37 but hypothesize longer gap durations in ISZ interactions34, with both ISZ and control contributing to this increase34,36.
-
c.
Pause We expect ISZ conversations to have more and longer pauses, with ISZ participants and their partners contributing to the increased duration7,35.
(H2) As speech fluency decreases in ISZ interactions, we expect this to negatively affect the willingness of participants to continue the conversation.
(H3) We anticipate that the reduced fluency in ISZ interactions will correlate with certain psychopathological dimensions of schizophrenia, particularly negative symptoms37,38,39,40.
Materials & methods
Study design
The current study is part of the Enhancer project (ANR-22-CE17-0036), which investigates the dynamics of speech and gesture in social interactions involving individuals diagnosed with schizophrenia. The French Ethical Committee ethically approved this study - Comité de la Protection des Personnes (2024-A00553-44) - along with a compensation of 50 euros for the healthy participants. After receiving a written information letter, all participants provided informed consent before the experiment.
This study employed a specific design where healthy IPs interacted with two distinct partners separately: an ISZ and a healthy MAT. It resulted in triplets, each consisting of one IP interacting once with ISZ (IPs_ISZ) and once with MAT (IP_MAT) (for better clarity, we will refer to “IPs” to name the interacting partner of ISZ). ISZ and MAT were matched on age (U = 160.5, p = 0.29) and sex, allowing only comparison based on diagnostic status41.
Participants
We recruited 60 participants (20 ISZ, 20 MAT, and 20 IP) for 20 triads to meet the required sample size, 20 ISZ (age: M = 32.3, SD = 9.4, 6 women, 14 men), 20 MAT (age: M = 31.1, SD = 11.9, 6 women, 14 men), and 20 IP (age: M = 26.3, SD = 6.5, 6 women, 14 men). All MAT and IP participants were recruited in Montpellier (France), presented no history of psychosis or neurological or psychiatric disorders, and were not taking any medication known to impact cognition. ISZ participants were recruited from the University Department of Adult Psychiatry in Montpellier. All participants spoke French. Inclusion criteria required a DSM-5 diagnosis of schizophrenia, clinical stability, and ongoing antipsychotic treatment at the time of participation. Symptomatology was assessed using the Positive and Negative Syndrome Scale (PANSS)42, and medication doses were converted to chlorpromazine (CPZ) equivalents to allow comparison across different antipsychotic treatments. Moreover, we used the Montreal Cognitive Assessment (MOCA) to control for cognitive differences43, the Beck Depression Inventory II (BDI-II) for depression44, the French National Adult Reading Test for premorbid intelligence (fANRT)45, and the Positive and Negative Affect Schedule (PANAS) for emotions46 (Table 1). No differences were found for these measures (p > 0.05), except for MoCA (p = 0.02).
Task
Each dyad participated in a 6-min free conversation. To avoid long silences while deciding on a topic, we provided each participant with a document containing various themes, such as movies, holidays, etc. Participants were instructed to select two topics to discuss, but could switch anytime. The instructor left the room to encourage spontaneous conversation. Participants were seated approximately 1.5 meters away and wore a lapel microphone (Hollyland LARK 150) to record their speech. The first four triads wore lapel microphones, but due to poor sound quality and diarization issues, the remaining triads used headsets for clearer audio. Despite increased volume, crosstalk persisted. Microphones were connected to a Zoom H6 recorder, and speech was recorded at 44,000 Hz and saved as .wav files. The lip-to-microphone distance and angle were not standardized during recordings, but were adjusted to ensure the comfort of each participant.
Procedure
Upon arrival, participants (ISZ or MAT) met their partner in a room. IP interacted twice, once with the ISZ and once with the MAT, with the interaction order counterbalanced. Before the experiment, participants were given an information letter explaining the study; they subsequently provided their informed consent and completed the First Impression Scale47, BDI, and PANAS. Before beginning the main task, participants took part in a four-minute icebreaker conversation intended to help them feel more comfortable interacting with each other48. They were then invited to engage in the free conversation task described above. After the interaction, participants completed the Willingness to Interact Scale49, a five-point Likert scale assessing willingness to engage in future social situations. Originally consisting of 6 items (e.g., “take advice,” “take the bus”), we added 3 items (e.g., “get to know better,” “liking,” “perceived similarity”). These items were adapted from previous research on interpersonal attraction and social evaluation41,50,51,52. Final items used the same Likert scale, and demonstrated excellent internal consistency (α = 0.92; 95% CI: [0.89, 0.94]).
Extraction of turn-taking metrics
Each .wav file was processed in WavePad Audio Editor to reduce room reverberation and amplify the voice when the microphone was distant. The files were then exported to ELAN, where an automatic Voice Activity Detection using Silence Recognizer was applied to identify and mark silent segments. The minimal silence duration was chosen to be 200 ms27. Due to crosstalk in our audio file, we manually corrected speech segments to assign them to the correct speaker. Audible in-breaths were removed, and filled pauses and laughter were classified as speech activity30,53. The resulting speech segments are defined as Inter-Pausal Units (IPUs), maximal sequences of words surrounded by any silence exceeding 200 ms27. For each IPU, we manually determined whether it was a backchannel (BC), namely, a short expression (i.e., “ok”) used by the listener to express attention or interest54,55.
The binary speech files were processed in MATLAB R2021B to derive turn-taking metrics based on methods from Heldner and Edlund and Levinson and Torreira, incorporating backchannels19,56. The algorithm identifies (1) a turn as a sequence of uninterrupted IPUs from one speaker (except for BCs that do not constitute a claim for a turn)27,55; (2) Gaps as periods of silence between speakers; (3) Overlaps when one speaker starts before the other finishes; (4) Pauses as silence within a speaker’s turn (Fig. 1).
The metrics represented correspond to Inter-Pausal Unit (IPU), Backchannel (BC), Pause, Gap, and Between-overlaps (simply referring to overlaps). The metrics are represented at the individual level (reflecting the specific involvement of each participant) and at the dyadic level (representing the interaction between the two participants). For example, when speaker 1 finishes his turn, speaker 2 takes 300 ms to respond. This gap is recorded for the dyad and is also attributed to Participant 2, as he initiated the delay.
Variables of interest
Gaps, overlaps, and pauses were extracted as our interest variables. For each variable, we extracted both the structural properties (the total number relative to the total number of possible turn transitions, expressed in %) and temporal properties (the median duration expressed in ms, and the total duration relative to the total speaking duration, expressed in %). All metrics were extracted both at the dyadic level and the individual level (Fig. 1). Data from binary speech files and the analytic code of statistical analyses can be found on the Open Science Framework (https://osf.io/g97hn/).
Statistical analysis
We first tested normality with the Shapiro–Wilk test and variance with Levene’s test. Only Shapiro–Wilk indicated non-normality in some variables. We applied natural logarithmic transformation to all our non-normally distributed variables, as is commonly done in psychosocial research57,58,59.
To test our first hypothesis, we fitted linear mixed-effects models (LMMs) using the lme4 package in R60, with triad as a random intercept. We ran two separate sets of models: one testing the effect of the dyad (IP_MAT vs. IPs_ISZ; reference = IP_MAT), and another testing participant role (IP vs. IPs and MAT vs. ISZ; reference = IP), to assess whether IPs adapted their behavior depending on their partner:
lmer (dependent_variable ~ dyad or participant + (1|triad))
To test whether dyad or participant factors improved model fit, each model was compared to a null model (random intercept only) using AIC, BIC, and likelihood ratio tests. We also assessed whether adding order (i.e., which dyad occurred first) and its interaction with dyad or participant further improved fit. The dyad factor improved model fit for all variables except Overlaps median duration, Overlaps total duration, and Pause total number. A dyad × order interaction best explained the Gaps’ total and median duration. In participant models, the participant factor improved fit for all variables except for Overlaps median duration, Overlaps total duration, and Gaps median duration. For Gaps total duration, a participant × order interaction improved fit, and for Pause median duration, additive effects of dyad and order provided the best fit (see Supplementary Tables 1–2 for model comparison details). Analyses were restricted to these validated models. P-values were computed using the lmerTest package61. When multiple tests were run on the same data, we corrected for multiple comparisons using the False Discovery Rate (FDR)62.
Results
Overlaps
Dyad level
The Linear Mixed Model results indicated a significant dyad effect on the total number of overlaps (IP_MAT vs. IP_ISZ: b = −8.34, p = 0.049, d = 0.71, 95% CI [0.03, 1.39]), with significantly more overlaps occurring in the IP_MAT (M = 41.9%) dyad compared to IPs_ISZ (M = 33.5%) (Fig. 2).
The dyadic level is illustrated on the top row and the individual level on the bottom row. The first column displays the median duration of overlaps (in seconds), the second column shows the total overlap duration (in %), and the third column represents the total number of overlaps (in %). Red boxplots correspond to dyads where the interacting participant was IP and MAT (IP_MAT), while blue boxplots correspond to dyads involving IPs and ISZ (IPs_ISZ). Individual-level comparisons are made between IP and IPs and between MAT and ISZ. * denotes p < 0.05, ** p < 0.01, and *** p < 0.001.
Individual level
The Linear Mixed Model results indicated a significant effect of the participant on the total number of overlaps for IP vs. IPs (IP vs. IPs: b = 8.27, p = .018, d = 1.11, 95% CI [0.45, 1.76]), with significantly more overlaps produced by the IP interacting with the MAT (M = 22.6%) compared to the IPs interacting with the ISZ (M = 14.3%) (Fig. 2).
Gaps
Dyad level
The Linear Mixed Model for gaps total number indicated a significant effect of the dyad (IP_MAT vs IPs_ISZ: b = 10.26, p = .043, d = −0.89, 95% CI [−1.59, −0.20]). Significant dyad × order interactions were found for gap total and median durations. In the IPs_ISZ dyad, gaps were longer when the conversation started with ISZ versus MAT (total: p = .049, d = 1.32, 95% CI [0.07, 2.57]; median: p = .043, d = 1.34, 95% CI [0.31, 2.36]). No order effect appeared in the IP_MAT dyad. Overall, the IP_MAT dyads produced shorter median gaps (M = 0.24 ms, M = 0.30 ms), shorter total gap duration (M = 1.26%, M = 1.74%), and fewer gaps (M = 52.56%, M = 62.82%) compared to the IPs_ISZ dyads (Fig. 3).
The dyadic level is illustrated on the top row and the individual level on the bottom row. The first column displays the median duration of gaps (in seconds), the second column shows the total gap duration (in %), and the third column represents the total number of gaps (in %). Red boxplots correspond to dyads where the interacting participant was IP and MAT (IP_MAT), while blue boxplots correspond to dyads involving IPs and ISZ (IPs_ISZ). Individual-level comparisons are made between IP and IPs and between MAT and ISZ. * denotes p < 0.05, ** p < 0.01, and *** p < 0.001. Order effects reported in the text are not displayed in the figure.
Individual level
The Linear Mixed Model indicated a significant effect between IP vs IPs for the total number of gaps (IP vs IPs: b = −9.16, p = 0.009, d = −1.23, 95% CI [−1.90, −0.57]), with significantly more gaps produced by the IPs compared to the IP (M = 33.8%, M = 24.7%). For gaps total duration, a main effect of Participant showed that IPs spent more time in gaps than IP (IP vs IPs: 0.30, p = 0.008, d = –1.44, 95% CI [–2.13, –0.76]). A significant interaction with order for ISZ (b = 0.43, p = 0.021) indicated longer gaps when IPs_ISZ occurred before IP_MAT, though this effect did not survive correction (p = 0.11) (Fig. 3).
Pause
Dyad level
The Linear Mixed Model results indicated a significant effect of the group on pauses median duration (IP_MAT vs IPs_ISZ: b = 0.042, p = 0.049, d = –0.76, 95% CI [–1.45, –0.08]), and pauses total duration (IP_MAT vs IPs_ISZ: b = 0.152, p = 0.049, d = 0.73, 95% CI [–1.43, –0.05]). Overall, the IPs_ISZ dyads had significantly longer median pauses (M = 0.45 ms, M = 0.26 ms) and a greater total duration of pauses (M = 2.77%, M = 2.61%) compared to the IP_MAT dyad (Fig. 4).
The dyadic level is illustrated on the top row and the individual level on the bottom row. The first column displays the median duration of pause (in seconds), the second column shows the total pause duration (in %), and the third column represents the total number of pauses (in %). Red boxplots correspond to dyads where the interacting participant was IP and MAT (IP_MAT), while blue boxplots correspond to dyads involving IPs and ISZ (IP_ISZ). Individual-level comparisons are made between IP and IPs and between MAT and ISZ. * denotes p < 0.05, **p < 0.01, and ***p < 0.001.
Individual level
The Linear Mixed Model results indicated a significant effect of the participant on the pause median duration for ISZ vs MAT (ISZ vs MAT: b = 0.079, p = .04, d = 1.13, 95% IC [0.47, 1.78]), with the ISZ producing significantly longer median pauses compared to the MAT participant (M = 0.48%, M = 0.40%) (Fig. 4). A significant order effect showed longer pauses when the interaction started with ISZ (p = 0.042), though not significant after correction (p = 0.14).
Correlation with willingness for future interactions
At the end of the interaction, the IP rated their willingness to engage in future interactions with MAT or ISZ. Correlations were conducted on variables that showed significant differences at the dyad level, as these variations indicate meaningful distinctions in interaction dynamics. Both dyadic outcomes (overall conversation synergy) and individual outcomes (partner fluency) were analyzed, as the IP’s willingness may be influenced by the flow of their partner’s speech.
IPs and IP willingness (using dyad outcome)
Among the 9 items of the willingness questionnaire, we only found one significant correlation between the IPs’ “willingness to know better” and the dyad gaps’ median duration (p = 0.019; R² = 0.314). While only 31% of the variability of “Willingness to know better” is explained by the duration of gaps, the significant correlation suggests that longer gaps in the conversation negatively affect the IPs’ perception and their will to know better ISZs (Fig. 5a).
Concerning the IP’s willingness to interact with the MAT, we found a significant correlation between the IP’s total willingness and the dyad gaps’ median duration (p = 0.024; R² = 0.282). Longer gaps seem to be associated with a higher IP’s willingness to engage in future interaction with the MAT (Fig. 5b).
IPs and IP willingness (using ISZ and MAT outcome, respectively)
There was no significant correlation between IPs’ willingness and the ISZs’ outcome, as well as between IP’s willingness and the MATs’ outcome.
Correlation with psychopathological variables (PANSS), medication, and duration of the disease
Due to missing data on three PANSS surveys, correlations were conducted using 17 dyadic outcomes and 17 individual conversation outcomes of ISZ and their interacting partner (IPs) (as the ISZ’s psychopathological symptoms, medication, and duration of the disease may have shaped their speech flow, they could also have impacted their partner’s speech).
PANSS scores and dyad outcome
Several significant correlations were found between the IPs_ISZ dyad outcome and the PANSS scores. We especially found that “blunted affect”, “lack of spontaneity and flow of conversation”, and “motor retardation” were associated with a higher number of gaps, longer pause durations, and a lower total number of overlaps. Moreover, the “depression” item was correlated with an increase in gaps in median duration and pause total duration (Table 2) (see Supplementary Fig. 1 for the full correlation heatmap). No correlation was found between turn-taking metrics, medication, and the duration of the disease.
PANSS scores and individual ISZ, IPs' flow of speech
We found significant links between ISZ’s psychopathological symptoms and speech flow. Blunted affect was associated with fewer overlaps, motor retardation with more gaps and fewer overlaps, and depression with longer gaps, pauses, and fewer overlaps (Table 3). ISZ symptoms also affected their partner’s speech: increased gap duration in the IP’s speech correlated with ISZ’s blunted affect and persecution, while longer pauses in the IP’s speech were linked to ISZ’s motor retardation, lack of spontaneity, and blunted affect (Table 4) (see Supplementary Figs. 2 and 3 for the full correlation heatmap). No correlation was found between turn-taking metrics, medication, and disease duration.
Discussion
The current study aimed to assess the turn-taking fluency between individuals with schizophrenia (ISZ) and controls and its relationship with social outcomes, psychotic symptomatology. We recruited 20 ISZ, 20 interacting partners (IP), and 20 matched participants (MAT). Blind to the ISZ diagnosis, each IP interacted once with an ISZ and once with a MAT. Consistent with our first hypothesis, we found significant differences between the IP_MAT and the IPs_ISZ dyads in all overlap, gap, and pause dimensions.
Concerning the overlap metric, our results indicated no difference in duration but a significant decrease in the total number of overlaps for the IPs_ISZ dyads compared to the IP_MAT dyads. This finding contradicts our hypothesis and challenges Lucarini et al., who found more overlaps in conversations with individuals with schizophrenia compared to controls37. At first glance, their results indicate that the control group minimized simultaneous speech to a greater extent, supporting the rules of Sacks et al.’s turn-taking model in which overlaps indicate a violation of turn-taking norms15,63. However, simultaneous talk is widespread in conversations63, and certain cooperative overlaps are more reflective of relationship expression dynamics than indicators of disrupted turn-taking27,63,64. This is the case of terminal overlaps that occur when a listener begins speaking as the other finishes, either by anticipating turn completion through verbal cues (lexico-syntactic, semantic, content) or by reacting to verbal or nonverbal turn-yielding signals19,53,65. Our results highlight that the IPs, rather than the ISZs, showed fewer overlaps, suggesting that IPs may have struggled to detect turn completions, possibly due to a lack of clear verbal and nonverbal cues from their partners. This interpretation aligns with previous studies that found reduced nonverbal behaviors, such as fewer hand gestures66,67, decreased rate of head movement68, increased flight behavior69, and impaired prosodic expression of emotion70 in ISZs.
We propose that if ISZs’ partners struggle to detect turn-taking cues, they may delay their response until they are certain the turn is over, leading to longer gaps, consistent with findings that reacting to silence lengthens transitions56. It also aligns with our findings of more and longer gaps in conversations with ISZs, especially when they initiated the session. However, though our hypothesis did indeed anticipate longer gaps, we did not expect to find an increased number of gaps, as suggested by the findings of Lucarini et al.37. We believe this discrepancy stems from contextual differences. Our study involved free conversation between IPs and ISZs, while Lucarini et al. analyzed semi-structured interviews led by psychiatrists aware of the diagnosis37. The higher overlap rates in their study may reflect the psychiatrist’s more directive role, often redirecting the conversation and interrupting the ISZs71, thus reducing gaps. This highlights that turn-taking rules are socially constructed and depend on the context and roles of participants72. Moreover, we found that only ISZ’s partners significantly increased their gap number compared to their conversations with MAT’s participants. Like Howes and Lavelle36, who found longer gaps only in ISZs’ partners, our results show that IPs produced more and longer gaps with ISZs than with MAT. ISZs also tended to produce longer gaps, especially when IPs_ISZ occurred first, suggesting coordination difficulties extend beyond the individual and are amplified when both partners lack task experience.
Differences in fluency in the IPs_ISZ dyads are also confirmed by the longer pauses found in those dyads compared to IP_MAT, aligning with our hypothesis on the pause duration metric. Specifically, we found ISZ participants to drive this dyadic increase, producing significantly longer pauses than MAT participants. Our findings align with previous research on speech fluency in ISZ during monologue tasks73,74 as well as studies employing dialog-based tasks, such as interviews75,76,77. However, since the latter often do not distinguish between pauses within speech and pauses between speakers, direct comparisons with our results remain challenging. Nevertheless, we can suggest that the increased pauses in ISZs’ speech may be linked to difficulties in structuring and planning their discourse, retrieving and selecting appropriate words, and organizing their thoughts, particularly at moments requiring sentences or ideas to be connected or transitioned78,79.
Overall, we hypothesized that these differences in overlaps, gaps, and pauses should be associated with poorer social outcomes in the IPs_ISZ dyad, such as a reduced willingness for future interaction. Our results indicate that only IPs’ willingness to know their ISZ partners’ items correlated with gap duration: as gap length increases, their interest in knowing their partner declined, consistent with research showing reduced connectedness in conversations with longer gaps59. In contrast, longer gaps in IP_MAT dyads were linked to a greater willingness to engage in future interactions, an unexpected finding. As Templeton et al. found in friends’ conversations, such pauses may support reflection. With ISZs, gaps might feel more awkward, similar to those in conversations between strangers59. This discrepancy suggests that the social impact of conversational gaps depends on the speaker and context; some perceive pauses as awkward, others as thoughtful. Additionally, the meaning attributed to these latencies can be shaped by contextual factors such as the topic of discussion, the nature of the relationship between participants, social norms, and the content exchanged80. These elements may also influence how willing a person is to continue the interaction. Moreover, it is worth noting that these correlations were observed using dyadic outcomes, emphasizing that individual participants may base their decision to continue the interaction on their overall perception of the conversation’s synergy, rather than solely on their partner’s speech flow.
Finally, our last hypothesis predicted that the differences in overlaps, gaps, and pauses observed in IPs_ISZ would be associated with the psychological dimensions of ISZ symptomatology. As expected, ISZ’s psychopathological symptoms affected their speech flow and were correlated with gaps and pauses in their partner’s dialog, influencing both participants and the overall conversation outcome. We found that higher numbers of gaps, longer pauses, and fewer overlaps were linked to negative symptoms of schizophrenia, such as blunted affect, motor retardation, and depression. These results support our hypothesis of reduced speech flow in ISZs, likely due to diminished verbal and nonverbal cues. Significant correlations with blunted affect and motor retardation on the PANSS scale, marked by reduced emotional responsiveness and slower motor activity and speech, further align with prior research linking negative symptoms to turn-taking patterns37,81.
While our results provide new insight into the flow of turn-taking in conversations with individuals diagnosed with schizophrenia, there are also limitations to consider. First, the small number of dyads may limit the generalizability of our findings, as it can increase data variability and reduce statistical power. This is particularly important when interpreting correlations between symptoms and conversational dynamics in the schizophrenia group, as reduced power may obscure meaningful associations. Second, French ethical regulations prevented collecting racial or ethnic data, limiting interpretation of willingness to interact, since cultural or dialectal factors may have influenced responses. Third, assigning a different partner to each dyad likely added individual variance, influencing dynamics and limiting generalizability. Fourth, the first four dyads used lapel microphones, while the rest used headsets. This inconsistency, along with lower lapel sound quality, may have introduced variability in participant behavior. Fifth, participants’ education levels were not recorded, despite potential effects on conversational behavior. Furthermore, although MoCA and BDI-II scores were obtained, the absence of medical comorbidity data limits full interpretation of variability in communication. Sixth, we found no correlation between disorganized symptoms, likely due to the use of a non-specific tool (sub-scale disorganization of the PANSS82). Future studies should examine disorganization/turn-taking relationships using specific tools such as the Thought, Language, and Communication scale83 or the Thought and Language Disorder scale84. Seventh, although we tried to control the content of the interactions by directing participants to discuss topics they both enjoy, some participants may have encountered more difficulty finding common ground and topics of conversation that motivated them. The content of the conversation was not analyzed in this study, although it could have played a role in the form and dynamics of turn-taking28. Eighth, turn-taking is multimodal and is shaped by both verbal and nonverbal cues85. Therefore, while this study was motivated by looking at the flow of the conversation only through specific conversational patterns (overlaps, gaps, pauses), other verbal and nonverbal signals could have shaped our findings. Future research should explore whether the reduced overlap proportion and increased gap and pause duration found in the IPs_ISZ dyads could be linked to lower gesture frequency in ISZ participants.
Conclusion
To conclude, our study supports the idea that turn-taking fluency is impaired in free dialog involving individuals diagnosed with schizophrenia because of an increased number and length of gaps and pauses. On the individual level, some outcomes, like overlap number, were mainly driven by one participant, highlighting the value of analyzing both individual and dyadic levels. Moreover, the overall conversational flow at the dyadic level influenced participants’ willingness to engage in future interactions. These dyadic outcomes were also shaped by ISZ symptoms, highlighting their impact on conversational quality. These insights offer new directions to improve therapeutic communication with individuals diagnosed with schizophrenia, including turn-taking techniques like signaling when it’s their turn to speak, pausing to give them time to respond, and prompting them to take their turn, all to support social engagement.
Data availability
The annotation data supporting the findings of this study are available in the Open Science Framework at https://osf.io/g97hn/, while the original audio recordings are available upon request from the corresponding author, T.F. The audio data are not publicly available as they may compromise participant consent and confidentiality.
Code availability
The code used to extract turn-taking metrics from the audio and to perform statistical analyses is available on the Open Science Framework at https://osf.io/g97hn/.
References
Dodell-Feder, D., Tully, L. M. & Hooker, C. I. Social impairment in schizophrenia: new approaches for treating a persistent problem. Curr. Opin. Psychiatry 28, 236–242 (2015).
Kuperberg, G. R. Language in schizophrenia Part 1: an introduction. Lang. Linguist. Compass 4, 576–589 (2010).
Bowie, C. R. & Harvey, P. D. Communication abnormalities predict functional outcomes in chronic schizophrenia: differential associations with social and adaptive functions. Schizophr. Res. 103, 240–247 (2008).
Marggraf, M. P., Lysaker, P. H., Salyers, M. P. & Minor, K. S. The link between formal thought disorder and social functioning in schizophrenia: a meta-analysis. Eur. Psychiatry 63, e34 (2020).
Brown, M. & Kuperberg, G. R. A hierarchical generative framework of language processing: linking language perception, interpretation, and production abnormalities in schizophrenia. Front. Hum. Neurosci. 9, 643 (2015).
Meyer, L., Lakatos, P. & He, Y. Language dysfunction in schizophrenia: assessing neural tracking to characterize the underlying disorder(s)? Front. Neurosci. 15, 640502 (2021).
Parola, A., Simonsen, A., Bliksted, V. & Fusaroli, R. Voice patterns in schizophrenia: a systematic review and Bayesian meta-analysis. Schizophr. Res. 216, 24–40 (2020).
Voppel, A. E., de Boer, J. N., Brederoo, S. G., Schnack, H. G. & Sommer, I. E. C. Semantic and acoustic markers in schizophrenia-spectrum disorders: a combinatory machine learning approach. Schizophr. Bull. 49, S163–S171 (2023).
Addington, J., Girard, T. A., Christensen, B. K. & Addington, D. Social cognition mediates illness-related and cognitive influences on social function in patients with schizophrenia-spectrum disorders. J. Psychiatry Neurosci. JPN 35, 49–54 (2010).
Cavieres, Á. & López-Silva, P. Social perception deficit as a factor of vulnerability to psychosis: a brief proposal for a definition. Front. Psychol 13, 805795 (2022).
Charernboon, T. & Patumanond, J. Social cognition in schizophrenia. Ment. Illn. 9, 7054 (2017).
Molina, J. L. et al. Central auditory processing deficits in schizophrenia: effects of auditory-based cognitive training. Schizophr. Res. 236, 135–141 (2021).
Wilson, M. & Wilson, T. P. An oscillator model of the timing of turn-taking. Psychon. Bull. Rev. 12, 957–968 (2005).
Kendrick, K. H. & Holler, J. Conversation. Open Encycl. Cogn. Sci. https://doi.org/10.21428/e2759450.3c00b537 (2024).
Sacks, H., Schegloff, E. & Jefferson, G. A simple systematic for the organisation of turn taking in conversation. Language 50, 696–735 (1974).
Stivers, T. et al. Universals and cultural variation in turn-taking in conversation. Proc. Natl. Acad. Sci. USA 106, 10587–92 (2009).
Indefrey, P. & Levelt, W. J. M. The spatial and temporal signatures of word production components. Cognition 92, 101–144 (2004).
Holler, J. & Levinson, S. C. Multimodal language processing in human communication. Trends Cogn. Sci. 23, 639–652 (2019).
Levinson, S. C. & Torreira, F. Timing in turn-taking and its implications for processing models of language. Front. Psychol 6, 731 (2015).
Tian, Y., Liu, S. & Wang, J. A corpus study on the difference of turn-taking in online audio, online video, and face-to-face conversation. Lang. Speech. https://doi.org/10.1177/00238309231176768 (2023).
Siegel, J. S. et al. Enrichment using speech latencies improves treatment effect size in a clinical trial of bipolar depression. Psychiatry Res. 340, 116105 (2024).
Boltz, M. G. Temporal dimensions of conversational interaction: the role of response latencies and pauses in social impression formation. J. Lang. Soc. Psychol. 24, 103–138 (2005).
Lennon, P. Investigating fluency in EFL: a quantitative approach. Lang. Learn. 40, 387–417 (1990).
van Os, M., de Jong, N. H. & Bosker, H. R. Fluency in dialogue: turn-taking behavior shapes perceived fluency in native and nonnative speech. Lang. Learn. 70, 1183–1217 (2020).
Bosker, H., Quen‚ H., Sanders, T. & de Jong, N. The perception of fluency in native and nonnative speech. Lang. Learn 64, 579–614 (2014).
Bosker, H. R., Pinget, A.-F., Quené, H., Sanders, T. & de Jong, N. H. What makes speech sound fluent? The contributions of pauses, speed and repairs. Lang. Test. 30, 159–175 (2013).
Skantze, G. Turn-taking in conversational systems and human-robot interaction: a review. Comput. Speech Lang. 67, 101178 (2021).
Koudenburg, N., Postmes, T. & Gordijn, E. H. Beyond content of conversation: the role of conversational form in the emergence and regulation of social structure. Personal. Soc. Psychol. Rev. 21, 50–71 (2017).
Templeton, E. M., Chang, L. J., Reynolds, E. A., Cone LeBeaumont, M. D. & Wheatley, T. Fast response times signal social connection in conversation. Proc. Natl. Acad. Sci. USA 119, e2116915119 (2022).
Cangemi, F. et al. Content-free speech activity records: interviews with people with schizophrenia. Lang. Resour. Eval. 58, 925–949 (2023).
Roberts, F., Francis, A. L. & Morgan, M. The interaction of inter-turn silence with prosodic cues in listener perceptions of “trouble” in conversation. Speech Commun. 48, 1079–1093 (2006).
Roberts, F., Margutti, P. & Takano, S. Judgments concerning the valence of inter-turn silence across speakers of American English, Italian, and Japanese. Discourse Process 48, 331–354 (2011).
van Leeuwen, A. R. Right on Time: Synchronization, Overlap, and Affiliation in Conversation (LOT, 2017).
Saccone, V., Trillocco, S. & Moneglia, M. Markers of schizophrenia at the prosody/pragmatics interface. Evidence from corpora of spontaneous speech interactions. Front. Psychol. 14, 1233176 (2023).
Howes, C., Lavelle, M., Healey, P. G., Hough, J. & McCabe, R. Disfluencies in dialogues with patients with schizophrenia. Proc. 39th Annu. Meet. Cogn. Sci. Soc. 565–570 (Cognitive Science Society, 2017).
Howes, C. & Lavelle, M. Quirky conversations: how people with a diagnosis of schizophrenia do dialogue differently. Philos. Trans. R. Soc. B Biol. Sci. 378, 20210480 (2023).
Lucarini, V. et al. Language in interaction: turn-taking patterns in conversations involving individuals with schizophrenia. Psychiatry Res. https://doi.org/10.1016/j.psychres.2024.116102 (2024).
Tahir, Y. et al. Non-verbal speech analysis of interviews with schizophrenic patients. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 5810–5814 (2016).
Parola, A. et al. Speech disturbances in schizophrenia: assessing cross-linguistic generalizability of NLP automated measures of coherence. Schizophr. Res. 259, 59–70 (2023).
Alpert, M., Kotsaftis, A. & Pouget, E. R. Speech fluency and schizophrenic negative signs. Schizophr. Bull. 23, 171–177 (1997).
Riehle, M. & Lincoln, T. M. Investigating the social costs of schizophrenia: facial expressions in dyadic interactions of people with and without schizophrenia. J. Abnorm. Psychol. 127, 202–215 (2018).
Kay, S. R., Fiszbein, A. & Opler, L. A. The Positive and Negative Syndrome Scale (PANSS) for schizophrenia. Schizophr. Bull. 13, 261–276 (1987).
Nasreddine, Z. S. et al. The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild cognitive impairment. J. Am. Geriatr. Soc. 53, 695–699 (2005).
Beck, A. T., Steer, R. A. & Brown, G. Manual for the Beck Depression Inventory–II (Psychological Corporation, 1996).
Mackinnon, A. & Mulligan, R. Estimation de l’intelligence prémorbide chez les francophones. [The estimation of premorbidintelligence levels in French speakers. Encéphale Rev. Psychiatr. Clin. Biol. Thér 31, 31–43 (2005).
Watson, D., Clark, L. A. & Tellegen, A. Development and validation of brief measures of positive and negative affect: the PANAS scales. J. Pers. Soc. Psychol. 54, 1063–1070 (1988).
Sasson, N. J. et al. Neurotypical peers are less willing to interact with those with autism based on thin slice judgments. Sci. Rep. 7, 40700 (2017).
Schmidt, R. & Fitzpatrick, P. Embodied synchronization and complexity in a verbal interaction. Nonlinear Dyn. Psychol. Life Sci. 23, 199–228 (2019).
Coyne, J. C. Depression and the response of others. J. Abnorm. Psychol. 85, 186–193 (1976).
Byrne, D. Attitudes and attraction. In Advances in Experimental Social Psychology (ed. Berkowitz, L.) Vol. 4 35–89 (Academic Press, 1969).
Montoya, R., Horton, R. & Kirchner, J. Is actual similarity necessary for attraction? A meta-analysis of actual and perceived similarity. J. Soc. Pers. Relatsh. 25, 889–922 (2008).
Boothby, E. J., Cooney, G., Sandstrom, G. M. & Clark, M. S. The liking gap in conversations: do people like us more than we think? Psychol. Sci. 29, 1742–1756 (2018).
De Ruiter, J., Mitterer, H. & Enfield, N. Projecting the end of a speaker’s turn: a cognitive cornerstone of conversation. Language 82, 515–535 (2006).
Bertrand, A. M., Mercier, C., Bourbonnais, D., Desrosiers, J. & Gravel, D. Reliability of maximal static strength measurements of the arms in subjects with hemiparesis. Clin. Rehabil. 21, 248–257 (2007).
Duncan, S. Some signals and rules for taking speaking turns in conversations. J. Pers. Soc. Psychol. 23, 283–292 (1972).
Heldner, M. & Edlund, J. Pauses, gaps and overlaps in conversations. J. Phon. 38, 555–568 (2010).
Hammouri, H. M., Sabo, R. T., Alsaadawi, R. & Kheirallah, K. A. Handling skewed data: a comparison of two popular methods. Appl. Sci. 10, 6247 (2020).
Ochi, K. et al. Quantification of speech and synchrony in the conversation of adults with autism spectrum disorder. PLoS ONE 14, e0225377 (2019).
Templeton, E. M., Chang, L. J., Reynolds, E. A., Cone LeBeaumont, M. D. & Wheatley, T. Long gaps between turns are awkward for strangers but not for friends. Philos. Trans. R. Soc. B Biol. Sci. 378, 20210471 (2023).
Bates, D., Mächler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48 (2015).
Kuznetsova, A., Brockhoff, P. & Christensen, R. LmerTest: tests in linear mixed effects models. J. Stat. Softw. 82, 1–26 (2017).
Ter Bekke, M., Drijvers, L. & Holler, J. The predictive potential of hand gestures during conversation: an investigation of the timing of gestures in relation to speech. Proc. 7th GESPIN – Gesture and Speech in Interaction Conf. (KTH Royal Institute of Technology, 2020).
Lee, K. Why do we overlap each other?: Collaborative overlapping talk in English as a Lingua Franca (ELF) communication. Korean J. Engl Lang. Linguist 20, 613–641 (2020).
Truong, K. P. Classification of cooperative and competitive overlaps in speech using cues from the context, overlapper, and overlappee. Proc. Interspeech 2013 1404–1408 (International Speech Communication Association, 2013).
Riest, C., Jorschick, A. B. & de Ruiter, J. P. Anticipation in turn-taking: mechanisms and information sources. Front. Psychol. 6, 89 (2015).
Annen, S., Roser, P. & Brüne, M. Nonverbal behavior during clinical interviews: similarities and dissimilarities among schizophrenia, mania, and depression. J. Nerv. Ment. Dis. 200, 26–32 (2012).
Del-Monte, J. et al. Nonverbal expressive behaviour in schizophrenia and social phobia. Psychiatry Res. 210, 29–35 (2013).
Abbas, A. et al. Computer vision-based assessment of motor functioning in schizophrenia: use of smartphones for remote measurement of schizophrenia symptomatology. Digit. Biomark. 5, 29–36 (2021).
Dimic, S. et al. Non-verbal behaviour of patients with schizophrenia in medical consultations – a comparison with depressed patients and association with symptom levels. Psychopathology 43, 216–222 (2010).
Ben Moshe, T., Ziv, I., Dershowitz, N. & Bar, K. The contribution of prosody to machine classification of schizophrenia. Schizophrenia 10, 1–9 (2024).
Nguyen, A., Guydish, A. & Fox Tree, J. Backchannels in the lab and in the wild. Interact. Stud. Soc. Behav. Commun. Biol. Artif. Syst. 25, 70–99 (2024).
Edwards, D. Gaps and overlaps in conversation: analyses of differentiating factors. PhD thesis, University of Texas at Arlington (2024).
Cannizzaro, M. S., Cohen, H., Rappard, F. & Snyder, P. J. Bradyphrenia and bradykinesia both contribute to altered speech in schizophrenia: a quantitative acoustic study. Cogn. Behav. Neurol. J. Soc. Behav. Cogn. Neurol. 18, 206–210 (2005).
Rapcan, V. et al. Acoustic and temporal analysis of speech: a potential biomarker for schizophrenia. Med. Eng. Phys. 32, 1074–1079 (2010).
de Boer, J. N., Brederoo, S. G., Voppel, A. E. & Sommer, I. E. C. Anomalies in language as a biomarker for schizophrenia. Curr. Opin. Psychiatry 33, 212 (2020).
Oomen, P. P. et al. Characterizing speech heterogeneity in schizophrenia-spectrum disorders. J. Psychopathol. Clin. Sci. 131, 172–181 (2022).
Stanislawski, E. R. et al. Negative symptoms and speech pauses in youths at clinical high risk for psychosis. Npj Schizophr. 7, 1–3 (2021).
Cokal, D. et al. Disturbing the rhythm of thought: speech pausing patterns in schizophrenia, with and without formal thought disorder. PLoS ONE 14, e0217404 (2019).
Matsumoto, Y., Takahashi, H., Murai, T. & Takahashi, H. Visual processing and social cognition in schizophrenia: relationships among eye movements, biological motion perception, and empathy. Neurosci. Res. 90, 95–100 (2015).
McLeod, S. How to Conduct Conversational Analysis | Guide & Examples. https://doi.org/10.13140/RG.2.2.35524.23688 (2024).
Tahir, Y. et al. Non-verbal speech cues as objective measures for negative symptoms in patients with schizophrenia. PLoS ONE 14, e0214314 (2019).
van der Gaag, M. et al. The five-factor model of the Positive and Negative Syndrome Scale II: a ten-fold cross-validation of a revised model. Schizophr. Res. 85, 280–287 (2006).
Andreasen, N. C. Thought, language, and communication disorders. I. Clinical assessment, definition of terms, and evaluation of their reliability. Arch. Gen. Psychiatry 36, 1315–1321 (1979).
Kircher, T. et al. A rating scale for the assessment of objective and subjective formal Thought and Language Disorder (TALD). Schizophr. Res. 160, 216–221 (2014).
Alviar, C., Kello, C. T. & Dale, R. Multimodal coordination and pragmatic modes in conversation. Lang. Sci. 97, 101524 (2023).
Acknowledgements
This work was funded by the Agence Nationale de la Recherche (ANR) project ENHANCER #ANR-22-CE17-0036-03. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
T.F.: Conceptualization, Data Curation, Formal Analysis, Investigation, Methodology, Software, Visualization, Writing – Original Draft; G.M.: Conceptualization, Methodology, Project Administration, Supervision, Validation, Funding Acquisition, Resources, Writing – Review & Editing; R.C.S.: Conceptualization, Methodology, Project Administration, Supervision, Validation, Funding Acquisition, Resources, Writing – Review & Editing; M.P.: Data Curation, Investigation, Methodology, Writing – Review & Editing; V.V.: Data Curation, Investigation, Methodology, Writing – Review & Editing; D.M.: Data Curation, Investigation, Methodology, Writing – Review & Editing; D.C.: Project Administration, Supervision, Validation, Funding Acquisition, Resources, Writing – Review & Editing; S.R.: Project Administration, Supervision, Validation, Resources, Funding Acquisition, Writing – Review & Editing; L.M.: Project Administration, Supervision, Validation, Resources, Funding Acquisition, Writing – Review & Editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Fauviaux, T., Mostafaoui, G., Schmidt, R.C. et al. Turn-taking fluency in free conversations with individuals diagnosed with schizophrenia. Schizophr 11, 130 (2025). https://doi.org/10.1038/s41537-025-00678-y
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41537-025-00678-y




