Abstract
Success in public speaking hinges on engaging an audience, which involves a high-stakes social interaction process that remains a significant source of anxiety and stress for many. Using a virtual-reality (VR)-based experimental paradigm, we tested how speakers delivering scientific talks perceive and respond to supportive vs. unsupportive audiences. We collected behavioral (gaze, paralinguistics, motion expressiveness/openness), physiological (heart rate, electroencephalography, breathing rate, pupil dilation), and self-report measures to assess audience effects. The unsupportive audience elicited greater negative affect, arousal, and anxiety, and higher perceived cognitive and social effort. Physiologically and behaviorally, speaking to the unsupportive audience decreased the speaking rate. Acoustic analyses further indicated greater emotional arousal and vocal dominance in the unsupportive condition. These findings highlight VR combined with physiological measurement as a powerful approach for investigating audience effects and social-communication processes, with clear implications for augmenting social intelligence and communication skills.
Similar content being viewed by others
Introduction
Whether delivering a scientific talk, a business pitch, or a university lecture, public speaking is a key competency that 21st century knowledge workers need to master to succeed, but also one of the most highly ranked sources of anxiety and apprehension. Public speaking is a hotbed of social communication processes1,2, governed by a dynamic feedback loop where a speaker’s efforts to engage an audience are met with audience reactions that can powerfully shape their performance and trigger profound anxiety. Although the term public speaking emphasizes verbal communication, the uniquely social nature of public speaking is perhaps even more evident in nonverbal and paralinguistic channels: Speakers, for instance, connect with their audiences through an array of non- and paraverbal behaviors, including eye gaze, body language, or prosody.
Often overlooked, however, is the fact that audiences are also communicating with the speaker. Audience feedback occurs predominantly via social-evaluative nonverbal signals – for example, unengaged audiences make no eye contact and exhibit little positive backchannel activity (e.g., head nodding)3. Such social feedback signals are perceived by the speakers, which can set off a cascade of negative self-attributions, feelings and symptoms of embarrassment (e.g., blushing, dry mouth), and psychological disturbances (e.g., working memory interference, speech disfluencies)4,5,6,7,8,9. These effects of the audience on the speaker are the topic of the current study.
This study examines the effects of the audience’s behavior on the speaker, using virtual reality as an experimental petri dish to examine social communication processes. Theoretically, our work is informed by the biopsychological (BPS) model of challenge and threat10, which connects the social-evaluative nature of public speaking to people’s biopsychological responses. Methodologically, we leverage immersive virtual reality (VR) as well as physiological, behavioral, and subjective measures to comprehensively assess a broad spectrum of behavioral, viscero-motor, and social-cognitive variables under realistic public speaking conditions. The results augment existing efforts in the scientific community to capture the multifaceted aspects of social processes and build better interventions targeting public speaking anxiety.
In the following, we first summarize related works and point out the research gap, which consists of causal manipulation of social factors (i.e., the audience behaviors that form the stimulus triggering social-evaluative interpretations in speakers) and the measurement of the complex dynamics underlying public speaking in the context of science communication. We then explain how the current VR-based approach offers a unique solution. Finally, we present the current study in which participants gave scientific presentations in front of two types of virtual audiences (supportive vs. unsupportive).
Challenge and threat in public speaking processes
The behavior of an audience is a potent form of social-evaluative feedback that can act as a significant psychological stressor, comparable to other social challenges like exclusion11 or stress interviews12. Influential frameworks, such as the Biopsychological Model of Challenge and Threat, describe how individuals respond to such stressful performance situations10. The model posits that individuals assess their resources and demands of a situation, leading to either a challenge or a threat response. A challenge response occurs when individuals perceive they have sufficient resources to cope with demands, resulting in adaptive physiological responses, such as increased heart rate. Conversely, a threat response arises when individuals are overwhelmed and lacking in resources, triggering negative physiological reactions, such as increased blood pressure and anxiety.
Although the challenge/threat framework was originally developed in social psychology with a focus on coping and resilience under stress, it can be applied to the public speaking situation. At the core of the challenge/threat framework lies how interpretations of challenging/threatening social situations influence physiology, with a focus on cardiovascular stress responses. While we do not directly test this model, it informs and motivates our multi-modal approach. Since public speaking is a complex biopsychosocial phenomenon, studying it requires a strategy that can capture its different facets simultaneously. Established frameworks in emotion and stress research12,13 suggest that relying on a single measurement modality has limitations in this regard. Self-reports provide insight into a speaker’s subjective experience but are retrospective and cannot capture real-time dynamics. Behavioral measures quantify overt performance but are ambiguous regarding the speaker’s internal state. Finally, physiological measures offer an objective window into bodily arousal but cannot, on their own, distinguish between positive states like challenge and negative states like threat. Therefore, the strength of our approach lies in triangulating these three channels – subjective, behavioral, and physiological. This integration allows us to capture a holistic picture of the speaker’s response to the public speaking challenge and their coping behaviors13,14.
This multi-modal perspective reveals the limitations of traditional approaches to study public speaking. Foundational research, particularly in work about communication apprehension, public speaking anxiety, and social stress has largely relied on observation and self-reported evaluations15,16. For instance, MacIntyre and colleagues provided participants with scenarios describing audience characteristics (e.g., responsiveness, pleasantness) and had them answer a questionnaire about their imagined audiences17,18. Hsu had participants give impromptu speeches in front of confederate audience members and complete a questionnaire after the task19. However, studies that solely use self-reports do not capture the hidden neurophysiological and overt behavioral phenomena that are ongoing as the public speaking situation unfolds, such as how speakers respond to audience reactions during their live performance. Other researchers have examined speakers’ biological responses, such as cardiovascular reactivity and changes in heart rate, as they spoke in front of various human audiences20,21,22. However, these studies occurred in the lab with confederates as audiences, leading to low ecological validity and cost and time spent on the confederates.
Current study: VR-based experimentation for examining public speaking processes
Within this context, immersive VR technology holds great potential for experimental research in public speaking. Immersive VR enables researchers to precisely manipulate experimental variables while simulating real-life environments and situations, enhancing ecological validity of lab-based studies23. This capability applies to social situations and cues as well. For instance, researchers can vary audience size, engagement levels, and positive and negative audience reactions and study the effects on the speaker24,25,26. When integrated with physiological and behavioral measurements such as eye-tracking and heart rate tracking, VR-based paradigms can illustrate the cause-effect mechanisms in real-time27. These features have led to recent research that leveraged immersive VR to study the effects of virtual audiences on speakers7,8,28,29.
Building upon this body of literature, our study examines how speakers respond to socio-evaluative threat by manipulating virtual audience behavior in immersive VR (See Fig. 1). Specifically, participants were asked to give a research-based presentation to supportive as well as unsupportive computer-generated audiences (within-subject design) while immersed in a virtual environment that resembled a typical academic conference venue. We focus on scientific presentation because of its multifaceted nature: Effective scientific presentation requires sustained audience engagement and breaking down complex ideas for the audience in addition to interesting content itself30. Also, scientific presentations are often connected to important outcomes such as getting a job or persuading the audience toward a specific action, making feedback signals from the audience more salient to the speaker.
Overview of the study setup and design, main measures, and predictions. Top Left Panel: Participants were asked to come to the lab with two 8-12-minute scientific presentations prepared. During the study, participants wore the VR headset and gave two presentations in front of a large audience. Top Right Panel: Presentations were given to a supportive audience vs. an unsupportive audience (counterbalanced across participants). Bottom Panel: To comprehensively assess audience effects, we examined participants’ subjective, physiological, and behavioral responses (see the “Methods” section for details). We tested the differences between the supportive vs. unsupportive audience manipulation (bar graphs are conceptual illustrations of our predictions and do not show empirical data).
In terms of measurement, we combined multiple physiological (e.g., electroencephalography, heart rate), behavioral (e.g., expressive motion, valence in the voice), and subjective measures (e.g., self-reported anxiety symptoms experienced during the presentation), aiming to capture not only speakers’ self-reported experience after the performance, but also their neurophysiological activity and behavior during the presentations. Our main hypotheses centered around the effect of the audience manipulation on these measures. First, we made the following predictions about audience effects on people’s subjective experiences:
Hypothesis 1 (H1): Individuals will perceive greater pressure to exert (a) cognitive, (b) physical, and (c) social effort when presenting in front of the unsupportive audience compared to the supportive audience.
H2: Individuals will report greater anxiety-related symptoms, manifested through (a) language and behavior, (b) thoughts, and (c) physiology when presenting in front of the unsupportive audience compared to the supportive audience.
H3: Individuals will report having experienced (a) greater negative affect, (b) heightened arousal, and (c) lower sense of dominance when presenting in front of the unsupportive audience compared to the supportive audience.
We also expected that audience behavior will have similar effects on the following corresponding physiological and behavioral measures:
H4: The unsupportive audience will elicit (a) greater negative affect, (b) heightened arousal, and (c) lower sense of dominance in the voice when presenting in front of the unsupportive audience compared to the supportive audience.
H5: The unsupportive audience will induce elevated levels of (a) breathing rate, (b) heart rate, and (c) pupil dilation compared to the supportive audience.
Furthermore, we posed two research questions about audience effects on neural activity and speaking and nonverbal behaviors because those measures often represent a combination of complex processes:
Research Question 1 (RQ1): Will audience behavior influence individuals’ neural activity?
RQ2: Will audience behavior influence individuals’ (a) speaking rate, (b) openness in motion, (c) expressiveness in motion, (d) eye contact with the audience, and (e) gaze dispersion while presenting?
In addition to audience effects, we examined the effectiveness of the VR-based public speaking intervention in general:
RQ3: Does completing the VR-based public speaking tasks decrease anxiety levels?
RQ4: After individuals complete the VR-based public speaking tasks, how likely are they to recommend the intervention to others?
Results
We first present the results from the within-subject comparisons of participants’ subjective experience (H1-H3; see Fig. 2; Table 1). As predicted by H1, participants perceived that the unsupportive audience required more cognitive, physical, and social energy than the supportive audience (cognitive: F(1,76) = 11.12, p = .001; physical: F(1,76) = 5.75, p = .019; social: F(1,76) = 5.28, p = .024). In addition, participants reported having experienced more communication anxiety symptoms when presenting in front of the unsupportive audience compared to the supportive audience. They felt the symptoms especially manifested in their language and behavior (F(1,76) = 6.90, p = .010) and thinking (F(1,76) = 16.83, p < .001), supporting H2a and H2b. Finally, participants reported greater levels of negative affect (F(1,76) = 8.09, p = .006) and emotional arousal (F(1,76) = 4.78, p = .032) after presenting to the unsupportive audience compared to the supportive audience, which aligned with H3a and H3b. H2c and H3c were not supported. However, the findings overall demonstrated that audience behavior affected peoples’ social-cognitive thought patterns, extending even to the level of perceptions of visceral and motor functions.
llustration of results: audience effects on speakers’ behavior, physiology, and subjective feelings. Post-study survey data confirmed that speakers were perceiving expected differences between the supportive and unsupportive audiences, in line with the intended manipulations (Manipulation Checks Panel). Results also revealed reductions in anxiety levels, and stronger reductions on the basis of public speaking anxiety (Public Speaking Intervention Panel). We assessed the effects of the manipulated audience on behavior, physiological and subjective experiences. The corresponding panels are examples of variables in each cluster. See text and tables for full results of additional variables.
Next, we present audience effects on participants’ physiology and behavior (H4-H5, RQ1-RQ2; see Table 2). Speaking in front of the unsupportive audience elicited higher levels of arousal (F(1,71) = 12.05, p < .001) in the voice, supporting H4b. Interestingly, individuals expressed greater dominance in the voice (F(1,71) = 11.30, p = .001), and vocal valence did not differ by audience type while presenting in front of the unsupportive audience. Thus, H4a and H4c were not supported. Contrary to our predictions in H5, individuals’ breathing rate, heart rate, and pupil dilation did not vary by audience type. Regarding RQ1, we found that audience behavior significantly influenced individuals’ neural activity (measured via alpha-to-beta ratio; F(1,72) = 4.14, p = .046). For RQ2, speaking to the unsupportive audience decreased the speaking rate (F(1,71) = 8.11, p = .006). Other behavioral measurements did not differ by condition; however, participants who reported greater speaker anxiety prior to the intervention were less open and expressive nonverbally when presenting in front of the unsupportive vs. supportive audience (rexpressiveness = 0.28, pexpressiveness = 0.020; ropenness = 0.24, popenness = 0.047). These results highlight how the (simulated) social audience behavior affects biological and behavioral and how a speaker’s temperament or personality shapes behavioral performance under stress.
We also assessed the effects of the VR-based public speaking tasks as a whole (RQ3-4). Completing the VR public speaking intervention lowered state anxiety overall (Mpost = 33.4, SDpost = 9.34; Mpre = 38.1, SDpre = 10.6; t = 5.67, p < .001), with effects more pronounced for those who reported higher levels of speaker anxiety prior to the virtual presentation (F(1,220) = 19.78, p < .001). In addition, participants were likely to recommend this VR-based public speaking simulation to others (M = 5.45 out of 7; SD = 0.64). Thus, this VR-based public speaking simulation shows promise as a platform for communication training, with the added benefit that it enables experiments on how social-cognitive processes influence physiology, and behavior.
Discussion
Here we examined how participants responded to two social-communicative challenges in VR. We found that the virtual audience behavior strongly affected participants, demonstrating the value of VR as a social-cognitive experimentation tool, underscoring its value for training challenging public-speaking situations, and providing insights into the dynamics of speaker behavior during social-evaluative communication challenges. While the task was a scientific presentation, it serves as a robust model for a wide range of high-stakes speaking situations where audience feedback is critical. Our findings on this communication feedback loop between speakers and audiences are therefore relevant across a wide range of professional and academic contexts - from business pitches to job talks.
First, people’s subjective experiences generally supported our predictions. Participants reported exerting greater effort and experiencing more cognitive and behavioral anxiety symptoms when presenting in front of the supportive audience. They also felt more negative affect and emotional arousal for the unsupportive vs. supportive audience. These results demonstrate that the experimentally manipulated virtual audience had a strong influence on participants’ subjective experience and that - despite the virtual and ultimately artificial nature of this mediated communication challenge - it produced real-world effects31.
In addition, the physiological and behavioral measures offer insights into the speaker’s response to the virtual audience and show potential evidence of compensatory strategies. The combination of increased vocal arousal and dominance alongside a higher alpha-to-beta ratio might be interpreted as a compensatory vocal effort, perhaps as an attempt to re-engage the audience, coupled with cognitive disengagement due to withdrawing or re-allocating mental resources from the presentation content itself.
Though generally harder to interpret, the null findings are also informative. For instance, the lack of a significant difference in heart rate, despite the increase in vocal arousal, suggests a difference between the communicative compensation strategies and the systemic stress response. Similarly, the absence of an effect on gaze behavior may reflect the high cognitive load of the task: the need to manage slides and notes could have created a floor effect that might have masked subtler gaze aversion and apprehension patterns9. In sum, these results underscore that arousal and stress are not monolithic constructs, and that behaviors can vary by task demands. Importantly, we also found that individual differences moderated these behavioral responses; participants with higher levels of speaking anxiety became less open and expressive when facing the unsupportive audience, suggesting that pre-existing dispositions can amplify negative behavioral reactions to social threat.
By triangulating multiple measures – behavioral, physiological, and experiential – the current study provides a holistic picture of public speaking, one of the most important social communication skills across business, educational, and civic contexts. Methodologically, by simulating real-life communication environments and manipulating theoretical variables like audience supportiveness, we demonstrate the utility of VR-based simulations to studying complex social communication processes. This goes beyond existing work that has used VR to study people’s self-reported ratings of virtual audiences. Going forward, integrating VR-based training systems with real-time physiological and behavioral monitoring could lead to significant practical implications. For instance, future systems might monitor participants’ responses in real-time and provide adaptive feedback (e.g., virtual audience members smiling if the speaker is effectively using humor), personalized training (e.g., try “X” instead of “Y” here), or social support or specific symptom-provocations geared towards communication apprehension25,32.
However, as with all research, this study is not without limitations. For example, in the post-experimental interview, some speakers mentioned that despite the simulation being realistic and immersive, they were aware the audiences were not real. Some speakers also mentioned that it would have been helpful if the audience behavior changed based on their performance as we suggested above future systems might.
Another limitation stems from our within-subject design, in which presentations were delivered successively. Although participants removed the headset to fill out survey questions, this raises the risk that the emotional state induced by the first condition could have influenced the speaker’s response to the second. While our counterbalancing and statistical controls for presentation order mitigate this risk, future studies aiming to validate VR platforms specifically as training tools should consider between-subjects designs to fully isolate the effects of audience condition.
Another methodological consideration relates to our sampling approach: Our study did not employ clinical exclusion criteria. This was a deliberate choice aimed at capturing the full spectrum of public speaking response in our university-affiliated sample – our target for generalizability of the science communication task. However, we acknowledge that this means our results describe dynamics of social-evaluative threat in a non-clinical sample. While the demanding nature of the task likely deterred individuals with debilitating anxiety from participating, the findings should not be generalized to clinical populations.
Lastly, most of the physiological and behavioral measures were aggregated across participants. Future work could examine individualized responding and trace dynamic changes over the duration of the presentation.
Going forward, given the rapid developments towards high-realism avatars and LLM-based agents, often combined with VR33,34, we see a lot of potential to create more flexible and realistic audiences. For example, it would be important and feasible, to not only simulate the public speech itself, but also the Q&A session afterwards, as this session is a more bi-directional communication situation and one in which speakers are on the spot and may receive completely unanticipated questions.
Furthermore, while our multi-modal measurement approach is currently still resource-intensive, we believe it can serve as a blueprint for future communication training technologies. Specifically, the rapid integration of biosensors, eye-tracking, and body-tracking into consumer VR/AR headsets will soon make a comprehensive, real-time data collection much easier and likely commonplace. With this in mind, our work demonstrates how integrated physiological and behavioral data can be used to understand a speaker’s state, paving the way for adaptive, next-generation training and feedback systems.
Conclusion
The current study examined how audience behavior in immersive VR impacts the speaker. The findings generally supported our predictions about audience effects on subjective experiences and shed insights into the speakers’ physiological and behavioral responses. The results, including the null findings (e.g., no significant differences in the heart rate or gaze), underscored the complexity of the public speaking process and that physiological and behavioral responses can vary by task demands. In addition to demonstrating how we can use immersive VR as an experimental tool to decipher social-cognitive mechanisms, this study also highlights how VR-based experimental paradigms can be integrated with subjective, physiological, and behavioral measures to create synergy between stimulus delivery, online social cognition, and communication outcomes, and between experimental control and ecological validity. Finally, understanding social-cognitive mechanisms of public speaking holds immense practical significance. Public speaking anxiety affects many individuals and public speaking skills are a top-ranked 21st-century job skill. Therefore, leveraging VR to enhance communication skills in speakers – through training, feedback, and intervention – is a burgeoning area where insights from communication science and biology can yield tangible benefits.
Methods
The study was approved by the Institutional Review Board at Michigan State University [STUDY00007966] and conducted in accordance with relevant guidelines and regulations. We obtained informed consent and collected data from 80 participants (Mage: 32.28, SDage: 10.60, rangeage: 20–75; 33 Male, 43 Female, 4 Nonbinary). We recruited participants who have presented or will need to present science or research-based work, which included graduate students, postdoctoral researchers, and faculty. All participants received monetary reimbursement for their participation. Sample size was determined a-priori based on a power analysis for a paired-sample t-test (α = 0.05, 1-β = 0.8). We powered the study to detect a small-to-medium-sized effect (d = 0.3), which was chosen as a conservative estimate and as the smallest effect of practical interest; while literature on in-person social-evaluative stressors often reports larger effects, we anticipated that responses might be attenuated in VR. This analysis suggested a sample size of 71 participants.
Software and equipment
See Fig. 3 for an overview of the software and measurement equipment used. We used the Virtual-Orator public speaking software35 as the VR-based experimental platform. This software allowed us to create the computer-generated audiences and extract general gaze behavior while completing the public speaking tasks. For the head mounted display, we used the HP Reverb G2 Omnicept. The headset included pupil dilation and heart rate measurement capabilities. In addition, we used the GTec Unicorn mobile cap to collect electroencephalography (EEG) data and Vernier Respiration Belt to measure the breathing rate. Furthermore, the Optitrack motion capture suit recorded participants’ nonverbal behavior.
Overview of mthodology. Participants gave two scientific talks while wearing the HP Omnicept VR headset. Their physiology and behavior were recorded via the respective measurement gear, and they reported their subjective experiences through the post-presentation questionnaires. The independent variable was the type of audience (manipulated in the Virtual-Orator platform), and the dependent variables were the physiological, behavioral, and subjective responses measured, enabling a comprehensive assessment of the complex public speaking processes.
Experimental conditions
We used Virtual Orator software to generate the supportive and unsupportive audiences. Specifically, we adjusted the audience parameters, ‘audience friendliness’, ‘audience interest’, ‘audience concentration’, and ‘general distraction’ for each type of audience. For the supportive audience, these parameters were all set to the maximum levels, resulting in audience behaviors that signaled attentiveness and friendliness to the speaker. These settings generated the attentive audience, who looked at the speaker or the projector with the slides, had an upright body posture, and neutral and friendly expressions, and did not make any distracting noise. The unsupportive audience, on the other hand, was generated by setting the respective software controls to the lowest levels, resulting in an audience that appeared distracted and unfriendly. In this condition, audience members turned away from the speaker (as if talking to the person behind them), frequently checked their cell phones, slouched, and fell asleep. Beyond audience behavior, the unsupportive condition also included loud environmental noise upon entering the room and cell phone alert and talking sounds during the presentation.
As manipulation check, we asked participants a series of audience evaluation questions in the post-presentation survey after each presentation. Specifically, participants were asked to rate the audience on seven adjectives (e.g., attentive, friendly, cold, ranging from 1 - not at all to 7 - very much). The paired t-tests showed that participants perceived the supportive and unsupportive audiences accordingly (see Table 3), demonstrating that our manipulations worked as intended.
Procedures
This study employed a within-subject design in which all participants gave two presentations, one to a supportive vs. unsupportive audience. This design was chosen for its statistical power to control for the large individual differences inherent in public speaking competency and its physiological correlates. By having each participant act as their own control, we could more sensitively isolate the effects of the audience manipulation. The order of these presentations was counterbalanced across participants to mitigate potential carry-over effects (e.g., habituation or contrast effects) so that half of the sample started with an unsupportive audience, the other half with the supportive audience.
Participants filled out a pre-survey and were instructed to prepare two different 8–12 min scientific presentations before coming into the lab. They choose topics with which they were familiar, particularly their own research. This also ensured that the presentation content was novel and different for each condition. After participants arrived at the lab, provided consent to the study, and watched a brief explanation of the study via a video clip, we equipped the participants with the ambulatory measurements (motion capture suit, EEG, breathing belt) and the HP Omnicept VR headset and calibrated the eye-tracking system within the headset. Then, participants entered the Large-Hotel-Meetingroom-With-Chairs-environment of the Virtual-Orator public speaking platform35. For each 8-12-minute presentation, we loaded the participants’ slides into the Virtual-Orator system; these slides were displayed on a virtual laptop in front of them and on the two projectors on their left and right sides. Once participants finished their presentation, they completed a post-presentation interview and survey.
Main measures
Subjective measures and anxiety instruments
We included three subjective measures. First, we adopted and modified the cognitive, exertional, and social subdimensions of a video game demand scale36, which asked participants to rate items on a 7-point Likert scale (1 - strongly disagree to 7 - strongly agree). The cognitive demand subscale included four statements about how much thinking was required by the public speaking task (e.g., “Presenting in VR was cognitively demanding”). The exertional subdimension comprised the following two items related to physical energy required by the tasks (e.g., “I felt physically exhausted after delivering the presentation”). The social demand subscale included three items about how much the public speaking task made the participants react to the audiences (e.g., “Being with the audience in the room had an impact on how I gave the presentation”).
To measure individuals’ subjective affective responses, we adopted the self-assessment manikin (SAM37. Three SAM categories asked participants to select the picture of the manikin that best represented their feelings of valence (1 - most positive to 5 - most negative), arousal (1 - most aroused to 5 - least aroused), and dominance (1 - least dominant to 5 - most dominant) during the presentation. These three items represented the Subjective Valence, Subjective Arousal, and Subjective Dominance measures.
For the anxiety-related symptoms, we asked participants to report to what extent they experienced language-behavioral, thought-related, and physiological symptoms while presenting, on a scale from 1 (not at all) to 7 (a lot). The language-behavioral subdimension included nine symptoms such as “I misspoke or stuttered” and “I made displacement activities (scratching head, touching face, tapping, etc.).” Thought-related symptoms comprised 10 items including “Unwanted thoughts came to my mind during the presentation” and “I made negative evaluations of myself.” Perception of physiological alterations was measured by one item, “I experienced physiological symptoms or thoughts that interfered with my performance.”
In addition to the subjective measures, we included two additional scales about state anxiety and speaker confidence levels. To measure individuals’ state anxiety levels before and after the VR-based intervention, we used 20 items from the State Trait Anxiety Inventory38. These statements asked participants to indicate how much they feel specific emotions from 1 (Not at all) to 4 (Very much so). For baseline speaker confidence level, we adopted the 12-item Personal Report of Confidence as a Speaker (PRCS) scale39, which asked participants to rate each statement from a scale of 1 (not at all characteristic of me) to 5 (extremely characteristic of me).
Physiological and behavioral measures
Over-time measures of participants’ neurophysiological and visceromotor responses included heart rate and pupil dilation (measured via the HP Omnicept Recorder), EEG (measured via a GTec Unicorn mobile EEG system), and breathing rate (measured via a Vernier respiration belt). Measures of participants’ behavior as they deliver the speech and respond to the audience included motion (measured via the motion-capture system), gaze behavior (generated by the Virtual-Orator platform), and their speech patterns (analyzed from presentation recordings).
Data processing and analysis
The processed data and scripts are available online (github.com/nomcomm/speaker_responses_ audience). An overview of the main derived metrics for biological and behavioral data is shown in Fig. 2. In brief, all data streams were related to the common onset point - the moment participants entered the conference room. They then immediately started their speech, and the moment was marked in all separate data streams. We then extracted relevant metrics and averaged them over the first 3 min of each speech (i.e., one average heart rate for the presentation in front of a supportive audience and one average HR for the presentation in front of an unsupportive audience). The only exceptions to the 3-minute aggregate were gaze dispersion and screen-to-audience gaze ratio measures. These metrics were computed for the entire duration by the Virtual-Orator system. We decided on the 3-minute cut-off after examining the metrics over-time in 30 s increments. We noticed that by the 180 s mark, the metrics were generally consistent, excluding certain moments of peaks and valleys. We will analyze these patterns in more detail in a follow-up paper.
Heart rate and pupil dilation metrics were directly computed by the HP Omnicept recorder software. EEG data were processed in MATLAB, using the unicorn python package to read in the data, and then using EEGLAB’s filtering, automated artifact correction, and power spectrum algorithms to compute global alpha and beta power spectrum values for each speech40. Then, for each data point, the alpha-to-beta ratio was calculated (alpha/beta). We note that while the alpha/beta ratio provides only a coarse and indirect measure of relaxed vs. active cortical states and should be interpreted with caution41,42, the design of our study makes it feasible to track changes within participants (i.e., supportive vs. unsupportive audience).
Breathing rate was determined via the Vernier Graphical breathing belt analysis kit. Eye gaze was quantified via Virtual Orator’s integrated audience-zone analysis tool, which measured time spent looking at each of six audience sections (front left, front middle, front right, back left etc.). We computed the standard deviation of those metrics as a measure of equality of gaze allocation to all zones, which contained the same number of audience members. Furthermore, we used the motion-capture system to track people’s nonverbal movements throughout the speech and used python code to calculate expressiveness and openness indices. Lastly, we analyzed valence, arousal, and dominance in people’s voices using the wav2vec model43.
For statistical analysis of the main measures, we fitted a linear mixed effects model for each metric, with audience type as the main effect variable. We note that our hypotheses and analysis plan were specified a priori, though not formally pre-registered. Intercepts varied by participant to consider the repeated-measure design, and we controlled for potential confounding effect of the speech order (i.e., which audience participants encountered first). In the spirit of transparency and reproducibility, anonymized data and analysis scripts are available online (see Data and Code Availability). We cleaned the data before fitting the models based on the following procedure: For each metric, if the data for one speech was missing, we excluded the participant all together. This ensured adequate within-subject comparisons.
Data availability
The anonymized dataset and data analysis scripts are available via GitHub at github.com/nomcomm/speaker_responses_audience.
References
Vangelisti, A. L., Daly, J. A. & Friedrich, G. W. Teaching Communication: Theory, Research, and Methods (Routledge, 2013).
Lucas, S. E. & Stob, P. The Art of Public Speaking (McGraw-Hill, 2020).
Rosenfeld, H. M. & Hancks, M. The nonverbal context of verbal listener responses. In The Relationship of Verbal and Nonverbal Communication (ed. Key, M. R.) 193–206 (Mouton, 1980).
Taylor, S. E. et al. Effects of a supportive or an unsupportive audience on biological and psychological responses to stress. J. Personal. Soc. Psychol. 98 (1), 47 (2010).
Kleinlogel, E. P., Renier, L., Mast, M. S. & Toma, C. I think that I made a good impression! Social Psychol. https://doi.org/10.1027/1864-9335/a000421 (2020).
Dickerson, S. S. Social-evaluative threat. In The Wiley Encyclopedia of Health Sychology (eds. Sweeny, K. & Robbins, M. L.) 0641–647 (2020).
Kroczek, L. O. & Mühlberger, A. Public speaking training in front of a supportive audience in virtual reality improves performance in real-life. Sci. Rep. 13 (1), 13968 (2023).
Girondini, M. et al. Decoupling the role of verbal and non-verbal audience behavior on public speaking anxiety in virtual reality using behavioral and psychological measures. Front. Virtual Real. 5, 1347102 (2024).
Rubin, M. et al. Avoidance of social threat: evidence from eye movements during a public speaking challenge using 360-video. Behav. Res. Ther. 134, 103706 (2020).
Blascovich, J. Challenge and threat appraisal. In Handbook of Approach and Avoidance Motivation (ed. Elliot, M.) 431–445 (Psychology, 2013).
Schmälzle, R. et al. Brain connectivity dynamics during social interaction reflect social network structure. Proc. Natl. Acad. Sci. 114 (20), 5153–5158 (2017).
Kirschbaum, C., Pirke, K. M. & Hellhammer, D. H. The ‘Trier social stress Test’—a tool for investigating Psychobiological stress responses in a laboratory setting. Neuropsychobiology. 28 (1–2), 76–81 (1993).
Bradley, M. M. & Lang, P. J. Measuring emotion: behavior, feeling, and physiology. In Cognitive Neuroscience of Emotion (eds. Lane, R. D. & Nadel, L.) 242–276 (Oxford University Press, 2000).
Sapolsky, R. M. Why Zebras Don’t Get Ulcers 3rd edn (Henry Holt, 2004).
McCroskey, J. C. An Introduction To Rhetorical Communication (Allyn & Bacon, 2001).
O’Brien, J., Fryer, S., Parker, J. & Moore, L. The effect of ego depletion on challenge and threat evaluations during a potentially stressful public speaking task. Anxiety Stress Coping. 34 (3), 266–278 (2021).
MacIntyre, P. D. & Thivierge, K. A. The effects of audience pleasantness, audience familiarity, and speaking contexts on public speaking anxiety and willingness to speak. Commun. Q. 43 (4), 456–466 (1995).
MacIntyre, P. D., Thivierge, K. A. & MacDonald, J. R. The effects of audience interest, responsiveness, and evaluation on public speaking anxiety and related variables. Commun. Res. Rep. 14 (2), 157–168 (1997).
Hsu, C. F. The relationships of trait anxiety, audience nonverbal feedback, and attributions to public speaking state anxiety. Commun. Res. Rep. 26 (3), 237–246 (2009).
Baldwin, S. F. & Clevenger, T. Jr Effect of speakers’ sex and size of audience on heart-rate changes during short impromptu speeches. Psychol. Rep. 46 (1), 123–130 (1980).
Hilmert, C. J., Christenfeld, N. & Kulik, J. A. Audience status moderates the effects of social support and self-efficacy on cardiovascular reactivity during public speaking. Ann. Behav. Med. 24 (2), 122–131 (2002).
McKinney, M. E., Gatchel, R. J. & Paulus, P. B. The effects of audience size on high and low speech-anxious subjects during an actual speaking task. Basic Appl. Soc. Psychol. 4 (1), 73–87 (1983).
Loomis, J. M., Blascovich, J. J. & Beall, A. C. Immersive virtual environment technology as a basic research tool in psychology. Behav. Res. Methods Instrum. Computers. 31 (4), 557–564 (1999).
Takac, M. et al. Public speaking anxiety decreases within repeated virtual reality training sessions. PloS One. 14 (5), e0216288 (2019).
Ye, T., Elliott, R., McFarquhar, M. & Mansell, W. The impact of audience dynamics on public speaking anxiety in virtual scenarios: an online survey. J. Affect. Disord. 363, 420–429 (2024).
Chollet, M. & Scherer, S. Perception of virtual audiences. IEEE Comput. Graph. Appl. 37 (4), 50–59 (2017).
Schmälzle, R., Lim, S., Cho, H. J., Wu, J. & Bente, G. Examining the exposure-reception-retention link in realistic communication environments via VR and eye-tracking: the VR billboard paradigm. PloS One 18(11), e0291924. (2023).
Artemi, T. F., Konstantinou, T., Naziri, S. & Panayiotou, G. Virtual Reality-assisted, single-=session exposure for public speaking anxiety: improved self-reports and heart rate but no significant change in heart rate variability. Virtual Worlds. 4 (2), 27 (2025).
Pertaub, D. P., Slater, M. & Barker, C. An experiment on public speaking anxiety in response to three different types of virtual audience. Presence 11 (1), 68–78 (2002).
Hey, B. Scientific talks—effective communication that assists with the conveyance of research results. In Mastering Scientific Presentations: Unlocking your Communication Skills 1–8 (Springer Fachmedien, 2024).
Reeves, B. & Nass, C. The Media Equation: How People Treat computers, television, and New Media Like Real People (Cambridge University Press, 1996).
Saufnay, S., Etienne, E. & Schyns, M. Improvement of public speaking skills using virtual reality: development of a training system. In 12th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW) 122–124 (IEEE, 2024).
Lim, S., Schmälzle, R. & Bente, G. Artificial social influence via human-embodied AI agent interaction in immersive virtual reality (VR): effects of similarity-matching during health conversations. Comput. Hum. Behavior: Artif. Humans 2025, 100172 (2025).
Pan, M., Kitson, A., Wan, H. & Prpa, M. Ellma-t: an embodied LLM-agent for supporting english language learning in social VR. In Proceedings of the 2025 ACM Designing Interactive Systems Conference 576–594 (2025).
Blom, K. J. Virtual Orator [Computer Software] (Publicspeaking.tech, 2018).
Bowman, N. D., Wasserman, J. & Banks, J. Development of the video game demand scale. In Video Games 208–233 (Routledge, 2018).
Bradley, M. M. & Lang, P. J. Measuring emotion: the self-assessment manikin and the semantic differential. J. Behav. Ther. Exp. Psychiatry. 25 (1), 49–59 (1994).
Spielberger, C. D., Gonzalez-Reigosa, F., Martinez-Urrutia, A., Natalicio, L. F. & Natalicio, D. S. The state-trait anxiety inventory. Revi. Interamericana Psicol./Interamerican J. Psychol. 5, 4 (1971).
Hook, J. N., Smith, C. A. & Valentiner, D. P. A short-form of the personal report of confidence as a speaker. Pers. Indiv. Differ. 44 (6), 1306–1313 (2008).
Delorme, A. & Makeig, S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J. Neurosci. Methods. 134 (1), 9–21 (2004).
Niedermeyer, E. & da Silva, F. L. Electroencephalography: Basic Principles, Clinical Applications, and Related Fields (Lippincott Williams & Wilkins, 2005).
Poldrack, R. A. Can cognitive processes be inferred from neuroimaging data? Trends Cogn. Sci. 10 (2), 59–63 (2006).
Wagner, J. et al. Dawn of the transformer era in speech emotion recognition: closing the valence gap. IEEE Trans. Pattern Anal. Mach. Intell. 45 (9), 10745–10759 (2023).
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This research was supported by NSF grant 2302608.
Author information
Authors and Affiliations
Contributions
S.L. and R.S. wrote the main manuscript text. R.S. prepared the figures while S. L. prepared the data tables. All authors reviewed and provided feedback on the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Lim, S., Schmälzle, R. & Bente, G. Examining speakers’ subjective and bio-behavioral responses to audience-induced social-evaluative threat via immersive VR. Sci Rep 16, 7633 (2026). https://doi.org/10.1038/s41598-026-38915-8
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-026-38915-8





