Introduction

It is common for surgeons to experience stress while they are performing surgery1. High levels of physiological and/or psychological stress have the potential to interfere with the expert execution of technical and non-technical skills (e.g., communication, teamwork and rapid decision-making) in a stressful environment2,3. The flow-on effects of intra-operative stress may have the potential to impair surgical performance and thereby worsen patient outcome2,3. Furthermore, acute stress phenotype may contribute to a burnout / chronic stress phenotype4. Therefore, the application of stress mitigation strategies within the operating room (OR) may be important to ensure the ongoing delivery of excellent surgical care through the execution of technical and non-technical skills1,2,3.

Music is regularly played in ORs throughout the world and is a common way in which surgeons can alter their operating environment5,6,7,8,9. Many surgical clinicians find music to be a generally favorable part of the theatre environment7,8,9,10,11,12, with music being seen as improving calmness5, stress13,14, mood8, surgeon and overall team performance7, however concerns remain over its distracting properties and risk to disrupt communication7,11,13,15,16. In order to investigate the effect of environment on a surgeon’s experience of stress and task performanc, a surgical simulation task was used, as this permits superior control of confounding variables and more extensive physiological monitoring than in the real-world17. For example, Oomens et al. observed that music improved task performance, instrument handling efficiency and reduced SURG-TLX parameters during a simple surgical simulation task (i.e., laparoscopic peg transfer) in medical students18. However it remains unknown if beneficial effects would be observed for tasks with a higher level of complexity and in more experienced operators.

The objective of the present study was to conduct a randomized controlled interventional trial using a complex simulated surgical task (i.e., carotid patch-angioplasty) to test the primary hypothesis that music attenuates an operator’s psychological and physiological stress response to a simulated surgical task. We also included a direct comparison of the effect of music on inexperienced (i.e., undergraduate medical students) versus experienced (i.e., trained surgeons) operators to extend and permit comparisons with previous research identifying that preferred music improves performance and reduces mental workload in inexperienced operators during the learning phase of a new and challenging surgical task18,19,20. The inclusion of such groups enhances the ecological validity of these novel investigations, while potentially informing the design of future studies examining psychophysiological stress and its mitigation in simulated surgery.

Methods

Ethics

The Effect of background music on STress Responses Amongst Undergraduates and Surgeons performing Simulated Surgical tasks: A randomised cross-over interventional trial (The STRAUSS study) was a non-blinded controlled crossover trial with balanced allocation carried out at the Faculty of Health and Medical Sciences, University of Auckland between April and October 2023. The STRAUSS study was designed in accordance with the CONSORT guidelines with the crossover study extension to randomised crossover trials21,22. This interventional trial was registered with the World Health Organisation International Clinical Trials Registry Platform (Universal Trial Number: U1111 1288-7255, 20/02/2023), the Australia and New Zealand Clinical Trials Registry (ANZCTR reference: ACTRN12623000306617, 20/3/2023), was approved by the New Zealand Health and Disability Ethics Committee (HDEC 20/CEN/57) and conformed to the standards in the Declaration of Helsinki (2013). All participants provided explicit written informed consent prior to commencement.

Participants

Eligible participants were undergraduate medical students (MS) with little surgical experience, and senior and experienced vascular trainee surgeons or vascular consultant surgeons (VS). MS from all year groups were included (pre-clinical, clinical, or final year students). Participants were excluded if they had a significant physical impairment that inhibited dextrous task completion, current pregnancy, a significant cardiac history, or a hearing impairment in which the auditory perception of music was not possible. Participants were asked to refrain from caffeine, cardioactive medications and strenuous exercise 12 h prior to the study. Each participant had anthropometric and demographic information gathered.

Experiment protocol and randomisation

The experiment protocol is demonstrated in Fig. 1 as a flow diagram. Participants were in the seated position with a table in front of them. After at least 5 min at rest, baseline physiological (5 min) and psychological measures were recorded. Using a Vascular Micro-Surgery Head Simulator (More Than Simulators™, Barcelona, Spain), participants were asked to perform a patch-angioplasty anastomosis using a double armed 5–0 prolene suture and a bovine pericardial patch (Xenosure, Le Maitre) to a 1.5 cm long arterial defect in a 6 mm bio-synthetic tubegraft (Omniflow, Le Maitre) that served as a carotid artery (Fig. 2). MS participants each had a period of initial training in which they were shown a five-minute video of a patch-angioplasty anastomosis and then were trained to perform the task on the model by the researcher (A.N.). VS participants were allowed to perform the task once to familiarise themselves with the surgical model during the training phase. All participants were then instructed to perform the surgical task a total of four times, randomised (by coin toss) to music or a no-music control using a block of size two. This ensured that each participant performed the task twice under each condition, once each in the first half and second half of the protocol. Participants were instructed to perform the task as neatly and quickly as possible, and an researcher observed their work while taking notes on a clipboard. A camera was set up in the corner of the room, focused on the participants hands. After the training period and after each task was completed, there was a rest period of approximately five-minutes (Fig. 2). Participants started from a resting position, and the task was considered complete when they had trimmed the final knot. Blinding was not possible with this study design, so allocation concealment was utilised. Participants were informed of the allocation at the time of commencement of the task and were kept blinded to the randomisation method. The intervention was participant selected music, with a choice of playlist, artist, or genre radio from Spotify™. Music selection was subsequently classified by Spotify™ genre allocation, or by investigator judgement when not clear. The no-music control condition was a recorded sound clip of ambient theatre noise23. Following the prior methodology of Oomens et al.18, in order to minimise observer bias, both conditions were played through noise cancelling headphones at a medium volume acceptable to the participant.

Fig. 1
figure 1

Experiment Protocol (Participant Flow). Psychological outcome measures were the Six-Item State Trait Anxiety Inventory (STAI-6) and a Surgical Taskload Index (SURG-TLX). Physiological outcome measures were heart rate (HR) blood pressure (BP), mean arterial pressure (MAP), respiratory frequency, partial pressure of end-tidal carbon dioxide (PETCO2) and middle cerebral artery mean blood velocity (MCA Vm). Music (M), Non music (NM).

Fig. 2
figure 2

Experimental set-up. Panel (A), the monitoring of blood pressure (BP; automated sphygmomanometer), middle cerebral artery flow velocity (MCA Vm; transcranial Doppler ultrasound), heart rate (HR; electrocardiography), respiratory frequency and partial pressure of end-tidal carbon dioxide (Rf and PETCO2; nasal cannula). Panel (B), the surgical model with (from left top) suture scissors, suture holder, fine Potts scissors, Castro needle holder, Gerald forceps, head, and neck model with self-retainer within wound, green drapes to cover the handles, and (from bottom left) 5–0 prolene suture packets, artery clip forceps, basic forceps. Panel (C), an example of the completed carotid patch-angioplasty procedure using a 5–0 prolene suture to join a bovine pericardial patch (Xenosure) to the model artery (Omniflow).

Experimental measures

The primary performance outcome measures were time to task completion, number of errors, self-rated performance on the Global Rating Scale24 and two anastomotic quality scales: The 10-point Microsurgical Anastomosis Rating Scale (MARS10)25 the End-Product Rating Scale (EPRS)26.

Primary psychological outcome measures were the Six-Item State Trait Anxiety Inventory (STAI-6) and a Surgical Taskload Index (SURG-TLX) completed at baseline and immediately on task completion after each of the four trials. The STAI-6 has reported strong internal reliability, to be highly correlated with the full STAI, and it has been used in many healthcare settings27,28,29. Each component of the STAI-6 could possibly score between 1 and 4, with a minimum total score of 6 and maximum score of 2430. The SURG-TLX is a multidimensional, surgery-specific workload measure that was developed and validated specifically for use in surgery from the NASA Task-load Index31. The highest possible score for the total SURG-TLX is 120, and the minimum score is 0, with each of 6 domains scoring on a Likert scale between between 0 and 2031. A non-weighted raw score was generated with a higher score representing higher task-load.

Participants were instrumented for continuous monitoring of heart rate (HR) by a lead II electrocardiogram (BioAmp, FE231, ADInstruments, Bella Vista, NSW, Australia) and intermittent monitoring of blood pressure by an automated digital sphygmomanometer fixed to the left upper arm (HEM-7121, Omron Healthcare, Kyoto Japan) (Fig. 2). Mean arterial pressure (MAP) was calculated post-hoc. Respiratory frequency (Rf) and partial pressure of end-tidal carbon dioxide (PETCO2) were monitored continuously via a single-nostril nasal cannula using a gas analyzer (Respiratory Gas Analyzer, ML206, ADInstruments) calibrated using standard gases. Middle cerebral artery mean blood velocity (MCA Vm) was measured by transcranial Doppler ultrasonography (ST3, Spencer Technologies, Redmond, WA, USA). Blood velocity in the M1 segment of the middle cerebral artery was measured using a 2 MHz probe fixed over the temporal window with an adjustable headband. The M1 segment was identified using search techniques described elsewhere32,52.

Data analysis

Raw signals underwent analog-to-digital conversion at 1 kHz (Powerlab and LabChart v8; ADInstruments) and were stored for offline analysis. From the ECG, HR was calculated on a beat-to-beat basis. Breath-by-breath Rf and PETCO2 were obtained, and artefacts were removed. Raw physiological responses were averaged over a 1-min period at four timepoints: rest (3 min prior to task commencement), the start of the task (immediately after commencement), the isometric midpoint of the task, and the end of the task (immediately prior to completion). R-R intervals were extracted from ECG data over a 5-min period at the overall resting baseline and the isometric midpoint of the task and subsequently analysed in line with the European Society of Cardiology (ESC) and the North America Society of Pacing and Electrophysiology standards of measurement of heart rate variability33. The ECG data were independently assessed and verified to exclude any artefacts, pauses, ectopics, and analysis was performed using HRV software (Kubios HRV Standard v3.5.0, Finland) with medium beat correction34. Time domain: RMSSD (square root of the mean of the sum of the squares of differences between adjacent NN intervals), frequency domain: LF:HF ratio (HF, high frequency [0.15–0.4 Hz]; LF, low frequency [0.04–0.15 Hz]) and composite scores of the parasympathetic nervous system (PNS index), sympathetic nervous system (SNS index) and Baevsky’s stress indices were reported34,35. Poincare plot (non-linear) analysis is reported as standard deviation of the cloud of points in the direction traverse to the line-of-identity (SD1, ‘short-term’), and the standard deviation of the cloud of points in the direction of the line-of-identity (SD2, ‘long-term’)36.

Statistical analysis

Investigators were blinded to the allocation of each trial during data extraction. Based on our previous work showing that mean operative SURG-TLX amongst surgeons was 48 ± 22, it was estimated that a total of 20 participants would be sufficient to detect a 30% difference between music and control conditions37. Performance outcomes were summarised and compared between trial arms with a two-tailed paired t-test. Psychological and HRV outcomes were compared with a two-way repeated measures ANOVA with group (MS, VS) and condition (baseline, music, control) as the main effects. Physiological outcomes were compared using three-way repeated measures ANOVA with group (MS, VS), condition (music, control), and time (baseline, start, midpoint, end) as main effects. Normality assumptions were checked by visually assessing residual QQ plots, which showed points on or near the reference line without systematic deviations that required model adjustments or transformations. Tukey’s test and the Geisser-Greenhouse epsilon were used for post-hoc analysis of significant main effects and interactions. Statistical significance was accepted at p < 0.05. Pairwise deletion was used for missing data. Data are presented as mean ± standard deviation unless otherwise stated. Statistical analyses were performed using Prism 9 (Graphpad Software Inc., San Diego, California, USA)38 and RStudio (R Core Team 2019)39. Most data sets were analyzed using a linear mixed-effects model conducted using the R package lmerTest40 via RStudio.

Results

Protocol adherence

The protocol was well tolerated. All MS participants were able to learn and perform the surgical tasks, and all participants completed the protocol. The music intervention and control conditions did not need to be turned off for any participant.

Participant characteristics

As shown in Table 1, the MS group (n = 15) were aged 23 ± 2.7 years (mean ± SD), with a height and weight of 170 ± 10 cm and 67 ± 21 kg, respectively. Ten (67%) were female and five (33%) were students who were in the first three university-based years of the undergraduate degree, and the remaining were in their final three hospital-based years. The pop genre was the most popular choice of the music intervention (n = 4), although choices also included classical, Rhythm and blues, atmospheric instrumental, jazz and metal. The VS group (n = 12) were older (42 ± 10 years; p < 0.0001 vs. MS), taller (179 ± 8 cm; p = 0.01), heavier (81 ± 12 kg; p = 0.04) and more likely to be male (p < 0.01). There was 1 female participant and 8 were consultant surgeons. The most popular choices in the VS group were pop, easy listening and 80’s music equally.

Table 1 Participant characteristics.

Surgical performance

Figure 3 illustrates the mean performance measures for the surgical task. Significant group effects were observed for time to completion (F(1,25) = 43.4, p < 0.0001), errors (F(1,25) = 28.9, p < 0.0001), and global rating scale (F(1,25) = 16.4, p = 0.0004). The VS cohort (mean difference of main effect, 95% CI) were faster to complete the task (− 10 min [− 13.2 to − 6.8]), made fewer errors (− 4 [− 7 to − 3]) and had higher self-rated scores (6 [2–9]). Music did not affect time to completion, number of errors or global rating scale (i.e., no condition effect was observed).

Fig. 3
figure 3

Simulated surgical task performance measures in the music and control (non-music) conditions for the medical student (MS) and vascular surgeon (VS) groups. Symbols represent individual values for MS (blue, n = 15) and VS (red, n = 12) with horizontal bars representing means with standard deviations. P values show results of two-way repeated measures analysis of variance.

Psychological responses

SURG-TLX total score and STAI-6 total score for baseline, music, and control (condition main effect) for MS and VS (group main effect) are shown in Fig. 4 and Table 2. There was an interaction between condition and group for SURG-TLX total score (F(2,50) = 12.36, p < 0.0001). MS exhibited pronounced increases from baseline by Δ40 [26–55, p < 0.0001] with music, and by Δ44 [31–57, p < 0.0001] with control, that tended to be greater than VS who exhibited increases from baseline by Δ17 [7–28, p < 0.01] in music and by Δ17 [8–27, p < 0.01] in control. SURG-TLX total score was not different between music and control in either group (MS p = 0.392 VS p = 1.000). A significant main effect of condition was noted for STAI-6 total scores (F(1.9,48.3) = 18.7, P < 0.0001), with an increase from baseline to control (Δ3 [1.7–4.3], p < 0.0001) and baseline to music (Δ3 [1.4–4.3], p = 0.0001), but no difference observed between music and control ( − 0.15 [− 1.4–1.1], p = 0.955).

Fig. 4
figure 4

Psychological measures at baseline and during the simulated surgical task under music and control (non-music) conditions in the medical student (MS) and vascular surgeon (VS) groups. SURG-TLX Total, Surgical Taskload Index; STAI-6, Total Six-Item State-Trait Anxiety Inventory Score. Symbols represent individual values for MS (blue, n = 15) and VS (red, n = 12) with horizontal bars representing means with standard deviations. P values show results of two-way repeated measures analysis of variance. Results of post-hoc analysis with Tukey HSD shown as ** = p < 0.01 and **** = p < 0.0001 versus Baseline.

Table 2 Psychological parameters at baseline and during the simulated surgical task under music and control (non-music) conditions in the medical student (MS) and vascular surgeon (VS) groups.

Physiological responses

Figure 5 demonstrates the HR, MCA Vm, PETCO2, and RR responses to the surgical task at four timepoints (rest, start, midpoint and end; time main effect), in MS and VS (group main effect), for music and no music (condition main effect). HR increased from baseline at all timepoints (time main effect; F(3,75) = 14.2, p < 0.0001) and reached a peak at the end of the task (Δ5.1 bpm [3.0–7.1] vs. baseline, p < 0.0001). HR was lower in VS (group main effect; F(1,25) = 14.8, p = 0.0007) but was not affected by music (condition main effect; p = 0.204). Similarly, MCA Vm was increased from baseline at all timepoints during the task (time main effect; F(3,75) = 36.8, p < 0.0001) and was lower in VS (group main effect; F(1,25) = 9.2, p = 0.006), but was not affected by music (condition main effect; F(1,25) = 3.0, p = 0.094). PETCO2 increased during the task (time main effect; F(3,75) = 15.8, p < 0.0001) reaching a peak at the midpoint (+ 1.5 [0.95–2.1] mmHg, p < 0.0001), and overall was lower with music (condition main effect; F(1,25) = 5.0, p = 0.034). RR increased only at the end of the task (time main effect; F(3,75) = 2.9, p = 0.040; + 1.7 [0.3–3.0] bpm vs baseline, p < 0.01). MAP was higher in VS than MS (group main effect; F(1,25) = 6.0, p = 0.022), although not different between conditions (condition main effect; F(1,25) = 0.05, p = 0.822).

Fig. 5
figure 5

Physiological responses to the simulated surgical task under music and control (non-music) conditions in the medical student (MS) and vascular surgeon (VS) groups. Panels show (A) heart rate (HR); (B) middle cerebral artery flow velocity (MCA Vm); (C) partial pressure of end-tidal carbon dioxide (PETCO2, mmHg); (D) respiratory rate (RR); and (E) mean arterial pressure (MAP, mmHg). Values are expressed as mean with standard deviations for MS (green, n = 15) and VS (n = 12, purple). P values show the results of three-way repeated measures analysis of variance. Results of post-hoc analysis with Tukey HSD shown as * = p < 0.05.

HRV responses

HRV parameters for the baseline, music, and control conditions, in the MS and VS groups, are shown in Table 3 and Fig. 6. For PNS Index, there was a main effect of condition (F(1.1,27.8) = 17.5, p = 0.0002). with PNS Index being lower than baseline in both control (− 0.45 [− 0.73 to − 0.18], p < 0.0009) and music (− 0.57 [− 0.91 to − 0.23], p < 0.0009), and lower in music than control (− 0.11 [− 0.22 to − 0.008], p = 0.032). There was also a main effect of group (F(1,25) = 5.9, p = 0.023), such that MS had a lower PNS index (− 0.8 [− 1.5 to − 0.12]) than VS. There was a main effect of condition (F(1.1,28.3) = 12.6, p = 0.001) for SNS Index with values being higher than baseline during both control (0.54 [0.96–0.13], p < 0.009) and music (0.68 [0.22–1.14], p < 0.003), and higher during music than control (0.14 [0.004–0.27], p = 0.042). SNS index was higher in MS than VS (group main effect F(1,25) = 6.2, p = 0.019; Δ + 1.0 [0.17–1.8]).

Table 3 Heart rate variability parameters at baseline and during the simulated surgical task under music and control (non-music) conditions in the medical student (MS) and vascular surgeon (VS) groups.
Fig. 6
figure 6

Heart rate variability parameters at baseline and during the simulated surgical task under music and control (non-music) conditions in the medical student (MS) and vascular surgeon (VS) groups. Panels show (A), parasympathetic nervous system index (PNS Index) and (B) sympathetic nervous system index (SNS Index). Symbols represent individual values and horizontal bars show mean with standard deviations for MS (blue, n = 15) and VS (red, n = 12). P values show the results of two-way repeated measures analysis of variance. Results of post-hoc analysis with Tukey HSD shown as * = p < 0.05 Baseline, # = p < 0.05 versus Control.

There was a main effect of condition for RMSSD (F(1.2, 30.2) = 5.6, p = 0.020), with a tendency for a decrease from baseline during music (− 5.98 [− 12.4–0.48], p = 0.073) and control (− 3.72 [− 8.9–1.4], p = 0.192), and lower values during music than control (− 2.3 [− 4.7–0.24], p = 0.081). Total power also exhibited a main effect of condition (F(1.3,32.3) = 3.9, p = 0.046), with lower values in music than control (− 322 [− 624 to − 20], p = 0.035). There was a main effect of condition for HF parameters (F(1.3,32.2) = 4.2, p = 0.038), though no significant results on post-hoc analysis. There were no significant main effects or interactions noted for LF power and LF:HF ratio, while there was an interaction between condition and group for Baevsky’s stress index (F(2,50) = 3.6, p = 0.035), although no significant post hoc differences were observed.

Discussion

This study examined the effect of background music on task performance, and the psychological and physiological responses to a complex simulated surgical task under stressful circumstances. The five major novel findings were, (1) music did not affect either the speed or accuracy of the simulated surgical stress task performance, (2) the task increased subjective ratings of anxiety and task load but these were unaffected by music, (3) HR, MCA Vm, PETCO2, RR, and SNS index were increased, while PNS index decreased during the task, indicative of a physiological stress response, (4) PNS index was reduced and SNS index was increased in the presence of music suggestive of an arousal response, and finally (5) the more experienced group (VS) performed faster, with fewer errors and greater self-rating than the inexperienced (MS) group, and had overall lower HR, MCA Vm and MAP, but there were no psychological or physiological differences between VS and MS in their responses to music.

Task performance

A beneficial effect of music on simulated surgical task performance was not demonstrated in the present study in either inexperienced or experienced operators. This is out of keeping with El Boghdady’s systematic review that concluded classical music at a low to medium volume improves accuracy and speed, and outweighs negative effects of distractedness, on the basis of n = 18 studies that were predominantly simulation studies and surveys41. This may in part be due to the novel use of a complex, non-laparoscopic surgical model of carotid patch-plasty in this field of research, a task that required the performance of multiple surgical steps (i.e. knot tying, suturing, patch trimming) which may take variable lengths of time. To investigate this variable, future research may consider utilising a study design that includes a complex multi-step task in comparison to a simple task. Music has also been shown to hasten the learning process of surgical tasks on the Da Vinci robot19. In the present study, there was no music or sound during the training phase of the protocol, and future research could investigate the effect of music on the learning trajectory of complex surgical tasks and speed of skill acquisition.

Psychological stress

Despite the simulated carotid patch-angioplasty task inducing a clear psychological task load and anxiety response from operators, this was not modified by music in either the VS or MS group. The unweighted version of the SURG-TLX was chosen as the added time of the longer weighted questionnaire would have significantly increased the length of time the protocol would take, given its repeated nature. While previous studies have shown high correlation between the weighted and unweighted scores42, there is a potential reduction in the generalisability or comparability of our findings to similar studies that have utilised the weighted version of the SURG-TLX. Elevated SURG-TLX scores may, in the real world setting, be associated with longer length on bypass in cardiac surgery, and with particular surgical team roles43,44.

Background music decreased workload scores in two previous studies of inexperienced operators performing simulated laparoscopic surgery17,18. The possible reasons for the conflicting findings of the present study include the nature of the control condition (silence vs. ambient OR noise) and the nature of simulated task. SURG-TLX measured reproducible changes from baseline in the participants in this study. Other authors have preliminary recommended the SURG-TLX as a standard measure to assess operators’ experience in studies of surgical innovation, as it has been found to be most relevant, comprehensive, and comprehensible45. STAI-6 scores increased during task performance (around 3 points in both groups), though the level of experience did not affect this increase. STAI-6 has been frequently used validated scale for stress measurement and has successfully been demonstrated to correlate to LF:HF though a direct comparison between STAI-6 and SURG-TLX has yet to be published46.

Physiological stress

The data here represents the most detailed physiological phenotype of a surgeon performing a simulated task to date10,47. In other studies, during surgery, mean HR has been measured to be the most consistent biomarker elevated by 6 to 22 bpm from baseline compared to 5 bpm in this study47. In our study, there was no difference in HR between music and control conditions. Analysis of HRV (i.e., SDNN, RMSSD, LF, HF, HF:LF ratio) has been shown to be a good objective assessment of mental stress in the surgical setting48. Notably, we observed that music reduced the PNS index and increased the SNS index, suggestive of an arousal response, which is contrary to widespread perception of OR music, and previous interventional studies7,8,10,18. Further evidence for an arousal response to music is the lowering of PETCO2, which would fit with an increase in minute ventilation under the music condition. Allen et al. suggested that self-selected music could reduce skin conductance, pulse and blood pressure and improve speed and accuracy of arithmetic tasks when compared to control music or the absence of music49. One study reported an increase in MAP between control and self-selected music50, though four other studies also demonstrated no change in HR or MAP under music conditions compared to control10. Finally, relaxing music (slow-paced classical pieces) when compared to control showed greater return to baseline in high frequency HRV after a stressing event suggesting a role for music in the recovery for surgeons after stressful operations51.

Transcranial Doppler ultrasound was first used by Aaslid et al52 to measure cerebral blood velocity in dynamic cerebral autoregulation research, and its application to surgeons is novel. Cerebral perfusion (i.e., MCA Vm) was significantly increased during task performance, with maximal increase during the midpoint of the operation, although a difference between music and control conditions was not observed. The cerebrovascular changes during task performance reported in this study are not surprising, in that MCA Vm has been shown to increase bilaterally in mathematical cognitive effort53, and during simulated robot assisted surgery, the pre-frontal cortex (supplied by the MCA and anterior cerebral artery) differentially activates54,55. The functional hyperaemia we observed may be linked to neuronal activation during the surgical task (i.e., neurovascular coupling), but the coincident elevation in PETCO2 (a proxy for arterial CO2 concentration a powerful cerebrovascular vasodilator) during task performance is also likely to be a contributory factor56. The nature of the task meant that non-invasive beat-to-beat blood pressure could not be measured using finger photoplethysmography, hence cerebral perfusion pressure estimates cannot be incorporated into the interpretation of the MCA Vm response57. It should be noted that transcranial Doppler assesses artery blood velocity, rather than blood flow, and whether the this is representative of cerebral blood flow rests on the assumption that vessel diameter remains constant.

The impact of experience

The VS cohort were older, taller, heavier, and contained more males which may account for the baseline group main effects between groups (i.e. lower HR, lower MCA Vm, higher MAP). Unsurprisingly, the greater experience was associated with improved speed and quality of the task performance58,59. A time x group interaction in HR is suggestive of attenuated cardiovascular reactivity to the task in the VS group, for whom the surgical task would have been more familiar. This is in line with previous work that has shown senior surgeons to display a better stress management capacity and are less prone to cognitive distraction compared to junior surgeons60. Future work should then consider that if music were to attenuate a psychophysiological response to surgical tasks, this effect may be less detectable in an experienced population where the stress response to tasks is smaller.

The type of music

The question often arises as to whether the type of music is important. Self-selected music is a commonly researched music choice in the surgical setting, in part due to Allen et al.’s seminal paper, and was therefore used in this study10,49. Music considered to be unpleasant may generate poorer surgical performance, an effect which may be measurable through application of electro-encephalography61. Dissonance in music has been shown to induce unpleasant feelings and physiological responses which may lend support to the application of consonant music62. Furthermore, faster music may reduce OR preparation time63, increase the rate of task performance (while also increasing the number of mistakes)64, greater sound intensity can increase perceived arousal and high rhythmic tension may more strongly affect HRV65, which was chosen (rather than silence) for ecological validity.

Strengths and limitations

Several experimental factors should be considered when interpreting the results of the present investigation. The crossover design was counterbalanced to minimise possible learning effects and all participants undertook a supervised training and practice period prior to being assessed to further mitigate against learning effects. To increase the experience of stress during the surgical task several experimental manipulations were employed. The Trier Social Stress Test is an ecologically valid stressor induced by public speaking, and in the present study and social evaluation was incorporated with the presence of an unresponsive yet engaged assessor and a camera setup (without filming)66. Perceived time pressure refers to the sense of time pressure experienced in the absence of genuine time limits and has been shown to be an empirical stressor, and was also incorporated into the study design67. Finally, participants were asked to complete task to the highest quality possible to maintain a level of sustained mental effort investment during the protocol in an aim to increase mental workload, stress and fatigue during the task68. The results indicate that the task and the applied manipulation were successful in generating a stress response. It is rarely silent in a working operating room. Previous interventional studies are heterogenous in the choice of control condition, however in this study ambient theatre noise was selected as the control condition, rather than silence for ecological validity10,69. As such it is possible that previous work showing that music improved simulated laparoscopy compared to a no music / silence control may be due to the presence of auditory stimulation (or the absence of silence), rather than the music itself18. HRV analyses were used to provide an indirect estimate of cardiac autonomic behaviour. However, there remains controversy surrounding the physiological interpretation of these metrics70. Furthermore, we acknowledge that the composite HRV scores reported (i.e., SNS and PNS indices) are currently unconventional33,34 but believe that they are helpful in summarising related metrics and we have also included the component metrics in Table 3.

Conclusion

We observed that the performance of a simulated carotid patch-angioplasty under stressing conditions successfully increased markers of psychological and physiological stress in both inexperienced and experienced operators. Despite previous research identifying generally positive surgeon perceptions, background music did not improve either surgical task performance or attenuate subjective ratings of task load and anxiety, but in fact evoked additional physiological arousal evidenced by decreased PNS index and increased SNS index. As expected, the more experienced group performed faster and more accurately than the inexperienced group, but there were no psychological or physiological differences in their responses to music.