Introduction

Gaze stability is vital for engaging in daily activities, during which our heads and bodies move relative to the visual world. This function is supported by the vestibular system in the inner ear, where head motion is detected and encoded by the vestibular organs and transmitted to neurons in the vestibular nuclei, which are part of the vestibulo-ocular reflex (VOR) pathway. The VOR generates compensatory eye movements of equal and opposite magnitude to head rotation, stabilizing gaze relative to space1,2. Patients with a compromised vestibular system have impaired VOR function and thus experience blurry vision, poor balance and postural control, and an increased risk of falls3,4,5.

Under current clinical guidelines, patients with impaired VOR are often prescribed a series of gaze-stabilization exercises, whose efficacy has been demonstrated by moderate to strong evidence in previous randomized and controlled studies6,7,8,9,10. Recently, validation of gaze stability exercises has expanded to those patients with motion sickness11, multiple sclerosis12, and fear of fall13. Participants are instructed to fixate on a target while moving their head along one of two rotational axes: yaw (rotation around the Earth’s vertical axis) or pitch (rotation within the horizontal plane). This is an advanced adaptation of the original Cawthorne-Cooksey exercises, which did not explicitly require fixation on stable or moving targets during head rotations.

Gaze-stabilization exercises are broadly categorized into two classes: continuous (e.g., smooth side-to-side movement) and transient (e.g., head motion is paused after an eccentric rotation). Each type offers an opportunity to study the deficits caused by vestibular loss and patients’ resultant compensatory strategies for maintaining gaze stability during head movements. Recent studies have shown that individuals with unilateral peripheral vestibular loss demonstrate greater changes in head-motion kinematics during continuous versus transient gaze-stability exercises14,15. However, the analysis of head kinematics does not determine whether vestibular-loss patients exhibit different trial to trial regularity in the generation of head movement. Indeed, no prior study has quantified the dynamic patterning of head movements during gaze-stabilization exercises or assessed whether this patterning is altered by vestibular loss.

Accordingly, here we investigated whether unilateral vestibular loss alters the patterning of head movements made by subjects during gaze-stabilization exercises. To do so, we first examined how individuals integrate real-time vestibular feedback to adjust future movements and then used sample entropy to assess the temporal regularity of head motion. We found that vestibular-loss patients produce head motions that are more stereotyped and less flexible compared to controls. Our results provide the first experimental evidence that the patterning of head movements is a valuable metric for identifying individuals with impaired vestibular function, offering novel insights that could serve to inform patient rehabilitation.

Methods

Subjects

Eighteen patients with unilateral vestibular schwannoma were recruited; nine of those patients completed the study protocol before and six weeks after surgical removal of the tumor. Nine patients were lost to follow up because they could not complete the study due to surgical complexity, or they resided out of state. Ten healthy controls with no history of otologic or neurologic pathology also completed the study protocol. The data of 8 patients (8 males, mean age = 54 ± 15.7, range 23–72 years) and 10 healthy controls (9 males, 1 female, mean age = 52 ± 16.8 years old, range 24–76 years) were analyzed. One patient’s data were not used due to data corruption. Video Head Impulse Test (vHIT; ICS Otometrics, Natus Medical Incorporated, Denmark) was used to measure the gain of the vestibulo-ocular reflex (VOR) during passive head rotations16,17. VOR gains in healthy controls were 0.99 ± 0.09 and 0.93 ± 0.04, for rightward and leftward head rotations, respectively. VOR gains before and after surgery were 0.79 ± 0.28 versus 0.34 ± 0.19 for ipsilesional, and 0.94 ± 0.11 versus 0.79 ± 0.16 for ipsilesional versus contralesional head rotations, respectively. Independent samples t-tests revealed significant differences among the ipsilesional VOR gains after surgery relative to healthy controls (p =  < 0.001) as well as significant differences before and after surgery for both ipsi and contralesional VOR gains (p = 0.006) and (p = 0.047) respectively.

This study was approved by the Johns Hopkins University Institutional Review Board and performed according to the institution’s guidelines for safe and ethical research in human subjects. Written informed consent was obtained from each participant prior to data collection.

Kinematic measures of gaze-stability exercises

Angular velocities were collected via inertial measurement units (IMUs) that were comfortably attached to participants with elastic bands, while they performed 12 gaze-stabilization exercises; six of which were continuous and six transient. Each exercise lasted approximately 30 s.

In the continuous gaze-stabilization exercises, subjects were instructed to continuously move the head from left to right at a comfortable range of motion, which was self-determined for both yaw and pitch head rotation, while keeping their eyes focused on a fixed “X” shaped target. “VOR × 1” is a term used by clinicians to describe this gaze-stabilization exercise where visual fixation on a target is prescribed, requiring an equal (1:1) ratio of eye to head velocity, regardless of what the actual gain of the VOR may be9. There were three continuous exercise conditions, which differed in the distance between target and subject, and location of the target, specifically: 1) 1 m on the wall (Table 1, exercise 1–2, Fig. 1a, left panel), 2) approximately 1 m handheld by the subjects (Table 1, exercise 3–4, Fig. 1a, left panel), and 3) 2 m on the wall (Table 1, exercise 5–6, Fig. 1a, middle panel). All three conditions for continuous exercises required head movements in either yaw or pitch directions, in separate sessions, resulting in six continuous gaze-stabilization exercises in total (Table 1, exercises 1–6).

Table 1 A list of gaze-stabilization exercises used in the current study.
Fig. 1
figure 1

Schematics of gaze-stabilization exercises in the yaw direction. Yaw and pitch are the directions of interest and only head movements in the yaw direction are shown for a clear demonstration. (a) Three continuous gaze-stabilization exercises, in which subjects were asked to continuously move the head from left to right at a comfortable range of motion, which was self-determined. Subjects were asked to fixate on an “X” shaped object either hand-held at ~ 1 m away from their eyes (‘VOR × 1 near’ exercise, left panel) or taped on a vertical wall 1 m (‘VOR × 1 near’, left panel) or 2 m away at eye level (‘VOR × 1 far’ exercises, middle panel). An example head motion consisting of two continuous head-rotation repetitions is shown in the right panel. (b) Three transient gaze-stabilization exercises, in which subjects were instructed to make self-initiated and self-determined impulsive head movement from the center to an eccentric location, followed by a slower resetting movement back to the center. Subjects were asked to either imagine or fixate on an ‘X’ shaped object placed 1 m from them on the wall at eye level (‘Imaginary target’ exercise and ‘Visual target’ exercise, left panel), or to alternate gaze between two targets ‘X’ fixed on the wall 60 cm apart (‘Gaze shift’ exercise, middle panel). An example head motion consisting of two transient head-rotation repetitions is shown in the right panel.

In the transient gaze-stabilization exercises, subjects were instructed to make self-generated impulsive head movements with self-determined amplitude from the center to an eccentric location at which time they then made a complete stop. These transient movements were interspersed with slower resetting movements back to the center; only the impulsive outward movements were analyzed. Notably, in contrast to the continuous gaze-stabilization exercises, subjects were required to pause at the midline when performing the transient exercises. There were three transient exercise conditions (Table 1, exercises 7–12). In the first two exercise conditions, subjects were asked to either fixate on the visual target 1 m from them on the wall at eye level (Table 1, exercises 7–8, Fig. 1b, left panel) or fixate on an imagined object placed at the same distance (Table 1, exercises 9–10, Fig. 1b, left panel). The imaginary-target head-impulse exercises were the same as those with a visible target, except that the subjects imagined the target and kept their eyes closed during the movements. In a third exercise condition, subjects were asked to alternate their gaze between two identical targets that were placed 60 cm apart on a fixed vertical wall 1 m away (Table 1, exercises 11–12, Fig. 1b, middle panel). Subjects began by standing in front of the middle of two targets with the head facing one of the targets. They were then instructed to first shift gaze to the other target without moving the head. Then, they were asked to turn their head toward that target while maintaining fixation, therefore realigning the head and gaze directions. Additionally, subjects performed each of the three exercise conditions described above by making head movements in either yaw or pitch directions, in separate sessions. Thus, overall, the experimental design resulted in six transient gaze-stabilization exercises in total (Table 1, exercises 7–12).

During each exercise, a small (51 mm × 34 mm × 14 mm) MEMS sensor (Shimmer3 IMU, Shimmer Research, Dublin, Ireland) was attached to the back of the subject’s head using an elastic headband. The sensor transduced head motion in six degrees of freedom; angular velocities in pitch and yaw are the focus of this study. The data were sampled at 1000 Hz and recorded on a built-in microSD card. We analyzed the kinematic measurements based on the head angular velocities that were aligned with the instructed head-motion directions during the gaze-stabilization exercises (e.g., yaw in horizontal gaze-stabilization exercises and pitch in vertical gaze-stabilization exercises). Each exercise was about 30 s long, yielding about 30,000 data points for each subject. Given the periodicity of the head movement accelerations, the data were parameterized by each movement repetition (i.e., cycle). In the continuous gaze-stabilization exercises, each head-movement cycle was defined to be a head movement from self-determined left end to right end, and back to the left starting point. In the transient gaze-stabilization exercises, each head-movement cycle was defined to be a fast head rotation initiating from the midline and terminating at a self-determined eccentric location.

Cycle frequency and peak velocity

Two variables of interest, cycle frequency and peak velocity, were parameterized from temporally ordered movement cycles. Peak velocity is the highest angular velocity within each head rotation cycle; cycle frequency is the number of head movement cycles per unit of time (1 s). Head movement was measured within a fixed duration (~ 30 s). Since cycle frequency differed across subjects, the number of head movement cycles each subject completed varied; the mean range of cycles for preoperative, postoperative, and control subjects in each task is in Table S1.

Inter-trial correlation analysis

Inter-trial variability analysis was performed by computing the autocorrelation function of cycle frequency and peak velocity between the current trial and subsequent trials (up to the twentieth trial) for all six continuous gaze-stabilization exercises. In all exercise conditions, the correlations were computed for each subject in three subject groups (preoperative, postoperative, control). We also computed the group-averaged correlation function to compare the patterns between groups, along with the standard error of the mean (SEM).

Entropy

In addition to trial-based variability analysis, we introduced sample entropy (SampEn) as a nonlinear temporal variability measurement to assess the regularity or order in head-movement data. Unlike the event-based analysis that breaks head kinematics into individual movement cycles, SampEn was applied to the entire session (down-sampled to 40 Hz) and consequently preserves the temporal variations within time-series head kinematics.

Entropy measures the order or regularity of time-series data and was adapted for more feasible analysis of experimental data by Pincus (1991)18. The idea is that time-series data with repetitive elements arise from a more ordered system, and thus should be characterized by a lower value of entropy. Traditional variability measurements—such as coefficient of variation (CV) and standard deviation (SD), whose outcomes are affected by how much on average each individual data point deviates from the mean—mask potential time-sensitive properties in time-series data19; temporal regularities of the data are then lost.

Sample entropy

Pincus introduced “approximate entropy” (ApEn) to quantify entropy in a manner that is computationally practical with real data18. It is approximately the negative natural logarithm of the conditional probability (CP) that a short epoch of data is repeated in the time series. However, this algorithm was found to generate potential bias in some cases when there are no similar data templates or when there are few matches20.

To address the bias problem, Richman et al. developed a similar measure of time series regularity—sample entropy (SampEn)—as a less-biased alternative21. SampEn depends on three variables: \(m\), \(r\), and \(N\). The window length \(m\) is the length of the data template; the rest of the time series is examined for near-repetitions of this template. The value \(r\) sets the tolerance for two data segments (one being the template) to be considered a match (a “repetition”); this tolerance is (\(r * std\)). \(N\) represents the data length.

If we acquired N data points in total \(\text{u}\left(\text{j}\right):1\le \text{j}\le N\), we form \(N-m+1\) vectors of length m: \({x}_{m}\left(i\right)=\{u\left(i+k\right): 0\le k\le m-1\} for \left\{\text{i }\right| 1\le \text{i}\le \text{N}-\text{m}+1\}\). The vector \({x}_{m}\left(i\right)\) is called the template. The distance between each pair of elements in the \({x}_{m}\) vector, denoted by \({d[x}_{m}(i), {x}_{m}(k)]\), is calculated as the maximum absolute difference between the corresponding scalar components: \({d[x}_{m}(i), {x}_{m}(k)] = \{\left|u\left(i+j\right)-u\left(k+j\right)\right|:0\le j\le m-1\}\). If the distance is within the tolerance (\(r * std\)), we count this specific pair as a “match”. Let B denote the number of matches for m points, and A for m + 1 points, and we compute the conditional probability \(\text{CP }= \left(\frac{A}{B}\right)\). Finally, SampEn is the negative natural logarithm of CP:

$$SampEn(m,r,N) =-\text{ln}\left(\frac{A}{B}\right)$$

Optimization of sample entropy parameters

Some studies have shown that the variables m, r, and N greatly influence the magnitude of SampEn22,23. Therefore, the parameter choices are not universal and should be optimized for each length of dataset to reduce the possible error of SampEn. Given that there are some overlapping pairs of data points in vectors \({x}_{m}\), Richman and Moorman defined an estimate of the variance of SampEn as:

$${\sigma }_{CP}^{2}=\frac{CP\left(1-CP\right)}{B}+\frac{1}{{B}^{2}}[{K}_{A}+{K}_{B}{\left(CP\right)}^{2}]$$

where \({K}_{A}\) denotes the number of overlapping pairs of (m + 1)-point templates, and \({K}_{B}\) denotes the number of overlapping pairs of m-point templates24. The standard error (SE) of SampEn can be estimated by \(\frac{{\sigma }_{CP}}{CP}\), and the relative error of SampEn is defined as the larger of \(\frac{{\sigma }_{CP}}{CP}\) and \(\frac{{\sigma }_{CP}}{-log(CP)CP}\), which is the maximum of the relative error of SampEn and the \(CP\) estimate. Therefore, we aimed to find m and r to minimize:

$$\max \left( {\frac{{\sigma_{CP} }}{CP}, \frac{{\sigma_{CP} }}{{ - \log \left( {CP} \right)CP}}} \right)$$

By minimizing this quantity, we favor estimates with low variance. There are two subsequent steps for optimization of the computational parameters.

Optimizing the range of m

Since m is the window length to find matches in the rest of the time series, one should choose m based on the knowledge of the time scale of the underlying process. To capture the potential time scale, we solved the first p Yule-Walker equations by running an autoregressive (AR) process order of each time series data and chose m to be the optimal order p of the model AR(p)20,21. A window length m greater than p would produce a template larger than necessary to capture the dynamics, while a value smaller than p might not fully capture the dynamics.

Given that the partial autocorrelation (PACF) of an AR(p) process is zero at lag p + 1 and greater, one looks for the point on the plot where the partial autocorrelations for all higher lags are essentially zero. To systematically find the optimal order, we determine p as the lag after which the PACF coefficients fall to within the 95% confidence interval. Most PACF values diminished to near-zero (within confidence intervals) after a lag of 2 for continuous-gaze exercises (Figs. S1, S2) and 3 for transient-exercises (Figs S1, S3). Hence, we set 2 to be the lower limit of m.

Optimizing the m-r pair

To determine m and r, we calculated the maximum relative error in each group (8 preoperative patients, 8 postoperative patients, and 10 healthy controls) for each gaze-stabilization exercise, which generated 288 heatmaps ((8 + 8 + 8) × 12 exercises = 288 trials). We chose the optimal value of r by choosing the value that minimizes the maximum relative error (Fig. S4).

Each group has a different optimum choice of \(r\): (a) control group, \(r = 0.2\), (b) preoperative patients, \(r = 0.08\), and (c) postoperative patients, \(r = 0.12\). Regarding the \(m\) choice, however, all three groups have \(m = 2\). In addition to the heatmap guidance, another consideration must be addressed regarding \(r\). Since \(r\) determines the tolerance range for finding template matches in the time series, using different tolerance levels for different subject groups may obscure the differences between them. Given that our goal is to see if SampEn could effectively distinguish different subject groups, we decided to use a consistent \(r\) across subject groups by taking the average of each group’s optimal \(r\) values. According to the heatmap, we ultimately chose \(m = 2\) and the average \(r = 0.133\) for the analysis of continuous head movements. Similarly, we chose \(m = 2\) and the average \(r = 0.04\) for analysis of transient head movements.

Surrogate data testing

To provide an intuitive understanding of the temporal regularity assessed by SampEn, we first tested SampEn on surrogate data. Instead of using a purely computer-generated dataset that might not capture relevant features of the time-series data, we generated a template representing one cycle of head rotation for each subject group. Specifically, we randomly chose one of the twelve gaze-stabilization exercises, one example subject, and one example patient (same for both preoperative and postoperative data). By averaging their head angular-velocity trajectories across cycles, we produced a template head-movement cycle for each subject group. Since the surrogate-data testing was performed in order to obtain an intuitive understanding of SampEn, rather than to capture different subject groups’ movement characteristics, an average across all control subjects or patients was not performed.

Based on the template movement cycle, and given that the duration of each exercise was about 30 s, we repeated the template to create an entire simulated session of head-movement data for each subject group respectively. These simulated sessions of head-movement simulations were therefore the first set of surrogate data. Five additional surrogate data sets were generated by adding samples of white, blue, brown, pink, and purple noise to the first surrogate. Only Gaussian white and blue noise are shown in the supplementary figure (Fig. S5) as other colored noises generated similar SampEn results as the blue noise. All noises were scalar and generated by the MATLAB built-in function (dsp.ColoredNoise) with a set random seed for reproducibility. In addition to noise, we also manipulated the amplitude and duration (or frequency) of the simulated head-movement data by horizontally and vertically expanding and squeezing the data, which produced another four sets of surrogate data. Only the horizontal and vertical expansions are shown in the supplementary figure (Fig. S5) to demonstrate their influences on SampEn results.

Statistics

To analyze time-series data, we segmented movement data into individual cycles and parameterized (1) cycle frequency (number of head movement cycles per second) and (2) peak velocity. For the trial-based analysis, we performed an autocorrelation analysis on these two parameters. For the sample-entropy analysis, we down-sampled the data from 1000 to 40 Hz and applied sample-entropy calculations on the entire duration of head movement. To compare the sample entropy between subject groups, we ran the Student t-test between (a) preoperative patients versus postoperative patients, (b) preoperative patients versus healthy controls, and (c) postoperative patients versus healthy controls. To investigate and validate trends, we tested for a significance level of p < 0.05, whether the correlations were consistently positive or negative, and computed standard deviations and coefficients of variation. All data analysis was performed with MATLAB (The MathWorks, Inc., Natick, Massachusetts, United States).

Results

Subjects exhibited high temporal variability in their head kinematics during gaze-stabilization exercises

We studied two classes of gaze-stabilization exercises, continuous and transient, each of which were performed in three separate conditions (see Methods). In each condition, subjects were asked to perform movement in two rotational axes—pitch and yaw, resulting in 12 exercises in total (Table 1). Figures 2, 3 illustrate the head-movement kinematics of healthy controls and vestibular-schwannoma patients during the two phases of testing (pre-surgery and six weeks post-surgery). Specifically, the left panels display the head-velocity traces throughout the first one-fourth of a session for the example subjects. The middle panels illustrate the same head-velocity traces divided into individual head-rotation repetitions (i.e., cycles, see Methods) which are superimposed on their means and standard deviations (thick black, red, and blue traces and associated shaded regions). Figure 2 shows the example subjects during the continuous gaze-stabilization exercise, who were moving their heads in the yaw direction while viewing a handheld target approximately 1 m away (Fig. 1a). Figure 3 shows the same example subjects during the transient gaze-stabilizing exercise, who were moving their heads in the yaw direction while fixating on an imaginary target 1 m away (Fig. 1b).

Fig. 2
figure 2

Example continuous head rotations from (a) a healthy control, (b) preoperative, and (c) postoperative testing from the same patient in the gaze-stabilization exercise with head rotation in the horizontal plane (yaw direction). The target was hand-held approximately 1 m away (‘VOR × 1 near’ exercise). The left panel shows the individual head-velocity traces throughout the first half-session of each trial. The middle panel shows the head-velocity traces from individual head-rotation repetitions superimposed with the mean and standard deviation of the range of motion. The three insets on the right include histograms in the upper panel, which characterize the distribution of the cycle frequency and peak velocity parameterized from individual head-movement cycles of each example subject. The scatterplots in the lower panel visualize the cycle-frequency and peak-velocity changes from trial to trial. (d) The summary histograms superimpose the group average cycle-frequency and peak-velocity distributions. Asterisks indicate differences at three significance levels (* for p < 0.05, ** for p < 0.01, *** for p < 0.001, ns for insignificant difference).

Fig. 3
figure 3

Example transient head rotations from (a) a healthy control, (b) preoperative, and (c) postoperative testing from the same patient in the gaze-stabilization exercise in the horizontal plane (yaw direction). The healthy control and the patient are the same as in the continuous head-movement examples shown in Fig. 2. Subjects were asked to move their heads in impulses while fixating on an imaginary target 1 m away (‘Imaginary target’ exercise), which required them to turn quickly from midline to one self-determined eccentric location (left and right sides of midline, interleaved), followed by a slow resetting head movement back to the midline, repeated. Compared with the continuous gaze-stabilization exercises, subjects needed to pause at eccentric locations and only the impulses were analyzed. The left panel shows the individual head-velocity traces throughout the first half-session of each trial. The middle panel shows head-velocity traces from individual head-rotation repetitions superimposed with the mean and standard deviation of the range of motion. The three insets on the right include histograms in the upper panel, which characterize the distribution of the cycle frequency and peak velocity parameterized from individual head-movement cycles of each example subject. The scatterplots in the lower panel visualize the cycle-frequency and peak-velocity changes from trial to trial. (d) The summary histograms superimpose the group average cycle-frequency and peak-velocity distributions. Asterisks indicate differences at three significance levels (* for p < 0.05, ** for p < 0.01, *** for p < 0.001, ns for insignificant difference).

During continuous gaze-stabilization exercises, both the individual head-velocity traces and superimposed cycles illustrated in Fig. 2 demonstrate that control and patient subjects exhibit high variability in cycle amplitudes and frequencies within a single session. To quantify this variability, we analyzed the individual head-movement cycles of each subject and computed the cycle frequency (number of head-movement repetitions per unit of time) and peak velocities. Values for an example control and example patient subject are shown in the histograms (Fig. 2a,b,c, upper panel of rounded insets). The summary histogram at the bottom superimposes the group average cycle-frequency and peak-velocity distributions (Fig. 2d). In both the example subjects and the group-average histograms, controls had a significantly higher cycle frequency than preoperative patients (\({p}_{example}\) = 1.05 × 10−14 < 0.001, \({p}_{group}\) = 0.00295 < 0.01) and postoperative patients (\({p}_{example}\) = 1.73 × 10−21 < 0.001, \({p}_{group}\) = 1.17 × 10−29 < 0.001), and preoperative patients had a higher cycle frequency than postoperative patients (\({p}_{example}\) = 2.19 × 10−18 < 0.001, \({p}_{group}\) = 1.04 × 10−16 < 0.01). Preoperative patients had significantly higher peak velocities than both controls (\({p}_{example}\) = 3.16 × 10−10 < 0.001, \({p}_{group}\) = 2.40 × 10−80 < 0.001) and postoperative patients (\({p}_{example}\) = 5.55 × 10−21 < 0.001, \({p}_{group}\) = 6.38 × 10−57 < 0.001). While example postoperative patient showed significantly higher peak velocities than controls (\({p}_{example}\) = 0.00287 < 0.01), this was nonsignificant for group average (\({p}_{group}\) = 0.417 > 0.05). Nevertheless, in terms of standard deviation (std), controls and preoperative patients demonstrated a wider distribution than postoperative patients in both parameters. To further demonstrate this cycle-based variability, we next plotted the example subjects’ cycle frequencies and peak-velocity distributions in trial-ordered scatterplots (Fig. 2a,b,c, lower panel of rounded insets). The scatterplots show that control subjects had higher and more variable cycle frequencies, and that all subjects exhibited highly variable peak velocities throughout the continuous exercises. While the standard deviation (std) roughly captures this variability difference (cycle frequency: \({std}_{control}\) (0.284) \({std}_{pre}\) (0.0814) > \({std}_{pos} (\) 0.0417), peak velocity: \({std}_{pre}\)(33.6) > \({std}_{pos}\) (30.6) > \({std}_{control}\) (27.3), it does not explain the temporal elements embedded in this variability. Accordingly, this observation motivated us to explore additional methods to quantify variability. In all, in the continuous exercises, control and patient groups exhibited significantly different and highly variable head kinematics, specifically in terms of head-movement cycle frequency and peak velocity.

We next performed the same analyses on head movements made during the transient gaze-stabilization exercises (Fig. 3a,b,c). In contrast to our findings for the continuous exercises, our analysis of head movements made during transient exercises did not reveal many intergroup differences in terms of cycle frequency and peak velocity (Fig. 3d). Preoperative patients showed significantly higher peak velocities than healthy controls (p = 0.0166 < 0.05), and postoperative patients had significantly higher cycle frequencies than preoperative patients (p = 0.0308 < 0.05). However, all groups exhibited a similar distribution in terms of standard deviation in cycle frequency: \({std}_{control}\) = 0.981, \({std}_{pre}\) = 1.10, \({std}_{pos}\) = 1.16. Although there were some intergroup differences in terms of std in peak velocity (\({std}_{control}\) = 167, \({std}_{pre}\) = 174, \({std}_{pos}\) = 141), no evident trial-to-trial temporal variability differences were found. Therefore, transient exercises do not reveal the highly variable head movement shown in the continuous exercises, albeit the cycle frequencies between the preoperative and control groups, and peak velocities between the preoperative and postoperative groups, show some significant differences.

Trial-to-trial variability showed the temporal structure in control subjects but not in vestibular schwannoma patients

To better understand the trial-to-trial variability found in continuous gaze-stabilization exercises, we next computed the autocorrelation function of cycle frequency and peak velocity, parametrized from chronologically ordered head-movement cycles. Comparison of the temporal structure of autocorrelation functions between each group (autocorrelation coefficient over past trials) revealed highly consistent periodic patterns in peak velocities from all control subjects, but not in patients (Fig. 4). Specifically, we computed the group-averaged correlation coefficients from the autocorrelations between the current cycle and subsequent cycles (up to the twentieth trial) for all six continuous gaze-stabilization exercises. While the coefficients of all subject groups showed an overall declining trend, a consistent periodic pattern of the autocorrelation function was observed only in control subjects among all exercises, and the patterns in pitch where subjects were asked to rotate their heads around an inter-aural axis (vertically) were more robust than those in yaw (Fig. 4a). (Autocorrelation functions of individual subjects can be found in the supplementary material (Fig. S6).) This periodic trend was exhibited by most control subjects and none of the patients. This pattern was not present when the order of the cycles was randomized, indicating that it is due to the temporal sequence of the trials. Large correlations indicate a high reliance on previous information in programming the current movement (or movement cycle), and the rate of decay of the correlations indicates how rapidly this inter-trial information is lost. The periodicity seen in the autocorrelation function (Fig. 4a) is because this analysis used data that contains interleaved head movements in two directions (pitch and yaw), and movement programming is presumably slightly different in these two directions.

Fig. 4
figure 4

Trial to trial variability of peak velocity parameterized from chronologically ordered continuous head-movement cycles. (a) Group-averaged correlation coefficients of peak velocity between the current cycle and subsequent cycles (up to the twentieth) from six continuous gaze-stabilization exercises. The six exercises comprised three conditions (as shown in three columns) and two head-movement directions (pitch in the upper panel and yaw in the lower panel). Error bars represent the standard error of the mean (SEM). (b) The group-averaged intertrial correlation difference (ICD) for each exercise (x-axis). Error bars are standard error of the mean (SEM). ICDs represent the absolute difference of coefficients between each trial and the subsequent trial in the autocorrelation function.

To quantify the fluctuations in autocorrelation values, we defined a parameter termed the intertrial correlation difference (ICD), which is the absolute difference of coefficients between each trial and its subsequent trial. Comparison of the group averaged ICDs for each exercise among three subject groups revealed that control subjects had significantly higher ICDs in pitch exercises than preoperative subjects (p = 7.59 × 10−10, 3.14 × 10−6, 7.50 × 10−5 < 0.001 for ‘VOR × 1 far’, ‘VOR × 1 near (on the all)’ and ‘VOR × 1 near (handheld)’ respectively) and postoperative patients (p = 1.87 × 10−11, 1.18 × 10−4, 6.48 × 10−6 < 0.001 for ‘VOR × 1 far’, ‘VOR × 1 near (on the all)’ and ‘VOR × 1 near (handheld)’ respectively). Furthermore, averaged ICDs in all yaw exercises were also higher in controls than in either preoperative or postoperative subjects, but the differences were not significant (Fig. 4b). Finally, within subject groups, controls had higher ICDs in pitch exercises than in yaw exercises (p = 1.77 × 10−17 < 0.001), whereas patient groups showed no significant differences between pitch and yaw exercises (p > 0.05). The ICDs of our individual subjects can be found in the supplementary material (Fig. S7). By comparing the ICDs between subject groups across exercises, we found that controls but not patients showed periodic patterns in trial-ordered peak velocity, mostly when controls moved their heads in the pitch direction. Notably, when we ran the same autocorrelation analysis on the cycle-frequency data, no evident patterns were found, and no intergroup differences were present (Fig. S8).

Sample entropy as a measure of temporal variability on surrogate data

In addition to investigating variability based on discrete movement cycles, we also wanted to understand the temporal variations of head movement on a continuous time scale. We first computed the standard deviation (std), a traditional variability measure, on one complete session of head-movement cycles and compared that between subject groups. There were no significant differences between subject groups (Table S2). Then we used sample entropy (SampEn) as a measure of the regularity or order of time-series data, as it applies to one complete session (multiple contiguous cycles) of movement and preserves the temporal structure of the head motion.

We first applied the SampEn analysis to a surrogate data set to gain an intuitive understanding of the temporal regularity assessed by SampEn (See Methods). In the first surrogate data set, we generated a template head movement (representing one movement cycle) for each subject group; this template was repeated to create an entire simulated session of head movements of comparable duration to an actual session (see Methods). Two additional surrogate data sets were generated by adding samples of white noise and colored noise to the initial (perfectly repetitive) surrogate. We also manipulated the movement duration and amplitude by vertically and horizontally expanding and squeezing the template head movement for each subject group. Together these surrogate data allowed us to access how cycle frequency, different types of noise, and amplitude can impact SampEn, and therefore help interpret SampEn results when applied to head-kinematics data.

Overall, we found that cycle frequency and noise both impact the surrogate data such that SampEn increases disproportionally with the added noise and changes non-monotonically with the decreased cycle frequency. While only white noise and blue noise manipulations are shown, three other colored noises including brown, pink, and purple were also tested and produced the same results. The original template movement of each subject group showed a pattern: PRE < POS < Control. As a result of adding the same amount of noise (white or colored noise) to the template movement, we found that SampEn increased in all subject groups, but the pattern became: POS < PRE < Control (Fig. S5a,b,c, bar plot). This disproportional increment in SampEn due to the same amount of noise implies that the regularity measured by SampEn is not directly related to conventional measures of variability such as standard deviation. SampEn considers the sequential order of data points. Linear variability measures such as standard deviation and root mean square, however, reflect the overall deviations from the mean of head velocities, without consideration of temporal order. This fundamental difference may explain why nonlinear algorithms like SampEn often reveal subtle time-series properties not detected previously using traditional approaches.

Second, after expanding the template data horizontally (increased cycle duration and thus decreased cycle frequency), we found that the SampEn in each subject group became: POS < PRE < Control (Fig. S5e, bar plot), and specifically, that SampEn increased in the PRE template and decreased in the POS and Control templates. While only the templates with decreased cycle frequency are shown, the increased cycle frequency was also tested and revealed the same non-monotonic changes in SampEn. These non-monotonic changes brought by the frequency manipulations imply that SampEn is not only reflecting the frequency of the periodic elements in the time series data. Finally, the amplitude manipulation had no impact on SampEn (Fig. S5a,d, bar plot).

Hence, based on our visualization of continuous head-kinematics data, healthy controls completed their head movements faster (i.e., had higher cycle frequencies), and both healthy controls and preoperative patients generally displayed more variation compared to postoperative patients in the cycle frequencies and peak amplitudes of their head movements. Thus, based on this analysis, we can expect that for continuous motions healthy controls would have a higher SampEn score than patients whose intertrial-correlation differences (ICD) would be smaller. Within the patient group, the preoperative patients would have a higher SampEn score due to their larger variances in both cycle frequency and peak velocities. Regarding transient motions, on the other hand, the inter-group differences might be trivial as the peak amplitudes and frequency distributions are comparable among all three subject groups. Hence, we expect their SampEn to be similar and reveal no significant differences.

Sample entropy showed that healthy controls exhibited higher temporal movement variability

Accordingly, we next applied SampEn to the continuous and transient head-movement data and the results are consistent with our expectations. Table 2 shows inter-group comparisons of SampEn results, in which asterisks indicate the differences at three significance levels (* for p < 0.05, ** for p < 0.01, *** for p < 0.001, ns for insignificant difference), and exact p values are shown with ± signs indicating which subject group has a higher mean SampEn value. The continuous-data comparison is listed in the top half table, and the transient-data comparison is listed in the bottom half table. To further illustrate the SampEn distributions, Fig. 5 shows the superimposed SampEn distributions from (a) preoperative patients and control subjects, (b) postoperative patients and control subjects, and (c) preoperative and postoperative patients, each for two continuous exercises, ‘VOR × 1 near (handheld)’ and ‘VOR × 1 near (on the wall)’ (Fig. 5a,b,c, upper and lower panels). The two exercises are shown as examples revealing the most significant SampEn differences between postoperative patients and healthy controls.

Table 2 Sample entropy comparison between preoperative patients, postoperative patients and healthy control.
Fig. 5
figure 5

Sample entropy results in two continuous exercises. Comparisons between (a) preoperative patients and healthy controls, (b) postoperative patients and healthy controls, and (c) preoperative patients and postoperative patients are shown. The sample-entropy measurements are generated by parameters m = 2 and r = 0.133 and are shown in histograms overlayed with fitted Gaussian curves. The y axis shows the number of values. Asterisks indicate differences at three significance levels (* for p < 0.05, ** for p < 0.01, *** for p < 0.001, ns for insignificant difference).

For the continuous head movement, SampEn results exhibited significant intergroup differences. As predicted by our surrogate data analysis, controls had significantly higher SampEn than both patient groups, and preoperative patients had slightly (but not significantly) higher SampEn than postoperative patients. Specifically, there was a significant difference between the preoperative patients and control subjects in ‘VOR × 1 near (handheld)’ in the yaw direction of head movement (p = 0.00948 < 0.01, Fig. 5a, top panel), in ‘VOR × 1 near (on the wall)’ in both yaw (Fig. 5a, bottom panel) and pitch directions, and in ‘VOR × 1 far’ in pitch the direction (Table 2a, top half chart). Postoperative patients were also significantly different from healthy controls in ‘VOR × 1 near (handheld)’ in the yaw direction (Fig. 5b, top panel), and ‘VOR × 1 near (on the wall)’ exercises in both yaw (Fig. 5b, bottom panel) and pitch directions (Table 2b, top half chart). There were no significant intergroup differences in all other continuous-exercise conditions (Table 2, top half chart, Fig. 5c, top and bottom panels). Therefore, healthy subjects exhibited significantly higher variability in continuous head movements than patients, and ‘VOR × 1 near (on the wall)’ best distinguished patients from healthy controls, implying that ‘VOR × 1 near (on the wall)’ might be a more challenging task compared to the other continuous-gaze exercises. There were no significant variability differences between preoperative and postoperative patients. For the transient head movements, as expected from our surrogate analysis, there were no significant differences between the three subject groups in terms of SampEn (Table 2a,b,c, bottom half chart), except for ‘Visual target (1 m distance)’ in the pitch direction (Table 2b, bottom half chart), which asked subjects to move their heads in discrete impulses while a target object is visible 1 m in front of them.

In summary, our findings show that the use of SampEn can reveal significant differences in regularity in head-movement patterns between healthy controls and patients, especially in the postoperative phase during continuous gaze-stabilization exercises. Across all the significant differences observed, healthy controls consistently showed higher SampEn whereas pre- and postoperative patients had comparable SampEn. These two observations indicate that healthy controls exhibit a higher degree of irregularity than patients whose motor patterns are already less variable before the surgical vestibular nerve deafferentation and did not change much after the surgery.

Discussion

To our knowledge, this is the first study to employ trial-based autocorrelation and sample entropy analysis to study the temporal variations on head kinematics during active gaze shifts as well as VOR training exercises (i.e. VOR × 1). Furthermore, this is the first study to compare these measures in control subjects with those of individuals with unilateral vestibular peripheral loss. We first found that the loss of vestibular feedback altered the trial-based autocorrelation function structure. Specifically, control subjects showed a strong periodic pattern in the autocorrelation function of peak head movement velocity that was more marked across pitch versus yaw movements. However, following unilateral vestibular loss there was no longer significant periodicity in either the pitch or yaw exercises. Further, our analysis of sample entropy showed that patients demonstrated significantly lower movement entropy than normal controls before and after surgery, and that this difference became more marked postoperatively. Taken together, our results suggest that the analysis of trial-based autocorrelation and sample-entropy measures of the temporal structure and variability of head movements during gaze-stabilization exercises may provide novel insights into gaze-shift strategies and also solid supplementary metrics for assessing patient kinematics in rehabilitation settings. An overarching goal for this work was to establish methods of precision and objectivity that can augment current healthcare practices that involve expertise and subjectivity.

Inter-trial correlation depends on the temporal ordering of trials, and the structure of these correlations reveals storage of the performance of previous trials and how this influences the programming of subsequent movements25,26. This measure of variability is distinct from the gross system variability that depends on dispersions about the central mean and eliminates the temporal order of the data. By employing trial-based autocorrelation analysis on head-movement kinematics, we found that healthy controls but not patients exhibit a consistent periodic pattern in their trial-parametrized peak velocities, as manifest in the structure of the autocorrelation function. We propose that that pattern reflects the ability of normal individuals to effectively utilize feedback from previous trials to rapidly adjust the movement in the next trial; this is supported by the strong correlations between trials. Since patients do not exhibit this periodic pattern, we suggest that a reduction in the reliability of the vestibular input caused by the unilateral deafferentation27 compromised patients’ ability to use feedback control to adjust movements in response to previous errors. In addition, since there is no significant difference between patient head-movement patterns before and six weeks after the vestibular deafferentation, patients before the surgery might already have compromised vestibular input to inform their movements.

In addition, we speculate that not only is intact vestibular input crucial for head-movement feedback control, but that the otolith organs specifically play a significant role on a trial-to-trial basis in the pitch plane. Although the otolith organs respond to linear accelerations, including changes in the magnitude and direction of gravitational force28, their contribution to pitch VOR and head-eye coordination remains unclear. For example, prior studies that focused on the rotational kinematics of the human VOR reported no large contribution from the otoliths during active or passive pitching of the head29,30, whereas others have concluded that the interaction of the otolith organs and semicircular canal is essential to yielding accurate phase for VORs in the vertical plane in rabbits31,32, cats33, rats34, and monkeys35. In the present study, we found that the periodic pattern observed in our autocorrelation functions was most pronounced in pitch exercises, which required subjects to rotate their heads about an inter-aural axis (“up and down”) while fixating on a target. Notably, this pattern was diminished in yaw exercises, where subjects instead rotated their heads horizontally (“left and right”). A key difference in these two types of head movement is that in the former case, head orientation changes relative to gravity, whereas this is not the case during yaw rotations. As a result, pitch but not yaw motions induce a modulation of otolith-afferent firing rate due to changes in orientation with respect to gravity. Therefore, we speculate that, in healthy control subjects, the otoliths provide information that is stored from trial to trial and used to modulate head motion in pitch but not yaw exercises, thus contributing to the periodic patterns only in the pitch exercises. The fact that this comparison between pitch and yaw is diminished in patients suggests that patients store less trial-to-trial information than healthy controls.

Another novel aspect of our study is the introduction of sample entropy (SampEn) as a nonlinear temporal-variability measurement for head movements. As noted above, conventional variability methods measure the dispersion about the central mean. In contrast, SampEn is a powerful nonlinear dynamical approach that can quantify the structure or organization of the variations present in a time series24. The development of entropy measures such as approximate entropy (ApEn) and SampEn was motivated by the data-length constraints commonly encountered in physiological data such as heart rate, EEG, and endocrine hormone secretion36, and the initial studies used ApEn and SampEn on fetal and neonatal heart rate variability and electrocardiograms to provide information regarding pathology in cardiology18,20,37.

Several prior studies had employed entropy analysis in the context of postural control and balance. Notably, these studies computed ApEn and SampEn measures from gait kinematics and center of pressure (COP, a measure of postural sway). The majority focused on how different task demands, such as different walking speeds, eyes open or closed, stable or sway support, and presence or absence of a simultaneous cognitive task, might influence COP variability in healthy young adults. These studies found that subjects exhibited higher postural variability as manifest in higher ApEn or SampEn values in the dual (postural plus cognitive) task than in the postural task alone38,39,40, and lower postural variability when the task was more difficult such that subjects were deprived of visual inputs40 and subject to external vibrations41. These prior studies further showed that increasing cognitive involvement in postural control by increasing task difficulty increased COP variability, whereas withdrawing attention from postural control by creating an external focus on a cognitive dual task increased COP variability. A few prior studies have further focused on how natural aging influences COP during standing balance. They reported that fallers (with fall history within 12 months) showed higher entropy values (both SampEn and ApEn) in the anterior–posterior direction and lower entropy values in the medial–lateral direction than the non-fallers22. Older adults also showed higher entropy values in the anterior–posterior direction and lower entropy values in the medial–lateral direction than young adults22,42. Thus, overall, both increased cognitive involvement and fall risk are related to changes in balance control, which can be effectively identified by entropy measures.

Surprisingly, while these prior studies have established that entropy measurement is a reliable non-linear analysis tool in studies of gait and posture, no study to date has directly addressed how impaired vestibular input might influence sample entropy during these vestibular-dependent behaviors. Consequently, our study is the first to directly test sample entropy on head-movement data from patients with peripheral vestibular loss. Additionally, although ApEn and SampEn algorithms are very sensitive to input parameters, such as the data length and the tolerance range which are case-specific22,23,43, most of the previous studies used a common set of parameters of SampEn. In the current study, to avoid the confounding influences of arbitrary parameter choices, we carefully explored the parameter space for the computation of SampEn to obtain optimal parameters according to a method proposed by Lake et al. in 200220. We tested SampEn on parameters (m = 3, 4 and r = 0.2) used by other studies18,22,40 but the results did not reach the significance observed for optimized parameter pairs (Table S3 & S4, for m = 3 and m = 4, respectively), indicating that different properties of the data (duration, modality, dimension, etc.) make parameter optimization necessary. We also tested SampEn on surrogate datasets to ensure the validity of the non-linear analysis and to get an intuitive understanding of SampEn results.

Overall, we found that vestibular schwannoma patients have significantly lower SampEn compared to the control group, both before and after sectioning of the involved vestibular nerve (vestibular neurectomy). The difference was most pronounced in the continuous-gaze exercises. Specifically, both ‘VOR × 1 near’ conditions (on the wall and handheld) showed more highly significant differences between patients and healthy controls compared to the ‘VOR × 1 far’ condition, suggesting that target distance potentially plays a role in patients’ gaze stability. Notably, the ‘on the wall’ condition of ‘VOR × 1 near’ revealed significant differences in both yaw and pitch, while the ‘handheld’ condition showed significance only in yaw. This discrepancy may arise from the instability of the target location in the handheld task, where hand movements induced by head motion can affect target positioning. In contrast, the ‘on the wall’ task may be more demanding, as subjects must maintain a stable gaze on a fixed target throughout the exercises. In contrast, during transient gaze exercises, only the ‘Visual target’ task revealed a significant difference in pitch between postoperative patients and controls. The absence of significant findings in the imaginary-target conditions suggests that the visual presence of a target is important for assessing gaze stability. Overall, the SampEn results highlight that continuous exercises, particularly ‘VOR × 1 near (on the wall),’ best distinguish vestibular schwannoma patients from healthy controls, potentially offering clinically relevant insights for exercise prescription.

The reduction in variability in patients quantified by SampEn during self-generated head turns may be a reflection of the complex control network associated with healthy vestibular circuits during such voluntary movements. Converging studies across patient groups show that temporal variations in physiological function reflect a healthy biological system that represents the underlying capability to make flexible adaptations to immediate stresses and perturbations. This view is supported by multiple experimental studies investigating motor learning44,45,46,47, which in general suggest that healthy individuals employ variability to adapt toward optimal movement policies. If we regard the higher SampEn of healthy controls’ head angular velocities as a reflection of a healthy and flexible biological system, the increased regularity of patients both before and after surgery may be a consequence of compensating for the impaired vestibular feedback, which makes their motor control more rigid. Moreover, in the broader context of neurological and motor disorders, several studies have computed such entropy-based measures to investigate sensorimotor performance in patients, including individuals who sustained concussion48,49,50, stroke43, anterior cruciate ligament rupture51, chronic ankle instability52, multiple sclerosis53, and infants with motor pathologies49,54,55. In general, patients exhibited lower ApEn or SampEn values compared with age-matched healthy controls, and a case study showed that rehabilitation for cerebral palsy increased infant postural variability49.

Limitations

Our results are limited by a relatively small sample size, as vestibular schwannoma is a rare type of tumor (https://www.cancer.gov). The data we analyze, however, involve many repeated trials, which offsets the small sample. (In fact, it is the repetition of trials within an individual, not many trials across individuals, that is necessary for the analyses that rely on temporal ordering, such as autocorrelation and entropy.) Further, while the existing literature provides evidence that hearing has a critical role in improving balance in conditions where other sensory input is altered or absent56, we did not consider the unique contribution hearing may have on gaze stability training. Finally, although our results show that SampEn measures can distinguish an impaired vestibular system from a healthy one, our current study does not address whether SampEn may in fact change over time in patients with impaired vestibular function—which may be very helpful for the clinician. We propose that SampEn brings novel insights into time-series data that may be employable as additional measurement criteria for clinically evaluating patients’ kinematics and their rehabilitation progress. Future steps should involve associating SampEn in head movements with other commonly used metrics such as VOR gain and detrended fluctuation analysis (DFA) in gait.

Conclusion

Trial-based variability analysis and sample entropy provide novel insights into the different control dynamics of patients with unilateral vestibular loss versus healthy individuals in the generation of head movements during gaze-stabilization exercises. From the trial-based analysis, we see that healthy controls can effectively regulate their head movements based on intact real-time feedback whereas patients lacked this ability. In sample-entropy analysis, on the other hand, controls showed highly variable movement speed and patients were much more stereotyped. Hence, we conclude that healthy individuals not only show a general feedback control on their head-movement patterns but also exhibit a greater level of variability in their movement generation, which allows for more flexible adjustment to the immediate demands of the stimulus environment. Vestibular schwannoma patients, however, do not seem to have this feedback control and recruit a more rigid motor plan, probably as a compensatory strategy that is less flexible and so is meant to provide adequate performance without rapid adjustments. Trial-based analysis and SampEn are innovative measures that examine temporal aspects of head movements and provide novel insights into neural control strategies, providing a richer understanding as compared with conventional variance-based metrics. Our results also provide evidence to support variability as a hallmark of healthy biological systems, and that the dynamical processes contributing to this variability make sensorimotor systems more adaptive and resilient to internal and external perturbations. We suggest that the use of temporal variability as criteria to evaluate patients’ kinematics may serve as a beneficial monitor of their motor-rehabilitation progress.