A dataset of paired head and eye movements during visual tasks in virtual environments

Rubow, Colin; Tsai, Chia-Hsuan; Brewer, Eric; Mattson, Connor; Brown, Daniel S.; Zhang, Haohan

doi:10.1038/s41597-024-04184-1

Download PDF

Data Descriptor
Open access
Published: 05 December 2024

A dataset of paired head and eye movements during visual tasks in virtual environments

Colin Rubow^1,2,3^na1,
Chia-Hsuan Tsai⁴^na1,
Eric Brewer⁴,
Connor Mattson⁴,
Daniel S. Brown^2,4 &
…
Haohan Zhang^1,2

Scientific Data volume 11, Article number: 1328 (2024) Cite this article

3346 Accesses
Metrics details

Subjects

Abstract

We describe a multimodal dataset of paired head and eye movements acquired in controlled virtual reality environments. Our dataset includes head and eye movement for n = 25 participants who interacted with four different virtual reality environments that required coordinated head and eye behaviors. Our data collection involved two visual tracking tasks and two visual searching tasks. Each participant performed each task three times, resulting in approximately 1080 seconds of paired head and eye movement and 129,611 data samples of paired head and eye rotations per participant. This dataset enables research into predictive models of intended head movement conditioned on gaze for augmented and virtual reality experiences, as well as assistive devices like powered exoskeletons for individuals with head-neck mobility limitations. This dataset also allows biobehavioral and mechanism studies of the variability in head and eye movement across different participants and tasks. The virtual environment developed for this data collection is open sourced and thus available for others to perform their own data collection and modify the environment.

Eye movement behavior in a real-world virtual reality task reveals ADHD in children

Article Open access 24 November 2022

GazeBaseVR, a large-scale, longitudinal, binocular eye-tracking dataset collected in virtual reality

Article Open access 30 March 2023

The structure of the superior and inferior parietal lobes predicts inter-individual suitability for virtual reality

Article Open access 08 December 2021

Background & Summary

Humans use coordinated head and eye movement to effectively survey, gain information, and interact with their environment. Information about this head-eye coordination provides key insights for how a person interacts with the environment. This information can also be used for designing controllers of neck exoskeletons^1,2, realistic robotic actors^3,4,5 or virtual characters^6,7,8. The dataset presented in this paper provides information on eye-head coordination that can, for example, be analysed for key behavioral insights or used to supply the necessary data for training machine learning models.

In general, head and eye movements are tightly coupled in gaze behaviors through neural mechanisms like vestibulo-ocular and optokinetic reflex^9,10,11,12. When responding to a visual stimulus that requires a large gaze shift (saccade), the head compensates for eye movement to recenter the gaze^13,14,15. When performing tracking of a moving target (pursuit), the head moves with the stimulus to keep the target centered within the field of vision^{11,12,16,17,18}. However, due to technological limitations, much prior work only considered settings where the head is fixed when studying gaze behaviors^{9,11,19,20,21}. Recently, with the help of head-mounted eye-tracking technology and portable motion sensors, modern studies have increasingly emphasized the role of head movements in human gaze behaviors. For example, Einhauser et al. underscored the dynamic nature of retinal inputs when developing sensory coding models through concurrent recording of eye and head during natural exploration²². Kothari et al. aimed to develop algorithms to categorize gaze events by measuring head and eye movements during real-world tasks¹².

Despite these emphasis, equipment such as eye trackers or inertial measurement units used in these studies were highly specialized, which limits data reproducibility. These systems also suffer from technical limitations such as sensor drift and unwanted sensor-body motions (e.g., slippage)¹². Additionally, the use of physical environments and tasks require significant setup and modification costs. Data collection is thus limited to participants who have access to the physical infrastructure. Hence, there is a need to provide an open dataset of paired eye and head movements during various visual-motor tasks through accurate, easy-to-access, and standardized collection procedures.

To address this need, we present an open dataset of paired head-eye movements during tasks in virtual environments. The use of virtual environments allows for standard and accurate equipment, as well as rapid modifications of the environments with relatively low cost. In this dataset, paired head and eye movement information was collected from 20 healthy young adults. Both tracking (following objects) and searching (looking for objects) tasks were included. The data collection procedure is easy to carry out as only a head-mounted display with eye-tracking capability is needed to perform the data collection. The virtual environment can be easily shared, enabling highly scalable data collection from anyone with access to the same virtual reality system. The use of the virtual reality system also allows for fast reconfiguration of the virtual environment to study different head-eye behaviors as needed. This open head-eye coordination dataset provides a new resource for scientists to study behaviors of head and eye movements and underlying neural mechanisms (e.g., anticipatory gaze behaviors). It may also assist engineers to develop bio-inspired machines that are relevant to robotics and medical research such as neck exoskeletons to restore head-neck motions in patients with neurological impairments.

Methods

Twenty-five participants were enrolled in this data collection who have varying experience with virtual reality (Table 1). All participants were without any known history of head-neck injuries. 6 participants wore glasses, five of which wore them during the experiment. The participant who opted to not wear glasses was able to perform the tasks as well as all other participants. All participants were able to understand and follow the task instructions. The protocol was approved by the University of Utah Institutional Review Board (#00145893). All participants provided written informed consent prior to the data collection.

Table 1 Characteristics of participants.

Full size table

A virtual reality headset with eye tracking capability (VIVE Pro Eye, HTC Corporation) was used in the data collection where virtual environment and tasks were developed in Unity 3D. Each participant was seated on a stationary chair throughout data collection. After fitting the headset, the eye tracking system was calibrated using a built-in procedure (see Technical Validation section). Each participant engaged in four interactive visual tracking and searching virtual reality environments three times each in increasing level of difficulty, where the increase in number of targets and complexity of target trajectory, along with distracting elements, is defined as more difficult (Fig. 1). The experiment could be paused at any time and the system could be recalibrated using the same built-in procedure when resumed.

The total duration of the data collection was approximately 18 minutes, resulting in 1080 seconds of paired head and eye movement per experiment. During these trials, participants were instructed to track moving objects in each virtual environment, with their eyes and head free to move. The headset was used to record their paired head and eye directions at an average (standard deviation) of 120.66 (8.75) Hz. The head and eye directions are expressed using unit vectors: the left eye direction relative to the head frame, the right eye direction relative to the head frame, and the head relative to the virtual environment ground frame. The orientation of the frames used by the headset are shown in Fig. 2.

The first two tasks, Linear Smooth Pursuit and Arc Smooth Pursuit, were designed to measure the movement behavior of eyes and head tracking a single, slow-moving target at a fixed distance of 10 meters. From the participant’s perspective, in the Linear Smooth Pursuit task, the target moves on a linear trajectory, whereas the target moves on a circular trajectory in the Arc Smooth Pursuit task.

The latter two tasks, Rapid Visual Search and Rapid Visual Search Avoidance, were designed to measure the head and eye movement behavior when participants were asked to search for discrete targets and gaze at them for a short period of time. The difference between these two tasks is that in the Rapid Visual Search Avoidance task, additional objects are presented to distract the participant.

Overall, we have collected data for tracking and pursuing tasks with disturbances caused by random direction changes and searching and fixating tasks with obstacles. For all tasks, targets (blocks) were constrained within a 140° by 140° conic range. When the gaze was fixed on a target, its color switched from blue to green. Each trial was 90-seconds long, and randomization was uniquely seeded for each participant. At the end of each trial, tracking performance of the participant was measured using a numerical score (explained below). Images of the VR environment are shown in Fig. 3.

Task 1: Linear Smooth Pursuit

Participants were asked to follow a target that moved between uniformly random positions at a fixed speed of 5 meters per second (Fig. 2a). Throughout a trial, the participant was asked to visually track this target as it was moving, with their head free to move. After each trial, the participant received a score that equals the number of frames in which the measured gaze were on the target.

Task 2: Arc Smooth Pursuit

Participants were asked to follow a target moving on circular trajectories with a fixed angular speed of a radian per second, as illustrated in Fig. 2b. The trajectories were randomly generated and updated similar to Linear Smooth Pursuit task. With the head free to move, the participant was asked to visually track the moving target during a trial. At the end of each trial, each participant received a score that equals the number of frames in which the measured gaze were on the target.

Task 3: Rapid Visual Search

Three targets were generated at uniformly random locations on a plane a fixed distance from the participant and moved towards the participant at a fixed speed (Fig. 2c). The participant was asked to eliminate these targets before they hit the participant by fixing their gaze on the target for 0.3 seconds. After a target was eliminated, a new target was then generated at a random location on the same plane, not necessarily inside the field of vision, which required the participant to search for the new target. The participant received a score after each trial where the number of points received equals the number of targets eliminated.

Task 4: Rapid Visual Search Avoidance

This is an extension of the Rapid Visual Search task where three distracting objects (yellow block, Fig. 2d) were added to the trial. The participant was instructed to avoid gazing at them. When a target is eliminated or reaches the subject, the target is respawned back to a random starting point on the same plane. The distracting objects turn red when the eyes are fixated on them and do not respawn until they reach the subject. The participant received a score after each trial where the number of points received equals the number of targets eliminated, with no penalty for gazing on the distracting targets beyond losing time not eliminating objects.

In the first two tasks, randomly changing the trajectories of the targets causes the participant to correct the gaze, thus capturing reactionary in addition to anticipatory behavior. Randomizing the starting position of targets in the last two tasks ensures the participant is performing true searching behavior. The four tasks are in the order of difficulty, however, this may easily be modified in the program.

Data Records

All data as well as trial scores and demographics are made available using Figshare²³. Each participant is identified by a unique numeric code from 1 to 25 (Table 1). The file structure is a set of directories for each participant (labeled as User[Participant ID]) where each of the participant’s trials are stored in their respective participant’s directories. Thus, there are 25 participant folders in the first directory, and 12 trial data files in each participant folder. Since each task was performed three times for each participant, each of the 12 trial file names have the structure of User[Participant ID]_[Trial Type]_[Trial Number].csv. The [Trial Type] are the four tasks used in this experiment (i.e., Linear smooth pursuit, Arc Smooth Pursuit, Rapid Visual Search, and Rapid Visual Search Avoidance). The [Trial Number] is the occurrence of a trial in each task (indexed from 0, i.e., 0, 1, and 2). For example, the second trial during the Arc Smooth Pursuit task for the sixth participant will have the file name User6_ArcSmoothPursuit_1.csv. Users 21 through 25 also include the trajectories of targets and follows the same Participant ID, Trial Type, and Trial Number naming pattern as eye and head data (e.g., Object6_ArcSmoothPursuit_1.csv).

The trial files are in .csv format with a single header line followed by the data with comma delimiters (Fig. 4). The first column of the data is the timestamp. At each time step, the second to the fourth columns are the directional vector, in the order of x, y, and z components, of the left eye recorded in the head frame, following the coordinate frames defined in Fig. 2. Similarly, the directional vectors of the right eye in the head frame and the head in the world frame can be found in the 5–7th columns and 8–10th columns, respectively. The tracker records the data at an average (standard deviation) sampling rate of 120.66 (8.75) Hz and each trial was recorded for 90 seconds.

To test the correctness of the data, the distributions of horizontal and vertical head and eye angles for all participants for each task are plotted using histograms (Fig. 5). Horizontal angles were calculated as the deviation from the sagittal plane, and vertical angles were calculated as the deviation from the transverse plane.

The majority of eye angle data are centered at zero while the head angles have a wider distribution than the eye angles. This is most likely due to the head rotating to keep the eyes centered on the target, which aligns with the behavior observed in the literature^14,15. It was suggested that such a behavior reduces cognitive load¹³. For the searching tasks, there are two symmetric non-zero peaks for the head angles, an expected outcome due to the participants needing to search their environment for targets. Due to the symmetry of the task environments and approximately uniform spread of target positions, we do not observe any skewness or concentration of head angles. To further demonstrate the data, a five second window of eye and head angles for a random subject is shown in Fig. 6. For the searching tasks (Task 3 and 4), the eye and head angles show more activity while the pursuit tasks are more mild. In the pursuit tasks, the eye angles are nearly constantly zero, showing that the head is centering the gaze.

Technical Validation

In addition to the headset, base stations need to be installed for best capture of the headset motions. For this dataset collection, two were installed on opposite corners of the capture volume according to onscreen instructions of the virtual reality software and manual. The capture volume was roughly 8 × 8 × 8 cubic feet. The headset must be adjusted and calibrated to each participant. This was performed by following the software calibration procedure of the headset: the participant first fit and tighten the headset on their head, then use a dial on the headset to adjust the interpupillary distance following the onscreen instructions, and lastly follow a target at five different positions with the head fixed to calibrate the eye tracking measurement. This process can be found in the headset manual. Literature suggests that after proper calibration, a mean accuracy of 1.08° and 4.16° for this headset can be achieved in a 60° by 60° and a 100° by 100° conic window, respectively^24,25. Instructions given on how to run and modify the experiment with the given code (see Code Availability section) can be found in a README.txt file included with the code located in the root directory. For example, to change the fixation time required to respawn targets in the search tasks, the code file EyeTrackingTest/Assets/Scripts/HighlightAtGaze.cs can be modified. To add a new task to the experiment pipeline, the code file EyeTrackingTest/Assets/Scripts/ModelSim.cs can be modified.

Blinking of eyes occurs during data collections. In our data record, data recorded during blinks are labelled as zeros for both eyes. The duration of blinks are short, on average (standard deviation) 0.101(0.086) seconds. This creates discontinuity of time trajectories of the eyes. However, this can be addressed in post processing by proper interpolation techniques (e.g., linear interpolation).

A limitation of our dataset is that we did not include the 3D trajectories of the targets for most of our participants. This data, however, can be included in future data collections by modifying the code found in EyeTrackingTest/Assets/Scripts/GazeCollection2.cs, as we have included for data collected from participants 21–25. The virtual reality system does not allow us to capture velocity data directly, but this may be approximated by the position and time data if desired or, an external IMU may be attached to the headset which would increase the complexity of the system. Additional proper filtering should be considered when numerically computing the velocities. Another limitation of the dataset is the fixed environment conditions (e.g., lighting) in which the data were collected. These conditions, however, can be modified by future users through the codes which we made publicly available.

Code availability

Code for our virtual reality data collection and the code for processing our data and generating the plots and tables in the Technical Validation section can be found on GitHub (https://github.com/aria-lab-code/EyeTracking_VR_DataCollection.git).

References

Zhang, H. et al. Amyotrophic lateral sclerosis patients regain head-neck control using a powered neck exoskeleton. In 2022 9th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob), 01–06 (IEEE, 2022).
Chang, B.-C., Zhang, H., Trigili, E. & Agrawal, S. K. Bio-inspired gaze-driven robotic neck brace. In 2020 8th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob), 545–550 (IEEE, 2020).
Rajruangrabin, J. & Popa, D. O. Control of head-eye coordination of conversational robotic actors. IFAC Proceedings Volumes 42, 729–734 (2009).
Article Google Scholar
Rajruangrabin, J. & Popa, D. O. Robot head motion control with an emphasis on realism of neck–eye coordination during object tracking. Journal of Intelligent & Robotic Systems 63, 163–190 (2011).
Article Google Scholar
Ranatunga, I., Rajruangrabin, J., Popa, D. O. & Makedon, F. Enhanced therapeutic interactivity using social robot zeno. In Proceedings of the 4th International Conference on PErvasive Technologies Related to Assistive Environments, 1–6 (2011).
Andrist, S., Pejsa, T., Mutlu, B. & Gleicher, M. A head-eye coordination model for animating gaze shifts of virtual characters. In Proceedings of the 4th Workshop on Eye Gaze in Intelligent Human Machine Interaction, 1–6 (2012).
Ruhland, K. et al. A review of eye gaze in virtual agents, social robotics and hci: Behaviour generation, user interaction and perception. In Computer graphics forum, vol. 34, 299–326 (Wiley Online Library, 2015).
Masuko, S. & Hoshino, J. Generating head–eye movement for virtual actor. Systems and Computers in Japan 37, 33–44 (2006).
Article Google Scholar
Freedman, E. G. Coordination of the eyes and head during visual orienting. Experimental brain research 190, 369–387 (2008).
Article PubMed PubMed Central Google Scholar
Tweed, D., Glenn, B. & Vilis, T. Eye-head coordination during large gaze shifts. Journal of neurophysiology 73, 766–779 (1995).
Article CAS PubMed Google Scholar
Land, M. F. Eye movements and the control of actions in everyday life. Progress in retinal and eye research 25, 296–324 (2006).
Article PubMed Google Scholar
Kothari, R. et al. Gaze-in-wild: A dataset for studying eye and head coordination in everyday activities. Scientific reports 10, 2539 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Nakashima, R. & Shioiri, S. Why do we move our head to look at an object in our peripheral region? lateral viewing interferes with attentive search. PloS one 9, e92284 (2014).
Article ADS PubMed PubMed Central Google Scholar
Barnes, G. R. Vestibulo-ocular function during co-ordinated head and eye movements to acquire visual targets. The Journal of Physiology 287, 127–147 (1979).
Article CAS PubMed PubMed Central Google Scholar
Cecala, A. L. & Freedman, E. G. Amplitude changes in response to target displacements during human eye–head movements. Vision research 48, 149–166 (2008).
Article PubMed Google Scholar
Mann, D. L., Spratford, W. & Abernethy, B. The head tracks and gaze predicts: how the world’s best batters hit a ball. PloS one 8, e58289 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Steinman, R. M. 90 gaze control under natural conditions. Eye Movements (2003).
Ackerley, R. & Barnes, G. R. The interaction of visual, vestibular and extra-retinal mechanisms in the control of head and gaze during head-free pursuit. The Journal of physiology 589, 1627–1642 (2011).
Article CAS PubMed PubMed Central Google Scholar
Lestienne, F., Vidal, P. & Berthoz, A. Gaze changing behaviour in head restrained monkey. Experimental brain research 53, 349–356 (1984).
Article CAS PubMed Google Scholar
Jürgens, R., Becker, W. & Kornhuber, H. Natural and drug-induced variations of velocity and duration of human saccadic eye movements: evidence for a control of the neural pulse generator by local feedback. Biological cybernetics 39, 87–96 (1981).
Article PubMed Google Scholar
Moschovakis, A. Neural network simulations of the primate oculomotor system: I. the vertical saccadic burst generator. Biological cybernetics 70, 291–302 (1994).
Article CAS PubMed Google Scholar
Einhäuser, W. et al. Human eye-head co-ordination in natural exploration. Network: Computation in Neural Systems 18, 267–297 (2007).
Article PubMed Google Scholar
Rubow, C., Tsai, C.-H., Zhang, H. & Brown, D. S. EyeTrackingVRDataset. Figshare https://doi.org/10.6084/m9.figshare.25749378.v5 (2024).
Schuetz, I. & Fiehler, K. Eye tracking in virtual reality: Vive pro eye spatial accuracy, precision, and calibration reliability. Journal of Eye Movement Research 15 (2022).
Sipatchin, A., Wahl, S. & Rifai, K. Accuracy and precision of the htc vive pro eye tracking in head-restrained and head-free conditions. Investigative Ophthalmology & Visual Science 61, 5071–5071 (2020).
Google Scholar

Download references

Acknowledgements

We acknowledge support from the National Institute of Biomedical Imaging and Bioengineering (R21-EB035378). Colin Rubow receives funding from the National Institute for Occupational Safety and Health under grant 5T42OH008414. We also thank all 25 volunteers for participating in this data collection.

Author information

These authors contributed equally: Colin Rubow, Chia-Hsuan Tsai.

Authors and Affiliations

Department of Mechanical Engineering, University of Utah, Salt Lake City, 84112, USA
Colin Rubow & Haohan Zhang
Robotics Center, University of Utah, Salt Lake City, 84112, USA
Colin Rubow, Daniel S. Brown & Haohan Zhang
Rocky Mountain Center for Occupational and Environmental Health, University of Utah, Salt Lake City, 84112, USA
Colin Rubow
Kahlert School of Computing, University of Utah, Salt Lake City, 84112, USA
Chia-Hsuan Tsai, Eric Brewer, Connor Mattson & Daniel S. Brown

Authors

Colin Rubow
View author publications
Search author on:PubMed Google Scholar
Chia-Hsuan Tsai
View author publications
Search author on:PubMed Google Scholar
Eric Brewer
View author publications
Search author on:PubMed Google Scholar
Connor Mattson
View author publications
Search author on:PubMed Google Scholar
Daniel S. Brown
View author publications
Search author on:PubMed Google Scholar
Haohan Zhang
View author publications
Search author on:PubMed Google Scholar

Contributions

D.B. and H.Z. conceived the data collection procedures, C.T. conducted the data collection (s), C.T. and C.R. processed data and analysed the results. E.B. and C.M provided support in technical validation. All authors participated in writing and reviewed the manuscript.

Corresponding authors

Correspondence to Daniel S. Brown or Haohan Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Rubow, C., Tsai, CH., Brewer, E. et al. A dataset of paired head and eye movements during visual tasks in virtual environments. Sci Data 11, 1328 (2024). https://doi.org/10.1038/s41597-024-04184-1

Download citation

Received: 27 May 2024
Accepted: 28 November 2024
Published: 05 December 2024
DOI: https://doi.org/10.1038/s41597-024-04184-1