Abstract
This paper presents a comprehensive multi-modal dataset capturing concurrent haptic, audio, and visual signals recorded from ten participants as they interacted with ten different textured surfaces using their bare fingers. The dataset includes stereoscopic images of the textures, and fingertip position, speed, applied load, emitted sound, and friction-induced vibrations, providing an unprecedented insight into the complex dynamics underlying human tactile perception. Our approach utilizes a human finger (while most previous studies relied on rigid sensorized probes), enabling the naturalistic acquisition of haptic data and addressing a significant gap in resources for studies of human tactile exploration, perceptual mechanisms, and artificial tactile perception. Additionally, fifteen participants completed a questionnaire to evaluate their subjective perception of the surfaces. Through carefully designed data collection protocols, encompassing both controlled and free exploration scenarios, this dataset offers a rich resource for studying human multi-sensory integration and supports the development of algorithms for texture recognition based on multi-modal inputs. A preliminary analysis demonstrates the dataset’s potential, as classifiers trained on different combinations of data modalities show promising accuracy in surface identification, highlighting its value for advancing research in multi-sensory perception and the development of human-machine interfaces.
Similar content being viewed by others
Background & Summary
Human perception is inherently multi-sensory, involving the integration of information from various modalities to enhance the robustness and accuracy of our understanding of environmental characteristics. This integration is crucial for resolving ambiguities, ensuring perceptual stability, and improving the perception of object properties under uncertain conditions1,2,3,4. Touch, audition and vision are the primary senses that contribute to perceiving the physical properties of surfaces. Stereoscopic visual data, for example, provides depth information essential for understanding a surface’s topography and texture5. Auditory cues, particularly those produced by mechanical interactions such as sliding contacts, serve as vital sources of perceptual information6,7. Additionally, touch is a complex sense that integrates various sub-modalities, including force, vibration, and proprioception8,9. The interdependence of these modalities is evident in phenomena such as the pseudo-haptic effect, where visual stimuli create an illusion of haptic feedback10, and the influence of auditory cues on the perception of hand dryness11. Therefore, datasets encompassing these sensory modalities are essential for advancing perceptual studies and developing artificial sensory feedback systems that fully exploit the multi-sensory nature of human perception.
Previous research has primarily focused on capturing images of textures and associated haptic signals, such as acceleration or load, using rigid probes12,13,14,15,16,17. Some studies have also investigated the auditory aspects of sliding interactions18,19. These efforts have provided valuable insights into the role of individual sensory modalities in texture perception, facilitating the development of haptic rendering systems integrated into simple styluses or robotic fingers20. However, most of these studies relied on rigid, sensorized probes and did not capture the direct interaction of a bare human finger with surfaces. Other research has documented the load and vibrations on a substrate from a bare finger sliding on a textured surface21,22 or the skin vibrations during sliding23,24.
Finally a critical feature for a comprehensive dataset is the presence of a subjective evaluation of the surfaces. This evaluation is essential to understand how the physical properties of the surfaces are perceived by humans. Recent work such as the sens325 presents a multi-sensory dataset (position, audio, vibration and temperature) recorded on two participants and 50 textures coupled with subjective evaluations. Sens3 employs an ADXL356 accelerometer with a bandwidth of 2.6 kHz, which is sufficient to capture vibrations above the threshold of human tactile perception26. However, as the dataset is recorded at 10 kHz, a substantial portion of the acquired signal falls outside the sensor’s effective range, limiting its usefulness for analysing the mechanical propagation of high-frequency vibrations through the skin27. Additionally, the dataset lacks precise sensor synchronization, thereby limiting its applicability for high-frequency signal analysis. Existing datasets with their key features are summarized in Table 1.
To our knowledge, no comprehensive dataset exists that integrates synchronized visual, audio, and haptic signals collected from finger exploration of material surfaces, representing a significant gap in resources for perceptual studies and machine learning. Such a dataset would enable the study of the complex interplay between different sensory modalities in human tactile perception and could serve as a foundation for inducing haptic feedback directly on the human finger28, as well as for training machine learning algorithms to recognize textures based on multi-modal inputs.
This gap motivated us to create a multi-modal dataset by collecting signals measured when a bare finger interacts with various textured surfaces. The dataset includes stereoscopic images of the surface, fingertip position and speed in image coordinates, the applied load by the finger, emitted sound, and friction-induced vibrations propagating through the finger. While the use of an artificial finger was considered29,30,31, we opted for bare fingers to ensure natural data collection.
The potential of this dataset is demonstrated by exploring the accuracy of classifiers trained on different combinations of these multi-modal data to identify surfaces. The following sections detail the data acquisition apparatus and protocol, the surfaces used (Section Methods), the dataset’s structure (Section Data Records), and the technical validation of the dataset (Section Technical Validation).
This dataset addresses a critical gap in multi-sensory perception research and opens new avenues for advancements in perception studies, offering unprecedented insights into the nuanced interplay of touch, vision, and sound in human perception.
Methods
Experimental setup
The dedicated setup shown in Fig. 1, installed in a sound-proofed room to minimize external noise interference, was designed to record high-resolution visual, audio, and haptic data as a participant’s finger slides against differently textured surfaces. It consists of the following components:
Experimental setup used to collect multi-modal data from human fingertip interactions with various textured surfaces. (A,B) 4K cameras 1 and 2 above a textured surface. (C,D) Directional microphones Left and Right with adjustable support. (E) Light source. (F) Texture holder. (G) Screen to display GUI. (H) Texture. (I) Force/torque sensor. (J) Accelerometer-nail glued to the nail, and (K) accelerometer phalanx strapped to the index finger by a (L) Silicone ring. (K) 2 Tracking markers for the fingertip motion and direction recording.
Visual data acquisition
To study and understand the role of visual cues in haptic perception, we recorded high-resolution stereoscopic images of the textured surfaces (Fig. 2). Similar to human vision, stereoscopic images provide depth information, which is crucial for understanding the surface’s topography and texture:
-
Vision. The visual data acquisition setup consisted of two 4K cameras (Fig. 1A,B, model QXC-700, Thustar, Dongguan, China) equipped with close-range lenses to record the textured surfaces. The cameras were positioned 5 cm apart, capturing the surfaces from slightly different angles. Each surface was photographed at a resolution of 3840 × 2160.
Representative textures used in the experiments, selected from the larger set of study33.
Tactile data acquisition
To capture the complexity and multi-dimensionality of tactile interactions, we used several type of sensors to record the force, vibrations, and position of the finger during the interaction:
-
Force/torque. The load applied by the finger on the surface impacts the physical interaction and the perception of the texture. This load was recorded using a six-axis force/torque (F/T) sensor (Fig. 1I : Model Nano17-SI-25-0.25, ATI Industrial Automation, Apex, NC, USA) with a resolution of 1/160 N and 1/32 N mm mounted on a 3D-printed holder attached to the aluminium profile structure. The sensor was positioned 5 cm below the texture surface.
-
Vibrations. When the fingertip is sliding on a textured surface, vibrations are generated and propagate through the finger. We recorded these vibrations using two accelerometers (Fig. 1J,K: Model 8778A500, Kistler Instrumente AG, Winterthur, Switzerland) with a sensitivity of 10 mV/g and a frequency range from 2 to 9000 Hz. To capture these vibrations the position and fixation of the accelerometers are critical. The first accelerometer was attached to the finger’s nail with glue to ensure a rigid fixation and prevent damping effects. The second accelerometer was attached to the finger’s first phalanx to evaluate the vibrations propagating through the finger. Due to the soft nature of the finger skin, this second accelerometer could not be rigidly fixed, instead we used a silicone ring to ensure a comfortable fixation. The data recorded from these accelerometers will then be referred to as ‘Vibration-nail’ and ‘Vibration-phalanx’ respectively.
-
Fingertip position and speed. To understand the impact of the sliding speed and direction on the gathered data, we recorded the position and speed of the fingertip. We used two tracking markers (Fig. 1M) on the fingernail to compute the centroid of the fingertip on the surface.
Audio data acquisition
Sound produced by the interaction between the finger and each textured surfaces was also recorded to capture any auditory cues generated. The sound produced by the finger-surface interaction has a low amplitude relative to the background noise (e.g., the sound of the participant’s breathing):
-
Audio. To focus on the sound generated by the interaction, we used two directional microphones(Fig. 1C,D; Model 130A23, PCB Piezotronics Inc., Depew, NY, USA) with a sensitivity of 14 mV/Pa. The microphones were place pointing towards the finger and the surface in two opposite directions in order to facilitate audio post-processing and noise reduction.
Signal conditioning and synchronization
The quality of the recorded data is highly dependent on the signal conditioning and synchronization of the different data sources. The microphone and accelerometers were connected to a laboratory amplifier and acquisition device (Model 5165A, Kistler Instrumente AG, Winterthur, Switzerland) that recorded the audio and acceleration signals at a 12.5 kHz sampling rate with 32-bit resolution. The synchronization of these signals was at the hardware level by the signal conditioning unit. Additionally, to synchronize every data source, the computer’s kernel clock was utilized to timestamp each received data chunk. This data was streamed using a LabStreamingLayer architecture32. Concurrently, a multi-threaded process on the computer collected all incoming data, directing it into a PostgreSQL (PSQL) database. Given the high volume of data produced by audio and acceleration streaming at 12.5 kHz, the TimeScaleDB PostgreSQL extension, designed specifically for efficient time series data storage, was employed to enhance the database’s data storage speed. Fig. 3 illustrates the recorded and processed signals. The dataset is provided with the timestamp of each data point and each sensor’s original sampling rate. For further processing and analysis, the data can be aligned through down- or up-sampling depending on the desired rate.
Visualisation of some of the physical signals recorded during lateral sliding (phase 2.c of a trial). The figure shows the following signals from top to bottom: position, speed and velocity direction of the first marker; force measured below the center of the surface; vibration (acceleration) measured on the nail and phalanx; audio signal from the left microphone.
Synchronization strategy
To ensure the synchronization of the data streams, every device was first synchronized to the main computer clock using the NTP protocol, then each chunk of data was timestamped using the computer clock. The dataset is provided with the timestamp of each data point.
Texture surfaces
Ten texture surfaces were selected from a larger set33 to represent a diverse range of materials and surface topographies. These surfaces are displayed in Fig. 2 and are labeled by the same code as in the original study33. To select textures, we explored the samples by sliding our fingers over their surfaces, selecting those expected to generate distinct acoustic and vibratory signatures. Textures that impeded smooth sliding, such as flat metals and glass surfaces, were excluded due to their propensity to induce irregular motion during interaction. Additionally, textures with visual patterns that did not contribute meaningfully to haptic perception (e.g., printed images) were excluded.
Visual feedback mechanism
During each trial, participants were seated in front of a screen providing visual feedback for fulfilling the trial constraints (see Section Experimental protocol). The screen displayed a filled circle to indicate the relative position of the fingertip on the texture (Fig. 4). An empty square was also shown to display the boundaries of the texture within which the participant was required to stay.
Graphical User Interface (GUI). The interface shows a filled circle (A) indicating fingertip position and an empty square (D) representing the texture’s boundaries. The user instructions are displayed on top of the GUI (C). Depending on the trial type, the GUI can display a moving target (B) and/or a load gauge (E).
Experimental protocol
We collected and analyzed multi-modal sensory data from human fingertip interactions with various textured surfaces. The data collection was approved by the Research Ethics Committee of Imperial College London (approval number: 22IC7699). Ten participants, aged between 18 and 50 years (5 males, 5 females) were recruited and provided written informed consent before starting with the experiment. All participants were right-handed and had no known sensory or motor impairments. The recordings were conducted between August and September 2023. The data collection encompassed 50 trials for each participant, with five trials for each of the ten selected textures, and lasted around 1 h.
The textures examined in each trial were selected in a randomized order to prevent order effects. At the start of each trial, participants positioned their finger above the texture, allowing for load calibration without making contact. The trials were divided into three types: Standard, Constant Speed, and Constant Load, each with specific constraints. The order of execution of the type of trials was kept constant within each texture. The trials were further divided into five phases, each designed to capture different aspects of tactile interaction (Fig. 5).
Trials types and associated GUI. Each participant performed 50 trials, divided into three types: Standard, Constant Speed, and Constant Load. Each trial was further divided into five phases: unrestricted sliding, clockwise circular sliding, anti-clockwise circular sliding, lateral back-and-forth sliding, and proximal/distal back-and-forth sliding.
Trial phases
The trials, each lasting a minute, were segmented into the following four phases designed to obtain rich data (as shown in the columns of Fig. 5):
-
0.
Before the start of a trial, participants place their finger above the texture without contact. The experimenter then starts recording, the GUI displays a “Wait” message for 1s, during which the texture’s weight is automatically calibrated. The task instructions then appear on the display, prompting the participant to begin interacting with the texture.
-
1.
In the first 15 seconds participants can freely move their finger along the surface while varying the load, and adjusting their scanning speed.
-
2.
The next 40 seconds are divided into four 10 second segments, each with a specific constraint:
-
(a)
clockwise circular sliding.
-
(b)
anti-clockwise circular sliding.
-
(c)
lateral back-and-forth sliding.
-
(d)
proximal/distal back-and-forth sliding.
-
(a)
-
3.
The trial ends automatically. The participant then removes the finger from the texture, allowing the experimenter to exchange the texture for the next trial.
Each segment included an initial 1-second transition period, ensuring a full 10 seconds period of effective recording for each task.
Trial types
The five trials for each texture were divided into three types of trials (corresponding to the rows in Fig. 5), each designed to assess different aspects of tactile interaction:
During the first three trials Standard) the speed and load were not constrained. The GUI displayed a static shape (either circular for phases 2a and 2b or rectangular for trial’s phases 2c and2d) positioned at the texture’s centre. Participants were required to move their finger periodically within this shape while maintaining contact with the texture. During the fourth trial (Constant Load) participants were asked to maintain a constant load of 4 ± 1 N. The GUI displayed a color gauge indicating the normal load exerted by the participant. The goal was to maintain this load within displayed limits. Finally, during the fifth trial (Constant Speed) participants were asked to track a moving target. The GUI displayed a circle moving at a constant speed of 0.1 m/s within the texture’s boundaries (Fig. 4B). The goal was to maintain contact with the moving target. The target’s motion followed a predefined trajectory in accordance with the trial’s phase.
This protocol was designed to capture a comprehensive range of tactile interactions in terms of contact force, sliding speed and movement directions, providing rich data for subsequent analysis in perceptual, artificial perception, and machine learning studies.
Texture images
To complement the dynamic physical data, ten high-resolution images were captured from each camera per texture. Each image was taken with different lighting configurations to capture the texture’s appearance under various conditions. These images are provided in the dataset to enable the study of the relationship between the physical properties of the surfaces and their visual appearance.
Psychophysical evaluation
We conducted a psychophysical evaluation to evaluate the subjective perception of the surfaces that complements the physical description of the interactions. This evaluation was designed to capture the participants’ perception of the surfaces in terms of tactile, visual, and auditory aspects. It can be used to study the relationship between the physical properties of the surfaces and the participant’s subjective perceptions.
Fifteen participants different from those of the first experiment were recruited (7 males and 8 females, between 18 and 50 years old and with no known sensory or motor impairments), who provided written informed consent before starting with the experiment. The evaluation was conducted in a quiet room, and the participants were seated in front of a screen displaying the evaluation questions. The evaluation data was collected between October and November 2023.
Based on previous analysis 8,34,35, 12 features were selected to evaluate the subjective perception of the surfaces covering tactile, visual, and auditory aspects of the surfaces. The features, which are described and listed in Table 2, were evaluated using a 1–10 scale.
The evaluation was divided into two parts: In the Individual Evaluation, participants were given a single texture at a time and were requested to slide along it with their index finger. They were then sequentially prompted with questions to rate the 12 features on a 1–10 scale (see GUI 1 in Fig. 6). In the Ranking Evaluation, participants were given all the textures at once and asked to rank them on the same scale for each question (see GUI 2 in Fig. 6). The order of the questions was randomized for each participant.
Psychophysical evaluation procedure. The evaluation was divided into two parts: Individual evaluation (Protocol 1 and GUI 1) and ranking evaluation (Protocol 2 and GUI 2). In the individual evaluation, participants were asked to rate the 12 features for each texture. In the ranking evaluation, participants ranked the textures for each feature.
Data Records
The data associated with this study is organized and archived in a public repository on figshare36, facilitating accessibility and reproducibility of the research findings. Each data record is detailed below, and the repository is cited accordingly.
Data Accessibility
The dataset is stored in HDF5 (.h5) format, suitable for a wide range of computational environments. Additional metadata is provided in JSON format. The specific structure is detailed below. The dataset is organized into structured directories, each corresponding to a participant and subdivided by textures, themselves subdivided by trial.
Physical data
The structure of the physical data recorded during the experiment is described in this section.
Naming convention
The following naming convention is used:
-
subject_[ID]: ID of the participant (subject_0 to subject_9).
-
texture_code: Code of the texture used during the trial (e.g., 7t).
-
trial_type: Type of trial (std_0, std_1, std_2, cte_force_3, cte_speed_4).
-
N: Number of data points recorded during the trial.
-
f64: 64-bit floating-point number.
-
int64: 64-bit integer.
Data structure
Each trial is stored in a separate file subject_[ID]_[texture_code]_[trial_type].h5 in HDF5 format, containing the following data:
-
ft_sensor_data: Array of Nft × 6f64 corresponding to the force and torque measured by the force/torque sensor: [Fx, Fy, Fz, Tx, Ty, Tz]. The forces are expressed in Newtons and the torques in Newton-meters.
-
time_ft_sensor: Array of Nft f64 timestamps (in second) of the data points.
-
pos_data: Array of N × 8 f64 corresponding to the position of the two markers (t1 and t2) coordinates in the reference frame of the left camera (cam1): [\({x}_{t1\in ca{m}_{1}}\), \({y}_{t1\in ca{m}_{1}}\), \({x}_{t2\in ca{m}_{1}}\), \({y}_{t2\in ca{m}_{1}}\)]. The position is expressed in decimeters.
-
time_pos: Array of Nposf64 timestamps (in s) of the data points.
-
kistler_data: Array of Nkistler × 4 f64 corresponding to the vibrations measured by the two accelerometers and the two microphones: [accelerometernail, accelerometerphalanx, microphoneleft, microphoneright]. The accelerations are expressed in g and the sound in kPa.
-
time_kistler: Array of Nkistler f64 timestamps of the data points.
Example
Here is an example of the structure of the data for the five trials of the participant 0, sliding on the texture 7t: In the directory subject_0/7t/ we find the following files:
-
subject_0_7t_std_0.h5
-
subject_0_7t_std_1.h5
-
subject_0_7t_std_2.h5
-
subject_0_7t_cte_speed_4.h5
-
subject_0_7t_cte_force_5.h5
Texture pictures
The texture images are stored in the directory called img. There is a folder for each texture containing the ten shots taken from the two cameras (twenty images per texture). The images are stored in JPEG format with a resolution of 3264 × 3264 pixels. Each image is named [texture_code]_[camera_code]_[shot_number].jpg with camera_code being either l or r for the left and right camera respectively.
Questionnaire data
The questionnaire data is stored in a separate and unique JSON file. The file consists of the following structure:
-
subject_[ID]: One entry per subject.
-
texture_eval: Individual evaluation
-
texture_code: One entry per texture.
-
question_code: One entry per question. It contains the integer value given by the participant.
-
-
questions_rank: Ranking evaluation
-
question_code: One entry per question.
-
Array: An array of 10 integers corresponding to the rank given by the participant for each texture.
-
-
-
Data table
Physical data directory
The following Table 3 provides an overview of the physical data records:
Additionally the psychophysical questionnaire data is stored in a single JSON file. The files is named as psychophysical_evaluation.json.
Texture Images Directory
The images of each texture, independent of the participants, are stored in a different called img. Each texture has its own folder containing the images taken from the two cameras.
The images are named as [texture_code]_[camera_code]_[shot_number].jpg.
Technical Validation
The dataset encompasses a wide range of tactile interactions, including friction-induced vibrations, load data and position data. It is further enriched with audio data, providing a comprehensive multimodal dataset for tactile analysis. To demonstrate the effectiveness of the dataset for capturing the complexities of tactile interactions, we have evaluated several combinations of modalities and features for surface classification. This provides a benchmark for future studies and demonstrates the importance of multimodal data for tactile analysis.
Data preprocessing
To carry out efficient classification of the various surfaces, we extracted several features from the raw data, which could then be used for training machine learning models. Concerning audio data, it was shown in37 that Mel Frequency Cepstral Coefficient, Spectral roll-off, Chromagram, Pitch, Spectral Centroid, spectral bandwidth, and RMS energy work well for discriminating between different classes.
Compared to the more established fields of audio and image processing, the domain of tactile data analysis has fewer well-defined features, especially for analyzing friction-induced vibratory signals. Building upon our previous work38, we address this challenge by using audio-inspired features such as Mel-frequency cepstral coefficients(MFCCs), root mean square(RMS) energy, pitch, spectral centroid, spectral bandwidth, zero crossing rate(ZCR), and spectral roll-off. We hypothesize that these audio signal processing features can also effectively capture the characteristics of tactile signals. These features are described in Table 4.
Set of modalities
We focus our evaluation on the following set of modalities: Audio, using only the audio data captured by the microphones; Acceleration, using only friction-induced vibration data measured by the accelerometers; and Tactile, which includes a combination of load, position and friction-induced vibration data.
Classifier
We used a Random Forest (RF) algorithm to classify the data, which is a popular ensemble learning method that combines multiple decision trees to improve classification performance. RF is well-suited for this large and multi-modal dataset, as it can handle high-dimensional data, is robust to overfitting, and is capable of handling imbalanced datasets, which is a common issue in tactile data classification. The RF classifier was trained on the extracted features from the tactile, audio, and acceleration data to classify the surfaces.
Results
The results of the classification are presented in Fig. 7 and Table 5. The models incorporating tactile data exhibited superior classification accuracy compared to those relying solely on audio or acceleration data. The combined tactile model achieved the highest accuracy and F1 score, highlighting the effectiveness of combining multiple tactile data sources.
Classification Performance Comparison on Main Phase Data between Uni-modal and Multi-modal Models with Random Forest Classifier. The standard deviations are listed in Table 5.
While the audio modality achieved lower accuracy compared to the other modalities, it still resulted in more than 65% classification accuracy (much greater than the 10% chance level). The audio data from the right microphone performed slightly better than the left microphone, which could be due to slight differences in the microphone placement and the characteristics of external noise. Combining the two audio sources did not significantly improve the classification performance, indicating that the two audio sources do not provide complementary information.
The tactile modality outperformed the audio modality, achieving an average classification accuracy of 75.66% and an F1 score of 75.50%. The tactile data from the nail accelerometer performed significantly better than the phalanx accelerometer, indicating that the nail accelerometer captures more relevant information about the tactile interactions. This difference could be explained by the more rigid fixation of the nail accelerometer compared to the phalanx accelerometer. Nevertheless, the combination of the two tactile data sources significantly improved the classification performance, indicating that the two tactile data sources provide complementary information.
A key advantage of our dataset is the measurement of the fingertip position and speed during the tactile interactions: The classification performance using only the acceleration and load data was significantly lower (73.53% accuracy and 73.48% F1 score) than the full tactile data, indicating the importance of the position and speed information for tactile data classification.
The combination of audio and tactile data achieved the highest classification performance, with an average accuracy of 83.89% and an F1 score of 83.87%. This indicates that the audio and tactile data provide complementary information about the tactile interactions. Again, to show the importance of the position and speed information, we also evaluated the performance of the audio and tactile data without the position and speed information. The classification performance using only the acceleration and load data was significantly lower (76.34% accuracy and 76.27% F1 score) compared to the full audio and tactile data, indicating once more the importance of the position and speed information in tactile data classification.
Questionnaire-based evaluation of textured surface perception
Individual texture evaluation
In this first phase of the questionnaire, the textures were evaluated individually, requiring the participants to use their own reference to evaluate the textures. This participants’ subjective evaluation leads to a large variability in the ratings scores of the textures. For instance, texture “83t” (ribbed aluminium) received consistent scores for “hardness” (mean rating of 10, standard deviation < 0.1, with only two participants differing) while more subjective criteria such as “change” exhibited a larger variability reflecting differences in individual perception (Fig. 8).
Differential rating of texture characteristics
The differential evaluation of the textures for each questions enables us to evaluate the proximity of the textures in terms of their perceived characteristics, and the characteristics that are most relevant for the differentiation of the textures. To assess variations in texture ratings across questions, the ratings v = (v1, …v12) for the 12 questions were normalized across all textures {t} and participants {p} with
If \(\min {v}_{i}=\max {v}_{i}\), \({\overline{v}}_{i}(p,t)\equiv 0.5\) to denote a neutral ranking. This avoids bias due to the rating’s range used in the response to a specific question. The normalized ratings were then used to create a vector representation for each texture, where each element of the vector corresponds to normalized ratings for each question. This vector representation can then be used to compare the textures in terms of the perceived characteristics.
Principal Component Analysis (PCA) was then applied to the normalized ratings to identify the principal components that capture the most variance in the dataset.
The scatter plot in Fig. 9a was generated to visualize the distribution of textures in the reduced two-dimensional space where each texture is represented by a distinct color. The proximity of the textures in the reduced space gives insight into the perceived similarities and differences among the textures. When analyzing the principal components, the weights of each rating criteria indicate their contribution to the variance captured by the principal components. In Fig. 9c we see that the first principal component is strongly influenced by the rating score related to the texture ‘hardness’ while the second principal component is mainly influenced by the rating score related to the texture ‘orderliness’, ‘3D’ and ‘roughness’. This analysis provides insights into the factors that most influence the subjective texture differentiation.
Principal Component Analysis of the normalized rankings of the textures. (a) Scatter plot of the textures in the reduced two-dimensional space defined by the first two principal components. To provide insight into the variability of the subjective evaluations within each texture, the mean and standard deviation of the samples in each texture group are visualized as a star marker and a circle, respectively (b) Percentage of variance explained by the first seven principal components. (c) Weights and modality of each question in the first four principal components.
However, the explained variance of the first two principal components is only 24% and 20% respectively (see Fig. 9b), indicating that the first two principal components capture only a small portion of the total variance in the dataset. This suggests that the subjective evaluations of the textures are influenced by a wide range of characteristics, and that the first two principal components may not capture the complexity of the texture evaluations well.
Code availability
The dataset described in the previous section “Data Record” and accompanying processing scripts are available on an online repository36. The repository includes a README.md file that provides detailed instructions on how to access and use the dataset.
References
Durrant-Whyte, H. F. Sensor models and multisensor integration. The International Journal of Robotics Research 7, 97–113 (1988).
Ernst, M. O. & Bülthoff, H. H. Merging the senses into a robust percept. Trends in Cognitive Sciences 8, 162–169, https://doi.org/10.1016/j.tics.2004.02.002 (2004).
Bertelson, P. & De Gelder, B. The psychology of multimodal perception. Crossmodal Space and Crossmodal Attention 141–177 (2004).
Munoz, N. E. & Blumstein, D. T. Multisensory perception in uncertain environments. Behavioral Ecology 23, 457–462 (2012).
Anderson, B. L. Stereoscopic surface perception. Neuron 24, 919–928 (1999).
Akay, A. Acoustics of friction. The Journal of the Acoustical Society of America 111, 1525–1548, https://doi.org/10.1121/1.1456514 (2002).
Kassuba, T., Menz, M. M., Röder, B. & Siebner, H. R. Multisensory interactions between auditory and haptic object recognition. Cerebral Cortex 23, 1097–1107 (2013).
Klatzky, R. L., Lederman, S. J. & Matula, D. E. Imagined haptic exploration in judgments of object properties. Journal of Experimental Psychology: Learning, Memory, and Cognition 17, 314–322, https://doi.org/10.1037/0278-7393.17.2.314 (1991).
Lederman, S. J. & Klatzky, R. L. Haptic perception: A tutorial. Attention, Perception, & Psychophysics 71, 1439–1459, https://doi.org/10.3758/APP.71.7.1439 (2009).
Lécuyer, A., Burkhardt, J.-M. & Etienne, L. Feeling bumps and holes without a haptic interface: the perception of pseudo-haptic textures. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI, 239–246, https://doi.org/10.1145/985692.985723 (2004).
Guest, S., Catmur, C., Lloyd, D. & Spence, C. Audiotactile interactions in roughness perception. Experimental Brain Research 146, 161–171, https://doi.org/10.1007/s00221-002-1164-z (2002).
Culbertson, H., Unwin, J., Goodman, B. E. & Kuchenbecker, K. J. Generating haptic texture models from unconstrained tool-surface interactions. In Proc IEEE World Haptics Conference (WHC), 295–300, https://doi.org/10.1109/WHC.2013.6548424 (2013).
Culbertson, H., López Delgado, J. J. & Kuchenbecker, K. J. One hundred data-driven haptic texture models and open-source methods for rendering on 3D objects. In Proc IEEE Haptics Symposium (HAPTICS), 319–325, https://doi.org/10.1109/HAPTICS.2014.6775475 (2014).
Strese, M. et al. A haptic texture database for tool-mediated texture recognition and classification. In Proc IEEE International Symposium on Haptic, Audio and Visual Environments and Games (HAVE), 118–123, https://doi.org/10.1109/HAVE.2014.6954342 (2014).
Zheng, H. et al. Deep learning for surface material classification using haptic and visual information. IEEE Transactions on Multimedia 18, 2407–2416, https://doi.org/10.1109/TMM.2016.2598140 (2016).
Strese, M., Boeck, Y. & Steinbach, E. Content-based surface material retrieval. In Proc IEEE World Haptics Conference (WHC), 352–357, https://doi.org/10.1109/WHC.2017.7989927 (2017).
Strese, M., Brudermueller, L., Kirsch, J. & Steinbach, E. Haptic Material Analysis and Classification Inspired by Human Exploratory Procedures. IEEE Transactions on Haptics 13, 404–424, https://doi.org/10.1109/TOH.2019.2952118 (2020).
Burka, A. et al. Proton: A visuo-haptic data acquisition system for robotic learning of surface properties. In Proc. IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), 58–65, https://doi.org/10.1109/MFI.2016.7849467 (2016).
Toscani, M. & Metzger, A. A database of vibratory signals from free haptic exploration of natural material textures and perceptual judgments (viper): Analysis of spectral statistics. In Seifi, H.et al. (eds.) Haptics: Science, Technology, Applications, 319–327 (Springer, Cham, 2022).
Lu, S., Zheng, M., Fontaine, M. C., Nikolaidis, S. & Culbertson, H. Preference-Driven Texture Modeling Through Interactive Generation and Search. IEEE Transactions on Haptics 15, 508–520, https://doi.org/10.1109/TOH.2022.3173935 (2022).
Wiertlewski, M., Hudin, C. & Hayward, V. On the 1/f noise and non-integer harmonic decay of the interaction of a finger sliding on flat and sinusoidal surfaces. In Proc IEEE World Haptics Conference (WHC), 25–30 (2011).
Platkiewicz, J., Mansutti, A., Bordegoni, M. & Hayward, V. Recording device for natural haptic textures felt with the bare fingertip. In Proc. EuroHaptics Conference, 521–528 (Springer, 2014).
Tanaka, Y., Horita, Y. & Sano, A. Finger-mounted skin vibration sensor for active touch. In Proc EuroHaptics Conference, 169–174 (Springer, 2012).
Kirsch, L. P., Job, X. E., Auvray, M. & Hayward, V. Harnessing tactile waves to measure skin-to-skin interactions. Behavior Research Methods 53, 1469–1477, https://doi.org/10.3758/s13428-020-01492-3 (2021).
Balasubramanian, J. K., Kodak, B. L. & Vardar, Y. Sens3: Multisensory database of finger-surface interactions and corresponding sensations. In Kajimoto, H.et al. (eds.) Haptics: Understanding Touch; Technology and Systems; Applications and Interaction, 262–277 (Springer 2025).
Weber, A. I. et al. Spatial and temporal codes mediate the tactile perception of natural textures. Proc. Natl. Acad. Sci. 110, 17107–17112, https://doi.org/10.1073/pnas.1305509110 (2013).
Dandu, B., Shao, Y. & Visell, Y. Rendering spatiotemporal haptic effects via the physics of waves in the skin. IEEE Transactions on Haptics 14, 347–358, https://doi.org/10.1109/TOH.2020.3029768 (2021).
Kawazoe, A., Reardon, G., Woo, E., Luca, M. D. & Visell, Y. Tactile echoes: Multisensory augmented reality for the hand. IEEE Transactions on Haptics 14, 835–848, https://doi.org/10.1109/TOH.2021.3084117 (2021).
Xu, D., Loeb, G. E. & Fishel, J. A. Tactile identification of objects using Bayesian exploration. In Proc IEEE International Conference on Robotics and Automation (ICRA), 3056–3061, https://doi.org/10.1109/ICRA.2013.6631001 (2013).
Friesen, R. F., Wiertlewski, M., Peshkin, M. A. & Colgate, J. E. Bioinspired artificial fingertips that exhibit friction reduction when subjected to transverse ultrasonic vibrations. In Proc IEEE World Haptics Conference (WHC), 208–213 (2015).
Camillieri, B. & Bueno, M.-A. Artificial finger design for investigating the tactile friction of textile surfaces. Tribology International 109, 274–284 (2017).
Kothe, C., Medine, D., Boulay, C., Grivich, M. & Stenner, T. Lab streaming layer. https://github.com/sccn/labstreaminglayer (2014).
Bergmann Tiest, W. M. & Kappers, A. M. L. Analysis of haptic perception of materials by multidimensional scaling and physical measurements of roughness and compressibility. Acta Psychologica 121, 1–20, https://doi.org/10.1016/j.actpsy.2005.04.005 (2006).
Baumgartner, E., Wiebel, C. B. & Gegenfurtner, K. R. Visual and haptic representations of material properties. Multisensory Research 26, 429–455, https://doi.org/10.1163/22134808-00002429 (2013).
Ujitoko, Y. & Ban, Y. Toward designing haptic displays for desired touch targets: A study of user expectation for haptic properties via crowdsourcing. IEEE Transactions on Haptics 16, 726–735, https://doi.org/10.1109/TOH.2023.3310662 (2023).
Devillard, A., Ramasamy, A., Cheng, X., Faux, D. & Burdet, E. Tactile, Audio, and Visual Dataset During Bare Finger Interaction with Textured Surfaces, https://doi.org/10.6084/m9.figshare.26965267 (2024).
Chaudhari, R., Çizmeci, B., Kuchenbecker, K. J., Choi, S. & Steinbach, E. Low bitrate source-filter model based compression of vibrotactile texture signals in haptic teleoperation. In Proc ACM International Conference on Multimedia, (MM), 409–418, https://doi.org/10.1145/2393347.2393407 (2012).
Devillard, A., Ramasamy, A., Faux, D., Hayward, V. & Burdet, E. Concurrent haptic, audio, and visual data set during bare finger interaction with textured surfaces. In Proc IEEE World Haptics Conference (WHC), 101–106, https://doi.org/10.1109/WHC56415.2023.10224372 (2023).
Jiao, J., Zhang, Y., Wang, D., Guo, X. & Sun, X. HapTex: A Database of Fabric Textures for Surface Tactile Display. In Proc IEEE World Haptics Conference (WHC), 331–336, https://doi.org/10.1109/WHC.2019.8816167 (2019).
Vardar, Y., Wallraven, C. & Kuchenbecker, K. J. Fingertip Interaction Metrics Correlate with Visual and Haptic Perception of Real Surfaces. In Proc IEEE World Haptics Conference (WHC), 395–400, https://doi.org/10.1109/WHC.2019.8816095 (2019).
Acknowledgements
This work was initiated with Vincent Hayward, was motivated by discussions with him, and he contributed to the development of the experimental setup and protocol. It was in part funded by the EU H2020 grants PH-CODING (FETOPEN 829186) and INTUITIVE (ITN PEOPLE 861166).
Author information
Authors and Affiliations
Contributions
Conceived and designed the experiments: A.D., A.R., X.C., D.F., E.B. Designed the experimental setup: A.D. Conducted the experiments: A.D. Analyzed the data: A.D., A.R.Wrote the paper: A.D., A.R., X.C., D.F., E.B. All authors have reviewed the manuscript and agree with its content.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Devillard, A.W.M., Ramasamy, A., Cheng, X. et al. Tactile, Audio, and Visual Dataset During Bare Finger Interaction with Textured Surfaces. Sci Data 12, 484 (2025). https://doi.org/10.1038/s41597-025-04670-0
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41597-025-04670-0











