Abstract
Mobile robots are used in various fields, from deliveries to search and rescue applications. Different types of sensors are mounted on the robot to provide accurate navigation and, thus, allow successful completion of its task. In real-world scenarios, due to environmental constraints, the robot frequently relies only on its inertial sensors. Therefore, due to noises and other error terms associated with the inertial readings, the navigation solution drifts in time. To mitigate the inertial solution drift, we propose the MoRPINet framework consisting of a neural network to regress the robot’s travelled distance. To this end, we require the mobile robot to maneuver in a snake-like slithering motion to encourage nonlinear behavior. MoRPINet was evaluated using a dataset of 290 minutes of inertial recordings during field experiments and showed an improvement of 33% in the positioning error over other state-of-the-art methods for pure inertial navigation.
Similar content being viewed by others
Introduction
A number of key factors contribute to the growth of mobile robots, including increased efficiency, productivity, and versatility. There are many applications for mobile robots. These include fruit picking or monitoring in agriculture applications, indoor or outdoor deliveries, inspection or transportation on construction sites. They also include data collection for multiple purposes and operation in hazardous environments.
There are several kinds of mobile robots including: legged robots, tracked slip locomotion and wheel-based mobile robots1. The latter are easier to design, cheaper to build, and are less complex to control. One of the most crucial aspect of mobile robots design is its navigation capabilities as they are responsible for determining the robot position and orientation. To that end, the robot relies on different sensors. Commonly, the positioning is performed by using sensors such as camera2, LiDAR3, sonar4, global navigation satellite system (GNSS)5, inertial sensors6or odometer7. A nonlinear filter, such as the extended Kalman filter8, can also be used to combine these sensors with inertial sensors. However, in real-world scenarios, situations when only inertial sensor readings are available for positioning commonly exist9,10. For instance, vision-based systems are susceptible to degradation in low-light conditions, while GNSS signals become inaccessible in indoor environments. Similarly, radio-based positioning solutions such as UWB or WiFi require the installation of dedicated hardware, such as transmitters, within the buildings the robot operates limiting its application. Also, such solutions may experience limitations due to blind spots or signal obstruction caused by surrounding materials along the robot’s trajectory.
The inertial sensors are grouped in inertial measurement unit (IMU), which includes three perpendicular accelerometers and three perpendicular gyroscopes11. Inertial sensors are commonly used because of their simplicity of installation, small size, low-cost, and high sampling rate. Yet, when integrating the inertial sensor reading, the navigation solution drift over time because the noise and the other error terms.
Recently, machine and deep learning approaches have demonstrated significant advancements over model-based methods in inertial sensing across various platforms12,13,14, including autonomous underwater vehicles15,16, quadrotors17,18, and pedestrians19,20. In mobile robot navigation, deep learning is primarily utilized with vision systems, building on extensive research in neural networks for image processing. The two main methods for vision-based navigation involve comparing landmarks to a predefined map21,22or creating a real time map for simultaneous localization and mapping23,24.
Inspired by the locomotion of snakes25,26, a novel method has been proposed that incorporates the serpentine movement of a mobile robot, along with a corresponding algorithm, to enhance motion accuracy. Snakes use serpentine slithering to compensate for their lack of legs, enabling them to move efficiently, conserve energy, and maintain excellent maneuverability even in rough terrain. By adopting this type of locomotion in mobile robots, maneuverability can be preserved, albeit with a potential increase in power consumption. Additionally, this mode of locomotion enriches sensor measurements, leading to a high signal-to-noise ratio and achieving high positional accuracy. Initially, the concept was explored with quadrotors, where Shurin et al.27developed a model-based method to estimate quadrotor positions and later17improved this solution by employing a deep learning network to estimate step length and altitude. Further research demonstrated that using multiple IMUs on the quadrotor yields better results28. Recently, we proposed the Mobile Robot Pure Inertial Navigation (MoRPI) framework29, which is based on periodic movement and employs an empirical formula to determine step length, similar to the approach used with quadrotors. However, mobile robots have low amplitudes, slow motion, minimal gravity changes, and frequent periods, hence they require unique considerations that cannot be directly transferred from quadrotor and pedestrian solutions, which makes MoRPI an excellent alternative to pure inertial navigation solutions.
In MoRPI, the method requires an additional calibration phase for gain calibration. The gain is sensitive to motion parameters and is a primary factor contributing to position error. Furthermore, the solution is inherently limited to peak-to-peak segments, which reduces the update rate of the positioning.
Therefore, in this paper, we present a deep-learning-based algorithm incorporating serpentine dynamics in a wheeled robot. The contribution of this paper are:
-
1.
Inspired by snake-like slithering motion, we present an approach for estimating distance increments for mobile robots in a situation of pure inertial navigation by using deep learning and inertial sensors. Emulating snake-like slithering motion increases the signal-to-noise ratio of the inertial sensor data and enables accurate navigation.
-
2.
Our dataset contains 290 minutes of ground-truth trajectories from an GNSS-RTK sensor in 10Hz and STD of 0.1m with recordings from about 5 different IMUs in 120Hz (each IMU has 58 min). The dataset and the code are publicly available and can be found here: https://github.com/ansfl/MoRPINet.
Our approach achieves better resolution by using small time windows and enhances robustness by training on a diverse dataset. As a result, it shows a 99% improvement compared to traditional INS solutions and a 33% improvement over MoRPI. With an update rate of 5 Hz, about 20 times faster than MoRPI. The proposed neural network is also lightweight and can be implemented on edge devices. The small amplitude suggested in this manuscript enables the robot to plan its path effectively using various sensors, such as cameras and Lidar, scan and classify the environment, and transport objects, without limiting most typical mobile robot applications. We demonstrate the effectiveness of our method through field experiments using a mobile robot equipped with RTK-GNSS and IMU.
The rest of the paper is organized as follows: Section 2 presents the INS equations and the MoRPI method. Section 3 describes the proposed approach. Section 4 explains the experiments and gives the results, and Section 6 gives the conclusions of this paper.
Model-based approaches
This section provides a brief overview of two model-based solutions used later in comparison to our proposed approach.
Inertial navigation system
An inertial navigation system (INS) provides a complete navigation solution, consisting of the position vector, velocity vector, and orientation. The INS equations of motion are commonly expressed in the navigation frame with north-east-down coordinates30. As mobile robot navigation is addressed, a local coordinate frame (l-frame), is adopted. It is located at the initial position of the robot and its coordinates are in the the north-east-down directions. The position vector rate of change is
where \(\varvec{p}^l\) is the position vector expressed in the l-frame and \(\varvec{v}^{l}\) is the velocity vector expressed in the l-frame.
The velocity rate of change is:
where \(\varvec{g}^{l}\) is the gravity vector expressed in the l-frame, \(\textbf{R}^{l}_{b}\) is the transformation matrix from the body frame to the l-frame and \(\varvec{f}^{b}_{ib}\) is the specific force vector measured by the accelerometer and expressed in body frame.
The rate of change of the transformation matrix is given by:
where \(\mathbf {\Omega }^{b}_{ib}\) is the skew-symmetric form of the angular rate measured by the gyroscope, expressed in the body frame.
Notice that as our scenarios include low-cost inertial sensors and short time periods, the earth turn rate and the transport rate are neglected in (2)-(3).
MoRPI
The MoRPI approach is based on periodic movement and employs an empirical formula to determine the peak to peak distance29. It uses accelerometer or gyroscope readings for peak-to-peak event detection. Then, by using Weinberg’s step length estimation approach, the peak-to-peak distance is estimated by:
where s is the peak-to-peak distance, G is the approach’s gain, and \(\varvec{f}^b\) is the sequence of accelerometer readings between two successive peak detection.
Two MoRPI approaches are available:
-
MoRPI-A: Uses the accelerometer readings for peak detection. The most sensitive axis in the snake-like slithering motion is the one that is perpendicular to the motion progress-direction. Assuming the accelerometer sensitive axes directions are: x - points towards the path segment end point, y - is the perpendicular axis to x in the ground planar and z - is oriented to complete the right-hand rule with the x and y axes. Then the specific force readings in the y axis are plugged into (4).
-
MoRPI-G: Uses the gyroscope readings for peak detection. To cope with limited dynamics cases, for example when driving in narrow passes, the signal to noise ratio in the accelerometer readings is low. Hence, an approach that relies only on the gyroscope was suggested. Here, the z-axis angular rate, \(\varvec{\omega }_z\), readings are used in (4) instead of \(\varvec{f}_{y}^b\) axis of the accelerometer.
The MoRPI gain, G, is estimated by moving the robot at known distance prior the method can be used. Each approach requires a different gain value. The peak-to-peak distance estimation together with the gyro-based heading (3), and initial conditions are used to propagate the mobile robot two-dimensional position by
where x and y are the robot’s position coordinates, \(\psi\) is the heading, and k is the peak index.
The MoRPI model assumes that the trajectory is composed of connected straight lines due to the position propagation model in (5)-(6) which is based on the peak to peak distance.
MoRPINet framework
Mobile robots equipped with inertial sensors often generate noisy readings with a low signal-to-noise ratio (SNR), particularly when traveling at a nearly constant velocity. This issue is especially pronounced in straight-line trajectories, where the lack of dynamic motion further reduces the SNR, leading to significant drift and positioning errors in both neural network-based and model-based INS (1)-(3), approaches. To mitigate these challenges, a serpentine locomotion pattern is adopted, enhancing the SNR and improving position estimation accuracy in mobile robots. In its nature, the serpentine locomotion is a dynamic motion with angular velocity and linear acceleration. As a consequence, this motion is expected to enrich the inertial signals and increase the signal to noise ratio for both accelerometers and gyroscopes. This will enable more features in the data processed by neural networks enabling the extraction of relevant positioning information. To this end, we propose MoRPINet, a pure inertial positioning approach. MoRPINet requires the mobile robot to move in a serpentine dynamics and splits the trajectory reconstruction into two parts:
-
1.
Distance Estimation: D-Net, a neural network architecture for distance regression is proposed. It is a simple yet efficient structure consisting of a one-dimension convolution layers (1DCNN) and a fully connected (FC) head for distance estimation, based only on the inertial readings.
-
2.
Heading Estimation:The well-established Madgwick filter31 is adopted for heading estimation based on the inertial readings.
Our proposed approach, MoRPINet, is illustrated in Figure 1. In the following subsections we elaborate on each part of MoRPINet.
D-Net: distance estimation network
The network architecture of the proposed approach is based on 1DCNN and FC layers. The network receives a time window with nsamples of data from the three-axis gyroscope and three-axis accelerometer. Initially, the input data is processed through a convolutional layer comprising of seven filters, each with a size of 2x1. Subsequently, the extracted features are flattened, and dropout is applied to prevent overfitting. The input is then flattened again and concatenated with the output of the dropout layer. This data is fed through two FC layers with 512 and 32 neurons, each followed by dropout and layer normalization. Both the convolutional and fully connected layers utilize ReLU activation32 functions to introduce nonlinearities into the model. An illustration of the architecture is given in Figure 2. The output of the network is a regressed value for the travelled distance at the given time window.
MoRPINet training process
In the process of training a deep learning network, the main objective is to determine the weights and biases that solve the given problem. By assuming a \(m_{1} \times m_{2}\)filter (or kernel) the output of the convolutional layer can be written as follows33:
where \(\varvec{\omega }_{\alpha \beta }^{(r)}\) is the weight in the \((\alpha ,\beta )\) position of the \(r^{th}\) convolutional layer, \(\varvec{b}^{(r)}\) represents the bias of the \(r^{th}\) convolutional layer, and \(\varvec{a}_{\dot{\imath }\dot{\jmath }}^{(\ell -1)}\) is the output of the preceding layer.
The fully-connected layers are built by a number of neurons. The following equation expresses the output of each neuron:
where \(\varvec{\omega }_{\dot{\imath }\dot{\jmath }}^{(\ell )}\) is the weight of the \(\dot{\imath }^{th}\) neuron in the \(\ell ^{th}\) layer associated with the output of the \(\dot{\jmath }^{th}\) neuron in the \((\ell -1)^{th}\) layer, \(\varvec{a}_{\dot{\jmath }}^{(\ell -1)}\), \(\varvec{b}_{\dot{\imath }}^{(\ell )}\) represents the bias in layer \(\ell\) of the \(\dot{\imath }^{th}\) neuron, and \(n_{\ell -1}\) represents the number of neurons in the \(\ell -1\) layer.
Equation (8) represents a linear process. Therefore, for the network to cope with nonlinear problems, the neuron’s output \(\varvec{z}_{\dot{\imath }}^{(\ell )}\) has to go through a nonlinear activation function, \(h(\cdot )\), which results in34:
Specifically, we employ, for all the layers in the model, the rectified linear unit (ReLU)35 as our nonlinear activation function \(h(\cdot )\), which has a strong mathematical and biological basis. The ReLU activation function is defined by
The mean absolute error (MAE) loss function is used for the forecasting process:
where \(\varvec{y}_{\dot{\imath }}\) is the ground truth (GT) distance, \(\hat{\varvec{y}}_{\dot{\imath }}\) is the predicted value of the distance and n is the number of items in the batch. In order to generate the prediction, the input has to go through (7)-(11), in a process called the forward propagation36. The learning process is performed by a stochastic gradient descent method, where the weights and biases are update by
Here we used \(J(\varvec{\theta })\) to present the loss function to emphasize that the parameter now is the vector \(\varvec{\theta }\), where \(\varvec{\theta }\) is the vector of weights and biases. \(\eta\) is the learning rate, and \(\nabla _{\theta }\) is the gradient operator.
In the suggested approach, the adaptive moment estimation (Adam) optimization algorithm37 was employed for better convergence of D-Net. In addition, we used learning rate reduction on plateau as scheduler with a factor of 0.5.
The number of epochs used in the training was 300 with a batch size of 2048. The initial learning rate was set to 0.0025. The dropout probability for the first dropout layer (applied after flattening) was set to 0.1, while for the two subsequent layers following the FC layers, it was 0.5.
For the training dataset we used windows size of \(W=24\) accelerometer and gyroscope samples (\(6 \times 24\)) with overlap of 12 samples between two successive time windows.
Madgwick filter
We employ the Madgwick filter31 for heading determination, due to its low computational load, high rate solution and the ability to generalise the filter to include magnetometer measurements for future work. The Madgwick filter has two main parts:
-
1.
Orientation from angular rate: The orientation is computed by numerically integrating the angular rates measurements from the three-axis gyroscope. First, a quaternion containing the measurement is defined as:
$$\begin{aligned} \varvec{\omega }_q = \begin{bmatrix} 0&\omega _x&\omega _y&\omega _z \end{bmatrix} \end{aligned}$$(13)where \(\varvec{\omega }_q\) is a quaternion and \(\omega _i, i=x,y,z\) is the angular rate components, as measured by the gyroscope. The quaternion rate of change is
$$\begin{aligned} \dot{\varvec{q}}_{w,t} = \frac{1}{2} \hat{\varvec{q}}_{t-1} \cdot \varvec{\omega }_q \end{aligned}$$(14)where \(\varvec{\hat{q}}_{t-1}\) is the estimated quaternion at time \(t-1\) and \(\varvec{\dot{q}}_{w,t}\) is the quaternion rate of change. Finally, the updated quaternion is
$$\begin{aligned} \varvec{q}_{w,t} = \hat{\varvec{q}}_{t-1} + \dot{\varvec{q}}_{w,t} \Delta {t} \end{aligned}$$(15)where \(\varvec{q}_{w,t}\) is the quaternion integrated orientation solution and \(\Delta t\) is the sampling period.
-
2.
Orientation from vector observations: By using the gradient descent algorithm the error between the rotated gravitational field and the accelerometer measurements can be minimized, and later used to update the gyroscope-based quaternion (15). The normalized gravitational field in the initial local coordinate frame is
$$\begin{aligned} \varvec{g}_e = \begin{bmatrix} 0&0&0&1 \end{bmatrix} \end{aligned}$$(16)The accelerometer measurement is normalized so that the sum of the absolute values of its components equals one. The normalized reading expressed in the body frame is
$$\begin{aligned} \varvec{f}_b = \begin{bmatrix} 0&f_x&f_y&f_z \end{bmatrix} \end{aligned}$$(17)where \(f_i, i=x,y,z\) is the normalized specific force components. Ideally, we want to find orientation such that the rotated gravitational field will be as close as to the accelerometer measurements. Therefore, the objective function is
$$\begin{aligned} \varvec{f}\left( \varvec{\hat{q}},\varvec{g}_e, \varvec{f}_b \right) = \varvec{\hat{q}}^* \cdot \varvec{g}_e \cdot \varvec{\hat{q}} - \varvec{f}_b \end{aligned}$$(18)where \(\hat{q}\) is the orientation in quaternion and \(\hat{q}^*\) is its conjugate number. Then, the gradient of the objective function with the last estimated quaternion is used to update the quatrenion recursively:
$$\begin{aligned} \varvec{q}_{\nabla f,t} = \hat{\varvec{q}}_{t-1} - \mu \frac{\nabla \varvec{f}}{\Vert \nabla \varvec{f}\Vert } \end{aligned}$$(19)where \(\varvec{q}_{\nabla f, t}\) is the estimated orientation quaternion calculated by the gradient of the objective function at time t and \(\mu\) is the convergence rate of \(\varvec{q}_{\nabla f, t}\).
The final solution of the filter is done by weighted fusion between (15) and (19) resulting with:
with \(0\le \gamma \le 1\).
The conversion to the heading angle is11:
where \(q \triangleq \begin{bmatrix} q_1&q_2&q_3&q_4 \end{bmatrix}\) is the quaternion solution from (20) and \(\psi _{AHRS}\) is the yaw angle extracted from it.
Summary
Once the distance and heading are estimated, the dead reckoning position update equations are
where \(s_{Dnet,k}\) is the D-Net estimated distance at time k. The heading angle is estimated in the inertial sensors sampling rate which is faster than the distance estimation rate (window size). Also, as the window size is short (part of a second) the robot dynamics changes slowly during that period. As a result, we use the average heading angle in the dead-reckoning equations (22)-(23). The average heading angle is defined by:
where \(\psi _{AHRS} \left( t_{k,i} \right)\) is the extracted yaw angle from the Madgwick filter (21) at window k and sample i.
In contrast with MoRPI position propagation (5)-(6), MoRPINet position propagation depends on a fixed time window and not on the unknown varying peak to peak time. This is also one of the benefits of MoRPINet as, generally, a shorter window size reduces the dead-reckoning propagation error.
To summarize, MoRPINet framework has three phases:
-
1.
Distance Estimation: A shallow, yet efficient, network to estimate the distance over the required time window (Section 3.2).
-
2.
Heading determination: Madgwick filter is employed for heading estimation (21) in the required window size (Section 3.3).
-
3.
Position Update: A dead-reckoning position update (22)-(23) is applied based on the distance and heading from previous phases.
The MoRPINet algorithm is presented in Algorithm 1.
Experiment setup and dataset
Experiment setup
A remote control (RC) car was used to conduct the experiments. The car model, STORM Electric 4 WD Climbing Car, has dimensions of \(385 \times 260 \times 205 mm\), with a wheelbase of 253mm and a tire diameter of 110mm. The RC car was equipped with a Javad SIGMA-3 N RTK sensor, which provides positioning measurements with an accuracy of 10cmat a sample rate of 10 Hz, serving as the GT38. Additionally, five IMUs were mounted on a rigid surface at the front of the RC car. The experimental setup is presented in Figure 3. We worked with the Movella DOT IMUs, capable of operating at a 120 Hz39. The DOT software allows synchronization between the IMUs. The associated noise and bias values of the accelerometer and gyroscope are presented in Table 1.
Dataset
Thirteen distinct trajectories were recorded during field experiments with a total of 58 minutes for a single IMU and 290 for the entire dataset. Each trajectory includes GT data obtained from the GNSS RTK, and inertial measurements recorded simultaneously by the five IMUs mounted on the RC car.
Each recording session began with a one-minute static period, which was utilized for stationary calibration and synchronize the timing between the IMU and GNSS RTK measurements. Synchronization between the two sensors was achieved during post-processing. The biases of the IMU’s accelerometers and gyroscopes were determined by averaging the IMU measurements for a few seconds taken during the stationary period of each recording. They were subsequently offset from the entire dataset as needed. To this end, we assumed that the IMU is parallel to the ground with almost zero roll and pitch angles, allowing for accelerometer calibration. Figure 4 shows GNSS-RTK position measurements of four different snake-like slithering motion trajectories.
The recorded trajectories were divided into train and test datasets:
-
1.
Train Dataset: Seven trajectories are used for the neural network training dataset. All seven include snake-like slithering motion with variable frequency and amplitude with slowly changing heading direction (no sharp turns). The duration of the recordings varies from 4 to 18 minutes for each trajectory, resulting in a total of 55 minutes for a single IMU and 275 minutes for the whole training dataset. The IMU data was divided to time windows and corresponding target values (GT). The target is based on pairs of GNSSS-RTK measurements (E and N position components) that were processed into distances. The corresponding equation for the GT distance between two successive position measurements is:
$$\begin{aligned}D_{\dot{\imath }}=\sqrt{(E_{\dot{\imath }+1}-E_{\dot{\imath }})^{2}+(N_{\dot{\imath }+1}-N_{\dot{\imath }})^{2}} \end{aligned}$$(25)where \(D_{\dot{\imath }}\) is the target distance at epoch i and \(E_{\dot{\imath }}\) and \(N_{\dot{\imath }}\) are the east and north coordinates, respectively. The IMU data was segmented based on the time between two RTK measurements. Since the IMU operates at a frequency of 120 Hz and the RTK at 10 Hz, one RTK sample corresponds to twelve IMU samples. We used a window size of 24 IMU samples. This means that to calculate the target distance, we took every second RTK measurement at the start (\(t_k\)) and at the end (\(t_{k+1}\)) of the time window. Figure 5 shows a schematic description of the window size. Using overlap between time windows allowed us to utilize all the available RTK measurements and to enlarge the training dataset. For MoRPI A and G methods, the role of the training dataset is to acquire the approach gain. According to (4), the gain depends on the amplitude of the motion. Consequently, the gradual change in the trajectory’s direction, reflected in varying amplitudes, affects the gain calculation. Thus, using the entire MoRPINet training dataset results in performance degradation. To address the above, sub-trajectories composed of straight segments from the MoRPINet training dataset are used to estimate the required MoRPI gain G and lead to optimal results given this training dataset. These sub-trajectories are consistent with MoRPI, as MoRPI is specifically designed to operate on paths composed of straight segments. The MoRPI gain train dataset contains of five minutes of recordings. In addition, the MoRPI train dataset needs to be similar to the test dataset. However, the variance in step length across these trajectories is high, with a standard deviation of 1.49m, and the overall mean is 2.45m, compared to 2.02m in the testing set. Consequently, MoRPI’s performance is not ideal. The reason is that the dataset was recorded to demonstrate the robustness of the proposed approach and was not specifically tailored to the MoRPI method. Yet, such trajectories reflect real-world scenarios. Therefore, taking the above steps allows the model-based MoRPI approach to perform better.
-
2.
Test Dataset: The test dataset includes four trajectories of driving between two fixed points with a distance of about 25 meters. Those trajectories were recorded while the robot was moving in a snake-like slithering motion. The test dataset is the same for all baseline methods (INS and MoRPI) and the proposed MoRPINet approach. During assessment time, we tested MoRPI using the gain achieved by the corresponding training group, and for MoRPINet we used D-net with the optimal weights from the training process. The test group contains 3 minutes of recordings for a single IMU and 15 minutes for the whole test dataset.
Additionally, two straight-line motion trajectories with a distance of about 25 meters were recorded. We used the two straight-line motion trajectories to evaluate only the INS method as the common baseline setup. The total time of these trajectories is approximately 4 minutes.
In all the trajectories, initial heading angle was obtained by:
where \(\psi _{\dot{\imath }}\) is the true heading at epoch i. An illustration of the position, distance, and heading is provided in Figure 6.
Analysis and results
In this section, we detail the evaluation metrics employed to assess the performance of our proposed method and present the results of the experiments conducted. Their relevance drove the selection of metrics to the specific objectives, the position error and the D-Net accuracy. The results highlight the comparative performance of our approach against baseline and alternative methods, emphasizing improvements and including visualizations to facilitate interpretation.
Evaluation metrics
To evaluate the performance of our approach, four metrics were used. Two for the position vector accuracy:
-
1.
Position root mean squared error (PRMSE):
$$\begin{aligned} PRMSE(\varvec{x}_{\dot{\imath }},\hat{\varvec{x}}_{\dot{\imath }})=\sqrt{\frac{\sum _{\dot{\imath }=1}^{N}|\varvec{x}_{\dot{\imath }}-\hat{\varvec{x}}_{\dot{\imath }}|^{2}}{N}} \end{aligned}$$(27) -
2.
Position mean absolute error (PMAE):
$$\begin{aligned}PMAE(\varvec{x}_{\dot{\imath }},\hat{\varvec{x}}_{\dot{\imath }})=\frac{\sum _{\dot{\imath }=1}^{N}|\varvec{x}_{\dot{\imath }}-\hat{\varvec{x}}_{\dot{\imath }}|}{N} \end{aligned}$$(28)
In (27)-(28) \(\varvec{x}_i\) is the measured position vector, \(\hat{\varvec{x}}_i\) is the expected position vector, and N is the number of samples.
Two more metrics are used for the D-Net assessment and used to estimate the distance error of the step’s distances as follows:
-
3.
Distance root mean squared error (DRMSE):
$$\begin{aligned}DRMSE(d_{\dot{\imath }},\hat{d}_{\dot{\imath }})=\sqrt{\frac{\sum _{\dot{\imath }=1}^{N}\left( d_{\dot{\imath }}-\hat{d}_{\dot{\imath }}\right) ^{2}}{N}} \end{aligned}$$(29) -
4.
Distance mean absolute error (DMAE):
$$\begin{aligned}DMAE(d_{\dot{\imath }},\hat{d}_{\dot{\imath }})=\frac{\sum _{\dot{\imath }=1}^{N}\left( d_{\dot{\imath }}-\hat{d}_{\dot{\imath }}\right) }{N} \end{aligned}$$(30)
Where, in (29)-(30), \(d_i\) is the estimated distance from D-Net, \(\hat{d}_i\) is the expected distance, and N is the number of steps.
Field experiment results
MoRPINet, MoRPI and INS methods were evaluated for each test trajectory. Initially, the experiments and errors of the INS method were presented using the PRMSE. Then, a summary of the D-Net performances is given. Finally, we compare the MoRPI and MoRPINet methods and analyze the results using the evaluation metrics (27)-(28). In the subsections below, we calculated the estimated position for each recorded IMU samples. The presented results are the average of the five recordings’ position errors corresponding to the same trajectory (which were recorded simultaneously by the five mounted IMUs).
Inertial navigation system
The INS solution was applied to the straight line trajectories for a fair comparison, as this the most common baseline navigation solution. Additionally, an analysis was conducted assuming ground planar movement, which means assuming 2D motion for the mobile robot. In (2) we presume \(f_z=0\) and in (3) \(\omega _x=\omega _y=0\). These assumptions were made to minimize system noise and achieve optimal results for this method29.
We used three seconds with stationary conditions for each recording for calibration, as mentioned in subsection 4.2. During this period, biases were extracted. This calibration technique was consistently applied across all reviewed methods when calibration was conducted. Specifically, calibration was performed for both accelerometers and gyroscopes in the INS solution.
As expected, there is a significant error when using the INS equations. The PRMSE for the trajectories is 4502m using the 3D INS. Under the planar assumption, the 2D INS error is 295m. The average distance of the trajectories is 24.4m, and the mean travel time along the trajectories is 41s. The errors for each trajectory are presented in Table 2.
Additionally, we evaluated the INS solution for trajectories involving periodic movement from the test group. This type of movement results in a minor improvement compared to the straight-line movement.
The average error of those trajectories for the 3D INS is 3528m, and for the 2D INS is 262m. The results are summarized in Table 3.
D-Net
The IMU samples used as input to the neural network were not calibrated; however, calibration was part of the process in the AHRS filter. The suggested network was trained with the hyperparameters presented in Table 4 as they obtained the highest performance.
Our D-Net achieved accuracy of 84% and 87% in terms of DRMSE and DMAE, respectively, relative to the average GT distance on the test set, as presented in Table 5.
MoRPI and MoRPINet
The use of MoRPI requires obtaining the gain prior to position evaluation. We calculated the gain using straight segments cropped from the training group as described in Section 4.2. The training set was constructed to ensure robust results from the neural network, containing trajectories with varying amplitudes and step sizes for the periodic movement. However, this characteristic does not align with MoRPI’s requirements, where the training set should closely match the test set regarding amplitude and step size.
The properties of the dataset are as follows: the average step size in the training set is 2.45m with a standard deviation of 1.49, while in the test set, the average step size and standard deviation are 2.02m and 0.76, respectively. Consequently, this mismatch led to a degradation in results, particularly in the MoRPI-G method.
Despite this difficulty, MoRPI’s results are significantly better than those of the INS method and are competitive with the proposed approach. Examining the test set, we observed an average error of 2.75m using MoRPI-A with gyro calibration. For MoRPI-G, either with gyro calibration, the average error was 6.28m.
The average number of position updates for the given test trajectories, with an average length of 24.6m, was 12.25. The average time between two updates was 3.9seconds. The average travel time of a trajectory is 39 seconds.
The results of MoRPINet show it scored a PRMSE of 1.92m on average of the test dataset, which is about \(30\%\) improvement over MoRPI. The PMAE scores of MoRPINet was 1.59m averaged across the test trajectories. MoRPI, on the other hand, scored only 2.75m and 2.36m over the PRMSE and PMAE metrics, respectively. The results are summed in Table 6.
Table 7 summarizes the results and presents a comparison between the different methods with the position evaluated metrics and update frequency. Furthermore, Figure 7 provides a visual comparison of the estimated trajectories obtained from each approach alongside the corresponding ground truth (GT) trajectory. The results clearly demonstrate that the proposed MoRPINet method achieves the highest positioning accuracy. Although the subfigure 7c exhibits reduced performance due to increased noise in the samples, which makes the prediction task more challenging, our approach still outperforms both MoRPI-A and MoRPI-G. MoRPI-G demonstrates greater robustness in heading estimation due to its different sampling points; however, it yields lower accuracy in step size estimation. In contrast, MoRPI-A is more affected by inaccuracies in heading estimation.
Conclusions
In real-world scenarios, during a mobile robot operation it commonly relies only on its inertial sensors for positioning. Yet, the error terms associated with the inertial readings cause the navigation solution to drift in time. To cope with such drift, we proposed the MoRPINet framework, an inertial data-driven approach that estimates the mobile robot position. MoRPINet framework consist of D-Net, a neural network architecture for distance regression, and a Madgwick filter for heading estimation. To increase the inertial signal to noise ratio, inspired by snakes’ movement, the mobile robot was maneuvered in serpentine locomotion allowing D-Net to accurately estimate the traveled distance.
To evaluate our MoRPINet approach, five low-cost IMUs and an RTK-GNSS sensor were mounted on a mobile robot. The RTK position recordings, with 10 cm accuracy, were used as GT. A dataset of 290 minutes (58 minutes from each IMU) of inertial recordings was used to train and test the model. We compared MoRPINet to two model-based approaches: 1) the commonly used INS solution and 2) the MoRPI approach which requires a gain calibration prior to its application.
MoRPINet provides a position accuracy of only 1.59 meters for a 40 second and 24 meters long trajectory, according to PMAE. Compared to the MoRPI solution, the PMAE and the PRMSE along the RC car’s route was reduced by 33% and approximately 30%, respectively. The INS solution rapidly drifts due to the accumulation of IMU errors and is practically irrelevant. The suggested D-Net achieved 87% accuracy in terms of DMAE relative to the average GT step distance.
The suggested approach presents three main advantages over the MoRPI approach: Improvement of at least 33% over previous methods using a pure inertial solution. Furthermore, while MoRPI is a peak-to-peak-based method, the suggested approach is not limited and can be applied at a higher rate (for example 5 Hz in our experiments which is 20 times faster than MoRPI), thus providing a better resolution. Finally, the MoRPI approach is highly sensitive to changes in the frequency of the periodic movement, while the suggested approach is more robust. It is also important to note that D-Net neural network is small, yet effective, and can be implemented on edge devices.
In addition, all recorded data and code used for our evaluations are publicly available at https://github.com/ansfl/MoRPINet.
In summary, MoRPINet offers an accurate, robust solution for mobile robot positioning in scenarios where only inertial reading are available. In future work, we aim to enhance this framework to accommodate more challenging maneuvers, such as sharp turns and variations in elevation, while also collecting additional data, including rough terrain conditions. These efforts will further broaden the application potential of the framework and allow a more comprehensive assessment of its robustness in complex environments.
Data availability
The data supporting the findings of this study is publicly available at: https://github.com/ansfl/MoRPINet.
References
Rubio, F., Valero, F. & Llopis-Albert, C. A review of mobile robots: Concepts, methods, theoretical framework, and applications. International Journal of Advanced Robotic Systems 16, 1729881419839596. https://doi.org/10.1177/1729881419839596 (2019).
Desouza, G. N. & Kak, A. C. Vision for mobile robot navigation: A survey. IEEE transactions on pattern analysis and machine intelligence 24, 237–267 (2002).
Cheng, Y. & Wang, G. Y. Mobile robot navigation based on lidar. In 2018 Chinese Control And Decision Conference (CCDC), 1243–1246 (2018).
Leonard, J. J. & Durrant-Whyte, H. F. Directed Sonar Sensing for Mobile Robot Navigation, vol. 175 (Springer Science & Business Media, 2012).
Farrell, J. Aided navigation: GPS with high rate sensors (McGraw-Hill, Inc., 2008).
Jiménez, A. R., Seco, F., Prieto, J. C. & Guevara, J. Indoor pedestrian navigation using an INS/EKF framework for yaw drift reduction and a foot-mounted IMU. In 2010 7th workshop on positioning, navigation and communication, 135–143 (IEEE, 2010).
Nemec, D., Hrubos, M., Janota, A., Pirnik, R. & Gregor, M. Estimation of the speed from the odometer readings using optimized curve-fitting filter. IEEE Sensors Journal 21, 15687–15695. https://doi.org/10.1109/JSEN.2020.3023503 (2021).
Kalman, R. E. A New Approach to Linear Filtering and Prediction Problems. Journal of Basic Engineering 82, 35–45 (1960).
Guosheng, W. et al. UWB and IMU system fusion for indoor navigation. In 2018 37th Chinese Control Conference (CCC), 4946–4950 (IEEE, 2018).
Lin, X., Gan, J., Jiang, C., Xue, S. & Liang, Y. Wi-Fi-based indoor localization and navigation: A robot-aided hybrid deep learning approach. Sensors 23, 6320 (2023).
Titterton, D. H. & Weston, J. L. Strapdown inertial navigation technology / D. H. Titterton, J. L. Weston. Progress in astronautics and aeronautics vol. 207 (AIAA, Reston, Va, 2004), 2nd ed. edn.
Klein, I. Data-driven meets navigation: Concepts, models, and experimental validation. In 2022 DGON Inertial Sensors and Systems (ISS), 1–21, https://doi.org/10.1109/ISS55898.2022.9926294 (2022).
Cohen, N. & Klein, I. Inertial navigation meets deep learning: A survey of current trends and future directions. Results in Engineering 103565 (2024).
Chen, C. & Pan, X. Deep learning for inertial positioning: A survey. IEEE transactions on intelligent transportation systems IEEE Intelligent Transportation Systems Council. 25 (2024-9).
Yona, M. & Klein, I. Compensating for partial Doppler velocity log outages by using deep-learning approaches. In 2021 IEEE International Symposium on Robotic and Sensors Environments (ROSE), 1–5 (IEEE, 2021).
Saksvik, I. B., Alcocer, A. & Hassani, V. A deep learning approach to dead-reckoning navigation for autonomous underwater vehicles with limited sensor payloads. In OCEANS 2021: San Diego–Porto, 1–9 (IEEE, 2021).
Shurin, A. & Klein, I. QuadNet: A hybrid framework for quadrotor dead reckoning. Sensors 22, 1426 (2022).
Zhang, K. et al. Dido: Deep inertial quadrotor dynamical odometry. IEEE Robotics and Automation Letters 7, 9083–9090 (2022).
Asraf, O., Shama, F. & Klein, I. PDRNet: A deep-learning pedestrian dead reckoning framework. IEEE Sensors Journal 22, 4932–4939 (2021).
Chen, C. et al. Deep-learning-based pedestrian inertial navigation: Methods, data set, and on-device inference. IEEE Internet of Things Journal 7, 4431–4441 (2020).
Dourado, C. M. et al. A new approach for mobile robot localization based on an online IoT system. Future Generation Computer Systems 100, 859–881. https://doi.org/10.1016/j.future.2019.05.074 (2019).
Kim, H., Lee, D., Oh, T., Choi, H.-T. & Myung, H. A probabilistic feature map-based localization system using a monocular camera. Sensors 15, 21636–21659 (2015).
Wang, X., Wang, X. spsampsps Wilkes, D. M. Machine learning-based natural scene recognition for mobile robot localization in an unknown environment (Springer, 2019).
Burschka, D. & Hager, G. D. V-GPS(SLAM): Vision-based inertial system for mobile robots. Proceedings - IEEE International Conference on Robotics and Automation 409–415, 2004. https://doi.org/10.1109/robot.2004.1307184 (2004).
Jayne, B. C. Kinematics of terrestrial snake locomotion. Copeia 915–927 (1986).
Hu, D. L., Nirody, J., Scott, T. & Shelley, M. J. The mechanics of slithering locomotion. Proceedings of the National Academy of Sciences 106, 10081–10085 (2009).
Shurin, A. & Klein, I. QDR: A quadrotor dead reckoning framework. IEEE Access 8, 204433–204440. https://doi.org/10.1109/ACCESS.2020.3037468 (2020).
Hurwitz, D. & Klein, I. Quadrotor dead reckoning with multiple inertial sensors. In 2023 DGON Inertial Sensors and Systems (ISS), 1–18, https://doi.org/10.1109/ISS58390.2023.10361917 (2023).
Etzion, A. & Klein, I. MoRPI: Mobile robot pure inertial navigation. IEEE Journal of Indoor and Seamless Positioning and Navigation 1, 141–150 (2023).
Groves, P. D. Principles of GNSS, Inertial and Multisensor Integrated Navigation Systems (Artech House, 2013), second edn.
Madgwick, S. et al. An efficient orientation filter for inertial and inertial/magnetic sensor arrays. Report x-io and University of Bristol (UK) 25, 113–118 (2010).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In Pereira, F., Burges, C., Bottou, L. & Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 25 (Curran Associates, Inc., 2012).
Bengio, Y., Goodfellow, I. & Courville, A. Deep learning Vol. 1 (MIT press Cambridge, 2017).
Gonzalez, R. C. Deep convolutional neural networks [lecture notes]. IEEE Signal Processing Magazine 35, 79–87 (2018).
Agarap, A. F. Deep learning using rectified linear units (ReLU). arXiv preprint arXiv:1803.08375 (2018).
Zhao, B., Lu, H., Chen, S., Liu, J. & Wu, D. Convolutional neural networks for time series classification. Journal of Systems Engineering and Electronics 28, 162–169 (2017).
Bock, S. & Weiß, M. A proof of local convergence for the Adam optimizer. In 2019 International Joint Conference on Neural Networks (IJCNN), 1–8 (IEEE, 2019).
Javad. Javad SIGMA-3N. Available: https://www.javad.com/jgnss/products/receivers/sigma.html. Accessed: 2022-10-01.
Xsens. Xsens DOT. Available: https://www.xsens.com/xsens-dot. Accessed: 2022-10-01.
Acknowledgements
A. E. and N. C. were supported by the Maurice Hatter Foundation
Author information
Authors and Affiliations
Contributions
A.E. designed and performed the experiments, derived the models and analysed the data. A.E. carried out the implementation and wrote the manuscript with support from N.C. and O.L.. Z.Y. suggested and checked various ideas during the research process. I.K. conceived of the presented idea and supervised the findings of this work. All authors carried out the experiments. A.E. and I.K wrote the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Etzion, A., Cohen, N., Levi, O. et al. Snake-inspired mobile robot positioning with hybrid learning. Sci Rep 15, 15602 (2025). https://doi.org/10.1038/s41598-025-97656-2
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-97656-2
Keywords
This article is cited by
-
Wheel-Mounted Inertial Datasets
Scientific Data (2025)










