Introduction

Unmanned aerial vehicles (UAVs) play a crucial role in various fields such as military reconnaissance, geographical mapping, agricultural monitoring, and disaster relief, significantly improving operational efficiency and safety. At the same time, they also demonstrate vast potential in civilian applications, including film production and courier logistics1,2,3,4 and5. QUAVs, distinguished from fixed-wing UAVs, offer the distinct capabilities of vertical take-off and landing (VTOL) and hovering, capturing significant global scholarly interest. These attributes make QUAVs particularly versatile in various applications. Additionally, like all aircraft and dynamic systems, QUAVs are subject to inevitable external disturbances, as discussed in6 and7. The flight of UAVs are often subject to various external disturbances, including but not limited to ground effect, rotor blade oscillations, gusts, and more. These external factors can significantly diminish the stability and precision of UAV flight, and may even pose a threat to the overall stability of the UAV8 and9. Consequently, the research into trajectory tracking control for UAVs, particularly in the face of such disruptive influences, has become increasingly crucial. Developing sophisticated control strategies that can mitigate these disturbances and enhance the flight performance of UAVs is therefore of paramount importance. In addition, the need for accurate and stable UAV controller design makes the development of robust control technology an important research field. Currently, there are numerous linear and non-linear methods available for trajectory tracking in UAVs, including proportional-integral-derivative (PID) control10 and11, adaptive control12,13 and14, active disturbance rejection control (ADRC)15,16 and17, sliding mode control18, Reference19 and20), dynamic surface control (DSC)21,22,23 and24, neural network and other intelligent control methods25,26,27,28 and29.

In11, a six-degree-of-freedom QUAV system is introduced, utilizing six nonlinear controllers. The design of these controllers focuses on system stabilization and precise adherence to a set trajectory, with an emphasis on reducing energy usage and minimizing tracking discrepancies. However, the influence of external disturbance on trajectory tracking control is not considered too much in this work. In30, a PID control strategy is proposed to tackle stability challenges in UAV flight when faced with disturbances. The efficacy of this method is confirmed through simulations, which show stable control of roll, pitch, and yaw angles during autonomous flight. Conversely, Reference31 details a robust adaptive control strategy designed to manage QUAVs effectively. This method assumes that disturbances are bounded and employs Lyapunov stability analysis to ensure the stability and boundedness of all closed-loop signals. Differing from the focus of most research efforts, which predominantly concentrate on leader-follower problems, in32, the focus is on situations in which leading UAVs encounter dynamic uncertainties and unanticipated external disturbances. Consequently, this paper outlines an adaptive consensus-based control methodology designed for the coordinated flight of UAV formations. Numerical simulations further demonstrate that the proposed adaptive control strategy exhibits greater robustness compared to conventional approaches. In addressing the intrinsic nonlinearity and vulnerability to external disturbances during flight of small UAV models,33 proposes a controller utilizing piecewise constant adaptive laws. This approach is advantageous over earlier methods, such as those discussed in34 and35, due to its minimalistic controller design and elimination of the need for parameter tuning. Additionally, simulations confirm that the designed controller exhibits substantial robustness. In13, sliding mode control technology is used to tackle the issue of pose control for QUAVs amidst parameter uncertainties and external disturbances. While effective, sliding mode control has limitations, such as potential chattering effects due to high-frequency switching in control signals, which can lead to increased wear and tear on system components. In18, a distributed sliding mode controller is used to achieve consensus on altitude and heading angles among a swarm of UAVs, while also employing sliding mode control for the autopilot to manage non-consensus states. Although effective, this approach has drawbacks such as potential chattering effects, which can lead to mechanical wear and increased energy consumption in UAV systems. In36, a disturbance rejection control strategy utilizing Dynamic Surface Control (DSC) is introduced to facilitate precise trajectory tracking in UAVs. This approach differs from traditional inverse control methods by offering a more streamlined design process for the dynamic controller. The DSC strategy incorporates a first-order filter to derive the derivative of the virtual control, effectively addressing the issue of dimension expansion commonly encountered during differentiation. This adaptation simplifies the overall design of the control system, making it more efficient and manageable. Numerical comparisons and simulations validate the superiority of the control strategy proposed in this paper. In37, a neural network-based trajectory tracking control strategy for UAVs optimizes control inputs by solving the Hamilton-Jacobi-Bellman equation, with system stability verified via Lyapunov functions, and robustness confirmed through simulations. Meanwhile,38 reports that a feedforward neural network enhances MPC accuracy for quadcopters, reducing trajectory tracking errors by \(\mathrm{{40\% }}\) in simulations and real-world tests compared to PID controllers. However, the complexity of neural network models can lead to challenges in real-time implementation and require significant computational resources.

So far, the main methods for predictive analysis of existing data have been time series forecasting techniques39, such as exponential smoothing40 and grey prediction models41. Moreover, various forms of regression, including linear, logistic, and nonlinear regression, are used to predict numerical outcomes by developing mathematical models. Machine learning-based predictive methods42, such as decision trees, random forests, and neural networks, which leverage pattern learning and correlations in the data for prediction. Additionally, there are other predictive methods like Markov prediction, ROC curves, etc. In43, the challenge of achieving effective accuracy and sufficiency in long-term wind speed prediction has been addressed in this study. This method employs a combination of Artificial Neural Networks (ANNs), Recurrent Neural Networks (RNNs), and Long Short-Term Memory (LSTM) networks to reduce errors and improve prediction accuracy. The results, measured through the Root Mean Square Error (RMSE) method, indicate that the LSTM model demonstrates superior predictive capabilities with lower errors compared to other models. PSO is employed to optimize the parameters or hyperparameters of the LSTM network44,45. The goal of integrating PSO with LSTM networks, known as PSO-LSTM, is to optimize the performance of LSTM in time series prediction tasks.

Drawing inspiration from the literature cited, this study’s novelty lies in its approach to solving the trajectory tracking control challenges faced by UAVs under external disturbances. Different from traditional methods, this paper proposes an innovative dual-layer structured robust sliding mode control strategy that effectively integrates a position outer loop with an attitude inner loop using robust sliding mode techniques. Additionally, this study employs a data-driven deep learning strategy, constructing an LSTM network enhanced with a bio-inspired swarm intelligence optimization algorithm to predictively analyze position tracking error data. The Key contributions of this research include:

  1. 1.

    In the position subsystem, a virtual sliding mode control input is implemented, augmented by adaptive estimation techniques to adjust for variations in mass loads and external disturbances. Meanwhile, in the attitude subsystem, a sliding mode controller is designed to maintain stability and ensure precise tracking of reference signals, effectively operating without requiring detailed knowledge of the inertia matrix model.

  2. 2.

    In this study, a differential operator with an integral chain structure is employed to effectively dampen noise. The dynamic performance of the inner loop, particularly the initial attitude angle tracking errors, plays a crucial role in the overall stability of the outer loop in the combined control systems. To improve the convergence rate of the inner loop’s sliding mode control, this algorithm strategically modifies the gain coefficients in the inner loop control.

  3. 3.

    The integration of PSO with conventional LSTM networks, known as PSO-LSTM, provides a more efficient exploration of parameter space, which improves the global search capability and robustness of the training phase, thus demonstrating clear benefits in time series prediction tasks.

  4. 4.

    MATLAB/Simulink is used to perform numerical simulations, which reaffirm the effectiveness and robustness of the control strategy described in this paper. These simulations serve as further validation, reinforcing the efficiency and dependability of the proposed control method.

This paper is structured as follows: “Dynamic model of a QUAV and algorithms introduction” section introduces the QUAV model and the LSTM algorithm, and outlines the control objectives. “Design of trajectory tracking controller for quadrotor UAVs” section details the development of control strategies for trajectory tracking, including the design of position and attitude angle tracking controllers for the quadrotor UAV using a dual-loop control approach. Section 4 presents numerical comparative simulations conducted in MATLAB/Simulink, illustrating the robustness and disturbance resistance of the proposed control strategy. Finally, Section 5 summarizes the key findings and contributions of the research.

Dynamic model of a QUAV and algorithms introduction

Mathematical description

A QUAV, or Quadrotor Unmanned Aerial Vehicle, typically features four rotors, each powered by an electric motor that generates lift and controls the aircraft’s flight. The key components of a QUAV include its rotors, electric motors, a flight controller, and various sensors. The core flight mechanics of a QUAV involve manipulating the speeds of these rotors to create lift and direct its movement. The flight controller, which receives data from sensors and inputs from a remote control, precisely adjusts the speeds of the electric motors. This adjustment allows the QUAV to perform fundamental maneuvers such as ascending, descending, moving forward and backward, rotating, and tilting. These functionalities are discussed in detail in References26,46, and47. A schematic representation of a QUAV is provided in Fig. 1.

Figure 1
Figure 1
Full size image

The structure diagram of a QUAV.

The dynamics model of the quadrotor aircraft presented in Reference48 does not account for the coupling relationships between attitude angles. However, as referenced in Reference49, the coupling relationships between attitude angles are considered in the dynamic model. This enhanced modeling approach in Reference49 contributes to a more comprehensive understanding of the quadrotor’s performance and control. Following the principles of Euler-Lagrange modeling, the dynamic model of the QUAV is established as :

$$\begin{aligned} m{{\ddot{P}}_{pos}}= & {} {U_1}R{e_3} - mg{e_3} + {d_F} \end{aligned}$$
(1)
$$\begin{aligned} J\ddot{\Theta }= & {} {U_2} - C\dot{\Theta }+ {d_\Gamma } \end{aligned}$$
(2)

where, Eqs. (1) and (2) respectively represent the dynamic models of the position subsystem and attitude subsystem of a QUAV. The symbol \(P_{pos}=\left[ \begin{array}{lll}x&y&z\end{array}\right] ^T\) represents the coordinates of the center of mass position of the QUAV in the inertial coordinate system. The vector \(P_{pos}\) is expressed as a column vector with its components being the values of x, y, and z. m stands for the total payload weight, g represents the gravitational acceleration, \(\Theta =\left[ \begin{array}{lll}\phi&\theta&\psi \end{array}\right] ^T\), denotes the attitude angles; \(e_3=\left[ \begin{array}{lll}0&0&1\end{array}\right] ^T\) represents the unit vector in the vertical direction. \(U_1\) represents the lift, and \(U_2\) stands for the rotational moments in the attitude subsystem. The disturbance \(d_F\) corresponds to the QUAV disturbance force, and \(d_{\Gamma }\) represents the disturbance moments. J denotes the inertial tensor, and R represents the transformation from the body coordinate system to the inertial coordinate system (see49).

$$\begin{aligned} R = \left[ {\begin{array}{*{20}{c}} {{C_\theta }{C_\psi }}&{}{{S_\theta }{S_\phi }{C_\psi } - {C_\phi }{S_\psi }}&{}{{C_\phi }{S_\theta }{C_\psi } + {S_\phi }{S_\psi }}\\ {{C_\theta }{S_\psi }}&{}{{S_\phi }{S_\theta }{S_\psi } + {C_\phi }{S_\psi }}&{}{{C_\phi }{S_\theta }{S_\psi } - {S_\phi }{C_\psi }}\\ { - {S_\theta }}&{}{{S_\phi }{C_\theta }}&{}{{C_\phi }{C_\theta }} \end{array}} \right] \end{aligned}$$
(3)

where, \(C_0\) represents the cosine function, and \(S_0\) represents the sine function. These functions are commonly used in mathematical expressions to represent angles and rotations. The matrix J represents the rigid body inertia tensor, which is defined as \({I}=\left[ I_{xx}, I_{yy}, I_{zz}\right] ^T\). Reference49 provides the representation of this tensor in the inertial coordinate system:

$$\begin{aligned} {J}=\left[ \begin{array}{ccc} I_{x x} &{} 0 &{} -I_{x x} S_\theta \\ 0 &{} I_{y y} C_\phi ^2+I_{z z} S_\phi ^2 &{} \left( I_{y y}-I_{z z}\right) S_\phi C_\phi C_\theta \\ -I_{x x} S_\theta &{} \left( I_{y y}-I_{z z}\right) S_\phi C_\phi C_\theta &{} I_{x x} S_\theta ^2+I_{y y} S_\phi ^2 C_\theta ^2+I_{z z} C_\phi ^2 C_\theta ^2 \end{array}\right] \end{aligned}$$
(4)

The term C represents the components of the coriolis and centrifugal forces, and it can be calculated using the following equation (see Reference50):

$$\begin{aligned} {C}=\dot{{J}}-\frac{1}{2} \frac{\partial }{\partial {\Theta }}\left( \dot{{\Theta }}^{\textrm{T}} {J}\right) \end{aligned}$$
(5)

By combining Eqs. (4) and (5), it can be obtained

$$\begin{aligned} {C}=\left[ \begin{array}{lll} c_{11} &{} c_{12} &{} c_{13} \\ c_{21} &{} c_{22} &{} c_{23} \\ c_{31} &{} c_{32} &{} c_{33} \end{array}\right] \end{aligned}$$
(6)

where

$$\begin{aligned} \left\{ \begin{array}{l} c_{11}=0 \\ c_{12}=-I_{x x} \dot{\psi } C_\theta +\left( I_{y y}-I_{z z}\right) \left( \dot{\theta } S_\phi C_\phi +\psi S_\phi ^2 C_\theta -\dot{\psi } C_\phi ^2 C_\theta \right) \\ c_{13}=\left( I_{z z}-I_{y y}\right) \dot{\psi } S_\phi C_\phi C_\theta ^2 \\ c_{21}=-c_2 \\ c_{22}=\left( I_{z z}-I_{y y}\right) \dot{\phi } S_\phi C_\phi \\ c_{23}=-I_{x x} \dot{\psi } S_\theta C_\theta +I_{y y} \dot{\psi } S_\phi ^2 S_\theta C_\theta +I_{z z} \dot{\psi } C_\phi ^2 S_\theta C_\theta \\ c_{31}=-I_{x x} \dot{\theta } C_\theta +\left( I_{y y}-I_{z z}\right) \dot{\psi } C_\theta ^2 S_\phi C_\phi \\ c_{32}=\left( I_{z z}-I_{y y}\right) \left( \dot{\theta } S_\phi C_\phi S_\theta +\dot{\phi } S_\phi ^2 C_\theta -\dot{\phi } C_\phi ^2 C_\theta \right) +I_{x x} \dot{\psi } S_\theta C_\theta -I_{y y} \dot{\psi } S_\phi ^2 S_\theta C_\theta -I_{z z} \dot{\psi } C_\phi ^2 S_\theta C_\theta \\ c_{33}=I_{x x} \dot{\theta } S_\theta C_\theta +I_{y y}\left( \dot{\phi } C_\phi S_\phi C_\theta ^2-\dot{\theta } S_\phi ^2 C_\theta S_\theta \right) -I_{z z}\left( \dot{\phi } C_\phi S_\phi C_\theta ^2+\dot{\theta } C_\phi ^2 C_\theta S_\theta \right) \end{array}\right. \end{aligned}$$
(7)

In the model, \({d}_F\) represents the vector of disturbance forces, which can be detailed as \({d}_F=\left[ d_x, d_y, d_z\right] ^T\). Correspondingly, \(d_{\Gamma }\) represents the vector of disturbance moments, expressed as \(d_{\Gamma } = \left[ d_\phi , d_\theta , d_\psi \right] ^T\). These disturbances primarily arise from the airflow effects experienced by the QUAVs. Within the equations Eqs. (1) and (2), the terms J and C illustrate the interdependencies among the attitude angles.

Remark 1

It’s important to highlight that there is a non-linear relationship between the disturbance \({d}_F\) and the control input \(U_1\), stability can not be ensured through sliding mode switching robust terms. Hence, an adaptive control law needs to be designed specifically for \({d}_F\). Moreover, given that the attitude subsystem does not include position variables, the control law design can be simplified by appropriately decomposing the system structure. This approach helps in managing the nonlinear relationship between the disturbance \({d}_F\) and the control input \(U_1\) through a targeted adaptive control strategy. Additionally, the absence of position variables in the attitude subsystem allows for a more streamlined control law design, facilitated by an effective system structure decomposition. This can significantly enhance the efficiency of the control process.

Direct current motor model

The relationship between the applied voltage u, representing the actual control input signal, and the rotational speed \(\omega \) of the propellers on the QUAVs can be approximated by the following first-order differential equation:

$$\begin{aligned} J_r \dot{\omega }=-\frac{k_r^2}{\rho } \omega -\tau _d+\frac{k_r}{\rho } u \end{aligned}$$
(8)

where, \(J_r\) represents the motor’s inertia constant, \(\rho \) represents the impedance of the motor, \(k_r\) denotes the torque constant, and \(\tau _d\) represents the motor load. The relationship between the rotational speed \(\omega \) of the propellers and the thrust generated \(F_i\) is a quadratic relationship, described by:

$$\begin{aligned} F_i=b \omega _i^2, \quad (i=1,2,3,4) \end{aligned}$$
(9)

The total thrust generated by all four propellers is expressed by:

$$\begin{aligned} U_1=\sum _{i=1}^4 F_i=\sum _{i=1}^4 b \omega _i^2 \end{aligned}$$
(10)

Using the reverse torque \(W_i\) the yaw moment \(U_\phi \) can be obtained. The control inputs for the attitude control of the QUAVs are then represented by the vector:

$$\begin{aligned} U_2=\left[ \begin{array}{c} U_\phi \\ U_\theta \\ U_\psi \end{array}\right] =\left[ \begin{array}{cccc} 0 &{} b l &{} 0 &{} -b l \\ -b l &{} 0 &{} b l &{} 0 \\ -c &{} c &{} -c &{} c \end{array}\right] \left[ \begin{array}{c} \omega _1^2 \\ \omega _2^2 \\ \omega _3^2 \\ \omega _4^2 \end{array}\right] \end{aligned}$$
(11)

where, the variables b and c denote constants associated with aerodynamics, while l represents the radius of the QUAV configuration. In the design of control laws, it is possible to use the lift \(U_1\) and rotational moments \(U_2\) as control inputs and transform them into control voltages using the relationships described above. These control voltages can then be applied in the actual flight control system.

Figure 2
Figure 2
Full size image

The schematic diagram of RNN structure.

Long short-term memory network, LSTM

Remark 2

The RNN is a specific type of neural network designed for processing sequential data. It distinguishes itself from traditional feedforward neural networks by incorporating cyclic connections, which allow it to maintain a memory state while processing sequences (refer to Fig. 2). Within the RNN family, the LSTM network stands out as a variant of deep learning neural networks and falls under the category of RNNs. Its primary purpose is to handle sequential data and is particularly effective in capturing and retaining long-term dependencies in the data. This capability addresses the issue of gradient vanishing that is encountered in conventional RNNs.

Figure 3
Figure 3
Full size image

The structure diagram of LSTM.

LSTM networks feature a memory cell, an essential element designed to store and retrieve information over different time intervals. This memory cell regulates information flow through a series of gating mechanisms: the input gate, the forget gate, and the output gate. These gates control the entry and exit of information, thus preserving the memory within the cell. The structure of an LSTM is illustrated in the schematic diagram referred to as Fig. 3.

In an LSTM network, the decision regarding how much information from the previous cell state should be forgotten is crucial. This decision is determined by processing the current input together with the hidden state from the prior timestep. These data points are fed into a fully connected layer, where they undergo transformation by a sigmoid activation function, resulting in the output for the forget gate. The output value of the forget gate ranges between 0 and 1, where a value of 0 indicates that all previous cell state information is forgotten, and a value of 1 indicates that all previous cell state information is retained. The equation governing the forget gate is typically given by:

$$\begin{aligned} {f}_{{t}}=\sigma \left( {W}_{{f}} \cdot \left[ {h}_{{t}-1}, {x}_{{t}}\right] +{b}_{{f}}\right) \end{aligned}$$
(12)

where, \(W_f\) is the weight matrix, \(b_f\) is the bias term, \(h_{t-1}\) is the previous time step’s hidden state, \(x_t\) is the current input, and \(\sigma \) is the sigmoid function.

In an LSTM network, determining which new information to store in the memory cell is pivotal. This decision-making process involves two key components: the input gate, which decides which portions of the memory cell are updated, and a hyperbolic tangent (tanh) layer, which creates a new candidate values vector that may be added to the memory cell. The values for both the input gate and the candidate value vector are computed based on the current input and the hidden state from the preceding time step. The equation for the input gate is as follows:

$$\begin{aligned} {i}_{{t}}= & {} \sigma \left( {W}_{{i}} \cdot \left[ {h}_{{t}-1}, {x}_{{t}}\right] +{b}_{{i}}\right) \end{aligned}$$
(13)
$$\begin{aligned} \tilde{{c}}_{{t}}= & {} \tanh \left( {W}_{{c}} \cdot \left[ {h}_{\textrm{t}-1}, \textrm{x}_{{t}}\right] +{b}_{{c}}\right) \end{aligned}$$
(14)

where, \(W_C\) and \(W_i\) represent weight matrices, while \(b_C\) and \(b_i\) are bias terms. \(h_{t-1}\) denotes the hidden state from the previous time step, and \(x_t\) corresponds to the current input. The symbol \(\sigma \) represents the sigmoid function, and \(\tanh \) stands for the hyperbolic tangent function.

The memory cell in an LSTM network is updated based on the outputs of the forget gate and the input gate. The process involves multiplying the current cell state by the forget gate’s output to discard parts of the existing state information deemed irrelevant. Subsequently, the product of the input gate’s output and the candidate value is added to the cell state, indicating the incorporation of new relevant information into the state. The formula for updating the cell state can be expressed as:

$$\begin{aligned} {c}_{{t}}={f}_{{t}} \times {c}_{{t}-1}+{i}_{{t}} \times \tilde{{c}}_{{t}} \end{aligned}$$
(15)

where, \(f_t\) represents the output of the forget gate, while \(c_{t-1}\) corresponds to the previous time step’s cell state. \(i_t\) denotes the input gate value, and \(\tilde{c}_t\) is the candidate value.

Determining what to output based on the cell state. It takes the current input and the previous time step’s hidden state, passes them through a fully connected layer, and applies a sigmoid function to obtain the output gate’s value. This value ranges from 0 to 1, where 0 means no output and 1 means full output. Next, the cell state is passed through a hyperbolic tangent (tanh) function to transform it into a value within the range of -1 to 1. This transformed value is then multiplied by the output gate’s value to compute the final hidden state. The formula for the output gate can be expressed as:

$$\begin{aligned} {o}_{{t}}= & {} \sigma \left( {W}_{{o}} \cdot \left[ {h}_{{t}-1}, \textrm{x}_{{t}}\right] +{b}_{{o}}\right) \end{aligned}$$
(16)
$$\begin{aligned} {h}_{{t}}= & {} {o}_{{t}} \times \tanh \left( {c}_{{t}}\right) \end{aligned}$$
(17)

where, \(W_o\) represents the weight matrix, \(b_o\) corresponds to the bias term, and \(h_{t-1}\) denotes the hidden state from the previous time step.

Particle Swarm Optimization, PSO

The Particle Swarm Optimization (PSO) algorithm models the dynamics of bird flocking by employing massless particles, each characterized by two key attributes: velocity, represented by V, and position, represented by X. Velocity signifies the rate of movement, while position indicates the travel direction. Each particle autonomously explores the search space to find the optimum solution, which it records as its current personal best, denoted as \(P_{\text {best}}\). These personal best values are exchanged among the particles across the entire swarm. The optimal personal best across the swarm is identified as the global best, represented as \(G_{\text {best}}\). Based on these benchmarks, all particles within the swarm recalibrate their velocities and positions in relation to both their own personal best (\(P_{\text {best}}\)) and the collectively determined global best (\(G_{\text {best}}\)). Below is a flowchart depicting the steps involved in the PSO process (refer to Fig. 4).

Figure 4
Figure 4
Full size image

The flow chart of PSO algorithm.

The PSO algorithm is relatively straightforward and can be summarized into the following steps:

  1. 1.

    Initialize the particle swarm.

  2. 2.

    Evaluate particles by computing their fitness values.

  3. 3.

    Search for individual best (\(P_{\text {best}}\)).

  4. 4.

    Search for the global best (\(G_{\text {best}}\)).

  5. 5.

    Modify particle velocities and positions.

The core idea behind PSO is the collaborative movement of particles towards better solutions. Each particle adjusts its position based on its own experience (personal best) and the collective experience of the entire swarm (global best). Through iterations, PSO aims to converge towards the optimal solution by updating particle positions and velocities according to certain rules.

The velocity and position of each particle are updated according to the following

Speed update:

$$\begin{aligned} v_i(t+1)=w \cdot v_i(t)+c_1 \cdot r_1 \cdot \left( \text{ Pbest } _i-x_i(t)\right) +c_2 \cdot r_2 \cdot ( \text{ Gbest } - \left. x_i(t)\right) \end{aligned}$$
(18)

Location update:

$$\begin{aligned} x_i(t+1)=x_i(t)+v_i(t+1) \end{aligned}$$
(19)

where, \(v_i(t)\) represents the velocity of particle i at time t, and \(x_i(t)\) denotes the position of particle i at time t. The symbol w corresponds to the inertia weight, while \(c_1\) and \(c_2\) are learning factors. Additionally, \(r_1\) and \(r_2\) are random numbers.

Remark 3

PSO-LSTM (Particle Swarm Optimization-based Long Short-Term Memory) leverages the PSO algorithm to optimize LSTM network parameters for time-series prediction tasks44,51. Unlike traditional LSTMs, which typically rely on optimization algorithms like gradient descent for parameter updates52, PSO-LSTM utilizes the global search capabilities of the PSO algorithm to find optimal parameters. This approach helps avoid the local optima pitfalls associated with gradient descent, allowing for more efficient parameter space exploration.

Control objectives

Given the underactuated nature of QUAVs, it is impractical to track and control all states directly. Consequently, in this study, we segment the entire QUAV system into inner and outer loops, adopting a dual-loop control strategy for designing the control laws. The position subsystem serves as the outer loop, while the attitude subsystem functions as the inner loop. The outer loop generates two intermediate command signals, \(\theta _{d}\) and \(\phi _{d}\), which are then relayed to the inner loop system. The inner loop is responsible for tracking these command signals using a dedicated sliding mode control law tailored for the inner loop. Figure 5 depicts the schematic representation of the proposed closed-loop control system, demonstrating the interaction between the inner and outer loops.

Figure 5
Figure 5
Full size image

The response diagram of closed-loop system of a QUAV.

Design of trajectory tracking controller for quadrotor UAVs

Position tracking controller design

Remark 4

A quadcopter, also known as a four-rotor UAV, exhibits six degrees of freedom (DOF). However, it is classified as an underactuated system because it relies on only four independent control inputs to govern its motion. This limitation prevents the comprehensive control of all state variables. In this section, we will employ a cascaded control approach, dividing the entire trajectory tracking closed-loop system into an inner-loop attitude control system and an outer-loop position subsystem. The outer loop produces two intermediate command signals, denoted as \(\theta _{\textrm{d}}\) and \(\phi _{\textrm{d}}\). These signals are then conveyed to the inner-loop system. The inner loop, in turn, tracks these intermediate command signals by employing an inner-loop sliding mode control law, thereby effectively mitigating errors introduced by the outer loop control. This cascaded control strategy allows for precise control of the quadcopter’s attitude and position, overcoming the inherent underactuation challenge.

Assuming the desired reference position is denoted as \({P}_{d}\), we define the tracking error as

$$\begin{aligned} {e_p} = {P_{pos}} - {P_{posd}} \end{aligned}$$
(20)

Using Eq. (1), we can derive the error equation for the position subsystem as follows:

$$\begin{aligned} {\ddot{e}_p} = \frac{1}{m}\left( {{U_p} + {d_F}} \right) - g{e_3} - {\ddot{P}_{posd}} \end{aligned}$$
(21)

where, \(U_{P}=U_1 {R}{e}_3\) represents the virtual control input that needs to be designed.

We define the sliding mode function as:

$$\begin{aligned} {\sigma _1} = {\dot{e}_p} + {\lambda _1}{e_p},\mathrm{{ }}{\lambda _1} > 0 \end{aligned}$$
(22)

To achieve precise tracking of the desired reference position signal, we design the virtual control law \({U}_{P}\) for the position subsystem as follows:

$$\begin{aligned} {U_p} = {\hat{m}}{{\bar{U}}_p} - {{\hat{d}}_F} \end{aligned}$$
(23)

where,

$$\begin{aligned} {{\bar{U}}_p} = g{e_3} + {\ddot{P}_{posd}} - {c_1}{\sigma _1} - {\lambda _1}{\dot{e}_p} \end{aligned}$$
(24)

where, \(\hat{m}\) represents the estimated value of mass, \(\hat{d}_F\) represents the estimated value of external disturbance forces, and \(c_1>0\).

Then

$$\begin{aligned} \begin{aligned} \dot{\sigma }_1&=\ddot{e}_p+\lambda _1 \dot{e}_p \\&=\frac{1}{m}\left( U_p+d_F\right) -g e_3-\ddot{P}_{ {posd }}+\lambda _1 \dot{e}_p \\&=\frac{1}{m}\left( \hat{m} \bar{U}_p-\hat{d}_F+d_F\right) -g e_3-\ddot{P}_{ {posd }}+\lambda _1 \dot{e}_p \end{aligned} \end{aligned}$$
(25)

The adaptive law for designing external disturbance and quality is:

$$\begin{aligned} \left\{ \begin{array}{l} \dot{\hat{d}}_F=\gamma _1 \sigma _1 \\ \dot{\hat{m}}=-\gamma _2 \sigma _1^T \bar{U}_p \end{array}\right. \end{aligned}$$
(26)

Defining disturbance estimation error as

$$\begin{aligned} \tilde{{d}}_{{F}}={d}_{{F}}-\hat{{d}}_{{F}} \end{aligned}$$
(27)

and mass estimation error as

$$\begin{aligned} \tilde{m}=m-\hat{m} \end{aligned}$$
(28)

For the stability analysis of the system, we select the following Lyapunov function:

$$\begin{aligned} V_1=\frac{1}{2} {\sigma }_1^{{T}} {\sigma }_1+\frac{1}{2 m \gamma _1} \tilde{{d}}_{{F}}^{{T}} \tilde{{d}}_{{F}}+\frac{1}{2 m \gamma _2} \tilde{m}^2 \end{aligned}$$
(29)

As can be seen from Eq. (24):

$$\begin{aligned} -g e_3-\ddot{P}_{ {posd }}+\lambda _1 \dot{e}_{{p}}+\bar{U}_{{p}}-c_1 \sigma _1=0 \end{aligned}$$
(30)

Then, the derivative of \(\sigma _1\) can be obtained as follows:

$$\begin{aligned} \dot{\sigma }_1=\frac{\hat{m} \bar{U}_{\textrm{p}}-\hat{d}_{\textrm{F}}+d_{\textrm{F}}}{m}-\bar{U}_{\textrm{p}}-c_1 \sigma _1 \end{aligned}$$
(31)

Therefore

$$\begin{aligned} \dot{V}_1={\sigma }_1^{T}\left( \frac{\hat{m} \overline{{U}}_{P}-\hat{{d}}_{\textrm{F}}+{d}_{F}}{m}-\bar{U}_{P}-c_1 \sigma _1\right) +\frac{1}{m \gamma _1} \tilde{{d}}_{F}^{T} \dot{\tilde{{d}}}_{F}+\frac{1}{m \gamma _2} \tilde{m} \dot{\tilde{m}} \end{aligned}$$
(32)

Substituting the adaptive law from Eqs. (25) and (26) into Eq. (32), it can be obtained:

$$\begin{aligned} \begin{aligned} \dot{V}_1&=-c_1 \sigma _1^{T} \sigma _1-\frac{\tilde{{d}}_{F}^{T}}{m \gamma _1}\left( \dot{\hat{d}}_F-\gamma _1 \sigma _1\right) -\frac{\tilde{m}}{m \gamma _2}\left( \dot{\hat{m}}+\gamma _2 \sigma _1^{T} \overline{{U}}_{P}\right) \\&=-c_1 \sigma _1^{T} \sigma _1 \leqslant 0 \end{aligned} \end{aligned}$$
(33)

Based on the analysis provided above, it can be concluded that when \(\sigma _1 \ne 0\), \(\dot{V}_1 < 0\), which means that \(V_1\) gradually decreases. In other words, both \(\sigma _1\), \(\tilde{{d}}_{F}\), and \(\tilde{{m}}\) decrease over time. Only when \(\sigma _1 = 0\) is reached, \(\dot{V}_1 = 0\). Therefore, the sliding mode function \(\sigma _1\) asymptotically converges to zero, which implies that as time approaches infinity (\(t \rightarrow \infty \)), \(\sigma _1\) tends to zero. In the presence of a non-zero \(\sigma _1\), the system is moving towards a desired state, gradually reducing the error represented by \(\sigma _1\), \(\tilde{{d}}_{\textrm{F}}\), and \(\tilde{{m}}\). Only when \(\sigma _1\) reaches zero, the system reaches its desired equilibrium.

Remark 5

When \(\sigma _1 = 0\), \(\dot{V}_1\) becomes zero, indicating that \(V_1\) is stable. However, both \(\tilde{d}_{F}\) and \(\tilde{m}\) remain bounded but do not necessarily converge to zero as they continuously fluctuate within defined limits53,54. To manage these fluctuations and prevent excessive values of \(\hat{m}\), potentially leading to overly high thrust inputs, we propose modifying the adaptive law (Eq. (26)) and implementing discontinuous projection mapping to constrain parameter estimates within safe operational bounds.

In accordance with the adaptive law (specified in Eq. (26)), the following method of utilizing a discontinuous projection mapping is implemented

$$\begin{aligned} \dot{\hat{m}}={\text {Proj}}_{\hat{m}}\left( -\gamma _2 \sigma _1^{T} \overline{{U}}_{p}\right) \end{aligned}$$
(34)

where

$$\begin{aligned} {\text {Proj}}_{\hat{m}}(\cdot )= {\left\{ \begin{array}{ll}0, &{} \text{ if } \hat{m} \geqslant m_{\max } \text{ and } \cdot >0 \\ 0, &{} \text{ if } \hat{m} \leqslant -m_{\max } \text{ and } \cdot <0 \\ \text{ others } &{} \end{array}\right. } \end{aligned}$$
(35)

After obtaining the virtual control input \({U}_{P}\) from Eq. (24), it is necessary to compute the actual thrust \(U_1\) and the intermediate command signals for the attitude subsystem, denoted as \({\Theta }_{d}\). We can express the virtual control input obtained from Eq. (24) in vector form as \({U}_{P} = \left[ U_x, U_y, U_z\right] ^{T}\), and define the intermediate command signals for the attitude subsystem as \({\Theta }_{d} = \left[ \begin{array}{lll}\phi _{d}&\theta _{d}&\psi _{d}\end{array}\right] ^{T}\).

From the provided equations, we can deduce the relationship between the virtual control input vector \({U}_{P}\), the actual thrust \(U_1\), and the unit vector \({e}_3\):

$$\begin{aligned} \left[ \begin{array}{l} U_x \\ U_y \\ U_z \end{array}\right] =U_1\left[ \begin{array}{ccc} C_\theta C_\psi &{} S_\theta S_\phi C_\psi -C_\phi S_\psi &{} C_\phi S_\theta C_\psi +S_\phi S_\psi \\ C_\theta S_\psi &{} S_\phi S_\theta S_\psi +C_\phi C_\psi &{} C_\phi S_\theta S_\psi -S_\phi C_\psi \\ -S_\theta &{} S_\phi C_\theta &{} C_\phi C_\theta \end{array}\right] \left[ \begin{array}{l} 0 \\ 0 \\ 1 \end{array}\right] \end{aligned}$$
(36)

where

$$\begin{aligned} U_x= & {} U_1\left( C_\phi S_\theta C_\psi +S_\phi S_\psi \right) \end{aligned}$$
(37)
$$\begin{aligned} U_y= & {} U_1\left( C_\phi S_\theta S_\psi -S_\phi C_\psi \right) \end{aligned}$$
(38)
$$\begin{aligned} U_z= & {} U_1 C_\phi C_\theta \end{aligned}$$
(39)

Next, by substituting \(U_1 = \frac{U_z}{C_\phi C_\theta }\), we can derive the following relationship

$$\begin{aligned} \begin{aligned} U_x&=\frac{U_z}{C_\phi C_\theta }\left( C_\phi S_\theta C_\psi +S_\phi S_\psi \right) =U_z\left( \tan \theta C_\psi +S_{\psi }{\tan {\phi }}{{\sec } {\theta }}\right) \\&=U_z\left[ \begin{array}{ll} C_\phi&S_\psi \end{array}\right] \left[ \begin{array}{c} {\tan {\theta }} \\ \tan \phi \sec \theta \end{array}\right] \end{aligned} \end{aligned}$$
(40)
$$\begin{aligned} \begin{aligned} U_y&=\frac{U_z}{C_\phi C_\theta }\left( C_\phi S_\theta S_\phi -S_\phi C_\psi \right) =U_z\left( \tan \theta S_\psi -C_\psi \tan \phi \sec \theta \right) \\&=U_z\left[ \begin{array}{ll} S_\phi&-C_\phi \end{array}\right] \left[ \begin{array}{c} {\tan {\theta }} \\ {{\tan \phi }} {\sec {{\theta }}} \end{array}\right] \end{aligned} \end{aligned}$$
(41)

Then

$$\begin{aligned} \left[ \begin{array}{l} U_x \\ U_y \end{array}\right] =U_z\left[ \begin{array}{cc} C_\psi &{} S_\psi \\ S_\psi &{} -C_\psi \end{array}\right] \left[ \begin{array}{c} \tan \theta \\ {\tan } {\phi } {\sec } {\theta } \end{array}\right] \end{aligned}$$
(42)

If we rewrite the intermediate command signals for the attitude subsystem \(\Theta _d\) as simply \(\Theta \), we can express the virtual control input as follows:

$$\begin{aligned} \left[ \begin{array}{c} U_x \\ U_y \end{array}\right] =U_z\left[ \begin{array}{cc} C_{\psi d} &{} S_{\psi d} \\ S_{\psi d} &{} -C_{\psi d} \end{array}\right] \left[ \begin{array}{c} \tan \left( \theta _d\right) \\ \tan \left( \phi _d\right) \sec \left( \theta _d\right) \end{array}\right] \end{aligned}$$
(43)

Multiplying both sides of Eq. (43) by the matrix \(\left[ \begin{array}{ll}C_{\phi _{d}}&S_{\psi _{d}}\end{array}\right] \), we obtain the relationship \(U_x C_{\psi _{d}}+U_y S_{\psi _{d}}=U_z \tan \left( \theta _{d}\right) \). In this research, it is assumed that the ranges of \(\theta _{d}\) and \(\phi _{d}\) are within \((-\pi / 2, \pi / 2)\). This allows for the use of the arctangent function to compute \(\theta _{d}\) and \(\phi _{d}\). Multiplying both sides of ( Eq. (43)) by the matrix \(\left[ \begin{array}{ll}C_{\psi _d}&S_{\psi _d}\end{array}\right] \) yields the pitch angle command as

$$\begin{aligned} \theta _{d}=\arctan \left( \frac{U_x C_{\psi _{d}}+U_y S_{\psi _{d}}}{U_z}\right) \end{aligned}$$
(44)

Similarly, multiplying both sides of Eq. (43) by the matrix \(\left[ {\begin{array}{*{20}{c}} {{S_{_{{\psi _{d}}}}}}&{{{ - }}{C_{{\psi _{d}}}}} \end{array}} \right] \) yields the roll angle command signal

$$\begin{aligned} {\phi _{d}} = \arctan \left( {{C_{{\theta _{d}}}}\frac{{{U_{x}}{S_{{\psi _{d}}}} - {U_y}{C_{{\psi _{d}}}}}}{{{U_{z}}}}} \right) \end{aligned}$$
(45)

Then, an actual position controller is designed as

$$\begin{aligned} U_1=\frac{U_{z}}{C_{\phi _{d}} C_{\theta _{d}}} \end{aligned}$$
(46)

Attitude tracking controller design

The attitude control subsystem, functioning as the inner loop, is designed to manage attitude stabilization via the internal control law. It is tasked with following the angular commands \(\theta _{d}\) and \(\phi _{d}\), which are produced by the outer loop. The dynamics of the attitude subsystem are governed by Eq. (2). To accurately track the intermediary command signals \({\Theta }_{d}\), the design of the control input torque vector \({U_2}\) is crucial. Additionally, the equation must incorporate the complexities of model uncertainties and external non-structured disturbance torques, which results in Eq. (2) being reformulated as follows:

$$\begin{aligned} J_0 \ddot{\Theta }+C_0 \dot{\Theta }+J_{\Delta } \ddot{\Theta }+C_{\Delta } \dot{\Theta }-U_2-d_{\Gamma }=0 \end{aligned}$$
(47)

where

$$\begin{aligned} J=J_0+J_{\Delta }, \quad {C}={C}_0+{C}_{\Delta } \end{aligned}$$

Then

$$\begin{aligned} d_1=d_{\Gamma }-J_{\Delta } \ddot{\Theta }-C_{\Delta } \dot{\Theta } \end{aligned}$$
(48)

where, \(\left\| {{d_1}} \right\| \le {D_1}\).

The dynamic model of attitude subsystem can be further written as

$$\begin{aligned} J_0 \ddot{\Theta }+C_0 \dot{\Theta }=\Gamma +d_1 \end{aligned}$$
(49)

The tracking error of the attitude subsystem is defined as follows

$$\begin{aligned} \Theta _e= & {} \Theta -\Theta _d \end{aligned}$$
(50)
$$\begin{aligned} \dot{\Theta }_r= & {} \dot{\Theta }_d-\lambda _2 \Theta _e \end{aligned}$$
(51)

where, \(\Theta =\left[ \begin{array}{lll}\phi&\theta&\psi \end{array}\right] ^T, \quad \Theta _d=\left[ \begin{array}{lll}\phi _d&\theta _d&\psi _d\end{array}\right] ^T \).

Then, a sliding mode function is introduced as

$$\begin{aligned} \sigma _2=\dot{\Theta }-\dot{\Theta }_r=\dot{\Theta }_e+\lambda _2 \Theta _e, \lambda _2>0 \end{aligned}$$
(52)

Therefore, the controller for the attitude subsystem is designed as

$$\begin{aligned} U_2=J_0 \ddot{\Theta }_r+C_0 \dot{\Theta }-c_2 \sigma _2-\eta _2 {\text {sgn}}\left( \sigma _2\right) \end{aligned}$$
(53)

Therefore, under the attitude control law design given in Eq. (53), it is possible to achieve the tracking of the angle commands \(\theta _d\) and \(\phi _d\) generated by the outer loop.

Proof

Select the following Lyapunov function

$$\begin{aligned} V_2=\frac{1}{2} {\sigma }_2^{\textrm{T}} {J}_0 {\sigma }_2 \end{aligned}$$
(54)

If we take the derivative of Eq. (54) with respect to time, we get:

$$\begin{aligned} \dot{V}_2=\sigma _2^{T} J_0 \dot{\sigma }_2=\sigma _2^{T}\left( U_2+{d}_1-{C}_0 \dot{{\Theta }}-{J}_0 \ddot{{\Theta }}_{r}\right) \end{aligned}$$
(55)

Substituting Eq. (53) into Eq. (54), it can be obtained

$$\begin{aligned} \dot{V}_2=-c_2 {\sigma }_2^{T} {\sigma }_{2}-\eta _{2}\left\| {\sigma }_{2}\right\| +{\sigma }_2^{T} {d}_1 \leqslant -c_{2} {\sigma }_2^{T} {\sigma }_{2}=-2 c_2 {J}_0{ }^{-1} V_{2} \end{aligned}$$
(56)

This implies that the attitude error subsystem is exponentially stable, meaning that \({\Theta }_{c}\) exponentially converges.

When considering the complete closed-loop system, which encompasses both the attitude subsystem and the position subsystem, we select the Lyapunov function for the entire closed-loop system to be

$$\begin{aligned} V_{total}=V_1+V_2 \end{aligned}$$
(57)

Then

$$\begin{aligned} \dot{V_{total}} \leqslant -c_1 \sigma _1^{\textrm{T}} \sigma _1-c_2^{\prime } V_2 \leqslant 0 \end{aligned}$$
(58)

Proof complete. \(\square \)

Remark 6

In the formulation of the control law Eq. (53), it is crucial to compute the first and second derivatives of the two intermediate command signals, \(\theta _{\textrm{d}}\) and \(\phi _{\textrm{d}}\). To accomplish this, one can utilize a third-order finite-time sliding-mode differentiator. This method ensures accurate and robust differentiation even in the presence of system noise and disturbances, as detailed in Reference55 and Reference56.

$$\begin{aligned} \left\{ \begin{array}{l} \dot{x}_1=x_2 \\ \dot{x}_2=x_3 \\ \varepsilon ^3 \dot{x}_3=-2^{3 / 5} \cdot 4\left( x_1-v(t)+\left( \varepsilon x_2\right) ^{9 / 7}\right) ^{1 / 3}-4\left( \varepsilon ^2 x_3\right) ^{3 / 5} \\ y_1=x_2, \quad y_2=x_3 \end{array}\right. \end{aligned}$$
(59)

where, the input signal that requires differentiation is represented as v(t), \(\varepsilon = 0.04\), and \(x_1\) is used for signal tracking, \(x_2\) estimates the first derivative of the signal. Additionally, \(x_3\) estimates the second derivative of the signal, with the initial values for the differentiator set as \(x_1(0) = 0\), \(x_2(0) = 0\), and \(x_3(0) = 0\). Since this differentiator employs an integral chain structure, in practical applications for signal differentiation with noise, the presence of noise is limited to the final layer of the differentiator. This allows for more effective noise suppression through the integral action on the first derivative of the signal.

Simulation examples

A comparison of two different disturbance situations

This manuscript addresses the trajectory tracking control issue for an underactuated QUAV by introducing a robust dual-loop sliding mode tracking control method. To demonstrate the effectiveness and capability of the proposed control method to reject disturbances, this section provides a comparative analysis across different types of disturbances, including constant and time-varying disturbances. Additionally, the performance of the open-loop system and the robustness of the proposed control strategy are validated through MATLAB/Simulink simulations. The complete diagram of the QUAV closed-loop simulation utilizing MATLAB’s S-function is illustrated in Fig. 6.

Figure 6
Figure 6
Full size image

The diagram of closed-loop simulation of QUAV based on MATLAB S-function.

Time-varying disturbance situation

In this section, MATLAB/Simulink is utilized to perform numerical simulations to verify the efficacy of our control approach under conditions of time-varying disturbances. Throughout the simulation, the QUAV is instructed to perform complex spiral ascent maneuvers. We set the target reference position for tracking as

$$\begin{aligned} {P}_{d}=\left[ 0.5 \cos \left( \frac{t}{2}\right) , 0.5 \sin \left( \frac{t}{2}\right) , 2+\frac{t}{10}\right] \end{aligned}$$
(60)

During the simulation, the reference yaw angle is maintained constant at \(\psi _{d} = \frac{\pi }{3}\). The actual inertia matrix is specified as \({I} = \text {diag}(0.0038, 0.0038, 0.008)\), and the distance between rotors is set at \(l= 0.5m\). The simulation is conducted over a period of 60s, with the mass m being adjusted at intervals of every 20s. The dynamic response of the mass over time is detailed as follows:

$$\begin{aligned} m= {\left\{ \begin{array}{ll}3.5 \mathrm {~kg}, &{} 0 \le t<20 \mathrm {~s} \\ 1.8 \mathrm {~kg}, &{} 20 \mathrm {~s} \le t<40 \mathrm {~s} \\ 0.9 \mathrm {~kg}, &{} 40 \mathrm {~s} \le t \le 60 \mathrm {~s}\end{array}\right. } \end{aligned}$$
(61)

During the flight of the QUAV, it encounters slowly varying external aerodynamic disturbance forces and time-varying disturbance torques. These disturbances are characterized as follows:

$$\begin{aligned} {d}_{F}= & {} [0.15 \sin (0.15 \pi t), 0.15 \cos (0.15 \pi t), 0.15 \cos (0.15 \pi t)] \end{aligned}$$
(62)
$$\begin{aligned} {d}_{\Gamma }= & {} [0.25 \sin (0.15 \pi t)+0.15, 0.35 \cos (0.25 \pi t)+0.255, 0.55 \sin (0.15 \pi t)+0.2] \end{aligned}$$
(63)

The tracking controller parameters are set as \(c_1=4\), \(\lambda _1=\left[ \begin{array}{lll}3 &{} 0 &{} 0 \\ 0 &{} 3 &{} 0 \\ 0 &{} 0 &{} 3\end{array}\right] \), \(\gamma _1=0.55\), \(\gamma _2=0.15\). The attitude subsystem controller parameters are defined as \(c_2=15\), \(\eta _2=0.25\), \(\lambda _1=\left[ \begin{array}{lll}30 &{} 0 &{} 0 \\ 0 &{} 30 &{} 0 \\ 0 &{} 0 &{} 30\end{array}\right] \). Besides, in the switching control, instead of the sign function \({\text {sgn}}(\varvec{\sigma }_2)\), we utilize a saturation function \({\text {sat}}(\sigma _2)\), and the boundary layer thickness is configured to be 0.30. The dynamic model in Eq. (1) reveals a direct relationship between the mass, m, and the dynamic changes in position. Since the mass, m, undergoes a change every 10 seconds, this results in a discontinuity in position tracking error every 10 seconds during the simulation. In the presence of time-varying disturbance, Figs. 7, 8, 9, 10, 11, 12 display the three-dimensional position tracking and its tracking error, attitude angle tracking, control input, mass m, and its adaptive estimation.

PSO-LSTM experiment and result analysis

Simulation test and training data

In this paper, we employ a network model based on the PSO-LSTM framework. The model utilizes data derived from the position error of a VTOL system, which is controlled using the method described in this paper. The error data is computed as the square root of the difference between the ideal and actual positions. The initial seventy percent of this error dataset is allocated for training the model, while the remaining thirty percent serves as the test sample for predictive comparison and model validation.

Predictive performance evaluation

Mean absolute error MAE, also called mean absolute deviation; In the calculation, the actual value and the predicted value are summed first, and then the average value is taken. The corresponding formula of MAE is as follows

$$\begin{aligned} M A E=\frac{1}{n} \sum _{i=1}^n\left| \hat{y}_i-y_i\right| \end{aligned}$$
(64)

The average absolute percentage error is an improvement of MAE, which avoids the influence of data range by calculating the error percentage between the real value and the prediction. MAPE calculation formula is

$$\begin{aligned} M A P E=\frac{100 \%}{n} \sum _{i=1}^n\left| \frac{\hat{y}_i-y_i}{y_i}\right| \end{aligned}$$
(65)

The root mean square error is the square root of the ratio of the deviation between the predicted value and the real value to the observation times n, RMSE calculation formula is

$$\begin{aligned} R M S E=\sqrt{\frac{1}{n} \sum _{i=1}^n\left( \hat{y}_i-y_i\right) ^2} \end{aligned}$$
(66)

where, \(y_i\) and \(\hat{y}_i\) represent the actual and predicted values of the VTOL position error.

Figure 7
Figure 7
Full size image

The three-dimensional effect diagram of position tracking under time-varying disturbance.

Figure 8
Figure 8
Full size image

The response diagram of position tracking error under time-varying disturbance.

Figure 9
Figure 9
Full size image

The response diagram of attitude angle tracking under time-varying disturbance.

Figure 10
Figure 10
Full size image

The response of control input under time-varying disturbance.

Figure 11
Figure 11
Full size image

The response of mass m and its adaptive estimation under time-varying disturbance.

Figure 12
Figure 12
Full size image

The response of disturbance torqu and its adaptive estimation under time-varying disturbance.

Constant disturbance situation

In this section, we modify the disturbance scenario by assuming that the disturbance remains constant. Under this assumption, and while keeping other simulation conditions the same, we further validate the anti-disturbance capabilities of the proposed method. The specific constant disturbance applied in this section is detailed as follows:

$$\begin{aligned} {d}_{F}=[0.35, 0.35, 45], {d}_{\Gamma }=[0.25, 0.35, 0.15] \end{aligned}$$

Under constant disturbance, position and attitude trajectory tracking and its tracking error, mass m and its adaptive estimation are shown in Figs. 13, 14, 15, 16.

Figure 13
Figure 13
Full size image

The three-dimensional effect diagram of position tracking under constant disturbance.

Figure 14
Figure 14
Full size image

The response diagram of position tracking error under constant disturbance.

Figure 15
Figure 15
Full size image

The response diagram of attitude angle tracking under constant disturbance.

Figure 16
Figure 16
Full size image

The response of mass m and its adaptive estimation under constant disturbance.

Figure 17
Figure 17
Full size image

A fitting forecast display chart of all samples.

Figure 18
Figure 18
Full size image

The diagram of training set fitting effect.

Figure 19
Figure 19
Full size image

The prediction of training set of position error data and its error.

Figure 20
Figure 20
Full size image

The diagram of testing set fitting effect.

Figure 21
Figure 21
Full size image

The prediction of testing set of position error data and its error.

The three-dimensional diagram of position tracking of QUAV shown in Fig. 7. Figure 8 shows the position tracking error of QUAV under time-varying disturbance. From Fig. 7, the red solid line represents the expected trajectory position, and the blue dashed line represents the actual position. From Figs. 7 and 8, it is evident that, even in the presence of time-varying disturbances and the control method described in this paper, precise tracking of the position trajectory can be achieved. Attitude angle tracking of QUAV under time-varying disturbance is shown in Fig. 9. It can be seen from Fig. 9 that the attitude angle of QUAV can accurately track the ideal signal under time-varying disturbance. The response of control input under time-varying disturbance as shown in Fig. 10. Figure 10 demonstrates that the system control input remains stable when utilizing the control method outlined in this paper. The response of mass m and its adaptive estimation under time-varying disturbance can be seen in Fig. 11, as shown in Fig. 11, it can be seen from the dynamic model Eq. (1) that the mass m is directly related to the dynamic change of position. Because the mass m changes every 20 seconds with time, the position tracking error jumps every 20 seconds in the simulation. Furthermore, the response of disturbance torque and its adaptive estimation under time-varying disturbance is depicted in Fig. 12. From Figs. 11 and 12, it is evident that \(\sigma _1\) converges to zero, but \(\tilde{d}_F\) and \(\bar{m}\) do not converge to zero, which aligns with the analysis provided earlier.

Just like Figs. 7, 8, 9 and 11, the three-dimensional position tracking and its error under a constant disturbance are depicted in Figs. 13 and 14. From these two figures, it is evident that, under the control method described in this paper, the QUAV’s position can be accurately tracked, irrespective of whether the disturbance is time-varying or constant. The response diagram in Fig. 15 and the adaptive estimation diagram in Fig. 16 depict the tracking of the attitude angle in the presence of a constant disturbance for a system with mass m observing these figures, it is evident that the control method proposed in this paper effectively tracks the attitude angle even in the presence of a constant disturbance. In addition, the mass m can be estimated adaptively under the step jump every 20 seconds.

Figure 17 shows a sample fitting graph for the overall QUAV position error data. The root mean square error (RMSE) for the linear predictive fitting of the entire dataset is 0.72696, as observed in Fig. 17. Figures 18 and 19 depict the data fitting graphs for two different network models (LSTM and PSO-LSTM), showcasing the training set error prediction and corresponding error graphs. Figure 18 highlights that the RMSE for the PSO-LSTM model is significantly lower, indicating superior predictive performance compared to the traditional LSTM model. Additionally, Fig. 19 demonstrates that although both models are capable of performing the prediction task, the PSO-LSTM model exhibits markedly better predictive performance and effectively addresses the issue of data overfitting. Furthermore, Figs. 20 and 21 illustrate the fitting effect of the testing set. Figure 20 shows that the RMSE of the test set data using the PSO-LSTM model is lower than that obtained with the LSTM model, indicating better performance of the PSO-LSTM model on the test set. Figure 21 presents the prediction display for the QUAV position error test set under both network models (PSO-LSTM and LSTM). It is evident from Fig. 21 that the PSO-LSTM-based network model demonstrates better prediction performance and greater robustness compared to the traditional LSTM model, making it more effective for this prediction task.

Conclusion

This paper addresses the high-precision tracking control problem of QUAVs under external disturbances using a dual-layer robust sliding mode control strategy. Additionally, a data-driven approach based on deep learning is employed to construct a PSO-LSTM neural network model for the analysis and prediction of VTOL position error data. The incorporation of PSO enables global optimization in the search space, facilitating the acceleration of the network training convergence process. This capability aids in better tuning network parameters, enhancing the overall performance of the model. Furthermore, the PSO-LSTM model may exhibit superior generalization capabilities, improving its predictive performance on new data. The numerical simulation based on MATLAB further verifies the effectiveness and robustness of the proposed control method and algorithm model.

Future research endeavors should continue building upon this foundation to further enhance their capabilities, versatility, and safety across various applications, fostering continuous innovation in the field of QUAV systems.