Introduction

Fractional calculus (FC) was introduced over 300 years ago. FC has been widely used in control science1,2, biology3, optics4, mathematics5and other fields. Compared with integer calculus, the fractional order model established using FC can more precisely describe dynamic change processes in real physical systems6,7, such as the relaxation and creep of viscoelastic materials8, temperature diffusion9, the spread of viruses10, the consumption of battery energy11, and heat conduction of the human head12.

Fractional order system identification technology involves the use of observed input and output data to estimate the unknown parameters and the fractional order of the system, enabling the establishment of precise mathematical models. This step is essential for controller design. Therefore, scholars have conducted substantial research on fractional order system identification methods. Dai et al.13 used modulation functions and numerical approximation methods to process fractional-order differentials and combined the recursive least-squares method to estimate system parameters. Aguilar et al.14 extended the integer order neural network to a fractional order neural network and successfully applied it to fractional order system identification. Victor et al.15 proposed a long memory recursive prediction error method that simultaneously identified model parameters and fractional order. Tian et al.16 estimated the parameters of a system through the least-squares method, combined with the gradient descent algorithm to identify the fractional order. Li et al.17used Haar wavelets to describe input and output signals and converted the fractional order system into an integral equation. They introduced an optimization method to solve integral equation and obtained the system parameters and orders. Gao18 studied a reduced-order Kalman filter to mitigate the impact of measurement noise on system parameters estimation accuracy. Galvao et al.19 converted a fractional-order system into a cubic equation using exponentially modulated signals and determined the fractional orders and system parameters by solving the equation. Wang et al.20 extended the frequency domain subspace parameter estimation method to fractional order systems. Djouambi et al.21 used a recursive least-squares and recursive auxiliary variable method for system identification. Wang et al.22 developed a wavelet integration operational matrix method, using wavelet decomposition and reconstruction to improve the estimation accuracy of system parameters and order coefficients. Li et al.23 proposed a gradient descent method based on forgetting factor to estimate system parameters, but this method could not identify fractional orders. Marzougui et al.24 combined the recursive least-squares and Levenberg–Marquardt algorithms to estimate the parameters and fractional order of a system. Zhang et al.25 proposed block pulse functions and the Gauss–Newton method to identify system parameters. Zhang et al.26converted a system identification problem into a nonlinear least-squares optimization problem and iteratively solved the optimization equation based on the sensitivity function. Moghaddam27combined the evolutionary method and the least-squares algorithm to estimate the parameters and fractional order of a system. Additionally, several intelligent optimization algorithms have been used in fractional order system identification, such as the genetic algorithm28, improved differential evolution algorithm29, and improved quantum bacterial foraging algorithm30.

As seen from the above-mentioned studies, the gradient descent method has been widely used in fractional-order system identification because of its broad application range and easy engineering implementation. However, gradient descent algorithms usually need to be combined with other algorithms to identify fractional order and system parameters separately. Moreover, integer order gradients have low convergence speed and accuracy. Therefore, this study proposes a joint multi-innovation fractional gradient descent identification algorithm.

The main contributions of this study can be summarized as follows:

  • The proposed algorithm uses two fractional gradients to iterate interactively, avoiding the complexity of combining multiple different algorithms for fractional order system identification.

  • The proposed algorithm combines multi-innovation theory and the flexibility of FC to improve the system identification accuracy and identification speed.

  • The effectiveness of the proposed algorithm in engineering applications is verified through an experiment involving the identification of a flexible linkage system.

  • The proposed algorithm can be extended to the identification of fractional order nonlinear systems or fractional order time-delay systems.

The remainder of this paper is organized as follows: Mathematical background and system description are presented in Sect. 2. The joint multi-innovation fractional gradient descent identification algorithm is proposed in Sect. 3. The convergence of the algorithm is analyzed in Sect. 4. A simulation example and an experiment are provided in Sect. 5. Finally, the conclusions are presented in Sect. 6.

Mathematical background and system description

Fractional calculus

FC extends integral or differential from integers to arbitrary order31. The definition of FC is also different from that of integer calculus. There are three most widely used definitions of FC, and each definition represents fractional operators differently. This paper mainly uses fractional order operators defined by Grünwald-Letnikov to describe fractional order systems32

$$\Delta^{{\overline{\alpha }}} x(kh) = \frac{1}{{h^{{\overline{\alpha }}} }}\sum\limits_{j = 0}^{k} {( - 1)^{j} } \left( \begin{gathered} {\overline{\alpha }} \hfill \\ j \hfill \\ \end{gathered} \right)x((k - j)h),$$
(1)

where \(\Delta\) denotes the discrete fractional derivation operator, \(\overline{\alpha }\) is the fractional order of the operator, \(x(kh)\) represents a function of \(t = kh\), k represents the number of sampling times, h denotes the sampling interval, and \(\left( \begin{gathered} {\overline{\alpha }} \hfill \\ j \hfill \\ \end{gathered} \right)\) can be expressed as follows

$$\left( \begin{gathered} {\overline{\alpha }} \hfill \\ j \hfill \\ \end{gathered} \right) = \left\{ \begin{gathered} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} 1,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\text{for}}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} j = 0, \hfill \\ \frac{{\overline{\alpha }(\overline{\alpha } - 1) \cdots (\overline{\alpha } - j + 1)}}{j!},{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\text{for}}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} j> 0. \hfill \\ \end{gathered} \right.$$
(2)

Define the \(\beta (j) = ( - 1)^{j} \left( \begin{gathered} {\overline{\alpha }} \hfill \\ j \hfill \\ \end{gathered} \right)\). Then, Eq. (1) can be rewritten into the following equation

$$\Delta^{{\overline{\alpha }}} x(kh) = \frac{1}{{h^{{\overline{\alpha }}} }}\sum\limits_{j = 0}^{k} {\beta (j)} x((k - j)h).$$
(3)

Under the assumption of h = 1, Eq. (3) can be expressed as follows

$$\Delta^{{\overline{\alpha }}} x(k) = \sum\limits_{j = 0}^{k} {\beta (j)} x(k - j).$$
(4)

Fractional order linear systems

The fractional order linear discrete system is expressed as follows

$$A(z)y(k) = B(z)u(k) + v(k),$$
(5)

where \(y(k)\) and \(u(k)\) are the system output and input, respectively; \(v\left( k \right)\) represents white noise with zero mean and finite variance \(\sigma_{v}^{2}\); and \(k = 1,2, \cdots ,N\) denotes the data length. The polynomials \(A(z)\) and \(B(z)\) of the system can be expressed as

$$A(z) = 1 + a_{1} z^{{ - \alpha_{1} }} + a_{2} z^{{ - \alpha_{2} }} + \cdots + a_{i} z^{{ - \alpha_{i} }} ,$$
(6)
$$B(z) = b_{0} + b_{1} z^{{ - \gamma_{1} }} + b_{2} z^{{ - \gamma_{2} }} + \cdots + b_{j} z^{{ - \gamma_{j} }} ,$$
(7)

where \(z^{ - 1}\) is the fractional backshift operator. \(a_{i}\) and \(b_{j}\) are the polynomial coefficients, \(\alpha_{i}\) and \(\gamma_{j}\) are the fractional orders of the polynomial.

When the fractional order of the polynomials in Eq. (6) and Eq. (7) becomes \(\alpha_{i} = i - \overline{\alpha },\gamma_{j} = j - \overline{\alpha }\), the system is referred to as a same-dimensional fractional order system; that is

$$y(k) + a_{1} z^{{ - 1 + \overline{\alpha }}} y(k) + \cdots + a_{i} z^{{ - i + \overline{\alpha }}} y(k) = b_{0} u(k) + b_{1} z^{{ - 1 + \overline{\alpha }}} u(k) + \cdots + b_{j} z^{{ - j + \overline{\alpha }}} u(k) + v(k).$$
(8)

According to \(z^{{ - i + \overline{\alpha }}} x(k) = \Delta^{{\overline{\alpha }}} x(k - i)\), Eq. (8) can be rewritten as

$$y(k) + a_{1} \Delta^{{\overline{\alpha }}} y(k - 1) + \cdots + a_{i} \Delta^{{\overline{\alpha }}} y(k - i) = b_{0} u(k) + b_{1} \Delta^{{\overline{\alpha }}} u(k - 1) + \cdots + b_{j} \Delta^{{\overline{\alpha }}} u(k - j) + v(k).$$
(9)

Equation (9) can be expressed as

$$y(k) = \varphi^{{\text{T}}} (k,\overline{\alpha })\theta + v(k),$$
(10)

where \(\varphi^{{\text{T}}} (k,\overline{\alpha })\) and \(\theta\) are the information vector and the parameter vector, respectively, which can be expressed as

$$\varphi (k,\overline{\alpha }) = [ - \Delta^{{\overline{\alpha }}} y\left( {k - 1} \right), - \Delta^{{\overline{\alpha }}} y\left( {k - 2} \right), \cdots , - \Delta^{{\overline{\alpha }}} y\left( {k - i} \right),u\left( k \right),\Delta^{{\overline{\alpha }}} u\left( {k - 1} \right), \cdots ,\Delta^{{\overline{\alpha }}} u\left( {k - j} \right)]^{{\text{T}}} ,$$
(11)
$$\theta = \left[ {a_{1} ,a_{2} , \cdots ,a_{i} ,b_{0} ,b_{1} , \cdots ,b_{j} } \right]^{{\text{T}}} .$$
(12)

This paper considers the same-dimensional fractional-order system in Eq. (10). In the following sections, identification methods are proposed to estimate the unknown fractional order \(\overline{\alpha }\) in Eq. (11) and the unknown parameter vectors in Eq. (12).

Joint multi-innovation fractional gradient descent identification algorithm

Compared with the integer order system, the fractional order system (10) introduces an unknown fractional order \(\overline{\alpha }\), and a strong coupling exists between the fractional order and each parameter. Inaccurate identification of the fractional order \(\overline{\alpha }\) or parameters will affect the dynamic performance of the system, which increases the complexity of system modelling. Therefore, this paper proposes the joint multi-innovation fractional gradient descent identification algorithm. The algorithm’s identification process is illustrated in Fig. 1.

Fig. 1
Fig. 1
Full size image

Identification process of the joint multi-innovation fractional gradient descent identification algorithm.

The algorithm is divided into two stages: parameter identification and fractional order identification. In the first stage, the fractional gradient of the unknown parameters is established, and the gradient function is continuously iterated according to the length of the observation data to obtain the final parameter identification results. The parameter identification results of the first stage are used as the initial conditions for fractional order identification. In the second stage, the fractional gradient of the system order is established, and the unknown fractional order is also iterated according to the length of the observation data. The fractional order identification results in the second stage are used as the update conditions for parameter estimation. The identification results of the two stages are combined to achieve accurate identification of the fractional order and system parameters through joint iteration.

Identification of fractional order system parameters

This paper leverages the high flexibility of FC to extend the integer gradient into a fractional gradient to enhance algorithm performance. In addition, the use of multi-innovation theory enables the efficient utilization of the observation data information to further improve the identification speed and accuracy of the algorithm.

The output error of the system (10) is defined as

$$e(k,\hat{\alpha }) = y(k) - \varphi^{{\text{T}}} (k,\hat{\alpha })\hat{\theta }(k - 1),$$
(13)

where \(\hat{\alpha }\) and \(\hat{\theta }\) are the estimated values of the fractional order \(\overline{\alpha }\) and system parameters \(\theta\), respectively.

Multi-innovation theory combines current and past time data to form a multi-innovation matrix, which is then used to estimate the current unknown information33,34. In multi-innovation theory, \(y(k),\varphi^{{\text{T}}} (k,\hat{\alpha }),e(k,\hat{\alpha })\) and \(v(k)\) are referred to as single innovations. We expand the single innovation to a p-dimensional multi-innovation matrix.

$$Y(p,k) = \left[ {y(k),y(k - 1), \cdots ,y(k - p + 1)} \right]^{{\text{T}}} ,$$
(14)
$$\Phi (p,k,\hat{\alpha }) = \left[ {\varphi (k,\hat{\alpha }),\varphi (k - 1,\hat{\alpha }), \cdots ,\varphi (k - p + 1,\hat{\alpha })} \right]^{{\text{T}}} ,$$
(15)
$$V(p,k) = \left[ {v(k),v(k - 1), \cdots ,v(k - p + 1)} \right]^{{\text{T}}} ,$$
(16)
$$E(p,k,\hat{\alpha }) = \left[ {e(k,\hat{\alpha }),e(k - 1,\hat{\alpha }), \cdots ,e(k - p + 1,\hat{\alpha })} \right]^{{\text{T}}} ,$$
(17)

where p denotes the multi-innovation length. The larger the value of p, the larger the dimension of the matrix composed of data at the past moment and the current moment, and the higher the data utilization rate. According to Eqs. (13) - (15), Eq. (17) can be rewritten as

$$\begin{aligned} E(p,k,\hat{\alpha }) & = \left[ {\begin{array}{*{20}c} {y(k) - \varphi^{T} (k,\hat{\alpha })\hat{\theta }(k - 1)} \\ {y(k - 1) - \varphi^{T} (k - 1,\hat{\alpha })\hat{\theta }(k - 1)} \\ \vdots \\ {y(k - p + 1) - \varphi^{T} (k - p + 1,\hat{\alpha })\hat{\theta }(k - 1)} \\ \end{array} } \right] \\ & = Y(p,k) - \Phi^{T} (p,k,\hat{\alpha })\hat{\theta }(k - 1). \\ \end{aligned}$$
(18)

According to the obtained multi-innovation matrix, the criterion function of the unknown parameters in Eq. (10) is defined as follows

$$J_{1} (\theta ) = \frac{1}{2}\left[ {Y(p,k) - \Phi^{{\text{T}}} (p,k,\hat{\alpha })\theta } \right]^{2} .$$
(19)

In Eq. (19), \(\Phi^{{\text{T}}} (p,k,\hat{\alpha })\) contains the unknown fractional order \(\hat{\alpha }\). To calculate the extreme value of the \(J_{1} (\hat{\theta })\) and obtain the estimated parameter \(\hat{\theta }\), we provide the initial state of the fractional order, enabling iteration using the fractional-order gradient

$$\hat{\theta }(k) = \hat{\theta }(k - 1) - \nabla^{\alpha } J_{1} (\hat{\theta }(k - 1)),$$
(20)

where \(\nabla^{\alpha }\)is the fractional order gradient. According to35, the fractional order gradient \(\nabla^{\alpha } f(x)\) of any function \(f(x)\) can be expressed as

$$\nabla^{\alpha } f(x) = \mu \frac{{f^{(1)} (x)}}{\Gamma (2 - \alpha )}(\left| {x - c} \right| + \varepsilon )^{1 - \alpha } ,$$
(21)

where \(0 < \alpha < 2\) is the fractional order of the gradient, \(\varepsilon\) is a small non-negative number, c is the low integral terminal, \(\mu\) is the step size, and \(\Gamma (\alpha )\) is the gamma function. According to Eq. (21), Eq. (20) can be rewritten as

$$\begin{aligned} \hat{\theta }(k) & = \hat{\theta }(k - 1) - \nabla^{\alpha } J_{1} (\hat{\theta }(k - 1)) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & = \hat{\theta }(k - 1) + \mu \frac{{\nabla J_{1} (\hat{\theta }(k - 1))}}{\Gamma (2 - \alpha )}(\left| {\hat{\theta }(k - 1) - \hat{\theta }(k - 2)} \right| + \varepsilon )^{1 - \alpha } \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & = \hat{\theta }(k - 1) + \mu \frac{{\Phi (p,k,\hat{\alpha })[Y(p,k) - \Phi^{{\text{T}}} (p,k,\hat{\alpha })\theta (k - 1)]\Xi (\hat{\theta },\alpha ,k)}}{\Gamma (2 - \alpha )}, \hfill \\ \end{aligned}$$
(22)

where \(\Xi (\hat{\theta },\alpha ,k) = {\text{diag}}\left\{ {\left[ {\left| {\hat{\theta }_{1} (k - 1) - \hat{\theta }_{1} (k - 2)} \right| + \varepsilon } \right]^{1 - \alpha } ,\left[ {\left| {\hat{\theta }_{2} (k - 1) - \hat{\theta }_{2} (k - 2)} \right| + \varepsilon } \right]^{1 - \alpha } , \cdots ,} \right.\)\(\left. {\left[ {\left| {\hat{\theta }_{l} (k - 1) - \hat{\theta }_{l} (k - 2)} \right| + \varepsilon } \right]^{1 - \alpha } } \right\}\), and l is the number of identification parameters. The step size \(\mu\) can be expressed as

$$\begin{gathered} \mu = 1/r(k),{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} r(k) = \overline{r}(k - 1) + \left\| {\Xi (\hat{\theta }\user2{,}\alpha \user2{,}k)\Phi (p,k,\hat{\alpha })} \right\|^{2} , \hfill \\ \overline{r}(k) = \overline{r}(k - 1) + \left\| {\Phi (p,k,\hat{\alpha })} \right\|^{2} ,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \overline{r}(0) = 1, \hfill \\ \end{gathered}$$
(23)

where \(\overline{r}(k)\) is the iteration factor of the step size, and \(\left\| \cdot \right\|\) is the L2-norm. Combining Eqs. (18), (22), and (23), we can obtain the joint multi-innovation fractional gradient descent algorithm.

$$\begin{aligned} \hat{\theta }(k) & = \hat{\theta }(k - 1) + \mu \frac{{\Phi (p,k,\hat{\alpha })[Y(p,k) - \Phi^{{\text{T}}} (p,k,\hat{\alpha })\theta (k - 1)]\Xi (\hat{\theta },\alpha ,k)}}{\Gamma (2 - \alpha )} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & = \hat{\theta }(k - 1) + \mu \frac{{\Phi (p,k,\hat{\alpha })E(p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)}}{\Gamma (2 - \alpha )} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & = \hat{\theta }(k - 1) + \frac{{\Phi (p,k,\hat{\alpha })E(p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)}}{r(k)\Gamma (2 - \alpha )}. \hfill \\ \end{aligned}$$
(24)

From Eq. (24), the unknown parameters can be identified through constant iteration, and the identification results \(\hat{\theta }(N)\) can be used as the initial state for identifying the fractional order.

Identification of the system fractional order

At present, some studies assume that the fractional order of the system is known and focus solely on estimating the parameters of the system. Others combine two algorithms, with one algorithm estimating the system parameters and the other identifying the fractional order. These combined methods increase the identification complexity of the system. Therefore, this study proposes a joint gradient identification algorithm. According to the parameter identification results \(\hat{\theta }(N)\), a fractional gradient of fractional order is constructed, and it is used to estimate the fractional order of the system. Through the joint iteration of the two gradients of the system parameters and the fractional order, the unknown parameters and the fractional order are simultaneously identified.

The fractional order identification process involves using the system parameters identified in one stage, combining them with the fractional gradient, and iteratively updating the fractional order.

The criterion function of fractional order is defined as

$$J_{2} (\hat{\alpha }) = \frac{1}{2}\left[ {Y(p,k) - \Phi^{{\text{T}}} (p,k,\hat{\alpha })\hat{\theta }(N)} \right]^{2} .$$
(25)

The fractional gradient of \(J_{2} (\hat{\alpha })\) is expressed as

$$\begin{aligned} \nabla^{\alpha } J_{2} (\hat{\alpha }) & = \mu_{1} \frac{{J^{(1)} (\hat{\alpha })}}{\Gamma (2 - \alpha )}(\left| {\hat{\alpha } - c} \right| + \varepsilon )^{1 - \alpha } \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & = - \mu_{1} \frac{{\frac{{\partial (\Phi^{{\text{T}}} (p,k,\hat{\alpha })\hat{\theta }(N))}}{{\partial \hat{\alpha }}}[Y(p,k) - \Phi^{{\text{T}}} (p,k,\hat{\alpha })\hat{\theta }(N)]}}{\Gamma (2 - \alpha )}(\left| {\hat{\alpha }(k) - \hat{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } , \hfill \\ \end{aligned}$$
(26)

where \(\mu_{1}\) is the fractional order step size. According to Eq. (15), \(\Phi^{{\text{T}}} (p,k,\hat{\alpha })\hat{\theta }(N)\) in Eq. (26) can be expanded as

$$\Phi^{{\text{T}}} (p,k,\hat{\alpha })\hat{\theta }(N) = \left[ {\varphi^{{\text{T}}} (k,\hat{\alpha })\hat{\theta }(N),\varphi^{{\text{T}}} (k - 1,\hat{\alpha })\hat{\theta }(N), \cdots ,\varphi^{{\text{T}}} (k - p + 1,\hat{\alpha })\hat{\theta }(N)} \right].$$
(27)

The partial derivative of the inner variable \(\varphi^{{\text{T}}} (k,\hat{\alpha })\hat{\theta }(N)\) of Eq. (27) can be expressed as

$$\begin{aligned} \frac{{\partial (\varphi^{{\text{T}}} (k,\hat{\alpha })\hat{\theta }(N))}}{{\partial \hat{\alpha }}} & \approx \frac{{\varphi^{{\text{T}}} ((k,\hat{\alpha }(k - 1) + \kappa \hat{\alpha }(k - 1))\hat{\theta }(N) - \varphi^{{\text{T}}} ((k,\hat{\alpha }(k - 1))\hat{\theta }(N)}}{{\kappa \hat{\alpha }(k - 1)}} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & = \tilde{\varphi }^{{\text{T}}} (k,\hat{\alpha }), \hfill \\ \end{aligned}$$
(28)

where \(\kappa \hat{\alpha }(k - 1)\) is a small variation of \(\hat{\alpha }(k - 1)\).

The multi-innovation matrix \(\Phi_{{\hat{\alpha }}} (p,k,\hat{\alpha })\) is defined as

$$\Phi_{{\hat{\alpha }}} (p,k,\hat{\alpha }) = [\tilde{\varphi }(k,\hat{\alpha }),\tilde{\varphi }(k - 1,\hat{\alpha }), \cdots ,\tilde{\varphi }(k - p + 1,\hat{\alpha })].$$
(29)

From Eq. (28) and (29), Eq. (26) can be rewritten as

$$\begin{gathered} \nabla^{\alpha } J_{2} (\hat{\alpha }) = - \mu_{1} \frac{{\frac{{\partial (\Phi^{{\text{T}}} (p,k,\hat{\alpha })\hat{\theta }(N))}}{{\partial \hat{\alpha }}}[Y(p,k) - \Phi^{{\text{T}}} (p,k,\hat{\alpha })\hat{\theta }(N)]}}{\Gamma (2 - \alpha )}(\left| {\hat{\alpha }(k) - \hat{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = - \mu_{1} \frac{{\Phi_{{\hat{\alpha }}} (p,k,\hat{\alpha })[Y(p,k) - \Phi^{{\text{T}}} (p,k,\hat{\alpha })\hat{\theta }(N)]}}{\Gamma (2 - \alpha )}(\left| {\hat{\alpha }(k) - \hat{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } . \hfill \\ \end{gathered}$$
(30)

The fractional order gradient iteration equation is expressed as

$$\begin{aligned} \hat{\alpha }(k) & = \hat{\alpha }(k - 1) - \nabla^{\alpha } J_{2} (\hat{\alpha }(k - 1)) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & = \hat{\alpha }(k - 1) + \mu_{1} \frac{{\Phi_{{\hat{\alpha }}} (p,k,\hat{\alpha })[Y(p,k) - \Phi^{{\text{T}}} (p,k,\hat{\alpha })\hat{\theta }(N)]}}{\Gamma (2 - \alpha )}(\left| {\hat{\alpha }(k) - \hat{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } . \hfill \\ \end{aligned}$$
(31)

The fractional order step size \(\mu_{1}\) is taken as follows

$$\mu_{1} = 1/r_{1} (k),{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} r_{1} (k) = \overline{r}(k - 1) + \left\| {\Phi_{{\hat{\alpha }}} (p,k,\hat{\alpha })[\left| {\hat{\alpha }(k) - \hat{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } ]} \right\|^{2} .$$
(32)

According to Eqs. (31) and (32), we can derive the joint multi-innovation fractional gradient for identifying the fractional order.

$$\begin{gathered} \hat{\alpha }(k) = \hat{\alpha }(k - 1) + \mu_{1} \frac{{\Phi_{{\hat{\alpha }}} (p,k,\hat{\alpha })[Y(p,k) - \Phi^{{\text{T}}} (p,k,\hat{\alpha })\hat{\theta }(N)]}}{\Gamma (2 - \alpha )}(\left| {\hat{\alpha }(k) - \hat{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = \hat{\alpha }(k - 1) + \frac{{\Phi_{{\hat{\alpha }}} (p,k,\hat{\alpha })[Y(p,k) - \Phi^{{\text{T}}} (p,k,\hat{\alpha })\hat{\theta }(N)]}}{{r_{1} (k)\Gamma (2 - \alpha )}}(\left| {\hat{\alpha }(k) - \hat{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } . \hfill \\ \end{gathered}$$
(33)

The fractional order can be identified using Eq. (33). The identification result \(\hat{\alpha }(N)\) is used as the update condition for parameter estimation. Through interactive iteration, joint identification of parameters and fractional order is achieved. The steps of the joint multi-innovation fractional gradient descent identification algorithm are summarised below.

figure a

Convergence analysis

Convergence analysis is an important basis for algorithm stability and reliability. Therefore, the convergence of the joint multi-innovation fractional gradient descent identification algorithm is analyzed in this section. Convergence analysis mainly includes two parts, namely parameter convergence and fractional order convergence. To prove the convergence of the algorithm, some lemmas are required.

Lemma 1

36 For fractional order systems (10) and the joint multi-innovation fractional gradient descent identification algorithm, there exist constants \(0 < \chi \le \rho < \infty\) such that the following strong persistent excitation condition holds.

$$\chi I_{n} \le \frac{1}{N}\sum\limits_{i = 0}^{N - 1} {\varphi (k + i,\hat{\alpha })} \varphi^{{\text{T}}} (k + i,\hat{\alpha }) \le \rho I_{n} ,$$
(34)

Then, \(\overline{r}(k)\) in Eq. (23) satisfies the inequality

$$n\chi (k - N + 1) \le \overline{r}(k) \le n\rho (k + N - 1) + 1.$$
(35)

Lemma 2

36 The fractional order \(\hat{\alpha }\) leads to the input fractional order information vector \(\varphi (k,\hat{\alpha })\) of the system satisfying the continuous excitation condition under the assumption that the noise signal \(\left\{ {v(k)} \right\}\) is an independent random signal, and the expectation of the signal satisfies \({\mathbb{E}}\left[ {v(k)} \right] = 0,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\mathbb{E}}\left[ {v(k)^{2} } \right] = \sigma_{v}^{2}\).

Lemma 3

37 Under the assumption that the non-negative sequences \(\left\{ {x(t)} \right\},\left\{ {a_{t} } \right\}\) and \(\left\{ {\beta_{t} } \right\}\) satisfy \(x(t + 1) \le (1 - a_{t} )x(t) + \beta_{t}\) and \(a_{t} \in [0,1),\sum\limits_{t = 1}^{\infty } {a_{t} } = \infty ,x(0) < \infty\), then \(\lim {\kern 1pt} {\kern 1pt} {\kern 1pt} \sup x(t{)} \le {\text{lim}}{\kern 1pt} {\kern 1pt} {\kern 1pt} \frac{{\beta_{t} }}{{a_{t} }}\).

Lemma 4

38 Non-negative random variables \(T(n),\beta (n)\) and \(\alpha (n)\) are measurable with respect to a non-decreasing sequence of \(\sigma\) algebra \(F(n - 1)\) if the following inequality is satisfied.

$${\mathbb{E}}\left[ {T(n)\left| {F(n - 1)} \right.} \right] \le T(n - 1) + \alpha (n) - \beta (n),$$
(36)

Then, we can obtain \(\sum\nolimits_{n = 1}^{\infty } {\beta (n) < \infty }\) and \(\lim T(n{)} \le T\),where \(T(n)\) is a finite non-negative random variable.

Convergence of identification parameters

The parameter estimation error vector is defined as follows

$$\tilde{\theta }(k) = \hat{\theta }(k) - \theta .$$
(37)

Equations (10) and (14)-(16) are used to obtain the multi-innovation matrices of the fractional-order system.

$$Y(p,k) = \Phi^{{\text{T}}} (p,k,\hat{\alpha })\theta + V(p,k).$$
(38)

According to Eq. (38), by subtracting \(\theta\) from both ends of Eq. (24), we can obtain

$$\begin{aligned} \tilde{\theta }(k) & = \tilde{\theta }(k - 1) + \frac{{\Phi (p,k,\hat{\alpha })E(p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)}}{r(k)\Gamma (2 - \alpha )} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & = \tilde{\theta }(k - 1) + \frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)}}{r(k)\Gamma (2 - \alpha )}[Y(p,k) - \Phi^{{\text{T}}} (p,k,\hat{\alpha })\hat{\theta }(k - 1)] \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & = \tilde{\theta }(k - 1) + \frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)}}{r(k)\Gamma (2 - \alpha )}[\Phi^{{\text{T}}} (p,k,\hat{\alpha })\theta + V(p,k) - \Phi^{{\text{T}}} (p,k,\hat{\alpha })\hat{\theta }(k - 1)] \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & = \tilde{\theta }(k - 1) + \frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)}}{r(k)\Gamma (2 - \alpha )}[ - \Phi^{{\text{T}}} (p,k,\hat{\alpha })\tilde{\theta }(k - 1) + V(p,k)] \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & = \left[ {I_{n} - \frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)\Phi^{{\text{T}}} (p,k,\hat{\alpha })}}{r(k)\Gamma (2 - \alpha )}} \right]\tilde{\theta }(k - 1) + \frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)V(p,k)}}{r(k)\Gamma (2 - \alpha )}. \hfill \\ \end{aligned}$$
(39)

Taking the norm of both sides of Eq. (39).

$$\begin{aligned} \left\| {\tilde{\theta }(k)} \right\|^{2} & = \left\| {\left[ {I_{n} - \frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)\Phi^{{\text{T}}} (p,k,\hat{\alpha })}}{r(k)\Gamma (2 - \alpha )}} \right]\tilde{\theta }(k - 1)} \right\|^{2} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & + 2\tilde{\theta }(k - 1)\left[ {I_{n} - \frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)\Phi^{{\text{T}}} (p,k,\hat{\alpha })}}{r(k)\Gamma (2 - \alpha )}} \right]\frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)V(p,k)}}{r(k)\Gamma (2 - \alpha )} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & + \left\| {\frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,k,\hat{\alpha })V(p,k)}}{r(k)\Gamma (2 - \alpha )}} \right\|^{2} {\kern 1pt} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & \le \lambda_{\max } \left[ {I_{n} - \frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)\Phi^{{\text{T}}} (p,k,\hat{\alpha })}}{r(k)\Gamma (2 - \alpha )}} \right]\left\| {\tilde{\theta }(k - 1)} \right\|^{2} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & + 2\tilde{\theta }(k - 1)\left[ {I_{n} - \frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)\Phi^{{\text{T}}} (p,k,\hat{\alpha })}}{r(k)\Gamma (2 - \alpha )}} \right]\frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)V(p,k)}}{r(k)\Gamma (2 - \alpha )} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & + \left\| {\frac{{\Xi (\hat{\theta },\alpha ,k)\Phi (p,k,\hat{\alpha })V(p,k)}}{r(k)\Gamma (2 - \alpha )}} \right\|^{2} . \hfill \\ \end{aligned}$$
(40)

According to the discussion in Ref39. and the definition \(\Xi (\hat{\theta },\alpha ,k)\) in Eq. (22) that \(0 < \hat{\theta }_{i} (k - 1) - \hat{\theta }_{i} (k - 2) < 1,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} i = 1,2, \cdots ,l\) holds.

$$\left\{ \begin{gathered} \varepsilon^{{\frac{1 - \alpha }{2}}} < \Xi^{\frac{1}{2}} (\hat{\theta },\alpha ,k) \le (1 + \varepsilon )^{{\frac{1 - \alpha }{2}}} ,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} 0 < \alpha \le 1, \hfill \\ (1 + \varepsilon )^{{\frac{1 - \alpha }{2}}} < \Xi^{\frac{1}{2}} (\hat{\theta },\alpha ,k) \le \varepsilon^{{\frac{1 - \alpha }{2}}} ,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} 1 < \alpha \le 2. \hfill \\ \end{gathered} \right.$$
(41)

Then, \(\Xi (\hat{\theta },\alpha ,k) = \max \{ \varepsilon^{1 - \alpha } ,\left( {1 + \varepsilon } \right)^{1 - \alpha } \}\) with \(0 < \alpha < 2\). Combining Eq. (41), Lemma 1, and Lemma 2 and considering that p = N, we can obtain

$$\begin{aligned} I_{n} - \frac{{\Xi (\hat{\theta },\alpha ,k)\Phi (p,k,\hat{\alpha })\Phi^{{\text{T}}} (p,k,\hat{\alpha })}}{r(k)\Gamma (2 - \alpha )} & = I_{n} - \frac{{\varepsilon^{1 - \alpha } }}{r(k)\Gamma (2 - \alpha )}\sum\limits_{i = 0}^{N - 1} {\varphi (k - i,\hat{\alpha })} \varphi^{{\text{T}}} (k - i,\hat{\alpha }) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & \le \left[ {1 - \frac{{\varepsilon^{1 - \alpha } N\chi }}{{\left[ {\overline{r}(k - 1) + \left\| {\Xi (\hat{\theta },\alpha ,k)\Phi (p,k,\hat{\alpha })} \right\|^{2} } \right]\Gamma (2 - \alpha )}}} \right]I_{n} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & \le \left[ {1 - \frac{{\varepsilon^{1 - \alpha } N\chi }}{{\left[ {n\rho (k - N) + 1 + (1 + \varepsilon )^{1 - \alpha } N\rho } \right]\Gamma (2 - \alpha )}}} \right]I_{n} , \hfill \\ \end{aligned}$$
(42)
$$\begin{aligned} {\mathbb{E}}\left[ {\left\| {\Xi (\hat{\theta },\alpha ,k)\Phi (p,k,\hat{\alpha })V(p,k)} \right\|^{2} } \right] & \le {\mathbb{E}}\left\{ {\lambda_{\max } \left[ {\Phi (p,k,\hat{\alpha })\Phi^{{\text{T}}} (p,k,\hat{\alpha })} \right]\left\| {\Xi (\hat{\theta },\alpha ,k)} \right\|^{2} \left\| {V(p,k)} \right\|^{2} } \right\} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & \le p\rho {\mathbb{E}}\left[ {\left\| {\Xi (\hat{\theta },\alpha ,k)} \right\|^{2} \left\| {V(p,k)} \right\|^{2} } \right] = p^{2} \rho (1 + \varepsilon )^{2(1 - \alpha )} \sigma_{v}^{2} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & \le N^{2} \rho (1 + \varepsilon )^{2(1 - \alpha )} \sigma_{v}^{2} . \hfill \\ \end{aligned}$$
(43)

Applying Eq. (42), Eq. (43), and Lemma 1 and taking the expectation of Eq. (40), given that \(r(k) = \overline{r}(k - 1) + \left\| {\Xi (\hat{\theta },\alpha ,k)\Phi (p,k,\hat{\alpha })} \right\|^{2}\), we obtain \(r(k)> \overline{r}(k - 1)\). According to Lemma 2 and considering that \(V(p,k)\) and \(\tilde{\theta }(k - 1),\Phi (p,k,\hat{\alpha }),I_{n} - \frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)\Phi^{{\text{T}}} (p,k,\hat{\alpha })}}{r(k)\Gamma (2 - \alpha )}\) are not linearly correlated, we have

$$\begin{aligned} {\mathbb{E}}\left[ {\left\| {\tilde{\theta }(k)} \right\|^{2} } \right] & \le \left[ {1 - \frac{{\varepsilon^{1 - \alpha } N\chi }}{{\left[ {n\rho (k + N) + 1 + (1 + \varepsilon )^{1 - \alpha } N\rho } \right]\Gamma (2 - \alpha )}}} \right]{\mathbb{E}}\left[ {\left\| {\tilde{\theta }(k - 1)} \right\|^{2} } \right] \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & + 2{\mathbb{E}}\left\{ {\tilde{\theta }(k - 1)\left[ {I_{n} - \frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)\Phi^{{\text{T}}} (p,k,\hat{\alpha })}}{r(k)\Gamma (2 - \alpha )}} \right]\frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)V(p,k)}}{r(k)\Gamma (2 - \alpha )}} \right\} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & + \left\| {\frac{{\Phi (p,k,\hat{\alpha })\Xi (\hat{\theta },\alpha ,k)V(p,k)}}{{\overline{r}(k - 1)\Gamma (2 - \alpha )}}} \right\|^{2} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & \le \left[ {1 - \frac{{\varepsilon^{1 - \alpha } N\chi }}{{\left[ {n\rho (k - N) + 1 + (1 + \varepsilon )^{1 - \alpha } N\rho } \right]\Gamma (2 - \alpha )}}} \right]{\mathbb{E}}\left[ {\left\| {\tilde{\theta }(k - 1)} \right\|^{2} } \right] \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & + \frac{{N^{2} \rho (1 + \varepsilon )^{2(1 - \alpha )} \sigma_{v}^{2} }}{{[n\chi (k - N)]^{2} \Gamma^{2} (2 - \alpha )}}. \hfill \\ \end{aligned}$$
(44)

From Lemma 3, the limit of Eq. (44) is determined

$$\begin{aligned} \mathop {\lim }\limits_{k \to \infty } {\mathbb{E}}\left[ {\left\| {\tilde{\theta }(k)} \right\|^{2} } \right] & \le \mathop {\lim }\limits_{k \to \infty } \frac{{N^{2} \rho (1 + \varepsilon )^{2(1 - \alpha )} \sigma_{v}^{2} }}{{[n\chi (k - N)]^{2} \Gamma^{2} (2 - \alpha )}}\frac{{\left[ {n\rho (k - N) + 1 + (1 + \varepsilon )^{1 - \alpha } N\rho } \right]\Gamma (2 - \alpha )}}{{\varepsilon^{1 - \alpha } N\chi }} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & \le \mathop {\lim }\limits_{k \to \infty } \frac{{N\rho (1 + \varepsilon )^{2(1 - \alpha )} \sigma_{v}^{2} \left[ {n\rho (k - N) + 1 + (1 + \varepsilon )^{1 - \alpha } N\rho } \right]}}{{[n\chi (k - N)]^{2} \Gamma (2 - \alpha )\varepsilon^{1 - \alpha } \chi }} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & \sim \frac{{N^{2} \rho^{3} (1 + \varepsilon )^{2(1 - \alpha )} \sigma_{v}^{2} }}{{n\chi^{3} \Gamma (2 - \alpha )}}\frac{1}{k}. \hfill \\ \end{aligned}$$
(45)

When \(k \to \infty\),\({\mathbb{E}}\left[ {\left\| {\tilde{\theta }} \right\|^{2} } \right] \to 0\), the convergence of the identification parameters is proven. Subsequently, we prove the convergence of the identification order.

Convergence of the identification fractional order

The fractional order estimation error vector is defined.

$$\tilde{\alpha }(k) = \hat{\alpha }(k) - \overline{\alpha }.$$
(46)

According to Eqs. (38) and (46), subtracting \(\overline{\alpha }\) from both ends of Eq. (33) results in

$$\begin{aligned} \tilde{\alpha }(k) & = \tilde{\alpha }(k - 1) + \frac{{\Phi_{{\tilde{\alpha }}} (p,k,\tilde{\alpha })[Y(p,k) - \Phi^{{\text{T}}} (p,k,\tilde{\alpha })\hat{\theta }(N)]}}{{r_{1} (k)\Gamma (2 - \alpha )}}(\left| {\tilde{\alpha }(k) - \tilde{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & = \tilde{\alpha }(k - 1) + \frac{{\Phi_{{\tilde{\alpha }}} (p,k,\tilde{\alpha })[\Phi^{{\text{T}}} (p,k,\tilde{\alpha })\theta + V(p,k) - \Phi^{{\text{T}}} (p,k,\tilde{\alpha })\hat{\theta }(N)]}}{{r_{1} (k)\Gamma (2 - \alpha )}}(\left| {\tilde{\alpha }(k) - \tilde{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & = \tilde{\alpha }(k - 1) - \frac{{\Phi_{{\hat{\alpha }}} (p,k,\hat{\alpha })[\Phi^{{\text{T}}} (p,k,\tilde{\alpha })\tilde{\theta } + V(p,k)]}}{{r_{1} (k)\Gamma (2 - \alpha )}}(\left| {\tilde{\alpha }(k) - \tilde{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } . \hfill \\ \end{aligned}$$
(47)

Taking the norm of both sides of Eq. (47)

$$\begin{aligned} \left\| {\tilde{\alpha }(k)} \right\|^{2} & = \left\| {\tilde{\alpha }(k - 1)} \right\|^{2} - \left( {\frac{{2\Phi_{{\tilde{\alpha }}} (p,k,\tilde{\alpha })[\Phi^{{\text{T}}} (p,k,\tilde{\alpha })\tilde{\theta } + V(p,k)](\left| {\tilde{\alpha }(k) - \tilde{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } }}{{r_{1} (k)\Gamma (2 - \alpha )}}\tilde{\alpha }(k - 1)} \right) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & + \left\| {\frac{{\Phi_{{\tilde{\alpha }}} (p,k,\hat{\alpha })[\Phi^{{\text{T}}} (p,k,\tilde{\alpha })\tilde{\theta } + V(p,k)](\left| {\tilde{\alpha }(k) - \tilde{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } }}{{r_{1} (k)\Gamma (2 - \alpha )}}} \right\|^{2} . \hfill \\ \end{aligned}$$
(48)

Taking the expectation of Eq. (48)

$$\begin{aligned} {\mathbb{E}}\left\| {\tilde{\alpha }(k)} \right\|^{2} & = {\mathbb{E}}\left\| {\tilde{\alpha }(k - 1)} \right\|^{2} - {\mathbb{E}}\left( {\frac{{2\Phi_{{\tilde{\alpha }}} (p,k,\tilde{\alpha })[\Phi^{{\text{T}}} (p,k,\tilde{\alpha })\tilde{\theta } + V(p,k)](\left| {\tilde{\alpha }(k) - \tilde{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } }}{{r_{1} (k)\Gamma (2 - \alpha )}}\tilde{\alpha }(k - 1)} \right) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \\ & + {\mathbb{E}}\left\| {\frac{{\Phi_{{\tilde{\alpha }}} (p,k,\tilde{\alpha })[\Phi^{{\text{T}}} (p,k,\tilde{\alpha })\tilde{\theta } + V(p,k)](\left| {\tilde{\alpha }(k) - \tilde{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } }}{{r_{1} (k)\Gamma (2 - \alpha )}}} \right\|^{2} . \hfill \\ \end{aligned}$$
(49)

Equation (49) can be expressed in the same form as Eq. (36) given that \(r_{1} (k) = \overline{r}(k - 1) + \left\| {\Phi_{{\hat{\alpha }}} (p,k,\hat{\alpha })[\left| {\overline{\alpha }(k) - \overline{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } ]} \right\|^{2}\); therefore, \(r_{1} (k)> \overline{r}(k - 1)\). According to Lemma 1 and Lemma 2, we have

$$\begin{gathered} {\mathbb{E}}\left( {\frac{{2\Phi_{{\tilde{\alpha }}} (p,k,\tilde{\alpha })[\Phi^{{\text{T}}} (p,k,\tilde{\alpha })\tilde{\theta } + V(p,k)](\left| {\tilde{\alpha }(k) - \tilde{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } }}{{r_{1} (k)\Gamma (2 - \alpha )}}} \right) \hfill \\ \le \frac{{2\Phi_{{\tilde{\alpha }}} (p,k,\tilde{\alpha })[\Phi^{{\text{T}}} (p,k,\tilde{\alpha })\tilde{\theta }](1 + \varepsilon )^{1 - \alpha } }}{{\overline{r}(k - 1)\Gamma (2 - \alpha )}} \hfill \\ \le \frac{{2\Phi_{{\tilde{\alpha }}} (p,k,\tilde{\alpha })[\Phi^{{\text{T}}} (p,k,\tilde{\alpha })\tilde{\theta }](1 + \varepsilon )^{1 - \alpha } }}{[n\chi (k - N)]\Gamma (2 - \alpha )} < \infty , \hfill \\ \end{gathered}$$
(50)

where \(\tilde{\theta }\) is a fixed value, \(\Phi_{{\tilde{\alpha }}} (p,k,\tilde{\alpha })\) and \(\Phi^{{\text{T}}} (p,k,\tilde{\alpha })\) are finite. When \(k \to \infty\), Eq. (50) is convergent. Similarly, we have

$$\begin{gathered} {\mathbb{E}}\left\| {\frac{{\Phi_{{\tilde{\alpha }}} (p,k,\tilde{\alpha })[\Phi^{{\text{T}}} (p,k,\tilde{\alpha })\tilde{\theta } + V(p,k)](\left| {\tilde{\alpha }(k) - \tilde{\alpha }(k - 1)} \right| + \varepsilon )^{1 - \alpha } }}{{r_{1} (k)\Gamma (2 - \alpha )}}} \right\|^{2} \hfill \\ \le \frac{{(\Phi_{{\hat{\alpha }}} (p,k,\hat{\alpha }))^{2} \left\{ {[\Phi^{{\text{T}}} (p,k,\tilde{\alpha })\tilde{\theta }]^{2} { + }\sigma_{v}^{2} } \right\}(1 + \varepsilon )^{2(1 - \alpha )} }}{{r(k - 1)^{2} \Gamma^{2} (2 - \alpha )}} \hfill \\ \le \frac{{(\Phi_{{\hat{\alpha }}} (p,k,\hat{\alpha }))^{2} \left\{ {[\Phi^{{\text{T}}} (p,k,\tilde{\alpha })\tilde{\theta }]^{2} { + }\sigma_{v}^{2} } \right\}(1 + \varepsilon )^{1 + \alpha } }}{{[n\chi (k - N)]^{2} \Gamma^{2} (2 - \alpha )}} < \infty . \hfill \\ \end{gathered}$$
(51)

When \(k \to \infty\), Eq. (51) is convergent. Using Lemma 4, we can deduce that \({\mathbb{E}}\left[ {\left\| {\tilde{\alpha }(k)} \right\|} \right]\) converges to random numbers. Therefore, the convergence of fractional order is proven.

The comprehensive analysis of Eqs. (45), (50), and (51) confirms the convergence of the joint multi-innovation fractional gradient descent identification algorithm. As k continues to increase, the parameters and fractional order can continue to converge, and the theory proves the stability and reliability of the algorithm.

Simulation and experiment

This paper presents a numerical example and an experiment on identifying a flexible linkage fractional order system. To verify the superiority of the proposed method, it is compared with the stochastic gradient (SG) descent algorithm and the fractional order stochastic gradient (FOSG) descent algorithm. The effects of multi-innovation length and fractional gradient order on the convergence speed and accuracy of the algorithm are analyzed. All simulations and experiment are performed using MATLAB (R2020a) software installed in a computer equipped with an Intel Core i7 12700H CPU with 16 GB RAM.

The system identification accuracy evaluation index is as follows

$$\delta { = }\frac{{\left\| {\hat{\theta }(k,\hat{\alpha }) - \theta } \right\|}}{\left\| \theta \right\|}.$$
(52)

Numerical simulation

Consider the fractional order systems as follows

$$y(k) = - a_{1} \Delta^{{\overline{\alpha }}} y(k - 1) - a_{2} \Delta^{{\overline{\alpha }}} y(k - 2) + b_{0} u(k) + v\left( k \right).$$
(53)

The parameters to be identified are \(\theta = \left[ {a_{1} ,a_{2} ,b_{0} } \right] = \left[ {8,5,3} \right]\) and \(\overline{\alpha } = 1.5\), and the \(\overline{\alpha }\) initial condition is set to 0.5. The system input \(u(k)\) is a persistent excitation signal, and \(v(k)\) is a white noise sequence with zero mean and variance \(\sigma_{v}^{2} = 0.5^{2}\). The proposed algorithm is compared with the SG and FOSG algorithms. In the proposed algorithm and the FOSG algorithm, the fractional order is set to \(\alpha = 1.2\), and the multi-innovation length is set to p = 5. The identification results are presented in Fig. 2 and Table 1.

Fig. 2
Fig. 2
Full size image

Comparison of evaluation indices for different identification algorithms: (a) first joint iteration, (b) second joint iteration, (c) third joint iteration.

Table 1 Comparison of identification results of different algorithms.

Figure 2 shows that compared with the SG and FOSG algorithms, the joint multi-innovation fractional gradient descent identification algorithm exhibits higher convergence speed and accuracy. As the number of joint iterations increases, the identification error gradually decreases. By the third joint iteration, the evaluation index decreases from 0.364 to 0.0025. Table 1 shows that the proposed algorithm has higher identification accuracy, and the identification results are closer to the true values. This improvement is due to two main factors: First, FC has high flexibility, which can accelerate the convergence speed and improve the convergence accuracy of the algorithm. Second, the proposed method combines the multi-innovation theory and applies more data in the system identification process; therefore, the identification accuracy can be further improved. Thus, the superiority of the proposed algorithm is verified.

The multi-information length p of the algorithm is a coefficient that can be set flexibly. Different values of p represent different amounts of observation data used for identification. In order to analyze the impact of the p on the performance of the algorithm, p is set to 1, 3, and 5; the algorithm order is set to \(\alpha = 1.2\); and the identification results are recorded in Figs. 3, 4, 5 and Table 2.

Fig. 3
Fig. 3
Full size image

Evaluation index of joint iteration number.

Fig. 4
Fig. 4
Full size image

Comparison of parameter and fractional order identification errors.

Fig. 5
Fig. 5
Full size image

Comparison of the joint iteration process under different multi-innovation lengths: (a) system parameter \(a_{1}\), (b) system parameter \(a_{2}\), (c) system parameter \(b_{0}\), (d) fractional order \(\overline{\alpha }\).

Table 2 Comparison of identification results under different multi-innovation lengths p.

Figure 3 and Table 2 show that as the multi-innovation length increases, the joint iteration error gradually decreases, and the system identification accuracy gradually improves. Notably, shifting from single innovation (p = 1) to multi-innovation (p = 3) increases the amount of information and the information utilization rate, reducing the evaluation index from 0.0083 to 0.0026. As shown in Fig. 4, at p = 5, the system identification error is the smallest. Figure 5 shows that the joint iteration of two multi-innovation fractional order gradients allows for accurate identification of the unknown parameters and fractional order of the system. In addition, as the number of joint iterations increases, the identification error of the system parameter and fractional order gradually decrease and become closer to the true values. Moreover, as shown in Fig. 5d, the convergence speed of the multi-innovation (p = 3 or p = 5) algorithm is greater than that of the single-innovation (p = 1) algorithm, verifying the effectiveness of multi-innovation theory in the proposed method.

The fractional order \(\alpha\) of the algorithm is also a coefficient that can be set flexibly. The impact of different fractional orders on the performance of the algorithm is verified below. p is set to 5, and the fractional orders are set as \(\alpha = 0.5,0.8,1.2,1.5\). The identification results are presented in Fig. 6 and Table 3.

Fig. 6
Fig. 6
Full size image

Comparison of evaluation indices for different fractional orders \(\alpha\): (a) first joint iteration, (b) second joint iteration, (c) third joint iteration.

Table 3 Comparison of identification results under different fractional order \(\alpha\).

Figure 6a shows that in the first joint iteration, the larger the fractional order, the higher the convergence speed of the algorithm and the smaller the evaluation index. However, in the second (Fig. 6b) and third joint iterations (Fig. 6c), at \(\alpha = 1.5\), the algorithm still converges, but the convergence accuracy is lower than that at other fractional orders. As shown in Table 3, at \(\alpha = 1.5\), the identification error of the system parameters and fractional order are the largest. At \(\alpha = 1.2\), the algorithm exhibits the highest convergence accuracy. At \(\alpha = 0.8\), the convergence accuracy is still high. Therefore, the optimal fractional order range of the proposed algorithm is [0.8, 1.2]. Although an excessively large fractional order reduces the convergence accuracy, the algorithm can still identify the system parameter and the fractional order. Therefore, the effectiveness of the proposed algorithm is verified.

Experiment

A flexible linkage system is utilised for the experiment, which is shown in Fig. 7. This system consists of a flexible linkage, a data acquisition card, a servo motor, an angle sensor, and Quarc real-time control software. The computer inputs the voltage signal and applies it to the servo motor through a real-time data acquisition card, driving the flexible connecting rod to rotate. The deflection angle of the flexible linkage is collected by the angle sensor. Precise control of the drive voltage of the servo motor enables rapid, accurate, and stable deflection of the flexible linkage.

Fig. 7
Fig. 7
Full size image

Flexible linkage system experimental device.

The flexible linkage system can be expressed as follows

$$y(k) = - a_{1} \Delta^{{\overline{\alpha }}} y(k - 1) - a_{2} \Delta^{{\overline{\alpha }}} y(k - 2) + b_{0} u(k).$$
(54)

In Matlab/Simulink, the system input and output measurement units are established using the Quarc database module. The input is a 1 V voltage signal, and the output is the deflection angle of the flexible linkage. The data collection step is 0.004 s, and the collection period is 5 s. Figure 8 shows the experimental data and identified system. The identification result is as follows

$$y(k) = 0.4314\Delta^{1.19} y(k - 1) + 0.7621\Delta^{1.19} y(k - 2) - 0.2547u(k).$$
(55)
Fig. 8
Fig. 8
Full size image

Experimental data and identified system.

Figure 8 demonstrates that the proposed algorithm can accurately identify the fractional order and unknown parameters of the flexible linkage system. The identified fractional order system can accurately characterise the dynamic process of the deflection angle of the flexible linkage. Therefore, the flexibility of the joint multi-innovation fractional gradient-descent identification algorithm in practical applications is verified.

Conclusion

This paper proposed a joint multi-innovation fractional gradient descent identification algorithm for fractional order systems. The algorithm identified parameters and fractional orders through the joint iteration of two fractional order gradients. Additionally, multi-innovation theory was applied to extend the joint fractional gradient to a joint multi-innovation fractional gradient. The effectiveness of the algorithm was verified through a simulation example and an experiment. The results indicated that the identification error gradually decreased with each joint iteration. Furthermore, the identification accuracy of the algorithm increased as the multi-innovation length increased. The optimal range for the order value of the joint multi-innovation fractional gradient was [0.8, 1.2], within which the algorithm’s performance was optimal. Finally, the algorithm’s effectiveness and flexibility in practical engineering were confirmed through the identification of a real flexible linkage fractional order system. Overall, the proposed algorithm can accurately and synchronously identify system parameters and fractional order, and it can be extended to the identification of fractional order nonlinear systems or fractional order time-delay systems.