Introduction

System identification is to establish mathematical models or certain quantitative relations to reflect the characteristics of practical plants from collected data1,2. The conventional algorithms, such as the least squares algorithms3,4,5, the gradient descent algorithms6,7,8 and the maximum likelihood algorithms9,10,11,12, generally assume that system inputs can be accurately measured and only the output data are corrupted by noises13,14,15. However, this assumption may lead to the model errors because of ignoring the input noise caused by sensor failures and environmental interferences16,17.

For decades, the identification for the errors-in-variables (EIV) systems has become a hot topic18,19,20. EIV systems refer to the systems which take into account the input noise as well as the output noise21,22,23. Zheng presented a bias-eliminated recursive least squares algorithm for linear EIV systems24. Söderström gave a survey for the identification approaches of EIV systems and discussed the identifiability under weak assumptions25. Fan et al. proposed the instrumental variable method for EIV systems based on correlation analysis26. In the works, the output disturbances are subject to white Gaussian processes.

The external noise can be regarded as a summation of random kinetic energy from multiple sources, and its statistical characteristic has an important influence on parameter estimation27,28,29. The hypothesis of Gaussian noise may be inappropriate in many engineering applications due to sensor malfunction, data transmission failure and sudden cyberattacks30,31,32. In fact, the injected attack signal of deception attacks can be regarded as non-Gaussian noises33. Bi et al.34 proposed a novel adaptive fuzzy control approach for a class of uncertain nonlinear cyber-physical systems to mitigate the effects of cyberattacks. When the non-Gaussian noise or outliers are encountered, the traditional least squares algorithm faces severe performance deterioration and is non-optimal35,36. An intuitive approach is to detect the outliers and to conduct the estimation algorithms using the data set without detected outliers, but it is difficult to guarantee all of outliers are exactly found especially for nonlinear systems37,38,39,40.

To cope with the identification problem with outliers, Stojanovic and Nedic constructed a least favorable function-based cost function and presented a robust recursive estimation algorithm for output error models with time-varying parameters, and successfully applied it to model the pneumatic servo drives41. Guo and Zhi developed a generalized hyperbolic secant adaptive filter, in which the disturbance is modeled by alpha-stable noise42. Yang et al. modeled the distribution of outliers by generalized hyperbolic skew Student’s t distribution and presented an expectation-maximization algorithm for the nonlinear state-space system43.

Although there have been many identification methods for the linear and nonlinear systems with impulsive noise, to the best of our knowledge, the identification of the EIV nonlinear system with impulsive noise has not been fully investigated. Therefore, this paper aims to propose the robust recursive identification schemes for the EIV nonlinear system contaminated by impulsive noise. Since the input nonlinearity contains the unknown input noise, a bias correction scheme is presented to obtain the unbiased estimates of the monomial basis functions of the input nonlinearity. Furthermore, a differentiable continuous logarithmic mixed p-norm cost function is formulated and the robust recursive estimation algorithms are presented. The main contributions of this paper are as follows.

  • Construct a differentiable continuous logarithmic mixed p-norm cost function for the EIV nonlinear system and optimize this criterion function to resist the influence of impulsive noise and improve the robustness of the parameter estimation.

  • Introduce the bias correction principle to estimate the unknown nonlinear input functions and propose a continuous logarithmic mixed p-norm based robust recursive estimation (CLMpN-RRE) algorithm for the EIV nonlinear system with impulsive noise.

  • Divide the identification model into sub-models and develope a CLMpN-based robust hierarchical estimation (CLMpN-RHE) algorithm to improve the calculation efficiency.

The structure of the rest paper is as follows. Section “Problem description” discusses the system model. Section “The continuous logarithmic mixed p-norm based robust recursive estimation algorithm” derives the CLMpN-RRE algorithm based on the continuous logarithmic mixed p-norm criterion. To improve the computational efficiency, Section “The continuous logarithmic mixed p-norm based robust hierarchical estimation algorithm” presents the CLMpN-RHE algorithm by means of the model decomposition. Section “Simulation studies” provides two simulation examples to verify the effectiveness of the proposed algorithms. Section “Conclusions” offers some conclusions.

Problem description

Consider the following errors-in-variables nonlinear system:

$$\begin{aligned} A(z)y(k)= & B(z){\bar{\xi }}(k)+v(k), \end{aligned}$$
(1)
$$\begin{aligned} u(k)= & \xi (k)+w(k), \end{aligned}$$
(2)

where u(k) and y(k) are the measured input and output, w(k) and v(k) are stochastic noises with zero means and variances \(\sigma ^2\) and \(\varsigma ^2\), respectively, \({\bar{\xi }}(k):=f(\xi (k))\) is the nonlinear function of the unmeasurable true input \(\xi (k)\). Various models can be applied to describe the nonlinearity \(f(\xi (k))\). A feasible scheme is to express the nonlinearity as the combination of the monomial functions:

$$\begin{aligned} {\bar{\xi }}(k)=f(\xi (k))=\sum _{j=1}^mc_j\xi ^j(k). \end{aligned}$$
(3)

The polynomial functions A(z) and B(z) are defined as:

$$\begin{aligned} A(z):= & 1+a_1z^{-1}+a_2z^{-2}+\cdots +a_{n_a}z^{-n_a},\\ B(z):= & b_0+b_1z^{-1}+b_2z^{-2}+\cdots +b_{n_b}z^{-n_b}, \end{aligned}$$

where \(z^{-1}\xi (k)=\xi (k-1)\). Due to abrupt disturbances and human errors in real systems, there are some outliers that deviate from the collected data greatly. To describe the characteristics of outliers, w(k) and v(k) are supposed as the impulsive noises.

Let \({\mathbb {R}}^{n}\) be the real vector space composed of the n-dimensional vectors and \({\mathbb {R}}^{m\times n}\) be the real matrix space composed of the matrices with m rows and n columns. Define

$$\begin{aligned} {\varvec{\vartheta }}:= & \left[ \begin{array}{c} {\varvec{\theta }} \\ \varvec{c} \end{array} \right] \in {\mathbb {R}}^{n_0},\quad {\varvec{\varphi }}(k):=\left[ \begin{array}{c} {\varvec{\phi }}(k) \\ {\varvec{\psi }}(k) \end{array} \right] \in {\mathbb {R}}^{n_0},\\ {\varvec{\theta }}:= & \left[ a_1,a_2,\ldots ,a_{n_a},b_1,b_2,\ldots ,b_{n_b}\right] ^{\textrm{T}}\in {\mathbb {R}}^{n_1},\ n_0:=n_a+n_b+m,\ n_1:=n_a+n_b,\\ \varvec{c}:= & [c_1,c_2,\ldots ,c_m]^{\textrm{T}}\in {\mathbb {R}}^{m},\\ {\varvec{\phi }}(k):= & \left[ -y(k-1),-y(k-2),\ldots ,-y(k-n_a), {\bar{\xi }}(k-1),{\bar{\xi }}(k-2),\ldots ,{\bar{\xi }}(k-n_b)\right] ^{\textrm{T}}\in {\mathbb {R}}^{n_1},\\ {\varvec{\psi }}(k):= & [\xi (k),\xi ^2(k),\ldots ,\xi ^m(k)]^{\textrm{T}}\in {\mathbb {R}}^{m}. \end{aligned}$$

Let \(b_0=1\). Inserting (3) into (1) gives

$$\begin{aligned} y(k)= & -\sum _{i=1}^{n_a}a_iy(k-i)+{\bar{\xi }}(k)+\sum _{i=1}^{n_b}b_i{\bar{\xi }}(k-i)+v(k)\nonumber \\= & -\sum _{i=1}^{n_a}a_iy(k-i)+\sum _{i=1}^{n_b}b_i{\bar{\xi }}(k-i)+\sum _{i=1}^{n_c}c_i\xi ^i(k)+v(k)\nonumber \\= & {\varvec{\phi }}^{\textrm{T}}(k){\varvec{\theta }}+{\varvec{\psi }}^{\textrm{T}}(k)\varvec{c}+v(k) \end{aligned}$$
(4)
$$\begin{aligned}= & {\varvec{\varphi }}^{\textrm{T}}(k){\varvec{\vartheta }}+v(k). \end{aligned}$$
(5)

Based on the sum of squared-error criterion (SSEC), the identification model in (5) can be easily identified. However, the SSEC-based algorithms are sensitive to outliers, since the SSEC function enlarges the overall error and leads to the inaccurate estimation44,45. The paper aims to develop the robust recursive algorithms to estimate \({\varvec{\vartheta }}\) from the measurements with impulsive noise.

The continuous logarithmic mixed p-norm based robust recursive estimation algorithm

To take advantage of the characteristics of the p-norm to resist the outliers, Zayyani comprehensively considered various p norms (\(1\leqslant p\leqslant 2\)) and proposed a continuous mixed p-norm (CMpN)46, which can be defined as

$$\begin{aligned} J_1({\varvec{\vartheta }}):=\int _{1}^2\lambda _k(p)\textrm{E}\{|e(k)|^p\}\textrm{d}p, \end{aligned}$$

where \(\lambda _k(p)\) represents the weight function and meets that \(\int _{1}^2\lambda _k(p)\textrm{d}p=1\), \(e(k):=y(k)-{\varvec{\varphi }}^{\textrm{T}}(k){\varvec{\vartheta }}\in {\mathbb {R}}\) denotes the error and \(\textrm{E}(\cdot )\) denotes the expectation operator. However, when \(e(k)=0\), the cost function \(J_1({\varvec{\vartheta }})\) is not differentiable. Moreover, the CMpN-based identification algorithm has the stability problem in the impulsive noise environment. It has been found that the logarithmic transformation can decrease the effect of impulsive noise and increase the robustness of the parameter estimation47. Thus the term \(\textrm{E}\{|e(k)|^p\}\) in \(J_1({\varvec{\vartheta }})\) can be replaced by \(\textrm{E}\{\ln [1+|e(k)|^p]\}\approx \frac{1}{k}\sum _{j=1}^k\ln [1+|e(j)|^p]\) and leads to the continuous logarithmic mixed p-norm criterion, which is defined by:

$$\begin{aligned} J_2({\varvec{\vartheta }}):=\sum _{j=1}^k\begin{matrix}\displaystyle \int _{1}^2\lambda _j(p)\ln \left[ 1 +\left( \sqrt{\tau _0+e^2(j)}\right) ^p\right] \textrm{d}p\end{matrix}, \end{aligned}$$

where \(e(j):=y(j)-{\varvec{\varphi }}^{\textrm{T}}(j){\varvec{\vartheta }}\) and \(\tau _0>0\) is a small positive number to ensure the differentiability of \(J_2({\varvec{\vartheta }})\). The constant \(\frac{1}{k}\) is removed in \(J_2({\varvec{\vartheta }})\) because multiplying the objective function by a constant does not affect the extreme point in optimization problems. Thus we have

$$\begin{aligned} \frac{\partial J_2({\varvec{\vartheta }})}{\partial {\varvec{\vartheta }}}= & -\sum _{j=1}^k\int _{1}^{2}p\lambda _j(p)\frac{\left( \sqrt{\tau _0+e^2(j)}\right) ^{p- 2}}{1+\left( \sqrt{\tau _0+e^2(j)}\right) ^p}\varvec{\varphi }(j) e(j)\textrm{d}p\nonumber \\= & -\sum _{j=1}^k\zeta (j){\varvec{\varphi }}(j)\left[ y(j)-{\varvec{\varphi }}^{\textrm{T}}(j){\varvec{\vartheta }}\right] =\textbf{0}, \end{aligned}$$
(6)

where

$$\begin{aligned} \zeta (j):=\int _{1}^{2}p\lambda _j(p)\frac{\left( \sqrt{\tau _0+e^2(j)}\right) ^{p-2}}{1+\left( \sqrt{\tau _0+e^2(j)}\right) ^p}\textrm{d}p. \end{aligned}$$

When \(\lambda _k(p)=\frac{1}{p\ln 2}\), the equality constraint \(\int _{1}^2\lambda _k(p)\textrm{d}p=1\) is met and \(\zeta (k)\) can be computed by

$$\begin{aligned} \zeta (k)= & \frac{1}{\ln 2}\int _{1}^{2}\frac{(\tau (k))^{p-2}}{1+(\tau (k))^p}\textrm{d}p\\= & \frac{\ln (1+\tau ^2(k))-\ln (1+\tau (k))}{\ln 2\cdot \ln \tau (k)\cdot \tau ^2(k)}, \end{aligned}$$

where \(\tau (k):=\sqrt{\tau _0+e^2(k)}=\sqrt{\tau _0+(y(k)-{\varvec{\varphi }}^{\textrm{T}}(k){\varvec{\vartheta }})^2}\). Assume that \({\varvec{\varphi }}(k)\) is persistently exciting. From (6), we have

$$\begin{aligned} {\hat{{\varvec{\vartheta }}}}(k)= & \left[ \sum _{j=1}^k\zeta (j){\varvec{\varphi }}(j){\varvec{\varphi }}^{\textrm{T}}(j)\right] ^{-1}\sum _{j=1}^k\zeta (j){\varvec{\varphi }}(j)y(j). \end{aligned}$$
(7)

Define

$$\begin{aligned} \varvec{P}^{-1}(k):= & \sum _{j=1}^k\zeta (j){\varvec{\varphi }}(j){\varvec{\varphi }}^{\textrm{T}}(j)=\varvec{P}^{-1}(k-1)+\zeta (k){\varvec{\varphi }}(k){\varvec{\varphi }}^{\textrm{T}}(k),\nonumber \\ {\varvec{\alpha }}(k):= & \sum _{j=1}^k\zeta (j){\varvec{\varphi }}(j)y(j)={\varvec{\alpha }}(k-1)+\zeta (k){\varvec{\varphi }}(k)y(k). \end{aligned}$$
(8)

Applying the matrix inversion formula \((\varvec{A}+\varvec{B}\varvec{C})^{-1}=\varvec{A}^{-1}-\varvec{A}^{-1}\varvec{B}(\varvec{I}+\varvec{C}\varvec{A}^{-1}\varvec{B})^{-1}\varvec{C}\varvec{A}^{-1}\) to (8), we have

$$\begin{aligned} \varvec{P}(k)= & \varvec{P}(k-1)-\frac{\zeta (k)\varvec{P}(k-1){\varvec{\varphi }}(k){\varvec{\varphi }}^{\textrm{T}}(k)\varvec{P}(k-1)}{1+\zeta (k){\varvec{\varphi }}^{\textrm{T}}(k)\varvec{P}(k-1) {\varvec{\varphi }}(k)}\\= & \varvec{P}(k-1)-\varvec{G}(k){\varvec{\varphi }}^{\textrm{T}}(k)\varvec{P}(k-1), \end{aligned}$$

where

$$\begin{aligned} \varvec{G}(k):= & \frac{\zeta (k)\varvec{P}(k-1){\varvec{\varphi }}(k)}{1+\zeta (k){\varvec{\varphi }}^{\textrm{T}} (k)\varvec{P}(k-1){\varvec{\varphi }}(k)}=\zeta (k)\varvec{P}(k){\varvec{\varphi }}(k). \end{aligned}$$

Equation (7) can be written as

$$\begin{aligned} {\hat{{\varvec{\vartheta }}}}(k)= & \varvec{P}(k)[{\varvec{\alpha }}(k-1)+\zeta (k){\varvec{\varphi }}(k)y(k)]\\= & \varvec{P}(k)\left[ \varvec{P}^{-1}(k-1){\hat{{\varvec{\vartheta }}}}(k-1)+\zeta (k){\varvec{\varphi }}(k)y(k)\right] \\= & \varvec{P}(k)\left[ \varvec{P}^{-1}(k)-\zeta (k){\varvec{\varphi }}(k){\varvec{\varphi }}^{\textrm{T}}(k)\right] {\hat{{\varvec{\vartheta }}}}(k-1)+\zeta (k)\varvec{P}(k){\varvec{\varphi }}(k)y(k)\\= & {\hat{{\varvec{\vartheta }}}}(k-1)+\zeta (k)\varvec{P}(k){\varvec{\varphi }}(k)\left[ y(k)-{\varvec{\varphi }}^{\textrm{T}}(k){\hat{{\varvec{\vartheta }}}}(k-1)\right] \\= & {\hat{{\varvec{\vartheta }}}}(k-1)+\varvec{G}(k)[y(k)-{\varvec{\varphi }}^{\textrm{T}}(k){\hat{{\varvec{\vartheta }}}}(k-1)]. \end{aligned}$$

Thus we have the following recursive form:

$$\begin{aligned} {\hat{{\varvec{\vartheta }}}}(k)= & {\hat{{\varvec{\vartheta }}}}(k-1)+\varvec{G}(k)\left[ y(k)-{\varvec{\varphi }}^{\textrm{T}}(k){\hat{{\varvec{\vartheta }}}}(k-1)\right] , \end{aligned}$$
(9)
$$\begin{aligned} \varvec{G}(k)= & \varvec{P}(k-1){\varvec{\varphi }}(k)\left[ \zeta ^{-1}(k)+{\varvec{\varphi }}^{\textrm{T}}(k)\varvec{P}(k-1){\varvec{\varphi }}(k)\right] ^{-1}, \end{aligned}$$
(10)
$$\begin{aligned} \varvec{P}(k)= & \left[ \varvec{I}-\varvec{G}(k){\varvec{\varphi }}^{\textrm{T}}(k)\right] \varvec{P}(k-1). \end{aligned}$$
(11)

The information vector \({\varvec{\varphi }}(k)\) contains the unknown \(\xi ^i(k)\) and \({\bar{\xi }}(k)\). To implement the algorithm in (9)–(11), their estimates should be calculated. From (2), the unmeasurable true input \(\xi (k)\) is relevant to the collected input u(k). Based on the bias compensation idea, one can assume that the monomial \(\xi ^i(k)\) meets

$$\begin{aligned} \xi ^i(k)=\textrm{E}\left[ u^i(k)+r_i(k)\right] ,\ i=1,2,\ldots ,m, \end{aligned}$$
(12)

where \(r_i(k)\) is a correction term between the true input monomial \(\xi ^i(k)\) and the measured input monomial \(u^i(k)\). Inserting (2) into (12) gives

$$\begin{aligned} \xi ^i(k)= & \textrm{E}\{[\xi (k)+w(k)]^i+r_i(k)]\}\\= & \textrm{E}\left[ \xi ^i(k)+C_i^1\xi ^{i-1}(k)w(k)+C_i^2\xi ^{i-2}(k)w^2(k)+C_i^3\xi ^{i-3}(k)w^3(k) +\cdots +C_i^iw^i(k)+r_i(k)\right] \\= & \xi ^i(k)+C_i^1\textrm{E}\left[ \xi ^{i-1}(k)\right] \textrm{E}[w(k)]+C_i^2\textrm{E}\left[ \xi ^{i-2}(k)\right] \textrm{E}\left[ w^2(k)\right] \\ & +C_i^3\textrm{E}\left[ \xi ^{i-3}(k)\right] \textrm{E}[w^3(k)]+\cdots +C_i^i\textrm{E}[w^i(k)]+\textrm{E}[r_i(k)], \end{aligned}$$

where \(C_i^j\) (\(1\leqslant j\leqslant i\)) denotes the combinatorial number. Thus we have

$$\begin{aligned} \textrm{E}[r_i(k)]= & -C_i^1\textrm{E}\left[ \xi ^{i-1}(k)\right] \textrm{E}[w(k)]-C_i^2\textrm{E}\left[ \xi ^{i-2}(k)\right] \textrm{E}[w^2(k)]\nonumber \\ & \quad -C_i^3\textrm{E}\left[ \xi ^{i-3}(k)\right] \textrm{E}[w^3(k)]-\cdots -C_i^i\textrm{E}\left[ w^i(k)\right] . \end{aligned}$$
(13)

Inserting (13) into (12) yields

$$\begin{aligned} \xi ^i(k)= & \textrm{E}[u^i(k)]-C_i^1\textrm{E}[\xi ^{i-1}(k)]\textrm{E}[w(k)]-C_i^2\textrm{E}[\xi ^{i-2}(k)]\textrm{E}[w^2(k)]-\cdots -C_i^i\textrm{E}[w^i(k)],\ i=1,2,\ldots ,m. \end{aligned}$$
(14)

Note that w(k) is stochastic noise with zero-mean and variance \(\sigma ^2\), we have

$$\begin{aligned} \textrm{E}[w^i(k)]= & \left\{ \begin{array}{ll} 0, & i=2l-1,\\ (i-1)!!\sigma ^i, & i=2l,\ l=1,2,\ldots \end{array} \right. \end{aligned}$$

Taking i in (14) one by one, we can recursively obtain the estimate of the monomials \(\xi (k),\xi ^2(k),\cdots ,\xi ^m(k)\) by \(u^1(k)+r_1(k),u^2(k)+r_2(k),\cdots ,u^m(k)+r_m(k)\), respectively, e.g.,

$$\begin{aligned} {\hat{\xi }}(k)= & u(k), \end{aligned}$$
(15)
$$\begin{aligned} {\hat{\xi }}^2(k)= & u^2(k)-\sigma ^2, \end{aligned}$$
(16)
$$\begin{aligned} {\hat{\xi }}^3(k)= & u^3(k)-3u(k)\sigma ^2, \end{aligned}$$
(17)
$$\begin{aligned} {\hat{\xi }}^4(k)= & u^4(k)-6u^2(k)\sigma ^2+3\sigma ^4. \end{aligned}$$
(18)

Substituting the obtained \(i-1\) equtions into (14), one can give the estimate \({\hat{\xi }}^i(k)\) of \(\xi ^i(k)\). From (3), the estimate of the nonlinear input \({\bar{\xi }}(k)\) can be calculated by

$$\begin{aligned} \hat{{\bar{\xi }}}(k)={\hat{c}}_1(k){\hat{\xi }}(k)+{\hat{c}}_2(k){\hat{\xi }}^2(k)+\cdots +{\hat{c}}_m(k){\hat{\xi }}^m(k). \end{aligned}$$

where the estimates \({\hat{\xi }}^i(k)\) and \({\hat{c}}_i(k)\) are in the place of their true values \(\xi ^i(k)\) and \(c_i\) in (3). Thus the following recursive relations can be acquired:

$$\begin{aligned} {\hat{{\varvec{\vartheta }}}}(k)= & {\hat{{\varvec{\vartheta }}}}(k-1)+\varvec{G}(k)\left[ y(k)-{\hat{{\varvec{\varphi }}}}^{\textrm{T}}(k){\hat{{\varvec{\vartheta }}}}(k-1)\right] , \end{aligned}$$
(19)
$$\begin{aligned} \varvec{G}(k)= & \varvec{P}(k-1){\hat{{\varvec{\varphi }}}}(k)\left[ {\hat{\zeta }}^{-1}(k)+{\hat{{\varvec{\varphi }}}}^{\textrm{T}}(k)\varvec{P}(k-1){\hat{{\varvec{\varphi }}}}(k)\right] ^{-1}, \end{aligned}$$
(20)
$$\begin{aligned} \varvec{P}(k)= & [\varvec{I}-\varvec{G}(k){\hat{{\varvec{\varphi }}}}^{\textrm{T}}(k)]\varvec{P}(k-1), \end{aligned}$$
(21)
$$\begin{aligned} {\hat{{\varvec{\varphi }}}}(k)= & \big [-y(k-1),-y(k-2),\cdots ,-y(k-n_a),\hat{{\bar{\xi }}}(k-1), \hat{{\bar{\xi }}}(k-2),\cdots ,\hat{{\bar{\xi }}}(k-n_b),\nonumber \\ & {\hat{\xi }}(k),{\hat{\xi }}^2(k),\cdots ,{\hat{\xi }}^m(k)\big ]^{\textrm{T}}, \end{aligned}$$
(22)
$$\begin{aligned} \hat{{\bar{\xi }}}(k)= & \sum _{j=1}^m{\hat{c}}_j(k){\hat{\xi }}^j(k), \end{aligned}$$
(23)
$$\begin{aligned} {\hat{\zeta }}(k)= & \frac{\ln (1+{\hat{\tau }}^2(k))-\ln (1+{\hat{\tau }}(k))}{\ln 2\cdot \ln {\hat{\tau }}(k)\cdot {\hat{\tau }}^2(k)}, \end{aligned}$$
(24)
$$\begin{aligned} {\hat{\tau }}(k)= & \sqrt{\tau _0+[y(k)-{\hat{{\varvec{\varphi }}}}(k){\hat{{\varvec{\vartheta }}}}(k-1)]^2}. \end{aligned}$$
(25)

Equations (14)–(25) form the continuous logarithmic mixed p-norm based robust recursive estimation (CLMpN-RRE) algorithm for EIV nonlinear system with impulsive noise in (1) and (2).

Remark 1

The gain \(\varvec{G}(k)\) in (20) is relevant to \({\hat{\zeta }}(k)\), which depends on \({\hat{\tau }}^2(k)\) in the denominator of (24). As an outlier is encountered, the sudden increasing of the error \(\epsilon (k):=y(k)-{\hat{{\varvec{\varphi }}}}(k){\hat{{\varvec{\vartheta }}}}(k-1)\in {\mathbb {R}}\) leads to the rapid decay of \({\hat{\zeta }}(k)\), which generates negligible change of \(\varvec{G}(k)\). Thus the CLMpN-RRE algorithm has the ability to resist the impulsive noise.

Table 1 The computational efficiency of the CLMpN-RRE algorithm.

The continuous logarithmic mixed p-norm based robust hierarchical estimation algorithm

The CLMpN-RRE algorithm in (14)–(25) can efficiently identify the EIV nonlinear system with impulsive noise. To enhance the computational efficiency of the CLMpN-RRE algorithm, the following derives a CLMpN-based robust hierarchical estimation algorithm.

Let \(y_1(k):=y(k)-{\varvec{\psi }}^{\textrm{T}}(k)\varvec{c}\) and \(y_2(k):=y(k)-{\varvec{\phi }}^{\textrm{T}}(k){\varvec{\theta }}\). From (4), we have

$$\begin{aligned} y_1(k)= & {\varvec{\phi }}^{\textrm{T}}(k){\varvec{\theta }}+v(k), \end{aligned}$$
(26)
$$\begin{aligned} y_2(k)= & {\varvec{\psi }}^{\textrm{T}}(k)\varvec{c}+v(k). \end{aligned}$$
(27)

Equations (26) and (27) are two fictitious identification sub-models. Define the cost functions

$$\begin{aligned} J_3({\varvec{\theta }})= & \sum _{j=1}^k\int _{1}^2\lambda _j(p)\ln \left[ 1+\left( \sqrt{\tau _0+(y_1(j) -{\varvec{\phi }}^{\textrm{T}}(j){\varvec{\theta }})^2}\right) ^p\right] \textrm{d}p,\\ J_4(\varvec{c})= & \sum _{j=1}^k\int _{1}^2\lambda _j(p)\ln \left[ 1+\left( \sqrt{\tau _0+(y_2(j)-{\varvec{\psi }}^{\textrm{T}}(j)\varvec{c})^2}\right) ^p\right] \textrm{d}p. \end{aligned}$$

Similar with the derivation of the CLMpN-RRE algorithm, letting the gradients of the cost functions \(J_3({\varvec{\theta }})\) and \(J_3(\varvec{c})\) with respect to \({\varvec{\theta }}\) and \(\varvec{c}\) be zero vectors, respectively, and utilizing the estimates \({\hat{\xi }}^i(k)\), \(\hat{{\bar{\xi }}}(k)\) and \({\hat{c}}_i(k)\) in place of the unknown \(\xi ^i(k)\), \({\bar{\xi }}(k)\) and \(c_i\) give

$$\begin{aligned} {\hat{{\varvec{\theta }}}}(k)= & {\hat{{\varvec{\theta }}}}(k-1)+\varvec{G}_1(k) \left[ {\hat{y}}_1(k)-{\hat{{\varvec{\phi }}}}^{\textrm{T}}(k){\hat{{\varvec{\theta }}}}(k-1)\right] \nonumber \\= & {\hat{{\varvec{\theta }}}}(k-1)+\varvec{G}_1(k) \left[ y(k)-{\hat{{\varvec{\phi }}}}^{\textrm{T}}(k){\hat{{\varvec{\theta }}}}(k-1) -{\hat{{\varvec{\psi }}}}^{\textrm{T}}(k){\hat{\varvec{c}}}(k-1)\right] \nonumber \\= & {\hat{{\varvec{\theta }}}}(k-1)+\varvec{G}_1(k) \left[ y(k)-{\hat{{\varvec{\varphi }}}}^{\textrm{T}}(k){\hat{{\varvec{\vartheta }}}}(k-1)\right] , \end{aligned}$$
(28)
$$\begin{aligned} \varvec{G}_1(k)= & \varvec{P}_1(k-1){\varvec{\phi }}(k) \left[ {\hat{\zeta }}^{-1}(k)+{\hat{{\varvec{\phi }}}}^{\textrm{T}}(k)\varvec{P}_1(k-1){\hat{{\varvec{\phi }}}}(k)\right] ^{-1}, \end{aligned}$$
(29)
$$\begin{aligned} \varvec{P}_1(k)= & \left[ \varvec{I}-\varvec{G}_1(k){\hat{{\varvec{\phi }}}}^{\textrm{T}}(k)\right] \varvec{P}_1(k-1), \end{aligned}$$
(30)
$$\begin{aligned} {\hat{\varvec{c}}}(k)= & {\hat{\varvec{c}}}(k-1)+\varvec{G}_2(k)\big [{\hat{y}}_2(k)-{\hat{{\varvec{\phi }}}}^{\textrm{T}}(k){\hat{{\varvec{\theta }}}}(k-1)\nonumber \\= & {\hat{\varvec{c}}}(k-1)+\varvec{G}_2(k)\left[ y(k)-{\hat{{\varvec{\phi }}}}^{\textrm{T}}(k){\hat{{\varvec{\theta }}}}(k-1)-{\hat{{\varvec{\psi }}}}^{\textrm{T}}(k){\hat{\varvec{c}}}(k-1)\right] \nonumber \\= & {\hat{\varvec{c}}}(k-1)+\varvec{G}_2(k)\left[ y(k)-{\hat{{\varvec{\varphi }}}}^{\textrm{T}}(k){\hat{{\varvec{\vartheta }}}}(k-1)\right] , \end{aligned}$$
(31)
$$\begin{aligned} \varvec{G}_2(k)= & \varvec{P}_2(k-1){\hat{{\varvec{\psi }}}}(k)\left[ {\hat{\zeta }}^{-1}(k)+{\hat{{\varvec{\psi }}}}^{\textrm{T}}(k)\varvec{P}_2(k-1){\hat{{\varvec{\psi }}}}(k)\right] ^{-1}, \end{aligned}$$
(32)
$$\begin{aligned} \varvec{P}_2(k)= & \left[ \varvec{I}-\varvec{G}_2(k){\hat{{\varvec{\psi }}}}^{\textrm{T}}(k)\right] \varvec{P}_2(k-1), \end{aligned}$$
(33)
$$\begin{aligned} {\hat{{\varvec{\varphi }}}}(k)= & \left[ \begin{array}{c} {\hat{{\varvec{\phi }}}}(k) \\ {\hat{{\varvec{\psi }}}}(k) \end{array} \right] ,\quad {\hat{{\varvec{\vartheta }}}}(k)=\left[ \begin{array}{c} {\hat{{\varvec{\theta }}}}(k) \\ {\hat{\varvec{c}}}(k) \end{array} \right] , \end{aligned}$$
(34)
$$\begin{aligned} {\hat{{\varvec{\phi }}}}(k)= & \left[ -y(k-1),-y(k-2),\cdots ,-y(k-n_a),\hat{{\bar{\xi }}}(k-1),\hat{{\bar{\xi }}} (k-2),\cdots ,\hat{{\bar{\xi }}}(k-n_b)\right] ^{\textrm{T}}, \end{aligned}$$
(35)
$$\begin{aligned} {\hat{{\varvec{\psi }}}}(k)= & \left[ {\hat{\xi }}(k),{\hat{\xi }}^2(k),\cdots ,{\hat{\xi }}^m(k)\right] ^{\textrm{T}}, \end{aligned}$$
(36)
$$\begin{aligned} \hat{{\bar{\xi }}}(k)= & \sum _{j=1}^m{\hat{c}}_j(k){\hat{\xi }}^j(k), \end{aligned}$$
(37)
$$\begin{aligned} {\hat{\zeta }}(k)= & \frac{\ln (1+{\hat{\tau }}^2(k))-\ln (1+{\hat{\tau }}(k))}{\ln 2\cdot \ln {\hat{\tau }}(k)\cdot {\hat{\tau }}^2(k)}, \end{aligned}$$
(38)
$$\begin{aligned} {\hat{\tau }}(k)= & \sqrt{\tau _0+[y(k)-{\hat{{\varvec{\varphi }}}}(k){\hat{{\varvec{\vartheta }}}}(k-1)]^2}. \end{aligned}$$
(39)

Equations (14)–(18) and (28)–(39) form the continuous logarithmic mixed p-norm based robust hierarchical estimation (CLMpN-RHE) algorithm for the EIV system with impulsive noise in (1) and  (2).

Table 2 The computational efficiency of the CLMpN-RHE algorithm.

Remark 2

To demonstrate the advantages of reducing calculational amount of the CLMpN-RHE algorithm, Tables 1 and 2 list the computational loads of the CLMpN-RRE and CLMpN-RHE algorithm using the floating point operations.

When m is an odd number, the difference of the computational cost between two algorithms at each step is

$$\begin{aligned} N_1-N_2:= & \left[ 4n_0^2+6n_0+15+\frac{3}{2}m(m-1)+\frac{m^2-1}{4}\right] -\left[ 4n_1^2+6n_1+16 +\frac{m}{2}(11m+9)+\frac{m^2-1}{4}\right] \\= & 4 \left( n_0^2-n_1^2 \right) +6(n_0-n_1)-1-4m^2-6m\\= & 8n_1m-1>0. \end{aligned}$$

When m is an even number, the difference is

$$\begin{aligned} N_1-N_2:= & \left[ 4n_0^2+6n_0+15+\frac{3}{2}m(m-1)+\frac{m^2}{4}\right] -\left[ 4n_1^2+6n_1 +16+\frac{m}{2}(11m+9)+\frac{m^2}{4}\right] \\= & 8n_1m-1>0. \end{aligned}$$

This indicates that the CLMpN-RHE algorithm has higher calculational efficiency than the CLMpN-RRE algorithm. In fact, the computational costs of the CLMpN-RRE algorithm mainly depend on the calculations of \(\varvec{L}(k)\) and \(\varvec{P}(k)\), whose computations approximatively equal \(4n_0^2=4(n_1+m)^2\). Similarly, the sum of the computational costs corresponding to these two terms in the CLMpN-RHE algorithm approximatively equal \(4(n_1^2+m^2)\). Note that \(n_1^2+m^2<(n_1+m)^2\). This explains why the CLMpN-RHE algorithm has less computational cost than the CLMpN-RRE algorithm.

In recursive algorithms, the initial values of the parameter vectors to be estimated are usually taken as small real vectors, i.e., \({\hat{{\varvec{\theta }}}}(0)=\textbf{1}_{n_1}/p_0\) and \({\hat{\varvec{c}}}(0)=\textbf{1}_m/p_0\), where \(\textbf{1}_m\) denotes an m-dimensional column vector whose elements are 1 (i.e., \(\textbf{1}_m=[1,1,\cdots ,1]^{\textrm{T}}\in {\mathbb {R}}^m\)) and \(p_0\) is a large number, e.g., \(p_0=10^6\). The initial values \(\varvec{P}_1^{-1}(0)\) and \(\varvec{P}_2^{-1}(0)\) should be taken as small positive definite matrices, i.e., \(\varvec{P}_1^{-1}(0)=\varvec{I}_{n_1}/p_0\) and \(\varvec{P}_2^{-1}(0)=\varvec{I}_m/p_0\), where \(\varvec{I}_m\) denotes an identity matrix with m rows and m columns. In other words, \(\varvec{P}_1(0)=p_0\varvec{I}_{n_1}\) and \(\varvec{P}_2(0)=p_0\varvec{I}_m\). The initial values of the unknown variables are usually specified as zeros when \(k\leqslant 0\), i.e., \(\hat{{\bar{\xi }}}(k)=0\) (\(k\leqslant 0\)). These initial values may affect the results of the parameter estimation, but the overall effect is negligible.

The steps of implementing the CLMpN-RHE algorithm (14)–(18) and (28)–(39) are as follows.

  1. 1.

    Let \(k=1\), set \({\hat{{\varvec{\theta }}}}(0)=\textbf{1}_{n_1}/p_0\), \({\hat{\varvec{c}}}(0)=\textbf{1}_m/p_0\), \(\hat{{\bar{\xi }}}(k)=0\) (\(k\leqslant 0\)), \(\tau _0=p_0\), \(\varvec{P}_1(0)=p_0\varvec{I}_{n_1}\), \(\varvec{P}_2(0)=p_0\varvec{I}_m\), where \(p_0=10^6\). Give the data length L.

  2. 2.

    Record the measurements u(k) and y(k).

  3. 3.

    Compute \({\hat{\xi }}(k),{\hat{\xi }}^2(k),{\hat{\xi }}^3(k),\cdots ,{\hat{\xi }}^m(k)\) using (14)–(18). Compute \(\hat{{\bar{\xi }}}(k)\) using (37).

  4. 4.

    Construct \({\hat{{\varvec{\varphi }}}}(k)\), \({\hat{{\varvec{\phi }}}}(k)\) and \({\hat{{\varvec{\psi }}}}(k)\) using (34)–(36). Compute \({\hat{\zeta }}(k)\) and \({\hat{\tau }}(k)\) using (38) and (39).

  5. 5.

    Compute \(\varvec{G}_1(k)\) using (29) and \(\varvec{P}_1(k)\) using (30). Update \({\hat{{\varvec{\theta }}}}(k)\) using (28).

  6. 6.

    Compute \(\varvec{G}_2(k)\) using (32) and \(\varvec{P}_2(k)\) using (33). Update \({\hat{\varvec{c}}}(k)\) using (31).

  7. 7.

    Construct \({\hat{{\varvec{\vartheta }}}}(k)\) using (34).

  8. 8.

    If \(t<L\), increase k by 1 and go to Step 2; otherwise, stop and obtain the parameter estimation vector \({\hat{{\varvec{\vartheta }}}}(L)\).

The presented parameter estimation algorithm in this paper can joint other estimation strategies48,49,50,51,52,53,54,55 to model dynamic systems and can be applied to industrial process systems56,57,58,59,60,61,62,63,64,65 and manufacture systems. characteristic parameters. The flowchart of identifying the errors-in-variables nonlinear systems with impulsive noise using the CLMpN-RHE algorithm is shown in Fig. 1.

Fig. 1
figure 1

The flowchart of the CLMpN-RHE algorithm for errors-in-variables nonlinear systems with impulsive noise.

Simulation studies

Example 1

Consider the following EIV nonlinear system with impulsive noise,

$$\begin{aligned} & \left( 1-1.60z^{-1}+1.00z^{-2}\right) y(k)=\left( 1+2.10z^{-1}+2.40z^{-2}\right) {\bar{\xi }}(k)+v(k),\\ & \quad {\bar{\xi }}(k)=0.45\xi (k)+0.10\xi ^2(k),\\ & \quad u(k)=\xi (k)+w(k),\\ & \quad {\varvec{\vartheta }}=[a_1,a_2,b_1,b_2,c_1,c_2]^{\textrm{T}}=[-1.60,1.00,2.10,2.40,0.45,0.10]^{\textrm{T}}, \end{aligned}$$

where the nonlinearity \({\bar{\xi }}(k)\) is a quadratic polynomial function. In simulation, the input u(k) is selected as a realization of the persistent excitation signal, w(k) and v(k) are chosen as symmetric \(\alpha\)-stable (S\(\alpha\)S) impulsive noises with the characteristic function:

$$\begin{aligned} g_{\tau }(k)=\exp \{-\tau |k|^{\alpha }\}, \end{aligned}$$

where the shape coefficient \(0<\alpha \leqslant 2\) and the dispersion coefficient \(0<\tau \leqslant 1\). The S\(\alpha\)S distribution noise is heavier tailed and has a larger amplitude as \(\alpha\) decreases. When \(\alpha =2\), it corresponds to the Gaussian distribution.

Figure 2 depicts the S\(\alpha\)S impulsive noise process when \(\alpha =1.6\). To check the performance of the CLMpN-RRE algorithm, Fig. 3 demonstrates the CLMpN-RRE estimates and errors, where the parameter estimation error is defined by the root mean square relative error \(\delta :=\Vert {\hat{{\varvec{\vartheta }}}}(k)-{\varvec{\vartheta }}\Vert /\Vert {\varvec{\vartheta }}\Vert \times 100\%\). It turns out that the CLMpN-RRE algorithm can provide a monotonically decreasing error with k increasing, which returns satisfactory estimates.

Fig. 2
figure 2

The impulsive noise versus k for Example 1.

Fig. 3
figure 3

The CLMpN-RRE estimation errors versus k for Example 1.

To test the influence of the shape parameter \(\alpha\) to the CLMpN-RRE algorithm, Fig. 4 exhibits the CLMpN-RRE estimates and errors when \(\alpha\) takes three different values 1.2, 1.6 and 2.0. The results suggest that the CLMpN-RRE error curves keep declining as \(\alpha\) gets larger, and the CLMpN-RRE algorithm has fastest convergence rate under the Gaussian noise (\(\alpha =2\)), which shows that the CLMpN-RRE algorithm is robust to both impulsive disturbance and the Gaussian one.

Fig. 4
figure 4

The CLMpN-RRE estimation errors \(\delta\) versus k under different \(\alpha\) for Example 1 (\(\sigma ^2=0.30^2\)).

Example 2

Consider the following EIV nonlinear system with impulsive noise,

$$\begin{aligned} & \left( 1-1.30z^{-1}+0.90z^{-2}\right) y(k)=\left( 1+0.75z^{-1}+0.40z^{-2}+1.80z^{-3}\right) {\bar{\xi }}(k)+v(k),\\ & \quad {\bar{\xi }}(k)=0.90\xi (k)+0.20\xi ^2(k)+0.10\xi ^3(k),\\ & \quad u(k)=\xi (k)+w(k),\\ & \quad {\varvec{\vartheta }}=[a_1,a_2,b_1,b_2,b_3,c_1,c_2,c_3]^{\textrm{T}}=[-1.30,0.90,0.75,0.40,1.80,0.90,0.20,0.10]^{\textrm{T}}, \end{aligned}$$

where the nonlinearity \({\bar{\xi }}(k)\) is a cubic polynomial function. Under the similar simulation environments as Example 1, Fig. 5 displays the comparison of the CLMpN-RRE and CLMpN-RHE estimates. It illustrates that the CLMpN-RRE algorithm converges slightly faster than the CLMpN-RHE algorithm, but the CLMpN-RHE algorithm has higher computational efficiency—see Tables 1 and 2.

Fig. 5
figure 5

The CLMpN-RHE and CLMpN-RRE errors versus k for Example 2.

To examine the sensitivities of the variation of parameter values on the estimation results, each parameter randomly fluctuates with the amplitude of 0.1 around its original true value and the Monte Carlo simulations for the CLMpN-RHE algorithm with 50 different realizations are conducted. The mean values and variances of the CLMpN-RHE estimates and errors are shown in Table 3. As can be seen, when the parameter values change, the CLMpN-RHE estimates are close to their true values, which indicates that the CLMpN-RHE estimates are not sensitive to the variations of the parameters.

Table 3 The mean values and variances of the CLMpN-RHE estimates and errors for Example 2.

To validate the superiority of the proposed algorithm, Fig. 6 compares the estimation errors between the CLMpN-RHE algorithm and the bias-correction least-squares (BC-RLS) algorithm18. As can be seen, the CLMpN-RHE algorithm has higher parameter estimation accuracy than the BC-RLS algorithm in non-Gaussian noise environment.

Fig. 6
figure 6

The comparison of the estimation errors of the CLMpN-RHE algorithm and the BC-RLS algorithm for Example 2.

Figure 7 illustrates the performance of the CLMpN-RHE algorithm by comparing the predictive outputs with actual measurements from \(k=4001\) to \(k=4200\). It reveals that the model outputs are consistent with the actual ones, and the CLMpN-RHE algorithm can give accurate prediction.

Fig. 7
figure 7

The CLMpN-RHE prediction and errors for Example 2.

Conclusions

This paper seeks a solution for the identification problem of the errors-in-variables nonlinear system under the realistic hypothesis of impulsive noise disturbance. The identification algorithm utilizes the continuous logarithmic mixed p-norm as the criterion function, and the correction term enables that the nonlinear monomials of the noisy input can achieve the unbiased estimates. To reduce the computational loads, a continuous logarithmic mixed p-norm based robust hierarchical estimation algorithm is presented. The proposed algorithms not only have good robustness against the impulsive noise, but also guarantee the accuracy of recursive estimation.

Although the continuous logarithmic mixed p-norm based robust hierarchical estimation algorithm can be used for recursive identification of errors-in-variables nonlinear systems with impulsive noise, the proposed algorithm is derived based on the premise that the basis functions of the input nonlinearity are the monomials. When the basis functions are not monomials, the input nonlinearity can be approximated by monomials or polynomials using Taylor expansion and the proposed algorithm can be also applied to identify the errors-in-variables nonlinear system. As we known, when the nonlinear function is weakly nonlinear, the truncated Taylor expansion can accurately represent the nonlinear function. However, when the nonlinear function is strongly nonlinear, the truncated Taylor expansion may cause huge errors and low estimation accuracy. So can we develop high precision estimation algorithm for errors-in-variables nonlinear systems with impulsive noise when the input nonlinearity have strong nonlinearity? This problem will remain as an open problem in future.