A novel hybrid extreme learning machine-based diagnosis model for sensor node faults in aquaculture

Shi, Bing; Gao, Zelin; Pu, Tianheng; Jiang, Jianming; Sun, Yueping

doi:10.1038/s41598-025-14748-9

Download PDF

Article
Open access
Published: 14 August 2025

A novel hybrid extreme learning machine-based diagnosis model for sensor node faults in aquaculture

Bing Shi¹,
Zelin Gao¹,
Tianheng Pu¹,
Jianming Jiang² &
…
Yueping Sun³

Scientific Reports volume 15, Article number: 29826 (2025) Cite this article

585 Accesses
Metrics details

Subjects

Abstract

Sensor nodes in a wireless sensor network are influenced by the surrounding environment while monitoring data, which can lead to faults and data biases, resulting in erroneous decisions and losses. Identifying and classifying fault types is a challenge that still need to be addressed. The inertia weight $\omega$ and learning factor c were optimized to enhance the optimization ability of the particle swarm. Additionally, parameters such as $\sigma ,\textrm{n} ,d,\gamma$ in the hybrid kernel function, and the penalty coefficient C, were also optimized to improve classification accuracy. A diagnosis model of a hybrid extreme learning machine based on an updated particle swarm optimization for sensor node faults was developed. The dataset of water parameters was obtained based on constructing a monitoring system for intensive aquaculture. Four different proportions of fault data, respectively 5$\%$,10$\%$,15$\%$, and 20$\%$, were added to the dataset to create new datasets for training the model.Test results of the new diagnostic model show an average classification accuracy of 99.30 $\%$,indicating that the proposed fault diagnosis model in this study enhances fault classification accuracy compared to other diagnostic algorithms.

Hybrid deep learning framework for real-time DO prediction in aquaculture

Article Open access 09 July 2025

Optimizing high dimensional data classification with a hybrid AI driven feature selection framework and machine learning schema

Article Open access 08 October 2025

IoT and ML approach for ornamental fish behaviour analysis

Article Open access 05 December 2023

Introduction

Water quality monitoring is essential in modern aquaculture, as precise management decisions rely on accurate acquisition of water quality parameters to optimize the yield and quality of aquaculture products. Wireless Sensor Networks (WSNs) are commonly used for data acquisition in aquaculture. However, malfunctions in sensor nodes can result in deviations in water quality parameters, leading to incorrect decisions and economic losses. Therefore, the detection and diagnosis of faults in sensor nodes within Wireless Sensor Networks are of significant importance^1,2,3,4.

Various fault diagnosis algorithms for Wireless Sensor Networks (WSNs) have been proposed^5,6,7,8,9. Gui et al.¹⁰ introduced a fault diagnosis method based on a Fireworks Algorithm optimized Convolutional Neural Network (CNN), which utilizes the self-regulating mechanism of the Fireworks Algorithm’s local and global optimization abilities to adjust the weights and biases of the CNN, achieving a higher fault diagnosis rate. However, due to the small explosion radius of individuals in the traditional Fireworks Algorithm, the diversity among the generated sparks is limited, indicating room for algorithm optimization. Cao et al.¹¹ proposed an Artificial Bee Colony algorithm optimized by Cauchy mutation to optimize the regularization coefficient and kernel parameters of a kernel-based Extreme Learning Machine (ELM), which improved the accuracy of hardware fault diagnosis of sensor nodes. However, using a single kernel function may lead to local optima; thus, the selection of kernel functions needs enhancement. Chen Honghong¹² first utilized Lévy flight to enhance the searchability of the Firefly Algorithm. Then, it improved the ELM using the optimized Firefly Algorithm to enhance diagnostic accuracy through an optimized fault diagnosis target function. However, ELM requires optimization of weights and threshold parameters and is slower to converge than the kernel ELM, thus requiring further improvement. Gnanavel et al.¹³ proposed six classifiers, including Random Forest, Support Vector Machine, Multilayer Perceptron, CNN, Stochastic Gradient Descent, and Probabilistic Neural Network, were used to diagnose six different types of sensor node hardware faults, concluding that Random Forest classifiers had an overall superior fault diagnosis rate. Li Yang¹⁴ presented a fault diagnosis algorithm based on a Rough Set optimized Probabilistic Neural Network, which reduced the impact of redundant attributes and noise data on fault diagnosis and improved the diagnostic rate by reconstructing the fault sample with optimal features. However, traditional Rough Sets in fault diagnosis may overlook the optimal attribute reduction of NP-hard problems and the probabilistic characteristics of fault diagnosis; hence, there is room for improvement. Wang Rui¹⁵ improved the accuracy of wind power prediction using a Sparrow Optimization Algorithm optimized hybrid kernel ELM compared to the hybrid kernel ELM alone. Wang Chunyang¹⁶ enhanced the diagnostic accuracy by optimizing the parameters of the Probabilistic Neural Network with Particle Swarm Optimization, achieving a higher overall accuracy than both Random Forest and Maximum Likelihood Estimation. However, since the inertia weight and learning factors in traditional Particle Swarm Optimization are constants, particles are prone to get trapped in local optima, indicating further optimization is needed. Finally, Balasubramanian et al.¹⁷ used Particle Swarm Optimization to fine-tune the hyperparameters of a CNN, resulting in higher diagnostic accuracy compared to results optimized by Genetic Algorithms and Grey Wolf Optimizer.

This study introduces a fault diagnosis method that utilizes an enhanced Updated Particle Swarm Optimization hybrid kernel Extreme Learning Machine (UPPSO-HKELM).Initially, a hybrid kernel is incorporated into the kernel-based Extreme Learning Machine. Subsequently, improvements are made to the inertia weight and learning factors of the Particle Swarm Optimization algorithm. Finally, the enhanced Particle Swarm Optimization is utilized to optimize the parameters of the hybrid kernel Extreme Learning Machine, with the goal of improving its fault diagnosis accuracy.

Material and methods

Types of sensor node faults

Environmental conditions play a significant role in affecting sensor nodes. Factors such as random interferences in power supply, ground wire, and surges, may lead to shock faults. Expose to substantial magnetic field interference may cause bias faults, while external solid pollution may lead to short-circuit faults^18,19,20. In environments with high temperatures or other disturbances, the likelihood of sensor nodes drifting off course increases significantly. At any given moment, the output value of the sensor node is represented by Equation (1).

$$\begin{aligned} f(t)=\alpha +k\gamma (t)+\varepsilon \end{aligned}$$

(1)

Where $\alpha ,k,t$ represent the offset, impact constant, and time, respectively, and $\gamma (t)$ and $\varepsilon$ denote the expected output of sensors and the measurement error caused by noise.

This study identifies four types of sensor faults and provides corresponding mathematical models based on the fault data characteristics outlined in the literature by²¹, as depicted in Equations (2) to (5).

(1)
Drift fault: the node’s measurement value drifts at a specific rate, given by:
$$\begin{aligned} f(t) = \gamma (t)+0.01n \end{aligned}$$
(2)
where n represents the nth drift fault.
(2)
Bias fault: a constant is added to the data collected by the node based on the real value $\gamma (t)$, given by:
$$\begin{aligned} f(t) = 0.7562+\gamma (t) \end{aligned}$$
(3)
(3)
Short-circuit fault: the output value of the node is close to 0, given by:
$$\begin{aligned} f(t) = \varepsilon \end{aligned}$$
(4)
(4)
Shock fault: the node’s measurement value changes abruptly, given by:
$$\begin{aligned} f(t) = 10\gamma (t) . \end{aligned}$$
(5)

Four types of fault characteristics are depicted in Figures 1(a) to (d), which were generated using the MATLAB platform.

The authors categorize the five states of sensor modes using different labels, as illustrated in Table 1, specifically for the purpose of labeling the sensor fault category later on. Simulated data encompasses common fault modes across various scenarios, including drift, bias, impact, and short-circuit faults. This generalization makes it representative of typical sensor faults. By adjusting fault model parameters (e.g., n and $\gamma (t)$), the simulated data can generate fault signals with varying intensities and characteristics, effectively addressing the diversity of sensor faults encountered in real-world applications.

Table 1 Labels for five situations of sensors.

Full size table

Data acquisition description

The data acquisition system comprises three modules: the data acquisition module, the data transmission module, and the data processing module, as depicted in Figure 2.

(1)
Data acquisition module: the WSNs(Wireless Sensor Networks) collect real-time data on water quality parameters such as pH, water temperature(WT), dissolved oxygen(DO), electrical conductivity(EC), oxidation reduction potential(ORP), and ammonia nitrogen. The specific sensor models and parameters can be found in Table 2, while physical illustrations of the sensors are provided in Figures 3(a) to (f), and physical appearance of sensor node in Figure 4.
(2)
Data transmission module is responsible for uploading data to a cloud service center using various communication methods like GPRS, Wi-Fi or Ethernet.
(3)
Data processing module: the module offers various internet applications, data processing, and remote control. Users can monitor water quality information on the cloud platform using a computer or mobile app, and control devices as required. This significantly streamlines the management of all aquaculture ponds, including personnel and equipment, for users.

Table 2 Features of the sensors.

Full size table

Data preparation

One of the sensor nodes was chosen as the focus of research, collecting data from July 24, 2023, 00:00 to July 30, 2023, 22:40, totaling 10,000 data points.Figure 5 illustrates the interface of the IoT platform, presenting real-time data. This interface provides users with convenient access to observe real-time data on water quality parameters.

To improve the accuracy of classification, it is essential to normalize the raw dataset. In this study, we employ min-max scaling, as depicted in Equation (6).

$$\begin{aligned} Y_{i}=\frac{X_{i} -X_{min}}{X_{max}-X_{min}} \end{aligned}$$

(6)

In Equation (6), $Y_{i}$ represents the data after normalization, $X_{i}$ represents the raw data before normalization, $X_{max}$ and $X_{min}$ represent the minimum and maximum values within the sequence i, respectively.

In this study, a new fault dataset was generated by sampling $5\%$ of the 10,000 sets from the original dataset, resulting in 500 sets. The new dataset consists of four types of fault data - drift faults, bias faults, short circuit faults, and shock faults, with each type containing 125 sets. Table 3 presents a sample of data from the dataset with a $5\%$ fault rate.

Table 3 Partial data in a $5\%$ fault set.

Full size table

Descriptions of the methods

In this study, the authors developed a Hybrid Kernel Extreme Machine (HKELM) model to classify sensor faults, aiming to enhance the accuracy of fault classification. Furthermore, the authors also incorporated Updated Particle Swarm Optimization (UPPSO) into the model training process to prevent parameters from being trapped in local optima.

Descriptions of HKELM

The KELM algorithm combines a kernel function with the Extreme Learning Machine (ELM)^22,23,24,25. This integration of the kernel function in KELM reduces the necessity for parameter tuning, speeds up convergence, and improves generalization performance and robustness when compared to the conventional ELM. The structure of the ELM is depicted in Figure 6.

Assuming the model input of the training set is $\left( x_{i},m_{i} \right)$, where i takes values $1, 2, \ldots , n$, the model can be represented as the following equation.

$$\begin{aligned} \sum _{j=1}^{l}\beta _{ji}g(\omega _{ji}\times x_{i}+b_{ji} )=m_{i} \end{aligned}$$

(7)

In Equation (7), $x_{i}$ is the input sample vector, $\beta$ is the weight vector from the hidden layer to the output layer, $\omega$ is the input weight, $O_{l}$ represents the lth node in the hidden layer, $m_{i}$ is the target output matrix, g(x) represents the activation function, and b denotes the bias.

Equation (7) can also be represented as Equation (8).

$$\begin{aligned} H\beta =M \end{aligned}$$

(8)

Where $\beta =(\beta _{1i},\beta _{2i},\cdots ,\beta _{li} )^{T},T=(m_{1},m_{2},\dots ,m_{n} )^{M}$represents the target output matrix, and H represents the hidden layer output matrix. Using the least squares method, $\beta$ can be obtained as shown in Equation (9).

$$\begin{aligned} \beta =H^{+}M \end{aligned}$$

(9)

Where $H^{+}$ is the pseudoinverse matrix of H. To enhance the stability and generalization performance of the model, we introduce the penalty coefficient C and the unit matrix I, adding C to the main diagonal of the unit diagonal matrix $HH^{T}$, which ensures that the characteristic roots are non-zero and enhances the stability and generalization performance of the model. we obtain $\beta$ using the least squares method, shown in Equation (10).

$$\begin{aligned} \beta =H^{T}(HH^{T}+\frac{I}{C})^{-1}M \end{aligned}$$

(10)

By introducing the kernel function into the ELM, we obtain the output expression for the KELM as shown in Equation (11).

$$\begin{aligned} f(x)=h(x)\beta =H\beta \end{aligned}$$

(11)

Where H is the feature mapping matrix generated when the kernel function maps the input sample, h(x) is the hidden layer output function, and f(x) is the output vector of the Kernel Extreme Learning Machine model.

The kernel matrix of the KELM is shown in Equation (12).

$$\begin{aligned} \Omega _{i,j}=HH^{T}=h(x_{i} ) h(x_{j} ) =K(x_{i} ,x_{j} ) \end{aligned}$$

(12)

Where $\Omega _{i,j}$ is the feature matrix produced when the input data is mapped to a high-dimensional space through the kernel function, and $K(x_{i} ,x_{j} )$ represents the kernel function.

By introducing the kernel function, Equation (10) becomes Equation (13).

$$\begin{aligned} \beta =(\Omega _{i,j}+\frac{I}{C} )^{-1}M \end{aligned}$$

(13)

By substituting Equations (12) and (13) into Equation (11), the output vector of the KELM algorithm is as shown in Equation (14).

$$\begin{aligned} f(x)=h(x)H^{T}(HH^{T}+I/C )^{-1}M =\begin{Bmatrix}K(x,x_{1} ) \\ \vdots \\ K(x,x_{N})\end{Bmatrix} (\Omega _{i,j}+I/C )^{-1}M \end{aligned}$$

(14)

In the KELM algorithm, since the kernel function uses the inner product for calculations, there’s no need to set the number of nodes in the hidden layer, thereby avoiding performance instability caused by random model parameter settings.

The classification performance of the KELM largely depends on the kernel function. A single kernel function cannot simultaneously cater to its fitting and generalization performance. Due to the strong local search capability of the Gaussian kernel function and the global search capability of the polynomial kernel function, a hybrid kernel function is introduced as shown in Equation (15).

$$\begin{aligned} K(x,x_{i} )=\lambda ^{2}\cdot exp(\left| x-x_{i} \right| /\delta ^{2} )+(1-\lambda ^{2} )\cdot \left[ (x,x_{i} )+n \right] ^{d} \end{aligned}$$

(15)

Where $\delta ,n,d,\lambda$ are the parameters of the hybrid kernel function.

The constructed fitness function is shown in Equation (16).

$$\begin{aligned} e_{x}=\frac{1}{N}\sum _{x=1}^{N}(f^{1}(x)-f(x) )^{2} \end{aligned}$$

(16)

Where $e_{x}$ represents the mean squared error, N is the total sample size, x represents the xth sample, $f^{1}(x)$ represents the actual value, and f(x) represents the predicted output value.

By optimizing the single kernel function of the KELM to the hybrid kernel function in Equation (15) and optimizing the initially random parameters $\delta ,n,d,\lambda$ and the penalty coefficient C, Subsequently we construct a hybrid kernel extreme learning machine(HKELM) to improve the accuracy of sensor node fault diagnosis.

Descriptions of UPPSO for HKELM

PSO is a swarm intelligence algorithm that brings advantages in parameter optimization^26,27,28. To mitigate the issue of the original PSO prematurely converging to local optima, the authors propose the UPPSO algorithm as a modification aimed at enhancing PSO’s performance.

In the original PSO algorithm, the speed and position of particles play a crucial role in the optimization process. The velocity controls the direction and distance of the particle’s movement during iteration, while the place represents different solutions. The updated formulas for the particle’s speed and position are shown in Equations (17) and (18), respectively.

$$\begin{aligned} v_{ij}(t+1)&=\omega v_{ij}(t)+c_{1} r_{1}(p_{best_{ij}(t)-X_{ij}(t)} )+c_{2} r_{2}(g_{best_{ij}-X_{ij} (t) } ) \end{aligned}$$

(17)

$$\begin{aligned} X_{ij}(t+1)&=X_{ij}(t)+v_{ij} (t+1) \end{aligned}$$

(18)

In Equations (17) and (18), v represents the velocity of the particle, Xdenotes the position of the particle, $\omega$ is the inertia weight value, i represents the ith particle, j signifies the dimension of the particle, t indicates the current iteration number, $P_{best}$ is the particle’s historical best value, $g_{best}$ is the global best deal, c is the learning factor, and $r_{1} ,r_{2}$ are independent random numbers within the [0,1] range. In the traditional PSO algorithm, the inertia weight and learning factors are constants, easily trapping the algorithm in local optima. Therefore, this paper improves the inertia weight and learning factors.

The inertia weight is chosen to decrease in an S-shape manner: it has strong global search capabilities in the early stages and local solid search capabilities in the later stages. To address the issue of the algorithm easily getting trapped in local optima, a step is added to the inertia weight in the middle of the iteration process. This enhances its global search capabilities and reduces the chances of getting stuck in local optima. The optimized formula for the inertia weight $\omega$ is shown in Equation (19).

$$\begin{aligned} \omega (t)=\omega _{min}+(\omega _{max}-\omega _{min})\times [\frac{1}{1+e^{6\times (\frac{2t}{T}-1 )} } +\frac{0.5}{1+e^{-|x-\frac{T}{2} |} } ] \end{aligned}$$

(19)

In Equation (19), $\omega _{max}$ and $\omega _{min}$ represent the maximum and minimum values of $\omega$, which are set to 0.8 and 0.1, respectively. t represents the current iteration number, and T represents the maximum number of iterations. When the maximum number of iterations is 100, the variation curve of $\omega$ is shown in Figure 7.

The improvement of the learning factor is designed to have strong individual learning capabilities in the early stages and strong group learning factors in the later stages, resulting in better outcomes. Equations (20) and (21) show the optimized formula for the learning factor.

$$\begin{aligned} c_{1}&=c_{max}-(c_{max}-c_{min})\times \frac{1}{1+e^{-8\times (\frac{2t}{T}-1 )}} \end{aligned}$$

(20)

$$\begin{aligned} c_{2}&=c_{min}+(c_{max}-c_{min})\times \frac{1}{1+e^{-8\times (\frac{2t}{T}-1 )}} \end{aligned}$$

(21)

In Equations (20) and (21), $c_{1}$ and $c_{2}$ represent the individual learning factor and the group learning factor, respectively. $c_{max}$ and $c_{min}$ represent the maximum and minimum values of the learning factor, taken as 2.4 and 1.4, respectively.

The authors present an UPPSO-based HKELM fault diagnosis model flowchart in Figure 8.

The specific steps of the algorithm are as follows:

Step 1: Divide the four fault datasets into training and testing sets, which will serve as inputs for the model.

Step 2: Determine the network structure of the hybrid-kernel ELM fault diagnosis model and initialize the model’s initial parameters.

Step 3: Identify the variables $\delta , n,d,\lambda$ in the hybrid-kernel function that need to be optimized, along with the penalty coefficient C parameter.

Step 4: Use the mean squared error $e_{x}$ between predicted and actual classifications to represent the fitness function.

Step 5: Initialize the parameters for the PSO algorithm, including the population size (P), spatial dimension (D), maximum iteration count (T), and upper and lower bounds (ub, lb) for particle positions (X) and velocities.

Step 6: Improve the inertial weight and learning factors in the PSO algorithm and use the modified particle swarm to optimize the parameters of the hybrid-kernel function and penalty coefficient. Compare fitness values to update particle velocities and positions, ultimately obtaining the global optimum position, which yields the optimal values for $\delta , n,d,\lambda ,C$.

Step 7: Validate the obtained fault diagnosis model using test set data and analyze the model.

Overall process of the algorithm and complexity analysis

Algorithm complexity is an essential factor for assessing the quality of an algorithm, so it is necessary to calculate the algorithm’s complexity shown in Figure 8. In PSO-HKELM, assuming a population size of P and a spatial dimension of D, the time required for initialization is denoted as $x_{1}$, the time for generating a uniform distribution is $x_{2}$, and the time for fitness evaluation is f(D). Therefore, the algorithm’s time complexity in its initial stage is represented by Equation (22).

$$\begin{aligned} O(x_{1}+P(Dx_{2}+f(D) ))=O(D+f(D)) \end{aligned}$$

(22)

Assuming that the execution time for updating each dimension of an individual during iterations is the same and denoted as $x_{3}$, and the time required for selecting the best after iterations is $x_{4}$, then the time complexity for this stage is represented by Equation (23).

$$\begin{aligned} O(P(Dx_{3}+f(D) )+x_{4} )=O(D+f(D)) \end{aligned}$$

(23)

Therefore, the total time complexity of the PSO-HKELM algorithm is given by Equation (24).

$$\begin{aligned} T(D)=O(D+f(D))+O(D+f(D))=O(D+f(D)) \end{aligned}$$

(24)

In UPPSO-HKELM, the time required for the algorithm’s initialization stage is similar to that of PSO-HKELM. Within the algorithm’s loop, assuming the calculation time for the inertia weight is $z_{1}$, the calculation time for the learning factor is $z_{2}$, and the calculation time for comparing and selecting individuals relative to the initial individual is $z_{3}$, the time complexity of the loop portion is represented by Equation (25).

$$\begin{aligned} O(P(Dx_{3}+f(D))+x_{4}+P(z_{2}+z_{3} )+z_{1} )=O(D+f(D)) \end{aligned}$$

(25)

Therefore, the final time complexity of UPPSO-HKELM is given by Equation (26).

$$\begin{aligned} T(D)=O(D+f(D))+O(D+f(D))=O(D+f(D)) \end{aligned}$$

(26)

In summary, UPPSO-HKELM does not increase the algorithm’s time complexity compared to PSO-HKELM.

Results and discussions

The experiments were conducted on a system running Windows 10 (64-bit) with MATLAB 2020 as the software platform. The hardware configuration included an Intel(R) Core(TM) i5-8300H processor and 8GB of RAM. The experimental data comprised four datasets, each containing varying fault levels. These datasets were divided into training and testing sets at an 8:2 ratio.

To validate the fault diagnostic performance of the UPPSO-HKELM, it was compared with the original PSO-KELM, PSO-HKELM, SSA-HKELM from Wang Rui¹⁵, PSO-PNN from Wang Chunyang¹⁶, and PSO-CNN from Balasubramanian et al.¹⁷, using detection accuracy as the experimental result metric. In all algorithms, the maximum iteration count for the particle swarm was set to T = 200, the population size P was set to 40, the learning factor $c_{1},c_{2}$, was set to 1.5, the inertia weight $\omega$ was established to 0.8; the upper and lower boundary parameters (ub, lb) were defined as [20, $10^{3}$,$10^{3}$, 10, 1] and [1, $10^{-3}$,$10^{-3}$, 1, 0.1] respectively.For the Sparrow Search Algorithm (SSA), the maximum iteration count for the sparrow population was 200, and the population size was 40. ST, PD, and SD were set to 0.7, 0.4, and 0.2, respectively.

The HKELM algorithm provides the advantage of a rapid training speed, completing within 5 seconds. In contrast, the proposed UPPSO-HKELM algorithm entails a considerably longer optimization process for determining optimal parameters. The duration of this process depends on both the number of iterations of the particle swarm algorithm and the population size. In this study, the population size is set to 40, and the number of iterations is 200, resulting in an estimated optimization time of approximately 6000 seconds. One hundred data samples were selected from the dataset for model validation and input into the fault diagnostic model. The diagnostic performance of the UPPSO-HKELM algorithm is illustrated in Figure 9. This figure depicts the diagnostic performance chart for the 10-test set, while Table 4 provides detailed accuracy rates. Analysis of Table 4 reveals an overall accuracy of approximately 99$\%$. It is noted that as the proportion of faults increases, the detection accuracy of the test set initially remains constant but later decreases.

Table 4 UPPSO-HKELM diagnostic accuracy.

Full size table

Utilizing the 8:2 split training and test sets as input, the algorithms PSO-KELM, PSO-HKELM, SSA-HKELM from Wang Rui¹⁵, PSO-PNN from Wang Chunyang¹⁶, PSO-CNN from Balasubramanian et al.¹⁷, and UPPSO-HKELM were each run ten times. This process provided the average diagnostic accuracy for the four types of faults at different fault rates, as depicted in Figure 10.

Table 5 Average values after ten runs on test sets with different fault proportion.

Full size table

Figure 10 demonstrates that the diagnostic accuracy decreases as fault rates increase for all four types of faults. This decline is attributed to the algorithms encountering mostly standard data during model training, resulting in lower accuracy as the proportion of anomalous data decreases. Notably, Figure 10(a) and 10(b) show consistent diagnostic accuracy across algorithms for impact and short-circuit faults due to their distinct characteristics. In contrast, Figure 10(c) highlights that UPPSO-HKELM achieves superior diagnostic accuracy overall, especially in drift faults where some data points closely resemble normal data, enhancing the overall accuracy. On the other hand, Figure 10(d) indicates that bias faults have the lowest diagnostic accuracy among all algorithms. Bias faults, being closer to valid values compared to drift faults, present greater challenges for accurate diagnose. By averaging results from ten runs, the overall diagnostic accuracy for each algorithm under varying fault rates can be calculated, as summarized in Table 5. Figure 9 offers a comparative overview of the overall fault diagnostic accuracy for wireless sensor networks across different algorithms.

Table 5 illustrates that the proposed algorithm attains the highest average fault diagnostic accuracy, highlighting its effectiveness. On average, its accuracy surpasses that of PSO-KELM by 3.56$\%$, PSO-HKELM by 2.03$\%$, SSA-HKELM by 1.75$\%$, PSO-PNN by 1.12$\%$, and PSO-CNN by 0.52$\%$.

Figure 11 illustrates a clear trend where increasing fault rates lead to a decrease in the overall diagnostic accuracy of algorithms. The UPPSO-HKELM algorithm consistently outperforms others, showcasing its effectiveness in this study. A comparison between specific approaches reveals that PSO-HKELM exhibits higher accuracy than PSO-KELM, attributed to its utilization of a hybrid kernel function. This improvement enhances optimization capabilities and diagnostic accuracy. SSA-HKELM achieves even higher accuracy than PSO-HKELM due to the Sparrow Search Algorithm’s proficiency in group-based searching, improving diagnostic precision. PSO-PNN outperforms PSO-KELM by simplifying the supervised learning process in neural networks, thereby enhancing fault tolerance. PSO-CNN surpasses PSO-KELM by leveraging convolutional neural network capabilities, learning implicitly from data and improving diagnostic accuracy through structural reconfiguration and weight reduction within a multi-layer perceptron. UPPSO-HKELM attains the highest accuracy among all algorithms by enhancing the kernel function of ELM and optimizing the particle swarm algorithm’s inertia weight and learning factor. These enhancements strengthen optimization capabilities and culminate in an optimal fault diagnostic model with significantly increased diagnostic accuracy. In conclusion, Figure 11 highlights that as fault rates increase, diagnostic accuracy decreases across algorithms, with UPPSO-HKELM consistently delivering the most accurate results due to its comprehensive approach to enhancing both model and optimization parameters.

The accuracy of the detection system typically exceeds 98%, even when the UPPSO-HKELM model encounters a data failure rate of 20%, its diagnostic accuracy remains impressively high at 98%. This finding indicates that the model’s performance meets the requirements for actual aquaculture detection, effectively minimizing both false positives and false negatives within the acceptable error tolerance range. Furthermore, a 20% failure rate is considered an extreme scenario and is relatively uncommon in practical applications, thereby limiting the impact of this decline in accuracy. The UPPSO-HKELM model has demonstrated stable diagnostic capabilities across varying failure rates, confirming its robustness and suitability for most scenarios.

Conclusion

This study proposes a fault diagnostic method called UPPSO-HKELM to improve the fault diagnostic accuracy of sensor nodes in wireless sensor networks. HKELM effectively addresses the challenges posed by small sample sizes, high dimensionality, and strongly nonlinear data in aquaculture, thereby fulfilling model performance requirements. UPPSO enhances optimization speed and accuracy through an improved search mechanism, rendering it more suitable for hyperparameter optimization in real-time water quality monitoring systems. This combination exhibits low computational resource demands, which facilitates its deployment in practical water quality monitoring applications.

This study primarily focuses on optimizing the HKELM and comparing various optimization models. Future work will include a comparative analysis of these models’ characteristics alongside other deep learning architectures, such as transformers.The key findings of this study are summarized as follows:

1)
A hybrid kernel function is introduced to address the limitations of a single kernel function in ELM, balancing fitting and generalization performance. Additionally, the particle swarm algorithm is enhanced by adjusting the inertia weight (w) and learning factor (c) to create an improved particle swarm, UPPSO. This enhanced particle swarm is then used to optimize the hybrid kernel ELM for an enhanced fault diagnostic model.
2)
A water quality monitoring system is presented to collect a dataset of water quality factors, intentionally injected with four types of sensor faults at specific proportions using a mathematical model. This approach creates datasets with varying fault ratios to overcome the challenge of obtaining fault data. Simulation results demonstrate that the UPPSO-HKELM model achieves a fault diagnostic accuracy of 99$\%$ across datasets with different fault rates, showcasing robust generalization performance.

In future work, we plan to include additional performance metrics such as precision, recall, F1-score, and various error-based criteria (e.g., RMSE, MAE) to enable a more comprehensive evaluation of diagnostic performance, particularly in less ideal or imbalanced data scenarios. Furthermore, we will consider introducing adaptive convergence strategies, such as error threshold-based or stagnation-aware stopping conditions, to further enhance the efficiency and stability of the optimization process.

Additionally, we have identified the issue of adaptability between the proposed model and real world scenarios. We will collect fault data from real scenarios to train the model, thereby enhancing its ability to adapt to actual application contexts and improving its overall adaptability.

Data availability

The authors confirm that the data supporting the findings of this study are available within the article or its supplementary materials. If necessary, the data that support the findings are also available from the corresponding author.

References

Dubey, D., Kumar, S. & Dutta, V. Anthropogenic disturbances influence mineral and elemental constituents of freshwater lake sediments. Environmental Monitoring and Assessment 195, 1459 (2023).
Article CAS PubMed Google Scholar
Yang, H., Hassan, S. G., Wang, L. & Li, D. Fault diagnosis method for water quality monitoring and control equipment in aquaculture based on multiple svm combined with ds evidence theory. Computers and electronics in Agriculture 141, 96–108 (2017).
Article Google Scholar
Siddique, M. A. M., Tahsin, T., Hossain, I., Hossain, M. S. & Shazada, N. E. Microplastic contamination in commercial fish feeds: A major concern for sustainable aquaculture from a developing country. Ecotoxicology and Environmental Safety 267, 115659 (2023).
Article CAS PubMed Google Scholar
Xu, X. et al. Fault diagnosis method of dissolved oxygen sensor electrolyte loss based on impedance measurement. Computers and Electronics in Agriculture 212, 108123 (2023).
Article Google Scholar
Sun, G.-W., He, W., Zhu, H.-L., Yang, Z.-J., Mu, Q.-Q., & Wang, Y.-H. A wireless sensor network node fault diagnosis model based on belief rule base with power set, Heliyon 8 (2022).
Hu, J. & Deng, S. Rolling bearing fault diagnosis based on wireless sensor network data fusion. Computer Communications 181, 404–411 (2022).
Article Google Scholar
Prasad, R. & Baghel, R. K. Self-detection based fault diagnosis for wireless sensor networks. Ad Hoc Networks 149, 103245 (2023).
Article Google Scholar
Swain, R. R., Khilar, P. M. & Bhoi, S. K. Heterogeneous fault diagnosis for wireless sensor networks. Ad Hoc Networks 69, 15–37 (2018).
Article Google Scholar
Saeed, U., Jan, S. U., Lee, Y.-D. & Koo, I. Fault diagnosis based on extremely randomized trees in wireless sensor networks. Reliability engineering & system safety 205, 107284 (2021).
Article Google Scholar
Gui, W., Lu, Q., Su, M. & Pan, F. Wireless sensor network fault sensor recognition algorithm based on mm* diagnostic model. IEEE Access 8, 127084–127093 (2020).
Article Google Scholar
Cao, L., Yue, Y. & Zhang, Y. A novel fault diagnosis strategy for heterogeneous wireless sensor networks. Journal of Sensors 2021, 1–18 (2021).
Article CAS Google Scholar
Chen Honghong, H. Z., Yijing, Zhai. Fault detection method of wireless sensor network based on improved extreme learning machine, Journal of Northwest Normal University (Natural Science Edition) 57 (2021).
Gnanavel, S. et al. Analysis of fault classifiers to detect the faults and node failures in a wireless sensor network. Electronics 11, 1609 (2022).
Article Google Scholar
Li Yang, S. Q., Ling, Gao. Node fault diagnosis algorithm for wireless sensor networks based on rsopnn, Computer Engineering and Applications 53 (2017).
Wang Rui, L. J., Xinchao, Xu. Optimization of variational mode decomposition and hybrid kernel extreme learning machine for short term wind power prediction based on sparrow search algorithm, Information and Control (2023).
Wang Chunyang, W. X., Zimeng, Tang. Semi-supervised particle swarm optimization probabilistic neural network algorithm for land use classification, Transactions of the Chinese Society for Agricultural Machinery 53 (2022).
Balasubramanian, K., Ananthamoorthy, N. & Ramya, K. An approach to classify white blood cells using convolutional neural network optimized by particle swarm optimization algorithm. Neural Computing and Applications 34, 16089–16101 (2022).
Article Google Scholar
Li, D., Wang, Y., Wang, J., Wang, C. & Duan, Y. Recent advances in sensor fault diagnosis: A review. Sensors and Actuators A: Physical 309, 111990 (2020).
Article CAS Google Scholar
Shabbir, W., Aijun, L. & Yuwei, C. Neural network-based sensor fault estimation and active fault-tolerant control for uncertain nonlinear systems. Journal of the Franklin Institute 360, 2678–2701 (2023).
Article MathSciNet MATH Google Scholar
Liu, C., Zhao, J., Jiang, B. & Patton, R. J. Fault-tolerant consensus control of multi-agent systems under actuator/sensor faults and channel noises: A distributed anti-attack strategy. Information Sciences 623, 1–19 (2023).
Article Google Scholar
Zhao, J., Wang, K., Wu, D., Huang, Q. & Yu, M. Optimization strategy for modal test measurement points of large-span steel beams based on improved particle swarm optimization algorithm with random weights. Applied Sciences 12, 12082 (2022).
Article CAS Google Scholar
Chen, H., Zhang, Q., Luo, J., Xu, Y. & Zhang, X. An enhanced bacterial foraging optimization and its application for training kernel extreme learning machine. Applied Soft Computing 86, 105884 (2020).
Article Google Scholar
Wang, M. et al. Grey wolf optimization evolving kernel extreme learning machine: Application to bankruptcy prediction. Engineering Applications of Artificial Intelligence 63, 54–68 (2017).
Article Google Scholar
Li, Y., Wang, S., Chen, L., Qi, C. & Fernandez, C. Multiple layer kernel extreme learning machine modeling and eugenics genetic sparrow search algorithm for the state of health estimation of lithium-ion batteries. Energy 282, 128776 (2023).
Article Google Scholar
Chai, W., Zheng, Y., Tian, L., Qin, J. & Zhou, T. Ga-kelm: Genetic-algorithm-improved kernel extreme learning machine for traffic flow forecasting. Mathematics 11, 3574 (2023).
Article Google Scholar
Kamal, N. A. M., Bakar, A. A. & Zainudin, S. Optimization of discrete wavelet transform feature representation and hierarchical classification of g-protein coupled receptor using firefly algorithm and particle swarm optimization. Applied Sciences 12, 12011 (2022).
Article CAS Google Scholar
Karthika, S. & Rathika, P. An adaptive data compression technique based on optimal thresholding using multi-objective pso algorithm for power system data. Applied Soft Computing 150, 111028 (2024).
Article Google Scholar
Nam, J. et al. Flexible metasurface for microwave-infrared compatible camouflage via particle swarm optimization algorithm (small 46/2023). Small 19, 2370390 (2023).
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank all members of our research group at Changzhou University and Jiangsu University.

Funding

This work is supported by the Agricultural Science and Technology Independent Innovation Fund Project of Jiangsu Province,China,(CX(22)3111) and the National Natural Science Foundation of China project(62173162).

Author information

Authors and Affiliations

School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou, 213164, China
Bing Shi, Zelin Gao & Tianheng Pu
School of Microelectronics and Control Engineering, Changzhou University, Changzhou, 213164, China
Jianming Jiang
School of Electrical and Information Engineering, Jiangsu University, Zhenjiang, 212013, China
Yueping Sun

Authors

Bing Shi
View author publications
Search author on:PubMed Google Scholar
Zelin Gao
View author publications
Search author on:PubMed Google Scholar
Tianheng Pu
View author publications
Search author on:PubMed Google Scholar
Jianming Jiang
View author publications
Search author on:PubMed Google Scholar
Yueping Sun
View author publications
Search author on:PubMed Google Scholar

Contributions

Bing Shi, Zelin Gao: Conceptualization, Software, Writing original draft. Tianheng Pu: data collection, editing. Jianming Jiang: Validation, editing. Bing Shi, Yueping Sun: Funding acquisition, Writing review and editing.

Corresponding author

Correspondence to Bing Shi.

Ethics declarations

Conflict of interest

The authors declare that there is no known competing financial interests or personal relationships that could have appeared to influence the work reported in this work.

Consent for publication

All authors approved the manuscript and give their consent for submission and publication.

Ethics approval

Not applicable.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Shi, B., Gao, Z., Pu, T. et al. A novel hybrid extreme learning machine-based diagnosis model for sensor node faults in aquaculture. Sci Rep 15, 29826 (2025). https://doi.org/10.1038/s41598-025-14748-9

Download citation

Received: 25 September 2024
Accepted: 04 August 2025
Published: 14 August 2025
DOI: https://doi.org/10.1038/s41598-025-14748-9

A novel hybrid extreme learning machine-based diagnosis model for sensor node faults in aquaculture

Subjects

Abstract

Similar content being viewed by others

Hybrid deep learning framework for real-time DO prediction in aquaculture

Optimizing high dimensional data classification with a hybrid AI driven feature selection framework and machine learning schema

IoT and ML approach for ornamental fish behaviour analysis

Introduction

Material and methods

Types of sensor node faults

Data acquisition description

Data preparation

Descriptions of the methods

Descriptions of HKELM

Descriptions of UPPSO for HKELM

Overall process of the algorithm and complexity analysis

Results and discussions

Conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Consent for publication

Ethics approval

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Search

Quick links

Subjects

Abstract

Similar content being viewed by others

Hybrid deep learning framework for real-time DO prediction in aquaculture

Optimizing high dimensional data classification with a hybrid AI driven feature selection framework and machine learning schema

IoT and ML approach for ornamental fish behaviour analysis

Introduction

Material and methods

Types of sensor node faults

Data acquisition description

Data preparation

Descriptions of the methods

Descriptions of HKELM

Descriptions of UPPSO for HKELM

Overall process of the algorithm and complexity analysis

Results and discussions

Conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Consent for publication

Ethics approval

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links