Abstract
Effective sensor denoising is crucial for accurate, real-time agricultural decision-support systems. This study explores the application of Unscented Kalman Filter (UKF) extensions on resource-constrained devices to improve sensor denoising and enhance the reliability of Internet of Things (IoT) based agricultural soil monitoring. The study was conducted in Ruhango district, Rwanda, utilizing a wireless sensor node equipped with a Raspberry Pi 5 (ARM v8) and an integrated seven-in-one soil sensor measuring temperature, humidity, electrical conductivity, pH, nitrogen, phosphorus, and potassium. The sensor was placed at a depth of 20 cm in ten cassava farms, collecting data every 30 min for eight months. Four real-time sensor denoising models were implemented: UKF, Cubature Kalman Filter (CKF), UKF with Artificial Neural Network (UKF_ANN), and UKF with Fuzzy Logic (UKF_FL). Models’ performance was evaluated using boxplot, square root(R2), mean absolute error (MAE), root mean square error (RMSE), computation memory (CM), and computation time (CT). Data analysis was performed using Python 3.12 on ARM v8. Results demonstrated that CKF outperformed the other models, reducing RMSE by up to 32% and lowering CM and CT by 75%. CKF and UKF_ANN maintained the integrity of the censored data while effectively removing Gaussian, uniform, and salt-and-pepper noise, making them suitable for IoT-based soil monitoring systems.
Similar content being viewed by others
Introduction
The success of smart farming and precision agriculture in improving crop yields depends on the accuracy of sensor data. Real-time sensor node denoising is a significant challenge, as valuable data and noise are captured simultaneously1. Sensor noise refers to measurement errors, irregularities, or imperfections significantly affecting part or all of a system2,3. With the rapid adoption of low-cost Internet of Things (IoT) devices in heterogeneous agricultural environments, real-time sensor denoising requires immediate attention4. Furthermore, sensor noise is a significant issue, causing many IoT solutions to fail in real-world applications due to their inability to handle diverse noise characteristics (Fig. 1) from the deployed environment4,5.
Noise types in sensor data. Adapted from5.
Successful implementation of real-time agricultural sensor denoising can enhance crop production and promote the sustainability of agricultural land use6. It reduces the time, costs, and environmentally destructive practices associated with traditional methods. Currently, obtaining accurate data requires routine recalibration or industrial-grade sensors, which are often unaffordable for small-scale farmers3.
Sensor denoising models have been derived from polynomial regression equations generated from the observed and simulated data,7,8 to mitigate the need for frequent recalibration and expensive sensors. However, polynomial models require large datasets and show limited adaptability to unseen conditions9,10.
Despite the success of sensor denoising techniques like band filters11, particle filters12, moving horizon estimation13, and artificial neural networks14 in image processing, they have failed to perform optimally in IoT scenarios due to resource constraints.
The linear Kalman filter (KF) is widely recognized as the best real-time sensor denoising technique for linear systems15. However, agricultural systems are complex and highly nonlinear, necessitating higher-order extensions of the KF to handle this nonlinearity16. These high-order KF variants are particularly suited for heterogeneous agricultural environments17,18.
The superiority of the Unscented Kalman filter (UKF) over its extensions (UKFs) and the Cubature Kalman filter (CKF) remains an open question despite their effectiveness in handling first and third-order nonlinear systems, respectively, without requiring Jacobian derivation19,20. A summary of considered studies is presented in Table 1
From Table 1, it is evident that few studies have addressed the deployment of high-order UKF and their extensions on IoT devices for real-time agricultural soil data denoising. This study addresses this gap by pioneering the enhancement of real-time sensor denoising through the integration of UKF with Artificial Neural Networks or Fuzzy Logic, as well as CKF models, on a Raspberry Pi 5.
This paper seeks to answer the research question: Can extensions of the Unscented Kalman Filter (UKF) or the Cubature Kalman Filter (CKF) improve real-time sensor denoising for agricultural soil parameters on resource-constrained devices? The key contributions of this paper include: (1) introducing innovative hybrid methods that integrate the Unscented Kalman Filter (UKF) with Artificial Neural Networks (UKF_ANN) and Fuzzy Logic (UKF_FL) to tackle challenges specific to resource-constrained IoT devices. These methods have low computational demands of 75% while achieving up to 99% accuracy in real-time open-field agricultural soil analysis monitoring. (2) By optimizing these denoising models for low-power devices like the Raspberry Pi 5, this study significantly contributes to deploying IoT systems in smallholder farming contexts, such as those in Rwanda, and (3) UKF_ANN and CKF have the potential to enhance crop production and sustainability through better data accuracy.
For inferences, the paper evaluates the performance of IoT-based UKF extensions for enhancing real-time soil sensor denoising based on i) root mean square errors (RMSE) to measure the model’s prediction error, ii) mean absolute error (MAE) to measure the average prediction error, iii) square root (R2) to explain the variance contained in data, iv) computation Memory (CM) to measure the memory used to compute each model, and v) Computation Time (CT) to quantify to the time used to execute each model. It aims to provide practical solutions for improving the reliability of IoT-based soil monitoring systems. The remainder of this paper is organized as follows.
Materials and methods
Description of the study area
The sensor node was deployed in the Kinazi sector of Ruhango district, Rwanda (Fig. 2), at a longitude of 30.03333 E and a latitude of 2.016667 S, with altitudes ranging from 1500 to 2000 m. The region has a tropical savanna wet (Aw) climate, clay-silt soil, an average annual temperature of 31.29 ± 2.14 °C, and an average yearly precipitation of 1000 mm rain. During the experiment, rainfall varied from 110 mm in April to 190 mm in October, temperatures ranged from 23.2 °C in February to 34.57 °C in September, and humidity fluctuated from 51.29% in November to 65.22% in March. This district has the most potential in cassava farming due to its edaphoclimatic conditions.
Area of Study. Source: QGIS 3.38, available at: https://qgis.org/download/
Soil sensor node design
The soil sensor node consisted of hardware, middleware, and software (Fig. 3). The hardware comprised the sensor node’s physical components (Fig. 3b). Meanwhile, the middleware layer utilized ThingSpeak, which communicated via MQTT protocols. The software layer was built under Python 3.12 through the Plotly library.
The hardware layer comprised a Raspberry Pi 5 (Arm Cortex A76), a JXBS-3001-NPK-RS, integrated soil sensors (temperature, humidity, electrical conductivity, pH, nitrogen, phosphorus, and potassium) from https://www.jxctiot.com/product1/product195.html, and a waterproof air temperature and relative humidity sensor (STH30). Table 2 presents a detailed description of the hardware components.
After the physical system development, four real-time sensor denoiser models were implemented (Fig. 4), on Raspberry Pi 5, then the system was deployed, Fig. 3b-c
Data collection
The data was collected in a 10-hectare model cassava farm, planted with NAROCASS cassava variety, 8-month cycle.
We implemented real-time Unscented Kalman Filter (UKF), Unscented Kalman Filter with Fuzzy Logic (FL), Unscented Kalman Filter with Artificial Neural Network (ANN), and Cubature Kalman Filter (CKF) sensor noise filters on Raspberry Pi 5. After that, we collected soil temperature, humidity, electrical conductivity, pH, nitrogen, phosphorus, and potassium every 30 min from September 2023 to April 2024 (the 2023–2024 season). We developed a web-based dashboard using the Plotly library for real-time data visualization.
Models’ implementation
The implementation of each model (UKF, UKF_ANN, UKF_FL, and CKF) comprised two states (state prediction and state estimation), as described in Fig. 4. The state prediction steps consisted of predicting the next soil parameter state, based on the previous state and system dynamic, relying only on the internal model system. The state prediction corrected the prediction from the state prediction by adjusting the predicted state and covariance to improve the system’s current state, incorporating sensor data.
The UKF and CKF were executed individually. Meanwhile, the UKF_ANN and UKF_ANN are extensions of the UKF, adopting specific rules to select the generated sigma points used to compute the Kalman gain. Thus, integrating the ANN and FL extensions to the UKF aimed to overcome the intrinsic instability reported in the UKF32. The steps of each model are presented with the phases (Fig. 4).
Unscented Kalman filter
The two states of the UKF were executed in six steps, as in Eqs. 1 to 6. The state prediction steps consisted of i) generating sigma points and propagating them through the state function, ii) computing the predicted state, iii) computing the predicted mean state, and calculating the predicted covariance. Meanwhile, the state update steps consisted of (i) predicting the measurement mean, (ii) calculating innovation covariance and cross-covariance, (iii) calculating the Kalman gain, and (iv) updating state and covariance. Detailed information is described in32,33. The state and observation models are described in Eq. (1).
where: \({x}_{k}\)—state model, \({Z}_{k}\)—observation model,—\({u}_{k-1}\)input vector from sensors, \(k\)—time stage, \(k-1\), prior time stage,—\({Q}_{k}\)state covariance matrix–vector—\({Q}_{t}\cong \text{\rm N} \left(0,{\Sigma }_{{w}_{k}}\right)\),—\({R}_{k}\)observation covariance matrix vector \(R_{k} \cong {\text{\rm N}} \left( {0,{\Sigma }_{{V_{t} }} } \right)\) \(f and g\) – are nonlinear process and measurement functions.
Then, the computation of the sigma points \({x}_{i, k-1}\) of n-dimensional vectors \({{x}_{i, k-1}}_{i=0}^{i=2n}\) is represented in Eq. (2).
The scalar \(\lambda\) is a semi-positive parameter determining the sigma points spread around the estimated state vector \({\widehat{x}}_{k-1}\). The term \(i\) refers to the ith column of the square root matrix \(P\), represented by \(\left(\sqrt{P}\right)i\) obtained through Cholesky factorization. After the computation of new points, the transition function \(f,\) was computed the mean \({\overline{x} }_{k}\), covariance error \({P}_{k}\), sigma points weights \({w}_{i}\), and updated the sampling points \({X}_{i,k}\), respectively, as in Eq. (3).
The sigma points weights were computed from Eq. (4).
The sigma points measurements, predicted weights, and measurement updates we computed throughout the observation matrix \(H,\) Eq. (5).
Finally, the Kalman gain, state model, and observation update were as in Eq. (6).
In summary, the UKF mitigates Gaussian noise by propagating sigma points through the nonlinear system and iteratively updating the Kalman gain. Uniform noise is reduced by averaging variations in sigma points, while salt-and-pepper noise is addressed during measurement updates that reject outliers. This process ensures accurate state estimation while preserving the underlying trends in soil parameters. Appendix A presents the pseudo-code for the UKF implementation.
Unscented Kalman filter with fuzzy logic
The UKF_FL incorporates fuzzy logic into the Kalman filtering process to adaptively adjust the Kalman gain based on noise characteristics. Gaussian noise is filtered through minor, incremental gain adjustments, ensuring smooth state updates. Uniform noise is mitigated by scaling corrections to prevent overcompensation for significant measurement variations. In the case of salt-and-pepper noise, fuzzy rules prioritize the rejection of extreme deviations during the measurement update phase, ensuring that substantial outliers do not affect the state estimation. This adaptive mechanism guarantees effective noise removal while preserving the critical trends and integrity of the soil parameter data.
This method’s innovation involves incorporating Eqs. 7 and 8 into the UKF for specific sigma points through Fuzzy Logic rules. It also addresses the intrinsic limitations of the UKF by normalizing the non-differentiable and differentiable matrix (\({\overline{\varphi } }_{j}\), \({\overline{\phi }}_{j})\) elements of the intuitionistic fuzzy set, and calculating the hesitation margin index \({\pi }_{j} ,\) as in Eq. (7).
To compute the Fuzzy Logic rules \(y\) was done as in Eq. (8).
The polynomial parameters \(s, {s}_{i}\) were determined using least square regression techniques. We compared the possibility matrix to the Gaussian probability of \(\left( {Q_{k} \,\gamma \,R_{k} } \right)\), with \(\delta_{ij} = \left( {Q\,\gamma \,R} \right)\). The detailed implementation is described in19,20. Equations 7 and 8 were integrated to replace Eqs. 3, 4, and 5 from the UKF, and the Kalman gain, state model, and observation executing Eq. 6 were updated. The pseudocode for the UKF_FL is presented in Appendix B.
Unscented Kalman filter with artificial neural network
The ANN initially learns the nonlinear relationships present in the sensor data, producing a preliminary denoised signal. This denoised output is then fed into the UKF for further refinement, reducing residual Gaussian noise and minor deviations. The ANN’s mappings effectively suppress uniform noise while ignoring impulsive salt-and-pepper noise during training, ensuring smooth and consistent outputs and Kalman gain convergence. This hybrid approach preserves the underlying trends in the data while providing robust noise reduction across various noise types that can affect soil sensors.
The state prediction was computed by introducing a look-back function to create sequences (batches) of the 10 previous sensor readings used for training (80%) and testing (20%) to compute the Kalman gain. Thus, LTSM was defined by configuring the forget gate (\({f}_{k}\)), input gate (\({i}_{k}\)), Cell state update (\({C}_{k}\)), and output gate (\({O}_{k}\)) to interactively update the cell state, as in Eq. (9–12).
The designed LSTM model consisted of two layers, each with 64 units. The first layer was configured to return sequences to the second layer, followed by a 20% dropout to prevent overfitting. The output and hidden layers were aligned with the number of predicted features to generate the final prediction.
To complement the LSTM, we run a parallel ANN model to approximate the Kalman gain, using \({Q}_{k},{R}_{k}\), as the input layer, 32 a unit-dense layer using a Rectified Linear Unit, \(ReLU\) activation, and a single-unit output layer that predicted the Kalman gain. The input layer of the ANN comprised two nodes \(\widehat{P}\) \(R\), while the hidden layer’s ReLU activation function was as in Eq. (13)
The output layer from ANN (Kalman gain) was computed from the linear regression of the predicted stage, weights \(w\), and bias \(b\), as in Eq. (14)
In the update state, we recursively updated the Kalman gain (\({K}_{k}\)), by first computing the predicted step (\({\widehat{P}}_{k}\)) from the initial \({P}_{0}\), adding the measurement covariance \({Q}_{k}\), as in Eq. (15)
This approach dynamically adjusted the \({K}_{k}\) iteratively, refining the predictions in real-time. This method’s novelty was the introduction of a mechanism to monitor the Kalman gain convergence \({K}_{kC}\) over multiple data points, thereby identifying the stabilization point, which signifies the algorithm’s effective calibration and reliability (Fig. 5).
After the \({K}_{kC}\) we updated the state model and covariance, as in Eq. (15).
Equations 10–14 were integrated to replace Eqs. 1—5 from the UKF and state model, and observation executing Eq. 6 was updated, similarly to Eq. 15. Appendix C describes the pseudocode for the UKF_ANN implementation.
Cubature Kalman filter implementation
The CKF employs symmetric cubature points to manage nonlinear dynamics and decrease Gaussian noise. Uniform noise is minimized through covariance updates, and extreme outliers from salt-and-pepper noise are partially filtered out. The algorithm preserves soil data trends while balancing state estimation and noise suppression. We replaced Eqs. 3 to 5 from UKF to generate sigma points with the spherical-radial cubature rules for the Cubature Kalman Filter implementation \({x}_{k}\), Eq. (16).
\(m\) = number of cubature points computed by \({\omega }_{i}=\frac{1}{m};i=1, 2, . . . , m;m=2n\);
\({\omega }_{i}\) = positive weights; \(n\) = dimension of vector state. Detailed information about the CKF is described in16,18. The pseudocode of the CKF is shown in Appendix D. We implemented parameters auto-tuning to minimize the covariance error, preventing overfitting. Thus, the initial condition was set as in Table 3.
Data analysis
The primary assumption for the collected parameters is that they have piecewise nonstationary characteristics, with no abrupt changes and only smooth variations between the a priori and posterior (Fig. 1) sampling times3,7. To this end, four UKF, UKF_FL, UKF_ANN, and CKF algorithms were tested for their performance in sensor noise removal for IoT applications. Inferences were based on root mean square errors (RMSE), mean absolute error (MAE), square root (R2), computation Memory (CM), and Computation Time (CT). In addition (Fig. 6), we plotted a graph (Fig. 7) of the model’s behavior to evaluate the effectiveness of the filters in eliminating different types of sensor noises (uniform, Gaussian, and salt-and-pepper) for each soil parameter.
Furthermore, we plotted the Pearson correlation matrix of the normalized sensor data after performing outlier detection using Bayesian decision theory and hypothesis testing with theoretical guarantees, as proposed by34. Concurrently, we calculated the Pearson correlation matrix of CKF, identified as the best-performing algorithm, to compare the inferences drawn before and after removing data noise. The analysis was performed in Python 3.12 using a feed-forward model on a Raspberry Pi 5, 64-bit quad-core Cortex-A72 processor running at 1.5 GHz and 4 GB of RAM.
Results
Boxplots
The boxplots (Fig. 6) of soil temperature, electrical conductivity, and phosphorus showed that CKF matched the sensor data patterns, followed by FL and ANN. Additionally, ANN demonstrated potential for removing outliers despite slight variations in data pattern structure (mean values) and amplitude. In contrast, the UKF failed to copy the real data structure, affecting the mean value and distribution within the quartiles. Despite the failure of the UKF to effectively preserve the variability, it reduced the interquartile range (IQR) comparable to the sensor data, indicating salt-and-pepper noise suppression (Fig. 6a) for the soil moisture and soil phosphorus.
The ANN removed the abrupt data variation (associated with Gaussian and pepper-and-salt noise) suspected to be present in soil EC and pH sensor data. The soil potassium and nitrogen parameters (Fig. 6e–g) were prone to several data abrupt associated with Uniform and Gaussian noise types, that only FL effectively handled.
CKF was generally ideal for IoT-based soil monitoring sensor data denoising, with ANN and UKF_FL as alternative solutions for specific cases based on noise types.
Models’ performance
To strengthen our inferences, we developed a dashboard to visualize the real-time data of each parameter, allowing for a comparison of model performance (Fig. 7). The CKF (\({R}^{2}\cong 0.96\)) and ANN (\({R}^{2}\cong 0.96\)) effectively filtered the Gaussian, Uniform, and Salt-and-Pepper noise across all variables. However, the FL filter failed (\({R}^{2}\cong 0.88\)) to remove the Gaussian noise for soil moisture, temperature, and electrical conductivity.
Additionally, the CKF quickly converged (12 input entries) to the sensor data, followed by the UKF_FL (Appendix E). The delayed convergence of the ANN is due to the time required to compute the 9 batches of 10 inputs Fig. 5 for the Kalman gain to converge across each variable. The Kalman gain convergence is noticeable in model’s initial phase, where the ANN produces inconsistent results until data splitting, training, and testing are completed, and the Kalman gain convergence, enabling accurate state estimation (see the red line fluctuation at the start of the process), Fig. 7.
Additionally, CKF and ANN effectively sharpened the data and reducing abrupt changes observed in the censored data (Fig. 7b).
The Cubature Kalman Filter (CKF) demonstrated potential for managing soil temperature sensor Gaussian noise (RMSE ± 0.35), highlighting significant differences in convergence and stability. Meanwhile, the UKF integrated with an artificial neural network (UKF_ANN) effectively suppressed the evaluated sensor noise types, but tended to be over-smooth, which could compromise data accuracy (RMSE ± 0.18). In contrast, the UKF integrated with fuzzy logic (UKF_FL) exhibited exceptional adaptability, efficiently suppressing various types of noise while preserving data integrity, making it particularly well-suited for practical applications in noisy environments.
The CKF and UKF_FL provided stable and accurate results; however, the UKF_FL showed superior adaptability, Uniform, Gaussian, and Salt-and-pepper sensor noise types. While the UKF struggled with Gaussian noise, it stabilized over time. A zoom-in (Appendix E) of the UKF_ANN demonstrated its potential in showing a gradual state change, reflecting the true field scenario rather than replicating the abrupt state variations caused by sensor noise. This makes the filter a promising approach for expert systems in real-time sensing and actuation in precision agriculture.
Reducing sensor noise in precision agriculture enhances resource efficiency. Uniform sensor noise can obstruct the delineation of management zones, which is particularly critical when farmers use variable-rate fertilizer applications. This process requires accurate real-time mapping of soil parameter variability for the correct functioning of the fertilizer application system. If noise accumulates, whether Gaussian or salt-and-pepper, it can cause the entire system to overlook minor variations, leading to increased costs. While uniform and Gaussian noise can significantly affect management zones (MZs), Salt-and-pepper noise within the MZs (e.g., due to lakes, rocks, channels, and drift) necessitates targeted optimization management.
The CKF, ANN, and FL real-time models demonstrated stable and responsive performance regarding soil nitrogen (N), phosphorus (P), and potassium (K), showcasing their effectiveness in managing MZs while minimizing soil acidification associated with biased application rates of N, P, and K due to noisy data. However, FL underestimated the sensor data and failed to effectively smooth the local state transitions caused by Gaussian and uniform sensor noise (Fig. 7e–g), which risks applying higher rates and potentially contaminating water and the environment. The CKF and UKF_ANN variants were the most accurate (R2 = 0.96) and reliable models for precision agriculture applications.
All applied filters removed the abrupt state changes related to sensor node path loss, delays, and network noises. These issues are associated with salt-and-paper noise and lead to anomalous data.
Evaluation metrics
The CKF numerical evaluation of the model’s performance (Table 4) demonstrated a 32% reduction in RMSE compared to the UKF, with the ANN and FL models showing a 2%. Additionally, the CKF and ANN exhibited comparable performance regarding R2 and MAE, though ANN outperformed CKF in predicting soil phosphorus.
Moreover, the CKF reduced the UKF’s computation memory and time by 75%, while the ANN increased them by 50% and 450%, respectively. Despite this, the ANN and FL performed similarly regarding RMSE, R2, and MAE. The ANN outperformed the CKF in predicting soil phosphorous, even with uniform noise (Fig. 7f-g). These findings suggest combining graphical visualization with numerical evaluation can enhance model selection.
Pearson correlation
The Pearson correlation analysis (Fig. 8) showed a significant (p < 0.05) association between the variables, supporting the alternative hypothesis. A strong linear association (Fig. 8a) was observed between soil moisture and soil pH, soil moisture and soil electrical conductivity, soil temperature and soil electrical conductivity, and soil pH and soil temperature.
Additionally, the relationships between soil moisture and soil pH and soil pH and soil temperature showed negative correlations, with strengths ranging from moderate to vigorous, based on the35,36 scale. In contrast, the combinations of soil moisture, soil electrical conductivity, and soil temperature and electrical conductivity demonstrated positive correlations, with strengths varying from firm to very strong.
Additionally, undetected outliers in the scatter plots for the nitrogen-phosphorus, nitrogen-soil temperature, and nitrogen-potassium pairs (Fig. 8a), impacted the performance of the Pearson correlation. The CK (Fig. 8b) effectively removed these outliers, revealing more precise data patterns among the variables. Furthermore, the CKF enhanced the associations between variables, which were previously obscured by noise in the censored data.
Moreover, the CKF improved the data structure of the association between potassium and conductivity, as well as phosphorous and potassium (Fig. 8b). Additionally, the CKF provided better data smoothing, accurately capturing the state transitions of the soil parameters. Using CKF-processed data to evaluate the relationships between variables enhanced the inferences about soil agricultural parameters, underscoring the importance of denoising soil sensor data for more informed decision-making.
Discussions
The correlation analysis demonstrated that soil moisture enhances the accuracy of the data on soil pH (r = 0.96) and electrical conductivity (r = 0.89) measurements29. while clear linear associations were not evident, strong evidence suggests causal relationships between nitrogen and phosphorus and between nitrogen and temperature.
Soil pH directly influences the biological processes regulating nitrogen availability and alters the chemical forms of phosphorus, affecting its availability to plants37. Additionally, a nonlinear relationship between soil phosphorus and temperature was observed. Despite a negligible Pearson correlation (r =—0.10), the data pattern suggests a sigmoidal trend, Fig. 8.
Moreover, the strong correlation observed between soil temperature, moisture, and electrical conductivity (Fig. 8) aligns with previous findings38 that reported a casual association between electrical conductivity and soil moisture linked to temporal patterns of these parameters. Analyzing the interrelationships among these parameters using Pearson correlation provides clear insights into how managing one can affect another. Moreover, inferences drawn from one variable can be applied to others based on their relationships. For instance, reduced soil moisture can lead to inaccurate results from NPK and EC, which may cause improper fertilizer application and poor irrigation management, ultimately affecting crop yield.
The increased convergence time in the ANN model, which has a delay of 11 s, is unlikely to affect inferences for soil parameters with less abrupt variations (pH, N, P, K, EC) despite the additional computational resources required for training and testing39.
The soil pH and moisture showed predictability, even with low-order nonlinearity models (n < 3). The lower variability of the soil pH can be related to its dependence on soil nutrients, soil origin, and soil management40. Related studies (Table 4) conducted by Chana et al.,41 applying the random forest algorithm to integrated soil sensor 7-in-1 data for predicting crop yield demonstrated limited generalization and extensive data requirement for better generalization, leading to increased computation time.
Singha et al.,29 Support Vector Machines and Partial Least Squares Regression were used to predict soil parameters through VIs–NIR reflectance spectroscopy for proximal sensing. This method depended on cross-validation of censored data and laboratory analysis for inferences, as real-time online data denoising was not utilized. While VIs–NIR reflectance spectroscopy can accurately predict soil parameters, the overlapping spectrum frequencies of various soil parameters necessitate careful desegregation of frequencies. Additionally, post-processing VIs–NIR spectroscopy data restricts its feasibility for on-field, real-time applications. These limitations underscore the need for implementing real-time nonlinear sensor denoising filters to improve sensor data before any inference processing.
Despite the quick convergence shown by the FL model (Appendix E), it underestimated the states of soil nitrogen, phosphorus, potassium, and pH while effectively maintaining the global data trends and removing Gaussian, uniform, and salt-and-pepper noise (Fig. 7b). The underestimation issues in FL can be attributed to the inability of the fuzzy logic rules to represent the entire data set, suggesting truncation during the computation of the possibility matrix \({\delta }_{ij}\). Additionally, the failure of FL to compute \({\delta }_{ij}\) accurately may result in a wide truth zone that can be incapable of detecting uniform noise (Fig. 7a – c) within that zone.
Optimizing the Kalman gain using ANN by splitting the data into batches of 10 inputs for training and testing (Fig. 5) enabled copying of the local and global variations, overcoming the limitations observed in FL (Fig. 7a–c). This highlights the hypotheses of implementing truncation to enhance the capture of local variation.
The gradual state transition of the CKF and ANN, which eliminates uniform, Gaussian, and pepper-and-salt noise from sensor data related to soil parameters (Appendix E), can be linked to the radial approaches for selecting sigma and the complexity of specific parameters with delayed convergence (soil temperature, moisture, and electrical conductivity). These may require highly robust filters to replicate the behavior of complex time series parameters, as observed by20,42,43,44, when dealing with high (\(n\ge 3\)) nonlinearity problems.
The multifactorial characteristics of soil temperature, humidity, and electrical conductivity that lead to delayed convergence and rapid data fluctuations can be traced back to the heterogeneous environment (sensor deployed in an open field without mulching), resulting in swift variations in these parameters, as observed by40. Therefore, these results suggest implementing CKF as an integrated real-time operating system (ROS) to minimize sensor noise in heterogeneous environments. The numerical results of the related studies are presented in Table 5.
Compared to the one obtained29,41, the strength of our results lies in the real-time removal of sensor noise using only the current and past states to estimate future states, reducing complexity. This contrasts with post-processing approaches that require large datasets for improved results and are prone to issues like gradient vanishing. Similarly,27 a hybrid Kalman filter enhanced soil sensor data in a controlled environment. However, the transferability of their results to uncontrolled environments was limited due to several uncontrolled parameters in natural conditions, which led to model failures or required higher-order models. Additionally,27 models implemented on resource-constrained devices did not report resource utilization, limiting the ability for numerical comparison. In contrast, our study comprehensively evaluates advanced Kalman filters to improve real-time sensor denoising for agricultural soil parameters on IoT resource-constrained devices, highlighting the CKF and UKF_ANN as practical approaches for real-time sensor management in heterogeneous, open-field environments.
Conclusion
This paper aimed to evaluate the performance of IoT-based Unscented Kalman Filter (UKF) extensions on resource-constrained devices for enhancing real-time sensor denoising of agricultural soil parameters to provide practical solutions for improving the reliability of IoT-based soil monitoring systems.
The cubature Kalman filter and unscented Kalman, combined with the artificial neural network, keep the censored data structure intact, removing sensor Gaussian, uniform, and salt-and-pepper noise.
The cubature Kalman filter quickly converged with only ten input data, with 75% reduced computation memory and time.
CKF (R2 = 0.99), ANN (R2 = 0.99), and UKF (R2 = 0.89) accurately predicted the soil pH, even with the increased computation time (11 s) observed in the ANN. A delay of this time for soil parameters cannot interfere with the inference, considering the advantage of converging the Kalman gain, which can better handle high-order (\(n>3\)) nonlinearity.
In future work, we will extend our study by predicting soil nutrient mobility to provide real-time optimal fertilization dates and amounts.
Data availability
The data is available at: https://drive.google.com/drive/folders/1U0if8tsoLtQ-r_9QiAMCpG7kxrpSYjed?usp=drive_link.
References
Biju, V. G., Schmitt, A.-M. & Engelmann, B. Assessing the Influence of Sensor-Induced Noise on Machine-Learning-Based Changeover Detection in CNC Machines. Sensors 24(2), 330. https://doi.org/10.3390/s24020330 (2024).
Jerath, K., Brennan, S. & Lagoa, C. Bridging the gap between sensor noise modeling and sensor characterization. Measurement 116, 350–366. https://doi.org/10.1016/j.measurement.2017.09.012 (2018).
Sharma, A. B., Golubchik, L. & Govindan, R. Sensor faults: Detection methods and prevalence in real-world datasets. ACM Trans. Sens. Netw. 6(3), 1–39. https://doi.org/10.1145/1754414.1754419 (2010).
Barriga, J. A., Blanco-Cipollone, F., Trigo-Córdoba, E., García-Tejero, I. & Clemente, P. J. Crop-water assessment in Citrus (Citrus sinensis L.) based on continuous measurements of leaf-turgor pressure using machine learning and IoT. Expert Syst. Appl. 209, 118255. https://doi.org/10.1016/j.eswa.2022.118255 (2022).
Ibarguengoytia, P. H., Sucar, L. E. & Vadera, S. Real time intelligent sensor validation. IEEE Trans. Power Syst. 16(4), 770–775. https://doi.org/10.1109/59.962425 (2001).
Kim, H. N. & Park, J. H. Monitoring of soil EC for the prediction of soil nutrient regime under different soil water and organic matter contents. Appl. Biol. Chem. 67(1), 1. https://doi.org/10.1186/s13765-023-00849-4 (2024).
Martín, J., Sáez, J. A. & Corchado, E. Tackling the problem of noisy IoT sensor data in smart agriculture: Regression noise filters for enhanced evapotranspiration prediction. Expert Syst. Appl. 237, 121608. https://doi.org/10.1016/j.eswa.2023.121608 (2024).
Teh, H. Y., Kempa-Liehr, A. W. & Wang, K.I.-K. Sensor data quality: a systematic review. J. Big Data 7(1), 11. https://doi.org/10.1186/s40537-020-0285-1 (2020).
Bordoni, F. & D’Amico, A. Noise in sensors. Sens. Actuators Phys. 21(1–3), 17–24. https://doi.org/10.1016/0924-4247(90)85003-M (1990).
Patel, A. K., Ghosh, J. K. & Sayyad, S. U. Fractional abundances study of macronutrients in soil using hyperspectral remote sensing. Geocarto Int. 37(2), 474–493. https://doi.org/10.1080/10106049.2020.1720315 (2022).
A. B. Bertolla and P. E. Cruvinel, Band-Pass Filtering for Non-Stationary Noise in Agricultural Images to Pest Control Based On Adaptive Semantic Modeling, In: 2021 IEEE 15th International Conference on Semantic Computing (ICSC), Laguna Hills, CA, USA: IEEE, https://doi.org/10.1109/ICSC50631.2021.00073. (2021).
D. Crisan and J. Míguez, Nested particle filters for online parameter estimation in discrete-time state-space Markov models, Bernoulli, https://doi.org/10.3150/17-BEJ954. (2018).
He, J., Khedher, A. & Spreij, P. A Kalman particle filter for online parameter estimation with applications to affine models. Stat. Inference Stoch. Process. 24(2), 353–403. https://doi.org/10.1007/s11203-021-09239-3 (2021).
X. Song, J. Huang, and D. Song, Air Quality Prediction based on LSTM-Kalman Model, In: 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), Chongqing, China: IEEE. https://doi.org/10.1109/ITAIC.2019.8785751. (2019).
Kalman, R. E. A New Approach to Linear Filtering and Prediction Problems. J. Basic Eng. 82(1), 35–45. https://doi.org/10.1115/1.3662552 (1960).
Arasaratnam, I. & Haykin, S. Cubature Kalman Filters. IEEE Trans. Autom. Control 54(6), 1254–1269. https://doi.org/10.1109/TAC.2009.2019800 (2009).
X. Fu et al., Soil moisture estimation by assimilating in‐situ and SMAP surface soil moisture using unscented weighted ensemble Kalman filter, Water Resour. Res., p. e2023WR034506, Sep. https://doi.org/10.1029/2023WR034506. (2023).
Gao, B., Hu, G., Zhong, Y. & Zhu, X. Cubature rule-based distributed optimal fusion with identification and prediction of kinematic model error for integrated UAV navigation. Aerosp. Sci. Technol. https://doi.org/10.1016/j.ast.2020.106447 (2021).
P. S. Madhukar and L. B. Prasad, State Estimation using Extended Kalman Filter and Unscented Kalman Filter, In: 2020 International Conference on Emerging Trends in Communication, Control and Computing (ICONC3), Lakshmangarh, India: IEEE, https://doi.org/10.1109/ICONC345789.2020.9117536.
Y. Sun et al., Improving Treatment of Noise Specification of Kalman Filtering for State Updating of Hydrological Models: Combining the Strengths of the Interacting Multiple Model Method and Cubature Kalman Filter, Water Resour. Res., https://doi.org/10.1029/2022WR033635. (2023).
Choudhury, S. et al. Agriculture Field Automation and Digitization Using Internet of Things and Machine Learning. J. Sens. 2022, 1–17. https://doi.org/10.1155/2022/9042382 (2022).
Farooq, M. S., Riaz, S., Abid, A., Umer, T. & Zikria, Y. B. Role of IoT Technology in Agriculture: A Systematic Literature Review. Electronics 9(2), 319. https://doi.org/10.3390/electronics9020319 (2020).
Dhanaraju, M., Chenniappan, P., Ramalingam, K., Pazhanivelan, S. & Kaliaperumal, R. Smart Farming: Internet of Things (IoT)-Based Sustainable Agriculture. Agriculture 12(10), 1745. https://doi.org/10.3390/agriculture12101745 (2022).
Rajak, P., Ganguly, A., Adhikary, S. & Bhattacharya, S. Internet of Things and smart sensors in agriculture: Scopes and challenges. J. Agric. Food Res. https://doi.org/10.1016/j.jafr.2023.100776 (2023).
Kadrolli, V. & Kalnoor, G. IoT and Smart Sensors for Remote Sensing Healthcare and Agriculture Applications. Remote Sens. Earth Syst. Sci. 7(4), 364–378. https://doi.org/10.1007/s41976-024-00129-9 (2024).
E. Armando, D. Hanyurwimfura, O. Gatera, and A. Nduwumuremyi, Kalman Filter and Artificial Neural Network for Real time Sensor Denoising, In: 2023 International Conference on Information Technology and Computing (ICITCOM), Yogyakarta, Indonesia: IEEE, https://doi.org/10.1109/ICITCOM60176.2023.10441838. (2023).
Jihani, N., Kabbaj, M. N. & Benbrahim, M. Kalman filter based sensor fault detection in wireless sensor network for smart irrigation. Results Eng. https://doi.org/10.1016/j.rineng.2023.101395 (2023).
Ludeña-Choez, J., Choquehuanca-Zevallos, J. J. & Mayhua-López, E. Sensor nodes fault detection for agricultural wireless sensor networks based on NMF. Comput. Electron. Agric. 161, 214–224. https://doi.org/10.1016/j.compag.2018.06.033 (2019).
Singha, C., Swain, K. C., Sahoo, S. & Govind, A. Prediction of soil nutrients through PLSR and SVMR models by VIs-NIR reflectance spectroscopy. Egypt. J. Remote Sens. Space Sci. 26(4), 901–918. https://doi.org/10.1016/j.ejrs.2023.10.005 (2023).
Klein, L., Dvorský, J., Seidl, D. & Prokop, L. Novel lossy compression method of noisy time series data with anomalies: Application to partial discharge monitoring in overhead power lines. Eng. Appl. Artif. Intell. https://doi.org/10.1016/j.engappai.2024.108267 (2024).
Dev, D. R. et al. Uncertainty determination and reduction through novel approach for industrial IOT. Meas. Sens. https://doi.org/10.1016/j.measen.2023.100995 (2024).
Julier, S. J. & Uhlmann, J. K. Unscented Filtering and Nonlinear Estimation. Proc. IEEE 92(3), 401–422. https://doi.org/10.1109/JPROC.2003.823141 (2004).
Wang, X., Zhang, H., Gao, X. & Zhao, R. The Tobit-Unscented-Kalman-Filter-Based Attitude Estimation Algorithm Using the Star Sensor and Inertial Gyro Combination. Micromachines 14(6), 1243. https://doi.org/10.3390/mi14061243 (2023).
Gupta, M., Gao, J., Aggarwal, C. C. & Han, J. Outlier Detection for Temporal Data: A Survey. IEEE Trans. Knowl. Data Eng. 26(9), 2250–2267. https://doi.org/10.1109/TKDE.2013.184 (2014).
Schober, P., Boer, C. & Schwarte, L. A. Correlation Coefficients: Appropriate Use and Interpretation. Anesth. Analg. 126(5), 1763–1768. https://doi.org/10.1213/ANE.0000000000002864 (2018).
F. Zinzendoff Okwonu, B. Laro Asaju, and F. Irimisose Arunaye, Breakdown Analysis of Pearson Correlation Coefficient and Robust Correlation Methods, IOP Conf. Ser. Mater. Sci. Eng., https://doi.org/10.1088/1757-899X/917/1/012065. (2020).
Placidi, P. et al. Monitoring Soil and Ambient Parameters in the IoT Precision Agriculture Scenario: An Original Modeling Approach Dedicated to Low-Cost Soil Water Content Sensors. Sensors 21(15), 5110. https://doi.org/10.3390/s21155110 (2021).
Cao, Z., Mu, S., Xu, L., Shao, M. & Qu, H. Causal Research on Soil Temperature and Moisture Content at Different Depths. IEEE Access 9, 39077–39088. https://doi.org/10.1109/ACCESS.2021.3064264 (2021).
Bogena, H., Huisman, J., Schilling, B., Weuthen, A. & Vereecken, H. Effective Calibration of Low-Cost Soil Water Content Sensors. Sensors 17(12), 208. https://doi.org/10.3390/s17010208 (2017).
Fu, X. et al. Effects of Soil Hydraulic Properties on Soil Moisture Estimation. J. Meteorol. Res. 37(1), 58–74. https://doi.org/10.1007/s13351-023-2049-2 (2023).
Chana, A. M., Batchakui, B. & Nges, B. B. Real-Time Crop Prediction Based on Soil Fertility and Weather Forecast Using IoT and a Machine Learning Algorithm. Agric. Sci. 14(05), 645–664. https://doi.org/10.4236/as.2023.145044 (2023).
Armando, E. J., Hanyurwimfura, D., Gatera, O. & Nduwumuremyi, A. Enhancing Agricultural Internet of Things Data Accuracy: Evaluating Kalman Filter-Based Sensor Denoising Techniques. J. Biosyst. Eng. 50(1), 47–58. https://doi.org/10.1007/s42853-025-00252-5 (2025).
Kulikova, M. V. & Kulikov, GYu. MATLAB-based general approach for square-root extended-unscented and fifth-degree cubature Kalman filtering methods. Eur. J. Control 59, 1–12. https://doi.org/10.1016/j.ejcon.2021.01.003 (2021).
Lv, Y.-W. & Yang, G.-H. An adaptive cubature Kalman filter for nonlinear systems against randomly occurring injection attacks. Appl. Math. Comput. https://doi.org/10.1016/j.amc.2021.126834 (2022).
Acknowledgements
To the International Centre of Insect Physiology and Ecology (ICIPE) for awarding a scholarship to the first author and the Center of Excellence in the Internet of Things for assisting with laboratory equipment. The authors thank the KINAZI cassava Plant for the system deployment.
Funding
This work was supported in part by the National Council for Science and Technology of Rwanda (NCST), under grant No.: NCST-NRIF/RIC-R&D-PHASE I/08/04/2022, and in part under the Regional Scholarship Investigation Fund (RSIF/PASET).
Author information
Authors and Affiliations
Contributions
E.J.A and O.G wrote conceptualization, literature review, and formal analysis D.H performed the review, algorithm design, validation, and data curation, A.N and K.S.K performed data analysis and formatting. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Armando, E.J., Hanyurwimfura, D., Gatera, O. et al. Evaluation of advanced Kalman filter on real-time agricultural soil parameters through an IoT resources-constrained device. Sci Rep 15, 20829 (2025). https://doi.org/10.1038/s41598-025-05427-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-05427-w