Assessment of off-road agricultural traction in situ using large scale machine learning and neurocomputing models

Mwiti, Frankline; Gitau, Ayub; Mbuge, Duncan; Njoroge, Ruth; Antille, Diogenes L.; Khatti, Jitendra

doi:10.1038/s41598-025-17736-1

Download PDF

Article
Open access
Published: 26 September 2025

Assessment of off-road agricultural traction in situ using large scale machine learning and neurocomputing models

Scientific Reports volume 15, Article number: 33098 (2025) Cite this article

1638 Accesses
3 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Artificial neuro-cognitive models can simulate human brain intelligence to enable accurate decision-making during complex agricultural operations in situ. To investigate this, twelve machine learning algorithms were employed to sequentially train 72 neurocomputing architectures using Deep Neural Network (DNN) and Artificial Neural Network (ANN) models for the neurocognitive prediction of tractive force (F_Tr). Fourteen soil-machine input variables were used, and the hyperparameters of the neuro-cognitive models were optimized through metaheuristic algorithms, targeting 50,000 neuro-perceptron epochs to minimize convergence error. The performance of the neurocomputing models was evaluated using standard accuracy metrics, including coefficient of determination (R²), root mean square error (RMSE), mean absolute error (MAE), and prediction accuracy (PA). The DNN Levenberg–Marquardt (trainlm) model with a 14-7-5-1 architecture demonstrated superior predictive performance (RMSE = 1.03e-5, R² = 1.0, MAE = 8.0e-6, and PA = 99.994), closely followed by the ANN Bayesian Regularization (trainbr) model with a 14-72-1 architecture (RMSE = 4.0e-4, R² = 0.9999, MAE = 0.0002, and PA = 99.935). Although the DNN trainlm model required slightly more epochs to reach optimal performance (55 vs. 51), it achieved faster computation (2s vs. 29s) than the ANN trainbr model. The reliability indices, i.e., a20-index (a20), scatter index (IOS), and agreement index (IOA), revealed that the DNN trainlm (14-7-5-1) and ANN trainbr (14-72-1) models are highly reliable. Notably, the ANN trainbr model attained the highest $\:{\xi\:}_{total}$ (= 240), while the DNN trainlm model was the most reliable under its configuration ($\:{\xi\:}_{total}$= 196). Taylor’s analysis revealed no statistically significant deviation between the experimental and predicted F_Tr values for both models. Furthermore, all prediction instances (100%) for both trainbr (14-72-1) and trainlm (14-7-5-1) fell within the 95% prediction uncertainty range (PPU_95%), with a near-zero neurocognitive uncertainty index ($\:\stackrel{-}{dU}x$ = 0.00;0.03) and minimal logical deviation factors (d_factor=0.03;1.58), confirming Monte Carlo consistency between predicted and observed F_Tr values. The Anderson–Darling test confirmed the normality of the predictions ($\:{P}_{model}$ = 0.001), with both models satisfying the condition (P_model< P_ideal,0.05). Finally, the ANN model under trainbr also achieved excellent performance. The DNN trainlm (14-7-5-1) model employed a log-sig–tan-sig–purelin activation sequence, whereas the ANN trainbr (14-72-1) model used a tan-sig and purelin configuration, both suitable for in-situ prediction of F_Tr. In addition, it was observed that the Spider Wasp Optimization (SWO) algorithm enhanced the performance of the conventional DNN (trainlm) and ANN (trainbr) models in predicting agricultural traction.

Overcoming classic challenges for artificial neural networks by providing incentives and practice

Article 20 October 2025

Smart agriculture: utilizing machine learning and deep learning for drought stress identification in crops

Article Open access 03 December 2024

Machine learning models for neurocognitive outcome prediction in preterm born infants

Article 18 January 2025

Introduction

Off-road agricultural traction is generated at the soil-tire contact during field operations to develop drawbar pull, tillage draft, and tractor-implement locomotion. However, generating traction in field conditions is a power-demanding process that consumes ~ 50% of the total energy required in mechanized agricultural systems^1,2. Although 90% of the tractor engine power is transmitted to tractive devices via the axle torque³the generated tractive force (F_Tr) is largely affected by heterogeneous and dynamically changing factors at the soil-tire and soil-implement interface, where 20–55% of tillage energy can be lost^4,5. The F_Tr is dynamically influenced by trafficability conditions and tractor-implement settings, locomotion configurations, and causal-response effects arising from wheel load (W_Load), tire inflation pressure (P_Tire), rut depth (R_depth), and the operating tillage depth (T_depth). Multivariate causal-response effects, such as vertical soil reactions, dynamic weight transfer, rolling resistance force (FRr), tire deflections, and wheel slip (S_Wheel), as well as fuel-torque throttling, all influence the resultant tractive thrust force⁶. Therefore, the net F_Tr generated is a non-linear function resulting from multivariate and dynamically complex soil-machine variables, as well as tractor-implement configuration and settings. Accordingly, accurate generation and utilizing the desired F_Tr through conventional gear-up and throttle-down to optimize power/load mismatch, while maximizing energy-use efficiency under the intricate soil-machine variables in situ, becomes challenging. Approaches to precisely respond to dynamically variable soil conditions by quickly adjusting the multiple machinery parameters and settings that influence F_Tr are not documented in scientific literature. Thus, optimized generation and utilization of F_Tr for improved energy-use efficiency in tillage operations solely relies on the traditional “gear-up and throttle-back” methods adopted by the conventional “proficient operators”. However, maximum F_Tr does not entirely correspond to the maximum engine power, W_Load, tillage speed, or minimum or maximum fuel consumption rate (Ø_Fuel), and S_Wheel. As a result, optimizing soil-tool-wheel and tractor-implement parameters for accurate generation and utilization of F_Tr from such a complex system of dynamic variables in tillage is challenging. Recent developments in intelligent autonomous tractors offer promises to reduce energy and fuel use in mechanized tillage. This is an important consideration for improving energy-use efficiency, as human-operated tractors often make inaccurate traction optimization judgments, which can consume up to 30% of production energy costs^7,8. Therefore, accurate decisions in real-time are needed to leverage the dynamic and causal-response effects of multivariate soil-machine variables for optimal rate generation and effective utilization of F_Tr during tillage. Well-developed models can assist in optimizing F_Tr by considering the numerous soil-tool, soil-tire, and tractor-implement variables and adjustments required to respond to highly heterogeneous and dynamically variable (both in time and space) soil characteristics in situ. However, numerical and semi-empirical models currently available rely on a limited number of parameters (e.g., wheel load, angle of internal friction (ϕ), and soil cohesion (c)) to develop traction prediction models, suggesting suboptimal generation and utilization of F_Tr during tillage⁹. Similarly, several studies have previously adopted deficient theoretical approaches or classical soil mechanics methods and combined a limited number of variables (e.g., soil cone index (CI), forward speed, soil bulk density (γ_Soil), c, ϕ, and rake angle) with semi-empirical terrain parameters to quantify tillage and tractive thrust forces^10,11. Such approaches neglect dynamic point-specific variations of soil parameters, and they neither provide exhaustive fundamental insights into the compounded direct and indirect influence of field soil-machine variables nor account for cumulative multi-pass and dynamic load transfer effects on F_Tr in situ. Efforts spent on integrating numerical and classical soil mechanics approaches have enabled the prediction of horizontal and vertical forces influencing traction, but with relatively high average errors (up to ± 33% and ± 50%, respectively)¹². Furthermore, previous research has developed numerical and empirical traction models based on studies conducted under controlled laboratory conditions using indoor soil bins with uniform soil conditions¹³. Although soil bin studies have been instrumental in developing a fundamental understanding of the engineering principles governing soil-machine interactions, the resultant models had limitations inherent to the artificial conditions under which they were developed, most importantly, a lack of soil heterogeneity. Furthermore, soil bin models do not account for the dynamic causal-response effects of arable soil loading in situ, as commonly encountered during tractive locomotion in tillage.

The laboratory traction rigs and dynamometric traction test benches used for single-wheel testing^14,15which are employed to predict traction, do not account for the off-road tractive dynamics of the entire tractor-implement drive train. Furthermore, previous soil-bin and traction rig models relied heavily on accuracy metrics; however, their robustness, reliability, and generalized adaptability to other soil conditions and unseen datasets were never fully demonstrated. Approved tractor testing authorities provide a comparison for tractive performance and test data from standardized experiments conducted on concrete tracks¹⁶which do not accurately represent the heterogeneous nature of agricultural soils. Although such approaches establish a basis for proving the generated F_Tr, machine learning soft computing approaches can provide more accurate and robust forecasting of F_Tr using multivariate datasets obtained from the soil-machine interface in situ. Multivariate dynamic systems are best evaluated using large training datasets in a soft computing environment such as machine learning^17,18. The reliability and robustness of machine learning techniques can be assessed in terms of generalization compared to numerical and regression methods. Soft computing machine learning approaches utilize artificial intelligence (AI) algorithms that learn and analyze complex dataset patterns and their intricate dependencies to reveal underlying multicollinearities and to provide accurate predictions^18,19. Deep neural networks (DNN) and Artificial Neural Network (ANN) algorithms can replicate the intelligent thinking of human brain neurons to simulate complex soil-machine non-linearities and associated causal-response effects for accurate, real-time, and evidence-based neurocognitive forecasting of F_Tr in situ. Furthermore, DNN and ANN algorithms enable smart agricultural technologies to utilize excellent and robust modeling domains with adaptability to diverse datasets from dynamic environments, such as arable soils during tillage²⁰. Dynamic soil-machine variables at the soil-tire and soil-tool interfaces are highly multivariate, complex, and nonlinear, rendering soil processing in tillage a prime target for intelligent modeling, automation and robotization using AI^18,21. Certain neurocognitive algorithms perform best with specific neuro-activation functions and learn various datasets at different levels of accuracy, computational time, and with different neuro-perceptron epoch sizes. However, the adoption of dynamic soil-machine variables for in-situ modeling of F_Tr using ANN and DNN, as well as the resultant systems of neurocognitive equations, are missing from the scientific and engineering literature. Previous machine learning studies connected to tillage have not established AI equations associated with the developed ML and neurocomputing models, thus limiting the scope of their adoption, generalization and utilization.

Gap identification in available studies

The literature study demonstrates that researchers utilized several soft computing approaches with different features and datasets in predicting F_Tr. It has also been observed that no researchers have utilized the entire soil-machine feature viz. wheel rut depth (R_depth), implement draft (F_D), fuel consumption rate (Ø_Fuel), four levels of wheeling load (W_load), five levels of tire inflation pressure (P_tire), four tillage depths (T_deth), soil cone index (CI_soil), shear strength (τ_Shear), water content (θ_Soil), soil bulk density (γ_Soil), plasticity index (I_PSoil), theoretical and actual wheel slippage (S_Wheel), rolling resistance force (F_Rr) and soil-tire contact patch area (A_Stc) as input features in predicting F_Tr. Interestingly, previous researchers utilized the artificial neural network models but didn’t compare the backpropagation algorithms. Further recent hybrid optimization algorithms have not been adopted in traction predictions. In addition, it has been found that the DNN models have not been implemented with Levenberg-Marquardt (trainlm), Scaled conjugate gradient (trainscg), Quasi-newton (trainbfg), Powell-Beale conjugate gradient (traincgb), Onestep secant (trainoss), Gradient descent momentum (traingdm), Fletcher-reeves conjugate gradient (traincgf), Gradient descent (traingd), Polak-ribiére conjugate gradient (traincgp), Bayesian regularization (trainbr), Learning rate gradient descent (traingdx), and Resilient backpropagation (trainrp) algorithms and compared in predicting F_Tr.

Novelty of the present investigation

Considering the gap identified in the literature, the present investigation has the following novelty:

This study employs artificial neural network and deep neural network models with the configuration of Levenberg-Marquardt (trainlm), Scaled conjugate gradient (trainscg), Quasi-newton (trainbfg), Powell-Beale conjugate gradient (traincgb), Onestep secant (trainoss), Gradient descent momentum (traingdm), Fletcher-reeves conjugate gradient (traincgf), Gradient descent (traingd), Polak-ribiére conjugate gradient (traincgp), Bayesian regularization (trainbr), Learning rate gradient descent (traingdx), and Resilient backpropagation (trainrp) backpropagation algorithms, and analyses their prediction capabilities in predicting F_Tr for the first time. Further, ANN and DL models have been optimized with new Spider Wasp Optimization (SWO), Puma Optimizer (PO), and Walrus Optimization (WO) algorithms to predict F_Tr in situ for the first time.
This investigation uses R_depth, F_D, Ø_Fuel, W_load, P_tire, T_deth, CI_soil, τ_Shear, θ_Soil, γ_Soil, I_PSoil, S_Wheel, F_Rr, and A_Stc as features for predicting F_Tr in situ for the first time. In addition, the cosine amplitude method reveals the sensitivity of each feature in predicting F_Tr.

Research methodology

Determining the required F_Tr before tillage operations would guide the optimization of tractor-implement forces and tillage energy utilization efficiency. However, the literature indicates a lack of accurate in-situ traction prediction models for wheeled vehicular applications in tillage, due to the multivariate and complex nonlinearities at the soil-tool and soil-tire interfaces. Dynamic soil-machine variables at the soil-tire and soil-tool interfaces in situ necessitate the adoption of advanced machine-learning algorithms for accurate, robust, and reliable predictions of F_Tr. Furthermore, operating under diverse and heterogeneous field conditions demands site-specific tractor-implement configurations, rendering the tillage process a prime target for intelligent automation and robotization using AI. This research utilizes artificial neurocomputing algorithms and neuro-activation functions to develop neurocognitive machine learning models for predicting F_Tr using in-situ soil-machine variables during tillage. While certain neurocognitive algorithms may learn from specific datasets and perform best with certain neuro-activation functions, the adoption of dynamic soil-machine variables for in situ prognostication of F_Tr using neurocomputing is currently lacking in the literature. In this study, the neurocognitive intelligence of human brain neurons is simulated to develop DNN and ANN models to accurately determine the required F_Tr using the dynamic parameters of the soil-tire and soil-tool interface in-situ. The developed models are useful for intelligent control and optimized traction utilization. The models will be practically utilized by machinery managers to properly match soil conditions with tractor-implement configurations for optimal rate generation and efficient utilization of F_Tr from wheeled agricultural tractors, thereby conserving tillage energy, reducing fuel wastage, and minimizing CO₂ emissions at reduced operational costs. Further, the models can be implemented in programmable logic controllers of wheeled autonomous robots for accurate decision-making and in-field operational adjustments of tillage robots. Figure 1 presents the research flow for assessing the F_Tr in this investigation, utilizing deep learning and neural networks.

Data insights and analysis

Soil and tractor-implement data acquisition in-situ

Tillage experiments were conducted in Ferralsols²² of the maize-growing region in North Rift, Kenya (0°34’16.50” N, 35°18’31.70” E, elevation: 2150 m above sea level). Eighty randomized and triplicated sites were each traversed into five profile pits for soil sampling and testing to establish average field soil water content % (θ_Soil), soil cone index (CI), soil bulk density (γ_Soil), plasticity index (I_PSoil), angle of internal friction (ϕ), cohesion (c) and shear strength (τ_Shear) in situ, at four tillage depth (T_deth) intervals namely; 0-100, 100–200, 200–300, and 300–400 mm. The 240 completely randomized experimental sites were delineated for triplicated tractive locomotion of the research tractor (CASE IHJXM 90 HP), dynamometric with the auxiliary tractor (John Deere 5503), and remotely telemetered with MSI 8000-paired laptop device to sequentially transmit tractive parameters at five levels of tire inflation pressure (P_Tire, 110.4, 151.8, 193.2, 234.6, and 275.8 kPa) and four levels of wheeling load (W_load,11.3, 11.8, 12.3 and 12.8 kN) at the four levels of T_deth. Before engagement, wheel rut depth (R_depth) was measured at the centerline of the tire path, and the soil-tire contact patch area (A_Stc) of the research tractor was obtained at the five levels of P_Tire and four levels of W_load using geometric-image-pixel-colour correlation and segmentation tool in MATLAB. Further, aided by an auxiliary tractor, the rolling resistance force (F_Rr) and the theoretical and actual wheel slippage (S_wheel) of the research tractor were measured at all W_loads, P_Tires, and T_deths levels. Thereafter, the draft dynamometer was connected to the drawbar of the research tractor and limbered up on the front-end tow of the auxiliary John Deere 5503 mounted by a 3-point hitch strip-till subsoiler. The tractive locomotion was sequentially engaged with triplications at all five levels of P_Tire, as well as the four levels of W_loads and T_deth. Instantaneous F_Tr generated by the research tractor at various T_deths, W_loads, and P_Tires were recorded by the digital dynamometer and transmitted remotely via an MSI 8000 datalogger, telemetric with a laptop device (Fig. 2). At the same time, fuel consumption rate (Ø_Fuel) of the research tractor (at all the T_deths, W_loads, and P_tires) was remotely relayed from the fuel tank, digitally instrumented with Teltonika FMB920 smartphone telemetry. The dynamometer was decoupled and implement draft force (F_D) obtained by deducting the auxiliary F_Rr from the total F_Tr. Simultaneously, the theoretical and actual forward speeds were thereafter used to determine respective wheel slippage (S_Wheel) during tillage at the corresponding tractor-implement settings, and T_deths under the prevailing conditions in situ. All 14 soil-machine variables (R_depth, F_D, Ø_Fuel, W_load, P_tire, T_deth, CI_soil, τ_Shear, θ_Soil, γ_Soil, I_PSoil, S_Wheel, F_Rr, A_Stc) were utilized in the neurocognitive modeling and prediction of F_Tr.

Data analysis

To analyze the database, descriptive statistics, frequency distribution, and the Pearson product-moment correlation coefficient method have been utilized in this investigation. The descriptive summary statistics of the experimental database are presented in Table 1, which indicates the range of statistical parameters for all 14 experimental variables in the database. The highest and lowest F_Tr values were 24.3 kN and 4.01 kN, respectively, indicating that they are within the range of tractive force generally developed by agricultural tractors during tillage. The frequency distribution of experimental F_Tr and the entire database variables is shown in Fig. 3(a-o). All variables reported a ‘good’ range of Gaussian normality as evidenced by the bell-shaped curves of their frequency distribution, indicating an excellent traction modeling database composed of all the variables.

Table 1 Descriptive statistics of the database.

Full size table

Pearson correlation coefficients mapped the strength of dependence between soil-machine variables, bivariate relationships, and multicollinearity among F_Tr modeling variables (Fig. 4). The correlation among soil-machine variables indicated that all variables were correlated with F_Tr. Ranging from − 1 to 1, the absolute values indicate the presence and strength of linear relationships, while 0 means a lack of linearity between the variables. F_D (0.9971), IP_Soil (0.9426), and T_depth (0.9426) exhibited the highest positive correlations. In contrast, R_depth (0.1206) showed the least correlation with F_Tr but was highly correlated with P_Tire (0.8062). Correlations among other variables indicated that γ_Soil and θ_Soil (-0.997), θ_Soil and τ_Shear (-0.9476), and θ_Soil and CI_soil (-0.9189) were the most negatively correlated. At the same time, F_D (0.9971), I_PSoil (0.9912), and I_PSoil (0.9694) were the most positively correlated with F_Tr, T_deth, and Ø_Fuel, respectively. Variables with absolute correlation values greater than 0.6 exhibit a high strength of linear relationship at P < 0.05²³. While a correlation index of ± 0.00 represents the absence of a relationship among the variables, non-zero positive correlations indicate that all variables increase or decrease together. By contrast, negatively correlated variables show one variable increasing as the other decreases, and vice versa, as shown in Fig. 4. All variables in the database exhibited either positive or negative multicollinearity with F_Tr at different levels. Correlation indices of ± 0.01 to ± 0.20, ± 0.21 to ± 0.40, ± 0.41 to ± 0.60, ± 0.61 to ± 0.80, and ± 0.81 to ± 1.00 indicate very weak, weak, moderate, strong, and very strong relationships between variables, respectively, as reported by Khatti et al.^24,25.

Furthermore, the dataset was normalized before neurocognitive modeling using Eq. 1 to equalize and balance the scale and range of input features, thereby reducing bias and improving computational speed, accuracy, and neurocognitive generalization.

$$\:{\varvec{X}}_{{\varvec{n}}_{\varvec{i}}}=\frac{{\varvec{X}}_{{\varvec{r}}_{\varvec{v}\varvec{i}}}-{\varvec{X}}_{{\varvec{v}}_{\varvec{i}\left(\mathbf{m}\mathbf{i}\mathbf{n}\right)}}}{{\varvec{X}}_{{\varvec{v}}_{\varvec{i}\left(\mathbf{m}\mathbf{a}\mathbf{x}\right)}-{\varvec{X}}_{{\varvec{v}}_{\varvec{i}\left(\mathbf{m}\mathbf{i}\mathbf{n}\right)}}}}\left({\varvec{X}}_{\varvec{h}\varvec{o}}-{\varvec{X}}_{\varvec{h}1})+{\varvec{X}}_{\varvec{h}1}\right){0<\varvec{X}}_{{\varvec{n}}_{\varvec{i}}}<1$$

(1)

where X_ni, X_rvi, X_vi(min), and X_vi(max) are the normalized, raw, lowest, and highest values of input variables, while X_h1 and X_h2 are set to 0 and 1, respectively^26,27.

Cosine amplitude sensitivity analysis

The nonlinear cosine amplitude sensitivity indexing and analysis approach was adopted to assess the relative influence of the nonlinear soil-machine variables on the dynamic traction responses for the most accurate ANN and DNN models. The cosine amplitude nonlinear sensitivity indices were established using Eq. 2²⁸.

$$\:{\varvec{S}\varvec{A}}_{\varvec{i}\varvec{j}}=\frac{\sum\:_{\varvec{k}=1}^{\varvec{m}}{\varvec{a}}_{\varvec{i}\varvec{k}}\varvec{*}{\varvec{a}}_{\varvec{j}\varvec{k}}}{\sqrt{\sum\:_{\varvec{k}=1}^{\varvec{m}}{\varvec{a}}_{\varvec{i}\varvec{k}}^{2}\varvec{*}\sum\:_{\varvec{k}=1}^{\varvec{m}}{\varvec{a}}_{\varvec{j}\varvec{k}}^{2}}}$$

(2)

where SA_ij is the parametric sensitivity strength, a_ik is the model input variable, and a_jk is the predicted output. Nonlinear cosine amplitude sensitivity (SA_ij) is illustrated in Fig. 5 (a-f), and the analysis is summarized in Table 2. Results showed that all variables had a strong and explicative influence on F_Tr (SAij ≥ 0.85) when considering the entire database for the best ANN trainbr and DNN trainlm models (Fig. 5. e-f). However, F_Tr was most significantly influenced by draft force (F_D), (SA_ij=0.9960 in training, SA_ij= 0.9968 testing), followed by T_depth (SA_ij = 0.9840 train, SA_ij = 0.9850 testing) and Ø_fuel (SA_ij = 0.9838 training, SA_ij = 0.9796 testing), while P_tire had the lowest effect (SA_ij = 0.8540 train, 0.8160 test) for the best ANN trainbr (14-72-1) model (Fig. 5a-b). A similar trend was observed (Fig. 5c-d) for the case of the multi-layered DNN trainlm (14-7-5-1) model: F_D (SA_ij = 0.9961 in training, SA_ij = 0.9970 testing) followed by T_depth (SA_ij = 0.9853 train, SA_ij = 0.9837 test), while P_tire had the least effect (SA_ij = 0.8713 train, 0.7798 test). A similar trend was also observed for the entire database in both ANN trainbr (0.99610, 0.98417, 0.84835) and DNN trainlm (0.99611, 0.98419, 0.84836), as shown in Figs. 5(e) and (f), respectively. The SA_ij values ranged from 0 to 1, indicating the extent to which variables influenced the predicted F_Tr. Values that approximated 1 demonstrated the highest strength, as reported in previous studies^23,29.

Table 2 Summary of sensitivity strength of the best ANN and DNN models.

Full size table

Development of computational approaches

Dataset variables were imported into MATLAB (R2024a), running on macOS (version 14.7.4), with an Intel Core i7 processor at 2.5 GHz CPU frequency and 32 GB of RAM. All neurocognitive algorithms for both DNN and ANN models were executed using custom coding in MATLAB, commanding neurocomputing architectures to predict F_Tr. The 14 experimental variables were sequentially subjected to all 72 DNN and ANN architecture topologies, learning on log-sig, tan-sig, and purelin neuro-activation functions for all 12 neurocognitive algorithms to predict F_Tr, while displaying results for various code snippets. The algorithms included trainlm, trainscg, trainbfg, traincgb, trainoss, traingdm, traincgf, traingd, traincgp, trainbr, traingdx, and trainrp. For each case, the number of hidden layers and neurons was meta-heuristically tuned while targeting 50,000 epochs. The algorithms were left to optimize different hyperparameters, including optimal epoch size, training time, convergence rate, learning rate, gradient, weights, bias, and the Marquardt weight update parameter (µ). During tuning, neuron weights and biases were updated according to each algorithm’s criterion, learning sequentially on sigmosig, tansig, and purelin neuro-activation functions to achieve neurocognitive convergence. For every architecture, the most optimal number of hidden layers and neurons that predicted the output F_Tr with the least mean square error of convergence and epoch optimality was identified for further evaluation. In that case, the most optimal DNN and ANN neurocognitive models were established using a broad set of evaluation metrics, including accuracy, reliability, and robustness.

Neurocomputing intelligence of deep neural networks and artificial neural network algorithms

Deep neural network and ANN models utilize computational layers using structured units called neurons to learn complex nonlinearities and establish weighted relationships among large datasets. Neurocomputing algorithms iteratively adjust the weighting of each hidden neuron and the output of the previous layer, which serves as input to the succeeding neuron layers, thereby predicting the new output³⁰. In DNN models, the number of hidden layers and neurons is incremented heuristically until the desired output is predicted at the most optimal epoch size (training cycles) with the lowest convergence error. Whereas the DNN models have multiple hidden neuron layers through which the data is propagated, ANN models transform the data through one input layer, one hidden layer, and an output layer. Further, DNN models integrate more sophisticated ANNs with complex architectures to achieve higher levels of backpropagation inference and abstraction within datasets. The neurocognitive architecture of DNN models contains at least two hidden layers. In comparison, ANN has up to a maximum of three layers (input, one hidden, and output), as depicted in Fig. 6. In this study, tillage datasets obtained from 80 triplicated (240) experimental sites were used to develop DNN and ANN models for predicting F_Tr in situ using the 14 soil-machine variables.

In recent years, AI algorithms have been deployed to train DNN and ANN models to solve complex, nonlinear agricultural problems. These algorithms learn from the available data using neuro-activation functions and predict the targeted output through iterative feed-forward backpropagation inference²³. Some of the most common artificial neurocognitive algorithms include Levenberg-Marquardt (trainlm), Scaled conjugate gradient (trainscg), Quasi-newton (trainbfg), Powell-Beale conjugate gradient (traincgb), Onestep secant (trainoss), Gradient descent momentum (traingdm), Fletcher-reeves conjugate gradient (traincgf), Gradient descent (traingd), Polak-ribiére conjugate gradient (traincgp), Bayesian regularization (trainbr), Learning rate gradient descent (traingdx), and Resilient backpropagation (trainrp). These algorithms iteratively adjust the weighted neuron connections to optimize the differences between predictions and their targeted outputs in each neuron layer. For instance, traingd updates the weights and bias values in the direction of the negative gradient function according to gradient descent learning and calculates the performance of the derivative, dX, concerning weights and bias of variables using the following Eq. 3¹:

$$\:\varvec{d}\varvec{X}=\varvec{l}\varvec{r}\times\:\frac{\varvec{d}\varvec{p}\varvec{e}\varvec{r}\varvec{f}}{\varvec{d}\varvec{X}}$$

(3)

Where lr is the learning rate, and dperf is the derivative performance. traingdx combines adaptive learning with momentum training rates³² and the previous change (dXprev) in weight or bias³³ using Eq. 4:

$$\:\varvec{d}\varvec{X}=\varvec{m}\varvec{c}\times\:\varvec{d}\varvec{X}\varvec{p}\varvec{r}\varvec{e}\varvec{v}+\varvec{l}\varvec{r}\times\:\varvec{m}\varvec{c}\times\:\frac{\varvec{d}\varvec{p}\varvec{e}\varvec{r}\varvec{f}}{\varvec{d}\varvec{X}}$$

(4)

Where mc is the momentum constant, however, traingdm updates weights and biases to adjust variables (Eq. 5) according to gradient descent with momentum³⁴:

$$\:\varvec{d}\varvec{X}=\varvec{m}\varvec{c}\times\:\varvec{d}\varvec{X}\varvec{p}\varvec{r}\varvec{e}\varvec{v}+\varvec{l}\varvec{r}(1-\varvec{m}\varvec{c})\times\:\frac{\varvec{d}\varvec{p}\varvec{e}\varvec{r}\varvec{f}}{\varvec{d}\varvec{X}}$$

(5)

Further, traincgf iteratively searches for the steepest descent negative gradient to determine the step size that would minimize the function along the conjugate search direction³⁵:

$$\:{\varvec{X}}_{\varvec{w}+1}={\varvec{X}}_{\varvec{w}}+\varvec{l}\varvec{r}\times\:{\varvec{d}}_{\varvec{k}}$$

(6)

Where X_w+1 and X_w are the preceding and current neuron weight vectors, and d_k is the current search direction. A new search direction, d_k+1, is then determined to conjugate the previous d_k−1 by combining the last direction with the new steepest descent direction:

$$\:{\varvec{d}}_{\varvec{k}+1}=-\varvec{d}{\varvec{X}}_{\varvec{k}}+{\varvec{\beta\:}}_{\varvec{k}}{\varvec{d}}_{\varvec{k}-1}$$

(7)

Where dX_k is the k^th gradient, and β_k is the Fletcher Reeves update constant of traincgf obtained³⁵:

$$\:{\varvec{\beta\:}}_{\varvec{k}}=\frac{\varvec{d}{{\varvec{X}}_{\varvec{k}}}^{\varvec{T}}\varvec{d}{\varvec{X}}_{\varvec{k}}}{\varvec{d}{{\varvec{X}}_{\varvec{k}-\varvec{t}}}^{\varvec{T}}}\varvec{d}{\varvec{X}}_{\varvec{k}-1}$$

(8)

Where β_k is the gradient ratio of current to the previously squared gradients, and each variable is now updated to the traincgf function³³:

$$\:\varvec{X}=\varvec{X}+\varvec{a}\times\:{\varvec{d}}_{\varvec{k}}$$

(9)

Where the parameter a is selected to control performance along d_k, the traincgp obtains the constant β_k by dividing the inner product of the previous gradient changes and the current gradient by a square of the earlier gradient³⁶:

$$\:{\varvec{\beta\:}}_{\varvec{k}}=\frac{\varvec{\varDelta\:}\varvec{d}{{\varvec{X}}_{\varvec{k}}}^{\varvec{T}}\varvec{d}{\varvec{X}}_{\varvec{k}}}{\varvec{d}{{\varvec{X}}_{\varvec{k}-\varvec{t}}}^{\varvec{T}}}\varvec{d}{\varvec{X}}_{\varvec{k}-1}$$

(10)

traincgb periodically resets the search gradient to the negative whenever the number of iterations and network parameters are equal and restarts the training if the current and previous gradients lack orthogonality³⁴. This condition improves the training efficiencies of conjugate gradient algorithms and is tested using an inequality as indicated in Eq. 9, which, if satisfied, resets the gradient to negative³⁷:

$$\:\left|{{{(d}_{{k}-1}}})^{T}{d}_{k}\right|\ge\:0.2\times\:{\Vert{d}_{k}\Vert}^{2}$$

(11)

trainscg was reinvented to avoid computationally expensive and time-consuming lines of the search for every iterative input-output network response in the conjugate gradient. It combines the model-trust and conjugate gradient approach, thereby establishing quadratic approximations of error, E, within the vicinity of point w using E_qw (y) and its critical points³⁸:

$$\:{\varvec{E}}_{\varvec{q}\varvec{w}}\left(\varvec{y}\right)=\varvec{E}\left(\varvec{w}\right)+{\varvec{E}}^{\varvec{{\prime\:}}{\left(\varvec{w}\right)}^{\varvec{T}}}\varvec{y}+\frac{1}{2}{\varvec{y}}^{\varvec{T}}{\varvec{E}}^{\varvec{{\prime\:}}\varvec{{\prime\:}}}\left(\varvec{w}\right)\varvec{y}$$

(12)

Compared with trainscg, trainbfg provides less time-consuming optimization and faster convergence using the Quasi-Newton weight updates learning methods³⁹:

$$\:{\varvec{X}}_{\varvec{w}+1}={\varvec{X}}_{\varvec{w}}-{\varvec{A}}_{\varvec{k}}^{-1}\times\:\varvec{d}{\varvec{X}}_{\varvec{k}}$$

(13)

Where A_k is the iterated performance index of the Hessian matrix in the current biases and weights, at the same time, trainoss updates the neuron weights and biases according to a one-step secant method, does not store the complete Hessian matrix, and attempts to bridge the gap between the conjugate and quasi-Newton algorithms³². It assumes the previous Hessian for every iteration to be the identity matrix and calculates new search directions without computing the inverse matrix. The trainoss computes the weighted derivative (dM) concerning bias using the gradient (gM), previous iteration step (M_step), changes in gM of the prior iteration (dgM), and their respective scalar products Ac and Bc³³:

$$\:{\varvec{d}}_{\varvec{M}}=-\varvec{g}\varvec{M}+\varvec{A}\varvec{c}\left({\varvec{M}}_{\varvec{s}\varvec{t}\varvec{e}\varvec{p}}\right)+\varvec{B}\varvec{c}\left(\varvec{d}\varvec{g}\varvec{M}\right)$$

(14)

Although trainlm is the fastest backpropagation algorithm, it requires high computational memory⁴⁰ to compute the gradients of the Jacobian matrix, J, containing first derivative errors concerning the weights and biases of the network, with less complexity than the Hessian matrix using Eq. 15:

$$\:\varvec{d}\varvec{X}={\varvec{J}}^{\varvec{T}}\times\:\varvec{e}$$

(15)

Where e is the network vector error, the trainlm also uses approximations to the Hessian matrix of identity I to update neuron weights⁴¹:

$$\:{\varvec{X}}_{\varvec{w}+1}={\varvec{X}}_{\varvec{w}}-{\left({\varvec{J}}^{\varvec{T}}\varvec{J}+\varvec{\mu\:}\varvec{I}\right)}^{-1}{\varvec{J}}^{\varvec{T}}\times\:\varvec{e}$$

(16)

trainbr updates the neuron weights and biases through Bayesian regularization, which optimizes, minimizes, and determines the optimal combination of squared errors and their weights to produce a well-generalizing network. It computes the performance of Jacobian jX for random regularization parameters related to random variables of bias and weights distributions³⁵:

$$\:\varvec{j}\varvec{j}=\varvec{j}\varvec{X}.\varvec{j}\varvec{X};\:\varvec{j}\varvec{e}=\varvec{j}\varvec{X}.\varvec{E}\:\mathbf{a}\mathbf{n}\mathbf{d}\:\varvec{d}\varvec{X}=-\frac{\varvec{j}\varvec{j}+(\varvec{I}.\varvec{m}\varvec{u})}{\varvec{j}\times\:\varvec{e}}$$

(17)

Where E is the total error, trainrp updates neuron weights and biases by eliminating the magnitude and effect of the partial derivatives through resilient backpropagation, with the direction of weighted updates determined by the sign and size of the derivatives (dw_jk) for each connection⁴²:

$$\:{\varvec{d}\varvec{w}}_{\varvec{j}\varvec{k}}\left(\varvec{m}\right)=\varvec{a}\times\:{\varvec{X}}_{\varvec{j}}\left(\varvec{m}\right)\times\:{\varvec{\delta}}_{\varvec{k}}\left(\varvec{m}\right)$$

(18)

Where a, x_j(m), m, ẟ_k are the learning rate, backpropagation input at the i^th neuron in step time m, and ẟ_k is the error gradient. However, these updates remain the same if the derivatives converge to zero. The DNN and ANN training algorithms reviewed in the literature and adopted in the present study are summarized in Table 3.

Table 3 Comparison and advantages of backpropagation algorithms used in neurocognitive modeling.

Full size table

Neuro-transfer functions of deep learning and artificial neural networks

Artificial Intelligence (AI) algorithms train DNN and ANN models on empirical data as the input layer to generate prediction outputs. Various neurons in the first and last layers represent input and outputs interconnected with one or more hidden layers using neurons (nodes). A neuron output (z_t) is defined by the relationship between inputs and outputs through an activation function¹⁸. Considering an activation function φ(t) in Eq. 19, this relationship can be expressed using Eq. 20⁴³.

$$\:{\varvec{z}}_{\varvec{t}}=\varvec{\phi\:}\left({\varvec{t}}_{\varvec{x}}\right)$$

(19)

$$\:{\varvec{t}}_{\varvec{x}}={\sum\:}_{\varvec{k}=1}^{\varvec{n}}{\varvec{w}}_{\varvec{x}\varvec{k}}{\varvec{y}}_{\varvec{k}}+{\varvec{b}}_{\varvec{x}}$$

(20)

Where n is the number of inputs, w is the weighted connections between neuron x and k, y is the input from neuron node k, and b_x is the bias, respectively. This summation is processed through the neuron transfer function, , to generate output⁴³:

$$\:\varvec{\phi\:}\left({\varvec{t}}_{\varvec{x}}\right)=\varvec{\phi\:}\left[\left({\sum\:}_{\varvec{k}=1}^{\varvec{n}}{\varvec{w}}_{\varvec{x}\varvec{k}}{\varvec{y}}_{\varvec{k}}\right)+{\varvec{b}}_{\varvec{x}}\right]$$

(21)

Activation functions expressed by (t) define the output of a neuron in terms of the induced local field. The input data is processed by the neuron activation functions associated with the weighted connections, which adjust iteratively to reduce the differences between predicted and target values by optimizing the weights. The feed-forward weight adjustments proceed until the maximum number of predefined epochs is reached, at which point the specified error limits are met. In this study, a combination of logsig (log–sigmoid), Purelin (linear), and tansig (hyperbolic tangent sigmoid) neuron activation functions was deployed (Eqs. 22–24):

$$\:\varvec{\phi\:}\left(\varvec{t}\right)=\frac{1}{1+{\varvec{e}}^{-\varvec{t}}}\:\mathbf{f}\mathbf{o}\mathbf{r}\:0\hspace{0.17em}\ge\:\hspace{0.17em}\mathbf{z}\mathbf{x}\:\le\:1$$

(22)

$$\:\varvec{\phi\:}\left(\varvec{t}\right)=\varvec{t}\:\mathbf{f}\mathbf{o}\mathbf{r}\:-\mathbf{\infty\:}\ge\:\mathbf{z}\mathbf{x}\le\:+\mathbf{\infty\:}$$

(23)

$$\:\varvec{\phi\:}\left(\varvec{t}\right)=\frac{2}{{1+\varvec{e}}^{-2\varvec{t}}}-1\:\mathbf{f}\mathbf{o}\mathbf{r}\:-\hspace{0.17em}1\hspace{0.17em}\ge\:\hspace{0.17em}\mathbf{z}\mathbf{x}\:\le\:+1$$

(24)

Feed-forward backpropagation in deep learning and artificial neural networks

Artificial neurocognitive algorithms based on feed-forward-back propagation (FFBP) perform computations through the network with an error-back propagation function⁴⁴:

$$\:{\varvec{P}}_{\varvec{E}\varvec{r}\varvec{r}\varvec{o}\varvec{r}}=\frac{1}{\varvec{x}}({\sum\:}_{\varvec{x}}{\sum\:}_{\varvec{p}}\left({\varvec{n}}_{\varvec{x}\varvec{k}}-{\varvec{z}}_{\varvec{p}\varvec{k}}\right)$$

(25)

Where P_Error is the propagated error, x is the indexed training set, p is the index of various neuron outputs, n_xk is the k^th element of the desired x^th model, and z_pk is the k^th element of the predicted neuron outputs. Upon determining errors, backpropagation algorithms adjust the neuron weights iteratively using an expression that minimizes the total error to the lowest acceptable levels. Each neuron-weighted factor changes throughout the FFBP training process until the error function reaches a minimum. Considering the i^th iteration, the weights can be adjusted⁴⁴:

$$\:{\varvec{w}}_{\varvec{i}\varvec{j}}\left(\varvec{t}+1\right)={\varvec{w}}_{\varvec{i}\varvec{j}}\left(\varvec{t}\right)+\varvec{\mu\:}\varvec{\varDelta\:}\varvec{w}-\varvec{\eta\:}\left(\frac{\partial\:\varvec{E}}{\partial\:{\varvec{w}}_{\varvec{i}\varvec{j}}}\right)\:\:\:\mathbf{f}\mathbf{o}\mathbf{r}\:0\hspace{0.17em}>\hspace{0.17em}\hspace{0.17em}<\hspace{0.17em}1\:\&\:0\hspace{0.17em}>\hspace{0.17em}\hspace{0.17em}<\hspace{0.17em}1$$

(26)

Where µ, Δw, and η are the momentum, previous layer weight change, and learning rate, respectively. Considering the layer number and output vector, the weight adjustment can be expressed⁴⁵:

$$\:{\varvec{w}}_{{\varvec{J}}_{(\varvec{L}-1)}}{\varvec{h}}_{\varvec{L}}(\varvec{t}+1)={\varvec{w}}_{{\varvec{j}}_{(\varvec{L}-1)}}{\varvec{h}}_{\varvec{L}}\left(\varvec{t}\right)+\varvec{\mu\:}\left[{\varvec{w}}_{{\varvec{j}}_{(\varvec{L}-1)}}{\varvec{h}}_{\varvec{L}}\left(\varvec{t}\right)-{\varvec{w}}_{{\varvec{j}}_{(\varvec{L}-1)}}{\varvec{h}}_{\varvec{L}}(\varvec{t}-1)\right]+\varvec{\eta\:}{\varvec{\delta\:}}_{{\varvec{h}}_{\varvec{L}}}^{\varvec{k}}{\varvec{x}}_{{\varvec{j}}_{(\varvec{L}-1)}}^{\varvec{k}}$$

(27)

Where L and x^k are the i^th layer number and output vector, respectively, during training, the algorithms adjust the bias and neuron weights to minimize errors between the actual input data and the neuron prediction output. Once the neuron architecture is trained and neuron weights specified, it can then be validated on the new datasets. Although DNN tends to provide more accurate results with large datasets, certain neurocognitive algorithms may learn better with specific datasets and perform better with certain neuro-activation functions in ANN than in DNN, and vice versa. A comparison of neurocognitive models with the corresponding algorithms utilized in previous agricultural operations is presented in Table 4.

Table 4 Summary of tillage studies associated with deep learning and artificial neural network modeling.

Full size table

ANN model

The MATLAB simulation of neurocognitive architecture for the most accurate and optimal ANN model is presented in Fig. 7. Metaheuristic evaluation of the ANN models indicated that the most optimal ANN model comprised a single-layered architecture with 72 neurons learning on the tansig transfer function in the hidden layer and purelin in the output layers, as shown in Fig. 7. The hyperparameter details of the neurocomputing architecture of the ANN model are presented in Table 5. In contrast, the neurocognitive model equation takes the form of Eq. 28.

$$\:{\varvec{F}}_{\varvec{T}\varvec{r}}=\varvec{p}\varvec{u}\varvec{r}\varvec{e}\varvec{l}\varvec{i}\varvec{n}\left\{\varvec{t}\varvec{a}\varvec{n}\varvec{s}\varvec{i}\varvec{g}\left(\varvec{W}\bullet\:\varvec{X}+\varvec{b}\right)\right\}$$

(28)

Where W is the neuron weight, while X and b represent the input and bias, respectively.

Table 5 Summary of hyperparameter configuration of the ANN model.

Full size table

The ANN neurocognitive model equation of the single-layered ANN model (14-72-1) takes into account all the input variables, neuron weights, hidden neurons, layers, and the biases to predict F_Tr output are as shown in Eq. 29. Characteristic values of neuron weights and biases associated with the ANN model equation are as shown in Table 6.

$$\begin{gathered} \:F_{{Tr}} = \sum\limits_{{i = 1}}^{{72}} {v_{i} \:{\mathbf{tanh}}(w_{{i1}} R_{{depth}} + w_{{i2}} {\mathcal{O}}_{{{\text{Fe}}}} + w_{{i3}} F_{{\text{D}}} + w_{{i4}} W_{{{\text{Load}}}} + w_{{i5}} P_{{{\text{Tire}}}} + w_{{i6}} T_{{{\text{depth}}}} } \hfill \\ \quad \quad \quad + w_{{i7}} CI_{{{\text{Soil}}}} + w_{{i8}} \tau _{{{\text{Shear}}}} + w_{{i9}} \theta \:_{{{\text{Soil}}}} + w_{{i10}} \gamma \:_{{{\text{Soil}}}} + w_{{i11}} I_{{{\text{pSoil}}}} + w_{{i12}} S_{{{\text{Wheel}}}} \hfill \\ \quad \quad \quad + w_{{i13}} F_{{{\text{Rx}}}} + w_{{i14}} A_{{{\text{stc}}}} + b_{i} ) + k \hfill \\ \end{gathered}$$

(29)

Table 6 Neurocognitive bias and neuron weights associated with the ANN model.

Full size table

DNN model

The most optimal deep learning model utilized both tansig and logsig neural activation functions in the multiple hidden layers, while the output layers were trained on the purelin function. Figure 8 shows the best MATLAB simulator for the DNN model architecture. The DNN model approach utilized the neurocomputing form of Eq. 30 within the neurocognitive architecture to predict F_Tr.

$$\:{\varvec{F}}_{\varvec{T}\varvec{r}}=\varvec{p}\varvec{u}\varvec{r}\varvec{e}\varvec{l}\varvec{i}\varvec{n}\left[\varvec{l}\varvec{o}\varvec{g}\varvec{s}\varvec{i}\varvec{g}\left\{\varvec{t}\varvec{a}\varvec{n}\varvec{s}\varvec{i}\varvec{g}\left({\varvec{W}}_{1}\bullet\:\varvec{X}+{\varvec{b}}_{1}\right)\bullet\:{\varvec{W}}_{2}+{\varvec{b}}_{2}\right\}\right]$$

(30)

Where W₁ and W₂ are the neuron weights of the first and second hidden layers, respectively, while b₁ and b₂ are the corresponding neuron biases, the model architecture utilized four interconnections with two hidden layers, comprising seven hidden neurons in the first hidden layer and five neurons in the second hidden layer, to predict the output layer (F_Tr). The DNN model was meta-heuristically configured (Fig. 8), and the hyperparameters of the model architecture are shown in Table 7.

Table 7 Summary of hyperparameter configuration of deep neural network model.

Full size table

The neurocognitive equation of DNN trainlm (14-7-5-1) model predicts F_Tr output by propagating all the respective tillage variables inputs through their corresponding neurons, adjustable neuron weights, hidden layers, and biases as shown in Eq. 31. Details of corresponding values of all the neuron weights, hidden layers and biases for each layer are shown in Table 8.

$$\begin{gathered} F_{{Tr}} = \sum\limits_{{j = 1}}^{7} {} \sum\limits_{{k = 1}}^{5} {} \sum\limits_{{L = 1}}^{1} {w_{{kl}} } \{ \frac{2}{{1 + e^{{ - 2\left( {\begin{array}{*{20}l} {w_{{i1}} R_{{depth}} + w_{{i2}} {\mathcal{O}}_{\mathcal{F}\text{uel}} + w_{{i3}} F_{D} + w_{{i4}} W_{{Load}} + w_{{i5}} P_{{Tire}} + w_{{i6}} T_{{depth}} + w_{{i7}} CI_{{Soil}} } \\ {\: + w_{{i8}} \tau _{{Shear}} + w_{{i9}} \theta _{{Soil}} + w_{{i10}} \gamma \:_{{Soil}} + w_{{i11}} P_{{Soil}} + w_{{i12}} S_{{Wheel}} + w_{{i13}} F_{{Rx}} + w_{{i14}} A_{{stc}} + b_{j} } \\ \end{array} } \right)}} }}\\ + \left( {w_{{jk}} x_{2} + b_{k} } \right)\} - 1 + b_{L} \end{gathered}$$

(31)

Table 8 Neurocognitive biases and neuron weights associated with the deep neural network model.

Full size table

Modeling status and neurocognitive error zeroing

Neurocognitive modeling states of ANN trainbr 14-72-1 and DNN trainlm 14-7-5-1 provide insights into their operational dynamics as shown in Fig. 9 (a-d). First, neurocognitive training of ANN trainbr 14-72-1 converged at lower perceptron optimality (51 epochs) and mean square error of convergence (2.933e-11) but higher damping factor (mu) of 500 (Fig. 9a-c), than DNN trainlm (14-7-5-1) as shown in Fig. 9 (b-d). The model demonstrated smooth neuron transition states and convergence, likely due to its ability to adjust weight updates effectively through Bayesian priors, ensuring stability in the training process. This characteristic has been validated in studies by Keshun et al.⁵⁸ and Zhang et al.⁵⁹emphasizing the importance of robust priors in neural network optimization for system reliability. On the other hand, the DNN trainlm (14-7-5-1) model exhibited a lower convergence gradient (7.849e-8) and mu factor of 1.0e-9 than the ANN trainbr 14-72-1 (2.933e-11 and 500), indicating precise weight updates and neuron transitions (Fig. 9b and d). The lower values demonstrate the efficacy of the second-order optimization techniques of the trainlm algorithm in achieving a balanced neurocognitive exploration-exploitation trade-off during training.

Conversely, error-zeroing histograms demonstrate distinctive performance strengths in both ANN trainbr 14-72-1 and DNN trainlm (14-7-5-1) models, as shown in Fig. 10. The ANN 14-72-1 architecture, utilizing Bayesian regularization, converged all the instances in a more uniformly distributed manner and stabilized very close to zero error line between − 2.1e5 to 1.28e5, showcasing its efficiency in minimizing overfitting. Bayesian approaches, as highlighted in Baumgartner et al.⁶⁰emphasize the role of Bayesian regularization in enhancing model robustness, particularly in noisy or complex datasets. However, the DNN (14-7-5-1) model, based on the trainlm optimization technique, distributed the errors between − 0.00094 and 0.000869. Despite its higher convergence error, trainlm excels in handling deep multilayered architectures such as the 14-7-5-1, which require intricate optimization strategies, underscoring the adaptability of gradient-based optimization techniques for DNN applications. Furthermore, unlike the widely distributed error instances in ANN trainbr 14-72-1 and its higher training time (29 s), as reported earlier, almost all the modeling instances in DNN trainlm 14-7-5-1 converged at a single error value (1.44e-05), close to the zero-error line, in a short neurocomputing time (2 s). These findings align with the comprehensive overview by Mienye and Swart⁶¹. Furthermore, we hypothesize that models utilizing trainbr algorithms require more robust and adequate training time to yield accurate results. In contrast, quick results can be obtained from a DNN trained on trainlm, albeit with compromised error limits unless denoising and overfitting costs are incurred.

Results and discussion

The neurocognitive performance of the models in predicting F_Tr was evaluated using broad statistical criteria of accuracy and reliability metrics. Furthermore, to assess the neurocognitive accuracy and reliability metrics, this study employed multiple statistical criteria, including Taylor analysis, Monte Carlo uncertainty, and the Anderson-Darling test (AD-test), to establish the neurocognitive robustness for the generalized adoption of DNN and ANN models. Moreover, nonlinear cosine amplitude sensitivity indexing was employed to determine the relative influence of each soil-machine variable on F_Tr for the most accurate DNN and ANN models during training and testing, as well as for the entire database. Accuracy metrics were used to evaluate the prediction performance of the DNN and ANN models, which were implemented in the source coding environment and console execution interface of the statistical software R, version 4.4.2. These metrics included Mean Squared Error (MSE), coefficient of determination (R²), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Sum Square Error (SSE), Prediction scatter (T_scatter), Coefficient of Variation (CV) and Prediction Accuracy (PA)²³:

$$\:\varvec{R}=\frac{{\sum\:}_{\varvec{i}=1}^{\varvec{n}}\left({\varvec{F}}_{\varvec{T}\varvec{r}\varvec{P}}-{\overline{\varvec{F}}}_{\varvec{T}\varvec{r}\varvec{P}}\right)\left({\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}-{\overline{\varvec{F}}}_{\varvec{T}\varvec{r}\varvec{A}}\right)}{\sqrt{{\sum\:}_{\varvec{i}=1}^{\varvec{n}}{\left({\varvec{F}}_{\varvec{T}\varvec{r}\varvec{P}}-{\overline{\varvec{F}}}_{\varvec{T}\varvec{r}\varvec{P}}\right)}^{2}\varvec{x}{\sum\:}_{\varvec{i}=1}^{\varvec{n}}{\left({\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}-{\overline{\varvec{F}}}_{\varvec{T}\varvec{r}\varvec{A}}\right)}^{2}}}$$

(32)

$$\:\varvec{M}\varvec{S}\varvec{E}={\sum\:}_{\varvec{i}=1}^{\varvec{n}}({\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}-{\varvec{F}}_{\varvec{T}\varvec{r}\varvec{P}}{)}^{2}$$

(33)

$$\:{\varvec{R}}^{2}=\frac{{\sum\:}_{\varvec{i}=1}^{\varvec{n}}({\varvec{F}}_{\varvec{T}\varvec{r}\varvec{p}}-{\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}{)}^{2}}{{\sum\:}_{\varvec{i}=1}^{\varvec{n}}({\varvec{F}}_{\varvec{T}\varvec{r}\varvec{P}}-{\overline{\varvec{F}}}_{\varvec{T}\varvec{r}}{)}^{2}}$$

(34)

$$\:\varvec{R}\varvec{M}\varvec{S}\varvec{E}=\sqrt{\frac{1}{\varvec{n}}{{\sum\:}_{\varvec{i}=1}^{\varvec{n}}\left({\varvec{F}}_{\varvec{T}\varvec{r}\varvec{P}}-{\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}\right)}^{2}}$$

(35)

$$\:\varvec{S}\varvec{S}\varvec{E}={{\sum\:}_{\varvec{i}=1}^{\varvec{n}}\left({\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}-{\varvec{F}}_{\varvec{T}\varvec{r}\varvec{P}}\right)}^{2}$$

(36)

$$\:\varvec{T}\varvec{S}\varvec{S}\varvec{E}={\sum\:}_{\varvec{i}=1}^{\varvec{n}}\left({\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}-{\overline{\varvec{F}}}_{\varvec{T}\varvec{r}}\right)$$

(37)

$$\:{\varvec{T}}_{\varvec{S}\varvec{c}\varvec{a}\varvec{t}\varvec{t}\varvec{e}\varvec{r}}=1-\frac{{\sum\:}_{\varvec{i}=1}^{\varvec{n}}({\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}-{\overline{\varvec{F}}}_{\varvec{T}\varvec{r}\varvec{P}}{)}^{2}}{{\sum\:}_{\varvec{i}=1}^{\varvec{n}}({\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}-{\overline{\varvec{F}}}_{\varvec{T}\varvec{r}}{)}^{2}}$$

(38)

$$\:\varvec{M}\varvec{A}\varvec{E}=\frac{1}{\varvec{n}}{\sum\:}_{\varvec{i}=1}^{\varvec{n}}\left|{\varvec{F}}_{\varvec{T}\varvec{r}\varvec{P}}-{\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}\right|$$

(39)

$$\:\varvec{M}\varvec{A}\varvec{P}\varvec{E}=\frac{\left({\sum\:}_{\varvec{i}=1}^{\varvec{n}}\frac{\left|{\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}-{\varvec{F}}_{\varvec{T}\varvec{r}\varvec{P}}\right|}{{\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}}\right)}{\varvec{n}}\varvec{x}100$$

(40)

$$\:\varvec{C}\varvec{V}=\frac{\sqrt{{\sum\:}_{\varvec{i}=1}^{\varvec{n}}\left(\frac{{\left({\varvec{F}}_{\varvec{T}\varvec{r}\varvec{p}}-{\overline{\varvec{F}}}_{\varvec{T}\varvec{r}}\right)}^{2}}{\varvec{n}-1}\right)}}{\left(\frac{{\sum\:}_{\varvec{i}=1}^{\varvec{n}}{\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}}{\varvec{n}}\right)}$$

(41)

$$\:\varvec{P}\varvec{A}=\left[1-\left(\frac{1}{\varvec{n}}{\sum\:}_{\varvec{i}=1}^{\varvec{n}}\frac{\left|{\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}-{\varvec{F}}_{\varvec{T}\varvec{r}\varvec{P}}\right|}{{\varvec{F}}_{\varvec{T}\varvec{r}\varvec{A}}}\right)\right]\varvec{x}100$$

(42)

Where n is the number of variables, F_TrP, and $\:{\stackrel{-}{F}}_{Tr}$ are the predicted F_Tr and its corresponding mean, while F_TrA and $\:{\stackrel{-}{F}}_{TrA}$ represent the actual and mean of experimental F_Tr, and F_Tri is the i^th F_Tr, respectively. Moreover, the neurocognitive reliability of model predictions was assessed using a20-index (a20), Willmott`s index of agreement (IOA), Index of scatter (IOS), Variance accounted for (VAF), Performance index (PI)^29,62:

$$\:\varvec{a}20-\varvec{i}\varvec{n}\varvec{d}\varvec{e}\varvec{x}=\frac{\varvec{m}20}{\varvec{M}}$$

(43)

$$\:\varvec{I}\varvec{O}\varvec{S}=\frac{\varvec{R}\varvec{M}\varvec{S}\varvec{E}}{{{\stackrel{-}{\varvec{F}}}_{\varvec{T}\varvec{r}}}_{\varvec{A}}}$$

(44)

$$\:\varvec{I}\varvec{O}\varvec{A}=1-\left[\frac{{\sum\:}_{\varvec{i}=1}^{\varvec{n}}{\left({{\stackrel{-}{\varvec{F}}}_{\varvec{T}\varvec{r}}}_{\varvec{P}}-{{\varvec{F}}_{\varvec{T}\varvec{r}}}_{\varvec{A}}\right)}^{2}}{\sum\:_{\varvec{i}=1}^{\varvec{n}}{\left\{\left|\left({{\varvec{F}}_{\varvec{T}\varvec{r}}}_{\varvec{P}}-{{\stackrel{-}{\varvec{F}}}_{\varvec{T}\varvec{r}}}_{\varvec{A}}\right)\right|+\left|\left({{\varvec{F}}_{\varvec{T}\varvec{r}}}_{\varvec{A}}-{{\stackrel{-}{\varvec{F}}}_{\varvec{T}\varvec{r}}}_{\varvec{A}}\right)\right|\right\}}^{2}}\right]$$

(45)

$$\:\varvec{V}\varvec{A}\varvec{F}=\left[1-\frac{\varvec{v}\varvec{a}\varvec{r}\left({{\stackrel{-}{\varvec{F}}}_{\varvec{T}\varvec{r}}}_{\varvec{A}}-{{\stackrel{-}{\varvec{F}}}_{\varvec{T}\varvec{r}}}_{\varvec{P}}\right)}{\varvec{v}\varvec{a}\varvec{r}\left({{\stackrel{-}{\varvec{F}}}_{\varvec{T}\varvec{r}}}_{\varvec{A}}\right)}\right]\varvec{x}100$$

(46)

$$\:\varvec{P}\varvec{I}={\varvec{R}}^{2}+0.01\varvec{x}\varvec{V}\varvec{A}\varvec{F}-\varvec{R}\varvec{M}\varvec{S}\varvec{E}$$

(47)

Where m20 and M represent datasets with actual/prediction ratios of 0.8 and 1.2, while n and p indicate the total number of data samples and inputs, respectively. Further, the Wilcoxon rank-sum indexing was adopted to establish the effective reliability of DNN and ANN models by comparing nonparametric statistical Wilcoxon rank-sum test score index, performed by ranking the accuracy and reliability metrics of each ANN and DNN model in their order of increased value and by assigning a rank number (ϑ_score) to every metric. The sum of ranks for each model was calculated to establish the respective total Wilcoxon rank-sum test score (ξ_total) values for comparison. The depth rank of the individual ϑ_score for accuracy and reliability metrics was obtained during both the training and testing phases. The sum of the ξ_total was then established by summing the particular metric ranks for the two modeling phases. The model with the highest accuracy and reliability metrics received a greater ξ_total, yielding a larger Wilcoxon rank-sum statistic index (Eq. 48). Consequently, the highest value of ξtotal represents the optimal ANN and DNN model architecture.

$$\:{\varvec{\xi\:}}_{\varvec{t}\varvec{o}\varvec{t}\varvec{a}\varvec{l}}=\left[\sum\:_{\varvec{i}=1}^{\varvec{m}}{\varvec{\xi\:}}_{\varvec{i}}+\sum\:_{\varvec{j}=1}^{\varvec{n}}{\varvec{\xi\:}}_{\varvec{j}}\right]$$

(48)

Where ξ_i and ξ_j are the ξ_total scores during training and testing, respectively, while m and n are their corresponding values of ϑ_score in the respective modeling phases, Table 9 presents the ideal values for the performance metrics.

Table 9 Ideal values of the performance metrics.

Full size table

The selection of multiple statistical and error-based performance metrics, such as Mean Squared Error (MSE), Coefficient of Determination (R²), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Sum Square Error (SSE), Prediction Scatter (T_scatter), Coefficient of Variation (CV), Prediction Accuracy (PA), a20-index (a20), Willmott’s Index of Agreement (IOA), Index of Scatter (IOS), Variance Accounted For (VAF), and Performance Index (PI), is crucial for comprehensive model evaluation and validation. Each metric captures different aspects of model behavior: MSE, RMSE, MAE, and MAPE quantify the magnitude and type of prediction errors; R² and VAF assess how well the model explains the variance in observed data; SSE measures the total deviation from actual values; CV standardizes the error relative to the mean; PA and a20-index evaluate classification and proximity-based accuracy; IOA and IOS focus on agreement and dispersion between predicted and observed values; while PI combines multiple error components into a single composite score.

Simulation of results

Tables 10 and 11 present the results of various accuracy metrics for all 72 DNN and ANN models, including their training and testing performance, as well as a comparison between them. The most intelligent models possessed the lowest MSE, RMSE, SSE, TSSE, MAE, MAPE, and CV. Still, with the highest R, R²^,T_scatter, and PA as reported by Jierula et al.⁶³. When considering the overall accuracy metrics, ANN trainbr (14-72-1) and DNN trainlm (14-7-5-1) models were the most accurate. Therefore, they were adopted for further evaluation. However, DNN trainlm (14-7-5-1) consumed higher perceptron optimality (55 epochs) with less neurocomputing time (2s) compared to ANN trainbr (14-72-1), 51(epochs), and 29s, respectively. Although the number of optimal epochs did not regress with training time, the number of hidden layers was congruent with epoch size in DNN and incongruent in ANN. The number of hidden layers was also incongruent with prediction accuracy in ANN but congruent in DNN. Further, the number of hidden neurons was congruent with accuracy in ANN but incongruent in DNN for all the hidden layers. Increasing the number of hidden layers and neurons reduced the accuracy of neurocomputing in ANN but improved accuracy in DNN. As such, the trainlm and trainbr algorithms were the most accurate in DNN and ANN modeling, respectively.

Table 10 Summary of accuracy metrics of the ANN model in predicting F_Tr.

Full size table

Table 11 Summary of accuracy metrics of deep learning models in predicting F_Tr.

Full size table

Neurocognitive prediction performance during training, testing, validation sets, and with the entire database is shown in Fig. 11(a-f). The regressed correlation was strong for both models, with ANN trainbr 14-72-1 achieving a marginally superior correlation than DNN trainlm 14-7-5-1, likely due to its regularization capabilities that enhance generalization. Predictive regression in neurocomputing networks is crucial for accuracy and neurocognitive reliability⁶⁰. Despite a slightly higher MSE, the DNN trainlm 14-7-5-1 model maintained adequate input-output regression (> 0.999) owing to its precise neuron transition dynamics. This observation aligns with previous studies, in which second-order methods have been shown to excel in predictive tasks that require complex data relationships, as reported by Mienye and Swart⁶¹Huo et al.⁶⁴ and Bassiouni et al.⁶⁵. The complementary data fitting strengths of the trainbr and trainlm algorithms highlight their potential for complex predictive modeling in diverse agricultural tasks using ANN and DNN models, respectively. Rapid convergence and robust generalization of trainbr make it ideal for addressing scenarios with limited or less noisy data. At the same time, the trainlm is adapted to suit deeper, more complex architectures that provide quick results, subject to denoising at higher backpropagation memory cost. Nonetheless, both the ANN trainbr 14-72-1 and DNN trainlm 14-7-5-1 models exhibited input-output prediction correlations of greater than 99.9% (Fig. 11), reaffirming their neurocognitive accuracy in forecasting F_Tr during tillage.

Accuracy metrics approached ideal unity for the most accurate ANN and DNN models, as shown in Fig. 12, indicating that trainbr performed best in the ANN (14-72-1) model, while trainlm was best in the DNN (14-7-5-1) model. During the training and testing phase and for the entire database, all the ANN trainbr models achieved the highest overall unity performance for R, R²and T_scatter (Figs. 12a-c). However, the corresponding metrics of the trainbr algorithm in DNN modeling were less than unity and varied substantially. The best-performing model of trainbr (14-72-1) achieved R, R²and T_scatter values of unity for all the modeling phases in ANN (Figs. 12a-c). However, all the neurocomputing models of trainlm outperformed their trainbr counterparts in DNN for all the modeling phases and with the entire database (Figs. 12d-f). In DNN, only the trainlm 14-7-5-1 model architecture maintained all values close to unity for the tripartite parameters R, R²and T_scatter in the entire modeling phases, making it the most error-tolerant and superior DNN model (Fig. 12d-f). As such, trainlm algorithms are best suited for training multi-layered DNN models, while trainbr performs best in ANN models. Other studies have demonstrated that the ideal neurocomputing models have values of R, R²and T_scatter equal to unity⁶⁶.

At the same time, the accuracy metrics for DNN and ANN models, which approach the ideal value of zero, are shown in Fig. 13, depicting MSE, RMSE, MAE, and MAPE values for the best models using the most accurate algorithms (trainbr and trainlm) across all modeling phases. The ANN and DNN models of trainbr and trainlm exhibited the lowest MSE, RMSE, MAE, and MAPE values, which were close to the zero-error line for all modeling phases. The best neurocomputing models have MSE, RMSE, MAE, and MAPE values closest to zero^63,67. Considering all modeling phases and datasets (training, testing, and entire databases) in tandem, the trainbr was the most accurate in modeling ANNs, while the trainlm algorithm was the most precise in DNN models. The ANN trainbr14-72-1 and DNN trainlm 14-7-5-1 model architectures were the most accurate in forecasting F_Tr in situ.

The prediction accuracy (PA) of the best ANN and DNN models is shown in Fig. 14. The analysis indicated that three ANN models of trainbr (14-72-1, 14-39-1, and 14-13-1) and DNN model of trainlm (14-7-5-1,14-9-5-1, and 14-9-7-1) achieved PA values close to unity (> 0.95) which was observed in all modeling phases (Fig. 14). The PA of single-layered ANN trainbr (14-13-1, 14-39-1, 14-72-1) models lagged marginally from unity compared with the multi-layered DNN trainbr counterparts (14-7-5-1,14-9-5-1, and 14-9-7-1) that widely diverged during training, testing and for the entire database. However, multilayered DNN models of trainlm (14-7-5-1, 14-9-5-1, and 14-9-7-1) performed best in all modeling phases compared with multi-layered DNN models of trainbr (14-7-5-1, 14-9-5-1, and 14-9-7-1) that diverged widely from unity. Furthermore, the PAs of single-layered ANN trainbr (14-13-1, 14-39-1, 14-72-1) models lagged marginally from unity compared to ANN trainlm counterparts, which widely diverged during testing and for the entire database. The PA of trainlm DNN models was thus higher in multi-layered architectures than in single-layered architectures, while trainbr performs best in single-layered ANN architectures. However, the best single-layered trainbr (14-72-1) model was superior, albeit marginally, during training, testing, and with the entire database compared with the best multi-layered trainlm (14-7-5-1). These findings were consistent with connectome signalling, which most accurately occurs via the shortest path lengths of highly clustered neurons in the human brain, free from extraneous artifacts such as noise or aliasing^68,69.

The maximum error differences between predictions and actual values for the range of datasets are shown in Fig. 15; Table 12. The residual error characteristic curve generalizes the data points falling within the zero-error tolerance during prediction, with error residuals on the y-axis and the data point error sources on the x-axis. The ANN trainbr model 14-39-1 achieved the smallest magnitude of error residuals during training, testing, and for the entire database (Fig. 15; Table 12). The maximum error limits of ANN trainbr (14-72-1) were closest to the actual zero error line in predicting F_Tr compared to the best DNN trainlm (14-7-5-1) model.

Table 12 Maximum error residuals for the best neurocognitive models.

Full size table

A comparative assessment of ANN trainbr and DNN trainlm model learning and prediction errors during training, testing, and with the entire database is illustrated in Fig. 16. The trainbr 14-71-1 reported the lowest errors (0.0054, 0.0117, 0.006) in ANN, while trainlm 14-7-5-1 achieved the lowest errors in DNN (0.0065, 0.2226, 0.065) during training, testing, and with the entire database, respectively.

Reliability analysis

The reliability metrics of the ANN and DNN models are shown in Fig. 17; Table 13. All ANN trainbr models achieved equal and optimal values of a20 (100%), VAF (100%), PI (2.0), and IOA (1.0). However, ANN trainbr 14-72-1 achieved the lowest values of IOS, indicating its superior reliability strength. Although all the DNN models had equal a20 (100%) and VAF (100%), the DNN trainlm 14-7-5-1 model achieved the highest and ideal PI (2.0), and IOA (1.0) at the lowest IOS (0.0017), making it the most reliable. Thus, the most reliable models were ANN trainbr 14-72-1 and DNN trainlm 14-7-5-1. Overall reliability metrics indicated that ANN models achieved the highest reliability in trainbr. By contrast, the DNN model proved the most reliable in trainlm (Fig. 17). All corresponding reliability indices were similar to or equal to the ideal values, indicating the reliability of both DNN and ANN models in forecasting agricultural traction.

Table 13 Reliability metrics of ANN and DNN models.

Full size table

Wilcoxon rank analysis

The ξ_total computed for the ANN and DNN models is shown in Fig. 18 (a), indicating that the ANN models achieved a higher ξ_total in the single-layered neuron architecture. In contrast, DNN models achieved the highest ξ_total in multi-layered neurons. The DNN models achieved higher scores in trainlm than in trainbr, where single-layered ANN neuron models were superior (Fig. 18b and c, and Table 14). Nonetheless, the best ANN single-layered trainbr model (14-72-1) achieved an overall score of 240, outperforming the best DNN trainlm model, which scored 196 (Table 14). Figure 18b and c illustrate that ANN trainbr 14-72-1 achieved the highest score ranking during training and testing, while trainlm 14-7-5-1 achieved the highest scores in DNN. Hence, when considering the two most performing algorithms and neurocognitive architectures, the single-layered trainbr 14-72-1 and multi-layered trainlm 14-7-5-1 models proved to be the most reliable for predicting F_Tr in tillage using ANN and DNN strategies, respectively.

Table 14 Score rank indices for the best-performing models.

Full size table

Visual interpretation of model capabilities

Taylor plot

The Taylor method was employed to simultaneously assess multiple statistical metrics and quantify the extent to which ANN and DNN neurocomputing predictions matched the experimental datasets. The statistical software R (version 4.4.2) was used to code and generate Taylor diagrams due to its cross-platform flexibility, which supports a wide range of UNIX platforms, enabling the visualization of multiple model performance metrics. Taylor plots were thus graphically constituted, integrating the goodness of fit metrics R, RMSE, and the penalties of standard deviation (σ) for the best ANN and DNN Models during training, testing, and the entire database. During Taylor analysis, R_Taylor values, which describe the degree of variability between the model simulation(s) and the reference data (r), were evaluated using Eq. 49⁷⁰.

$$\:{\varvec{R}}_{\varvec{T}\varvec{a}\varvec{y}\varvec{l}\varvec{o}\varvec{r}}=\frac{\frac{1}{\varvec{N}-1}\sum\:_{\varvec{n}=1}^{\varvec{N}}\left[\left({\varvec{s}}_{\varvec{n}}-\stackrel{-}{\varvec{s}}\right)\left({\varvec{s}}_{\varvec{n}}-\stackrel{-}{\varvec{r}}\right)\right]}{{\varvec{\sigma\:}}_{\varvec{s}}{\varvec{\sigma\:}}_{\varvec{r}}}$$

(49)

where $\:\stackrel{-}{s}$, σ_s, s_n, $\:\stackrel{-}{r}$ and σ_r are the means and standard deviations of s and r for the simulated and reference data, while the corresponding RMSE_Taylor were defined by Eq. 50.

$$\:{\varvec{R}\varvec{M}\varvec{S}\varvec{E}}_{\varvec{T}\varvec{a}\varvec{y}\varvec{l}\varvec{o}\varvec{r}}={\left[\frac{1}{\varvec{N}}\sum\:_{\varvec{n}=1}^{\varvec{N}}{\left({\varvec{s}}_{\varvec{n}}-{\varvec{r}}_{\varvec{n}}\right)}^{2}\right]}^{0.5}$$

(50)

Further, the relationship between RMSE_Taylor, σ of s, and r was used to formulate the centered root mean square error (cRMSE) as indicated in Eq. 51⁷¹, whose interpretation yields the relationship between cRMSE, σ_s, σ_r, and R_Taylor for simulated and reference data in Eq. 52. Thus, to evaluate the performance of DNN and ANN models, Eq. 52 was graphically adopted to determine standard variability, correlation, and cRMSE between predictions and reference datasets in a Taylor diagram^72,73.

$$\:\varvec{c}\varvec{R}\varvec{M}\varvec{S}\varvec{E}={\left[\frac{1}{\varvec{N}}\sum\:_{\varvec{n}=1}^{\varvec{N}}{\left[\left({\varvec{s}}_{\varvec{n}}-\stackrel{-}{\varvec{s}}\right)-\left({\varvec{r}}_{\varvec{n}}-\stackrel{-}{\varvec{r}}\right)\right]}^{2}\right]}^{0.5}$$

(51)

$$\:{\varvec{c}\varvec{R}\varvec{M}\varvec{S}\varvec{E}}^{2}=\left(\left({{\varvec{\sigma\:}}_{\varvec{s}}}^{2}+{{\varvec{\sigma\:}}_{\varvec{r}}}^{2}\right)-2{\varvec{\sigma\:}}_{\varvec{s}}{\varvec{\sigma\:}}_{\varvec{r}}\varvec{R}\right)$$

(52)

Taylor’s analysis presented three combined statistical metrics of neurocomputing models (R, σ, and cRMSE), thereby addressing rectifications of any underlying offsets or biases that could have been introduced in the predictions, leading to a more robust representation of prediction errors and ultimately better reflecting model robustness.

A comprehensive visual-metric diversity of ANN and DNN models’ performance was quantified by portraying the extent to which their predictions differed from the reference dataset while considering the corresponding σ, as shown in Taylor`s diagram (Fig. 19). Both the ANN and DNN model architectures demonstrated satisfactory relationships between R, σ, and RMSE during the modeling phases and for the entire database. The ANN trainbr 14-72-1 and DNN 14-7-5-1 demonstrated superior prediction compared with their respective counterparts, where their R, SD, and RMSE values of prediction were positioned at the corresponding reference data points (Fig. 19). However, the ANN trainbr (14-72-1) clustered closest to the reference data points, albeit with slightly higher values. Still, the efficacy was nonsignificant compared with the DNN trainlm 14-7-5-1 model (Fig. 19). The compact, intuitive Taylor diagram summarized the multi-statistical aspects of the models, thereby depicting the degree to which the experimental datasets and predicted values were similar or dissimilar, with the best models exhibiting the closest congruence. These findings align with similar observations in previous research^66,70,74.

Monte Carlo uncertainty simulation

A Monte Carlo simulation was performed to establish and quantify the uncertainties associated with predictions from the best ANN and DNN models. During analysis, randomly resampled datasets were used to retrain DNN and ANN models for 1000 multiple cycles, generating a corresponding number of outputs at constant train-validation-test ratios without replacement. Monte Carlo-based cumulative distribution functions were then constructed to determine the true data bracketed by 95% prediction uncertainty (PPU_95%) intervals using the degree of neurocognitive uncertainty, $\:\stackrel{-}{dU}x$ at 2.5th (XL) and 97.5th (XU) percentiles depicted in Eq. 53.

$$\:\stackrel{-}{\varvec{d}\varvec{U}}\varvec{x}=\frac{1}{\varvec{n}}\sum\:_{\varvec{i}=1}^{\varvec{n}}\left({\varvec{X}}_{\varvec{U}}-{\varvec{X}}_{\varvec{L}}\right)$$

(53)

Where n depicts the number of experimental observations. Ideal models yield zero (0) $\:\stackrel{-}{dU}x$ as 100% observations are bracketed by PPU_95% but may not be achieved in practice due to modeling uncertainty^75,76. As such, a logical equivalence of $\:\stackrel{-}{dU}x$ was presented as d_factor and computed using Eq. 54.

$$\:{\varvec{d}}_{\varvec{f}\varvec{a}\varvec{c}\varvec{t}\varvec{o}\varvec{r}}=\frac{{\stackrel{-}{\varvec{d}}}_{\varvec{x}}}{{\varvec{\sigma\:}}_{\varvec{x}}}$$

(54)

Where σ_x is the standard deviation of the output variable, larger values of dfactor indicate increased uncertainty, and vice versa. In comparison, values less than unity are desirable; however, higher values of true data, bracketed by PPU95%, prevail, as shown in Eq. 55⁴⁶.

$$\:{\varvec{P}\varvec{P}\varvec{U}}_{95\varvec{\%}}=\frac{1}{\varvec{n}}\varvec{c}\varvec{o}\varvec{u}\varvec{n}\varvec{t}\left(\varvec{X}|{.\varvec{X}}_{\varvec{L}}\le\:\varvec{X}\le\:{\varvec{X}}_{\varvec{U}}\right)100$$

(55)

Montecarlo uncertainty simulation generated a 95% confidence interval (CI) observed in the uncertainty plot of both ANN trainbr 14-72-1 and DNN trainlm 14-7-5-1. It encapsulated variations in the range with which the predicted outcomes fall within the observed data range (Fig. 20). A plot with relatively narrower uncertainty bands reflects a higher confidence level in the predictions for the best model^77,78,79. Compared with trainlm, the model CI band of ANN trainbr (Fig. 20a) suggests that ANN would outperform the DNN trainlm model, mainly when the dataset includes patterns that can be efficiently learned through regularization. Such is a typical case of ANN-14-72-1, where the regularization process adjusts the network’s complexity by penalizing excessive weights, allowing for a closer fit to the observed data while avoiding overfitting. This observation reveals that the trainbr algorithm excelled at producing reliable and robust model predictions in ANN (14-72-1), which is apparent from the close alignment between observed and predicted data (Fig. 20). However, both DNN and ANN models exhibited high adaptability to the observed data while maintaining a balance between accuracy and generalization. Specifically, the ANN 14-72-1 effectively reduces overfitting by regularizing the weights, thereby enhancing the model’s robustness in its predictions. This improvement is achieved by iteratively and efficiently adjusting the network parameters, thereby providing a better response to the underlying complex patterns in the dataset (Fig. 20a). The trainbr ANN model 14-72-1 penalizes large weights, allowing for an effective trade-off between model complexity and PA while balancing bias and variance. Thus, an improved generalization with minimal overfitting is achieved, which is crucial when dealing with complex and noisy datasets or where prediction uncertainty is critical (Fig. 20c, noting that such improvements are not possible with the DNN trainlm model).

In contrast, the DNN trainlm neuro-optimization technique yields a different pattern of prediction and uncertainty intervals, with a smoother and more generalized mean prediction that does not closely follow the observed data points (Fig. 20b). This can be attributed to the nature of the trainlm algorithm, which is more focused on optimizing speed and accuracy through a combination of gradient descent and Gauss-Newton methods, as reported in the literature^33,80. While this method is generally faster and more effective at converging, it may struggle to align predictions as closely with the data. This is the case when significant variability or noise is present in the dataset (Fig. 20b); thus, the uncertainty interval widens (Fig. 20d) compared with the trainbr model (Fig. 20c). In some instances, this trade-off may be advantageous when a fast, computationally efficient solution is needed. However, in cases where data variability is crucial, trainlm may require additional tuning, such as regularization, as reported by Ying⁸¹. Table 15 summarizes the Monte Carlo uncertainty indices corresponding to the PPU_95% intervals obtained from ANN and DNN output predictions. The ideal model delivers a $\:\stackrel{-}{dU}x$ reaching zero and 100% of observations bracketed by PPU_95% as indicated in Badgujar et al.⁴⁶ and Noori et al.⁷⁵. Most ANN and DNN models had 100% of their predictions bracketed at PPU_95%. However, the trainbr 14-72-1 ANN had the lowest $\:\stackrel{-}{dU}x$ and d_factor values, closest to zero, and was a marginally better model, while trainlm 14-7-5-1 was best in DNN (Table 15). These results agreed with earlier findings where the trainbr algorithm models were best in ANN, while the trainlm was best in DNN architectures.

Table 15 Uncertainty evaluation of the best ANN and DNN models.

Full size table

Anderson Darling (AD) test

The AD-test was used to determine if a sampled prediction was drawn from hypothesized normality in the observed distribution. This hypothesis was achieved by evaluating the AD-test statistic parameter based on the cumulative distribution F(x,θ) and empirical probability density functions, F_n(x) for n observations, x₁, ≤x_2, …. x_n of a particular sample. If x₁, ≤x₂,., x_n satisfy the same distribution F(x, θ), then H₀ and the converse H₁ are true. The AD parameter (A²) was calculated using Eq. 56 and compared with its corresponding critical values (⍺) obtained from Eq. 57^82,83.

$$\:{\varvec{A}}^{2}=\varvec{n}{\int\:}_{\varvec{\infty\:}}^{\varvec{\infty\:}}\frac{{\left[{\varvec{F}}_{\varvec{n}}\left(\varvec{x}\right)-\varvec{F}\left(\varvec{x};\varvec{\theta\:}\right)\right]}^{2}}{\varvec{F}\left(\varvec{x};\varvec{\theta\:}\right)\left[1-\varvec{F}\left(\varvec{x};\varvec{\theta\:}\right)\right]}\varvec{d}\varvec{F}\left(\varvec{x};\varvec{\theta\:}\right)$$

(56)

$$\:{\varvec{A}}^{2}=-\varvec{N}-\frac{1}{\varvec{N}}\sum\:_{\varvec{i}=1}^{\varvec{N}}\left(2\varvec{j}-1\right)\left[\varvec{l}\varvec{n}{\varvec{u}}_{\varvec{j}}+\varvec{l}\varvec{n}\left(1-{\varvec{u}}_{\varvec{N}-\varvec{j}+1}\right)\right]$$

(57)

Where N is the total sample data, u_j equals F(x_j), and x_j is the j^th sample value. Anderson-Darling test results are shown in Table 16 together with the AD-test at a 95% confidence interval in Fig. 21. The P-value (< 0.0012) of the actual database was lower than the level of significance (P < 0.05), implying rejection of the null hypothesis of normality and acceptance of a normally distributed database (Table 16). Compared with other models, the AD-test statistics of both ANN trainbr (14-72-1) and DNN trailm (14-7-5-1), i.e., 1.3913 and 1.3922, respectively, were the closest to that of the actual database (1.3912) as shown in Table 16. This observation construed normal distribution in neurocognitive predictions of ANN trainbr (14-72-1) and DNN trailm (14-7-5-1) as indicated by the closeness of AD-test statistic values (i.e., AD _model ≡AD actual). However, the AD-test statistics of ANN trainbr (14-72-1) were closer to the actual values than those of DNN trailm (14-7-5-1), possibly due to the ingress of extraneous artifacts from a complex neurocognitive architecture, such as noise and long backpropagation memory. Moreover, P-values of both ANN and DNN predictions were less than the ideal P-value(i.e., P_model < P_ideal), which further justifies the rejection of hypothetical nullity and acceptance of normality in the F_Tr predictions in both models (Fig. 21). Hence, their neurocognitive robustness can be generalized. The AD-test statistics of neurocognitive predictions must be close to the value of the actual database for the acceptance of normally distributed predictions. This condition must be met whenever the P_model values are less than the significant P-value (P < 0.05) as cogent evidence of model robustness, and the findings agree with the literature^84,85.

Table 16 Anderson-Darling test results for the ANN and DNN model predictions.

Full size table

Discussion on results

The present investigation shows that the ANN model configured by Bayesian regularization (trainbr) has outperformed the ANN models configured by Levenberg-Marquardt (trainlm), Scaled conjugate gradient (trainscg), Quasi-newton (trainbfg), Powell-Beale conjugate gradient (traincgb), Onestep secant (trainoss), Gradient descent momentum (traingdm), Fletcher-reeves conjugate gradient (traincgf), Gradient descent (traingd), Polak-ribiére conjugate gradient (traincgp), Learning rate gradient descent (traingdx), and Resilient backpropagation (trainrp) due to its exceptional generalization capabilities. Unlike traditional methods such as Levenberg–Marquardt (trainlm) or Scaled Conjugate Gradient (trainscg), trainbr automatically incorporates regularization, preventing overfitting, especially on noisy or small datasets. While algorithms such as trainlm and trainbfg may converge faster, they risk memorizing the training data rather than learning the underlying patterns. In contrast, trainbr balances data fitting with model complexity, adjusting weights to ensure smoother outputs. This makes it highly reliable for function approximation and regression tasks. Unlike gradient-based methods such as traingd, traingdx, or traingdm, trainbr is less sensitive to learning rate tuning and local minima. It also outperforms conjugate gradient variants (traincgf, traincgp, traincgb) and resilient backpropagation (trainrp) in producing stable models across varied input distributions. Furthermore, trainbr’s probabilistic framework enhances robustness against outliers. On the other side, the DNN model trained using Levenberg–Marquardt (trainlm) often outperforms other backpropagation algorithms due to its exceptionally fast and accurate convergence, especially on moderate-sized datasets. As a hybrid of gradient descent and Gauss-Newton methods, trainlm effectively handles nonlinear error surfaces with high precision. Compared to Bayesian Regularization (trainbr), trainlm generally trains faster and requires fewer epochs, making it ideal for tasks demanding quick optimization. It also outpaces conjugate gradient methods like traincgb, traincgf, and traincgp in both speed and solution quality, especially when the network is well-initialized. Unlike basic gradient descent variants (traingd, traingdx, traingdm), trainlm is far more resilient to poor learning rate settings. While trainbr adds robustness through regularization, it is computationally intensive and slower for deeper networks. Algorithms like trainoss and trainbfg approximate second-order information but lack the adaptive damping feature of trainlm. Additionally, resilient backpropagation (trainrp) performs well in shallow networks but struggles with deeper architectures. Finally, the robustness of DNN and ANN models has been compared with other machine learning models used to predict agricultural traction (Table 17). As highlighted earlier, some of these models were developed under controlled experimental conditions in laboratory soil bins, relying on a limited number of input parameters and accuracy evaluation metrics. However, the ANN trainbr 14-72-1 and DNN trailm 14-7-5-1 models in our study were developed from in situ conditions. Consequently, neurocognitive learning and modeling were subjected to complex and dynamic soil and environmental conditions, involving a large number of input variables that enhanced their robustness. The neurocognitive models presented in our study can simulate traction force under real-world field conditions with high accuracy, reliability, and robustness, facilitating their generalized adoption (Table 17).

Table 17 Performance comparison of published and present study models.

Full size table

The literature study demonstrated that the optimization algorithm enhances the performance of the soft computing models. Therefore, the BR_ANN (trainbr; 14-72-1) and LM_DNN (trainlm; 14-7-5-1) models have been optimized using three metaheuristic algorithms, i.e., Spider Wasp Optimization (SWO), Puma Optimizer (PO), and Walrus Optimizer (WO). The reasons for selecting these algorithms: (a) SWO effectively balances exploration and exploitation by mimicking the hunting and paralyzing strategies of spider wasps, enabling it to avoid premature convergence⁹¹. It is particularly efficient in handling high-dimensional search spaces and provides robust global search capability. (b) PO, inspired by the cooperative and predatory behavior of pumas, emphasizes adaptive hunting strategies that enhance convergence speed while maintaining solution diversity⁹². Its flexibility makes it well-suited for both continuous and discrete optimization tasks. (c) WO, modeled after the social and survival behaviors of walruses, incorporates herd-based communication and leadership mechanisms to improve local exploitation⁹³. It shows strong stability and resilience against local optima due to its collective decision-making process. Together, these algorithms provide superior accuracy, scalability, and adaptability across engineering design, machine learning, and real-world optimization applications. The SWO algorithm has been configured with a population size of 30, 500 iterations, a 0.9 crossover rate, and a 0.5 mutation random factor. On the other hand, the PO algorithm has been tuned with a population size of 30, 500 iterations, a phase weight of 1.3, a mega exploration and exploitation ratio of 0.99, and a phase-switch threshold of 0.5. Similarly, the WO algorithm is configured with a population size of 30, 500 iterations, a leader (alpha) fraction of 0.20, a step size of 0.5, a decay rate of 0.99 per iteration, a social communication probability of 0.70, and a random perturbation of 0.1. Thus, six hybrid models, i.e., SWO_ANN, PO_ANN, WO_ANN, SWO_DNN, PO_DNN, and WO_DNN, have been developed with the same hyperparameter configurations. Figure 22 demonstrates the comparison of conventional (trainbr; 14-72-1) and hybrid ANN models in estimating F_Tr. It can be observed that the optimization algorithms have enhanced the prediction capabilities of the conventional ANN model. The optimized ANN models have outperformed the conventional ANN model. Still, the SWO_ANN model attained higher performance in both phases. The SWO_ANN model estimated FTr with the least residuals (RMSE = 1.38E-11 in training and 8.38E-03 in testing; MAE = 7.05E-12 in training and 6.57E-03 in testing) and higher accuracies (R = 1 in training and 0.9965 in testing), followed by the WO_ANN (R = 1 in training and 0.9948 in testing) and PO_ANN (R = 1 in training and 0.9943 in testing) models.

Conversely, Fig. 23 demonstrates the comparison of conventional (LM_DNN; tuned by trainlm, 14-7-5-1) and hybrid deep neural network (DNN) models in predicting F_Tr. Figure 23 shows that the SWO, PO, and WO algorithms have increased the performance of the traditional DNN (trainlm; 14-7-5-1) model. It has been found that the SWO_DNN model has outperformed the conventional LM_DNN model in both training (RMSE = 9.31E-12, MAE = 4.93E-12, R = 1.0000) and testing (RMSE = 2.57E-03, Mae = 2.56E-03, R = 1.0000) phases with the least residuals and high performance, followed by the PO_DNN (RMSE = 1.38E-11, MAE = 7.18E-12, R = 1.0000 in training phase; RMSE = 7.87E-02, MAE = 7.25E-03, R = 0.9984 in testing phase) and WO_DNN (RMSE = 1.41E-11, MAE = 7.63E-12, R = 1.0000 in training phase; RMSE = 1.21E-02, MAE = 1.16E-02, R = 0.9983 in testing phase) models. Interestingly, the ANN and DNN models have attained higher performance using the SWO algorithm. The fundamental study of SWO reveals that SWO outperforms PO and WO in training ANN and DNN models due to its strong balance between exploration and exploitation. Its multi-phase strategy, including searching, escaping, paralyzing, and mating, prevents premature convergence and improves convergence speed. SWO adapts better to high-dimensional weight spaces, which are common in deep networks. It also maintains diversity through crossover and random motion, reducing the risk of overfitting.

Summary and conclusions

This study simulated the neurocognitive intelligence of human brain neurons to develop Deep Learning (DNN) and Artificial Neural Network (ANN) models for predicting tractive force (F_Tr) of farm vehicles using dynamic soil-machine variables in situ. The model development process relied on 12 artificial neurocomputing algorithms and three activation functions to sequentially train 72 DNN and ANN neurocognitive architectures that used 14 input neurons. Subsequently, the prediction performance of F_Tr was evaluated using various metrics, including training time, epoch size, architecture complexity, accuracy, robustness, and reliability, from which optimal DNN and ANN models were identified. The novel set of DNN and ANN neurocognitive equations developed by this study will enable intelligent prediction of F_Tr for applications in wheeled tractors and autonomous systems used for tillage operations. The main conclusions derived from this research are summarized below:

The neurocognitive accuracy, reliability, and robustness of the DNN and ANN models depend on the training algorithm, network size, activation function, convergence time, and epoch size. The number of layers, hidden layer neurons, convergence time, and epoch optimality do not have an equal regression with neurocomputing accuracies in both ANN and DNN models. The trainbr performs best in a single-layered ANN, but it consumes higher convergence time at lower epoch optimality. In contrast, trainlm provides better predictions in multi-layered DNN models at less convergence time but at the expense of higher epoch optimality. Increasing the number of neurons in the hidden layer of an ANN trained by the trainbr improves prediction performance. However, increasing the number of layers in DNN trainlm does not necessarily lead to improved accuracy without modifying the algorithm.
Although the number of optimal epochs did not regress with training time in ANN, convergence time and number of hidden layers were congruent with epoch size in DNN, but incongruent in ANN. The number of hidden layers successfully regressed against modeling accuracy in trainlm DNN, but it was incongruent in trainbr ANN. The number of neurons was congruent with neurocomputing accuracy in ANN, but it did not regress with accuracy metrics in DNN models. As the number of hidden layers increased, accuracy declined, while the number of neurons improved the neurocomputing accuracy in ANN. However, increasing the number of hidden layers and neurons improved the neurocognitive accuracy in DNN, while reducing them, respectively, decreased the accuracy.
The performance of DNN models can be optimized by increasing the number of layers and neurons. Still, it can be limited by early stoppage due to overfitting and complex neurocognitive artifacts, such as noise and long-term memory. The trainbr ANN model optimized neurocomputing performance at 72 hidden neurons in a single layer (14-72-1), while DNN optimized prediction at 7 and 5 neurons in the first and second hidden layers (14-7-5-1). Draft force and tillage depth had the highest explanatory importance on predicted F_Tr among all soil-machine variables used in neurocognitive modeling. At the same time, tire inflation pressure was the least sensitive parameter.
The SWO, PO, and WO algorithms enhanced the performance of ANN (trainbr; 14-72-1) and DNN (trainlm; 14-7-5-1) models in both phases. The SWO_ANN and SWO_DNN models outperformed BR_ANN, LM_ANN, PO_ANN, WO_ANN, PO_DNN, and WO_DNN models with higher accuracy and the least residuals.

Based on the modeling results, future work should focus on designing and developing intelligent, programmable logic control platforms to accurately implement the models in decision-making for in-field operational adjustments, as well as optimizing wheeled autonomous tractors and tillage robots for sustainable smart farming. The present investigation may be extended by optimizing DNN (trainlm) and ANN (trainbr) models with each Ant Lion Optimizer (ALO), Information Acquisition Optimizer (INFO), Enhanced Remora Optimization Algorithm (EROA), Enhanced Runge Kutta Optimizer (ERUN), and Improved Randomized Firefly Optimization (IMRFO) algorithm to understand the impact of optimization techniques on the accuracy of models. In addition, the real-world application of the best-performing models, particularly DNN (trainlm) and artificial neural networks (trainbr), holds significant potential for enhancing autonomous tractor operations in precision agriculture. These models can be integrated into real-time control systems to facilitate intelligent decision-making for tasks such as traction control, path planning, and terrain adaptation. However, successful deployment requires careful consideration of several practical factors. First, reliable and continuous data collection from onboard sensors (e.g., GPS, IMU, torque sensors) is critical, but such data are often affected by noise, drift, and environmental variability. Therefore, preprocessing techniques like filtering, normalization, and sensor fusion must be applied to ensure model robustness. Second, computational constraints on embedded systems demand lightweight and optimized versions of these models to maintain real-time responsiveness without compromising accuracy. Model pruning, quantization, or edge computing solutions may be required. Lastly, the system should be adaptive to dynamic field conditions by incorporating online learning or periodic re-training strategies using updated field data to maintain performance consistency over time. There is a need to onboard commercial software packages for the developed models to optimize the generation and utilization of tractive forces in wheeled robots under diverse soil and field conditions.

Data availability

Data is shown within the article’s Figures and Tables. Access to the entire database shall be provided by the corresponding author upon reasonable request, subject to institutional and ethical considerations for academic and scientific research purposes via https://shorturl.at/Io3LZ.

Abbreviations

$\:{\stackrel{-}{F}}_{TrA}$ :: Mean observed tractive thrust force
$\:{\stackrel{-}{F}}_{TrP}$ :: Mean predicted tractive thrust force
$\:{F}_{TrA}$ :: Actual tractive thrust
$\:{F}_{TrP}$ :: Predicted tractive force
a20 :: a20-index
AD :: Anderson Darling
ANN:: Artificial neural network
A_Stc :: Soil-tire patch area
CI_Soil :: Cone index
CV:: Variation coefficient
DNN:: Deep neural network
F_D :: Draft force
F_Rr :: Rolling resistance
F_Tr :: Tractive force
IOA :: Index of Agreement
IOS :: Index of scatter
I_Psoil :: Soil plasticity index
MAE:: Mean absolute error
MAPE:: Mean absolute percentage error
MSE:: Mean squared error
Ø_Fuel :: Fuel consumption rate
PA:: Prediction accuracy
PI :: Performance Index
P_Tire :: Tire inflation pressure
R :: Correlation coefficient
R ² :: Coefficient of determination
R_depth :: Wheel rut depth
RMSE:: Root mean square error
SA _ij :: Sensitivity strength
SSE:: Sum of squared error
S_wheel :: Wheel slippage
T_depth :: Tillage depth
TSSE:: Total sum of square error
VAF :: Variance accounted for
W_Load :: Wheel load
γ_Soil :: Soil bulk density
θ_Soil :: Soil water content
ξ _total :: Total score rank
σ:: Standard deviation
τ_Shear :: Soil shear strength

References

Al-Dosary, N. M. N., Alnajjar, F. M. & Aboukarima, A. E. W. M. Estimation of wheel slip in 2WD mode for an agricultural tractor during plowing operation using an artificial neural network. Sci. Rep. 13, 5975. https://doi.org/10.1038/s41598-023-32994-7 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Elsoragaby, S. et al. Energy utilization and greenhouse gas (GHG) emissions of tillage operation in wetland rice cultivation. Land 13, 587. https://doi.org/10.3390/land13050587 (2024).
Article Google Scholar
Siddique, M. A. et al. Development and verification of an underground crop harvester simulation model for potato harvesting. J. Drive Control. 21, 38–45. https://doi.org/10.7839/ksfc.2024.21.1.038 (2024).
Article Google Scholar
Pentoś, K., Pieczarka, K. & Lejman, K. Application of soft computing techniques for the analysis of tractive properties of a low-power agricultural tractor under various soil conditions. Complexity 2020 (7607545). https://doi.org/10.1016/j.still.2016.08.005 (2020).
Md-Tahir, H. et al. Experimental investigation of traction power transfer indices of farm-tractors for efficient energy utilization in soil tillage and cultivation operations. Agronomy 11, 168. https://doi.org/10.3390/agronomy11010168 (2021).
Article CAS Google Scholar
Shafaei, S. M., Loghavi, M. & Kamgar, S. Reliable execution of a robust soft computing workplace found on multiple neuro-fuzzy inference systems coupled with multiple nonlinear equations for exhaustive perception of tractor-implement performance in plowing process. Artif. Intell. Agric. 2, 38–84. https://doi.org/10.1016/j.aiia.2019.06.003 (2019).
Article Google Scholar
Lal, B. et al. Energy and carbon budgeting of tillage for environmentally clean and resilient soil health of rice-maize cropping system. J. Clean. Prod. 226, 815–830. https://doi.org/10.1016/j.jclepro.2019.04.041 (2019).
Article CAS Google Scholar
Jensen, T. A., Antille, D. L. & Tullberg, J. N. Improving on-farm energy use efficiency by optimizing machinery operations and management: A review. Agric. Res. 14, 15–33. https://doi.org/10.1007/s40003-024-00824-5 (2025).
Article Google Scholar
Jasoliya, D., Untaroiu, A. & Untaroiu, C. A review of soil modeling for numerical simulations of soil-tire/agricultural tools interaction. J. Terrramech. 111, 41–64. https://doi.org/10.1016/j.jterra.2023.09.003 (2024).
Article Google Scholar
Stefanow, D. & Dudziński, P. A. Soil shear strength determination methods – State of the Art. Soil Tillage. Res. 208, 104881. https://doi.org/10.1016/j.still.2020.104881 (2021).
Article Google Scholar
Tiwari, V. K., Pandey, K. P. & Pranav, P. K. A review on traction prediction equations. J. Terrramech. 47, 191–199. https://doi.org/10.1016/j.jterra.2009.10.002 (2010).
Article Google Scholar
Godwin, R. J. & O’Dogherty, M. J. Integrated soil tillage force prediction models. J. Terrramech. 44, 3–14. https://doi.org/10.1016/j.jterra.2006.01.001 (2007).
Article Google Scholar
Mardani, A. & Golanbari, B. Indoor measurement and analysis on soil-traction device interaction using a soil Bin. Sci. Rep. 14, 10077. https://doi.org/10.1038/s41598-024-59800-2 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Ekinci, Ş., Çarman, K. & Kahramanlı, H. Investigation and modeling of the tractive performance of radial tires using off-road vehicles. Energy 93, 1953–1963. https://doi.org/10.1016/j.energy.2015.10.070 (2015).
Article Google Scholar
Fanigliulo, R. et al. Assessment of the performance of agricultural tires using a mobile test bench. Agriculture 13, 87. https://doi.org/10.3390/agriculture13010087 (2023).
Article Google Scholar
Roeber, J., Pitla, S., Hoy, R., Luck, J. & Kocher, M. Development and validation of a tractor drawbar force measurement and data acquisition system (DAQ). Appl. Eng. Agric. 33, 781–789. https://doi.org/10.13031/aea.12489 (2017).
Article Google Scholar
Gupta, R., Yadav, A. K. & Jha, S. K. Harnessing the power of hybrid deep learning algorithm for the Estimation of global horizontal irradiance. Sci. Total Environ. 943, 173958. https://doi.org/10.1016/j.scitotenv.2024.173958 (2024).
Article CAS PubMed Google Scholar
Badgujar, C., Das, S., Figueroa, D. M. & Flippo, D. Application of computational intelligence methods in agricultural Soil–Machine interaction: A review. Agriculture 13, 357. https://doi.org/10.3390/agriculture13020357 (2023).
Article Google Scholar
Jha, A. et al. An efficient and interpretable stacked model for wind speed Estimation based on ensemble learning algorithms. Energy Technol. 12, 2301188. https://doi.org/10.1002/ente.202301188 (2024).
Article Google Scholar
Waqas, M. et al. Applications of machine learning and deep learning in agriculture: A comprehensive review. Green. Technol. Sustain. 3, 100199. https://doi.org/10.1016/j.grets.2025.100199 (2025).
Article Google Scholar
Mwiti, F., Gitau, A. & Mbuge, D. Effects of soil-tool interaction and mechanical pulverization of arable soils in tillage -a comprehensive review. Agricultural Eng. International: CIGR E-J. 25, 75–94 (2023). https://cigrjournal.org/index.php/Ejounral/article/view/8587/4089
Google Scholar
FAO. Lecture notes on the major soils of the world. Food and Agriculture Organization of the United Nations, 94. Rome: 334. (2021). https://www.fao.org/4/y1899e/y1899e00.htm
Mwiti, F., Gitau, A. & Mbuge, D. Evaluation of artificial neurocomputing algorithms and their metacognitive robustness in predictive modeling of fuel consumption rates during tillage. Comput. Electron. Agric. 224, 109221. https://doi.org/10.1016/j.compag.2024.109221 (2024).
Article Google Scholar
Khatti, J. & Grover, K. S. Assessment of the uniaxial compressive strength of intact rocks: an extended comparison between machine and advanced machine learning models. Multiscale Multidiscip Model. Exp. Des. 7, 3301–3325. https://doi.org/10.1038/s41598-024-83784-8 (2024).
Article Google Scholar
Khatti, J., Khanmohammadi, M. & Fissha, Y. Prediction of time-dependent bearing capacity of concrete pile in cohesive soil using optimized relevance vector machine and long short-term memory models. Sci. Rep. 14, 32047. https://doi.org/10.1038/s41598-024-83784-8 (2024).
Article ADS PubMed PubMed Central Google Scholar
Pentoś, K. & Pieczarka, K. Applying an artificial neural network approach to the analysis of tractive properties in changing soil conditions. Soil Tillage. Res. 165, 113–120. https://doi.org/10.1016/j.still.2016.08.005 (2017).
Article Google Scholar
Huang, L. et al. Normalization techniques in training dnns: methodology, analysis and application. IEEE Trans. Pattern Anal. Mach. Intell. 45, 10173–10196. https://doi.org/10.1109/TPAMI.2023.3250241 (2023).
Article ADS PubMed Google Scholar
Zhang, K. et al. Dynamic modeling and parameter sensitivity analysis of AUV by using the POD method and the HB-AFT method. Ocean Eng. 293, 116693. https://doi.org/10.1016/j.oceaneng.2024.116693 (2024).
Article Google Scholar
Khatti, J. & Grover, K. S. Prediction of compaction parameters for fine-grained soil: critical comparison of the deep learning and standalone models. J. Rock Mech. Geotech. Eng. 15, 3010–3038. https://doi.org/10.1016/j.jrmge.2022.12.034 (2023).
Article Google Scholar
Pourkamali-Anaraki, F., Nasrin, T., Jensen, R. E., Peterson, A. M. & Hansen, C. J. Adaptive activation functions for predictive modeling with sparse experimental data. Neural Comput. Applic. 36, 18297–18311. https://doi.org/10.1007/s00521-024-10156-8 (2024).
Article Google Scholar
Sangwan, P., Deshwal, D. & Dahiya, N. Performance of a Language identification system using hybrid features and ANN learning algorithms. Appl. Acoust. 175, 107815. https://doi.org/10.1016/j.apacoust.2020.107815 (2021).
Article Google Scholar
Prasad, B., Kumar, R. & Singh, M. Performance analysis of various training algorithms of deep learning based controller. Eng. Res. Express. 5, 025038. https://doi.org/10.1088/2631-8695/acd3d5 (2023).
Article ADS Google Scholar
Arthur, C. K., Temeng, V. A. & Ziggah, Y. Y. Performance evaluation of training algorithms in backpropagation neural network approach to Blast-Induced ground vibration prediction. Ghana. Min. J. 20, 20–33. https://doi.org/10.4314/gm.v20i1.3 (2020).
Article Google Scholar
Kuruvilla, J. & Gunavathi, K. Lung cancer classification using neural networks for CT images. Comput. Methods Programs Biomed. 113, 202–209. https://doi.org/10.1016/j.cmpb.2013.10.011 (2014).
Article PubMed Google Scholar
Kumar, A. & Sodhi, S. S. Effect of hidden neuron size on different training algorithm in neural network. Comm. Math. Appl. 13, 351–365. https://doi.org/10.26713/cma.v13i1.1680 (2022).
Article Google Scholar
Aghaei, S., Shahbazi, Y., Pirbabaei, M. & Beyti, H. A hybrid SEM-neural network method for modeling the academic satisfaction factors of architecture students. Computers Education: Artif. Intell. 4, 100122. https://doi.org/10.1016/j.caeai.2023.100122 (2023).
Article Google Scholar
Alhawarat, A., Salleh, Z. & Nguyen-Thoi, T. Conjugate gradient method: A developed version to resolve unconstrained optimization problems. J. Comput. Sci. 16 https://doi.org/10.3844/jcssp.2020.1220.1228 (2020).
Sandhu, P. S. & Chhabra, S. A comparative analysis of conjugate gradient algorithms & PSO based neural network approaches for reusability evaluation of procedure based software systems. Chiang Mai J. Sci. 38, 123–135 (2011). https://epg.science.cmu.ac.th/ejournal/journal-detail.php?id=511
Google Scholar
Veza, I. et al. Review of artificial neural networks for gasoline, diesel and homogeneous charge compression ignition engine. Alexandria Eng. J. 61, 8363–8391. https://doi.org/10.1016/j.aej.2022.01.072 (2022).
Article Google Scholar
Taghavifar, H. & Mardani, A. Application of artificial neural networks for the prediction of traction performance parameters. J. Saudi Soc. Agricultural Sci. 13, 35–43. https://doi.org/10.1016/j.jssas.2013.01.002 (2014).
Article Google Scholar
Mazzeo, D. et al. A user-friendly and accurate machine learning tool for the evaluation of the worldwide yearly photovoltaic electricity production. Energy Rep. 9, 6267–6294. https://doi.org/10.1016/j.egyr.2023.05.221 (2023).
Article Google Scholar
Matinkia, M., Hashami, R., Mehrad, M., Hajsaeedi, M. R. & Velayati, A. Prediction of permeability from well logs using a new hybrid machine learning algorithm. Petroleum 9, 108–123. https://doi.org/10.1016/j.petlm.2022.03.003 (2023).
Article CAS Google Scholar
Ghritlahre, H. K. & Prasad, R. K. Prediction of heat transfer of two different types of roughened solar air heater using artificial neural network technique. Therm. Sci. Eng. Progress. 8, 145–153. https://doi.org/10.1016/j.tsep.2018.08.014 (2018).
Article Google Scholar
Bharti, P. S. Process modelling of electric discharge machining by back propagation and radial basis function neural network. J. Inform. Optim. Sci. 40, 263–278. https://doi.org/10.1080/02522667.2019.1578088 (2019).
Article Google Scholar
Bayram, A. & Kankal, M. Artificial neural network modeling of dissolved oxygen concentrations in a Turkish watershed. Pol. J. Environ. Stud. 24, 1507–1515 (2015). https://research.ebsco.com/c/fhr7k7/search/details/jf5pvuye45?db=a9h
Google Scholar
Badgujar, C., Flippo, D. & Welch, S. Artificial neural network to predict traction performance of autonomous ground vehicle on a sloped soil Bin and uncertainty analysis. Comput. Electron. Agric. 196, 106867. https://doi.org/10.1016/j.compag.2022.106867 (2022).
Article Google Scholar
Shafaei, S. M., Loghavi, M. & Kamgar, S. An extensive validation of computer simulation frameworks for neural prognostication of tractor tractive efficiency. Comput. Electron. Agric. 155, 283–297. https://doi.org/10.1016/j.compag.2018.10.027 (2018).
Article Google Scholar
Taghavifar, H., Mardani, A. & Hosseinloo, A. H. Appraisal of artificial neural network-genetic algorithm based model for prediction of the power provided by the agricultural tractors. Energy 93, 1704–1710. https://doi.org/10.1016/j.energy.2015.10.066 (2015).
Article Google Scholar
Taghavifar, H. & Mardani, A. Applying a supervised ANN (artificial neural network) approach to the prognostication of driven wheel energy efficiency indices. Energy 68, 651–657. https://doi.org/10.1016/j.energy.2014.01.048 (2014).
Article Google Scholar
Roul, A. K., Raheman, H., Pansare, M. S. & Machavaram, R. Predicting the draught requirement of tillage implements in sandy clay loam soil using an artificial neural network. Biosyst. Eng. 104, 476–485. https://doi.org/10.1016/j.biosystemseng.2009.09.004 (2009).
Article Google Scholar
Shafaei, S. M., Loghavi, M. & Kamgar, S. A comparative study between mathematical models and the ANN data mining technique in draft force prediction of disk plow implement in clay loam soil. Agricultural Eng. International: CIGR J. 20, 71–79 (2018). https://cigrjournal.org/index.php/Ejounral/article/view/4466
Google Scholar
Shehta, Y., Kabany, A. F., Elhelew, W. & Abd-Elwahed, M. Predicting and optimizing tillage draft using artificial network technique. Arab. Universities J. Agricultural Sci. 0, 0–0 (2023). https://cigrjournal.org/index.php/Ejounral/article/view/4466/2753
Google Scholar
Naji Al-Dosary, N. M. et al. Modification of values for the horizontal force of tillage implements estimated from the ASABE form using an artificial neural network. Appl. Sci. 13, 7442. https://doi.org/10.3390/app13137442 (2023).
Article CAS Google Scholar
Upadhyay, G., Kumar, N., Raheman, H. & Dubey, R. Predicting the power requirement of agricultural machinery using ANN and regression models and the optimization of parameters using an ANN–PSO technique. AgriEngineering 6, 185–204. https://doi.org/10.3390/agriengineering6010012 (2024).
Article Google Scholar
Anisur Rahman, R. L. & Kushwaha Seyeed Reza Ashrafizadeh, & Satyanarayan Panigrahi. Prediction of Energy Requirement of a Tillage Tool in a Soil Bin using Artificial Neural Network. in Kentucky, August 7 - August 10, 2011 (American Society of Agricultural and Biological Engineers. https://doi.org/10.13031/2013.37744 (2011).
Zaki, H., Khorasani, M. E., Nategh, N. A., Sheikhdavoodi, M. & Andekaiezadeh, K. Specific draft modeling for combined and simple tillage implements using mathematical, regression and ANN modeling in silty clay loam soil. Agricultural Eng. International: CIGR Journal 24, 41–56. (2022). https://cigrjournal.org/index.php/Ejounral/article/view/6433
Al-Janobi, A., Al-Hamed, S., Aboukarima, A. & Almajhadi, Y. Modeling of draft and energy requirements of a moldboard plow using artificial neural networks based on two novel variables. Eng. Agríc. 40, 363–373. https://doi.org/10.1590/1809-4430-eng.agric.v40n3p363-373/2020 (2020).
Article Google Scholar
Keshun, Y., Guangqi, Q. & Yingkui, G. Optimizing prior distribution parameters for probabilistic prediction of remaining useful life using deep learning. Reliab. Eng. Syst. Saf. 242, 109793. https://doi.org/10.1016/j.ress.2023.109793 (2024).
Article Google Scholar
Zhang, X., Zou, Y. & Li, S. Bayesian neural network with efficient priors for online quality prediction. Digit. Chem. Eng. 2, 100008. https://doi.org/10.1016/j.dche.2021.100008 (2022).
Article Google Scholar
Baumgartner, J., Schneider, A., Zhenis, U., Jager, F. & Winkler, J. Mastering neural network prediction for enhanced system reliability. Fusion Multidisciplinary Res. Int. J. 3, 261–274 (2022). https://fusionproceedings.com/fmr/1/article/view/34
Article Google Scholar
Mienye, I. D. & Swart, T. G. A comprehensive review of deep learning: architectures, recent advances, and applications. Information 15, 755. https://doi.org/10.3390/info15120755 (2024).
Article Google Scholar
Hosseini, S. et al. Assessment of the ground vibration during blasting in mining projects using different computational approaches. Sci. Rep. 13, 18582. https://doi.org/10.1038/s41598-023-46064-5 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Jierula, A., Wang, S., Oh, T. M. & Wang, P. Study on accuracy metrics for evaluating the predictions of damage locations in deep piles using artificial neural networks with acoustic emission data. Appl. Sci. 11, 2314. https://doi.org/10.3390/app11052314 (2021).
Article CAS Google Scholar
Huo, H. et al. Data-Driven strategies for complex system forecasts: the role of textual big data and State-Space Transformers in decision support. Systems 12, 171. https://doi.org/10.3390/systems12050171 (2024).
Article Google Scholar
Bassiouni, M. M., Chakrabortty, R. K., Sallam, K. M. & Hussain, O. K. Deep learning approaches to identify order status in a complex supply chain. Expert Syst. Appl. 250, 123947. https://doi.org/10.1016/j.eswa.2024.123947 (2024).
Article Google Scholar
Kumar, D. R., Wipulanusat, W., Kumar, M., Keawsawasvong, S. & Samui, P. Optimized neural network-based state-of-the-art soft computing models for the bearing capacity of strip footings subjected to inclined loading. Intell. Syst. Appl. 21, 200314. https://doi.org/10.1016/j.iswa.2023.200314 (2024).
Article Google Scholar
Kardani, N. et al. A novel technique based on the improved firefly algorithm coupled with extreme learning machine (ELM-IFF) for predicting the thermal conductivity of soil. Eng. Comput. 38, 3321–3340. https://doi.org/10.1007/s00366-021-01329-3 (2022).
Article Google Scholar
Barbey, A. K. Network neuroscience theory of human intelligence. Trends Cogn. Sci. 22, 8–20. https://doi.org/10.1016/j.tics.2017.10.001 (2018).
Article PubMed Google Scholar
Seguin, C., Sporns, O. & Zalesky, A. Brain network communication: concepts, models and applications. Nat. Rev. Neurosci. 24, 557–574. https://doi.org/10.1038/s41583-023-00718-5 (2023).
Article CAS PubMed Google Scholar
Ghorbani, M. A. et al. The Taylor diagram with distance: A new way to compare the performance of models. Iran. J. Sci. Technol. Trans. Civ. Eng. 49, 305–321. https://doi.org/10.1007/s40996-024-01477-8 (2025).
Article Google Scholar
Izzaddin, A., Langousis, A., Totaro, V., Yaseen, M. & Iacobellis, V. A new diagram for performance evaluation of complex models. Stoch. Environ. Res. Risk Assess. 38, 2261–2281. https://doi.org/10.1007/s00477-024-02678-3 (2024).
Article Google Scholar
Anžel, A., Heider, D. & Hattab, G. Interactive Polar diagrams for model comparison. Comput. Methods Programs Biomed. 242, 107843. https://doi.org/10.1016/j.cmpb.2023.107843 (2023).
Article PubMed Google Scholar
Elvidge, S., Angling, M. J. & Nava, B. On the use of modified Taylor diagrams to compare ionospheric assimilation models. Radio Sci. 49, 737–745. https://doi.org/10.1002/2014RS005435 (2014).
Article ADS Google Scholar
Taylor, K. E. Summarizing multiple aspects of model performance in a single diagram. J. Geophys. Res. 106, 7183–7192. https://doi.org/10.1029/2000JD900719 (2001).
Article ADS Google Scholar
Noori, R., Hoshyaripour, G., Ashrafi, K. & Araabi, B. N. Uncertainty analysis of developed ANN and ANFIS models in prediction of carbon monoxide daily concentration. Atmos. Environ. 44, 476–482. https://doi.org/10.1016/j.atmosenv.2009.11.005 (2010).
Article ADS CAS Google Scholar
Shen, X., Du, C., Jiang, S., Zhang, P. & Chen, L. Multivariate uncertainty analysis of fracture problems through model order reduction accelerated SBFEM. Appl. Math. Model. 125, 218–240. https://doi.org/10.1016/j.apm.2023.08.040 (2024).
Article MathSciNet Google Scholar
Costa, E. A. et al. Mapping Uncertainties of Soft-Sensors Based on Deep Feedforward Neural Networks through a Novel Monte Carlo Uncertainties Training Process. Processes 10, 409. https://doi.org/10.3390/pr10020409 (2022).
Zhou, J. et al. A Monte Carlo simulation approach for effective assessment of flyrock based on intelligent system of neural network. Eng. Comput. 36, 713–723. https://doi.org/10.1007/s00366-019-00726-z (2020).
Article Google Scholar
Salinas, J. R., García-Lagos, F., Diaz de Aguilar, J., Joya, G. & Sandoval, F. Monte Carlo uncertainty analysis of an ANN-based spectral analysis method. Neural Comput. Applic. 32, 351–368. https://doi.org/10.1007/s00521-019-04169-x (2020).
Article Google Scholar
Sharma, U., Gupta, N. & Verma, M. Prediction of the compressive strength of flyash and GGBS incorporated geopolymer concrete using artificial neural network. Asian J. Civ. Eng. 24, 2837–2850. https://doi.org/10.1007/s42107-023-00678-2 (2023).
Article Google Scholar
Ying, X. An overview of overfitting and its solutions. J. Phys. : Conf. Ser. 1168, 022022. https://doi.org/10.1088/1742-6596/1168/2/022022 (2019).
Article Google Scholar
Liao, J. & Liu, S. Anderson-Darling test for ground radar environment sensing. J. Phys. : Conf. Ser. 2435, 012001. https://doi.org/10.1088/1742-6596/2435/1/012001 (2023).
Article Google Scholar
Jäntschi, L. & Bolboacă, S. D. Computation of Probability Associated with Anderson–Darling Statistic. Mathematics 6, 88. (2018). https://doi.org/10.3390/math6060088
Khatun, N. Applications of normality test in statistical analysis. Open. J. Stat. 11 https://doi.org/10.4236/ojs.2021.111006 (2021). 113.
Sun, G., Kang, J. & Shi, J. Application of machine learning models and GSA method for designing stud connectors. J. Civil Eng. Manage. 30, 373–390. https://doi.org/10.3846/jcem.2024.21348 (2024).
Article Google Scholar
Golanbari, B., Mardani, A., Farhadi, N. & Nazari Chamki, A. Applications of machine learning in predicting Rut depth in off-road environments. Sci. Rep. 15, 5486. https://doi.org/10.1038/s41598-025-90054-8 (2025).
Article ADS CAS PubMed PubMed Central Google Scholar
Nagar, H. et al. An integrated cloud system based serverless android app for generalised tractor drawbar pull prediction model using machine learning. Syst. Sci. Control Eng. 12, 2385332. https://doi.org/10.1080/21642583.2024.2385332 (2024).
Article Google Scholar
Almaliki, S., Alimardani, R. & Omid, M. Artificial neural network based modeling of tractor performance at different field conditions. Agricultural Eng. International: CIGR J. 18, 262–274 (2016). https://cigrjournal.org/index.php/Ejounral/article/view/3880/2482
Google Scholar
Elaoud, A. et al. Machine learning approach for predicting soil penetration resistance under different moisture conditions. J. Terrramech. 110, 39–45. https://doi.org/10.1016/j.jterra.2023.08.002 (2023).
Article Google Scholar
Mahore, V., Soni, P., Paul, A., Patidar, P. & Machavaram, R. Machine learning-based draft prediction for mouldboard ploughing in sandy clay loam soil. J. Terrramech. 111, 31–40. https://doi.org/10.1016/j.jterra.2023.09.002 (2024).
Article Google Scholar
Abdel-Basset, M., Mohamed, R., Jameel, M. & Abouhawwash, M. Spider Wasp optimizer: a novel meta-heuristic optimization algorithm. Artif. Intell. Rev. 56, 11675–11738. https://doi.org/10.1007/s10462-023-10446-y (2023).
Article Google Scholar
Abdollahzadeh, B. et al. Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning. Cluster Comput. 27, 5235–5283. https://doi.org/10.1007/s10586-023-04221-5 (2024).
Article Google Scholar
Han, M. et al. Walrus optimizer: A novel nature-inspired metaheuristic algorithm. Expert Syst. Appl. 239, 122413. https://doi.org/10.1016/j.eswa.2023.122413 (2024).
Article Google Scholar

Download references

Funding

No funding.

Author information

Authors and Affiliations

Department of Environmental and Biosystems Engineering, Machinery Section, University of Nairobi, P.O.Box 30197, 00100, Nairobi, Kenya
Frankline Mwiti
Department of Agricultural and Biosystems Engineering, University of Eldoret, P.O.Box 1125, 30100, Eldoret, Kenya
Frankline Mwiti
Faculty of Engineering, University of Nairobi, P.O.Box 30197, 00100, Nairobi, Kenya
Ayub Gitau & Duncan Mbuge
Academic Division, University of Nairobi, P.O.Box 30197, 00100, Nairobi, Kenya
Ayub Gitau
Department of Soil Science, University of Eldoret, P.O.Box 1125, 30100, Eldoret, Kenya
Ruth Njoroge
CSIRO Agriculture and Food, Canberra, Australian Capital Territory 2601, Australia
Diogenes L. Antille
Engineering Department, Harper Adams University, Newport, Shropshire, TF108NB, United Kingdom
Diogenes L. Antille
Department of Civil Engineering, Rajasthan Technical University, Kota, Rajasthan, 324010, India
Jitendra Khatti

Authors

Frankline Mwiti
View author publications
Search author on:PubMed Google Scholar
Ayub Gitau
View author publications
Search author on:PubMed Google Scholar
Duncan Mbuge
View author publications
Search author on:PubMed Google Scholar
Ruth Njoroge
View author publications
Search author on:PubMed Google Scholar
Diogenes L. Antille
View author publications
Search author on:PubMed Google Scholar
Jitendra Khatti
View author publications
Search author on:PubMed Google Scholar

Contributions

F. M: Research conceptualization, Methodology, Field investigations, Machinery preparation, Experimental design, Data collection, Data curation, Data Analysis, Interpretation, Software, Formal analysis, Modeling, Validation, Database, Writing the original manuscript, Comprehensive revision, Manuscript structure and Final editing. A. G: Research administration, Resources, Supervision, and Research mentorship. D. M: Supervision, Resources, Validation, Reviewing, and Research mentorship. R. Njoroge: Validation, Reviewing, Editing. D.L. A: Validation, Reviewing, Editing, Proofing, Research mentorship, Resources, Operational support. J. K: Development of hybrid models and visual interpretation, Manuscript Structure, Revision and Final Editing.

Corresponding author

Correspondence to Frankline Mwiti.

Ethics declarations

Competing interests

The authors declare no competing interests.

Permission for land studies

The authors declare that all machinery investigations on land were conducted under local rules and regulations.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Mwiti, F., Gitau, A., Mbuge, D. et al. Assessment of off-road agricultural traction in situ using large scale machine learning and neurocomputing models. Sci Rep 15, 33098 (2025). https://doi.org/10.1038/s41598-025-17736-1

Download citation

Received: 25 February 2025
Accepted: 26 August 2025
Published: 26 September 2025
Version of record: 26 September 2025
DOI: https://doi.org/10.1038/s41598-025-17736-1

Keywords

This article is cited by

Prediction of Tunnel Deformation Based on Machine Learning and Numerical Simulation
- Siqi Li
- Anchang Lan
- Yanbo Sun
Indian Geotechnical Journal (2026)

Subjects

Abstract

Similar content being viewed by others

Overcoming classic challenges for artificial neural networks by providing incentives and practice

Smart agriculture: utilizing machine learning and deep learning for drought stress identification in crops

Machine learning models for neurocognitive outcome prediction in preterm born infants

Introduction

Gap identification in available studies

Novelty of the present investigation

Research methodology

Data insights and analysis

Soil and tractor-implement data acquisition in-situ

Data analysis

Cosine amplitude sensitivity analysis

Development of computational approaches

Neurocomputing intelligence of deep neural networks and artificial neural network algorithms

Neuro-transfer functions of deep learning and artificial neural networks

Feed-forward backpropagation in deep learning and artificial neural networks

ANN model

DNN model

Modeling status and neurocognitive error zeroing

Results and discussion

Simulation of results

Reliability analysis

Wilcoxon rank analysis

Visual interpretation of model capabilities

Taylor plot

Monte Carlo uncertainty simulation

Anderson Darling (AD) test

Discussion on results

Summary and conclusions

Data availability

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Permission for land studies

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

This article is cited by

Prediction of Tunnel Deformation Based on Machine Learning and Numerical Simulation

Search

Quick links