Introduction

One of the world’s ancient as well as mainly predominant industries is agriculture1. Also with the swift increase in the world’s population, the insistence for food and employment is also in the increasing trend1. Due to this, novel automated mechanisms are being evolved to meet up with food prerequisites utilized by farmers because the conventional methods employed by them are found to be insufficient to meet their requirements1. Technologies like, AI, and ML make routes to virtually every industry. Attempts as well as research are in progress to enhance quality, and quantity of agricultural products by creating them associated as well as intelligence via smart farming.

Multi Layer Perceptron (MLP) analyzed in1 the effectiveness of temperature, pesticides, and so on in the impact of sustainable agriculture as well as effectiveness of economic condition at the farm level in Saudi Arabia. Also with this model future value predictions of crop yield in Saudi Arabia were also discussed. Here AI technique was utilized in evaluating the influence of environmental features and agro-technical criteria on crop yield prediction. Moreover by employing Artificial Neural Networks (ANNs), an extremely efficient MLP model was constructed for precisely forecasting crop yield on the basis of the environmental data and also reducing the training/testing error1.

Attention-based random forest by Meta-learning called (MetaRF) was designed in2. With this design new reagents were predicted in a swift manner. In addition to enhance the learning performance, a dimensionality reduction-based method as sampling was also introduced for determining valuable samples to be learned with minimal error.

For crop yield prediction, HybridCNN-Deep Neural Networks (DNN) (Hybrid CNN-DNN) was proposed in3. Here, in addition, XGBoost was also utilized as an estimator for selecting essential features to exploit speed and effectiveness. Also, CNN was employed in acquiring data dependencies and extracting pertinent information. Finally, DNN here was utilized as feed-forward propagation for making accurate and timely predictions.

In4, with the evolution of technology, there has been perceived a considerable shift in several of the industries globally. AI has initiated to significant part in everyday chores, and begun to play main task in daily lives, widening our extending our awareness as well as potentiality to perceptions and capability to improve the environment around us. Numerous applications of AI in agriculture specifically for spraying, yield prediction, and weeding were investigated.

An outline of modern research in the region of AI-enabled agriculture as well as spotting of mainly eminent applications of artificial intelligence was discussed in5. A comprehensive review on the utilization of AI-enabled ML for forecasting crop yield with distinctive importance on yield prediction relating to palm oil was investigated in6. Researchers are operating toward the employment of novel IoT techniques to assist farmers utilize AI technology in the evolution of protecting the crop, seed evolution, and fertilizers. A holistic survey of AI applications in the agricultural sector, like machine learning, and computer vision was presented in7.

In8, the Indian economy is chiefly contingent on agriculture. It is the chief starting point of economy as far as the extensive preponderance of Indian Farmers is concerned. Agriculture till now remains mainly a significant economic factors for the country’s financial development. Nevertheless, farmers cannot acquire the cultivation-related crops, predict market prices and improve productivity. Many new agricultural technologies, like AI are being executed to assist farmers expand more efficiently and advantageously. A review of comprehending regarding vegetation indices as well as environmental variables influence agricultural output through divulging apertures using deep learning was investigated.

In9, yet another hybrid method to focus on the accuracy aspects for predicting crop yield by taking into consideration the environmental factors and management tactics employing AI-enabled CNN)and Recurrent Neural Networks (RNN) was presented. However, precision was not covered. To address precision aspects, AI and a family of ML algorithms were presented in10. Also to achieve precise results, a recommendation system was utilized.

In11, cop yields accurate prediction aided by sophisticated and area-related perceptions it required to enhance agricultural breeding across different climatic circumstances to safeguard against varied climatic conditions. LSTM employing RNN was proposed for the purpose of measuring weekly weather parameters in a precise fashion. Nevertheless, these data provide both spatial and temporal classification that to a greater extent mess up the management performance.

In12, spatio temporal semantic management of data for improved interoperability was presented and analyzed training and validation using neural networks. However, the resource efficiency was not analyzed. With the inception of AI that has reorganized conventional agriculture mechanisms, improved crop productivity and quality were both ensured in13 by employing a distance vector hop positioning algorithm. Despite improvements in accuracy as well as precision, the error rate involved in prediction was not focused. A predictive model employing regression for corn and soybean fields was presented in14. By using this regression model not only reduced error but selecting features both spatially and temporally improved accuracy considerably.

In15, innovations in the agriculture field assist in increasing farmland yield to slacken the market economy. Prediction of specific crop yield by employing selected pertinent features would evolve and increase food production. AI-enabled ML algorithms employing random forest were employed that with the aid of forward feature selection improved the accuracy score that laid the foundation for farmers to improve stability in terms of both economically and socially.

Yet another method with detailed performance analysis for selecting relevant features using AI-enabled ML techniques was designed16. With this type of design, accuracy was improved and also reduced error significantly. Climatic conditions pave a major role in predicting crop yield. To analyze climatic conditions eleven combinations of climate and geography was discussed in17. Despite yield prediction performed in an accurate fashion, however certain deficiencies like mapping between raw data and crop yield heavily depends on features being extracted. In18, Deep Recurrent Q-Network method was presented for forecasting the crop yield under varied climatic conditions. Ensemble of deep learning methods was proposed in19 and feature-selected enabled methods using AI-enabled ML were investigated in20. Advanced ensemble machine learning methods were presented in21 for enhancing the predictive accuracy. However, the specificity was not considered.

Objectives of this paper

The objectives of the AI-enabled Barilai–Blinder–Oaxaca–Bernoulli Deep Classifier (BBO-BDC) for paddy and rice crop yield prediction are discussed below.

  • To provide accurate and precise detection of rice and paddy crop yield, a novel AI-enabled Barilai–Blinder–Oaxaca–Bernoulli Deep Classifier (BBO-BDC) for crop yield prediction is proposed.

  • To improve the convergence speed, Barilai–Borwein Gradient Min–max Normalization-based preprocessing is applied.

  • To select relevant and pertinent features with lesser reduction of the false positive and false negative, Blinder–Oaxaca Statistical Decomposition-based feature selection is employed.

  • To classify crop yield with higher accuracy, an AI-enabled Bernoulli Deep Belief Network has been developed.

The novelty of this paper

The novelty of the proposed BBO-BDC for crop yield prediction is given below,

  • The proposed BBO-BDC is designed through preprocessing, feature selection, and crop yield forecast process to obtain precise crop yield prediction.

  • Barilai–Borwein Gradient Min–max Normalization is developed to perform preprocessing. Four distinct input matrices data are normalized by the novelty of the min–max normalization function. The innovation of the Barilai–Borwein gradient function is used to enhance the convergence speed. In this way, the missing value is eradicated.

  • Feature selection is carried out with novelty Blinder–Oaxaca Statistical Decomposition. It is employed to choose the most significant feature with fewer false positives and false negatives.

  • AI-enabled Bernoulli Deep Belief Network Classifier is utilized to execute classification with several layers. Innovation of Xavier Initialization function is used to perform weight initialization. Bernoulli distribution function is employed to allocate neurons between the hidden and visible layers. Principal Components are utilized to find total hidden layer nodes. With this, rice and paddy crop yields are categorized with maximum accuracy.

Structure of manuscript

The manuscript is organized as below. Different crop yield prediction methods employing AI-enabled ML and DL methods are discussed in the section “Related works”. The analogous system models of BBO-BDC are provided in the section "Materials and methods", following which the various phases of BBO-BDC are defined comprehensively using figures and pseudo-code representations. The comparison of BBO-BDC and other similar prediction methods together with the experimental setup is provided in the section "Experimentation, results and analysis", following which quantitative analysis of BBO-BDC is also done using graphical representations with analysis in Section "Discussion". At last, the manuscript is summarized in Section "Conclusion".

Related works

AI methods for crop yield prediction

AI being an ingenious tool provokes human intelligence and potentiality procedures by machines, specifically digital equipment22. Numerous applications of AI comprise analog to digital conversion, recognizing speed, and expert systems to mimic the perception to name a few. Hence, the viability of the agriculture field is prime to ensure food security as well as for ever-increasing population. The significance of AI and ML to focus on the agriculture sector was investigated. Despite the selection of the correct crop the chief boosting mechanism to increase crop yield is by performing an in-depth analysis of soil by taking into consideration several metrological constituents into analysis. However, the insufficiency of expertise in soil fertility remains the major reason for moderate production in crops.

In23, by taking several factors like slope, temperature, rainfall, soil moisture, and humidity into consideration, a method utilizing the list of crops that was predominantly useful for farmers in making efficient decisions was presented. In24, a comparison among both spatial as well as temporal methods employing ANN for crop yield forecast was designed. DMA techniques were designed in25 for analyzing and validating both present and future patterns into considerations for predicting crop yield. An AI-enabled ML system for crop monitoring employing a random forest algorithm focusing on optimization aspects was inspected in26. A review of AI for crop yield forecasting employing ML was investigated in27.

Learning methods for crop yield prediction

Deep Learning (DL), and Machine Learning (ML) are correlated with each other. In28, a significant and precise mechanism utilizing an ML algorithm for crop selection towards maximal yield was presented. At present utilizing AI to enhance appropriateness among land as well as crop types to improve crop yield has harmony between researchers. But several issues are said to exist like constrained crop phenotypic information and deprived execution of AI techniques. In29, with maize considered as an example, both environmental climate and crop phenotypic features were taken into consideration using graph NN to validate crop suitability assessment. With this type, significant improvements were observed in terms of precision.

Deep learning-improved remote sensing approach was designed in30 to predict the rice yield. The designed approach significantly handles the difficulties associated with the processing of large target datasets. Yet another statistical and machine learning technique focusing on climate and rice yield was presented in31. A comprehensive literature review using ML for crop yield prediction by extracting significant features was investigated in32.

In33, the increasing necessitates for food internationally owing to unrivaled population enlargement has resulted in food insecurity in certain populated areas like Africa. One more major factor is a change in climatic conditions and its changeability. A prediction method based on ML for predicting six crops, like, rice, seed cotton, and so on throughout the year was designed. Here several factors like, weather information, yields, and chemical and climatic information were merged to assist both the farmers and decision-makers predict crop yields annually. For this purpose, three different mechanisms were employed.

A method employing gradient boosting to focus on rice production in Bangladesh annually was proposed in34. In spite of the illustrated efficiency of ML techniques as an appropriate replacement for tradition statistical methods, their application for price forecasting remained an area of conflict. Different statistical and ML techniques were proposed in35 with the objective of obtaining accurate forecast in pricing policies. A hybrid CNN and RNN method were designed in36 to focus on wheat yield prediction with minimum error. Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) were developed in37 for producing the data. However, the large dataset was not considered. A review of machine learning and deep learning techniques was discussed in38 for evaluating the crop yield. A comparison of existing methods and drawbacks is listed in Table 1.

Table 1 Comparison of existing methods.

Motivation

The crop yield prediction data can be utilized for predicting yields of different crops at different time periods in an accurate fashion. In such circumstances, for the prediction of rice and paddy crop yield, we can utilize AI-enabled deep neural network methods. This procedure is in particular essential to predict the yield according to several factors like, rainfall, temperature, pesticide and so on. In addition, although AI supports different application types as discussed in the previous sections, it also suffers from overhead and convergence speed-related issues. This in turn would compromise the overall sensitivity and specificity rate of the prediction being made. The present methods were found to be deficient in overhead and convergence speed functionality features and hence are also found to be susceptible to storage issues. In addition, they do not possess fine-tuned feature selection aspects. Hence, it is crucial to impart feature analysis in an AI-enabled environment, which could overcome the existing methods. This motivates us to design a new AI-enabled Barilai–Blinder–Oaxaca–Bernoulli Deep Classifier (BBO-BDC) for crop yield forecast.

Materials and methods

Materials

One of the considerable regions taking part in notable part in the economy globally is agriculture. The science of training machines for learning as well as generating prototypes for early forecasts has been considerably utilized over the past few decades. In addition through mushrooming growth of human population crop yield remains the key to addressing smart agriculture. As a result, crop yield forecast over precedent few decades has been studied as a paramount agricultural issue39. Also, the agricultural yield is said to be regulated and controlled by numerous factors, like, pesticides, rain, temperature, and so on. The entire Crop Yield Prediction Dataset employed in our work obtained from https://www.kaggle.com/datasets/patelris/crop-yield-prediction-dataset is divided into four distinct files, naming, pesticide, temperature, yield, finally combined into a single file, naming, yield data file (yield_df.csv). The pesticide file comprises of features including, domain, area, element, item, year, unit, and value, and on the other hand the rainfall file consists of area, year and average rainfall. The details of the Crop Yield Prediction Dataset are given below in Table 2.

Table 2 Crop Yield Prediction Dataset description.

As given in the above table, seven features are present in the pesticide csv file, three features are present both in the temperature and rainfall csv files, and finally 12 features are in the yield csv files.

Proposed methodology: AI-enabled Barilai–Blinder–Oaxaca–Bernoulli deep classifier (BBO-BDC)

Crop yield forecast investigation warrants a diversity of making criteria as well as techniques. Techniques for identifying most prognostic features for crop yield are employed by through few farmers40, as other farmers are employed to discover predictions41. This section comprises an AI-enabled deep-learning method for precise and premature crop yield production. The proposed BBO-BDC is designed with Barilai–Borwein Gradient Min–max Normalization, Blinder–Oaxaca Statistical Decomposition, and AI-enabled Bernoulli Deep Belief Network. Barilai–Borwein Gradient Min–max Normalization-based preprocessing algorithm is employed for eradicating the missing values. Blinder–Oaxaca Statistical Decomposition-based feature selection algorithm is designed to select the pertinent features. AI-enabled Bernoulli Deep Belief Network classification algorithm is used to provide exact crop yield prediction. Figure 1 shows the structure of AI-enabled Barilai–Blinder–Oaxaca–Bernoulli Deep Classifier (BBO-BDC) method.

Fig. 1
figure 1

Structure of BBO-BDC method.

As shown in the above figure, the input data set consists of every detail related to the crop acquired from the crop yield prediction dataset. Then preprocessing is performed to eliminate all those missing values in the input Crop Yield Prediction Dataset using the Barilai–Borwein Gradient Min–max Normalization-based preprocessing algorithm. Then pertinent features are selected using the Blinder–Oaxaca Statistical Decomposition-based feature selection algorithm. Feature selection assists in accomplishing precise and accurate results by obtaining differences in the means of a dependent variable (i.e., dependent feature) and independent variable (i.e., independent feature) between groups (i.e., from different input vector matrices acquired from pesticides, rainfall, and temperature). Finally, the AI-enabled Bernoulli Deep Belief Network classification algorithm is employed for accurate and precise crop yield prediction.

System model

In this section, the system model involved in the design of AI-enabled DL-based crop yield forecast is introduced. The crop yield prediction with the AI system model here represents a process-oriented flow between four input matrices, i.e., pesticide, rainfall, temperature, and yield. In our work, these three CSV files are stored in the form of matrices separately as given below.

$$P=\left[\begin{array}{cccc}{P}_{11}& {P}_{12}& \dots & {P}_{1p}\\ {P}_{21}& {P}_{22}& \dots & {P}_{2p}\\ \dots & \dots & \dots & \dots \\ {P}_{i1}& {P}_{i2}& \dots & {P}_{ip}\end{array}\right], where\, i=7$$
(1)
$$R=\left[\begin{array}{cccc}{R}_{11}& {R}_{12}& \dots & {R}_{1r}\\ {R}_{21}& {R}_{22}& \dots & {R}_{2r}\\ \dots & \dots & \dots & \dots \\ {R}_{j1}& {R}_{j2}& \dots & {R}_{jr}\end{array}\right], where\, j=3$$
(2)
$$T=\left[\begin{array}{cccc}{T}_{11}& {T}_{12}& \dots & {T}_{1t}\\ {T}_{21}& {T}_{22}& \dots & {T}_{2t}\\ \dots & \dots & \dots & \dots \\ {T}_{k1}& {T}_{k2}& \dots & {T}_{kt}\end{array}\right], where\, k=3$$
(3)
$$Y=\left[\begin{array}{cccc}{Y}_{11}& {Y}_{12}& \dots & {Y}_{1y}\\ {Y}_{21}& {Y}_{22}& \dots & {Y}_{2y}\\ \dots & \dots & \dots & \dots \\ {Y}_{l1}& {Y}_{l2}& \dots & {Y}_{ly}\end{array}\right], where\, l=12$$
(4)

The above four input matrices attained separately for pesticide ‘\(P\)’, rainfall ‘\(R\)’, temperature ‘\(T\)’ as well as yield ‘\(Y\)’ forms as input. These four input matrices are used in proposed crop yield prediction.

Barilai–Borwein gradient min–max normalization-based preprocessing

The given Crop Yield Prediction Dataset input dataset with four distinct input matrices contains certain missing values and it is mandatory to eliminate those missing values in the preliminary stage itself. Preprocessing is utilized in eliminating each those missing values42 there in input Crop Yield Prediction Dataset input dataset. Normalization as well as the multi-variable function of attribute methods called, Barilai–Borwein Gradient Min–max Normalization-based preprocessing are utilized in the proposed technique.

As illustrated in Fig. 2 through crop yield prediction dataset given as input, for preprocessed data in a convergence speed-effective manner. Owing to the reason that the scope of raw data varies widely with four distinct input matrices employed in our work, crop yield prediction will not work properly without normalization43. As a result, the range of all features is normalized with the objective that each feature present in the four distinct input matrices contributes approximately proportionate to the final distance. The Min–max Normalization consists of rescaling the feature range for a min–max of ‘\(\left[\text{0,1}\right]\)’. The mathematical formulation is written as given below.

Fig. 2
figure 2

Structure of Barilai–Borwein Gradient Min–max Normalization-based preprocessing model.

$${S}{\prime}=\frac{S-Min\left(S\right)}{Max\left(S\right)-Min\left(S\right)}$$
(5)

Based on the above Eq. (5) results, four distinct input matrices data is normalized in such manner which every features possess similar weight. Also, ‘\(Min\left(S\right)\)’ and ‘\(Max \left(S\right)\)’ denotes the minimum and maximum value respectively. Also owing to the utilization of four distinct input matrices multi-variable function via Barilai–Borwein gradient function is applied with the purpose of evolving good convergence speed. This Barilai–Borwein gradient function is mathematically stated as given below.

$$\gamma =\frac{{\left({S}_{N}{\prime}-{S}_{N-1}{\prime}\right)}^{T}-\left[\nabla fun \left({S}_{N}{\prime}\right)-\nabla fun \left({S}_{N-1}{\prime}\right)\right]}{{\left|\nabla fun \left({S}_{N}{\prime}\right)-\nabla fun \left({S}_{N-1}{\prime}\right)\right|}^{2}}$$
(6)

From the above Eq. (6) results, using the Barilai–Borwein gradient function ‘\(\gamma\)’ convergence to a local minimum is assured. When the function ‘\(fun\)’ with respect to min–max normalization sample result is found to be convex all local minima44,45 are said to be global minima and hence in this case Barilai–Borwein gradient function converge to the global solution. With this all the missing values are eliminated from further processing. The pseudo-code representation of Barilai–Borwein Gradient Min–max Normalization-based preprocessing is given below.

Algorithm 1
figure a

Barilai–Borwein Gradient Min–max Normalization-based preprocessing

As given in the above algorithm with four distinct input vector matrices forming the samples from the given crop yield prediction dataset, each input vector matrices possess different numbers of features. To this input vector matrices, initially min–max normalization function is applied to return the normalized resultant values. Next to the normalized resultant values, to improve the convergence speed, the Barilai–Borwein gradient function is applied. With this not only improves the convergence speed but also improves the true positive and true negative rates in a significant manner.

BlinderOaxaca Statistical Decomposition-based feature selection

The feature selection techniques assist in catering to those features that are pertinent46 in crop yield forecasting algorithms. The definite and well-defined feature subsets chosen are utilized for crop yield prediction. As a substitute for a thorough feature set47, feature subsets provide fine-grained results with less computational time. Moreover, accuracy is also said to be improved by selecting a fine-grained subset, therefore minimizing overfitting. Feature selection algorithm48,49 has relevance to observing the essential features that are impenetrable with crop yield prediction. In this section, Blinder–Oaxaca Statistical Decomposition-based feature selection (Fig. 3) is designed to obtain fine-grained results with less computational time.

Fig. 3
figure 3

Structure of Blinder–Oaxaca Statistical Decomposition-based feature selection.

In Fig. 3 with the preprocessed data acquired as input, to measure how jointly two features are associated with one another, the Blinder–Oaxaca decomposition function is employed. The Blinder–Oaxaca decomposition function is the most frequently employed of the distinct correlation coefficient in identifying dependent features or variables. We determine the Blinder–Oaxaca decomposition function between the most significant feature and crop yields to explore the influence of climatic changes on crop yields. The Blinder–Oaxaca decomposition function a statistical function describes the difference in the means of a dependent variable (i.e., dependent feature) between two groups (i.e., two input vector matrices).

The function decompose the slot into that portion in such a manner as to obtain mean values differences of independent variable or feature within the input matrices on one hand and group differences in the consequences of the independent variable (i.e., independent feature) on the other hand. Using the formula in Eq. (7), the association between two input vector matrices, average temperature ‘\(AT\)’ and crop yields ‘\(CY\)’ is given below.

$$In \left({FS}_{{AT}_{i}}\right)={X}_{{AT}_{i}}{\beta }_{AT}+{\mu }_{{AT}_{i}}$$
(7)
$$In \left({FS}_{{CY}_{i}}\right)={X}_{{CY}_{i}}{\beta }_{CY}+{\mu }_{{CY}_{i}}$$
(8)

From the above Eqs. (7) and (8), ‘\({X}_{{AT}_{i}}\)’ represents the vector explanatory variables such as year and country, ‘\({X}_{{CY}_{i}}\)’ denoting the vector explanatory variables such as domain code, domain, area code, area, element code, element, item code, item, year code, year, unit, and value. In a similar manner ‘\({\beta }_{AT}\)’ and ‘\({\beta }_{CY}\)’ represents the vector of coefficients with ‘\({\mu }_{{AT}_{i}}\)’ and ‘\({\mu }_{{CY}_{i}}\)’ denoting significant error terms respectively. Let ‘\({b}_{AT}\)’ and ‘\({b}_{CY}\)’ denote the regression estimates of ‘\({\beta }_{AT}\)’ and ‘\({\beta }_{CY}\)’, then, the average value of residuals (i.e., features selected with respect to average temperature and crop yield) is mathematically formulated as given below.

$${FS}_{ACY}=mean\left(In\left({FS}_{{AT}_{i}}\right)\right)-mean\left(In\left({FS}_{{CY}_{i}}\right)\right)={b}_{AT}\left(mean\left({FS}_{AT}\right)-mean\left({FS}_{CY}\right)\right)+mean\left({FS}_{CY}\right)\left({b}_{AT}-{b}_{CY}\right)$$
(9)

Next, we evaluate we calculate the association between two input vector matrices, pesticides in terms of tonnes ‘\(PT\)’ and crop yields ‘\(CY\)’ as given below.

$$In \left({FS}_{{PT}_{i}}\right)={X}_{{PT}_{i}}{\beta }_{PT}+{\mu }_{{PT}_{i}}$$
(10)
$$In \left({FS}_{{CY}_{i}}\right)={X}_{{CY}_{i}}{\beta }_{CY}+{\mu }_{{CY}_{i}}$$
(11)

From the above Eqs. (10) and (11), ‘\({X}_{{PT}_{i}}\)’ represents the vector explanatory variables such as domain, area, element, item, year, unit, and value, ‘\({X}_{{CY}_{i}}\)’ denoting vector explanatory variables for crop yield prediction, ‘\({\beta }_{AT}\)’ and ‘\({\beta }_{CY}\)’ represents the vector of coefficients with ‘\({\mu }_{{PT}_{i}}\)’ and ‘\({\mu }_{{CY}_{i}}\)’ denoting error terms respectively. Let ‘\({b}_{PT}\)’ and ‘\({b}_{CY}\)’ denote the regression estimates of ‘\({\beta }_{PT}\)’ and ‘\({\beta }_{CY}\)’, then, the average value of residuals (i.e., features selected with respect to pesticides and crop yield) is mathematically formulated as given below.

$${FS}_{PCY}=mean\left(In\left({FS}_{{PT}_{i}}\right)\right)-mean\left(In\left({FS}_{{CY}_{i}}\right)\right)={b}_{PT}\left(mean\left({FS}_{PT}\right)-mean\left({FS}_{CY}\right)\right)+mean\left({FS}_{CY}\right)\left({b}_{PT}-{b}_{CY}\right)$$
(12)

Finally, the association between two input vector matrices, average rainfall ‘\(AR\)’ and crop yield ‘\(CY\)’ is mathematically represented as given below.

$$In \left({FS}_{{AR}_{i}}\right)={X}_{{AR}_{i}}AR+{\mu }_{{AR}_{i}}$$
(13)
$$In \left({FS}_{{CY}_{i}}\right)={X}_{{CY}_{i}}{\beta }_{CY}+{\mu }_{{CY}_{i}}$$
(14)

From the above Eqs. (13) and (14), ‘\({X}_{{AR}_{i}}\)’ represents the vector explanatory variables such as area, year and average rainfall, ‘\({X}_{{CY}_{i}}\)’ denoting the vector explanatory variables for crop yield prediction, ‘\({\beta }_{AR}\)’ and ‘\({\beta }_{CY}\)’ represents the vector of coefficients with ‘\({\mu }_{{AR}_{i}}\)’ and ‘\({\mu }_{{CY}_{i}}\)’ denoting error term respectively. Let ‘\({b}_{AR}\)’ and ‘\({b}_{CY}\)’ denote the regression estimates of ‘\({\beta }_{AR}\)’ and ‘\({\beta }_{CY}\)’, then, the average value of residuals (i.e., features selected with respect to rainfall and crop yield) is mathematically formulated as given below.

$${FS}_{ARCY}=mean\left(In\left({FS}_{{AR}_{i}}\right)\right)-mean\left(In\left({FS}_{{CY}_{i}}\right)\right)={b}_{AR}\left(mean\left({FS}_{AR}\right)-mean\left({FS}_{CY}\right)\right)+mean\left({FS}_{CY}\right)\left({b}_{AR}-{b}_{CY}\right)$$
(15)
$$FS={FS}_{ACY} \cup { FS}_{PCY} \cup {FS}_{ARCY}$$
(16)

Finally, the equation given above (16) forms the resultant features selected based on the crop yield differential between three different groups, i.e., pesticides in tonnes, average rainfall, and average temperature respectively. The pseudo-code representation of Blinder–Oaxaca Statistical Decomposition-based feature selection is given below.

Algorithm 2
figure b

Blinder–Oaxaca Statistical Decomposition-based feature selection

In algorithm 2, through the objective of reducing the false positive and false negative relevant and pertinent features should be retained whereas the irrelevant features should be discarded from further processing. With this objective, the Blinder–Oaxaca Statistical Decomposition function is applied to the preprocessed data. The Blinder–Oaxaca Statistical Decomposition function employed identifies and quantifies separate contributions of group variances (i.e., pesticide, rainfall and temperature with respect to crop yield) in quantifiable features such as domain, area, element, item, year, unit value, etc. This in turn assists in minimizing the false positive and false negative considerably. The selected features of the Blinder–Oaxaca Statistical Decomposition-based feature selection algorithm are listed in Table3.

Table 3 Relevant features selected.

With the above relevant features selected (i.e., area, item, year, unit, value, average_rainfall, average_temperature) crop yield prediction is discussed in the next section.

AI-enabled Bernoulli Deep Belief Network classifier for crop yield prediction

Farmers employing AI-powered systems generate accurate and precise methods which lead the way and comprehend optimal management of water and nutrients, harvesting crops in an optimal manner, and so on. AI has the prospective to handle or navigate an agricultural insurgence at a time when the world requires revolution at a time when the world requires to induce additional food by minimal resources. However, accurate and precise crop yield is the reason for concern. In this work, an AI-enabled Bernoulli Deep Belief Network Classifier for crop yield prediction (i.e., rice and paddy for years between 1960 and 1980) is designed. Figure 4 shows the structure of the AI-enabled Bernoulli Deep Belief Network Classifier.

Fig. 4
figure 4

Structure of AI-enabled Bernoulli Deep Belief Network Classifier.

As illustrated in Fig. 4, preprocessed features selected samples form as input to input layer. Following these three hidden layers are employed in our work with which the actual output of crop yield for rice and paddy for the year between 1960 and 1980 are obtained. The AI-enabled Bernoulli Deep Belief Network represents an arrangement of unsupervised networks, where each hidden layer serves as a visible layer for the next layer. The AI-enabled Bernoulli Deep Belief Network for crop yield prediction is modeled with a visible input layer and a hidden layer and connections between them but not within layers. The AI-enabled Bernoulli Deep Belief Network ‘\(H\)’ and ‘\(V\)’ represent the hidden and visible units respectively and is mathematically formulated as given below.

$$\theta =\left\{W, VE,HE\right\}, where VE=\left\{{p}_{i}\in {R}^{m}\right\}\& HE=\left\{{q}_{j}\in {R}^{n}\right\}$$
(17)

From the above Eq. (17), ‘\(W\)’, ‘\(VE\)’ and ‘\(HE\)’ represents the weight initialized using Xavier Initialization function, visible element and hidden element respectively. In addition, ‘\(i-th\)’ visible unit threshold (i.e., for rice and paddy) is governed by ‘\({p}_{i}\)’ and ‘\(j-th\)’ hidden unit threshold (i.e., year between 1960 and 1980) is governed by ‘\({q}_{j}\)’. With this, a total of ‘\(5000\)’ neurons are assigned in each layer with a learning rate of ‘\(0.03\%\)’.

The objective of using Xavier Initialization function remains in initializing the weights50,51 in such a manner so as to keep the deviation of the activations the same across every layer. This constant deviation assists in averting the gradient from vanishing. Weight initialization is performed using the Xavier Initialization function as given below.

$${W}_{ij}=UD\left(S\left[FS\right]\right)\left[-\frac{\sqrt{6}}{\sqrt{{Size}_{PL}+{Size}_{CL}}},\frac{\sqrt{6}}{\sqrt{\sqrt{{Size}_{PL}+{Size}_{CL}}}}\right]$$
(18)

From the above Eq. (18), with a uniform distribution of sample features selected as input ‘\(UD\left(S\left[FS\right]\right)\)’, the weight at every layer is initialized employing size of the preceding layer ‘\({Size}_{PL}\)’ and the size of current layer ‘\({Size}_{CL}\)’ respectively. Following this based on the Bernoulli Distribution function, the distribution of neurons between hidden as well as visible layer are done and energy equation is mathematically formulated as given below.

$$Energy\left(V,H\left[\theta \right]\right)=-\sum_{i=1}^{m}{p}_{i}{V}_{i}-\sum_{j=1}^{n}{q}_{i}{H}_{j}-\sum_{i=1}^{m}\sum_{j=1}^{n}{V}_{i}{W}_{ij}{H}_{j}$$
(19)

The energy function provided as given in the above Eq. (19) denotes the value of energy for the visible node ‘\({p}_{i}{V}_{i}\)’, hidden node ‘\({q}_{i}{H}_{j}\)’ and weights associating hidden and visible nodes ‘\({V}_{i}{W}_{ij}{H}_{j}\)’ respectively. Following this, the total hidden layer employing Principal Components (i.e., features selected) is mathematically stated as given below.

$$\sum Var\left({PC}_{i}\right)=\sum Complexity\left({H}_{i}\right)$$
(20)

From the above Eq. (20), ‘\({PC}_{i}\)’ denotes the principal components (i.e., features selected as input in our work correspond to principal components ‘\(7\)’) and ‘\({H}_{i}\)’ denotes the hidden layer respectively. With these only three principal components, a total number of three hidden layers were employed in our work. Then, the probability of neurons occurring in either visible or hidden layers is mathematically formulated as given below.

$$Prob\left({H}_{j}=1|V\right)=\sigma \left({q}_{j}+\sum_{i}{V}_{i}{W}_{ij}\right)$$
(21)
$$Prob\left({V}_{i}=1|H\right)=\sigma \left({p}_{i}+\sum_{i}{H}_{j}{W}_{ij}\right)$$
(22)

With the above Eqs. (21) and (22) results, the rice and paddy crop yield forecast for the year 1960 and 1980 is retrieved. The pseudo-code representation of the AI-enabled Bernoulli Deep Belief Network Classifier is given below.

Algorithm 3
figure c

AI-enabled Bernoulli Deep Belief Network Classifier

As given in the above algorithm with the objective of generating accurate and precise results as output, an AI-enabled Bernoulli Deep Belief Network is designed. First, preprocessed features selected samples are provided as input to the visible layer. Following this, the Xavier Initialization function is used for the initializing weight that activation variable is found to be uniform, therefore ensuring optimal distribution. Next, the Bernoulli distribution function is applied with the purpose of distributing neurons equivalently. Finally, total hidden layer nodes are determined using Principal Components (i.e., features selected) therefore providing accurate and precise means of crop yield prediction.

Experimentation, results and analysis

Experimental evaluation

The results of simulations employed to validate the method, called, AI-enabled Barilai–Blinder–Oaxaca–Bernoulli Deep Classifier (BBO-BDC) for crop yield prediction for crop yield forecast described. The experiment uses the Crop Yield forecast dataset. The dataset consists of test files, training files, and sample submission files each with a different number of samples. The proposed BBO-BDC method has been implemented in Python language. The simulation results from the proposed BBO-BDC method and existing methods, Multi-Layer Perceptron (MLP) [1], attention-based random forest with meta-learning (MetaRF) [2], and Hybrid CNN-DNN) [3] are detailed below in terms of different performance parameters, sensitivity, specificity, accuracy, convergence speed, and overhead. To ensure a fair comparison between proposed BBO-BDC and existing methods [1], [2] and [3], similar samples from the Crop Yield Prediction dataset are applied for an average of 10 different simulation runs. Afghanistan country is considered for crop yield investigation. Also, the rice and paddy crops are forecasted in this work.

Experimental parameters

To measure sensitivity, specificity and accuracy, four different performance factors are involved namely, true positive ‘\(TP\)’, true negative ‘\(TN\)’, false positive ‘\(FP\)’ and false negative ‘\(FN\)’. TP refers to a particular crop that yields a certain percentage and the test gives similar results. True negative refers to the particular crop that does not yield a certain percentage and the test results give the negative results. False positive refers to the crop does not produce a certain yield and the test results gives positive results. False negative refers to the crop yielding a certain result and the test is negative. Sensitivity52 measures the potentiality of a test (i.e., crop yield prediction) to correctly identify crops with specific yield.

$$Sen=\frac{TP}{TP+FN}$$
(23)

Specificity measures the potentiality of a test (i.e., crop yield prediction) to correctly identify crops without a significant yield.

$$Spe=\frac{TP}{TP+FP}$$
(24)

Accuracy53 is utilized as a statistical measure of how efficiently a classification test correctly identifies crop yield or excludes a condition. Accuracy is ratio of correct forecasts (i.e., involving both true positives and true negatives) among the total numbers of sample cases considered for simulation. Accuracy is mathematically formulated as given below.

$$Acc=\frac{TP+TN}{TP+TN+FP+FN}$$
(25)

Convergence speed refers to time utilized in carry out entire process, i.e., the crop yield prediction. The convergence speed is mathematically stated as given below.

$$CS=\sum_{i=1}^{n}{S}_{i}*Time \left(Prediction\right)$$
(26)

From the above Eq. (26), convergence speed ‘\(CS\)’ is evaluated employing the sample instances ‘\({S}_{i}\)’ and the actual time consumed in prediction ‘\(Time \left(Prediction\right)\)’ (i.e., involving preprocessing, feature selection and classification). It is measured in milliseconds (ms). Finally, overhead measures the memory consumed in the prediction process and is mathematically given as below.

$$Overhead=\sum_{i=1}^{n}{S}_{i}*Mem \left(Prediction\right)$$
(27)

From the above Eq. (27), overhead ‘\(Overhead\)’, is measured based on the samples ‘\({S}_{i}\)’ involved in simulation as well as memory utilized in the actual prediction ‘\(Mem \left(Prediction\right)\)’ process. It is measured in kilobytes (KB).

Results

sensitivity, specificity, accuracy with and without feature selection

In this section performance analysis of sensitivity, specificity, and accuracy with and without feature selection is discussed in detail. Table 4 given below shows the comparison between the proposed AI-enabled Barilai–Blinder–Oaxaca–Bernoulli Deep Classifier (BBO-BDC) and existing methods, Multi-Layer Perceptron (MLP) [1], attention-based random forest with meta-learning (MetaRF) [2] and Hybrid CNN-DNN [3]. Outcomes show BBO-BDC method shows improved results than the existing methods [1], [2], and [3] of accuracy, sensitivity and specificity through relevant features selected.

Table 4 Comparison of accuracy, sensitivity, specificity of different methods for crop yield prediction both with and without feature selection.

Table 4 illustrates the results of accuracy, sensitivity, and specificity for different methods. To guarantee fair comparison among proposed BBO-BDC and existing methods, MLP [1], MetaRF [2], and Hybrid CNN-DNN [3], similar samples from the Crop Yield Prediction dataset are employed for an average of 10 different simulation runs. The same metrics are used for analyzing the performance of proposed BBO-BDC and existing methods, MLP [1], MetaRF [2], and Hybrid CNN-DNN [3].

Figure 5 depicts graphical representations of accuracy, sensitivity and specificity using the proposed BBO-BDC and existing methods, Multi-Layer Perceptron (MLP) [1], attention-based random forest with meta-learning (MetaRF) [2], and Hybrid CNN-DNN [3] respectively. To ensure fair comparison, 10,000 sample images were employed for three methods, the TP rate using proposed technique (i.e., with feature selection) was found to be 1865, whereas using the existing [1], [2] and [3] was observed to be 1800, 1750 and 1725 respectively. In a similar manner, the false negative rate using the proposed method was identified to be 180 whereas 200, 222 and 250 using [1], [2] and [3] respectively with feature selection.

Fig. 5
figure 5

Graphical representations of sensitivity, specificity and accuracy.

As a result, the overall sensitivity using the three methods was found to be 91.19%, 90%, 88.74% and 87.34% respectively with feature selection. On the other hand in case of sensitivity without feature selection, the true positive and false negative rate using the proposed BBO-BDC method was 1845 and 195, 1750 and 215 using [1], 1720 and 237 using [2], 1700 and 265 using [3] therefore entire sensitivity was 90.44%, 89.50%, 87.88% and 86.51% respectively.

In a similar manner, the specificity rate by BBO-BDC method when applied with feature selection was found to be 87.50%, 81.25% [1], 75% [2] and 72.04% respectively whereas 85%, 77.5% [1], 72.5% [2] and 71.83% [3] respectively without applying feature selection.

In case of accuracy, when applied with the feature selection model, it was found to be 88.25% using the proposed method whereas 83% [1], 77.7% [2] and 75.14% [3] respectively. When applied without the feature selection model, accuracy was found to be 85.82% using the proposed method whereas 79.77% [1], 75.52% [2] and 74.22% [3] respectively.

The sensitivity of BBO-BDC method is improved by 2%, 3% and 4% (with feature selection) than the [1], [2] and [3] with feature selection. The sensitivity of BBO-BDC method is improved by 2%, 3% and 5% (without feature selection) than the [1], [2] and [3]. The specificity of BBO-BDC method is enhanced by 8%, 17% and 21% (with feature selection) than the [1], [2] and [3] with feature selection. The specificity of BBO-BDC method is improved by 10%, 17% and 18% (without feature selection) than the [1], [2] and [3]. The accuracy of BBO-BDC method is enhanced by 6%, 14% and 17% (with feature selection) than the [1], [2] and [3] with feature selection. The accuracy of BBO-BDC method is improved by 8%, 14% and 16% (without feature selection) than the [1], [2] and [3].

From the inferences accuracy, sensitivity as well as specificity were found to be comparatively better using BBO-BDC method than the [1], [2] and [3]. Enhancement was owing to the application of AI-enabled Bernoulli Deep Belief Network Classifier from crop yield prediction. Here, the preprocessed features selected samples remained as input to the visible layer. Next, weight initialization was performed using arbitrary Xavier Initialization function instead of a threshold therefore ensured optimal distributions between input and hidden layers. Also neurons were distributed equivalently by means of Bernoulli distribution function that in turn aided in minimizing the false positive and false negative samples. This in turn improved the overall sensitivity, specificity and accuracy in a significant manner.

Performance analysis on convergence speed and overhead with and without feature selection

In this section, the convergence speed and overhead resultant values using the proposed BBO-BDC and existing three methods Multi Layer Perceptron (MLP) [1], attention-based random forest with meta-learning (MetaRF) [2] and Hybrid CNN-DNN [3] when applied with and without feature selection is presented. Table 5 tabulates the validation results of the proposed BBO-BDC method with all other compared methods [1], [2] and [3] for all ten distinct samples respectively.

Table 5 Comparison of convergence speed and overhead using different methods for crop yield prediction both with and without feature selection.

Figure 6 given above illustrates the investigation of convergence speed and overhead values acquired by the proposed BBO-BDC method and all the looked at methodologies Multi Layer Perceptron (MLP) [1], attention-based random forest with meta-learning (MetaRF) [2] and Hybrid CNN-DNN [3] on differing samples performed for a simulation of 10 distinct runs both with and without feature selection. From the above figure it is inferred that the BBO-BDC method imparted permissible results in terms of both convergence speed and overhead using feature selection than without feature selection. However, all the other compared ones [1], [2] and [3] in perspective on their reduced true positive and true negative possibly not work at high samples. The improvement in terms of convergence speed and overhead was owing to the application of Blinder–Oaxaca Statistical Decomposition. By using this algorithm first, convergence speed was focused employing Barilai–Borwein Gradient Min–max Normalization-based preprocessing model. Here, min–max normalization along with the Barilai–Borwein gradient function were applied to address on missing data even in case of multi-variable function involving different input vector matrices. It minimized the convergence speed with BBO-BDC method by 27% than the [1], [2] and 13% compared to [3]. In addition, by using Blinder–Oaxaca Statistical Decomposition function, fine-grained subset was selected. This in turn reduced the overhead incurred in crop yield prediction considerably using BBO-BDC method by 42% 21% 12% than the [1], [2], [3] respectively.

Fig. 6
figure 6

Graphical representations of convergence speed and overhead.

Discussion

In this study, the proposed BBO-BDC method, existing MLP [1], MetaRF [2], and Hybrid CNN-DNN [3] are discussed in the Crop Yield forecast dataset with several parameters namely sensitivity, specificity, accuracy, convergence speed, and overhead. In Table 5, the BBO-BDC method was evaluated against existing MLP [1], MetaRF [2], and Hybrid CNN-DNN [3]. Convergence speed is increased and overhead is reduced by using Barilai–Borwein Gradient Min–max Normalization and Blinder–Oaxaca Statistical Decomposition function. Also, the missing value is eliminated and relevant features are chosen. From the results, the BBO-BDC method of sensitivity, specificity, and accuracy is enhanced than the existing MLP [1], MetaRF [2] and Hybrid CNN-DNN [3]. The reason for higher sensitivity, specificity, and accuracy is to apply the AI-enabled Bernoulli Deep Belief Network Classifier in above Table 4. In addition, the Xavier Initialization function is investigated with weight initialization. The optimal distributions are determined to achieve crop yield prediction. With limitation of this method did not discuss about failure in analyzing diverse geographies involved in predicting crop yield.

Conclusion

In this paper, an efficient method called AI-enabled Barilai–Blinder–Oaxaca–Bernoulli Deep Classifier (BBO-BDC) for crop yield prediction has been proposed, which employs Barilai–Borwein Gradient Min–max Normalization for performing preprocessing and Blinder–Oaxaca Statistical Decomposition for feature selection and AI-enabled Bernoulli Deep Belief Network for actual crop yield prediction. In this regard, three distinct processes were performed wherein the samples were first subjected to normalization using Barilai–Borwein Gradient Min–max Normalization-based preprocessing algorithm. Second, computationally efficient features were selected using Blinder–Oaxaca Statistical Decomposition function. Also, the actual classification was done by utilizing AI-enabled Bernoulli Deep Belief Network Classifier to classify preprocessed feature selected samples for appropriate crop yield (i.e., rice and paddy) in an accurate and precise manner. Experiments were conducted on Crop Yield Prediction dataset to check performance of proposed method. Experimental outcomes demonstrate that BBO-BDC method achieved high accuracy and sensitivity with minimum convergence speed and overhead upon comparison to the state-of-the-art methods.

In future work, the proposed method is further extended to provide precise, real-time suggestions tailored to specific agricultural factors such as weather, soil type, and time by using a novel deep learning network54.An attention mechanism will be used to choose the most significant features for precise crop recommendations. Also, the XAI methods such as LIME and SHAP are transparent and reliable in modern agriculture55. How XAI can be incorporated and investigated into crop recommendation systems to address the “black box” nature of AI models. In addition, the XAI technique will be to create AI-driven recommendations for farmers.