Introduction

The Conventional agricultural methods refer to the traditional techniques that have been in use for centuries, relying heavily on manual labor, natural weather patterns, and outdated tools and equipment. These methods face several challenges, including vulnerability to climate change, environmental degradation, low efficiency, and reduced productivity1. In contrast, smart agriculture, also known as precision agriculture, leverages modern technologies such as the Internet of Things (IoT), Artificial Intelligence (AI), drones, and automation to enhance the farming productivity and efficiency. This approach utilizes data-driven insights to optimize the farming practices and to make more informed decisions2.

It is expected that the world population will exceed 9 billion by the year 2050, by which the food demand will increase by 60%. Consequently, smart agriculture will boost up the crop yields through data-driven decisions, and it will ensure a stable food supply. Technologies such as soil moisture sensors and smart irrigation systems have been developed to reduce water wastage. Smart agriculture will support agriculture irrigation-based water wastage by saving resources and minimizing environmental harm. Also, it helps the farmers to quickly identify weather changes, diseases, and pests. The smart agriculture tools will support unpredictable climate changes.

In smart agriculture, the efficient handling of inputs and machinery helps greenhouse gas emissions. These data lead to support crop rotation and usage of chemical fertilizers in a healthier way. Hence, the automation and analytics improve the productivity, reduce the input costs and labor expenses. The farmers will access the digital platforms and they will connect with the buyers, financial services and supply chains. As the smart agriculture involves new technologies like data analysis, technical support and etc., it attracts the younger generation for innovation and entrepreneurship. This enables better international cooperation on food security and sustainability goals.

Smart agriculture increases efficiency and productivity by precisely managing resources. It can reduce water usage by accurately determining the optimal amount of water needed for crops, minimize chemical use, conserve energy, and maintain soil health3,4. Additionally, it helps to reduce pollution and environmental degradation. By analyzing the data collected from farm fields, AI and machine learning models can predict potential risks, such as crop diseases, water-related issues, and pest infestations. Compared to the traditional methods, smart agriculture offers significant advantages in decision-making, technology integration, weather dependency, resource optimization, sustainability, and overall efficiency5.

However, the implementation of smart agriculture requires substantial initial investment on sensors, software, and infrastructure. Farmers also need to be trained in the technical aspects of operating and maintaining smart farming systems. Moreover, since smart agriculture relies on internet connectivity, remote areas with poor network access may face challenges. Another concern is the privacy of farm data, particularly regarding sharing data with third parties, such as private agencies6,7.

The primary scope of the present research work is to involve the design, development and implementation of an intelligent precision agriculture which uses lightweight AI model. This work aims to increase water management and promote sustainable farming practices. The sustainable agriculture focuses on the efficient usage of water in irrigation and it is enhanced by predicting optimal watering schedules which will reduce the environmental impact as well support climate resilient farming. The deployment of computationally efficient machine learning models like random forest algorithms will emphasis on real time decision making without depending on cloud computing.

The implementation of the method starts from the deployment of sensors devices like ESP32 in the farmland to collect data and to process them. The sensors will be integrated with the devices to measure the key attributes like soil moisture, temperature, and light intensity. The gathered real-time data will be trained for irrigation scheduling. To enable automatic irrigation by sending commands to the actuators, the data driven methods will be used. These methods will improve the crop yield and soil health by maintaining optimal moisture level.

The performance evaluation and effectiveness of the proposed method can be estimated using the important parameter called prediction accuracy. The main goal of this parameter is to access how the AI model accurately predicts the irrigation needs based on the sensor data. Based on the correct predictions, the accuracy, mean absolute error, root mean square error can be calculated. This will be again compared with the AI based algorithms and the recommendation will be shared. The recommendations include water usage efficiency, crop health & yield impact, device responses and cost effectiveness.

As agriculture faces challenges like poor crop management, pest control issues, environmental degradation, and inefficient resource use, sustainable agricultural practices are essential to ensure food security while minimizing environmental impact. Subsequently, there is a growing need for integrated platforms that use consumer electronics to monitor and control practices such as smart farming, smart irrigation, climate monitoring, and pest control. These platforms should also incorporate data-driven insights to help the farmers make better decisions 8. The key advantages of such a smart irrigation system are real-time monitoring, data collection, integration, predictive analysis, mobile interfaces, and automation. By combining the consumer devices like smart phones with the advanced technologies such as IoT and AI, these systems can be designed and implemented and this approach will greatly benefit the farmers as well promote a transition towards more sustainable agricultural practices.

Key contributions

A smart agriculture system leverages sensor data to assess crop health, identify disease types, determine irrigation needs, and evaluate soil quality. By incorporating both the real-time and the historical sensor data, the system enables more informed and robust decision-making. The key contributions of this study are as follows:

  1. i.

    A comprehensive analysis of the proposed approach is presented for ensuring accurate yield distribution predictions across various locations.

  2. ii.

    The proposed model utilizes random forest classifier, which is effective in handling non-linear data and minimizing overfitting issues in the dataset.

  3. iii.

    The performance of the smart agriculture system is integrated with consumer devices, such as mobile phones, allowing stakeholders to monitor and control activities remotely.

  4. iv.

    Accuracy and yield predictions are planned for evaluation and they have paved a way for analyzing the overall performance of the proposed model.

Organization of the paper

The rest of the research is organized as follows. Section “Literature review” explains the related works of various smart agriculture methods and their performance metrics. Section “Materials and methods” presents the material, methods and working procedure of the proposed model. Section “Proposed methodology” explains the proposed work along with the system model and the architecture. Section “Experimentation, results and analysis” demonstrates the experimental setup, experimental evaluation and performance analysis. The Conclusion and the future work are discussed in section “Conclusion and future works”.

Literature review

The related works on smart agriculture demonstrate a growing integration of AI and IoT to enhance agricultural productivity and sustainability. Various studies have focused on incorporating IoT-based systems for farm monitoring, precision farming, weather tracking, and smart irrigation. AI techniques, such as machine learning and deep learning models, have been applied to improve decision-making processes, optimize resource usage, and predict agricultural outcomes like crop health, yield, and pest risks. Additionally, the use of low-cost sensors, adaptive AI, and cloud computing has enabled cost-effective and energy-efficient solutions, though challenges remain in terms of data security, privacy, and initial infrastructure investment. Notable contributions include advancements in sensor networks, smart farming applications, and decentralized models aimed at improving efficiency and reducing environmental impact, with several studies reporting significant accuracy improvements in yield prediction and resource management.

To address the challenges in smart agriculture, Ali and Galyna9 have presented a system based on AI and IoT. This work has initially focused on the use of IoT technologies in agriculture, including drones for farm monitoring, precision farming, weather monitoring, and smart greenhouses. With the advancement of AI technologies, AI-based methods are incorporated to further boost the agricultural output. It is predicted that the agricultural market world reach $4 billion by 2026. Kumar et al.10 have explored smart sensing in agriculture in both qualitative and quantitative perspectives and proposed a method utilizing low-cost, low-energy-consuming sensors to reduce the need of manual labor and promote automation. The system integrates various consumer electronic device by considering the factors such as temperature and soil health.

Vijendrakumar et al.11 have reviewed the use of various IoT techniques in smart and sustainable agriculture and discussed the importance of smart agriculture as well as outlined the methods to enhance productivity. Their system has employed sensors in the farm fields, with data analysis powered machine learning techniques. The decision support system for smart farming is classified into four phases: data collection, supervision and management, feature selection, and agricultural data analysis. In addition to IoT and machine learning, the roles of data analytics and cloud storage are emphasized in supporting sustainable agriculture.

Sengupta et al.12 have developed a novel smart agriculture model called FarmFox, which uses a Quad sensor and follows a wireless sensor network architecture based on IoT. The system consists of three layers: the sensing layer, remote processing layer, and application layer. FarmFox includes integrated sensor nodes, IoT gateways, and remote servers, aiming to implement precision agriculture aligned with Agriculture 5.0. The model has achieved an accuracy of 87.3% using IoT-based smart agriculture. Wongchai et al.13 have proposed an AI-enabled IoT soft sensor and deep learning architecture, using a weight-optimized neural network to maximize the likelihood based on features. This sensor-based system for rice, wheat, and paddy crops has achieved an accuracy of 89%, computational time reduction of 56%, precision of 85.5%, recall of 89.9%, and an F1-score of 86%.

Singh et al.14 have introduced a model using the Long-range Radio (LoRa) protocol for monitoring soil and weather conditions in precision agriculture within smart cities. LoRa, integrated into a wide-area network, supports smart irrigation systems. Various communication technologies are analyzed for their range and data rate and machine learning algorithms are implemented for agricultural activities such as crop, livestock, soil, and water management. This approach has achieved an accuracy of 87.2% in yield prediction. Mandal et al.15 have developed a smart farming application by combining IoT and AI with the aim of addressing key agricultural challenges like crop production and management. The model focuses on four primary aspects of Agriculture 4.0: monitoring, control, prediction, and logistics. The model has achieved 82% precision.

To address the safety concerns in smart agriculture, Saba et al.16 have designed a decentralized trust-based model to reduce the costs and delays while improving crop productivity. This system integrates network registration with an IoT-based machine learning system and includes a data security model. As a result, IoT device energy consumption is reduced by 16% and the packet delivery ratio is increased by 17%. Banke17 has discussed the use of adaptive AI in precision agriculture by employing a self-learning algorithm based on real-time farm data. By utilizing an enhanced multiple linear regression model, a precision accuracy of 88% has been achieved.

Chouaib El et al.18 have developed a weather data management system based on AI and data analytics to support precision agriculture in the European ReAnalysis V5 (ERA-5) framework. This system includes an exploratory data analysis model, using a feed-forward neural network and LSTM, and it has achieved an RMSE of 0.04 with the deep learning model, and 0.07 for evapotranspiration using XGBoost models. Abdennabi et al.19 have presented an advanced model to enhance food security through IoT and cloud computing. Their smart irrigation system, integrated with IoT devices and cloud computing for scheduling, utilizes the ESP32 architecture for remote access. The system has demonstrated reduced response time even under high temperature and humidity, proving the efficacy of the V-model for smart irrigation.

Fan et al.19 have well explained the process of random model classifier. This model has been implemented for the prediction of short-term load forecasting on various power resources. The standard procedure is followed to create the tree and based on the trees, a mean generation function model has been constructed. This model evaluates the performance of the sub trees which are constructed using the standard procedures. The RSME, MAPE and MAD are measured for evaluating the performance. The data have been collected from eastern zone of USA and used 80:20 for training and testing values. The best performance metrics values are achieved with least error rate.

Shoar et al.20 have proven that the random forest algorithm can be used in various applications including constructions and used the random forest algorithm for the estimation of cost in designing the project and supervision of construction operations in which both are labelled as engineering services. The RF algorithm is utilized to predict the project cost for both the projects related and the organizational related. The dataset has been collected using nominal group technique and a model has been developed. This model has developed a very rich database which contains 95 high rising residential building projects along with 12 input variables to train and test RF model and 91% accuracy is achieved using this model.

Aria et al.21 have analysed the random forest method in various other applications. A framework is created to understand the characteristics of the dataset. According to them, there is a lack of interpretability which may limit the usage of some specific applications like health and economics. It has been analysed that the RF model has high predictive performance which can be used by even less experienced users. Zhan et al.22 have verified the performance of RF in the field of healthcare and presented the broad chains of COVID-19 during the initial days. A dataset, which contains 184 countries and 1241 areas, has been created. The RF model is constructed based on broad learning system which studies the spread of COVID-19. This model contains major three steps namely feature selection, establish sub training dataset and building the model. After the performance estimation, the RSME value is calculated as 95.23%.

Josso et al.23 have used the RF model in the ocean for predicting minerals. Ferromanganese is one of the most important minerals in the ocean and it can used for various product manufacturing. The dataset has been collected from GeoERA which is a large set of data and 600 subtrees are created for the RF model. 10,244 samples are identified correctly as non-deposit minerals and 185 samples are identified correctly as Fe–Mn crust which is 98.2% accurate value.

Prabha et al.24 have portrayed a survey about AI enabled precision agriculture and illustrated the perceptions of AI in terms of historical development and broad spectrum of applications. Also, the roles of machine learning, deep leaning, convolutional neural networks and recurrent neural networks are emphasized because these methods are very supportive for yield estimation, plant phenotyping and agricultural image analysis. Moreover, the deployment strategies like edge computing, cloud-based services and hybrid infrastructure are investigated along with AI techniques. By creating the theoretical frameworks with applied innovations, the authors have revealed complete perspective of how AI is shaping the future of sustainable and data driven agriculture practices.

Dakhia et al.25 have used AI enabled IoT for food computing and applied computational techniques like AI, machine learning, data science and natural language processing to analyse and understand the food related data. It connects numerous fields like healthcare, cooking, nutrition, agriculture and consumer behaviour. Data have been collected from UECFOOD-256 which comprises food related data along with nutrition information. These data are collected using crowdsourcing, sensors and IoT methods and machine learning algorithm is used for quality control and technological advancements.

Qazi et al.26 have reviewed the current and the future trends of precision agriculture and elaborately discussed the need of IoT in the smart agriculture including the usage of various sensors, the operation of sensors with respect to data collection and its related protocols. Also, the drip irrigation, Hydroponics and Aeroponics are demonstrated using sensors. The fuzzy logic algorithms are used for smart IoT based algorithms. The usage and support of UAVs in the smart agriculture are also discussed.

Nawaz and Babar27 have presented a review for smart agriculture with IoT and AI in a resource constrained environment. The technical, financial and social constraints, which may affect the smart agriculture, have been considered. A framework with both hardware and software elements, which enhance the technology adaption, has been proposed. The proposed methodology is low cost, open source, and effective for high productivity in agriculture. The AI applications have been briefly analyzed in agriculture including soil nutrition, irrigation management, crop health, disease detection, yield estimation and greenhouse management. Rathore et al.28 have also reviewed smart agriculture with IoT and AI using a vertical farming method. Vertical farming is a modern agriculture which involves growing crops in vertically stacked layers. The challenges and the future trends in vertical farming have been briefly analyzed and discussed how the machine learning and deep learning are used in the vertical farming. An architecture for vertical farming has been proposed using AI and IoT and its efficiency for disease detection, yield prediction, nutrition and irrigation control automation has been estimated.

Shahid et al.29 have presented an ensemble deep learning method for cotton crop classification in AI enabled smart agriculture. The agriculture process incorporates many domains like crop cultivation, water management, pest control and etc. Among these, plant diseases and pest control are the most significant threats because they may reduce the productivity. Early detection and timely intervention have been ensured using the ensemble model and achieved 97.6% accuracy.

Research gap

Upon reviewing various existing methods of smart agriculture, certain deficiencies have been identified and are as follows.

  • Consumer devices typically provide raw data or alerts that are not actionable, highlighting the need of an integrated AI-based decision support system to effectively monitor and process the data.

  • Additionally, there is a challenge in seamlessly integrating low-cost sensors with smartphones for real-time monitoring.

  • Furthermore, current systems lack transparent data policies for local storage and user-controlled data sharing mechanisms.

  • Another limitation is that the existing solutions often fail to account field-specific variations, local microclimates, and individual farm histories, which are crucial for personalized and accurate decision-making.

After understanding these research gaps, it has been determined that the proposed model should be designed with low-cost electronic devices like sensors and microcontrollers. Further, these devices should be controlled by means of the user’s consumer devices like smart phones. The smart irrigation should be monitored using the lightweight AI algorithm.

Materials and methods

The materials and methods have been carefully chosen, since the proposed work must be implemented properly to meet the target objectives. This section explains the process of data collection from the fields, methods involved for data preprocessing and the proposed random forest algorithm as well as its significance.

Material

The area has been characterized by the climate changes, soil type and etc., whereas, the target crop and the fields are characterized by the type of the fertilizer and the product used. This will be chosen based on the economic factors and the data availability.

Dataset description

The dataset contains information like environmental readings, humidity, soil related data and water related data which are collected through the sensor. These data are collected with an ESP32 wifi module for real-time monitoring. The data fields contain indicators according to the world bank and economist intelligence unit and they can be used for systematic review of articles on the smart irrigation systems in agriculture. The dataset contains plants indicators and environmental indicators which are gathered through automated IoT system30,31. The dataset is intended for economic modelling to assess the climate changes and to identify the environmental sustainability factors. It includes the risks related to the temperature, water shortages, land degradations and natural disasters. The main labels of the Selection Attribute are illustrated in Table 1.

Table 1 Attribute selected for testing.

Data preprocessing

The dataset used in this work contains the basic environmental factors like location, rainfall (R), temperature (temp), soil moisture (Smoi), humidity (H), season (S) and the area (A). Also, it consists of non-essential features like crop type and yields which can be removed to get accurate efficiency, The data fields are divided into two parts namely numerical and categorical. During the preprocessing of data, the initial steps like data acquisition, handling of missing values, data cleaning, type conversions and outliner detections are done on the dataset32. Once the fundamental and mandatory activities over, feature engineering is applied. Here, the trends of the temperature and the changes in the soil moisturizer are extracted. These temporal features and the statistical features like mean, median, mode and standard deviation of the datasets are extracted.

The normalization is also applied to the data, as machine learning algorithms are used. The input data are normalized using the min–max scaling method which splits the dataset into training and testing sets for the machine-learning algorithm33.

Methods

The random forest classifier has been selected, due to its robustness in handling non-linear data and reducing the overfitting problem of datasets34. It is a very powerful machine learning algorithm for both classification and regression techniques. It uses ensemble learning method that constructs multiple parallel decision trees and combines the results for better performance. The major steps involved in the random forest classification are Bootstrapping, multiple decision trees and voting. In bootstrapping step, the dataset is sampled randomly to replace the multiple subsets. In each subset, an individual decision tree is constructed and trained. In the classification model, the majority of the decision will be the final class. If it is regression model, the average prediction of the trees will be considered. The primary advantages of the random forest classifier are high accuracy, scalability, and feature importance35. Figure 1 depicts the flow of the proposed random forest algorithm.

Fig. 1
figure 1

Proposed architecture.

As the random forest algorithm uses ensemble technique, it combines multiple decision trees, reduces the overfitting and increases the prediction accuracy. In the smart agriculture, this method is used for crop yield prediction, estimation of soil moisture, weather forecasting and disease detection36,37.

The major key parameters of the cross validation are the number of trees, maximum depth of the tree and the minimum samples per leaf. These parameters are decided based on the soil data, temperature and humidity values. The feature importance is derived from the trained random forest model to identify the key variables which influence the crop yielding and classification.

Proposed methodology

The proposed smart agriculture system integrates agricultural consumer electronics, machine learning models and the sensor data processing to optimize crop yield for sustainable agriculture. The data flow of the proposed model is shown in Fig. 2.

Fig. 2
figure 2

Flow diagram.

Stages involved in the projected technique to achieve Sustainable agriculture using consumer electronics.

Step 1 The Initial data are processed related to Rainfall (R), Temp as (T), Soil moisture as (S_moi), Soil type as (S_t), humidity as H, Soil as S and y as Yield in Irrigation process, Location as l to sustainable agriculture as (SA) and consumer electronics as (CEd) where 1 < y < SA.

Step 2 Data Preprocessing layer works to normalize the numeric values in T, encode categorical value in S_moi, and finally to ensure T = S_moi.L where y/S_t (CEd).The primary activities on the dataset like removal of missing values, handling of inconsistent values and feature selection are done in this layer.

Step 3 The model is trained with historical environmental data and irrigation methods using machine learning classifier RF (random forest) to rank the feature for 1 < y < SA.

Step 4 The intelligent decision making with RF ensures optimal water utilization and reduces the waste with entropy towards y/S_t (CEd).

Step 5 The trained model is implemented in the agricultural consumer electronics like smart devices related to y = L. A dashboard visualization method is prepared to check the performance of the system like irrigation status and other environmental impacts through CEd.

Step 6 Finally, the yield distribution and sustainable agriculture are achieved with the deployment and monitoring through consumer electronics with y/S_t (CEd).

System model

Let us assume, X is the feature set and y is the target irrigation class. Then, the dataset will be expressed as follows (1).

$$X = \left\{ {R, T_{emp} , S_{moi} , S_{t} , H, S, A} \right\} y = I$$
(1)

where R is rainfall precipitation intensity, \(T_{emp}\) is temperature recorded daily for analysing crop-based farming, \(S_{moi} ,\) is soil moisture to analyse the water content in the farm land, \(S_{t} ,\) is soil type , \(H\) is humidity in atmospheric condition, \(S\) is the season for the kind of crops to perform plantation like rabi, kharif , \(A\) is Area cultivated total plot size and finally, I is Irrigation type with different class drip, basin and spray .

The linear operator which includes the dataset is defined as follows (2).

$$R_{i} = \mathop \sum \limits_{i} \left( R \right)\left( {x_{i} } \right)$$
(2)

where \(R_{i}\) is the average rainfall index for water availability measure, \(x_{i}\) is the weight factor for normalization and \(R,\) is the Aggregator operator.

Since the dataset collection is an analogue process, the continuous setting integral operator kernel transformation can be defined as (3).

$$R_{f} = \mathop \smallint \limits_{0}^{m} k_{a} \left( {p,q} \right) R_{i}$$
(3)

where \(k_{a} \left( {p,q} \right)\) is Gaussian kernel, \(m\) is the window size for some days and \(p,q\) are Spatial analyses with latitude and longitude.

As the dataset is classified into subsets, there is a condition for measures. All measurable subsets include T_emp of S_moi, and H and A as in Eq. (4). It can also be referred as group invariant measure to get the desired output. As this function follows the covariance, this Eq. (4) can be modified as follows (5) and (6).

$$\mu_{m} \left( {T_{emp} .H} \right) = \mu_{m} \left( {S_{moi} .A} \right)$$
(4)
$$\mu_{m} \left( {T_{emp} .H} \right) = \mu \left( {T_{emp} } \right). \mu \left( H \right)$$
(5)
$$\mu_{m} \left( {S_{moi} .A} \right) = \mu \left( {S_{emoi} . \mu \left( A \right)} \right)$$
(6)

where \(\mu_{m}\) is covariance Feature for probability measure, \(T_{emp}\) is temperature recorded daily for analysing crop-based farming, \(S_{moi} ,\) is soil moisture to analyse the water content in the farm land, \(H\) refers to the humidity in atmospheric condition, \(A\) represents Area cultivated total plot size, \(\mu \left( H \right)\) is the humidity distribution and \(\mu \left( A \right)\) refers to area distribution.

As the dataset contains both categorical and numerical data fields, equivalent linear operators are required to collect the majority of the decisions from various sub trees (7).

$$H\left( {g.f} \right) = g.Hf$$
(7)

where H is the Decision tree class hierarchy, \(g\) is the feature subspace projection and \(f\) is the predictor function.

The decisions of the sub trees are based on the influencing parameters like temperature, soil moisture, humidity, location and etc.38. It is assumed that the temperature of the location will affect the soil moisture and hence, the equivalent linear operator is expressed as follows (8).

$$X_{m} \left( {T_{emp} .S_{moi} } \right) = T_{emp} \left( {p,q} \right) . S_{moi} \left( {p,q} \right)$$
(8)

where \(X_{m}\) is the critical threshold for irrigation demand identification and capture non-linear temperature.

The model is trained with 100 estimators, and its performances are evaluated based on accuracy, feature importance and confusion matrix analysis. The random forest function is given as (9).

$$f \left( X \right) = \frac{1 }{n}\mathop \sum \limits_{i = 1}^{n} h_{i} \left( X \right)$$
(9)

where n is an optimized decision for out of bag error handling and \(h_{i}\) is an individual tree for some depth.

where n represents the number of decision trees, and gini (X) is the decision of each subtree. Based on this, gini impurity value is calculated 39. It is the criteria where the decision trees use splitting threshold less than 0.2 (10).

$$Gini \left( X \right) = 1 - \mathop \sum \limits_{i = 1}^{c} p_{i}^{2}$$
(10)

where pi is the probability of class i in the dataset X, and c represents the number of classes.

Like Gini index, the Entropy value is also calculated to understand the non-homogeneity of the data 40. It is the measure of homogeneity of the dataset and it returns the information about the impurity of the dataset. The entropy is expressed as follows (11).

$$Entropy S = - \left( {P\log_{2} P + N\log_{2} N} \right)$$
(11)

where P is the number of positive or correct samples and N is the number of negative or wrong samples.

From the entropy value, the information gain is calculated as follows (12).

$$Gain = Entropy - \mathop \sum \limits_{values} \frac{{|S_{v} |}}{\left| S \right|} Entropy \left( {S_{v} } \right)$$
(12)

where Sv is the subset of S with minimum gain 0.01bits.

Finally, the accuracy score and the confusion matrix are used to assess the performance of the model. Feature importance is plotted to determine the most influential factors in irrigation prediction (13).

$$Accuracy = \frac{TP + TN}{{TP + TN + FP + FN}}$$
(13)

where TP is the True positive, TN denotes True negative, FP denotes Falso positive and FN denotes False negative.

Algorithm for the proposed model

The following algorithm depicts the proposed model of random forest classifier.

Algorithm 1
figure a

Intelligent irrigation prediction.

Class imbalance handling

The dataset used in this research contains the basic environmental factors like location, rainfall (R), temperature (temp), soil moisture (Smoi), humidity (H), season (S) and the area (A). The class imbalance is addressed through irrigation type identification, including drip, spray, and basin. Three functions are used to handle the class imbalance. The first one is stratified sampling and it is used to split the dataset as 70:30 for testing and training. After the class distribution is processed, second-class weighted RF is assigned for higher weights to split the minority classes and it is proportional to the weights and irrigation class frequencies. Finally, the third method, synthetic minority oversampling, is used to prevent data leakage with different irrigation samples, such as drip, spray, and basin. All these methods improve the irrigation class accuracy score by handling class imbalance in the agricultural dataset for minority crop yield distribution classes. The proposed model inverse frequency weighting increases irrigation detection by reducing false negatives in precision water management in high-temperature areas.

Experimentation, results and analysis

Experimental setup

The principal objective of this study is to apply a machine learning model to enhance the agricultural performance. It involves identifying a suitable crop for the region, predicting the yields, plant disease classification and irrigation automation41. The productivity of the crops depends on soil quality, weather condition, crop type. The processor used in this research is intel core i5 with integrated graphics card with 16 GB ram, the environment used here is Google colab with T4 GPU and libraries are TensorFlow and PyCharm The features which are extracted for the evaluation are listed in Table 2.

Table 2 Features for evaluation.

Results

The correlation heatmap plays a significant role in understanding and analysing the datasets, since it visually shows how strongly the variables are related. For example, if the temperature and the humidity are correlated as negative, it may influence the irrigation planning.

Figure 3 illustrates the correlation heatmap of the dataset. The key attributes which mostly influence the smart agriculture and irrigation system are area, soil moisture, yields, humidity and the price. They are used to detect the issues in the early stage and they help to prioritize the interventions like focus on the fertilizer for the crop growth. Using the heatmaps, the complex data relationships can be easily interpreted42.

Fig. 3
figure 3

Correlation heatmap.

A multi class confusion matrix has been used because, the model has classified the data into three categories of predictions. It will provide the performance of the proposed model across all the classes by comparing the predicted and the actual outcomes. Figure 4 shows the multiclass confusion matrix of the proposed model43.

Fig. 4
figure 4

Confusion matrix.

The confusion matrix of Fig. 4 is plotted between the irrigation parameters like basin, drip and spray. According to the controller input, the irrigation system will get an alert of water distribution. Out of 632 total predictions, 570 predictions are made correctly. It gives a clear visibility that which model or the parameter confuses or influences the performance. As a result, it helps to increase the performance of the mode with low error rate44,45.

Proposed model validation and overfit handling

After data correlation are mapped effectively, the next step is to ensure model validation. The model validation has been initially processed with the stratified cross-validation to maintain the irrigation class distribution, allowing each fold to assess performance and facilitate early stopping based on Out-Of-Bag (OOB) error. The OOB validation is used to control unnecessary computation for improving the structural convergence with some pruning parameters such as max_depth and min_sample to limit the decision tree complexity. This OOB plays an important role in balancing the model’s generalizability and capacity, which are accurately mapped by using the model learning curve for training 91.2% and testing 90%, as visualized in Fig. 5 as validation accuracy. This function enhances the validation of the proposed model by achieving parallel convergence, as the sample size increases. The model remains optimal for validation and deployment with less than 2% error.

Fig. 5
figure 5

Validation accuracy.

The OOB error rectification function stabilizes the samples, prevents overfitting and ensures model stability across bootstrap data handling functions. This stability is crucial for reliable irrigation-based predictions in variable soil conditions and yielding accurate results.

Data integrity and sensor reliability

The proposed work ensures Real-Time data collection on farmland using an ESP32 as an IP67-rated sensor. The sensor operates effectively in various conditions, including extreme temperatures. The sensor operates at − 10 to 60 degrees Celsius, with data transmission utilizing a cyclic redundancy check checksum and an automatic smoothing of environmental noise function by temporal resolutions. Agricultural data, such as field reports of soil, temperature and moisture, have been collected effectively, if any failure happens at one node. Triple modular redundancy voting is used to eliminate single-point failures and calibrate other reports effectively through automatic calibration. The working conditions of real-time field calibration are expressed in Table 3.

Table 3 Sensor performance in variable field conditions.

The proposed framework utilizes the sensor effectively by eliminating the single function failure by 90% compared to other IoT solutions. The sensor’s efficiency is accurate in various farming conditions where frequently climatic variations occur, and the proposed methodology effectively addresses these challenges. For data integrity, the model uses AES-128 for data transmission and SHA256 for mobile firmware updates towards injection attacks. The model utilizes a high-end access system with role-based permissions for providing limited access to users and more extensive access to the farmland owner to critical updates. The model uses Hyperledger to control unauthorized data modification, which increases the security aspect of the proposed model with the help of sensor simulations. Figure 6 defines the varied condition sensor performance.

Fig. 6
figure 6

Extreme condition sensor performance.

Feature importance

In the smart agriculture, it is most important to understand which environmental factor initiates the outcome like crop health, need of irrigation, prediction of diseases and yield distribution46,47. It helps to understand the most influential factor and the decision making.[48]Figure 7 shows the feature importance of the given dataset.

Fig. 7
figure 7

Feature selection.

The proposed model employs various feature importance techniques, with Gini importance being the primary method that accurately predicts soil moisture, temperature, and yields 90% result. This Gini is a critical predictor, accounting for 32% of decisions, whereas secondary distribution is towards temperature (28%) and humidity (18%). Statistical validations are processed with 100 iterations, where p < 0.05 is used as the significance feature mapping, with a strong correlation with Gini metrics. This method has indicated that the proposed random forest is unaffected by inherent biases in feature selection, as it produces high accuracy in irrigation type prediction for specific farmland in tropical regions and vice versa49,50.

The comparison of water irrigation system between the actual and the predicted will refine the precision agriculture more. This will be very useful to the farmers to identify whether the water utilization is overused or underused. Also, it builds the confidence in the AI driven recommendations51. Figure 8 depicts the actual and the predicted water irrigation distributions.

Fig. 8
figure 8

Predicted irrigation distribution.

The irrigation level and the frequency of the irrigation are plotted in Fig. 8. From the graph, it is clearly evidenced that all the time, both the actual and the predicted values are same while using the random forest classifier. So, improved water efficiency and reduced operational costs are demonstrated.

Adaptive learning and prediction

The proposed model uses a dynamic methodology to effectively predict yield distribution in farmland by maintaining high accuracy in varying climatic conditions. The model refreshes the data with a 3-month sliding window approach by accessing recent sensor data. This ensures sessional patterns by omitting obsolete relationships. The temporal accuracy plot defines this in 3 phases: 1st month for initial improvement, 2nd month for season adoption and 3rd as stability to validate the proposed model for achieving optimal performance in all climatic conditions as expressed in Fig. 9. The model uses Adam optimizer with a learning rate of 0.001 and a frequent feature distribution. The farmland owner can track the correction through mobile by making monthly adjustments to hectare data, which result in cumulative accuracy improvement over a year following 3 months of deployment. The final gain is accessed during monsoon transitions, and the sliding window reduces the computational overhead by 37.5% compared to other traditional machine learning models. This defines the proposed model as an emerging model for edge-based AI for agriculture solutions with the maximum yield.

Fig. 9
figure 9

Learning curve.

Sustainable agriculture output on the smart devices

It represents how the data and outcomes from the precision agriculture are captured, processed and displayed on the consumer electronic devices like tablets, smart watches, phones, dashboards and voice assistants. Figure 10. depicts the real time sensor data which are collected using the Arduino microcontroller and various sensors.

Fig. 10
figure 10

Real time sensor data.

Figure 11 illustrates the mobile view of feature importance in smart irrigation system. It gives priority to soil health monitoring, water usage, crop health and pesticide alerts.

Fig. 11
figure 11

Display smart irrigation.

Figure 12 depicts the Yield distribution with various levels of frequencies in the range of 90% yield prediction for 100,000 distributions. The smart agriculture breaks the yield distribution into micro zones, instead of treating the field as one uniform area. Hence, the resources like water, pesticides and fertilizers can be used effectively.

Fig. 12
figure 12

Yield distribution.

Figure 13 illustrates how the output can be viewed in the mobile devise. It has been plotted between the frequencies and the dataset parameters like temperature. It is observed from the diagram that the rainfall is the most significant feature which influences yield distribution.

Fig. 13
figure 13

Smart display output.

Comparative result analysis

Table 4 presents the performance comparison of the proposed model, which has achieved a high accuracy of 90%. In contrast, the traditional models such as XGBoost (88%), LoRa (87.2%), and ANN (89%) have depicted lower accuracies. Notably, the energy efficiency of the traditional models is around 18.7wh/day. Still, the proposed model demonstrates 0.5w, which is 12% greater energy efficiency. The model is compatible with edge devices, achieving less than 50 ms latency, and utilizes an ESP sensor that is 3 times faster than other models. This result makes the proposed model more efficient in resource-constrained environments, where irrigation type prediction accuracy and battery life are critical operational functions in agricultural deployments.

Table 4 Performance comparison.

From Table 5, it has been observed that the proposed random forest classification algorithm offers 90.1% accuracy in the precision agriculture and it is comparatively high with the existing works. The mentioned quantitative results authorize greater performance of the proposed intelligent irrigation system using random forest classifier. The combination of high prediction accuracy, significant feature insights and energy efficient establishes that the model can be sustainable and scalable solution for modern agriculture.

Table 5 Decision making towards sustainable architecture.

Conclusion and future works

This research explores the potential of integrating machine learning with agricultural consumer electronics to enable sustainable irrigation management. Also, it explains the effectiveness of random forest algorithm for the smart precision technology. The proposed random forest classification algorithm offers an effective and accurate irrigation prediction of 90.1% with consumer electronics monitoring system that minimizes water waste and optimizes agricultural productivity. This approach not only improves crop yield and quality, as demonstrated in the result analysis, but also conserves essential resources such as water and energy. Smart agriculture bridges the gap between the technology and the farming by fostering more sustainable, efficient, and profitable farming systems. By integrating various agronomic parameters with an AI driven model, the proposed model has accurately predicted the irrigation requirement and others. The deployment of this study using AI enabled consumer devices like smart phones, irrigation controllers, IoT sensors and etc. has enhanced the accessibility and usability of precision agriculture tools.

In future, this approach could be enhanced by integrating AI and edge computing for reducing reliance on the cloud by processing data locally in real-time. Hybrid approaches can be combined with random forest like deep learning models which will be improve the prediction accuracy in the complex weather dynamics. Additionally, fully autonomous robots could be developed for tasks such as planting, harvesting, weeding, and spraying. The inclusion of climate-adaptive models would further enhance the system’s ability to tailor farming practices for changing environmental conditions. The focus could also shift towards low-cost smart devices and local language interfaces to improve accessibility and ensure broader adoption across diverse farming communities. Creations of user-friendly interfaces in the mobile applications like push notifications and voice supports will increase the farmer engagement and adaptation. The economic and environmental impact analyses can be considered in the future analysis, as they should quantify the benefits through long term field trials. The collaborations with agricultural extension agencies and the policy makers can support the inclusion of AI driven irrigation tools in the national precision agriculture frameworks.