Table 1 Machine Learning Algorithms with their Applications and Benefits.

From: Enhancing precision agriculture through cloud based transformative crop recommendation model

Ref

Machine learning algorithm

Description

Applications

Advantages

Limitations

13

Decision tree

A tree-like model applied to regression and classification issues. It splits data into branches based on criteria to make decisions

Customer churn prediction

Fraud detection

Medical diagnosis

Feature selection

Interpretability

Handles both categorical and numerical data

Minimal data pre-processing

Overemphasis on certain ML models like Random Forest, with less focus on ensemble approaches

Narrow dataset coverage, focusing primarily on Indian crops

14

Random forest

An ensemble learning method that combines different decision trees to improve predicting accuracy and reduce overfitting

Image classification

Anomaly detection

Recommender systems

Credit scoring

High accuracy

Robust to outliers

Reduces overfitting

Limited coverage of real-world implementation of IoT solutions for smart farming

Lack of evaluation metrics for energy and resource savings claimed in the framework

15

Support vector machine (SVM)

A binary classification algorithm that determines the best hyperplane for classifying data points

Text classification

Image recognition

Bioinformatics

Financial forecasting

Works well in high-dimensional spaces

Uses kernel functions to handle non-linear data

Classification based on margins

High computational complexity due to the use of advanced algorithms like KELM and RF

Absence of a comparison with simpler and faster baseline models for practical usability

16

K-nearest neighbors (KNN)

A straightforward classification technique that labels features in the feature space according to the dominant class among its k-nearest neighbors

Handwriting recognition

Collaborative filtering

Anomaly detection

Pattern recognition

Easy to implement

No model training required

Non-parametric approach

Overfitting in gradient boosting and random forest algorithms

Absence of comprehensive testing across diverse farm machinery types and environments​

17

Naive Bayes

A Bayes’ theorem-based probabilistic method that predicts class labels by calculating conditional probabilities

Spam email detection

Sentiment analysis

Document classification

Medical diagnosis

Works well with text data is effective with high-dimensional data

Needs less training data

Insufficient validation of the system in different geographical and climatic conditions

Limited exploration of real-time decision-making mechanisms for farmers​

18

Gradient boosting

A method of ensemble learning that combines weak learners (often decision trees) to gradually create a powerful predictive model

Ranking in search engines -click-through rate prediction

Anomaly detection

Object detection

High predictive accuracy

Handles imbalanced datasets

Reduces bias and variance

Lack of integration with real-time environmental sensors for continuous monitoring

Limited exploration of economic factors in crop recommendation​

19

Neural networks (deep learning)

A complex brain-inspired model made up of interconnected nodes (neurons) arranged in layers. Deep learning models have multiple hidden layers

Image recognition

Natural language processing

Speech recognition

Autonomous vehicles

Excellent work in challenging assignments

Acquires knowledge of hierarchical structures

Big data scalable

Dependency on high-quality and diverse datasets for achieving accuracy in ML models

Limited application in smaller farming setups compared to large-scale operations

20

Extreme learning machine (ELM)

A learning method for single-hidden layer feedforward neural networks (SLFN) that provides fast training and effective performance. Weights are assigned analytically without iterative tuning

Crop selection

Yield prediction

Soil fertility assessment

Irrigation management

Disease detection

Price forecasting

Smart automation

Climate impact analysis

Precision farming

Data-driven decisions

Scalability

Sustainability

Adaptability

Economic benefits

Limited adoption by farmers

Data dependency challenges

Overfitting in models

Complex algorithm design

High computational cost

Insufficient real-time data

Accuracy vs. explainability trade-off

Scalability issues in implementation

Dependence on environmental variables

Skill gap for technology utilization