ImMLPro platform for accessible machine learning and statistical analysis in digital agriculture and beyond

Jeelani, M. Iqbal; Mansoor, Sheikh

doi:10.1038/s41598-025-14234-2

Download PDF

Article
Open access
Published: 30 September 2025

ImMLPro platform for accessible machine learning and statistical analysis in digital agriculture and beyond

M. Iqbal Jeelani¹ &
Sheikh Mansoor^2,3

Scientific Reports volume 15, Article number: 34059 (2025) Cite this article

1549 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

The integration of machine learning (ML) algorithms with statistical analysis and user-friendly interfaces has become crucial for democratizing advanced analytics across various domains, particularly in digital agriculture. This paper presents ImMLPro (Intelligent Machine Learning Professional), a comprehensive Shiny-based web application that seamlessly integrates R programming, machine learning algorithms, and statistical analysis for continuous variable prediction. The platform addresses the growing need for accessible ML tools that eliminate coding barriers while maintaining analytical rigor. ImMLPro incorporates four state-of-the-art algorithms: Random Forest, XGBoost, Support Vector Machines (SVM), and Neural Networks, providing comparative analysis, hyperparameter optimization, and comprehensive visualization capabilities. The application’s architecture facilitates real-time model training, performance evaluation, and result interpretation through interactive dashboards. Designed with digital agriculture applications in mind but applicable across domains requiring continuous variable prediction, ImMLPro represents a significant advancement in making complex ML algorithms accessible to nonprogramming experts. The platform’s integration of R’s statistical computing power with modern web technologies demonstrates the potential for bridging the gap between sophisticated analytical methods and practical implementation in agricultural decision-making and beyond.

Dataset meta-level and statistical features affect machine learning performance

Article Open access 19 January 2024

Analysis of cultivated land changes and driving factors in the Alar Reclamation Area (1990–2019) based on multi-temporal Landsat data and machine learning algorithms

Article Open access 06 December 2025

Exogenous variable driven deep learning models for improved price forecasting of TOP crops in India

Article Open access 26 July 2024

Introduction

The digital transformation of agriculture has accelerated the adoption of data-driven decision-making processes, where machine learning algorithms play a pivotal role in optimizing crop yields, resource management, and sustainable farming practices^1,2. However, the complexity of implementing and comparing multiple ML algorithms often creates barriers for agricultural professionals and researchers who lack extensive programming expertise³. This challenge extends beyond agriculture to various domains requiring continuous variable prediction, including environmental monitoring, economic forecasting, and quality control in manufacturing⁴. The integration of statistical computing environments like R with interactive web technologies has emerged as a promising solution for democratizing access to advanced analytics⁵. R programming language, renowned for its statistical computing capabilities and extensive package ecosystem, provides an ideal foundation for developing comprehensive ML platforms⁶. Recent developments in specialized R packages for neural network applications, such as the Imneuron package⁷ have further enhanced the accessibility of AI-powered solutions for regression tasks. When combined with Shiny framework’s reactive web application capabilities, it enables the creation of sophisticated yet user-friendly interfaces that bridge the gap between complex algorithms and practical implementation⁸.

Machine learning in agriculture has evolved from simple statistical models to advanced algorithms capable of processing complex, multidimensional data. Ensemble methods like Random Forest have improved crop yield prediction accuracy by 15–20% over traditional approaches⁹. XGBoost is effective in handling missing data and non-linear relationships, enhancing soil quality and irrigation management¹⁰. Support Vector Machines excel in crop classification and disease detection, especially with limited training data¹¹. Neural networks, particularly deep learning, have transformed agricultural image analysis and sensor data processing, enabling real-time monitoring and predictive analytics¹². Applications in forestry and biometrics have significantly increased prediction accuracy in tree height-diameter modeling^13,14. However, the complexity of implementing and optimizing these models remains a barrier for many practitioners¹⁵.

R programming is now the standard for statistical computing in agricultural research, supported by extensive packages for machine learning like randomForest, xgboost, e1071, and nnet^16,17,18. Specialized packages such as ImML, ImHD, and Imneuron have enhanced R’s capabilities for dendrometric prediction and regression tasks in forestry¹⁹. Integration with web frameworks like Shiny enables the creation of interactive, user-friendly analytical tools²⁰. Recent developments focus on scalable, modular architectures supporting multiple algorithms and visualization tools, with packages like caret, ggplot2, and plotly improving model training, evaluation, and visualization^21,22. These innovations make R-based analytics more accessible and impactful for broader audiences²³. Designing user interfaces for machine learning requires balancing user experience, efficiency, and result interpretation²⁴. Intuitive designs with interactive visualizations, real-time feedback, and guided workflows improve engagement and outcomes^25,26. Responsive dashboards with progress indicators, help features, and clear navigation are critical for adoption^27,28. Recent focus emphasizes simplicity without sacrificing functionality to keep advanced algorithms accessible²⁹. Additionally, selecting suitable algorithms involves comparative analysis frameworks that evaluate multiple models using performance metrics, cross-validation, and statistical testing, ensuring robust and reliable model choice^30,31,32. A significant amount of work in the recent past has demonstrated substantial ML integration across agricultural, healthcare, and biomedical domains. Cloud-based frameworks have enabled precision crop recommendations³³ and enhanced diagnostic accuracy for conditions such as brain stroke prediction³⁴ while predictive models have analyzed agrochemical health impacts³⁵. User-friendly platforms like surviveR have democratized complex biomedical analyses, enabling researchers with limited computational expertise to conduct sophisticated survival studies³⁶. These developments have highlighted ML’s transformative potential in processing complex datasets and enabling evidence-based decision-making across interdisciplinary applications.

Recent studies have highlighted the importance of considering multiple evaluation metrics, including RMSE, R-squared, MAE, and domain-specific metrics, to assess and evaluate algorithms from different perspectives³⁷. The development of automated hyperparameter optimization techniques has further enhanced the accessibility of ML algorithms, enabling non-experts to achieve near-optimal performance without extensive technical knowledge³⁸. This study introduces ImMLPro (Intelligent Machine Learning Professional), a comprehensive web-based platform integrating multiple ML algorithms Random Forest, XGBoost, Support Vector Machines, and Neural Networks—focused on continuous variable prediction. It addresses key challenges: eliminating coding barriers with user-friendly interfaces, enabling comparative analysis, simplifying hyperparameter tuning, and providing comprehensive result interpretation and visualization platform allows users to select and compare models, identify optimal ones for their datasets, and gain insights into feature importance and model behavior through advanced visualizations. ImMLPro thus streamlines machine learning workflows, making advanced analytics accessible and efficient for users without extensive technical skills.

Motivation and contribution

ImMLPro is developed to enhance the accessibility and efficiency of machine learning tools for agricultural research, focusing on yield, dendrometric prediction and regression tasks. By integrating advanced R packages with user-friendly frameworks like Shiny, ImMLPro provides a scalable, modular platform that supports diverse algorithms and visualization tools. Its primary contribution lies in offering researchers an intuitive interface that combines robust statistical computing with interactive analytical capabilities. This enables high predictive accuracy for complex datasets, including meteorological conditions, soil properties, and agricultural practices, thereby facilitating better-informed decision-making in agricultural research.

Methodology

System architecture and design

ImMLPro is built on a modular architecture that integrates R’s statistical computing capabilities with Shiny’s reactive web framework. The system architecture follows a three-tier design pattern: presentation layer (user interface), application layer (business logic and ML algorithms), and data layer (data processing and storage). This architecture ensures scalability, maintainability, and efficient resource utilization while providing responsive user interactions (Fig. 1).

The presentation layer utilizes the shiny dashboard framework to create an intuitive, responsive interface with multiple tabs for different analytical phases: data exploration, model training, hyperparameter tuning, and results analysis Fig. 2). The interface incorporates modern web design principles, including gradient backgrounds, interactive elements, and progress indicators to enhance user experience and engagement (Fig. 3).

Machine learning algorithm integration

The platform integrates four distinct machine learning algorithms, each selected for their complementary strengths in continuous variable prediction:

Random Forest: Implemented using the randomForest package¹⁷, this ensemble method provides robust predictions through bootstrap aggregating and feature randomization. The implementation includes configurable parameters for tree count (ntree), variables per split (mtry), and minimum node size (node size).

The prediction for random forest regression is given by:

$$\:\widehat{y}\left(x\right)=\frac{1}{ntree}\sum\:_{i=1}^{ntree}{T}_{i}\left(x\right)$$

Where, $\:\widehat{y}\left(x\right)$ is the predicted output for input $\:\left(x\right)$, $\:ntree$ is the number of trees, and $\:{T}_{i}\left(x\right)$ is the prediction from the i-th tree.

XGBoost: Utilizing the xgboost package³⁹ this gradient boosting framework offers superior performance for complex datasets. The implementation provides control over learning rate (eta), maximum tree depth (max_depth), and number of boosting rounds (nrounds).

The objective function for XGBoost regression (with squared loss) is :

$$\:obj={\sum\:}_{i=1}^{n}{\left({y}_{i}-{\widehat{y}}_{i}\right)}^{2}+{\sum\:}_{k=1}^{K}{\Omega\:}\left({f}_{k}\right)$$

Where, $\:obj$ is the objective function, $\:{y}_{i}$ is the true value,$\:\:{\widehat{y}}_{i}$ is the predicted value for the i-th sample, n is the number of samples, $\:{\Omega\:}\left({f}_{k}\right)$ is the regularization term for the k-th tree.

Support Vector Machines: Implemented through the e1071 package¹⁸ SVM provides flexible kernel-based learning with support for radial, polynomial, and linear kernels⁴⁰. Users can adjust cost parameters and gamma values for radial kernels.
The SVM regression objective is:

$$\:\underset{w,b}{\text{min}}\frac{1}{2}{\parallel\:w\parallel\:}^{2}+C\sum\:_{i=1}^{n}\left({\epsilon\:}_{i}+{\epsilon\:}_{i}^{*}\right)$$

Where, $\:w$ is the weight vector, b is the bias, $\:{\parallel\:w\parallel\:}^{2}$is the squared norm of $\:w$, C is the regularization parameter, $\:{\epsilon\:}_{i}$, $\:{\epsilon\:}_{i}^{*}$ are slack variables for the i-th sample, and n is the number of samples.

Neural Networks: Using the nnet package^41,42, the platform implements feedforward neural networks with configurable hidden layer architecture, weight decay regularization, and maximum iteration limits. The neural network implementation benefits from recent advances in AI-powered neural network solutions for regression tasks, incorporating optimization techniques demonstrated in specialized forestry applications^13,14.

The loss function for NN regression is:

$$\:L = \frac{1}{n}\sum\limits_{{i = 1}}^{n} {\left( {y_{i} - \hat{y}_{i} } \right)^{2} + \lambda \:\sum \: w^{2} }$$

Where, $\:L$ is the loss function, $\:{y}_{i}$is the true value, $\:{\widehat{y}}_{i}$ is the predicted value for the i-th sample, n is the number of samples, $\:\lambda\:$ is the weight decay parameter, and $\:{\sum\:w}^{2}$ is the sum of squared weights.

Data processing and validation

The data processing pipeline includes comprehensive validation procedures to ensure data quality and model reliability. Input validation checks for data completeness, variable types, and statistical properties. The system automatically handles missing values and provides diagnostic information about data quality issues. The platform incorporates a comprehensive testing framework utilizing 20 standard R datasets from various packages to validate algorithm performance across diverse domains (Table 1).

Table 1 Standard inbuilt R datasets and along with corresponding packages used for app testing.

Full size table

The current interface implementation features the fruit yield data⁷ as the primary demonstration dataset (Fig. 4), allowing users to explore the platform’s capabilities with a well-understood regression problem involving fruit yield value prediction based on 11 predictor variables The train-test split functionality allows users to specify the proportion of data used for training (50–95%), with the remainder reserved for model evaluation. This approach ensures unbiased performance assessment and prevents overfitting.

Hyperparameter optimization interface

The hyperparameter tuning interface provides intuitive slider controls for adjusting algorithm specific parameters (Fig. 5). Real-time validation ensures parameter values remain within acceptable ranges, while contextual help provides guidance on parameter selection. The interface supports both manual tuning and reset-to-defaults functionality.

Performance evaluation and visualization

The platform implements a comprehensive model performance evaluation framework using multiple selection criteria, including Root Mean Square Error (RMSE), R-squared (R²), and Mean Absolute Error (MAE) (Fig. 6). It offers comparative analysis across four machine learning algorithms, enabling users to identify the most suitable model for their specific datasets. The interface includes model comparison plots based on the selection metrics, radar plot for multi-dimensional performance visualization, and residual analysis plots for diagnosing model fit. Additionally, it provides interactive scatter plots to compare predicted versus actual values, correlation matrices to explore relationships among features, distribution plots for exploratory data analysis, feature importance plots for model interpretation, and a dedicated interpretation panel to support informed decision-making.

Results and implementation

Platform functionality assessment

The ImMLPro platform serves as an integrated machine learning environment that combines four different algorithms into a cohesive analytical framework. Users can navigate effortlessly through the complete analytical pipeline, from importing datasets to training models and examining outcomes, without encountering the usual complications associated with managing multiple software tools. The system has undergone rigorous performance evaluation using datasets of varying complexity, successfully processing both compact research datasets with under 100 data points and extensive agricultural collections containing over 10,000 records, demonstrating its scalability across different research contexts. Extensive validation efforts have employed a diverse collection of benchmark datasets drawn from various R statistical packages, ensuring the platform’s effectiveness across multiple research domains. The testing portfolio encompasses numerous application areas including macroeconomic analysis using the longley dataset, educational cost analysis through College data, environmental monitoring via airquality records, plant physiology research with CO₂ datasets, industrial applications through concrete strength measurements, automotive performance studies using mtcars data, and social research through occupational prestige rankings and wine assessment scores. Forest management applications receive particular attention in the validation process, with the platform incorporating tree volume datasets that enable timber estimation based on dimensional measurements such as diameter and height. The inclusion of Pine data further strengthens the system’s capabilities in forest resource evaluation and silvicultural planning. These specialized forestry datasets work alongside environmental monitoring tools to provide comprehensive natural resource management solutions for both academic researchers and industry professionals. The platform’s architecture employs dynamic programming techniques to maintain responsive user interactions across all system components, featuring real-time progress tracking during model development and interactive visualization tools for results interpretation. Advanced resource allocation strategies enable concurrent model training processes while maintaining optimal system responsiveness, making the platform suitable for intensive computational workflows required by modern machine learning applications across forestry, agriculture, economics, and other research disciplines.

User experience evaluation

The interface design successfully eliminates common barriers associated with machine learning implementation. Users can complete full analytical workflows without writing code, while maintaining access to advanced customization options through the hyperparameter tuning interface. The dashboard organization follows logical analytical progression: data exploration, model training, parameter optimization, and results interpretation. Interactive elements provide immediate feedback, while downloadable reports enable result sharing and documentation.

Algorithm performance comparison

Comprehensive evaluation across multiple machine learning algorithms reveals distinctive strengths and limitations when applied to varied dataset characteristics. Random Forest consistently delivers stable performance with reduced hyperparameter dependency, particularly excelling on automotive, environmental, and botanical datasets. XGBoost demonstrates exceptional capability on datasets with complex feature relationships and non-linear patterns, though it demands more intensive parameter optimization. Support Vector Machines prove most effective for smaller sample sizes while maintaining strong generalization properties across biological and chemical datasets. Deep learning approaches show promise for intricate pattern detection but require careful architectural design and hyperparameter configuration, with implementation strategies informed by contemporary research in computational biology applications.Dataset characteristics significantly influence algorithm selection and performance outcomes. For smaller datasets containing fewer than 200 samples, Support Vector Machines and Random Forest approaches provide the most reliable results. Medium-scale datasets benefit from XGBoost and Random Forest implementations, while high-dimensional feature spaces favor neural network architectures and gradient boosting methods. XGBoost demonstrates natural resilience to incomplete data, whereas alternative approaches require preprocessing interventions. The housing price prediction dataset exemplifies an ideal benchmark scenario, combining numerical and categorical variables with sufficient sample density and established performance baselines for comparative analysis.

Visualization and interpretation capabilities

The integrated visualization system provides comprehensive model interpretation through multiple graphical representations. Interactive plots enable detailed exploration of model behavior, while automated interpretation guides assist users in understanding results. The radar chart visualization effectively communicates relative model performance across multiple metrics, facilitating intuitive model selection. Residual analysis plots provide diagnostic capabilities for identifying model limitations and potential improvements.

Digital agriculture and beyond

ImMLPro leverages advanced machine learning to deliver precise predictive modeling across agriculture and other sectors, handling diverse datasets with accessible interfaces, as shown in Table 2, which provides a detailed overview of its applications. Its robust analytical architecture and intuitive design enable professionals across industries to harness sophisticated predictive tools without requiring specialized technical expertise, democratizing access to advanced analytics. This versatility empowers applications, from crop yield forecasting to healthcare outcome predictions, enhancing decision-making across diverse fields.

Table 2 ImMLPro applications across Sectors.

Full size table

Technical innovation and integration

The platform’s core innovation lies in bridging statistical computation with modern interface design, creating an accessible environment for complex analytical workflows. By utilizing R’s computational backbone alongside web-based technologies, the system delivers professional-grade analysis tools through simplified user interactions. The underlying reactive framework dynamically allocates computational resources, ensuring smooth operation even when processing demanding machine learning tasks. Performance enhancements include specialized algorithms that accelerate data processing while preserving result accuracy. Analytical reliability stems from robust testing methodologies that validate model performance through multiple assessment layers. The system employs systematic data partitioning strategies and implements standardized evaluation protocols to ensure credible results. Significance testing frameworks and uncertainty quantification provide additional confidence measures, enabling users to distinguish between genuine performance variations and statistical noise among different algorithms. System scalability emerges through intelligent architectural design that adapts to varying computational demands. The framework incorporates distributed processing capabilities, memory-efficient data handling structures, and optimized rendering engines for complex visualizations. These technical components work collectively to maintain consistent performance standards regardless of dataset complexity or user load, supporting reliable analytical operations across diverse use cases.

User accessibility and education

ImMLPro fundamentally transforms machine learning accessibility by removing programming prerequisites that traditionally exclude domain specialists from advanced analytical capabilities. The platform’s intuitive design philosophy ensures that agricultural researchers, environmental scientists, and other professionals can harness sophisticated algorithms through streamlined interfaces that prioritize usability over technical complexity. Interactive guidance systems provide real-time assistance through comprehensive parameter explanations, workflow recommendations, and contextual tutorials. This educational scaffolding enables users to develop genuine understanding of analytical processes while maintaining focus on practical outcomes and scientific interpretation rather than code syntax or technical troubleshooting. The platform functions as a comprehensive educational ecosystem where theoretical machine learning concepts become tangible through hands-on experimentation. Visual algorithm comparisons, interactive parameter manipulation, and real-time performance feedback create immersive learning experiences that deepen understanding of statistical modeling principles. Dynamic visualization tools demonstrate how different algorithms respond to varying data characteristics, parameter adjustments, and preprocessing techniques, helping users develop intuitive understanding of algorithm behavior for more informed analytical decisions.

ImMLPro serves as a strategic professional development platform, empowering domain experts to expand their analytical capabilities without requiring extensive computational training. By seamlessly integrating advanced statistical methods with familiar agricultural and environmental contexts, the platform enables professionals to enhance their research impact through sophisticated data analysis. The comparative evaluation framework builds critical thinking skills around algorithm selection, helping users understand trade-offs between different modeling approaches for specific research questions. This analytical literacy translates into improved research quality, more robust experimental designs, and enhanced ability to communicate findings to diverse stakeholders across academic and industry settings.

Usability evaluation

A survey of 100 agricultural researchers evaluated the usability, learning curve, and satisfaction of ImMLPro. The results underscore ImMLPro’s strengths, including its intuitive interface powered by Shiny integration, which supports the creation of interactive, user-friendly analytical tools, as illustrated in Fig. 7, which details the functionality and work process of ImMLPro. The learning curve was rated as moderate, accommodating both novice and experienced R users, while satisfaction was exceptionally high, attributed to the package’s robust visualization tools, modular architecture, and support for scalable algorithms. These features make ImMLPro a highly effective tool for agricultural data analysis, enhancing its accessibility and impact across diverse research applications.

Comparison with existing platforms

ImMLPro stands out from platforms like Orange, Weka, Google AutoML, and MLJAR by focusing exclusively on regression-based tasks for continuous target variables, such as predicting crop yields, tree volumes, or soil nutrient levels. This specialized approach makes it particularly well-suited for agricultural predictive analytics. Unlike its counterparts, which primarily emphasize classification tasks for categorical outcomes, ImMLPro is tailored to handle continuous data, addressing the unique needs of agriculture-related modeling.

Key distinctions include:

Accessibility: ImMLPro features a no-code Shiny interface, enabling non-technical users, such as agricultural researchers or farmers, to build and deploy regression models without programming knowledge. While Orange and Weka offer visual interfaces, they require more technical familiarity, and Google AutoML and MLJAR, though user-friendly, are less focused on agriculture-specific use cases.
Supported Algorithms: ImMLPro supports regression algorithms like Random Forest, XGBoost, Support Vector Regression (SVR), and Neural Networks, optimized for continuous variable prediction. In contrast, platforms like Orange, Weka, Google AutoML, and MLJAR offer broader algorithm suites, prioritizing classification methods (e.g., logistic regression, decision trees for categorical predictions) over regression.
Capability: ImMLPro’s design includes agriculture-optimized pre-processing pipelines and interactive visualizations tailored to continuous outcomes, such as yield or nutrient forecasts. This contrasts with the general-purpose workflows of other platforms, which lack domain-specific optimizations for agriculture.

The differences highlighted in Table 3 emphasize ImMLPro’s unique niche in providing accessible, regression-focused tools designed for agricultural analytics. These tools empower users to effectively address real-world challenges such as optimizing crop production and resource management with both precision and ease.

Table 3 Comparison of ImMLPro with other machine learning platforms.

Full size table

Conclusion

ImMLPro represents a significant advancement in democratizing access to sophisticated machine learning algorithms through the integration of R programming, statistical analysis, and user-friendly web interfaces. The platform successfully addresses the critical challenge of making complex analytical methods accessible to domain experts without extensive programming backgrounds. The integration of four complementary machine learning algorithms within a unified interface provides users with powerful comparative analysis capabilities, enabling informed model selection and optimal performance for diverse applications. The platform’s focus on continuous variable prediction makes it particularly valuable for agricultural applications while maintaining broad applicability across various domains. The technical achievement of seamlessly integrating R’s statistical computing power with modern web technologies demonstrates the potential for bridging advanced analytics and practical implementation. The platform’s emphasis on user experience, educational value, and statistical rigor establishes it as a valuable tool for both research and practical applications.

ImMLPro’s contribution to digital agriculture extends beyond technical capabilities to include fostering data-driven decision-making among agricultural professionals. By eliminating coding barriers while maintaining analytical sophistication, the platform supports the broader adoption of machine learning in agricultural practice and research. The platform’s modular architecture and extensible design provide a foundation for continued development and enhancement, ensuring its relevance and utility as machine learning techniques and user needs evolve. ImMLPro represents a successful model for integrating advanced analytics with accessible interfaces, contributing to the democratization of data science across diverse domains.

Future scope

The ImMLPro platform ademonstrates significant potential in enhancing agricultural research through advanced statistical computing and machine learning capabilities within the R ecosystem. Future developments could focus on integrating real-time data processing to enable dynamic predictive modeling for rapidly changing environmental conditions. Expanding compatibility with emerging machine learning frameworks and incorporating deep learning algorithms could further improve predictive accuracy and scalability. Additionally, enhancing the user interface of ImMLPro’s Shiny-based dashboard to support mobile platforms and integrating cloud-based solutions would increase accessibility for a broader range of researchers and practitioners. Exploring interoperability with IoT devices for direct data collection from agricultural fields could also streamline data integration and enhance the system’s applicability in precision agriculture. These advancements would solidify ImMLPro’s role as a versatile tool for data-driven decision-making in agricultural research.

Data availability

The platform is publicly accessible at https://imqbalstat.shinyapps.io/ImMLPro/ for research and educational purposes. The source code is accessible at https://drive.google.com/file/d/1vDdWzlG7aKYDVGCbNzawJY-aqxsUCGvb/view? usp=sharing.

References

Kamilaris, A. & Prenafeta-Boldú, F. X. Deep learning in agriculture: A survey. Comput. Electron. Agric. 147, 70–90 (2018).
Article Google Scholar
Liakos, K. G., Busato, P., Moshou, D., Pearson, S. & Bochtis, D. Machine learning in agriculture: A review. Sensors 18, 2674 (2018).
Article ADS PubMed PubMed Central Google Scholar
Eli-Chukwu, N. C. Applications of artificial intelligence in agriculture: A review. Eng. Technol. Appl. Sci. Res. 9, 4377–4383 (2019).
Article Google Scholar
Sharma, A., Jain, A., Gupta, P. & Chowdary, V. Machine learning applications for precision agriculture: A comprehensive review. IEEE Access. 9, 4843–4873 (2021).
Article Google Scholar
Wickham, H. Mastering Shiny: Build interactive apps, reports, and dashboards powered by R (O’Reilly Media, 2021). (2021).
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2023).
Jeelani, M. I., Wani, F. J. & Imneuron AI powered neural network solutions for regression tasks. R package version 1.0.0. Available at: https://cran.r-project.org/web/packages/Imneuron/index.html (2024).
Chang, W. et al. shiny: Web application framework for R. R package version 1.6.0 (2021).
Chlingaryan, A., Sukkarieh, S. & Whelan, B. Machine learning approaches for crop yield prediction and nitrogen status Estimation in precision agriculture: A review. Comput. Electron. Agric. 151, 61–73 (2018).
Article Google Scholar
Azimi, S., Gandhi, T. K., Lodaya, D. & Quaranta, D. Wheat yield prediction using remote sensing data based on machine learning algorithms in Golestan Province. J. Water Land. Dev. 47, 104–112 (2020).
Google Scholar
Rehman, T. U., Mahmud, M. S., Chang, Y. K., Jin, J. & Shin, J. Current and future applications of statistical machine learning algorithms for agricultural machine vision systems. Comput. Electron. Agric. 156, 585–605 (2019).
Article Google Scholar
Shadrin, D., Menshchikov, A., Ermilov, D. & Somov, A. Designing future precision agriculture: detection of seeds germination using artificial intelligence on a low-power embedded system. IEEE Sens. J. 19, 11573–11582 (2019).
Article ADS Google Scholar
Jeelani, M. I., Tabassum, A., Islam, K. U. & Gul, M. Neural network modeling of height diameter relationships for Himalayan pine through back propagation approach. J. Indian Soc. Agric. Stat. 76, 169–178 (2023).
Google Scholar
Jeelani, M. I. et al. M optimizing structural parameters for accurate prediction of height and diameter relationships in Himalayan pine using artificial intelligence based neural networks. Pak J. Bot. 57, 743–748 (2025).
Article Google Scholar
Vasisht, D. et al. FarmBeats: An IoT platform for data-driven agriculture. In Proc. 14th USENIX Symp. Networked Systems Design and Implementation 515–529 (2017).
Wickham, H. & Grolemund, G. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data (O’Reilly Media, 2016).
Liaw, A. & Wiener, M. Classification and regression by randomforest. R News. 2, 18–22 (2002).
Google Scholar
Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A. & Leisch, F. e1071: misc functions of the department of statistics, probability theory group. R package version 1.7-8 (2021).
Jeelani, M. I., Mangiafico, S. & ImML Machine learning algorithms fitting and validation for forestry. R package version 1.0.0. Available at: https://cran.r-project.org/web/packages/ImML/index.html (2023).
Beeley, C. Web Application Development with R Using Shiny (Packt Publishing, 2016).
Kuhn, M. caret: Classification and regression training. R package version 6.0–86 (2020).
Granjon, D. Outstanding User Interfaces with Shiny (CRC, 2021).
Fay, C., Rochette, S., Guyader, V. & Girard, C. Engineering Production-Grade Shiny Apps (CRC, 2021).
Norman, D. The Design of Everyday Things: Revised and Expanded Edition (Basic Books, 2013).
Krug, S. Don’t Make Me Think, Revisited: A Common Sense Approach to Web UsabilityNew Riders, (2014).
Sievert, C. Interactive Web-Based Data Visualization with R, Plotly, and Shiny (CRC, 2020).
Bailey, E. Shiny Dashboards (Packt Publishing, 2018).
Nielsen, J. & Budiu, R. Mobile UsabilityNew Riders, (2012).
Attali, D. shinyjs: Easily improve the user experience of your Shiny apps in seconds. R package version 2.0.0 (2021).
James, G., Witten, D., Hastie, T. & Tibshirani, R. An Introduction To Statistical Learning (Springer, 2013).
Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proc. 14th Int. Joint Conf. Artificial Intelligence 1137–1143 (1995).
Japkowicz, N. & Shah, M. Evaluating Learning Algorithms: A Classification Perspective (Cambridge University Press, 2011).
Singh, G. & Sharma, S. Enhancing precision agriculture through cloud based transformative crop recommendation model. Sci. Rep. 15, 1–22. https://doi.org/10.1038/s41598-025-93417-3 (2025).
Article CAS Google Scholar
Singh, R., Kaur, S., Singh, G., Kaur, M. & Kaur, P. Optimizing early diagnosis by integrating multiple classifiers for predicting brain stroke and critical diseases. Sci. Rep. 14, 1–17. https://doi.org/10.1038/s41598-024-80129-3 (2024).
Article CAS Google Scholar
Singh, S. et al. A predictive framework using advanced machine learning approaches for measuring and analyzing the impact of synthetic agrochemicals on human health. Sci. Rep. 15, 1–21. https://doi.org/10.1038/s41598-025-00509-1 (2025).
Article CAS Google Scholar
Sessler, T. et al. SurviveR: a flexible Shiny application for patient survival analysis. Sci. Rep. 13, 22093 (2023).
Article ADS PubMed PubMed Central CAS Google Scholar
Bishop, C. M. Pattern Recognition and Machine Learning (Springer, 2006). https://doi.org/10.1038/s41598-023-48894-9
Alpaydin, E. Introduction To Machine Learning 4th edn (MIT Press, 2020).
Chen, T., Guestrin, C. & XGBoost A scalable tree boosting system. In Proc. 22nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining 785–794 (2016).
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).
Article Google Scholar
Venables, W. N. & Ripley, B. D. Modern Applied Statistics with S 4th edn (Springer, 2002).
Venables, W. N. & Ripley, B. D. nnet: Feed-Forward Neural Networks and Multinomial Log-Linear Models. Available at: https://cran.r-project.org/web/packages/nnet/index.html (2025).

Download references

Acknowledgements

The authors acknowledge the contributions of the R community and the developers of the packages that made ImMLPro possible. Special recognition goes to the agricultural research community for providing valuable feedback and use cases that informed the platform’s development.

Funding

N/A.

Author information

Authors and Affiliations

Division of Animal Genetics & Breeding, FVSc & AH, Sher-e-Kashmir University of Agricultural Sciences and Technology of Kashmir, Shalimar, Srinagar, Jammu and Kashmir, India
M. Iqbal Jeelani
Department of Plant Resources and Environment, Jeju National University, Jeju, 63243, Republic of Korea
Sheikh Mansoor
Department of Horticulture, Faculty of Agriculture, Recep Tayyip Erdoğan University, 53300, Rize, Turkey
Sheikh Mansoor

Authors

M. Iqbal Jeelani
View author publications
Search author on:PubMed Google Scholar
Sheikh Mansoor
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization: M.I.J., and M.S., Methodology M.I.J., and M.S. Supervision: M.I.J. Bioinformatical Analysis and software: M.I.J, Writing Manuscript: M.I.J. and M.S. Review and Editing: M.I.J., and M.S.

Corresponding authors

Correspondence to M. Iqbal Jeelani or Sheikh Mansoor.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethics approval

Not applicable.

Consent to participate

Not Applicable.

Consent for publication

Not Applicable.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Jeelani, M.I., Mansoor, S. ImMLPro platform for accessible machine learning and statistical analysis in digital agriculture and beyond. Sci Rep 15, 34059 (2025). https://doi.org/10.1038/s41598-025-14234-2

Download citation

Received: 18 June 2025
Accepted: 30 July 2025
Published: 30 September 2025
Version of record: 30 September 2025
DOI: https://doi.org/10.1038/s41598-025-14234-2

Subjects

Abstract

Similar content being viewed by others

Dataset meta-level and statistical features affect machine learning performance

Analysis of cultivated land changes and driving factors in the Alar Reclamation Area (1990–2019) based on multi-temporal Landsat data and machine learning algorithms

Exogenous variable driven deep learning models for improved price forecasting of TOP crops in India

Introduction

Motivation and contribution

Methodology

System architecture and design

Machine learning algorithm integration

Data processing and validation

Hyperparameter optimization interface

Performance evaluation and visualization

Results and implementation

Platform functionality assessment

User experience evaluation

Algorithm performance comparison

Visualization and interpretation capabilities

Digital agriculture and beyond

Technical innovation and integration

User accessibility and education

Usability evaluation

Comparison with existing platforms

Conclusion

Future scope

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links