Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Air quality index AQI classification based on hybrid particle swarm and grey wolf optimization with ensemble machine learning model
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 05 January 2026

Air quality index AQI classification based on hybrid particle swarm and grey wolf optimization with ensemble machine learning model

  • Emad Elabd1,2,
  • Hany Mohamed Hamouda1,
  • M. A. Mohamed Ali3,
  • A. S. Hamid4 &
  • …
  • Yasser Fouad5 

Scientific Reports , Article number:  (2026) Cite this article

  • 748 Accesses

  • 1 Citations

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Engineering
  • Mathematics and computing

Abstract

Accurate Air Quality Index (AQI) classification is essential for environmental surveillance and public health decision-making. Using a publicly available daily U.S. county-level dataset with six AQI categories (Good, Moderate, Unhealthy for Sensitive Groups, Unhealthy, Very Unhealthy, Hazardous), we conducted a comprehensive benchmarking study. Data preprocessing included missing-value imputation and class balancing via Synthetic Minority Over-sampling Technique (SMOTE). We trained and evaluated classical and deep models (Random Forest (RF), Extra Trees (ET), K-Nearest Neighbors (KNN), Naive Bayes (NB), Logistic Regression (LR), and a Multi-Layer Perceptron (MLP)) and assessed performance using cross-validation accuracy, test accuracy, macro-averaged recall, F1-score, and ROC-AUC. Ensemble methods (RF, ET) and the MLP consistently outperformed traditional baselines. RF achieved 99.3% test accuracy with perfect recall, F1-score, and ROC-AUC; MLP achieved 99.0% test accuracy. A stacking ensemble, optimized with a hybrid Particle Swarm–Grey Wolf Optimizer (PSO–GWO), delivered 99.99% test accuracy, 99.99% macro-averaged recall, and 1.0000 ROC-AUC. These findings demonstrate that combining ensemble learning with metaheuristic optimization can substantially enhance multi-class AQI classification performance and offer a practical path toward reliable, real-time air-quality assessment.

Data availability

The data that support the findings of this study are available at https://www.epa.gov/outdoor-air-quality-data/air-quality-index-report.

Code availability

The code used in this study is available from the corresponding author upon reasonable request.

References

  1. Kampa, M. & Castanas, E. July. Human health effects of air pollution. Environmental Pollution. 151 (2), 362–367. https://doi.org/10.1016/j.envpol.2007.06.012 (2007).

  2. Lai, W. I. et al. Ensemble machine learning model for accurate air pollution detection using commercial gas sensors. Sensors. 22 (12), 4393. https://doi.org/10.3390/s22124393 (2022).

  3. Lin, C. Y. et al. Ensemble multifeatured deep learning models for air quality forecasting. Atmospheric Pollution Res., 12 (5), 101045. https://doi.org/10.1016/j.apr.2021.03.008 (2021).

  4. Gupta, N. et al. Prediction of air quality index using machine learning techniques: a comparative analysis. J. Environ. Public. Health, 1–26. https://doi.org/10.1155/2023/4916267 (2023).

  5. Air pollution. https://www.who.int/health-topics/air-pollution (2025).

  6. New State of Global Air. Report finds air pollution is second leading risk factor for death worldwide | Health Effects Institute, https://www.healtheffects.org/announcements/new-state-global-air-report-finds-air-pollution-second-leading-risk-factor-death (2025).

  7. Guo, J. et al. Long-term exposure to particulate matter on cardiovascular and respiratory diseases in low- and middle-income countries: A systematic review and meta-analysis. Front. Public. Health. 11, 1134341. https://doi.org/10.3389/FPUBH.2023.1134341 (2023).

    Google Scholar 

  8. Ambient (ed) (outdoor) air pollution. https://www.who.int/news-room/fact-sheets/detail/ambient-%28outdoor%29-air-quality-and-health (2025).

  9. Organización Mundial de la Salud (OMS): WHO global air quality guidelines. Particulate matter (PM2.5 and PM10), ozone, nitrogen dioxide, sulfur dioxide and carbon monoxide. 1–360. (2021).

  10. Ketu, S. and Pramod Kumar, M. Scalable Kernel-based SVM classification algorithm on imbalance air quality data for proficient healthcare. Complex. Intell. Syst. 7 (5), 2597–2615. https://doi.org/10.1007/s40747-021-00435-5 (2021). .

  11. Alkabbani, H. et al. An improved air quality index machine learning-based forecasting with multivariate data imputation approach. Atmosphere. 13 (7), 1144. https://doi.org/10.3390/atmos13071144 (2022),

  12. Razavi-Termeh, S. et al. Spatial modeling of asthma-prone areas using remote sensing and ensemble machine learning algorithms. Remote Sens., 13,16,2021,3222. https://doi.org/10.3390/rs13163222

  13. Udristioiu, M. T. et al. Prediction, Modelling, and forecasting of PM and AQI using hybrid machine learning. J. Clean. Prod., 421, 138496. https://doi.org/10.1016/j.jclepro.2023.138496 (2023).

  14. Sethi, J. K. & Mittal, M. An efficient correlation based adaptive LASSO regression method for air quality index prediction. Earth Sci. Inf., 14 (4), 1777–1786. https://doi.org/10.1007/s12145-021-00618-1 (2021).

  15. Rao, R. et al. Multimodal Imputation-based stacked ensemble for prediction and classification of air quality index. in Indian Cities Computers Electr. Eng., 114, 109098. https://doi.org/10.1016/j.compeleceng.2024.109098 (2024).

  16. Mohan, A. S. & Abraham, L. An ensemble deep learning approach for air quality estimation in Delhi, India. Earth Science Informatics. 17, (3), 1923–48. https://doi.org/10.1007/s12145-023-01210-5 (2024).

  17. Farooq, O. et al. An enhanced approach for predicting air pollution using quantum support vector machine. Sci. Rep., 14 (1). https://doi.org/10.1038/s41598-024-69663-2 (2024).

  18. Ma, S. et al. Forecasting air quality index in yan’an using Temporal encoded informer. Expert Syst. Appl.. 255, 124868. https://doi.org/10.1016/j.eswa.2024.124868 (2024).

  19. Ahmadi, M. et al. Enhancing air quality classification using a novel discrete learning-based multilayer perceptron model (DMLP). Int. J. Environ. Sci. Technol. https://doi.org/10.1007/s13762-024-06017-5 (2024).

    Google Scholar 

  20. Singh, S. & Suthar, G. Machine learning and deep learning approaches for pm2.5 prediction: a study on urban air quality in Jaipur, India. Earth Sci. Inf., 18 (1), https://doi.org/10.1007/s12145-024-01648-1 (2024).

  21. Rajagopal, K. & Narayanan, K. A novel approach for air quality index prognostication using hybrid optimization techniques. Int. Res. J. Multidisciplinary Technovation. 84–99. https://doi.org/10.54392/irjmt2427 (2024).

  22. Subrahmanyam, V. et al. June. An environmental green approach by optimization of air quality index (AQI) prediction using hybrid machine learning combines with swarm intelligence algorithm. International Journal of Environmental Sciences. 11 (10), 724–34. https://doi.org/10.64252/70hanh87 (2025).

  23. Ghorbal, A. et al. Air pollution prediction using blind source separation with Greylag Goose optimization algorithm. Front. Environ. Sci.. 12, https://doi.org/10.3389/fenvs.2024.1429410 (2024).

  24. Lakshmipathy, M. et al. Health and ecological risk assessment-based air quality prediction framework using ensemble learning network with optimal weighted prediction score. Int. J. Image Graphics , https://doi.org/10.1142/s0219467827500604 (2025).

  25. Air Quality Index Report | US EPA. https://www.epa.gov/outdoor-air-quality-data/air-quality-index-report (2025).

  26. Panchbhai, K. G., Lanjewar, M. G. & Naik, A. V. Modified MobileNet with leaky ReLU and LSTM with balancing technique to classify the soil types. Earth Sci. Inf. 18, 77. https://doi.org/10.1007/s12145-024-01521-1 (2025).

    Google Scholar 

  27. Panchbhai, K. G. & Lanjewar, M. G. Detection of amylose content in rice samples with spectral augmentation and advanced machine learning. J. Food Compos. Anal. 107455. https://doi.org/10.1016/j.jfca.2025.107455 (2025).

  28. Panchbhai, K. G. et al. Near-infrared spectroscopy coupled with machine learning for soil properties prediction. Int. J. Remote Sens. 1–33. https://doi.org/10.1080/01431161.2025.2541943 (2025).

  29. Panchbhai, K. G. & Lanjewar, M. G. Identification of Mango varieties with vitamin C and titratable acidity using stacking generalization from NIR spectra. Food Measure. 19, 4257–4277. https://doi.org/10.1007/s11694-025-03251-4 (2025).

    Google Scholar 

  30. Panchbhai, K. G. & Lanjewar, M. G. Integrating ATR-MIR spectroscopy with stacking machine learning for detecting palm Olein adulterants in groundnut oil. Food Measure. 19, 5871–5885. https://doi.org/10.1007/s11694-025-03360-0 (2025).

    Google Scholar 

  31. Elshewey, A. M. et al. Water potability classification based on hybrid stacked model and feature selection. Environ. Sci. Pollut. Res. https://doi.org/10.1007/s11356-025-36120-0 (2025).

    Google Scholar 

  32. Elshewey, A. M. et al. Prediction of aerodynamic coefficients based on machine learning models. Model. Earth Syst. Environ.. https://doi.org/10.1007/s40808-025-02355-6 (2025).

  33. Fouad, Y. et al. Adaptive visual sentiment prediction model based on event concepts and object detection techniques in social media. Int. J. Adv. Comput. Sci. Appl.. https://doi.org/10.14569/ijacsa.2023.0140728 (2023).

  34. Rainio, O. et al. Evaluation metrics and statistical tests for machine learning. Sci. Rep.. https://doi.org/10.1038/s41598-024-56706-x (2024).

  35. Choi, Y. et al. Utilizing machine learning-based classification models for tracking air pollution sources: a case study in Korea. Aerosol Air Qual. Res., 24 (7), 230222. https://doi.org/10.4209/aaqr.230222 (2024).

  36. Rao, R. et al. Multimodal imputation-based multimodal autoencoder framework for AQI classification and prediction of Indian cities. IEEE Access.. 12, 108350–108363. https://doi.org/10.1109/access.2024.3438573 (2024).

  37. Barthwal, A. and Amit Kumar Goel. Advancing air quality prediction models in urban india: a deep learning approach integrating DCNN and LSTM architectures for AQI time-series classification. Model. Earth Syst. Environ., 10 (2), 2935–2955. https://doi.org/10.1007/s40808-023-01934-9 (2024).

  38. Rafi, M. A. et al. Air pollution prediction and classification with a hybrid ANN-LSTM model in modern cities: a comparative study. IET Conference Proceedings.,2024,30,2025,580–85. https://doi.org/10.1049/icp.2025.0313

  39. Domingos, P. Sept. A few useful things to know about machine Learning. Communications of the ACM. 55 (10), 78–87. https://doi.org/10.1145/2347736.2347755 (2012).

  40. Lobo, J. M. et al. AUC: A misleading measure of the performance of predictive distribution models. Glob. Ecol. Biogeogr.17 (2), 145–151. https://doi.org/10.1111/j.1466-8238.2007.00358.x (2007).

  41. Singh, K. P. et al. Identifying pollution sources and predicting urban air quality using ensemble learning methods. Atmos. Environ. 80, 426–437. https://doi.org/10.1016/j.atmosenv.2013.08.023 (2013).

  42. Zhang, B. et al. Sept. Air quality index prediction in six major Chinese urban agglomerations: a comparative study of single machine learning model, ensemble model, and hybrid model. Atmosphere. 14 (10), 1478. https://doi.org/10.3390/atmos14101478 (2023).

  43. Almaliki, A. H. et al. Sept. Air quality index (AQI) prediction in holy Makkah based on machine learning methods. Sustainability. 15 (17), 13168. https://doi.org/10.3390/su151713168 (2023).

  44. Diallo, A. et al. Enhancing outlier detection in air quality index data using a stacked machine learning model. Eng. Rep.. https://doi.org/10.1002/eng2.12936, (2024).

  45. Özüpak, Y. et al. Air quality forecasting using machine learning: comparative analysis and ensemble strategies for enhanced prediction. Water Air Soil. Pollution, 236, https://doi.org/10.1007/s11270-025-08122-8 (2025).

  46. Afreen, S., Bhurjee, A. K. & Aziz, R. M. Feature selection using game Shapley improved grey Wolf optimizer for optimizing cancer classification. Knowl. Inf. Syst. 67, 3631–3662. https://doi.org/10.1007/s10115-025-02340-6 (2025).

    Google Scholar 

  47. Yaqoob, A., Kumar, V. N., Rao, G. V. V., Jagannadha & Aziz, R. Musheer. 8 Efficient gene selection for breast cancer classification using Brownian Motion Search Algorithm and Support Vector Machine. Drug Discovery and Telemedicine: Through Artificial Intelligence, Computer Vision, and IoT, edited by Saurav Mallik, Zubair Rahaman, Soumita Seth, Anjan Bandyopadhyay, Sujata Swain and Somenath Chakraborty, De Gruyter, 109–126. (2025). https://doi.org/10.1515/9783111504667-008

  48. Yaqoob, A., Kumar, V. N., Rao, G. V. V., Jagannadha & Aziz, R. 9 A hybrid feature gene selection approach by integrating variance filter, extremely randomized tree, and cuckoo search algorithm for cancer classification. Drug discovery and telemedicine: through artificial Intelligence, computer Vision, and IoT, edited by Saurav Mallik, Zubair Rahaman, Soumita Seth, Anjan Bandyopadhyay, Sujata Swain and Somenath Chakraborty. De Gruyter. 127–150. https://doi.org/10.1515/9783111504667-009 (2025).

Download references

Acknowledgements

The Researchers would like to thank the Deanship of Graduate Studies and Scientific Research at Qassim University for financial support (QU-APC-2026).

Author information

Authors and Affiliations

  1. Department of Management Information Systems, College of Business and Economics, Qassim University, Buraidah, Qassim, 51452, Saudi Arabia

    Emad Elabd & Hany Mohamed Hamouda

  2. Department of Information Systems, Faculty of Computers and Information, Menoufia University, Shebin El Kom, Egypt

    Emad Elabd

  3. Department of Mathematics, College of Science, Qassim University, Buraidah, Qassim, 51452, Saudi Arabia

    M. A. Mohamed Ali

  4. Department of Physics, College of Science, Qassim University, Buraidah, Qassim, 51452, Saudi Arabia

    A. S. Hamid

  5. Department of Computer Science, Faculty of Computers and Information, Suez University, P.O.Box:43221, Suez, Egypt

    Yasser Fouad

Authors
  1. Emad Elabd
    View author publications

    Search author on:PubMed Google Scholar

  2. Hany Mohamed Hamouda
    View author publications

    Search author on:PubMed Google Scholar

  3. M. A. Mohamed Ali
    View author publications

    Search author on:PubMed Google Scholar

  4. A. S. Hamid
    View author publications

    Search author on:PubMed Google Scholar

  5. Yasser Fouad
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Yasser Fouad conceived and designed the study; developed the methodology; implemented the software; performed validation and formal analysis; conducted the investigation; curated the data; wrote the original draft; contributed to review and editing; and prepared the visualizations. Emad Elabd conceived and designed the study; contributed to the methodology; conducted the investigation; reviewed and edited the manuscript. M. A. Mohamed Ali contributed to the methodology, performed validation and formal analysis, and participated in review and editing. Hany Mohamed Hamouda provided resources, curated the data, contributed to visualization, and participated in review and editing. A S Hamid participated in review and editing. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Emad Elabd or Yasser Fouad.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Elabd, E., Hamouda, H.M., Ali, M.A.M. et al. Air quality index AQI classification based on hybrid particle swarm and grey wolf optimization with ensemble machine learning model. Sci Rep (2026). https://doi.org/10.1038/s41598-025-34278-8

Download citation

  • Received: 06 September 2025

  • Accepted: 26 December 2025

  • Published: 05 January 2026

  • DOI: https://doi.org/10.1038/s41598-025-34278-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Air quality index
  • Environmental monitoring
  • Air pollution
  • Machine learning
  • Air quality classification
  • Ensemble machine learning
  • Particle swarm and grey wolf optimization
  • Metaheuristic optimization

This article is cited by

  • Hybrid deep learning model for air quality prediction and its impact on healthcare

    • Tanisha Madan
    • Shrddha Sagar
    • Arvind Panwar

    Scientific Reports (2026)

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics