Abstract
Intrusion detection systems (IDS) play a vital role in protecting computer networks from malicious activities. Dimensionality reduction techniques are commonly employed to enhance the effectiveness and accuracy of machine learning based IDS. In this study, we proposed an effective dimensionality reduction technique called feature importance-based autoencoder (FI-AE) for intrusion detection systems. Our proposed approach encompasses several key components. First, we introduce a novel feature importance method known as one-versus-all feature importance (OVA), which utilizes a random forest algorithm. Next, we train an autoencoder model using a weighted loss function that takes into account the feature importance values obtained through the OVA method. Finally, we utilized the trained autoencoder to reduce the number of features in the benchmark datasets, followed by the application of a random forest classifier to the reduced datasets. We tested our proposed model using three well-known datasets, namely NSL-KDD, UNSW-NB15, and CIC-IDS2017. The experiments revealed that the random forest classifier, combined with our proposed model, outperformed previous dimensionality reduction techniques in terms of accuracy and F1-score.
Data availability
The datasets used in this research are available online. NSL-KDD https://www.unb.ca/cic/datasets/nsl. UNSW-NB15 https://research.unsw.edu.au/projects/unsw-nbl5-dataset CIC-IDS2017 https://www.unb.ca/cic/datasets/ids-2017.
References
Khraisat, A., Gondal, I., Vamplew, P. & Kamruzzaman, J. Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity2, 1–22 (2019).
Li, Y. & Liu, Q. A comprehensive review study of cyber-attacks and cyber security; emerging trends and recent developments. Energy Rep.7, 8176–8186 (2021).
Ahmad, Z., Shahid Khan, A., Wai Shiang, C., Abdullah, J. & Ahmad, F. Network intrusion detection system: A systematic study of machine learning and deep learning approaches. Trans. Emerg. Telecommun. Technol.32, e4150 (2021).
Dini, P. et al. Overview on intrusion detection systems design exploiting machine learning for networking cybersecurity. Appl. Sci.13, 7507 (2023).
Yang, Z. et al. A systematic literature review of methods and datasets for anomaly-based network intrusion detection. Comput. Secur.116, 102675 (2022).
Duhayyim, M. A. et al. Evolutionary-based deep stacked autoencoder for intrusion detection in a cloud-based cyber-physical system. Appl. Sci.12, 6875 (2022).
Khanam, S., Ahmedy, I., Idris, M. Y. I. & Jaward, M. H. Towards an effective intrusion detection model using focal loss variational autoencoder for internet of things (iot). Sensors22, 5822 (2022).
Zebari, R., Abdulazeez, A., Zeebaree, D., Zebari, D. & Saeed, J. A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. J. Appl. Sci. Technol. Trends1, 56–70 (2020).
Kamalov, F., Moussa, S., Zgheib, R. & Mashaal, O. Feature selection for intrusion detection systems. In 2020 13th international symposium on computational intelligence and design (ISCID), 265–269 (IEEE, 2020).
Sarhan, M., Layeghy, S., Moustafa, N., Gallagher, M. & Portmann, M. Feature extraction for machine learning-based intrusion detection in iot networks. Digit. Commun. Netw. https://doi.org/10.1016/j.dcan.2022.08.012 (2022).
Jolliffe, I. T. & Cadima, J. Principal component analysis: a review and recent developments. Philos. Trans. R. Soc. A: Math. Phys. Eng. Sci.374, 20150202 (2016).
Liu, H. & Lang, B. Machine learning and deep learning methods for intrusion detection systems: A survey. Appl. Sci.9, 4396 (2019).
Ayesha, S., Hanif, M. K. & Talib, R. Overview and comparative study of dimensionality reduction techniques for high dimensional data. Inf. Fusion59, 44–58 (2020).
Abdulhammed, R., Musafer, H., Alessa, A., Faezipour, M. & Abuzneid, A. Features dimensionality reduction approaches for machine learning based network intrusion detection. Electronics8, 322 (2019).
Zhou, Y., Cheng, G., Jiang, S. & Dai, M. Building an efficient intrusion detection system based on feature selection and ensemble classifier. Comput. Netw.174, 107247 (2020).
Salo, F., Nassif, A. B. & Essex, A. Dimensionality reduction with ig-pca and ensemble classifier for network intrusion detection. Comput. Netw.148, 164–175 (2019).
Bansal, A. & Kaur, S. Data dimensionality reduction (ddr) scheme for intrusion detection system using ensemble and standalone classifiers. In Advances in Computing and Data Sciences: Third International Conference, ICACDS 2019, Ghaziabad, India, April 12–13, 2019, Revised Selected Papers, Part I 3, 436–451 (Springer, 2019).
Gavel, S., Raghuvanshi, A. S. & Tiwari, S. Distributed intrusion detection scheme using dual-axis dimensionality reduction for internet of things (iot). J. Supercomput.77, 10488–10511 (2021).
Yoshimura, N., Kuzuno, H., Shiraishi, Y. & Morii, M. Doc-ids: a deep learning-based method for feature extraction and anomaly detection in network traffic. Sensors22, 4405 (2022).
Kasongo, S. M. A deep learning technique for intrusion detection system using a recurrent neural networks based framework. Comput. Commun.199, 113–125 (2023).
Kasongo, S. M. & Sun, Y. Performance analysis of intrusion detection systems using a feature selection method on the unsw-nb15 dataset. J. Big Data7, 1–20 (2020).
Thakkar, A., Kikani, N. & Geddam, R. Fusion of linear and non-linear dimensionality reduction techniques for feature reduction in lstm-based intrusion detection system. Appl. Soft Comput.154, 111378 (2024).
Biau, G. & Scornet, E. A random forest guided tour. Test25, 197–227 (2016).
Elsheikh, M., Shalaby, M., Sobh, M. A. & Bahaa-Eldin, A. M. Deep learning techniques for intrusion detection systems: A survey and comparative study. In 2023 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), 1–9 (IEEE, 2023).
Ronaghan, S. The mathematics of decision trees, random forest and feature importance in scikit? Learn and spark.
Michelucci, U. An introduction to autoencoders. arXiv preprint arXiv:2201.03898 (2022).
Tavallaee, M., Bagheri, E., Lu, W. & Ghorbani, A. A. A detailed analysis of the kdd cup 99 data set. In 2009 IEEE symposium on computational intelligence for security and defense applications, 1–6 (IEEE, 2009).
Moustafa, N. & Slay, J. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), 1–6 (IEEE, 2015).
Sharafaldin, I., Lashkari, A. H. & Ghorbani, A. A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp1, 108–116 (2018).
Silva, B. R., Silveira, R. J., da Silva Neto, M. G., Cortez, P. C. & Gomes, D. G. A comparative analysis of undersampling techniques for network intrusion detection systems design. J. Commun. Inf. Syst.36, 31–43 (2021).
Ashari, I. F. et al. Analysis of elbow, silhouette, Davies-Bouldin, Calinski-Harabasz, and rand-index evaluation on k-means algorithm for classifying flood-affected areas in Jakarta. J. Appl. Inform. Comput.7, 95–103 (2023).
Otoum, Y. & Nayak, A. As-ids: Anomaly and signature based ids for the internet of things. J. Netw. Syst. Manag.29, 23 (2021).
Kayode-Ajala, O. Anomaly detection in network intrusion detection systems using machine learning and dimensionality reduction. Sage Sci. Rev. Appl. Mach. Learn.4, 12–26 (2021).
Manzoor, I. et al. A feature reduced intrusion detection system using ann classifier. Expert Syst. Appl.88, 249–257 (2017).
Pajouh, H. H., Javidan, R., Khayami, R., Dehghantanha, A. & Choo, K.-K.R. A two-layer dimension reduction and two-tier classification model for anomaly-based intrusion detection in iot backbone networks. IEEE Trans. Emerg. Top. Comput.7, 314–323 (2016).
Al-Qatf, M., Lasheng, Y., Al-Habib, M. & Al-Sabahi, K. Deep learning approach combining sparse autoencoder with svm for network intrusion detection. IEEE Access6, 52843–52856 (2018).
Thaseen, I. S. & Kumar, C. A. Intrusion detection model using fusion of chi-square feature selection and multi class svm. J. King Saud Univ.-Comput. Inf. Sci.29, 462–472 (2017).
Tama, B. A., Comuzzi, M. & Rhee, K.-H. Tse-ids: A two-stage classifier ensemble for intelligent anomaly-based intrusion detection system. IEEE Access7, 94497–94507 (2019).
Lin, W.-C., Ke, S.-W. & Tsai, C.-F. Cann: An intrusion detection system based on combining cluster centers and nearest neighbors. Knowl.-Based Syst.78, 13–21 (2015).
Mohammadi, S., Mirvaziri, H., Ghazizadeh-Ahsaee, M. & Karimipour, H. Cyber intrusion detection by combined feature selection algorithm. J. Inf. Secur. Appl.44, 80–88 (2019).
Singh, R., Kumar, H. & Singla, R. An intrusion detection system using network traffic profiling and online sequential extreme learning machine. Expert Syst. Appl.42, 8609–8624 (2015).
Shiravani, A., Sadreddini, M. H. & Nahook, H. N. Network intrusion detection using data dimensions reduction techniques. J. Big Data10, 27 (2023).
Khammassi, C. & Krichen, S. A ga-lr wrapper approach for feature selection in network intrusion detection. Comput. Secur.70, 255–277 (2017).
Hassine, K., Erbad, A. & Hamila, R. Important complexity reduction of random forest in multi-classification problem. In 2019 15th international wireless communications & mobile computing conference (IWCMC), 226–231 (IEEE, 2019).
Meftah, S., Rachidi, T. & Assem, N. Network based intrusion detection using the unsw-nb15 dataset. Int. J. Comput. Digit. Syst.8, 478–487 (2019).
Jose, J. & Jose, D. V. Deep learning algorithms for intrusion detection systems in internet of things using cic-ids 2017 dataset. Int. J. Electr. Comput. Eng. (IJECE)13, 1134–1141 (2023).
Zouhri, H., Idri, A. & Ratnani, A. Evaluating the impact of filter-based feature selection in intrusion detection systems. Int. J. Inf. Secur.23, 759–785 (2024).
Funding
This research was funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2026R234), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
Author information
Authors and Affiliations
Contributions
M.A., and A.S. conceived the study. M.A. and S.A performed the experiments and data analysis. M.S, A.M. and T.S. contributed to methodology and resources. M.S and A.B. drafted the manuscript, and all authors reviewed and approved the final version.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Abdel-Rahman, M.A., Alluhaidan, A.S., El-Rahman, S.A. et al. Feature importance guided autoencoder for dimensionality reduction in intrusion detection systems. Sci Rep (2026). https://doi.org/10.1038/s41598-026-36695-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-36695-9