Intrusion Detection Method Based on Preprocessing of Highly Correlated and Imbalanced Data

Loading...
Thumbnail Image

Date

item.page.thesis.degree.name

item.page.thesis.degree.level

item.page.thesis.degree.discipline

item.page.thesis.degree.department

item.page.thesis.degree.grantor

item.page.thesis.degree.advisor

item.page.thesis.degree.committeeMember

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This paper examines traditional machine learning algorithms, neural networks, and the benefits of utilizing ensemble models. Data preprocessing methods for improving the quality of classification models are considered. To balance the classes, Undersampling, Oversampling, and their combination (Over + Undersampling) algorithms are explored. A procedure for reducing feature correlation is proposed. Classification models based on meta-algorithms such as SVM, KNN Naive Bayes, Perceptron, Bagging, Random Forest, AdaBoost, and Gradient Boosting have been thoroughly investigated. The settings of the base classifiers and meta-algorithm parameters have been optimized. The best result was obtained by using an ensemble classifier based on the Random Forest algorithm. Thus, an intrusion detection method based on the preprocessing of highly correlated and imbalanced data has been proposed. The scientific novelty of the obtained results lies in the integrated use of the developed procedure for reducing feature correlation, the application of the SMOTEENN data balancing method, the selection of an appropriate classifier, and the fine tuning of its parameters. The integration of these procedures and methods resulted in a higher F1 score, reduced training time, and faster recognition speed for the model. This allows us to recommend this method for practical use to improve the quality of network intrusion detection.

Description

Citation

Intrusion Detection Method Based on Preprocessing of Highly Correlated and Imbalanced Data [Electronic resource] / Serhii Semenov [et al.] // Applied sciences. – Electronic text data. – 2025. – Vol. 15. – P. 1-15. – Acess mode: https://www.mdpi.com/2076-3417/15/8/4243, free (date of the application 10.11.2025.)

Endorsement

Review

Supplemented By

Referenced By