Table 2 Data preprocessing steps.

From: Securing IoT networks: a machine learning approach for detecting unusual traffic patterns

Step

Description

Techniques used

Tools/Software

Data Cleaning

Removing irrelevant data,

correcting errors, and

handling missing values

Filtering, Imputation

Python (Pandas, NumPy)

Data Integration

Combining data from multiple

sources to create a

consistent dataset

Merging, Joining

Python (Pandas)

Data Transformation

Normalizing and scaling data

to a uniform format

Min-Max Scaling, Z-score

Normalization

Python (Scikit-learn)

Data Reduction

Reducing data volume but

producing the same or

similar analytical results

Dimensionality Reduction

(PCA), Feature Selection

Python (Scikit-learn)

Data Discretization

Converting continuous features

into discrete bins

Binning, Histogram Analysis

Python (Pandas, Matplotlib)

Feature Engineering

Creating new features to

improve model performance

Feature Extraction, Feature

Construction

Python (Feature-engine)

Label Encoding

Transforming categorical labels

into a numerical format

Label Encoding, One-Hot

Encoding

Python (Scikit-learn)

Dataset Balancing

Addressing the imbalance in

the dataset to prevent

model bias

Oversampling (SMOTE),

Under-sampling

Python (Imbalanced-learn)