Table 2 Data preprocessing steps.
From: Securing IoT networks: a machine learning approach for detecting unusual traffic patterns
Step | Description | Techniques used | Tools/Software |
|---|---|---|---|
Data Cleaning | Removing irrelevant data, correcting errors, and handling missing values | Filtering, Imputation | Python (Pandas, NumPy) |
Data Integration | Combining data from multiple sources to create a consistent dataset | Merging, Joining | Python (Pandas) |
Data Transformation | Normalizing and scaling data to a uniform format | Min-Max Scaling, Z-score Normalization | Python (Scikit-learn) |
Data Reduction | Reducing data volume but producing the same or similar analytical results | Dimensionality Reduction (PCA), Feature Selection | Python (Scikit-learn) |
Data Discretization | Converting continuous features into discrete bins | Binning, Histogram Analysis | Python (Pandas, Matplotlib) |
Feature Engineering | Creating new features to improve model performance | Feature Extraction, Feature Construction | Python (Feature-engine) |
Label Encoding | Transforming categorical labels into a numerical format | Label Encoding, One-Hot Encoding | Python (Scikit-learn) |
Dataset Balancing | Addressing the imbalance in the dataset to prevent model bias | Oversampling (SMOTE), Under-sampling | Python (Imbalanced-learn) |