Table 1 Strengths and weaknesses of existing resampling techniques.
From: An approach for handling imbalanced datasets using borderline shifting
Technique | Strengths | Weaknesses |
|---|---|---|
Random Undersampling (RUS) | Simple and fast; effectively reduces training time | Discards potentially useful majority class instances, leading to information loss and reduced generalization |
NearMiss | Retains informative majority instances close to minority samples | Highly sensitive to noise; may under- represent the majority class boundary |
Random Oversampling (ROS) | Easy to implement; no data is removed | Prone to overfitting due to repeated du- plication of minority samples |
SMOTE | Creates synthetic samples to improve class balance and recall | May introduce noise or overlapping samples, distorting class boundaries |
Borderline-SMOTE | Focuses on generating synthetic sam- ples near the decision boundary | Cannot control majority overlap; may introduce ambiguity near class borders |
SMOTE-Tomek(Hy- brid) | Reduces class overlap and noise through Tomek link removal | Tomek removal may discard informa- tive borderline instances |
SMOTEENN (Hybrid) | Improves noise removal and boundary clarity using Edited Nearest Neighbor | May be too aggressive; risks removing valuable samples and increasing com- plexity |