Abstract
Accurate detection and classification of Power Quality Disturbances (PQDs) are critical for ensuring reliability in modern distribution networks with high renewable energy penetration. This study evaluates four signal decomposition methods—Empirical Mode Decomposition (EMD), Ensemble EMD (EEMD), Complete Ensemble EMD with Adaptive Noise (CEEMDAN), and Variational Mode Decomposition (VMD)—in conjunction with a Random Forest Classifier (RFC) for PQD classification. The IEEE-1159 synthetic benchmark dataset comprising fifteen single and multiple PQD classes was used for training and testing, with 5-fold cross-validation employed to ensure robustness. Hyper parameter tuning of the RF model was performed using grid search to optimize the number of trees, depth, and feature selection strategy. Among the methods, VMD + RFC consistently outperformed the EMD-family techniques, achieving a confusion matrix–based accuracy of 99.16% and a cross-validation accuracy of 94.6% ± 1.42 (95% CI). Paired t-tests confirmed that the accuracy improvements of VMD over the other decomposition methods were statistically significant (p < 0.05). Beyond synthetic benchmarks, the proposed study was validated on a field dataset of the university campus in India collected at the point of common coupling (PCC) where a 500 kWp photovoltaic (PV) system is integrated. Here, VMD + RFC predictions are closely aligned with the reference PQA logs, demonstrating strong generalization capability. The results establish VMD + RFC as a robust and computationally efficient study for PQD classification, combining superior accuracy, statistical significance, and real-world applicability. This contributes both methodological rigor and practical validation, distinguishing the work from prior studies limited to smaller PQD sets or lacking external verification.
Introduction
Ensuring stable and high-integrity power quality is essential for the efficient operation of power systems and the reliable performance of electrical equipment. Issues such as voltage fluctuations, frequency variations, excessive harmonic distortion, and current imbalances can significantly affect the overall power quality1. Problems with current asymmetry and harmonic content might cause equipment damage and lower system energy efficiency2. Voltage stability is critical for the proper functioning of electrical devices, whereas frequency stability is vital for the reliable operation of synchronous equipment. Nonlinear loads, the integration of renewable energy resources, and device failures are the main variables that cause power quality issues3. Common power quality issues include harmonics and current asymmetry resulting from nonlinear loads, whereas the intermittent nature of renewable energy sources further complicates the maintenance of stable power quality. Voltage dips, voltage surges, and a variety of other problems can result from equipment malfunctions, which further impact the stability of power systems4. The combined use of both proven and cutting-edge technologies geared at overcoming power quality is necessary to ensure high power quality. Technologies such as power electronic devices, static reactive power compensators, and intelligent dispatch and control systems play vital roles in maintaining the quality of power. Comprehensive analysis and continuous monitoring enable the rapid detection and resolution of power quality issues, thereby supporting the stable operation of power systems, enhancing energy efficiency, prolonging equipment lifespan, and ensuring the safe and reliable performance of end-user devices. Power quality (PQ) refers to a set of standards that define the control and reliability of the electrical signal delivered to users. These requirements are outlined in various regulatory standards, including IEEE 1159–2009, which provides guidelines for monitoring electric power quality5.
Inter harmonics and harmonics must be identified to perform harmonic analysis and manage them effectively. The area of harmonic detection has undergone considerable advancements from researchers across the globe. However, interharmonic detection remains problematic and a key area of focus for power quality studies in the power grid because of the high standards for algorithmic accuracy6. The Fourier transform is a traditional method for breaking down signals that are continuous in time and last forever into frequency components to analyze the frequency of stationary signals. The Fast Fourier Transform (FFT)7, Discrete Fourier transform (DFT)8, and Short-Time Fourier Transform (STFT)9 are few popular Fourier transform techniques. However, the DFT and FFT are excellent for analyzing periodic signals and harmonics, but they have trouble with transient and nonstationary perturbations. The temporal resolution of the STFT is constrained by its constant window size, even though it solves this by examining signal segments. Although the Wavelet Transform (WT) is better for analyzing nonstationary and transient signals, it may have difficulties with signals that are complicated or have considerable noise. Although the WT, which encompasses both continuous and discrete wavelet transforms (CWT and DWT), respectively, is more appropriate for studying nonstationary signals, it is susceptible to noise and necessitates careful parameter optimization. The Fourier Transform (FT) and the WT are two traditional PQD analysis techniques that have shortcomings when dealing with nonlinear and nonstationary signals.
Recent studies in related fields have also highlighted the value of advanced time–frequency signal processing and deep learning for handling non-stationary disturbances. For example, Ullah et al.10 developed a deep learning study for centrifugal pump fault diagnosis using wavelet coherence analysis and S-transform scalograms, enabling convolutional networks to learn discriminative time–frequency features. Similarly, Ullah et al.11 applied acoustic emission signal processing with machine learning algorithms to detect pipeline leakage, achieving superior accuracy through statistical and frequency-domain feature extraction. More recently, Zaman et al.12 proposed a dual-input CNN architecture that integrates acoustic emission and vibration signals for multimodal fault classification, demonstrating the effectiveness of hybrid signal representations. These works emphasize that combining advanced decomposition with machine learning or deep learning significantly improves classification robustness, thereby supporting the methodological choice of applying VMD-based feature extraction and Random Forest classification in the present study.
Among the above signal processing techniques, EMD, EEMD, CEEMDAN, and VMD have gained prominence as ways to overcome these difficulties. As a feature extraction technique, the EMD method breaks down the signal by its time scale, making it appropriate for the study of nonlinear and nonstationary signals13. EMD was widely utilized to identify PQDs shortly after its introduction. In14,15, the authors introduced an EMD-based PQD classification approach for detecting and classifying nine different PQDs under high noise contamination at 20 dB, in which the fundamental concept of EMD is applied to the PQD signal to produce IMFs directly. The instantaneous spectrum is then obtained by applying the HT to each eigenmode function, and the frequency with physical meaning is then determined in16. In reference17, the authors compared the detection of voltage spikes and gaps in PQDs via empirical mode decomposition (EMD) and the S- transform by overlooking issues such as mode mixing and noise sensitivity, and the findings suggested that EMD outperforms the S transform. According to reference18, the empirical wavelet transform (EWT) may offer a more consistent decomposition than EMD does. Moreover, in reference19, the authors applied EMD to real-time monitoring of power system variations and reported that its resolution for detecting PQ events was lower than that of the Discrete Wavelet Transform (DWT).
The teaser energy operator was employed to obtain the instantaneous amplitude and frequency, while the use of EEMD approximate entropy was recommended to resolve the mode-mixing problem associated with EMD in20. In reference21, white noise was introduced to aid the EEMD algorithm. By using EMD correlation, the denoising and feature extraction approach successfully suppressed noise and lowered noise sensitivity in22. The orthogonal empirical mode decomposition method was introduced in23, together with its field-programmable gate array (FPGA) implementation. To overcome the limitation of the EMD sampling rate, the iterative down sampling phase is integrated into EMD in the literature24, whereas Salameh et al.25 developed a novel fast sliding window EMD. The computational cost of EMD can be decreased with these two strategies, which are also better at extracting the IMF. The harmonic may then be derived in real time from the nonstationary signal.
CEEMDAN is a noise-assisted EMD approach that improves EEMD by introducing paired positive and negative noise to reduce the remaining noise. While still requiring careful noise amplitude selection, it is more effective than EEMD (fewer ensemble trials are required) and minimizes false IMF components. The word “complete” presumably implies completely decomposing everything, including introducing disturbance (noise). To extract IMFs with clear frequency ranges, VMD employs constrained optimization. Although it needs parameter adjustments (such as the number of modes and bandwidth), VMD is more resistant to noise and removes mode mixing than are EMD-based methods.
The second step in identifying power quality disturbance signals is pattern recognition of the extracted feature information. Most pattern recognition methods rely on artificial intelligence techniques for classification and decision-making. Among these, machine learning, the fastest-growing field within artificial intelligence, has become the most commonly used approach for pattern recognition. Machine learning algorithms enable computers to analyze vast amounts of data, identify patterns, and intelligently recognize and predict new samples. The evolution of machine learning has seen two key phases: traditional machine learning and deep learning (DL).
Popular machine learning recognition techniques include decision trees (DTs)26, artificial neural networks (ANNs)27,28,29,30,31,32, support vector machines (SVMs)33,34,35,36,37,38,39, and random forests (RFs)40. The mathematical algorithm known as ANN replicates the characteristics of human neural networks for parallel distributed processing. It may modify the weights and biases between different neurons to carry out tasks such as function approximation and pattern identification. Owing to its simplicity, ANN is extensively used in power quality disturbance identification and excels at handling nonlinear data. In Reference41, a composite power quality disturbance identification technique is introduced, which uses dual neural networks in which an adaptive linear network is responsible for extracting feature vectors to detect and classify disturbances such as surges, harmonics, and outages. Reference42 presented a method that integrates artificial neural networks (ANNs) with the Hilbert transform (HT), applying empirical mode decomposition (EMD) to decompose signals into intrinsic mode functions (IMFs) and subsequently extracting instantaneous amplitude and spectral characteristics for the classification of transient disturbances. Despite their classification capabilities, ANNs often require significant training durations, which can hinder their practical application in real-time power quality monitoring scenarios. Additionally, their performance on campus datasets typically necessitates a large volume of training samples. In contrast, decision trees (DTs) simulate human logical reasoning processes to generate rule-based classification models, offering a simple structural design and ease of implementation. Reference43 proposed a rule-based DT classification strategy, which uses features extracted from the t FFT, a dynamic measurement algorithm, and the S-transform. Similarly, Reference44 extracts classification features on the basis of curve variations obtained through HT transformation and adaptively establishes threshold values for these features within a DT-based classification model. However, DTs have drawbacks, including a sluggish rate of convergence, high sensitivity to noise, and strong reliance on training data. In recent years, RFs have emerged as a well-liked learning method that builds on DTs. The RF classifier is a large ensemble classifier that integrates the bagging algorithm with the random subspace algorithm45 and is composed of numerous tree-structured DT classifiers. The idea behind this is to increase the precision of prediction and categorization by combining several DTs. In the field of power quality disturbance signal identification, Reference46 demonstrated high classification accuracy by integrating the RF algorithm with an enhanced S-transform. However, the primary limitation of RF lies in the substantial computational time and memory requirements when managing many decision trees, which restricts its scalability in power quality applications. The support vector machine (SVM), as noted in Reference47, is a supervised learning method that is well suited for nonlinear system modeling and pattern recognition. It addresses several limitations of artificial neural networks (ANNs), such as slow convergence and susceptibility to local minima, by offering advantages, including strong global optimization capability, high generalization performance, and relatively fast training, making it particularly suitable for classification tasks involving small sample sizes. Reference48 developed an integrated model that combines an improved wavelet transform with an SVM for power quality disturbance classification. Additionally, Reference49 employs a hybrid approach by combining a genetic algorithm with an SVM to construct a disturbance identification model, reporting promising experimental results. With a 100% accuracy rate, the authors of50 offered an artificial neural network (ANN) and an EMD-based feature extraction technique to classify nine low-SNR PQDs. Using a hybrid method combining Hilbert transform (HT), the CEEMDAN technique, and an extreme learning machine (ELM), the authors of51 proposed a hybrid approach that can successfully handle nonlinear and nonstationary PQDs under intense noise interference, increasing the identification accuracy above 95% compared with that of conventional methods such as CNNs and SVMs. The authors of reference52 presented a classification approach for PQDs that is based on the VMD algorithm and an enhanced support vector machine (SVM) model with a 98% accuracy rate.
Despite considerable progress in PQD detection and classification, existing studies still face several challenges. Traditional decomposition techniques such as EMD and its variants often suffer from mode mixing and bandwidth leakage, which limit their effectiveness in extracting reliable features from nonlinear and nonstationary signals. Furthermore, many prior works have been restricted to a narrow set of disturbance types (typically 5–7 classes), thus limiting their generalizability to real-world operating conditions. Another limitation is the reliance on diverse, non-standard datasets and the use of conventional classifiers such as SVM, KNN, or DT, which may not ensure robust performance across different scenarios.
To address these gaps, this study introduces a novel study that integrates Variational Mode Decomposition with a Random Forest Classifier for the classification of power quality disturbances. The key novelties of this work are threefold:
-
1.
Advanced signal decomposition: Unlike empirical methods, VMD formulates decomposition as a constrained variational optimization problem in the frequency domain, producing narrowband and non-overlapping modes. This property enables effective isolation of transients and harmonics, thereby overcoming issues of mode mixing and bandwidth leakage.
-
2.
Robust classification through ensemble learning: The use of RFC leverages ensemble-based bagging to combine multiple decision trees, improving classification robustness, reducing overfitting, and enhancing generalization compared to single-model classifiers.
-
3.
Comprehensive evaluation on benchmark and real-world data: While most prior works rely solely on synthetic or limited datasets, this study evaluates the proposed study on both the IEEE 1159 synthetic dataset (covering 15 classes of single and multiple disturbances) and real-world campus data from a 500 kWp PV-integrated distribution system. The validation against practical PQA logs demonstrates that VMD + RFC achieves both high accuracy and strong generalization, making it suitable for real-world deployment.
Taken together, these contributions establish the novelty of this study as the first statistically validated PQD classification study that combines VMD-based feature extraction with RFC, evaluated on both standardized and real-world datasets. This dual validation demonstrates superior accuracy, robustness, and practical feasibility compared to traditional methods.
nds—usually computed The paper is organized as follows: Methodology - describes the generation of synthetic and campus data signals and the advanced signal processing techniques used to extract meaningful information (features) for classifying PQDs. Evaluation of the decomposition methods- evaluates discuss about the effectiveness of VMD in extracting the features of the synthetic and campus data signals. Evaluation of PQDs classifcation- effectiveness of the machine learning method in classifying the PQDs is discussed. Conclusions- explains the conclusions drawn and future scope.
Methodology
This section describes the step-by-step approach used for PQD detection and classification. The process consists of five key stages: signal generation, signal processing, feature extraction, training a machine learning model, and classification of PQDs. The structural diagram of the proposed system is shown in Fig. 1.
Signal generation
Synthetic power quality signals were generated to simulate fifteen types of disturbances commonly observed in electrical networks, following the definitions outlined in IEEE 1159 specifications52. The synthetic dataset was designed with dimensions of 7500 × 1000 samples, where the signal amplitudes were normalized to unity to maintain consistency in feature scaling. Each disturbance sample was labeled according to the IEEE-1159 standard categories of PQDs. To capture the electrical waveform accurately, 100 samples per cycle were acquired at a sampling frequency of 5 kHz for a 50 Hz system.
In parallel, real-world power quality data was obtained from a university campus distribution system to complement the synthetic dataset. The campus under study has a contracted maximum demand of 500 kVA and operates a 500 kWp captive solar power plant. Due to the intermittent nature of solar resources and the continuous switching of nonlinear loads, the occurrence of PQDs in the distribution system is frequent. To capture these events, datasets were logged using a power quality analyzer (PQA). This represents the first attempt to systematically estimate and validate the power quality of a university campus using both synthetic and real-time datasets. The campus dataset therefore forms a practical foundation for evaluating PQD detection and classification methods in this research.
Signal processing
Signal processing is crucial in detecting, analyzing, and classifying PQDs to ensure a stable and efficient power system. In this study, a power quality disturbance signal f(t) is decomposed into a set of intrinsic mode functions (IMFs) via four advanced decomposition techniques. In VMD, the decomposition is formulated as a constrained variational optimization problem, as shown in Eq. (1), aiming to minimize the bandwidth of each mode while preserving the integrity of the original signal by summing all the IMFs. The objective function is to minimize the total bandwidth of all decomposed modes, and the constraint ensures signal reconstruction, i.e., without any information loss. This mathematical technique ensures that each mode is concentrated around a center frequency (a narrow band) and that all modes together describe the original signal accurately.
Subject to:\(\:{\sum\:}_{k=1}^{K}{u}_{k}\left(t\right)=f\left(t\right)\)
-
uk(t): the kth mode (narrowband signaling components).
-
ω(k): the estimated center frequency, uk.
-
uk(t) e−jωt: shifts the modes to the baseband to measure its bandwidth.
-
∂t(uk(t) e−jωt): the gradient (derivative) used to measure the bandwidth in the frequency domain.
-
∥⋅∥22: squared norm, which quantifies the total bandwidth.
The variational optimization problem is addressed via the method of Lagrange multipliers, particularly through the augmented Lagrangian technique. This allows the transformation of a constrained optimization problem into an iterative, solvable form. By writing a Lagrangian function in Eq. (2) that combines the objective and the constraint via a Lagrange multiplier λ(t):
λ(t): Lagrange multiplier that enforces the reconstruction constraint.
α: Balancing parameter for bandwidth minimization.
To improve convergence, we add a quadratic penalty term (to an optimization problem to enforce a constraint softly) to equation − 2, which strongly penalizes violations of the constraint to form the augmented Lagrangian as given in equation − 3.
µ: Penalty parameter for constraint violation.
The variational optimization problem is solved via a Lagrangian multiplier method, namely, the augmented Lagrangian approach with ADMM. It enables the decomposition of a signal into narrowband modes while enforcing that the sum of the modes equals the original signal. This method ensures adaptive, efficient, and accurate mode decomposition—ideal for analyzing complex, nonstationary signals such as those in power quality studies. Table 1 shows the parameters used for extracting the feature of the PQD signal via the VMD technique.
Feature extraction
After decomposing a signal into multiple intrinsic mode functions (IMFs) using different signal decomposition techniques, representative features are extracted from the resulting components to facilitate power quality disturbance (PQD) classification. Feature extraction is typically performed in three domains: time, frequency, and time–frequency. In the time domain, statistical descriptors such as mean, standard deviation, root mean square (RMS), peak value, skewness, kurtosis, energy (square sum of the IMF signal), and entropy (Shannon or log-energy) are widely adopted, as they are particularly effective for capturing transient disturbances such as sags, swells, and flickers. In the frequency domain, features including dominant frequency, spectral bandwidth, and power distribution across frequency bands—usually computed via FFT are instrumental in detecting harmonic distortions, notches, and other steady-state frequency-related PQDs. In the time–frequency domain, features derived from Hilbert–Huang Transform (HHT) analysis or related approaches, such as Hilbert spectrum energy and instantaneous frequency, provide insights into the temporal evolution of spectral content and are valuable for identifying non-stationary and transient PQDs.
Although thirteen features were initially extracted from each IMF across these domains, not all features contribute equally to classification performance. To assess their discriminative power, a Random Forest–based feature importance analysis was conducted. The results (Fig. 2) revealed that features such as mean, RMS, and dominant frequency consistently provided the highest contribution to PQD classification, while features such as kurtosis and peak value played a relatively minor role. This analysis not only validates the relevance of the selected feature set but also underscores the potential for dimensionality reduction in future work. By emphasizing the most discriminative features, the classification pipeline can be optimized for computational efficiency without compromising accuracy.
Random forest classifier for PQD classification: hyper parameter optimization and performance evaluation
This research adopts a robust RFC based framework for the classification of PQDs, where decomposition-driven feature extraction is integrated with ensemble learning. Advanced signal decomposition techniques (EMD, EEMD, CEEMDAN, and VMD) were employed to analyze nonlinear and nonstationary power signals, enabling the extraction of representative features such as mode energy, entropy, standard deviation, and statistical moments. The RFC leverages an ensemble of decision trees, each trained on bootstrapped samples and randomized feature subsets, to assign final labels through majority voting. This ensemble approach enhances generalization, mitigates overfitting, and ensures robustness against noise and feature variability.
To ensure a fair and reliable evaluation across decomposition methods, the RFC was carefully tuned prior to classification. Three critical hyper parameters were optimized: the number of trees (n_estimators), the maximum tree depth (max_depth), and the minimum number of samples required to split an internal node (min_samples_split). A grid search strategy combined with 5-fold cross-validation was employed, exploring parameter ranges of n_estimators (100–500, step = 100), max_depth (5–30, step = 5), and min_samples_split (2–10). For each configuration, mean cross-validation accuracy was computed, and the optimal setup was identified as:
(Where N*, D*, and M* denote the empirically selected values), achieving stable generalization without overfitting. Denote empirically determined best values). The tuning results, shown in Fig. 3, confirm that the chosen configuration balanced predictive accuracy and computational efficiency while avoiding overfitting.
Following hyper parameter optimization, the tuned RFC was employed for PQD classification across both synthetic and campus datasets. Performance was assessed using accuracy, precision, recall, and F1-score, ensuring a comprehensive evaluation of classification effectiveness. Confusion matrix analysis further illustrated class-specific prediction reliability, while cross-validation with 95% confidence intervals (CIs) quantified statistical stability. Comparative evaluation across decomposition methods revealed that VMD combined with RFC consistently outperformed EMD, EEMD, and CEEMDAN, with improvements confirmed to be statistically significant at the 5% level (p < 0.05).
Taken together, the integration of decomposition-based feature extraction with a tuned RFC provides a scalable, interpretable, and statistically validated framework for PQD identification, demonstrating strong potential for real-world smart grid monitoring applications.
Evaluation of the decomposition methods
This section evaluates the effectiveness of different signal decomposition methods—EMD, EEMD, CEEMDAN, and VMD—used in conjunction with an RF classifier for PQD classification for the datasets considered in this work.
Evaluation of variational mode decomposition (VMD) for PQD analysis
To evaluate the effectiveness of variational mode decomposition (VMD) in PQD analysis, several synthetic and campus PQD signals were tested, covering both standard disturbance types and complex disturbances. A comparison of the four IMFs in the time domain for different decomposition techniques for class-10 (Harmonics with Sag) and campus data (Notch) are presented in Figs. 4 and 5. The results highlight that among all techniques, VMD–Mode-1 effectively captures the clean fundamental frequency component of the signal, Mode 2 isolates transient disturbances, and Mode-3 predominantly extracts harmonic components and mode-4 very high frequency oscillations. By contrast, the IMFs derived from EMD suffer from mode mixing (e.g., transients blended with harmonics, or leakage between fundamental and higher-order components), which reduces interpretability and diagnostic value.
The superiority of VMD arises from its formulation as a well-posed constrained variational optimization problem, rather than an empirical sifting process. Using the alternating direction method of multipliers (ADMM), VMD adaptively estimates the center frequency and bandwidth of each mode directly in the frequency domain, ensuring spectral separation and narrowband characteristics. This makes the extracted modes more resilient to noise and intermittency, producing a stable and interpretable time–frequency representation. Such decomposition quality is critical for extracting meaningful features from nonstationary PQDs.
To further validate the methodological choice of the number of modes, an ablation analysis was conducted by varying K ∈ {2, 3, 4, 5, 6}. The results indicate that classification accuracy with the RFC remained saturated at 100% across all values of K shown in Fig. 6a, confirming robustness to decomposition depth. However, reconstruction fidelity (measured by mean Normalized Root Mean Square Error) consistently improved with larger K shown in Fig. 6b, while residual whitening (spectral flatness) reached its minimum at K = 4 before slightly increasing shown in Fig. 6c. Taken together, these findings suggest that K = 4 provides the best trade-off: it minimizes reconstruction error and effectively extracts deterministic components into distinct modes, without overfitting or capturing noise at higher K.
Comparative analysis of synthetic PQD signals and campus signals using FFT analysis
Figure 7 illustrates the FFT analysis of the decomposed campus power quality signal, while Fig. 8 presents the FFT spectrum of the IEEE 1159 Class 10 synthetic dataset (Harmonics with Sag). A clear distinction can be observed between the two cases. In the campus signal, the dominant frequency components are widely spread, with noticeable peaks below 500 Hz and additional localized harmonic contributions extending toward higher frequency ranges (up to ~ 2000 Hz). This indicates a mixture of harmonic distortion, oscillatory disturbances, and transient activity, which are typical of real-world operating conditions where multiple loads and renewable sources interact.
In contrast, the synthetic Class 10 dataset shows a much cleaner frequency-domain signature. The majority of the spectral energy is concentrated in the lower-frequency region, with sharp and well-defined peaks corresponding to the fundamental frequency and its harmonics (notably around 150 Hz and 250 Hz). Unlike the campus data, the synthetic dataset does not exhibit strong transient content beyond 500 Hz, which highlights the controlled nature of IEEE test cases.
From a machine learning perspective, the integration of both datasets plays a crucial role in building robust classifiers. Synthetic data ensures that the models are trained on standardized and clearly defined disturbance categories as per IEEE 1159, minimizing ambiguity in class boundaries. Meanwhile, real-world campus data introduces natural noise, overlapping disturbances, and operational variability, which enhances the generalization capability of the trained models. By leveraging FFT-based frequency features from both domains, the classification study can effectively learn to discriminate between idealized PQDs and complex field disturbances, thereby reducing overfitting to synthetic signals and improving accuracy when deployed in practical monitoring systems.
Evaluating the effectiveness of signal denoising
The effectiveness of signal denoising achieved through various signal decomposition techniques can be evaluated via the signal-to-noise ratio (SNR) as a key performance metric via Eq. (4). The average SNRs for synthetic and campus data are calculated and compared for all the decomposition methods considered in this work. The results shown in Fig. 9 clearly indicate that the VMD method consistently achieves a significantly higher SNR in the decomposed modes than do the EMD methods. This superior performance can be attributed to the inherent design of VMD, which decomposes a signal into a predefined number of modes by minimizing the bandwidth of each mode around its adaptive center frequency. Unlike EEMD and CEEMDAN, which rely on the addition of external white or adaptive noise to suppress mode mixing, VMD inherently suppresses mode mixing and spectral leakage through its constrained variational optimization process. As a result, VMD produces clean nonoverlapping modes with inherently high SNRs compared with those of the EMD family for synthetic and campus datasets.
Where \(\:s\left(k\right)\) is the original known component.
\(\:\widehat{s}\:\left(k\right)\) is the corresponding IMF.
Figures 10 and 11 illustrate the comparative visual representations of power quality signal decomposition and reconstruction via EMD, EEMD, CEEMDAN, and VMD. The original synthetic power quality disturbance, such as transients, harmonics and sag events, as shown in Fig. 10a, processed through each decomposition method is reconstructed, as shown in Fig. 10b. Figure 10c shows a zoomed-in view of the reconstructed signal from 0.7 s to 0.9 s, revealing that, compared with EMD, EEMD, and CEEMDAN, VMD effectively decomposes and reconstructs the PQD signal. The campus original signal, reconstructed signal and zoomed view are shown in Fig. 11a, b, and c, respectively.
Evaluating the effectiveness on the basis of the execution time
Figure 12 illustrates the trade-off between execution time and classification efficiency for the four decomposition methods when combined with the RFC. The left panel compares the total execution time (in minutes) required for feature extraction across EMD, EEMD, CEEMDAN, and VMD. CEEMDAN exhibits the highest computational cost (180 min), while EMD and VMD are significantly faster (20–25 min).
The right panel presents the performance-to-time efficiency, expressed as classification accuracy per minute of execution. EMD achieves the highest efficiency due to its low computational time, while VMD demonstrates a favorable balance between high classification accuracy and moderate execution time. In contrast, CEEMDAN provides the least efficient trade-off, with high computation time and comparatively lower accuracy gains.
This analysis highlights that, although EMD is computationally efficient, VMD achieves the best overall compromise between computational effort and predictive performance, making it the most suitable method for real-time PQD monitoring and classification.
Evaluation of PQDs classification
In this section, the performance of the RFC is evaluated using features extracted from two families of signal decomposition methods: the EMD family (EMD, EEMD, CEEMDAN) and Variational Mode Decomposition (VMD). The extracted features serve as inputs to the machine learning model for the classification of PQDs.
For the synthetic dataset constructed according to IEEE-1159 standards, model performance is assessed using multiple metrics, including the confusion matrix, accuracy, precision, recall, and F1-score. In addition, to validate the generalization of the study, campus PQA log data are employed as test inputs. To visualize the discriminative power of the extracted features, a two-dimensional representation is generated using t-distributed stochastic neighbor embedding (t-SNE), illustrating how well the features cluster different PQD classes into separable groups.
Comparative evaluation of PQD classification on synthetic and campus datasets
The classification results are first analyzed using the confusion matrix, which provides insight into the model’s ability to distinguish among PQD classes. A total of 500 synthetic samples per class were generated, and the dataset was partitioned using a 70:30 training-to-testing split, corresponding to 150 test samples per class across 15 PQD categories.
The results demonstrate that VMD combined with RFC achieves the highest prediction accuracy, consistently outperforming EMD, EEMD, and CEEMDAN. As illustrated in Fig. 13 (Panel A), VMD maintains high classification counts (close to 150 correct predictions per class) across nearly all PQD categories, while EEMD and CEEMDAN exhibit substantial misclassifications for certain classes (e.g., Classes 2, 5, and 9). This indicates that VMD is more effective in capturing the intrinsic, class-specific characteristics of PQDs, resulting in better class separability and lower misclassification rates compared to the EMD family of methods.
To further assess real-world applicability, the proposed study was tested using PQA campus log data (99 × 200 samples) under the same decomposition strategies. The classification outcomes highlight substantial differences across methods:
-
EMD misclassifies the majority of events as harmonics.
-
EEMD predicts most events as sag combined with oscillatory transients.
-
CEEMDAN distributes predictions across notch, oscillatory transients, harmonics, and harmonics combined with sag.
-
VMD, in contrast, provides more balanced predictions, identifying notch, harmonics combined with sag, and sag combined with oscillatory transients as the dominant classes.
As shown in Fig. 13 (Panel B), VMD predictions closely align with the PQA reference log distributions, particularly for Classes 9, 10, and 13. This alignment underscores VMD’s robustness and adaptability, making it a superior feature extraction method not only for synthetic benchmark datasets but also for practical campus-level PQD monitoring.
Comparative performance and statistical validation of decomposition methods
The performance of the four signal decomposition methods (EMD, EEMD, CEEMDAN, and VMD) combined with the Random Forest classifier was assessed using two complementary evaluation strategies: (i) confusion matrix–based accuracies on the held-out test set, and (ii) 5-fold cross-validation (CV) accuracies with 95% confidence intervals (CIs). Figure 14 reports the comparing confusion matrix–based accuracies and 5-fold cross-validation accuracies with 95% confidence intervals across decomposition methods.
The results demonstrate that VMD consistently delivers superior classification performance compared to the other decomposition techniques. Specifically, VMD + RFC achieved a confusion matrix accuracy of 99.16%, outperforming EMD (95%), EEMD (71%), and CEEMDAN (87%). Cross-validation further confirmed this trend, with mean ± 95% CI accuracies of VMD: 94.6% ± 1.42, EEMD: 91.4% ± 1.42, EMD: 89.0% ± 1.96, and CEEMDAN: 88.0% ± 1.96.
To examine whether the observed performance improvements of VMD were statistically significant, paired t-tests were conducted between VMD + RFC and each of the alternative methods across the CV folds. The results indicate that the performance gains of VMD are statistically significant at the 5% level (p < 0.05) in all pairwise comparisons. This confirms that the improvements are not attributable to random variation but represent a consistent and robust advantage of VMD in extracting discriminative features for PQD classification.
Visualization of the feature space distribution for campus data
The effectiveness of each decomposition method was further analyzed via t-distributed stochastic neighbor embedding (t-SNE) for dimensionality reduction and visualization. As shown in Fig. 15, the campus data features extracted via VMD formed well-separated and compact clusters, indicating strong discriminative power. In contrast, features derived from the EMD family exhibited significant overlap and dispersion, suggesting reduced effectiveness in capturing class-specific characteristics. The superior clustering performance of VMD corroborates the higher classification accuracy achieved by VMD with the RFC, reinforcing its suitability for power quality disturbance classification tasks.
Conclusions
This study successfully demonstrates the feasibility and effectiveness of classifying power quality disturbances (PQDs) using both synthetic datasets and field data collected from a campus distribution network in India. Fifteen PQD classes, defined according to the IEEE-1159 standard, were analyzed using four advanced signal decomposition methods: EMD, EEMD, CEEMDAN, and VMD. Among these, the VMD-based approach consistently outperformed the EMD-family methods by preserving discriminative features, avoiding mode mixing, enhancing the signal-to-noise ratio, and isolating dominant frequency components. When integrated with a Random Forest Classifier (RFC), the VMD framework achieved a mean classification accuracy of 99.16%, significantly surpassing competing methods, with statistical validation (paired t-tests, p < 0.05) confirming the robustness of these improvements.
For feature extraction, the VMD technique was applied with mode number K = 4, balancing decomposition quality, reconstruction error, and residual whitening. From the Intrinsic Mode Functions (IMFs), a comprehensive set of time-domain and frequency-domain features was derived, including RMS, mean, standard deviation, skewness, kurtosis, dominant frequency, spectral entropy, and total harmonic distortion (THD). Feature importance analysis revealed that RMS, mean, and dominant frequency were the most discriminative, suggesting the potential for dimensionality reduction in future implementations.
To ensure fair and optimized classification performance, the Random Forest Classifier was tuned using a grid search with 5-fold cross-validation to achieve stable generalization without overfitting. Beyond comparative evaluation, a key contribution of this work lies in the external validation of the VMD + RFC model using power quality analyzer (PQA) logs from a 500 kWp photovoltaic-integrated distribution system at the point of common coupling (PCC). The close alignment between model predictions and PQA reference logs confirms the strong generalization capability of the proposed framework under real-world operating conditions.
In conclusion, the VMD + RFC model emerges as a scalable, interpretable, and computationally efficient solution for PQD classification in renewable-integrated grids. Future work will extend this study by (i) exploring hybrid decomposition–deep learning pipelines for end-to-end feature learning, (ii) validating with expert-labeled field data across diverse grid conditions, and (iii) implementing real-time deployment on embedded or edge-computing platforms. These advancements will further strengthen the practical impact of the proposed methodology for smart grid monitoring and decision support.
Data availability
The datasets used in this study are available upon reasonable request to the corresponding author.
References
Chen, J. et al. Analysis and control of cascaded energy storage system for energy efficiency and power quality improvement in electrified railways. IEEE Trans. Transp. Electrif. 10, 1299–1313 (2024). [CrossRef].
Zhou, J. et al. Detection of PQDs based on improved wavelet threshold function and CEEMD. J. Electron. Meas. Instrum. 33, 141–148 (2023).
Zu, L., Liu, Z. & Sheng, N. Research on denoising method of wavelet transform under nonstationary acoustic signals. Mod. Electron. Technol. 45, 35–40 (2022).
Zhang, Z., Liu, M., Yu, B., Zhang, Z. & Wang, Q. Research on denoising of improved wavelet threshold new algorithm in power grid disturbance. Mod. Electron. Technol. 44, 53–57 (2024).
IEEE Std. 1159–2009 (Revision of IEEE Std 1159–1995), IEEE Recommended Practice for Monitoring Electric Power Quality (IEEE Power & Energy Society, 2009).
Monteiro, V. et al. A novel Three-Phase multi objective unified power quality conditioner. IEEE Trans. Ind. Electron. 71, 59–70 (2024). [CrossRef].
Tang, F. Spectral Analysis of Inter harmonics Based on Spatial Spectrum Estimation Algorithm. Master’s Thesis, Lanzhou University of Technology, Lanzhou, China, (2022).
Grandke, T. Interpolation algorithms for discrete fourier transforms of weighted signals. IEEE Trans. Instrum. Meas. 32, 350–355 (2013). [CrossRef].
HarrisF.J. On the use of windows for harmonic analysis with the DFT. Proc. IEEE. 66, 51–83 (1978). [CrossRef].
Ullah, N. et al. An intelligent framework for fault diagnosis of centrifugal pump leveraging wavelet coherence analysis and deep learning. Sensors 23 (21), 8850. https://doi.org/10.3390/s23218850 (2023).
Ullah, N., Ahmed, Z. & Kim, J. M. Pipeline leakage detection using acoustic emission and machine learning algorithms. Sensors 23 (6), 3226. https://doi.org/10.3390/s23063226 (2023).
Zaman, F. et al. A new dual-input CNN for multimodal fault classification using acoustic emission and vibration signals. Mech. Syst. Signal. Process. 221, 110216. https://doi.org/10.1016/j.engfailanal.2025.109787 (2025).
Geng, H., Wang, Z., Yi, X., Alsaadi, F. E. & Cheng, Y. Tobit Kalman filtering for fractional-order systems with stochastic nonlinearities under round-robin protocol. Int. J. Robust Nonlinear Control. 31 (6), 2348–2370. https://doi.org/10.1002/rnc.5396 (2021). [Google Scholar] [CrossRef].
Lopez-Ramirez, M. et al. EMD-Based Feature Extraction for Power Quality Disturbance Classification Using Moments. Energies ; 9(7):565. https://doi.org/10.3390/en9070565 (2016).
Norden, E. et al. The empirical mode decomposition and the Hilbert spectrum for nonlinear and nonstationary time series analysis. In: Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences 454.1971 pp. 903–995. (1998).
Geng, H., Liang, Y. & Xu, L. Fault detection for multirate sensor fusion under multiple uncertainties. IET Control Theory Appl. 9 (11), 1709–1716. https://doi.org/10.1049/iet-cta.2014.1134 (2015). [Google Scholar] [CrossRef].
Shukla, S., Mishra, S. & Singh, B. Empirical-mode decomposition with hilbert transforms for power-quality assessment. IEEE Trans. Power Delivery. 24 (4), 2159–2165. https://doi.org/10.1109/TPWRD.2009.2028792 (2009). [Google Scholar] [CrossRef].
Jena, M. K., Panigrahi, B. K. & Samantaray, S. R. A new approach to power system disturbance assessment using wide-area postdisturbance records. IEEE Trans. Industr. Inf. 14 (3), 1253–1261. https://doi.org/10.1109/TII.2017.2772081 (2018). [Google Scholar] [CrossRef].
Geng, H., Liang, Y. & Zhang, X. Fast-rate residual generator based on multiple slow-rate sensors. IET Signal Proc. 8 (8), 878–884. https://doi.org/10.1049/iet-spr.2013.0296 (2014). [Google Scholar] [CrossRef].
Geng, H., Liang, Y. & Zhang, X. Linear-minimum-mean-square-error observer for multirate sensor fusion with missing measurements. IET Control Theory Appl. 8 (14), 1375–1383. https://doi.org/10.1049/iet-cta.2013.0972 (2014). [Google Scholar] [CrossRef].
Wang, S. et al. Identification of PQDs based on EEMD and TEO. 2nd International Conference on Mechatronics and Control Engineering (ICMCE 2013), pp. 469–476. Beijing, China. [Google Scholar] (2013).
Yang, D. et al. Vibration condition monitoring system for wind turbine bearings based on noise suppression with multipoint data fusion. Renew. Energy. 92 (9), 104–116. https://doi.org/10.1016/j.renene.2016.01.099 (2016). [Google Scholar] [CrossRef].
Singh, R. H., Mohanty, S. R., Kishor, N. & Thakur, A. K. Real-time implementation of signal processing techniques for disturbances detection. IEEE Trans. Industr. Electron. 66 (5), 3550–3560. https://doi.org/10.1109/TIE.2018.2851968 (2019). [Google Scholar] [CrossRef].
Camarena-Martinez, D. et al. Novel down sampling empirical mode decomposition approach for power quality analysis. IEEE Trans. Industr. Electron. 63 (4), 2369–2378. https://doi.org/10.1109/TIE.2015.2506619 (2016). [Google Scholar] [CrossRef].
Salameh, J. P., Cauet, S., Etien, E., Sakout, A. & Rambault, L. A new modified sliding window empirical mode decomposition technique for signal carrier and harmonic separation in nonstationary signals: application to wind turbines. ISA Trans. 89 (11), 20–30. https://doi.org/10.1016/j.isatra.2018.12.019 (2019). [Google Scholar] [PubMed] [CrossRef].
Kumar, R., Singh, B., Shahani, D. T., Chandra, A. & Al-Haddad, K. Recognition of power-quality disturbances using S-Transform-Based ANN classifier and rule-based decision tree. IEEE Trans. Ind. Appl. 51, 1249–1258 (2015).
Valtierra-Rodriguez, M., de Jesus Romero-Troncoso, R., Osornio-Rios, R. A. & Garcia-Perez, A. Detection and classification of single and combined PQDs using neural networks. IEEE Trans. Ind. Electron. 61 (May 5), 2473–2482 (2014).
Gaing, Z. L. Wavelet-based neural network for power disturbance recognition and classification. IEEE Trans. Power Deliv. 19 (October 4), 1560–1568 (2004).
Biswal, B., Biswal, M., Mishra, S. & Jalaja, R. Automatic classification of power quality events using balanced neural tree. IEEE Trans. Ind. Electron. 61 (January 1), 521–530 (2014).
Negnevitsky, M. et al. Discussion of power quality disturbance waveform recognition using wavelet-based neural classifier-Part 1: theoretical foundation [Closure to discussion]. IEEE Trans. Power Deliv. 15 (October 4), 1347–1348 (2000).
Mazumdar, J. & Harley, R. G. Recurrent neural networks trained with backpropagation through time algorithm to estimate nonlinear load harmonic currents. IEEE Trans. Ind. Electron. 55 (September 9), 3484–3491 (2008).
Jain, S. K. & Singh, S. N. Low-order dominant harmonic Estimation using adaptive wavelet neural network. IEEE Trans. Ind. Electron. 61 (January 1), 428–435 (2014).
Janik, P. & Lobos, T. Automated classification of power-quality disturbances using SVM and RBF networks. Ieee Trans. Power Deliv. 21 (July 3), 1663–1669. https://doi.org/10.1109/TPWRD.2006.874114 (2006).
Naderian, S. & Salemnia, A. Method for classification of PQ events based on discrete Gabor transform with FIR window and T2FK-based SVM and its experimental verification, in IET Generation, transm. Distrib. Eng. Technol. 11 (1), 133–141 (2017).
Lin, W., Wu, C., Lin, C. & Cheng, F. Detection and classification of multiple power-quality disturbances with wavelet multiclass SVM. IEEE Trans. Power Deliv. 23 (October 4), 2575–2582 (2008).
Liu, Z. et al. A classification method for complex PQDs using EEMD and rank wavelet SVM, IEEE Trans. Smart Grid 6 (July 4) 1678–1685. [88], Tunable-q wavelet transform and dual multiclass SVM for online automatic detection of PQDs, IEEE Trans. Smart Grid 9 (July 4) (2018) 3018–3028. (2015).
C¸. Kocaman, H., Usta, M., Özdemir, ˙, I. & Eminoglu ˘ Classification of two common PQDs using wavelet based SVM, in: Melecon 2010–2010 15th IEEE Mediterranean Electrotechnical Conference, Valletta, pp. 587–591. (2010).
Thirumala, K., Umarikar, A. C. & Jain, T. A new classification model based on SVM for single and combined PQDs, in: National Power Systems Conference (NPSC), Bhubaneswar, 2016, pp. 1–6 (2016).
Fang, Y., Pang, H. & Chen, Y. Detection Method of Voltage Sag Disturbance Based on Improved HHT. In IOP Conference Series: Earth and Environmental Science, Volume 645, Proceedings of the International Conference on Smart Grid and Energy Engineering 13–15 November 2020, Guilin, China; IOP Publishing Ltd.: Bristol, UK, 2021; p. 012069. (2020).
Xu, J. & Guo, T. Simulation analysis of power quality disturbance detection based on improved HHT combined with improved S-Transform. Autom. Technol. Appl. 37, 98–102 (2024).
Sun, S., Wang, Q., Yan, H., Lin, X. & Du, T. Application of EEMD in power harmonic detection. Proc. CSU-EPSA. 28, 25–31 (2024).
Xu, Y. C., Gao, Y. K., Li, Z. X. & Xi, L. Application of improved LMD algorithm in power quality disturbance signal detection of microgrid. Power Syst. Technol. 43, 332–341 (2019).
Dragon Miretskiy, K. & Zosso, D. Variational mode decomposition. IEEE Trans. Signal. Process. 62, 531–544 (2014). [CrossRef].
Xu, Y., Gao, Y., Li, Z. & Lu, M. Detection and classification of PQDs in distribution network based on VMD and DFA. CSEE J. Power Energy Syst. 6, 122–130 (2023).
Huang, C. & Zhou, T. A. New method for power quality disturbance detection based on variational mode decomposition. Electr. Power Autom. Equip. 38, 116–123 (2022).
Yu, K., Jia, L., Chen, Y. & Xu, W. The Past, present and future of deep learning. J. Comput. Res. Dev. 50, 1799–1804 (2013).
Valtierra-Rodriguez, M., de Romero-Troncoso, J., Osornio-Rios, R. & Garcia-Perez, R. A. Detection and classification of single and combined PQDs using neural networks. IEEE Trans. Ind. Electron. 61, 2473–2482 (2013). [CrossRef].
Dong, G. et al. Power Quality Disturbance Classification Method Based on Feature Fusion Parallel Optimization Model. Proc. CSEE 43, 1017–1027. (2023).
Ruan, J. et al. GA-SVM Predictor for Big Data in Agricultural Cyber-Physical Systems. IEEE Trans. Ind. Inf. 15, 6510–6521. (2019). [CrossRef].
Lopez-Ramirez, M. et al. EMD-Based feature extraction for power quality disturbance classification using moments. Energies 9, 565. https://doi.org/10.3390/en9070565 (2016).
Liu, K. et al. Power quality disturbance identification method based on improved CEEMDAN-HT-ELM model. Processes 13, 137. https://doi.org/10.3390/pr13010137 (2025).
Uvesh Sipai, R., Jadeja, N., Kothari, T. & Trivedi Synthetic PQDs dataset of single and combined disturbances generated in accordance with IEEE 1159 specifications. IEEE Dataport Febr. 23 https://doi.org/10.21227/035e-rx20 (2024).
Funding
Open access funding provided by B.M.S. College of Engineering. Authors have not received any funding in carrying out this research work.
Author information
Authors and Affiliations
Contributions
Veera Vasantha Rao Battula: Conceptualization, Methodology, Software, Formal Analysis, Data Curation, and Writing – Original Draft. K Padmavathi: Supervision, Validation, Resources, Project Administration, Writing – Review & Editing. K Sravanthi: Investigation, Data Acquisition, Visualization, Statistical Analysis, Writing – Proofreading.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Battula, V.V.R., K., P. & K., S. Comparison of advanced signal decomposition techniques for the classification of PQDs by machine learning algorithms. Sci Rep 15, 38164 (2025). https://doi.org/10.1038/s41598-025-22128-6
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-22128-6
Keywords
This article is cited by
-
Current transformer (CT) saturation classification using empirical mode decomposition (EMD) and relevance vector machine (RVM)
Scientific Reports (2026)
-
A Hybrid Forecasting Framework with Adaptive Parameter Optimization and Multi-Scale Feature Fusion for Non-Stationary Power Grid CPS Time Series
Journal of Electrical Engineering & Technology (2026)














