Introduction

Background

RANSFORMERS are crucial components for power network distribution and transmission, ensuring that consumers receive high-quality, smooth, and reliable power. These devices are key components of interconnected power networks, making them among the most important assets. They are vulnerable to various external and internal faults. A significant portion of power transformer failures is caused by internal winding defects, the most detrimental of which are axial displacement (AD), radial deformation (RD), and short-circuit (SC) faults1. These faults may result in catastrophic transformer failures, costing electricity providers’ significant expenses due to power network outages, high repair costs, potential fires, and even casualties. Therefore, it is crucial to identify internal faults in transformers at an early stage to prevent unexpected outages or costly secondary failures. Accumulated faults should be assessed before they lead to transformer failure. Thus, early-stage winding fault detection using sensitive methods is essential2.

Literature survey

In recent years, numerous techniques for detecting winding deformations have been proposed successively. The primary techniques include the voltage-current locus diagram (VCL)3,4, low-voltage impulse (LVI)5, short-circuit impedance (SCI)6, ultra-wideband (UWB) antenna7, and the FRA method8-9. Among these, FRA is the most commonly utilized due to its reliance on an output-to-input transfer function (TF) analysis, which has proven effective in accurately and sensitively detecting electrical and mechanical faults in power transformers10. Furthermore, FRA has become the most widely used fault diagnosis technique among commercial methods due to its straightforward, non-destructive, economical, and rapid procedure11. As a result, many studies have concentrated on the challenges in implementing, interpreting, and reproducing FRA responses. Although the theory and technique of FRA measurement are well-standardized and developed, the practical interpretation of frequency responses remains challenging and calls for expert-level professional expertise12. Due to the unique structure and corresponding frequency response of each transformer, it is not feasible to develop a universal method for interpreting FRA results13. As a result, expert-based visual interpretation of FRA.

As shown in Fig. 1, existing literature on FRA interpretation is classified into four main categories: knowledge-based methods, mathematical models, pattern/template-based methods, and graphical models. Methods that use expert classifiers to identify faults belong to the first category. In these techniques, necessary frequency response features are extracted and then fed to classifiers. Reference14 enhanced the classification performance by combining SVM, PSO, and GA algorithms. This approach was successfully tested on a physical transformer model. However, these techniques require extensive data. Another important algorithm in unsupervised learning is the fuzzy inference system, which also falls under this category.

Fig. 1
figure 1

Classification of system representation methods for FRA interpretation

The second category consists of analytical models, detailed models, and adaptive models. In the detailed model approach, each winding section is represented by a different circuit element in the circuit model15. To apply this model, changes in the circuit elements must first be mapped to the corresponding alterations in the transformer structure. The circuit model is then modified with the element variations to examine the changes in the FRA trace16. Further studies have used the finite element method to simulate the geometrical dimensions of transformers to better approximate real transformer operation conditions17. However, the conversion process introduces additional errors to the model. The main limitation of the circuit model is that incorporating certain faults is difficult.

The third category is pattern/template-based. A substantial number of fault instances are required to effectively train the neural networks in wavelet- and neural network-based techniques proposed by various researchers18-19. For instance, the neural network described in20 is not well-suited for the limited data available on transformer winding defects, and may converge to local optima rather than the global solution. When the amount of data is below a certain threshold, classifiers may not be adequately trained, leading to erroneous results. These methods have been combined with other approaches, including numerical index-based21-22 and algorithmic estimation methods20,23. In23, M. Bigdeli employed the frequency-amplitude and phase characteristics of the TF for classifying winding faults using an SVM algorithm.

The last category comprises graphical models, including fault trees, bond graphs, and diagrams. The logical structure of decision trees performs multiple binary classifications at each level by comparing two classes, thereby enhancing system performance. Reference24 developed a diagnostic decision tree that can both isolate faults and identify failure modes based on detailed model data. Bond graph techniques are used to analyze signatures/signals with significant random components and to assess the similarity between two signals.

FRA signatures exhibit many uncertain properties in addition to changes in minima and maxima caused by resonant frequencies. These characteristics reflect alterations in both the type and condition of the winding. Researchers recommend using statistical methods as a more robust approach to differentiate between healthy and faulty FRA signatures25,26. The most significant numerical indices are presented in27. These indicators, which offer improved fault diagnosis capability, can be derived by computing them for various fault conditions and comparing them to the healthy state. Using statistical criteria is more straightforward than estimating model parameters, which is prone to errors and challenges. Several statistical parameters, including maximum absolute difference, spectrum deviation, and correlation coefficient (CC), have been proposed to quantify differences between FRA measurements28,29. Recent studies30-31 demonstrate increased sensitivity when using mathematical indices within appropriate frequency ranges for fault diagnosis. Thus, distinguishing between disturbances and actual failures is essential when using indices. However, numerical indices require extensive data to establish threshold levels for each fault type and severity.

The main goal of this research is to enhance the interpretation of FRA results by employing new statistical methods. To address the aforementioned shortcomings, fuzzy clustering analysis (FCA), factor analysis (FA), and principal component analysis (PCA) are employed to detect transformer winding faults using FRA. These methods are used to first determine the probability of a fault occurring based on variations in the FRA trace; subsequently, the specific type of fault is categorized graphically using this approach. Identifying the fault type enables proper assessment and appropriate corrective action to be taken.

PCA and FA methods use the correlation matrix to transform a set of features into lower-dimensional sets called principal components and factors, respectively. Similarly, FCA utilizes the degree of similarity between features to group them into lower-dimensional clusters. The proposed method offers the advantages of low computational complexity and minimal effort required to determine classifier characteristics.

Research innovations

The following are, in a nutshell, the innovations’ key features:

  • This study employs three independent methods—Factor Analysis (FA), Fuzzy Clustering Analysis (FCA), and Principal Component Analysis (PCA)—to interpret Frequency Response Analysis (FRA) results. Each method demonstrates exceptional performance owing to its unique characteristics in diagnosing transformer winding faults.

  • FA uncovers complex faults by identifying hidden factors and latent patterns in the data. This method is particularly effective for detecting faults that conventional approaches might miss.

  • FCA is particularly adept at managing uncertainty and detecting combined fault conditions, particularly in scenarios where the boundaries between different fault states are ambiguous. This method proves invaluable for diagnosing concurrent or overlapping faults.

  • PCA enhances interpretability by reducing data dimensionality and filtering out noise while retaining key information. This enables clearer and more efficient identification of fault patterns.

  • The developed two-stage identification model first distinguishes between healthy and faulty conditions, then classifies the specific fault type in the second stage. This approach enhances fault diagnosis accuracy.

  • Advanced data visualization techniques are employed to independently present the results of each method in a visual format. This significantly simplifies result interpretation for end-users.

  • Implementation of the proposed techniques substantially reduces the requirement for expert knowledge in fault diagnosis and classification. This enables engineers and operators to interpret results more efficiently.

  • The reliability of each method has been validated using actual transformer data and practical experiments, confirming their effectiveness in real-world applications.

FRA concept and experimental setup

The frequency response is represented by the transfer function (TF) plot as a function of the input excitation frequency. Any physical alteration to the active components of a power transformer modifies the characteristics of its equivalent electrical circuit, consequently altering the frequency response. This principle forms the basis of FRA for transformers. In practice, impedance or admittance functions, or voltage ratios between specified terminals (e.g., end-to-end measurements), are commonly used to assess winding frequency responses. Transformer FRA can be performed using either the Sweep FRA (SFRA) or Low Voltage Impulse (LVI) method32. Both approaches involve applying excitation signals to the winding and monitoring the response signals to obtain the transformer’s frequency response characteristics. While both SFRA and Impulse FRA (IFRA) utilize excitation signals, SFRA employs a sinusoidal sweep signal, whereas IFRA uses an impulse signal yet the outcomes are equivalent.

Currently, precise interpretation of FRA data remains challenging. Visual inspection23 remains the predominant method for FRA interpretation. This is because specific frequency ranges correlate with particular transformer faults; that is, different fault types manifest in distinct frequency ranges of the FRA trace. Following Chinese power industry standards33, FRA signatures are divided into three frequency bands: low (1–100 kHz), mid (100–600 kHz), and high (600–1000 kHz), with analyses conducted for each range. However, this method requires skilled experts and comprehensive knowledge of how various winding defects affect each frequency range, as false positives and false negatives may occur.

Evaluating the effectiveness of intelligent classifiers requires establishing a database of transformers in both good and faulty condition (with varying fault intensities). For this study, mechanical and electrical faults were artificially induced and simulated in different locations of windings at various severity levels in a high-voltage laboratory. To identify short-circuit (SC), axial displacement (AD), and radial deformation (RD) faults, FRA measurements were performed on a 1200 kVA transformer. The transformer features round-shaped windings and a round core. Its high-voltage (HV) and low-voltage (LV) windings consist of 70 disks (each with 80 turns) and a continuous layer (with 112 turns), respectively. The transformer’s insulating system comprises Kraft paper and mineral oil.

To create SD, AD, and RD faults, the leads were removed from the transformer to allow easy access to the internal turns while enabling external fault simulation. All measurements were conducted using an Omicron FRANEO 800 analyzer (Bode 100) with a precision of approximately 90 dB and a maximum amplification factor of 40 dB. The network analyzer’s tracking generator provided the measurement system’s reference signal: a 5-volt alternating voltage. SFRA measurements were obtained at 1,280 frequency points ranging from 100 Hz to 1 MHz.

Theory of suggested approaches for FRA interpretation

This research employs three statistical methods—PCA, FA, and FCA—to identify power transformer faults. The theoretical foundations of these methods and their respective implementation algorithms for fault detection are elaborated below.

Principal component analysis

PCA is a widely-used multivariate technique that transforms a set of correlated features into a set of linearly uncorrelated features called principal components. The most significant features are captured by the first few components in this transformation34. As a dimensionality reduction technique, PCA projects high-dimensional data onto a lower-dimensional space by retaining only the first principal components, thereby reducing the data size. According to Kaiser’s criterion, the number of eigenvalues exceeding one from the correlation matrix determines the number of significant principal components.

Suppose that the goal is to study p random variables \(\:{X}_{1},\dots\:,{X}_{p}\). We consider the vector \(\:\varvec{X}\) as follows:

$$\:\varvec{X}={\left({X}_{1},\dots\:,{X}_{p}\right)}^{T}$$
(1)

We define the mean vector and the variance matrix of the vector X as follows

$$\:\varvec{\mu\:}=E\left(\varvec{X}\right)={\left({\mu\:}_{1},\dots\:,{\mu\:}_{p}\right)}^{T}$$
(2)

and

$$\:{\Sigma\:}=Var\left(\varvec{X}\right)=\left[\begin{array}{ccc}{\sigma\:}_{11}&\:\dots\:&\:{\sigma\:}_{1p}\\\:\dots\:&\:\dots\:&\:...\\\:{\sigma\:}_{p1}&\:\dots\:&\:{\sigma\:}_{pp}\end{array}\right]$$
(3)

where \(\:{\mu\:}_{i}\) is the mean of the random variable \(\:{X}_{i}\) and \(\:{\sigma\:}_{ij\:}\)is the covariance between the random variables \(\:{X}_{i}\) and \(\:{X}_{j}.\) Assume \(\:{\lambda\:}_{1}\ge\:{\lambda\:}_{2}\ge\:\dots\:\ge\:{\lambda\:}_{p}\) are the eigenvalues and \(\:{\varvec{e}}_{1},\:\dots\:,\:{\varvec{e}}_{p}\) are the eigenvectors of the matrix \(\:{\Sigma\:}.\) Then the ith principal component (\(\:{Y}_{i}\)) is computed by:

$$\:{Y}_{i}={\varvec{e}}_{\varvec{i}}^{\mathbf{{\prime\:}}}\varvec{X}={e}_{i1}{X}_{1}+\dots\:+{e}_{ip}{X}_{p}\:\:$$
(4)

Factor analysis

Similar to PCA, factor analysis (FA) is a widely-used multivariate approach that transforms multiple correlated features into a smaller set of features known as factors. The initial factors in this transformation capture the most significant information from the original dataset34. In contrast to PCA, FA focuses on exploring correlations between features, where features within the same factor have higher correlations while those across different factors have lower correlations. As with PCA, FA reduces high-dimensional data to fewer dimensions by retaining only the most significant factors, thereby achieving data compression. The number of factors is selected using the same method as PCA. A FA structure with m factors (where m ≤ p) can be expressed as:

$$\:\varvec{X}-\varvec{\mu\:}=L\varvec{F}+\varvec{\epsilon\:}\:\:$$
(5)

such that

$$\:\text{L}=\left[\begin{array}{ccc}{l}_{11}&\:\dots\:&\:{l}_{1m}\\\:\dots\:&\:\dots\:&\:...\\\:{l}_{p1}&\:\dots\:&\:{l}_{pm}\end{array}\right]$$
(6)
$$\:\varvec{F}={\left({F}_{1},\dots\:,{F}_{m}\right)}^{T}$$
(7)

and

$$\:\varvec{\epsilon\:}={\left({\epsilon\:}_{1},\dots\:,{\epsilon\:}_{p}\right)}^{T}$$
(8)

where \(\:\varvec{F}\) is the factors vector, \(\:\text{L}\) is the loading matrix and \(\:\varvec{\epsilon\:}\) is the error vector.

The FA structure can be represented as

$$\:{X}_{i}-{\mu\:}_{i}=\sum\:_{j=1}^{m}{l}_{ij}{F}_{j}+{{\upepsilon\:}}_{i},\:\:\:\:\:\:i=1,\dots\:.,p$$
(9)

where \(\:{l}_{ij}\) is called as the loading of \(\:{X}_{i}\) on the factor \(\:{F}_{j}.\).

For orthogonal FA can be proved that

$$\:Cov\left(\varvec{X},\varvec{F}\right)=L\:\:\:\:\:\:\:$$
(10)

and

$$\:{\Sigma\:}=\text{L}{\text{L}}^{T}+{\Psi\:}\:\:\:\:\:\:$$
(11)

such that

$$\:{\Psi\:}=\:Var\left(\varvec{\epsilon\:}\right)$$
(12)

Therefore,

$$\:Var\left({X}_{i}\right)=\sum\:_{j=1}^{m}{l}_{ij}^{2}+{{\Psi\:}}_{i}$$
(13)

and

$$\:Cov\left({X}_{i},{X}_{j}\right)=\sum\:_{k=1}^{m}{l}_{ik}{l}_{jk}$$
(14)

The primary goal of FA is to determine the loadings values. Different techniques, such as maximum likelihood (ML) and PCA, can be used to compute the matrices L and \(\:{\Psi\:}\). To determine the matrix L, the principal component technique decomposes the matrix \(\:{\Sigma\:}\) using eigenvalues and eigenvectors. The maximum likelihood method computes and optimizes the likelihood to discover the matrices L and \(\:{\Psi\:}\). Loading plots can be considered once the loading values have been estimated. Loading plots have many applications in.

  • Investigating the correlations between features,

  • Features classification and categorization.

  • Detection of m.

The correlation (\(\:r\)) of two features is determined by their angle (\(\:\theta\:\)) (Fig. 2). \(\:\theta\:={90}^{0}\) suggests that two features are uncorrelated (\(\:r=0)\). The case \(\:\theta\:={0}^{0}\) is equal to the exact positive linear relationship and the case \(\:\theta\:={180}^{0}\) is equal to the exact negative linear relationship.

Fig. 2
figure 2

The interpretation of Loading plot: F1 and F2 are the main factors and X1 and X2 are two features.

Fuzzy clustering analysis

Clustering35 is a powerful data analysis tool in data mining. Among clustering techniques, soft clustering algorithms36,37 have recently gained popularity, as studies have shown that these methods outperform conventional hard clustering algorithms38,39. Unlike hard clustering, soft clustering allows every point to belong to multiple clusters with varying membership degrees (probabilities). Among these, Fuzzy C-means (FCM) clustering40 is the most widely-used soft clustering technique. Consider n observations from a p-dimensional vector X = (X, …, Xp)ᵀ, represented as:

$$\:{\varvec{X}}_{i}={\left({X}_{i1},\dots\:,{X}_{ip}\right)}^{T},\:i=1,\dots\:,\:n\:\:\:\:\:\:$$
(15)

The aim is to convert dataset in lower-dimensional clusters \(\:{C}_{1},\dots\:,{C}_{k}.\) Assume \(\:{\varvec{c}}_{1},\dots\:,{\varvec{c}}_{k}\) as the centroids of \(\:\varvec{X}\) for the members of \(\:{C}_{1},\dots\:,{C}_{k}\). Suppose that \(\:{u}_{ij}\) is the probability of membership of \(\:{\varvec{X}}_{i}\) in the cluster \(\:{C}_{j}\). In FCM, we minimize

$$\:\sum\:_{i=1}^{n}\sum\:_{j=1}^{k}{u}_{ij}{{\parallel\varvec{X}}_{i}-{\varvec{c}}_{j}\parallel}^{2}$$
(16)

where \(\parallel.\parallel\) is any arbitrary norm to compute the similarity between \(\:{\varvec{X}}_{i}\) and \(\:{\varvec{c}}_{j}.\) After some iterations, the updated probability of membership and the centroids is as following:

$$\:{u}_{ij}^{m}=\frac{1}{\sum\:_{l=1}^{k}{\left(\frac{\parallel{\varvec{X}}_{i}-{\varvec{c}}_{j}\parallel}{\parallel{\varvec{X}}_{i}-{\varvec{c}}_{l}\parallel}\right)}^{\frac{2}{m-1}}}$$
(17)

and

$$\:{\varvec{c}}_{j}=\frac{{u}_{ij}^{m}{\varvec{X}}_{i}}{\sum\:_{i=1}^{n}{u}_{ij}^{m}}$$
(18)

The procedure will stop when

$$\:\parallel{U}^{(s+1)}-{U}^{\left(s\right)}\parallel<\delta\:$$
(19)

where \(\:0<\delta\:<1\) is a termination criterion and \(\:{U}^{\left(s\right)}\) are probabilities of membership matrix in iteration 0 of s.

It should be noted that the number of clusters is determined using Silhouette’s index.

A comparative summary across PCA, FA, and FCA is as follows:

  • PCA, FA, and FCA are three independent multivariate techniques that can be employed in the fault detection process.

  • The PCA method, using covariance matrix decomposition, enables the extraction of principal components with the highest variance. By calculating eigenvalues and eigenvectors, this method represents data in a lower-dimensional space. Its computational simplicity and high execution speed are prominent features that make it suitable for preliminary data analysis.

  • The FA method, by modeling latent factors, allows for the examination of complex relationships between variables. This technique, by considering measurement error and using a factor loading matrix, can identify hidden structures in data. Its robustness against noisy data and ability to work with highly correlated variables are among its key advantages.

  • The FCA method, leveraging fuzzy set concepts, enables data classification under uncertainty. By assigning membership degrees and optimizing the objective function, this method demonstrates high flexibility in analyzing complex patterns. Although its computational complexity is higher, its ability to handle ambiguous and borderline data gives it a distinct advantage.

Each of these methods has unique characteristics that make them suitable for different analytical scenarios. PCA, with its focus on dimensionality reduction and computational simplicity; FA, with its ability to uncover hidden relationships and robustness against noise; and FCA, with its capability to operate under uncertainty, provide researchers with a comprehensive set of analytical tools.

Implementation of diagnostic procedure

Figure 3 illustrates the proposed clustering-based framework for Frequency Response Analysis (FRA). The systematic implementation methodology encompasses the following key phases:

Fig. 3
figure 3

The FRA analysis process based on the suggested clustering methods.

Phase 1: data acquisition & Preparation

  • Perform FRA scans (100 Hz-10000 kHz) using FRANEO 800 analyzer.

  • Record data for healthy and faulty transformers.

  • Import datasets and validate signal integrity.

  • Generate comparative plots for: Low-band (100 Hz-100 kHz), Mid-band (100–600 kHz), High-band (600–10000 kHz).

Phase 2: feature extraction

The system employs three parallel methods for extracting key features:

  1. 1.

    PCA Method:

  • Calculation of principal components through covariance matrix decomposition.

  • Selection of components with eigenvalues greater than 1 (Kaiser criterion).

  • Retention of components covering at least 95% of data variance.

  1. 2.

    FA Method:

  • Modeling of latent factors by examining factor loadings.

  • Validation using Kaiser-Meyer-Olkin (KMO) test (value > 0.6).

  • Selection of factors with loadings greater than 0.7.

  1. 3.

    FCM Method:

  • Implementation of fuzzy clustering with fuzzification degree m = 2.

  • Calculation of membership degrees and cluster center updates.

  • Process termination upon convergence (δ < 1).

Phase 3: fault detection

Each method employs specific fault detection criteria:

  • PCA: Deviations exceeding 3σ in principal components and significant variance changes.

  • FA: Changes exceeding 25% in factor loadings and residual increases.

  • FCM: Cluster center displacements and membership distribution anomalies.

Phase 4: making decision

  • Final output specifying fault type (SC/AD/RD).

  • Graphical results presentation.

Case study

Power transformer data set

The study utilizes a three-phase 1200 kVA power transformer with a voltage rating of 20/0.4 kV (D/Yg connection), representing a typical distribution transformer configuration commonly employed in power systems. The transformer features distinct winding designs for high and low voltage sides to facilitate comprehensive fault analysis. The high-voltage winding comprises 70 interleaved disks with 80 turns per disk (totaling 5600 turns), utilizing round conductors with an inner diameter of 987 mm and outer diameter of 1086 mm. In contrast, the low-voltage winding employs a continuous layer design with 112 turns, featuring round conductors of 823 mm and 891 mm inner and outer diameters respectively. The core structure measures 2033 mm in height and 3785 mm in length, with winding heights of 1154 mm (HV) and 1249 mm (LV). The transformer’s electrical characteristics include a 2.431% impedance at 50 Hz frequency, with Kraft paper and mineral oil serving as the primary insulation materials. To systematically evaluate fault detection capabilities, ten distinct severity levels of three fundamental fault types (axial displacement, radial deformation, and short circuit) were artificially induced at strategic locations across all three phases (A, B, and C). The complete parametric details of these fault configurations are comprehensively documented in Tables 1, 2 and 3 to ensure reproducibility and facilitate comparative analysis.

a) Short Circuit Fault Simulation: This study implemented ten distinct levels of inter-disk short circuit faults at specified disk pairs in the HV winding. The investigated disk pairs included 11–12, 13–15, 18–20, 21–24, 25–26, 27–29, and 32–35, with additional analysis of combined faults at disks 11–12/25–26 and 18–20/27–29. Table 1 shows these simulations:

Table 1 Different levels of SCs at various locations.

b) Axial Displacement: Ten progressive levels of axial displacement faults were systematically simulated by displacing the HV winding relative to the LV winding in precise 6.25 mm increments, corresponding to 1–5% of the total winding height (1154 mm). This resulted in displacement magnitudes ranging from 12.5 mm (1.08%) to 62.5 mm (5.41%), covering both minor misalignments and severe deformations observed in practical scenarios. Table 2 shows how these faults created.

Table 2 Various axial displacement levels.

c) Radial Deformation: This study systematically investigated radial deformation (RD) by introducing ten distinct fault levels through controlled mechanical deformation of the disk winding. The simulation encompassed various deformation patterns, including single-axis (Fig. 4a), dual-axis opposed (Fig. 4b), three-axis (Fig. 4c), and four-axis symmetric (Fig. 4 d) configurations, with the angular position fixed at θ = 45° for standardization. The deformation severity was precisely quantified using the ratio d/R (Eq. 20), where d represents the radial bending magnitude (d = R - R₁) and R denotes the original average radius. The complete parameter sets for all test cases, including detailed geometric specifications and deformation patterns, are comprehensively documented in Table 3, while Fig. 4 (a)~(d) visually illustrates the various deformation modes.

$$\% RD\:Fault\:Level\:\: = \frac{{{\text{R - }}{{\text{R}}_{\text{1}}}}}{{\text{R}}} \times {\text{100}}\% {\text{ = }}\frac{{\text{d}}}{{\text{R}}} \times {\text{100}}\%$$
(20)

The power transformer’s dimensions, specifications, and capacity are given in Table 4.

Fig. 4
figure 4

deforming the winding along (a) one axis (b) two axes (c) three axes (d) four axes.

Table 3 Different radial deformation levels.
Table 4 Transformer’s specifications under investigation.

FRA simulation results

The impacts of RD, AD, and SC faults on the transformer’s FRA waveforms are illustrated in Figs. 5(a)–(c) for ten levels of each fault. This study employs an OMICRON analyzer to conduct the FRA measurements. Although the FRA changes are visible in Fig. 5, their analysis is highly challenging. Additionally, low-level fault recognition poses a challenge for conventional FRA. However, the proposed method automates the interpretation process and can be readily applied to FRA, as described below.

Fig. 5
figure 5

The variation of FRA measurements due to AD, RD, and SC defects in different faults.

Three proposed methods simulations and results

This section reports the results of PCA, FA, and FCA for detecting transformer winding faults. The analysis was performed using R software version 3.6.1 and Minitab version 18. Subsection A presents the PCA results, while Subsections B and C provide the FA and FCA results, respectively.

C.1. Results of PCA to diagnose winding faults

This section presents the results of using PCA for fault detection to diagnose winding faults in the transformer. The eigenvalues of the correlation matrix for variables at low frequency are shown on the left side of Fig. 6a. As shown, only the first two values are greater than 1. The right side of Fig. 6a illustrates that these variables can be classified into two categories: healthy, AD, RD, and SC systems. Consequently, at low frequencies, the RD and AD data are similar to those of a healthy system. However, PCA does not confirm this similarity for SC data. Similarly, the eigenvalues of the mid-frequency correlation matrix are presented on the left side of Fig. 6b, where again only the first two values exceed 1.

Fig. 6
figure 6

Eigen-values plot for the correlation matrix and the first two loading components for PCA. (a) low-frequency, (b) middle-frequency, (c) high-frequency.

The right side of Fig. 6b shows that these variables can be classified into two categories: healthy, SC, RD, and AD systems. Consequently, at mid-frequency, the SC and RD data show similarity with a healthy system. However, PCA does not confirm this similarity for AD data. The left side of Fig. 6c displays the eigenvalues of the high-frequency correlation matrix variables, where only the first two values exceed 1. The right side of Fig. 6c demonstrates that these variables can again be classified into two categories: healthy, SC, AD, and RD systems. At high frequencies, the SC and AD data are similar to those of a healthy system, while PCA fails to establish this similarity for RD data.

C.2. Results of FA to indicate winding faults

The results of FA to detect transformer winding faults are reported in this section. Figures 7a-c show that these variables can be categorized into two groups across all frequency ranges. At low frequencies (Fig. 7a), the variables separate into: (1) healthy, AD, and RD systems, and (2) SC systems. Consequently, the AD and RD data show similarity with healthy system data, while FA does not confirm this similarity for SC cases.

Fig. 7
figure 7

The loading factors for the first two factors of FA. (a) low-frequency, (b) middle-frequency, (c) high-frequency.

In the mid-frequency range (Fig. 7b), the variables divide into healthy systems and SC, AD, RD defects. Here, the SC and RD data match healthy system data, whereas FA fails to establish this correspondence for AD cases. At high frequencies (Fig. 7c), the classification yields healthy systems versus SC, AD, and RD systems. While the SC and AD data correlate with healthy system data, FA does not demonstrate this correlation for RD cases.

C.3. Results of FCA to indicate winding faults

The findings of FCA to detect transformer winding defects are provided in this section. As illustrated in Figs. 8a-c, the variables can be classified into two groups according to their frequency characteristics. At low frequencies (Fig. 8a), the classification reveals: (1) healthy, AD, and RD systems, and (2) SC systems. Accordingly, the AD and RD values closely match those of healthy systems, while FCA does not demonstrate this correspondence for SC cases. In the medium frequency range (Fig. 8b), the groups comprise: (1) healthy, SC, and RD systems, and (2) AD systems. Here, the SC and RD values align with healthy system values, whereas FCA fails to establish this relationship for AD data. At high frequencies (Fig. 8c), the classification yields: (1) healthy, SC, and AD systems, and (2) RD systems. While the SC and AD measurements correspond to healthy system values, FCA does not confirm this similarity for RD data.

Fig. 8
figure 8

Fuzzy clustering plot. (a) low-frequency, (b) middle-frequency, (c) high-frequency.

Comparative analysis

In this section, we provide a comparative analysis of our proposed methods against several alternative methods, including random forest (RF)41, artificial neural network (ANN)42, gradient boosting (GB)43, and decision tree (DT)44. The evaluation is based on four key performance metrics: precision, recall, F1-score, and accuracy by the following formulas:

$$\:{\text{precision = }}\frac{{{\text{TP}}}}{{{\text{TP + FP}}}}$$
$$\:{\text{recall = }}\frac{{{\text{TP}}}}{{{\text{TP + FN}}}}$$
$$\:{\text{F1 - score = }}\frac{{{\text{2}} \times {\text{Precision}} \times {\text{Recall}}}}{{{\text{Precision + Recall}}}}$$

and

$$\:{\text{accuracy = }}\frac{{{\text{TP}} + {\text{TN}}}}{{{\text{TP + TN + FP + FN}}}}$$

.

where TP, FP, FN, and TN denote True Positives, False Positives, False Negatives, and True Negatives, respectively. As it can be seen in Table 5, our data visualization approaches outperformed all comparative methods in terms of most performance metrics. The FCA approach achieves the highest accuracy, with 98.9, outperforming all other methods.

Table 5 Classification performance of different approaches in fault detection.

FA, PCA and RF acts approximately similar with accuracies 96.6%, 96.6% and 96.4%, respectively. Although the performance of RF is similar to PCA and FA, but since PCA and FA are visual approaches, we recommended these techniques instead of RF. The results from the comparative analysis clearly demonstrate that our proposed visualization approaches outperform alternative methods across all evaluated metrics. This consistent superiority of our methods highlights their potential as more reliable and effective solutions for the problems at hand.

Discussion

FRA is a cost-effective, accurate, and non-destructive technique for rapid assessment of transformers’ mechanical integrity. However, interpreting FRA results is not yet automated. This study proposes a novel SFRA-based methodology to automate fault detection and interpretation. The proposed approach was tested on a three-phase 50 Hz, 1.2 MVA, 20/0.4 kV transformer. Various electrical faults (SCs) and mechanical faults (AD and RD) were artificially simulated at multiple levels in transformer windings for FRA testing. The FRA signatures were divided into three frequency bands: low (100 Hz-100 kHz), mid (100–600 kHz), and high (600–1000 kHz). To overcome the interpretation challenge, the automatic detection module simultaneously employs three graphical methods (FA, FCA, and PCA) to analyze FRA results for detecting and classifying RD, AD, and SC defects. These multivariate techniques reduce high-dimensional data complexity:

  1. 1.

    PCA and FA transform multiple features into principal components and factors, respectively.

  2. 2.

    FCA, unlike PCA and FA, focuses on feature similarities rather than correlations.

  3. 3.

    Within-cluster feature values are highly similar, while between-cluster values show significant divergence.

The model demonstrates three key capabilities: simulating diverse fault types for FRA interpretation, facilitating easy detection of various faults, and eliminating expert-dependent interpretation of frequency responses, thereby reducing subjective judgments.

Conclusion

Interpreting transformer FRA results is challenging and has traditionally relied on error-prone human experts. This study proposes, for the first time, three automatic graphical clustering methodologies for interpreting FRA measurements across different frequency ranges. To validate these methods, a series of tests was conducted on an actual transformer. The required data were obtained through FRA measurements performed with an FRANEO 800 analyzer on both healthy and faulty transformers. The assessed faults included SC, RD, and AD defects. The measured FRA characteristics were analyzed across three sub-frequency bands. The proposed clustering techniques were validated by experimental FRA measurements obtained from artificial fault simulations. Key findings include:

  1. 1.

    The clustering results match the original FRA label distribution, demonstrating the method’s applicability for processing FRA data.

  2. 2.

    Different winding fault types form distinct clusters with clear boundaries, effectively separating the three types of winding deformation faults.

  3. 3.

    The optimal frequency bands for diagnosing RD, AD, and SC faults are high, medium, and low frequencies, respectively.

  4. 4.

    The proposed methods accurately assess fault severity.

Future research directions include:

  1. 1.

    Creating models to forecast faults based on FRA data and statistical methods could enhance preventive maintenance.

  2. 2.

    Exploring integration with IoT and smart systems could enable automated, intelligent fault detection.

  3. 3.

    Investigating these techniques in various transformer types would help generalize the findings.

  4. 4.

    Studying temperature, humidity, and pollution effects on FRA results could improve detection accuracy.

  5. 5.

    Additional experiments under diverse real-world conditions would strengthen the results.