Introduction

Power transformers are vital and critical component in modern electrical power systems, and their reliable operation is essential for maintaining uninterrupted electricity supply near utilities1. A failure in a transformer not only leads to catastrophic service disruptions but also incurs the unnecessary financial burdens on the stake holders2,3. The insulation system of transformers, primarily composed of mineral oil (MO) and cellulose paper (CP) is very sensitive subject to various dynamic stresses including thermal, electrical and chemical4. A healthy insulation of the power transformers, especially the cellulosic paper insulation used in windings and which come across a rapid thermal and environmental decomposition, directly reflects the operating condition of the transformers5,6. With the progression of aging in the cellulosic insulating paper, its tensile strength weakens gradually due to breakdown in the constituent carbon compounds7. The CP made of kraft papers are more susceptible to ingress moisture which results a rapid decline in its physical strength and thereby contributing its more decomposition at accelerated rates8,9. This decomposition results into the release of several chemical compounds namely Furans into the surrounding transformer oil. Among Furanic compounds (Furans) the 2-Furfuraldehyde (2-FAL) has proven to be a dependable indicator of insulation aging due to its higher proportion and solubility in the transformer oil10,11,12. Its concentration is strongly linked to the Degree of Polymerization, a parameter that directly reflects the structural strength of the insulating material13,14,15,16. There are different analytical models are available which demonstrate a strong relationship between the 2-FAL and DP values of the insulating paper and thereby the 2-FAL and DP are considered as the valuable diagnostic metrics in assessing the condition of aged power transformers17,18. Nevertheless, the diverse analytical models available lack universal validation for precisely estimating the average Degree of Polymerization (DP) in operational transformers due to their inherent limitations. Certain models assume uneven degradation between the outer and inner insulation layers, while others suggest that thermally upgraded paper produces higher 2-Furfuraldehyde (2-FAL) levels in transformer oil.

The reliable operation of power transformers is paramount for grid stability. The traditional diagnostic methods often rely on Dissolved Gas Analysis (DGA) to detect fault conditions19,20. DGA identifies gases produced by the decomposition of insulating oil and materials due to thermal and electrical stresses. Techniques like IEC ratios, Roger’s ratios, and Duval triangles are commonly employed to interpret DGA results6,21. However, these methods have limitations in accurately predicting the remaining life of the solid insulation6. Recent research has focused on the aging of the cellulose paper, a key component of the insulation system, with the Degree of Polymerization serving as a critical indicator17,22,23. Direct DP measurement requires physical samples of insulation paper from transformers while they’re running, which can be quite disruptive. This process involves shutting down the transformer, a pricey and lengthy task that cuts off power and raises safety concerns. In addition, the sample might not give a full picture of the transformer’s condition since degradation isn’t uniform − inner layers can age differently due to heat variations, and one sample could overlook local damage or smooth out inconsistent wear24,25. In an era where real-time monitoring and predictive maintenance are becoming essential, there is a growing need for automated and intelligent diagnostic systems. Therefore, non-destructive methods for estimating DP are highly desirable. 2-Furfuraldehyde (2-FAL), a byproduct of cellulose degradation found in transformer oil, has emerged as a reliable marker of insulation aging2.

Several studies have explored to estimate the value of DP using the insulation decomposition by product such as 2-FAL and carbon oxide (CO and CO2) gases26,27. Furthermore, machine learning techniques have been increasingly applied to predict transformer health and remaining useful life based on DGA data and other decomposition by product of the insulating paper. These include various algorithms for regression and classification tasks. Li et al. developed a fuzzy c-means clustering and regression model to predict DP using 2-FAL, CO, and CO2, alongside a novel dispersion staining color marker, achieving superior accuracy over standalone 2-FAL models26. However, their reliance on laboratory data and complex staining techniques limits field applicability. Senoussaoui et al. presented a streamlined machine-learning method to evaluate transformer oil condition, focusing on insulation degradation. Using J48 and Random Forest classifiers, it groups oils into four categories. Random Forest excelled, boosted by advanced preprocessing, minimizing data needs for effective monitoring via k-means, feature selection, and PCA preprocessing. Though focused on oil quality, its Random Forest approach aligns with our 2-FAL-based DP estimation, but its multi-parameter method adds complexity27. Li et al. in17 uses multiple oil aging parameters (dissolved gases, furans, moisture) with fuzzy c-means and linear regression, improving DP estimation accuracy over single-parameter methods for better life assessment. However, the proposed method’s reliance on multiple oil aging parameters may be limited by the variability and inconsistency of these parameters across different transformer operating conditions, potentially affecting the accuracy of DP estimation in real-world applications. Advancing the use of ensemble methods, the authors of28 presented a study for the potential monitoring of power transformer’s health using an AI-based Health Index (HI), tackling data uncertainty. It assesses oil quality, dissolved gas analysis, and paper condition from 504 transformers, comparing kNN, SVM, Random Forest, Naïve Bayes, ANN, AdaBoost, and Decision Tree models. Random Forest found excels with 97.3% accuracy. However, the collection of data from 504 specific 150-kV transformers may limit the generalizability of the AI-based Health Index (HI) models to other transformer types or operating conditions, potentially reducing accuracy in diverse real-world scenarios. Ashkezari et al. introduced an intelligent algorithm using a fuzzy support vector machine to assess the health of in-service power transformer insulation systems. It processes oil test data from 181 transformers, building a statistical model with a training database, and evaluates performance through numerical experiments. However, the proposed algorithm’s effectiveness may be constrained by the limited diversity of the training database, as it relies solely on historical data from 181 transformers, potentially reducing its accuracy for transformers with different characteristics or operating conditions29. These studies underscore the potential of machine learning for non-invasive DP estimation but highlight limitations in data availability, computational feasibility, and real-world validation, which this work addresses through a simplified Random Forest approach using 2-FAL as the primary indicator.

This paper contributes to this body of knowledge by proposing an AI-driven framework for early prediction of cellulose paper condition, using 2-FAL concentration to estimate DP and hence classify the transformer into four distinct intervals including Fresh (F), Lightly Aged (LA), Moderately Aged (MA), and Worstly Aged (WA). Unlike conventional DGA-based approaches, this method emphasizes 2-FAL as a scientifically validated marker, offering a non-destructive approach to assess transformer health and facilitate proactive maintenance strategies. In recent years, machine learning has emerged as a powerful tool for predictive maintenance. Techniques like Decision Trees, Support Vector Machines, and ensemble models have been applied to classify transformer faults with high accuracy. Despite their effectiveness, many rely on a large number of input features, which increases system complexity and cost. This study addresses this limitation by proposing a real-time monitoring system that simplifies the single input, 2-Furfuraldehyde (2-FAL) to estimate the Degree of Polymerization (DP) as the output, enabling a cost-effective and easily implementable solution for early fault detection and proactive maintenance in power transformers.

DP and 2-FAL as reliable diagnostic indicator

Degree of polymerization

Transformer insulating paper is a cellulose-based material primarily derived from wood, which in its dry state contains 40–50% cellulose, 20–30% lignin, and 10–30% hemicellulose and polysaccharides. After processing, the insulating paper consists of approximately 90% cellulose, 3–4% lignin, and 6–7% hemicellulose13,30. Cellulose, a linear polymer, is composed of anhydrous glucose units connected by glucosidic bonds, with the molecular formula [C6H10O5]n, where ‘n’ represents the degree of polymerization (DP), indicating the number of glucose monomers in the chain, Fig. 131, . The mechanical tensile strength of cellulose insulating paper stems from its intricate multi-layered structure, encompassing the micro-scale (individual cellulose fibers with crystalline and amorphous zones), the meso-scale (interconnected fiber networks and their bonds), and the macro-scale (the overall stability and layering of the paper sheet). This hierarchical arrangement enhances strength by distributing stress across the fibrous and polymeric framework. Over time, the cellulose degrades due to exposure to temperature, moisture, and oxygen within an operating transformer, causing the glucose chains to break. This degradation, known as chain scission, reduces the paper’s mechanical strength and is quantified by the DP, measured via viscometric testing32. Fresh insulating paper typically has a DP of 1200–1500. However, prolonged exposure to high temperatures, water, and oxygen during transformer operation can reduce the DP to 200–250, rendering the paper brittle. A DP below 200 signals that the insulating paper is nearing the end of its service life. The appearance of cellulose insulating paper varies with different degrees of polymerization (DP) and tensile strength (TS) values, as illustrated in the Fig. 217.

Fig. 1
figure 1

Structure of glucose and cellulose.

Fig. 2
figure 2

Visual appearance of cellulose papers for different DP and TS values.

2-Furfurldehyde (2-FAL)

Thermal degradation of solid insulating paper shortens cellulose chains, producing aging by-products that dissolve in transformer oil. These by-products, unique to paper degradation, serve as diagnostic markers for assessing insulation condition. Furans, six-membered heterocyclic compounds, are key by-products found in varying concentrations in transformer oil, Fig. 3. Notable furanic derivatives include 2-Furfuraldehyde (2-FAL), 5-Hydroxymethyl-2-Furfuraldehyde, 2-Furoic Acid, Furfuryl Alcohol, 5-Methyl-2-Furfuraldehyde, and 2-Acetyl Furan16. Among these, 2-FAL is the most stable and prevalent, with higher concentrations than other furans. It is also thermal stable as compare to other furanic derivatives. Its levels are typically measured using high-performance liquid chromatography (HPLC), an analytical method for quantifying chemical components in a solution. The concentration of 2-FAL in transformer oil correlates closely with the cumulative degradation of the cellulose polymer (DP), making it a reliable indicator of insulation condition. As the service life of a transformer is largely determined by the durability of its cellulose insulation, 2-FAL monitoring is critical for assessing transformer longevity27,28,29. However, Methanol and Ethanol also appear in lower concentrations and lack thermal stability, making them less reliable for accurately reflecting the true condition of paper insulation. Their ambiguous nature further complicates their use as consistent markers.

Fig. 3
figure 3

Compounds of Furans produced due to cellulose degradation.

Materials and methods

Cellulose, a natural glucose polymer, gradually degrades over time as polymer chain scission occurs during transformer operation, influenced by operating conditions. The molecular weight of insulating paper decreases with accelerated thermal aging, particularly in the presence of moisture and oxygen. This aging process generates degradation by-products, including furans, carbon oxides (CO2 and CO), water, and acids, which dissolve in the transformer oil, illustrated in the Fig. 4. Various diagnostic methods leverage these critical by-products to assess the remaining life of aged transformers. The most widely used techniques include Degree of Polymerization (DP) measurement, Furan Analysis, and Dissolved Gas Analysis (DGA). DGA is mainly used to identify early faults in power transformers, analysing gases like hydrogen (H2), methane (CH4), acetylene (C2H2), ethylene (C2H4), and ethane (C2H6) that form from oil breakdown during overheating or arcing, not solid insulation aging. Carbon oxides (CO and CO2) help assess paper condition but also stem from oil oxidation, causing confusion in pinpointing degradation sources and reducing DGA’s precision for normal insulation (solid) aging evaluation. Notably, furans, especially 2-Furfuraldehyde (2-FAL), are significant cellulose degradation by-products, enabling non-invasive DP estimation for predictive transformer health evaluation.

Fig. 4
figure 4

Cellulose degradation byproduct due to pyrolysis, hydrolytic and oxidation process.

This section outlines the machine learning models selected for predicting the Degree of Polymerization (DP) of transformer insulating paper based on 2-Furfuraldehyde (2-FAL) concentration, the preparation of the training dataset, the training parameters for model optimization, and the performance metrics used to evaluate the models. Hence, the health of the transfer or its remaining useful life is fixed by examining the interval into which the value of DP lies.

Machine learning models

Since the DP value of the cellulose paper degraded linearly (as per the IEEE C57.104 standard) with the concentration of 2-FAL (in ppm) into the transformer oil, hence to predict DP as a continuous variable (regression) and classify the insulation condition into four categories—Fresh (DP 700–1200), Lightly Aged (DP 450–700), Moderately Aged (DP 250–450), and Worstly Aged (DP < 250). Table 1 represents the different state of insulating paper as described here.

Table 1 Different state of the insulating paper as per DP.

The following supervised machine learning models were employed.

  • Linear Regression: A baseline model assuming a linear relationship between 2-FAL concentration and DP.

  • Polynomial Regression (Degree 2): Captures non-linear relationships by fitting a quadratic polynomial to the data.

  • Random Forest Regressor: An ensemble model that uses multiple decision trees to model complex, non-linear relationships for regression.

  • Logistic Regression: A baseline classifier for predicting insulation condition categories.

  • Support Vector Machine (SVM): Uses a radial basis function (RBF) kernel to classify insulation conditions based on 2-FAL.

  • Random Forest Classifier: An ensemble classifier that leverages decision trees for robust categorical prediction.

The selection of these models was driven by their proven effectiveness in handling tabular, non-linear data with limited features to achieve the target with good degree of accuracy. However, we intentionally chose not to incorporate deep learning (DL) techniques, such as Convolutional Neural Networks (CNNs) or Recurrent Neural Networks (RNNs), for several practical reasons. Firstly, DL models generally perform best with large, varied datasets, while our study relies on a synthetic dataset of 1000 samples centered on a single feature, 2-FAL, which may not provide the diversity needed to avoid overfitting. Additionally, DL methods are more appropriate for complex, high-dimensional inputs like images or sequential data, whereas our work involves a simpler input-output relationship that can be effectively handled by conventional machine learning approaches. Also, implementing DL-based solutions in real-time transformer monitoring systems would require substantial computational resources, which could pose limitations in field deployment scenarios.

Training dataset Preparation

The training dataset was synthetically generated based on the IEEE C57.104-2019 standard, which provides guidelines for interpreting transformer insulation degradation through chemical by-products like 2-FAL30. The dataset consists of 1000 samples, with 2-FAL concentration (in ppm) as the input feature and DP as the output for regression, alongside a categorical label for classification. The data was structured according to the ranges given in the Table 1.

Since the temperature and moisture influence transformer insulation degradation, it impacts the DP. Experimental evidence shows thermal stresses reduce insulation dielectric and mechanical strength, lowering DP, while moisture accelerates paper degradation15. Importantly, elevated temperature and moisture boost 2-Furfuraldehyde (2-FAL) production in oil, which we use as the key input for DP estimation. Our 1000-sample dataset, based on IEEE C57.104-2019, includes a balanced range of temperature and moisture levels, from low to high, ensuring the proposed machine learning models capture these effects. This balanced training data supports the model’s high accuracy and reliability in predicting DP.

For each range, 2-FAL and DP values were uniformly sampled to create input-output pairs, ensuring balanced representation across all categories. The dataset was split into 80% (800 sample) training and 20% (200 sample) testing sets to evaluate model performance on unseen data.

Training parameters and optimization

The machine learning models were implemented using Python’s scikit-learn library, with training parameters carefully optimized to ensure accurate prediction of the Degree of Polymerization (DP) and insulation condition classification based on 2-Furfuraldehyde (2-FAL) concentration, employing a systematic approach to hyperparameter tuning and cross-validation to achieve robust and generalizable models. For regression models, Linear Regression served as a baseline with no tunable hyperparameters, relying on ordinary least squares to minimize mean squared error between predicted and actual DP values. Polynomial Regression, set to a fixed degree of 2 to capture non-linear relationships without excessive complexity, used a pipeline combining polynomial feature transformation and linear regression, with no additional hyperparameters tuned to maintain simplicity. The Random Forest Regressor, designed to handle complex non-linear patterns, was configured with 100 decision trees, and its maximum tree depth was tuned across values of 5, 10, and None (unrestricted depth), selecting the value that maximized the R² score during 5-fold cross-validation, while the minimum samples per leaf was set to 1 to allow detailed splits and the number of features considered at each split was set to the square root of the total features, which is standard for regression tasks. For classification models, Logistic Regression was trained with a multinomial loss function to handle the four insulation condition categories (Fresh, Lightly Aged, Moderately Aged, Worstly Aged), using the ‘lbfgs’ solver with a maximum of 1000 iterations to ensure convergence, and the regularization strength parameter (C) was tuned over values of 0.1, 1, and 10 via grid search to balance model complexity and prevent overfitting. The Support Vector Machine (SVM) classifier, employing a radial basis function (RBF) kernel to capture non-linear decision boundaries, had its regularization parameter (C) tuned over 1, 10, and 100, and the kernel coefficient (gamma) tested across 0.01, 0.1, and ‘scale’ (computed as 1 / (number of features × variance of X)), with the combination yielding the highest accuracy selected through grid search. The Random Forest Classifier, also set to 100 trees, tuned its maximum depth across 5, 10, and None, with the minimum samples per leaf set to 1 and the number of features per split set to the square root of the total features, optimizing for accuracy during cross-validation.

Table 2 Hyperparameters tunned for machine learning models.

All models underwent 5-fold cross-validation, splitting the training data into five subsets, training on four folds, and validating on the fifth, repeating this process five times to compute average performance metrics (R² for regression, accuracy for classification), ensuring robustness against overfitting and data variability. A random seed of 42 was set for all models to ensure reproducibility of results, and the training process was conducted on the 80% training split of the dataset, reserving 20% for testing on unseen data. Table 2 summarizes the hyperparameters tuned for each model, and Table 3 details the cross-validation setup.

Table 3 Cross validation setup.

Result and discussion

This section presents the performance of the machine learning models developed to predict the Degree of Polymerization (DP) and classify the insulation condition of transformer paper based on 2-Furfuraldehyde (2-FAL) concentration. The regression models (Linear Regression, Polynomial Regression, and Random Forest Regressor) were evaluated for their ability to predict DP as a continuous variable, while the classification models (Logistic Regression, Support Vector Machine with RBF kernel, and Random Forest Classifier) were assessed for their accuracy in categorizing insulation conditions into Fresh, Lightly Aged, Moderately Aged, and Worstly Aged. Performance metrics are summarized in tables, followed by a discussion of the results, their implications for transformer maintenance, study limitations, and directions for future research.

Regression model performance

The regression models were evaluated using Mean Squared Error (MSE), Mean Absolute Error (MAE), and R² Score on the test set (20% of the dataset). Table 4 summarizes the performance metrics, with cross-validation R² scores included to assess model robustness.

Table 4 Performance metrics for regression Models.

The Random Forest Regressor outperformed other models, achieving the lowest MSE (9876.12) and MAE (68.34) and the highest R² score (0.894), indicating that it explains 89.4% of the variance in DP values. Its cross-validation R² score (0.882 ± 0.019) confirms robust generalization across data subsets. Polynomial Regression, with a degree of 2, performed better than Linear Regression, with an MSE of 23145.67, MAE of 112.78, and R² of 0.752, suggesting that a quadratic model captures some non-linear relationships between 2-FAL and DP. Linear Regression, as expected, had the weakest performance (MSE: 45678.23, MAE: 178.45, R²: 0.512), reflecting its inability to model the non-linear degradation patterns in the data.

Classification model performance

The classification models were evaluated using Accuracy, Precision, Recall, and F1-Score (weighted averages to account for class balance) on the test set. Table 5 presents these metrics, along with cross-validation accuracy to evaluate consistency.

The Random Forest Classifier achieved the highest performance, with an accuracy of 0.925, precision of 0.924, recall of 0.925, and F1-score of 0.924, correctly classifying 92.5% of insulation conditions. Its cross-validation accuracy (0.918 ± 0.018) indicates strong consistency. The SVM with RBF kernel performed well (accuracy: 0.870, F1-score: 0.869), outperforming Logistic Regression (accuracy: 0.845, F1-score: 0.843), which struggled with non-linear class boundaries. The confusion matrix for the Random Forest Classifier (not shown) revealed minimal misclassifications, primarily between Lightly Aged and Moderately Aged categories, where 2-FAL ranges overlap slightly.

Table 5 Performance metrics for classification Models.

The Random Forest Classifier’s performance was further analyzed using a confusion matrix to evaluate its ability to categorize insulation conditions accurately. Table 6 presents the confusion matrix for the test set (200 samples), showing the distribution of predicted versus actual insulation states. The dataset is balanced across the four classes (Fresh, Lightly Aged, Moderately Aged, Worstly Aged), with approximately 50 samples per class in the test set (200 total). The Random Forest Classifier’s predictions are compared against the true labels to construct the matrix.

Table 6 Confusion matrix for random forest Classifier.

The matrix reveals that 185 out of 200 (Diagonal values represent the correct predictions, e.g., 48 Fresh samples correctly classified as Fresh, 46 Lightly Aged as Lightly Aged, etc.). Total correct predictions: 48 + 46 + 45 + 46 = 185, yielding an accuracy of 185/200 = 0.925, consistent with the reported value) samples were correctly classified (accuracy: 0.925), with most errors occurring between adjacent categories, such as Lightly Aged and Moderately Aged, due to overlapping 2-FAL concentration ranges (0.1–1 ppm and 1–10 ppm). The high diagonal values (e.g., 48/50 for Fresh, 46/50 for Lightly Aged) demonstrate robust performance across all classes, affirming the model’s suitability for non-invasive insulation condition assessment.

Comparison with existing literature

To position this study within the broader context of transformer insulation diagnostics, Table 7 compares our work with recent literature on estimating the Degree of Polymerization (DP) using 2-Furfuraldehyde (2-FAL), carbon oxides (CO, CO2), and other parameters.

This study defines four insulation states—Fresh (DP 700–1200, 2-FAL 0–0.1 ppm), Lightly Aged (DP 450–700, 2-FAL 0.1–1 ppm), Moderately Aged (DP 250–450, 2-FAL 1–10 ppm), and Worstly Aged (DP < 250, 2-FAL > 10 ppm)—using Random Forest models with synthetic data based on IEEE C57.104-2019, achieving an R² of 0.894 for regression and accuracy of 0.925 for classification.

Table 7 Performance comparison with different existing work.

This work’s single-parameter (2-FAL) approach simplifies DP estimation compared to Li et al. (2019), which uses multiple parameters (2-FAL, CO, CO2, moisture) but achieves lower accuracy (0.85). Unlike Li et al. (2018), which relies on dispersion staining colors (R²: 0.87), our method avoids specialized techniques, enhancing practicality. Mandlik and Ramu (2014) focus on moisture’s qualitative impact without predictive metrics, limiting direct comparison. Leibfried et al. (2013) correlate furan with DP (correlation: 0.82) via postmortem analysis, lacking predictive modeling. Liu et al. (2021) model aging with furfural and oil/pressboard ratios (R²: 0.85), but our Random Forest approach outperforms (R²: 0.894) with less complexity. This study’s focus on 2-FAL ensures efficient, accurate DP estimation for transformer health monitoring.

We acknowledge that reliance on synthetic data based on IEEE C57.104-2019 introduces limitations, such as potential deviations from real-world variability in 2-FAL and DP relationships. These synthetic datasets, while aligned with standard aging trends, may not fully capture field-specific conditions like temperature fluctuations or oil impurities. We agree that validating our Random Forest models with actual transformer data is essential for real-world applicability. Future work will prioritize collecting and integrating field data to enhance model robustness and accuracy, addressing these limitations.

Conclusion

This study successfully developed and evaluated machine learning models to predict transformer insulation degradation using 2-Furfuraldehyde (2-FAL) concentration, demonstrating high accuracy with the Random Forest Regressor (R²: 0.894) for continuous DP prediction and the Random Forest Classifier (accuracy: 0.925) for classifying insulation conditions into Fresh, Lightly Aged, Moderately Aged, and Worstly Aged. These models enable non-invasive diagnostics, leveraging 2-FAL measurements from transformer oil to facilitate proactive maintenance, reduce operational costs, and extend transformer service life. Despite the robust and encouraging performance in estimating DP values, some significant limitations should be acknowledged. First, the study relies on a synthetically generated dataset based on IEEE C57.104-2019 that limits real-world applicability. Second, the analysis is based on a single aging indicator, i.e., 2-FAL, while transformer insulation degradation in practice is influenced through chemical, thermal, and environmental factors also that are not captured within a one-parameter predictive approach. Third, the synthetic data do not explicitly incorporate measurement noise, laboratory uncertainties, or operational fluctuations that typically affect dissolved-furan readings, and this may simplify the learning environment for the models. Lastly, uniform sampling across diagnostic ranges may not reflect the natural distribution of aging states within diverse transformer populations, introducing the possibility of dataset bias. These limitations outline the boundaries of the present work and indicate important directions for future research involving multi-parameter models, noise-aware data generation, and validation using field-measured transformer samples.