A comparative study of predicting the availability of power line communication nodes using machine learning

Moussa, Kareem; Amin, Mennatullah Mahmoud; Darweesh, M. Saeed; Said, Lobna A.; Elbaz, Abdelmoniem; Soltan, Ahmed

doi:10.1038/s41598-023-39120-7

Download PDF

Article
Open access
Published: 04 August 2023

A comparative study of predicting the availability of power line communication nodes using machine learning

Kareem Moussa^1,2,
Mennatullah Mahmoud Amin^2,3,
M. Saeed Darweesh^1,2,
Lobna A. Said^2,3,
Abdelmoniem Elbaz⁴ &
…
Ahmed Soltan^2,3

Scientific Reports volume 13, Article number: 12658 (2023) Cite this article

4213 Accesses
3 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Power Line Communication technology uses power cables to transmit data. Knowing whether a node is working in advance without testing saves time and resources, leading to the proposed model. The model has been trained on three dominant features, which are SNR (Signal to Noise Ratio), RSSI (Received Signal Strength Indicator), and CINR (Carrier to Interference plus Noise Ratio). The dataset consisted of 1000 readings, with 90% in the training set and 10% in the testing set. In addition, 50% of the dataset is for class 1, which indicates whether the node readings are optimum. The model is trained with multi-layer perception, K-Nearest Neighbors, Support Vector Machine with linear and non-linear kernels, Random Forest, and adaptive boosting (ADA) algorithms to compare between statistical, vector-based, regression, decision, and predictive algorithms. ADA boost has achieved the best accuracy, F-score, precision, and recall, which are 87%, 0.86613, 0.9, 0.8646, respectively.

Predicting the availability of power line communication nodes using semi-supervised learning algorithms

Article Open access 21 May 2025

Improved energy efficiency using adaptive ant colony distributed intelligent based clustering in wireless sensor networks

Article Open access 22 February 2024

Non-technical loss detection in power distribution networks using machine learning

Article Open access 16 October 2025

Introduction

Power Line Communication (PLC) is a communication technology that uses existing power cables for data transmission. Hence, PLC is an attractive and cost-effective method for transmitting data from all devices plugged into the power plugs, such as sensors and actuators. Therefore, using PLC as a communication technology avoids adding another infrastructure for data exchange by using the power line^1,2,3. Power line communication is divided into two categories based on the data rate; narrow band power line and broadband power line communications^4,5. Narrow-band PLC is used a lot in the smart grid, by electricity companies, and in-home networks for smart home applications. Moreover, PLC is used in in-vehicle and vehicle-to-infrastructure systems, and next-generation battery management systems^6,7. On the other hand, broadband power line communication is used in multimedia communications. Such applications are often characterized by many connected nodes, which are increasing with the Internet of Things (IoT) expansion.

The shared environment nature of the PLC raises many challenges for the communication process, such as the variable media characteristics. One issue concerns impedance matching at both the transmitter (TX) and the receiver (RX) for the PLC front end. The matching impacts the self-interference and the signal-to-self-interference-plus-noise ratio (SSINR). Typical PLC modems use a low impedance Tx path and a higher impedance Rx path in the analogue front-end for efficient harmonic distortion operation^5,7,8,9. Much effort has been made for impedance matching for the PLC¹⁰. However, there are still challenges in the power line impedance matching due to its variable load nature.

Contemporary PLC network performance deteriorates with increasingly connected nodes. Similarly, the coexistence with neighbouring DSL networks degrades the link quality. Hence, the European Telecommunications Standards Institute (ETSI) recommends using a dynamic spectral adaptation approach¹¹. Broadband PLC modems estimate the DSL-to-PLC channel interference and adapt the PLC’s transmit power spectral density accordingly. Moreover, a considerable effort has been made in PLC focused on the physical layer to deal with issues such as the time-varying behaviour of loads in electric power systems. Hence, there are dynamics and diversity of loads that result in time-frequency varying behaviour and signal attenuation when frequency and/or distance increase. Different impedance matching techniques have been illustrated in¹⁰. Furthermore, high power impulsive noise, impedance mismatching, the widespread use of unshielded power cables, and coupling losses impact link quality^1,4,6,11,12. In addition, high power impulsive noises yielded by connecting and disconnecting loads, equipment, alternate current/direct current (AC/DC) converters, and electromagnetic interference due to unshielded power lines and coupling problems affect the communication media performance dynamically over time.

Research in the domain of PLC is still running to address these issues. Communities such as PRIME and G3 are developing advanced tools, techniques, methods, and approaches, such as different implementations for the MAC and PHY layers which deals with different challenges^5,9,10,11. Moreover, field studies have discussed these issues^9,13. Another technique for addressing these issues is using another communication medium, such as RF, in regions where PLC is unstable. For example, using PLC (G3-based ) with RF technology such as 6LowPAN¹⁴ or LoRA has been better performance^13,15,16. However, using another technology violates the main advantage of using PLC: using existing infrastructure without added cost. Furthermore, another effort has been made to improve communication performance based on artificial intelligence (AI). So, AI is used to determine link quality and communication media quality. AI has been primarily used for RF-based technologies such as 4G/5G, optical networks, and smart cities data analysis^{17,18,19,20,21}. Hence, this work focuses on using AI to predict link quality for a PLC-based network and determining the optimum time slot for communicating with the node via the PLC network. The used data is collected from a field configured to work with a PRIME based PLC network.

This work uses the PRIME standard to build a PLC network in the field. The nodes are implemented using PL360 PLC transceiver from Microchip technology. The network consists of 500 PLC nodes, while the data concentrator unit (DCU) is located at the transformer site. Then, a PLC sniffer located one node after the DCU point. The dataset has been collected consisting of 1000 instances of the time in which a PLC node has optimum readings of the Signal to Noise Ratio (SNR), Received Signal Strength Indicator (RSSI), and Carrier to Interference-plus-Noise (CINR). The dataset trained six models representing Statistical, Vector-based, regression, decision, and predictive algorithms. The trained statistical algorithm is adaptive boosting. The vector-based algorithms are the Support Vector Machine (SVM) linear kernel and the SVM non-linear kernel. Decision algorithms are the random forest and decision trees. Finally, the predictive algorithm is K-Nearest Neighbors.

In the rest of the paper, AI in communications is discussed in Sect. 2, and the algorithms and the dataset details are discussed in Sect. 3. Then, the behaviours of the trained models are shown in Sect. 4, with a discussion of the results in Sect. 5, and the paper is concluded in Sect. 6.

AI in communications

AI is the field that allows computers to be smart and perform tasks that humans only did before. It has been widely developed in previous years and used in different applications. For example, AI is used for predicting some events in the future based on their historical performance, which saves time²².

PLC is recently being used more. It is data transmission using a Power Line Network (PLN). The problem in PLC is that PLN is not designed for this transmission, so PLC faces large noise^23,24.

The work in²⁵ has used machine learning to cluster the multi-conductor noise in PLCs to determine whether automatic clustering is helpful in this topic or not. They have used the MIMO NB noise database. They preprocessed the database to create the feature library, a Table consisting of the time segments from 5 to 500 $\mu s$ and two types of features. The first was to extract the signal, and the other was to find the relation between the two multi-conductor signal traces. The features have been evaluated to determine which are beneficial to consider. The authors have used principal component analysis (PCA) and box plots for feature evaluation. PCA reduces the dataset dimensions and keeps most of the information. A box plot displays the data on a standardized graph depending on six metrics: median, 25th percentile, 75th percentile, outlier, minimum, and maximum. The PCA shows that features 5 (Samples Skewness), 7(Samples Pearson correlation), and 9 (Distance Correlation) are the most informative features, and box-plot also shows features 5 and 7 have a visible data separation^25,26. Three methods have been used in clustering: hierarchical clustering, self-organizing map (SOM), and clustering using representatives (CURE). Hierarchical clustering sets for each point of a cluster, calculate the distances between the clusters, combine the nearest two clusters into one cluster and then redo the process till all the clusters are combined into one cluster, forming a dendrogram which is clusters’ tree²⁷. On the other hand, in CURE, a subset of the data representing C clusters is selected. For each cluster, some points far from each other are selected, then they are moved by 20% to the cluster centroid, then the algorithm merges every two clusters having two nearby representative points and then clusters all data points²⁸. Finally, SOM is a network of mapped units that each unit refers to a cluster such that the larger the number of units, the more accurate the separation of data²⁹. The clusters have been labeled according to the probability density functions (PDFs), which led to 35% of the data being normal, 23% being Middleton Class A, 27% being Alpha STable, 13% Generalized Extreme Value, and 2% of unknown classes. It is worth mentioning that more than five conventional noise classes were needed to represent the nature of the noise, especially in a noisy network such as a PLC environment²⁹.

The noise affects the PLC node, affecting data transfer reliability. AI can be used to detect whether a node is working at a specific time or not. This can be done by knowing the readings of the node in the past and training the AI model on these readings, which leads to the prediction of the time intervals where the readings of the PLC nodes are not optimal. This prediction will lead to the early selection of other nodes for the transmission instead of testing each node to determine the functioning nodes³⁰.

Methodology

In this section, the trained machine learning algorithms, which are Multi-Layer Perception, K-Nearest Neighbour, Support Vector Machine, Random Forest, and Adaptive Boosting, are discussed along with the key information of the collected data.

Machine learning algorithms

Multi-layer perception (MLP)

Multi-Layer Perception (MLP) is a neural network that is a supervised learning technique. The MLP consists of six layers: the input layer, four hidden layers, and the output layer, as depicted in Fig. 1. All the non-input nodes are neurons that use a nonlinear activation function.

K-nearest neighbour (KNN)

K-Nearest Neighbour (KNN) is an algorithm that predicts the input class based on voting for the most similar training data instances to the input. This takes the majority class of the K nearest similar neighbours without a learning process. As shown in Fig. 2, the green circle next to the question mark is the input that is not labelled. The two red triangles and a blue square are next to the input circle because they are similar; in other words, their features are similar to the input’s features. In this example, the value of the K is chosen to be three, so the black circle contains the nearest three instances to the input. After knowing the voting participants, the majority class will be the class of the input, so the prediction of the input class is the red triangle class

where the blue square is class 1, the red triangle is class 0, and the green circle is the input.

Support vector machine (SVM)

SVM separates the data points into another readily separable dimension using a kernel. For example, as shown in Fig. 3, there are two features, x1 and x2, and two classes, black and white dots. To be able to identify which combination of the feature values would refer to the class, the feature values of each instance have been plotted, and using a non-linear kernel as part (a) and linear kernel as part (b), the parts of the plot indicating each class could be known. The hyperplane is the plane that separates the classes in n-dimensional space. The more it is farther from the data points, the more accurate the classification^31,32,33.

Random forest

The Random Forest algorithm is a group of decision trees, as shown in Fig. 4. Each of these decision trees is trained on a subset of the dataset. These portions are equally distributed. When an input is given to the random forest algorithm, each tree, based on its training, gives a classification for this input. The class with the majority of predictions is input predicted class^35,36.

Adaptive boosting

An ensemble learning algorithm adjusts the weak classifier’s weights by iterating over them to enhance performance and create a more robust classifier. As shown in Fig. 5, the algorithm starts with fitting the model on the dataset and having some results, then adjusts some weights in the weak classifier and tests the model; if it is a weak classifier, it adjusts its weights till it becomes a more robust classifier.

Figure 6 shows the number of registered nodes in a time instance such that, as part (a) and part (b) show, as the variation in the number of registered nodes increases, the variation of the number of switch nodes in a time instance increase.

Dataset

In this work, data is collected from a test field that consists of 400 PLC modems. The PLC data are based on the PL360 chipset from Microchip, and the protocol used for the communication is PRIME standard. The data is collected using a PLC sniffer from Microchip as the sniffer is placed one node after the Data Concentrator Unit (DCU). Firstly, the data has been analyzed and filtered to be even. Then, the parameters representing the channel quality have been chosen based on the literature^5,16.

The dataset consists of 1000 readings of the most dominant parameters, which are Signal to Noise Ratio (SNR), Received Signal Strength Indicator (RSSI), and Carrier to Interference-plus-Noise (CINR); these are the most dominant parameters as shown in Fig. 7. Table 1 shows a sample for the dataset. 50% readings for label 0 indicate that the channel is not working for these values. On the other hand, for labeled 1, the communication channel is working and suitable for data exchange. The readings helped to determine which timestamp the node was working and which was not, which helped in determining, at a given time, the probability that a node was working or not. The dataset was divided into 90% for training and 10% for testing. Samples of the data for a specific node are displayed in the following figures. For example, Fig. 8 shows the SNR values over time, while Fig. 9, depicts the RSSI of the channel for a specific node in the network. The diagrams show that the change in the parameters is random over the time of sniffing. Moreover, the CINR is illustrated in Fig. 10. The captured data shows that the signal quality is variable over time.

Table 1 Dataset Samples.

Full size table

Data analysis

Analysis was done on our data to identify the relation between features and how these parameters can affect nodes performance. A subset of data was taken for simplicity to visualize all feature over the time creating Figure 11. This plot clearly shows the high correlation between all features, for example when SNR increases at specific time, the Bersoft decreases so a high negative correlation appears between these two features.

Another type of visualization to prove correlation is a correlation matrix which depends on Pearson correlation calculations as in Figure 12. As shown, a very high correlation appears between all quality features. Besides, a high correlation appears in up & down column with Pdutype and level. This can be explained that as a the node tends to be in a high level, lower quality appears resulting in down performance.

A histogram of all features can be visualized as well in Figure 13. This shows that the most abundant value in SNR is 4 indicating a bad quality of almost 75% of the data. Moreover, Bersoft and Bersoft max shows the same distribution which proves their high correlation of 1 which appears in the correlation matrix.

Results

In this section, six AI models have been used with the collected data in order to predict channel behavior. Hence, the results of these models and their impact on network performance are discussed. Then, a comparison between these results is conducted. Four metrics are used to evaluate the models; accuracy, F1-score, precision, and recall. Furthermore, a confusion matrix has been plotted for each model. The confusion matrix is a 2-d matrix whose rows refer to the true labels, and its columns refer to the predicted labels. The confusion matrix shows how many predicted instances are for each class, indicating the model performance³⁹.

The metrics’ equations are shown in equation 1. The accuracy is evaluated using (1a), the ratio between the summation of all the correct predictions. Which are the truly predicted positive class (TP) and truly predicted negative class (TN) to all the predictions, which are the true predictions of positive class (TP) and negative class (TN) and the falsely predicted positive class (FP) and the falsely predicted negative class (FN). As (1b) shows, the precision is how much truly predicted positive class (TP) is concerning all the predicted positive classes either truly predicted (TP) or falsely predicted (FP). The recall is how much truly predicted positive class for the number of positive class instances in the testing dataset, whether they are truly predicted (TP) or falsely predicted (FN) as shown in (1c). The F1-score is the ratio between double the multiplication of the precision and the recall to their summation as shown in (1d)⁴⁰. Fig. 14 shows the confusion matrices for the proposed models. The confusion matrix compares the predictions of each model with respect to the actual predictions such that the rows represent the actual class and the columns represent the predicted class such that the diagonal shows the correctly classified instances.

$$\begin{aligned} Accuracy= & {} \dfrac{TP+TN}{TP+FP+TN+FN} \end{aligned}$$

(1a)

$$\begin{aligned} Precision= & {} \dfrac{TP}{TP+FP} \end{aligned}$$

(1b)

$$\begin{aligned} Recall= & {} \dfrac{TP}{TP+FN} \end{aligned}$$

(1c)

$$\begin{aligned} F1-score= & {} 2\dfrac{precision. recall}{precision + recall} \end{aligned}$$

(1d)