Introduction

Artificial intelligence (AI) and especially artificial neural networks (ANNs) have been used to solve various issues and in different applications in smart grids, e.g., control-based applications1,2, protection-based applications3,4,5, detection and mitigation of cyber-attacks6,7,8,9,10,11,12,13, sensorless voltage estimation-based strategy for calculation of the total harmonic distortion14, wind and solar power prediction15, cybersecurity monitoring of cyber-phyisical power electronic converters16, etc. One of issues in smart grids which can be solved by ANNs is the prediction of the loads. There can be uncertainty related to electrical loads, and load forecasting can be considered as one of important challenges and duties for industry, where it plays an important role in power systems and it can affect the operation of power systems17,18,19. To have a reliable operation of a power system, short-term prediction of loads is a necessary task20. Previously, some studies have been done to support the application of load forecasting in power systems. For example, phase space reconstruction and stacking ensemble learning have been implemented for anticipation of load in21. Also, a strategy using transfer learning and deep residual neural networks for residential load prediction in22 can be mentioned as another instance. An approach for the short-term prediction of electrical loads using temporal feature selection and based on long short-term memory (LSTM) has been introduced by23. As another example, an interpretable memristive LSTM-based strategy has been implemented by24 for probabilistic-based prediction of residential loads. In addition, a federated learning-based methodology has been deployed in25 for short-term anticipation of residential loads and based on LSTM. Furthermore, an approach using improved temporal convolutional network and densely connected convolutional network has been used for short-term prediction of loads in26.

Previous works have shown effective results. However, they are mainly structured based classical approaches, e.g., classical AI. But, the advantages of quantum computing can lead to replace the classical computing with quantum-based strategy to gain more benefits in different ways using quantum mechanics, e.g., speedup the calculations using the concepts of the superposition and entanglement, and communication using the concept of the entanglement27,28,29,30. Therefore, it is necessary to open a window for the deployment of quantum computing to solve the challenges of smart grids, e.g., load forecasting. Before, some attempts have been initialized to use the concept of the quantum computing in power and energy applications. For example,31,32,33,34 have done studies related to quantum computing, communication, and cybersecurity of microgrids. Also, a quantum computing-based methodology have been developed by35 in order to unit commitment. As another example,36 has deployed quantum computing to propose quantum power flow. Also,37,38 have tried to use quantum computing for the electromagnetic transients program. Besides, quantum computing has been used to provide a methodology for the stability assessment of power systems in39. For more examples,40 and41 have suggested hybrid strategies for load forecasting based on support vector regression using quantum tabu search and chaotic quantum genetic algorithm, respectively. For more investigation regarding power applications and quantum computing,28,42,43,44,45 can be studied.

Although the above-mentioned studies have implemented quantum computing for solving different issues in power systems (e.g., cybersecurity, unit commitment, and power flow), but still there is a gap to address the forecasting-based challenges (e.g., load prediction) in power systems. In addition, the few mentioned works related to the prediction of loads have implemented the concept of quantum computing for the optimization part related to finding the optimized values of the parameters of the model, that is used for the forecasting. In other words, the mentioned works did not implement quantum computing for modeling the quantum layer as a part of the model to be used for the prediction. In addition, the mentioned previously studied did not deploy quantum computing for ANN-based applications, which are a powerful tool to solve prediction-based challenges.

Therefore, this paper tries to fill this gap by the deployment of quantum artificial intelligence for a short-term prediction of loads. This paper uses a hybrid quantum/classical strategy, i.e., hybrid quantum/classical ANN (that can be titled Q/C-ANN). Still, the number of the available quantum computers is very limited. In addition, the number of the accessible qubits is very limited and it is not possible to implement a fully quantum computing-based strategy for a large scale system. Furthermore, by increasing the number of the quantum gates and qubits, the sensitivity of the quantum-based circuit to the environmental noises (e.g., thermal and magnetic noises) can be increased. Therefore, at the moment, the deployment of a fully quantum circuit can not guarantee the reliability of the system. So, due to the current limitation about the fully quantum computing-based applications, this paper implements a hybrid quantum/classical approach to initialize a quantum computing-based strategy to address a prediction-based challenge (i.e., load forecasting) in a power system.

In the rest of this paper, Section II will talk about the basics of quantum computing. Also, in Section III, Q/C-ANNs will be discussed. In addition, Section IV will introduce the proposed strategy to use Q/C-ANNs for load forecasting. In section V, more math-based discussions (i.e., updating the states of the qubits, quantum measurement and calculation the output of the quantum layer, the encoding technique to map classical data into quantum states, and data pre-processing) will be talked. Besides, the proposed strategy will be examined on two different residential loads by Section VI, and this section will show the results. In Section VII, other classical AI-based methods will be examined on the implemented dataset. Further, Section VIII will discuss this study. Furthermore, Section IX and Section X will conclude the paper, and talk about the suggested future works.

Introduction to quantum computing

Currently, there are various types of computation-based strategies, e.g., swarm intelligence, AI, and quantum computing. However, the physical concept behind them can be considered different. For example, classical intelligence-based approaches implement classical bits as the basics of the computation. But, among the mentioned computing strategies, quantum computing can be considered more distinguish, due to the implementation of qubits as the basis of the computation and also for carrying out the information. For more clarification, a bit can be 0 or 1. But, the state of a qubit can be as follows46:

$$\begin{aligned} |\Gamma \rangle =a_1 | 0 \rangle + a_2 | 1 \rangle , \end{aligned}$$
(1)

where, \(a_1\) and \(a_2\) are complex numbers, which should satisfy the following equality constraint46:

$$\begin{aligned} \left| a_1 \right| ^2 + \left| a_2 \right| ^2 =1. \end{aligned}$$
(2)

When the state of a qubit is measured, it can be either in state \(| 0 \rangle\) or \(| 1 \rangle\), with the probability of \(\left| a_1 \right| ^2\) and \(\left| a_2 \right| ^2\), respectively46. Also, (1) can be written as follows46:

$$\begin{aligned} |\Gamma \rangle =cos\begin{pmatrix} \frac{\sigma }{2} \end{pmatrix} | 0 \rangle + e^{i\lambda } sin\begin{pmatrix} \frac{\sigma }{2} \end{pmatrix} | 1 \rangle . \end{aligned}$$
(3)

Where, \(\sigma\) and \(\lambda\) are real numbers (\(\sigma , \lambda \in {\mathbb {R}}\)). For a visual representation of the state of a qubit, the Bloch sphere can be implemented46. In Fig. 1, the state of a qubit using the Bloch sphere is shown.

Fig. 1
figure 1

The representation of the Bloch sphere for (3), where, for more details,46 can be studied.

Typically, based on Fig. 2, a quantum circuit can contain three main parts. The first part includes initialized qubit/qubits, which can be implemented as the basis of the calculations. In addition, the second part can have quantum operators including gates to structure the main part related to the algorithm. Furthermore, the last part is related to measurement/measurements, for measuring the final state of the qubit/qubits.

Fig. 2
figure 2

The general architecture of a quantum circuit to be used for quantum computing.

The state of a qubit can be represented using a vector. Therefore, \(|\Gamma \rangle\) can be represented as follows46:

$$\begin{aligned} |\Gamma \rangle =\begin{bmatrix} a_1\\ a_2 \end{bmatrix}. \end{aligned}$$
(4)

Also, there is a matrix representation for a quantum operator and using the linear algebra, the final state of a qubit system or a multi qubit system can be obtained. For more clarification, the final state of a single qubit circuit (based on Fig. 3(a)) can be obtained as follows:

$$\begin{aligned} |\Gamma _{out}\rangle = U|\Gamma {in}\rangle . \end{aligned}$$
(5)

In addition, for a single qubit system, in the case of a series operators (\(U_{i,1}\), \(U_{i,2}\),...,\(U_{i,i_k}\)), the equivalent operator G (based on Fig. 3(b)) of the system can be calculated as follows:

$$\begin{aligned} G= U_{i,i_k}\times U_{i,i_{k-1}}\times ... \times U_{i,2} \times U_{i,1} {.} \end{aligned}$$
(6)

Further, for the case of a multi qubit circuit with \(j_k\) qubits and one operator for each qubit, the equivalent operator (F) of the system can be calculated (based on Fig. 3(c)) as follows:

$$\begin{aligned} F= U_{1,j}\otimes U_{2,j}\otimes ... \otimes U_{j_k-1,j} \otimes U_{j_k,j} {.} \end{aligned}$$
(7)
Fig. 3
figure 3

The representation of: (a) a single qubit circuit with one operator, (b) a single qubit circuit with more than one operator, and (c) a multi qubit circuit with one operator for each qubit.

For the calculation of the equivalent operator of a multi qubit system, firstly, the equivalent operator of each qubit can be calculated based on (6). So, the system can be converted to a multi qubit circuit with one equivalent operator for each qubit. Then, the equivalent operator of all the system can be obtained using (7). It is imporant to note that, for the case of quantum circuits including entangled qubits, the calculations can be more complex. In the next part, it will be shown that how a quantum circuit can be used in an AI-based strategy. Therefore, the structure of a Q/C-ANN including quantum operators and also classical parts will be discussed.

Basics of Q/C-ANNs

In this part, the architecture of a quantum computing-based AI is talked. For this purpose, application of hybrid/classical ANNs is discussed, which have classical parts and also quantum computing-based operators. Generally, a Q/C-ANN includes three main parts, which can be named as Part 1, Part 2, and Part 3. For a more clarification, Fig. 4 shows the architecture of a Q/C-ANN.

Fig. 4
figure 4

The structure of a Q/C-ANN.

The main task of the first part is to receive the input signals of the hybrid network. In addition, Part 1 can contain classical layers to receive the inputs and produce the output signals of Part 1. Each classical layer can be made using artificial neurons, which receive input signals from the previous layer and produce the output signals considering weighting coefficients, the bias factor, and the activation function. Therefore, in Part 1, artificial neurons play the most important role. For more clarification, the output of a neuron can be updated as follows:

$$\begin{aligned} \mu _{out}=f\left( \chi \right) . \end{aligned}$$
(8)

Where, f is an activation function, and \(\chi\) is as follows:

$$\begin{aligned} \chi = \left( \omega _1 \times \kappa _1 + \omega _2 \times \kappa _2 + \omega _3 \times \kappa _3 + \cdots + \omega _z \times \kappa _z\right) + \varrho . \end{aligned}$$
(9)

In (9), z is the number neurons of the previous layer, which are connected to the desired neuron. In addition, \(\varrho\) is the bias factor, and \(\omega _j\) is the weighting coefficient of the \(j^{th}\) input signal of the desired neuron. Also, \(\kappa _j\) is the \(j^{th}\) input signal of the desired neuron, which can be produced by the \(j^{th}\) neuron of the previous layer.

Also, Part 2 of Fig. 4 is a quantum computing-based layer, which has three sub-parts, i.e., initialized qubits, quantum circuit, and measurements. For the initialized qubits, different strategies can be done. For example, all the implemented qubits can be in state \(| 0 \rangle\). Therefore, if Part 2 is considered a n qubit system with state \(| 0 \rangle\) for all qubits, the initial state of the system (\(S_{1,0}\)) can be obtained as follows:

$$\begin{aligned} | S_{1,0} \rangle = \overbrace{| 0 \rangle \otimes | 0 \rangle \otimes \cdots \otimes | 0 \rangle }^{n}. \end{aligned}$$
(10)

In other words:

$$\begin{aligned} | S_{1,0} \rangle = \overbrace{| 00 \cdots 0 \rangle }^{n}. \end{aligned}$$
(11)

And as a result:

$$\begin{aligned} | S_{1,0} \rangle = 2^n\left\{ \begin{array}{l} \begin{bmatrix} 1\\ 0\\ 0\\ \vdots \\ 0 \end{bmatrix} \end{array}\right. . \end{aligned}$$
(12)

Also, if all the initialized qubits of the system are in state \(| 1 \rangle\), the initial state of the system (\(S_{1,1}\)) is as follows:

$$\begin{aligned} | S_{1,1} \rangle =\overbrace{| 1 \rangle \otimes | 1 \rangle \otimes \cdots \otimes | 1 \rangle }^{n}. \end{aligned}$$
(13)

Therefore,

$$\begin{aligned} | S_{1,1} \rangle = \overbrace{| 11 \cdots 1 \rangle }^{n}. \end{aligned}$$
(14)

Then,

$$\begin{aligned} | S_{1,1} \rangle = 2^n\left\{ \begin{array}{l} \begin{bmatrix} 0\\ 0\\ 0\\ \vdots \\ 1 \end{bmatrix} \end{array}\right. . \end{aligned}$$
(15)

Furthermore, if each qubit is in an equal superposition, the overall initial state of the system (\(| S_{1,e} \rangle\)) can be as follows:

$$\begin{aligned} | S_{1,e} \rangle = \overbrace{\frac{\begin{pmatrix} | 0 \rangle + | 1 \rangle \end{pmatrix}}{\sqrt{2}}\otimes \cdots \otimes \frac{\begin{pmatrix} | 0 \rangle + | 1 \rangle \end{pmatrix}}{\sqrt{2}}}^{n}. \end{aligned}$$
(16)

Also, (16) can be written as follows:

$$\begin{aligned} | S_{1,e} \rangle = 2^n\left\{ \begin{array}{l} \begin{bmatrix} \frac{1}{\sqrt{2^n}}\\ \frac{1}{\sqrt{2^n}}\\ \frac{1}{\sqrt{2^n}}\\ \vdots \\ \frac{1}{\sqrt{2^n}} \end{bmatrix} \end{array}\right. . \end{aligned}$$
(17)

In addition, in Part 2, the quantum circuit can have quantum gates or operators, which receive the initialized qubits. There are different quantum gates, which can be represented by matrices. For example, a Pauli-X gate (\({\mathbb {U}}_X\)), a Hadamard gate (\({\mathbb {U}}_H\)), and a rotation gate about \({\hat{y}}\) axis (\({\mathbb {U}}_{Rot_y}(\varpi )\)) can be represented by matrices as follows46:

$$\begin{aligned} {\mathbb {U}}_X=\begin{bmatrix} 0 & 1\\ 1 & 0 \end{bmatrix}, \end{aligned}$$
(18)
$$\begin{aligned} {\mathbb {U}}_H=\frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1\\ 1 & -1 \end{bmatrix}, \end{aligned}$$
(19)

and

$$\begin{aligned} {\mathbb {U}}_{Rot_y}(\varpi )=\begin{bmatrix} cos(\frac{\varpi }{2}) & -sin(\frac{\varpi }{2})\\ sin(\frac{\varpi }{2}) & cos(\frac{\varpi }{2}) \end{bmatrix}. \end{aligned}$$
(20)

It is important to note that, there are different quantum operators. For the rest of this part, the matrix representation of some important quantum operators will be talked about. Before, the matrix representation of Pauli-X gate, Hadamard gate, and rotation gate about \({\hat{y}}\) axis have been talked about. However, there are other quantum gates, e.g., Pauli-Y gate, Pauli-Z gate, swap gate, controlled-swap, controlled-Not gate, Toffoli gate, as well as rotation gates about \({\hat{x}}\) and \({\hat{z}}\) axes. Some gates are operated on single qubit such as Pauli-Y gate (\({\mathbb {U}}_Y\)) and Pauli-Z gate (\({\mathbb {U}}_Z\)), where the matrix representation of them is as follows46:

$$\begin{aligned} {\mathbb {U}}_Y=\begin{bmatrix} 0 & -\sqrt{-1}\\ \sqrt{-1} & 0 \end{bmatrix}, \end{aligned}$$
(21)

and

$$\begin{aligned} {\mathbb {U}}_Z=\begin{bmatrix} 1 & 0\\ 0 & -1 \end{bmatrix}. \end{aligned}$$
(22)

In addition, rotation gates about \({\hat{x}}\) and \({\hat{z}}\) axes (\({\mathbb {U}}_{Rot_x}(\varpi )\) and \({\mathbb {U}}_{Rot_z}(\varpi )\), respectively) can be considered as other examples of quantum operators for single qubits, where they can be represented mathematically as follows46:

$$\begin{aligned} {\mathbb {U}}_{Rot_x}(\varpi )=\begin{bmatrix} cos(\frac{\varpi }{2}) & -\sqrt{-1}sin(\frac{\varpi }{2})\\ -\sqrt{-1}sin(\frac{\varpi }{2}) & cos(\frac{\varpi }{2}) \end{bmatrix}, \end{aligned}$$
(23)

and

$$\begin{aligned} {\mathbb {U}}_{Rot_z}(\varpi )=\begin{bmatrix} e^{\frac{-\sqrt{-1}\varpi }{2}} & 0\\ 0 & e^{\frac{\sqrt{-1}\varpi }{2}} \end{bmatrix}. \end{aligned}$$
(24)

Furthermore, swap (\({\mathbb {U}}_{swap}\)) and controlled-not (\({\mathbb {U}}_{C-N}\)) gates operate on two qubits, and their matrix representations are as follows46:

$$\begin{aligned} {\mathbb {U}}_{swap}=\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}, \end{aligned}$$
(25)

and

$$\begin{aligned} {\mathbb {U}}_{C-N}=\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{bmatrix}{.} \end{aligned}$$
(26)

Besides, controlled-swap gate (\({\mathbb {U}}_{C-swap}\)) and Toffoli gate (\({\mathbb {U}}_{Toffoli}\)) are used for three qubis, where their matrix representations are as follows46:

$$\begin{aligned} {\mathbb {U}}_{C-swap}=\begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{bmatrix}, \end{aligned}$$
(27)

and

$$\begin{aligned} {\mathbb {U}}_{Toffoli}=\begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1\\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \end{bmatrix}. \end{aligned}$$
(28)

In this part, the matrix representation of different quantum gates has been discussed. Some of the mentioned quantum gates such as Pauli-Y, Pauli-Z, and rotation gate gates can be operated on a single qubit. Also, the swap gate and controlled-not gate are operated on two qubits. In addition, controlled-swap gate and Toffoli gate are operated on three qubits. For more information about the mentioned gates and other quantum gates,46 can be studied.

It is worth to mention that, for quantum gates, those can be considered which have parameters to tune them. Therefore, using and modifying the parameters of the operator gates, they can be used to support the hybrid network for the goal of the network (e.g., regression application). Finally, Part 3 of the hybrid network is a classical ANN. This part has classical layers, and it receives the measurements of the quantum computing-based layer. Then, the output of this part is the output of the hybrid network.

Time-series-based load forecasting using Q/C-ANNs

In this study, application of Q/C-ANNs is used for ultra short-term forecasting (one-step-ahead prediction) of loads and in a form of time-series. Therefore, if \(P_L (t)\) is the value of the load at time = t, the set that includes the inputs of the hybrid network is as follows:

$$\begin{aligned} X=\begin{Bmatrix} P_L(t-\Delta t),&P_L(t-2\Delta t),&\cdots&, P_L(t-\beta \Delta t) \end{Bmatrix}. \end{aligned}$$
(29)

Where, in (29), \(\Delta t\) and \(\beta\) are the sampling time and inputs delay, respectively. Also, for Part 1 and Part 3, classical layers including artificial neurons are implemented in a form of a feedforward structure-based neural network. Besides, for Part 2, initialized qubits are considered based on (10). In addition, for the quantum circuit, for each qubit, a rotation operator about \({\hat{y}}\) axis can be considered. In Fig. 5, the structure of a n qubit system including rotation gates is depicted. Based on (20), each rotation operator of Fig. 5 can be represented in a form of a matrix as follows:

$$\begin{aligned} R_y(\xi )=mathbb{U}_{Rot_y}(\xi ) = \begin{bmatrix} cos(\frac{\xi }{2}) & -sin(\frac{\xi }{2})\\ sin(\frac{\xi }{2}) & cos(\frac{\xi }{2}) \end{bmatrix}. \end{aligned}$$
(30)

So, in a case of a n qubit system, the equivalent operator (R) of the quantum circuit can be as follows:

$$\begin{aligned} R = \overbrace{R_y(\xi _1 )}^{{\mathbb {U}}_{Rot_y}(\xi _1)} \otimes \overbrace{R_y(\xi _2 )}^{{\mathbb {U}}_{Rot_y}(\xi _2)} \otimes \cdots \otimes \overbrace{R_y(\xi _n )}^{{\mathbb {U}}_{Rot_y}(\xi _n)}. \end{aligned}$$
(31)

Where, the rotation parameter \(\xi _i\) is related to the \(i^{th}\) qubit of the system.

Fig. 5
figure 5

The quantum circuit of a n qubit system, including a rotation gate (\(R_y\) = \({\mathbb {U}}_{Rot_y}\)) for each qubit.

Therefore, using (30) and considering that each qubit is in initial state \(| 0 \rangle\), the final state of each single qubit for \(1 \le i \le n\) is as follows:

$$\begin{aligned} |\phi _{i}\rangle = \begin{bmatrix} cos(\frac{\xi _i}{2}) & -sin(\frac{\xi _i}{2})\\ sin(\frac{\xi _i}{2}) & cos(\frac{\xi _i}{2}) \end{bmatrix} \times \begin{bmatrix} 1\\ 0 \end{bmatrix}, \end{aligned}$$
(32)

and in other words,

$$\begin{aligned} | \phi _{i} \rangle =\begin{bmatrix} cos(\frac{\xi _i}{2})\\ sin(\frac{\xi _i}{2}) \end{bmatrix}. \end{aligned}$$
(33)

Therefore, the final state of qubit i can be written as follows:

$$\begin{aligned} | \phi _{i} \rangle = \begin{pmatrix} cos(\frac{\xi _i}{2})| 0 \rangle + sin(\frac{\xi _i}{2}) | 1 \rangle \end{pmatrix}. \end{aligned}$$
(34)

Also, the vector corresponding to the final state of the system \(| S_{2} \rangle\) can be calculated as follows:

$$\begin{aligned} |S_{2}\rangle =|\phi _{1} \rangle \otimes |\phi _{2} \rangle \otimes \cdots \otimes |\phi _{n-1} \rangle \otimes |\phi _{n} \rangle . \end{aligned}$$
(35)

And considering (33) and (35), \(| S_{2} \rangle\) can be obtained as follows:

$$\begin{aligned} |S_{2}\rangle =\begin{bmatrix} cos(\frac{\xi _1}{2})\\ sin(\frac{\xi _1}{2}) \end{bmatrix} \otimes \begin{bmatrix} cos(\frac{\xi _2}{2})\\ sin(\frac{\xi _2}{2}) \end{bmatrix} \otimes \cdots \otimes \begin{bmatrix} cos(\frac{\xi _{n-1}}{2})\\ sin(\frac{\xi _{n-1}}{2}) \end{bmatrix} \otimes \begin{bmatrix} cos(\frac{\xi _n}{2})\\ sin(\frac{\xi _n}{2}) \end{bmatrix}. \end{aligned}$$
(36)

The output of the quantum computing-based layer depends on \(\xi _1, \xi _2, \cdots , \xi _n\). The measured outputs of each qubit can be used as the inputs of the Part 3 in order to connect the quantum computing-based layer into the Part 3 of the Q/C-ANN.

For more clarification, in continuation of this part, it will be shown that how a Q/C-ANN can predict the future value of a load. So, consider a Q/C-ANN, which has \(N_{in}\) inputs (\(\beta =N_{in}\)). Also, for this hybrid network (Q/C-ANN), each classical part (Part 1 and Part 3) has one classical layer. In addition, the numbers of classical neurons for Part 1 and Part 3 are \(N_{n,1}\) and \(N_{n,3}\). Then, based on (29), the vector related to the input dataset of this hybrid network is modified as follows:

$$\begin{aligned} X=\begin{bmatrix} P_L(t)&P_L(t-\Delta t)&\cdots&P_L(t-(N_{in}-1)\Delta t) \end{bmatrix}. \end{aligned}$$
(37)

Also, the output of this hybrid network is as follows:

$$\begin{aligned} Y=\begin{bmatrix} P_L(t+\Delta t) \end{bmatrix}. \end{aligned}$$
(38)

Considering the mentioned parameters and the definition of the inputs and the output of Q/C-ANN, the relation between the inputs and the output of the system can be explained. Therefore, for the rest of this part, the calculations related to Part 1, Part 2, and Part 3 will be discussed to show how the hybrid network can be used to predict the desired data. To achieve this goal, Subsection 4.1, Subsection 4.2, and Subsection 4.3 will talk about Part 1, Part2, and Part 3 of Q/C-ANN, respectively. The next part will show how the output signals of Part 1 can be calculated.

Calculation the outputs of part 1

Part 1 is a classical layer, which includes neurons. In addition, the input vector of Part 1 (\(I_1\)) is fed by the inputs of the hybrid network. In other words:

$$\begin{aligned} I_1=X^T. \end{aligned}$$
(39)

Also, each neuron of Part 1 is connected to the inputs of the hybrid network. In addition, each neuron has weighting coefficients and a bias factor. The matrix representation of weighting coefficients of Part 1 is as follows:

$$\begin{aligned} W_1=\begin{bmatrix} w_{1,1,1} & w_{1,1,2} & w_{1,1,3} & \cdots & w_{1,1,N_{in}} \\ w_{1,2,1} & w_{1,2,2} & w_{1,2,3} & \cdots & w_{1,2,N_{in}} \\ w_{1,3,1} & w_{1,3,2} & w_{1,3,3} & \cdots & w_{1,3,N_{in}} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ w_{1,N_{n,1},1} & w_{1,N_{n,1},2} & w_{1,N_{n,1},3} & \cdots & w_{1,N_{n,1},N_{in}} \end{bmatrix}. \end{aligned}$$
(40)

Where, \(w_{1,j,k}\) is the weighting coefficient between the \(j^{th}\) neuron of the classical layer of Part 1 and the \(k^{th}\) input of the hybrid network (\(P_L(t-(k-1)\Delta t)\)). Also, the vector corresponding to the bias factors of Part 1 can be represented as follows:

$$\begin{aligned} B_1=\begin{bmatrix} b_{1,1}\\ b_{1,2}\\ \vdots \\ b_{1,N_{n,1}-1} \\ b_{1,N_{n,1}} \end{bmatrix}. \end{aligned}$$
(41)

In (41), \(b_{1,j}\) is the bias factor of the \(j^{th}\) neuron of the classical layer of Part 1. Therefore, by the implementation of (39), (40), and (41), the output of the classical layer without the implementation of the activation function of Part 1 is as follows:

$$\begin{aligned} P_1=W_1 \times I_1 + B_1 \end{aligned}$$
(42)

Also, \(P_1\) can be represented in a form of a vector as follows:

$$\begin{aligned} P_1=\begin{bmatrix} p_{1,1}\\ p_{1,2}\\ \vdots \\ p_{1,N_{n,1}-1} \\ p_{1,N_{n,1}} \end{bmatrix}. \end{aligned}$$
(43)

Using an activation function \(f_1\), the \(i^{th}\) output of Part 1 is \(f_1(p_{1,i})\). Then, the vector corresponding to the outputs of Part 1 is as follows:

$$\begin{aligned} O_1=\begin{bmatrix} f_1(p_{1,1})\\ f_1(p_{1,2})\\ \vdots \\ f_1(p_{1,N_{n,1}-1}) \\ f_1(p_{1,N_{n,1}}) \end{bmatrix}. \end{aligned}$$
(44)

The output signals of Part 1 will be used as the input signals of Part 2. In other words, the input signals of Part 2 will be used to update the state of the qubits using rotation gates, which are deployed to structure the quantum circuit in Part 2. The next part will discuss how the output signals of Part 1 can be used in Part 2 to update the state of the qubits and how it can produce the output signals of Part 2.

Calculation the outputs of part 2

As discussed above, the outputs of Part 1 modify the rotation parameters of Part 2. Therefore, the input vector of Part 2 (\(I_2\)) is as follows:

$$\begin{aligned} I_2=O_1. \end{aligned}$$
(45)

In other words, the \(i^{th}\) rotation parameter is as follows:

$$\begin{aligned} \xi _i = f_1(p_{1,i}), \end{aligned}$$
(46)

and the vector representation related to the rotation parameters of Part 2 is as follows:

$$\begin{aligned} \Xi =\begin{bmatrix} \xi _1\\ \xi _2\\ \vdots \\ \xi _{N_{n,1}-1} \\ \xi _{N_{n,1}} \end{bmatrix}. \end{aligned}$$
(47)

Where, using (46) and (47), \(\Xi\) can be obtained. Also, if the initial state of the \(i^{th}\) qubit is \(| 0 \rangle\), based on (34) and (46), the final state of the \(i^{th}\) qubit is as follows:

$$\begin{aligned} | \phi _{i} \rangle = \begin{pmatrix} cos(\frac{f_1(p_{1,i})}{2})| 0 \rangle + sin(\frac{f_1(p_{1,i})}{2}) | 1 \rangle \end{pmatrix}. \end{aligned}$$
(48)

In other words and based on (33),

$$\begin{aligned} |\phi _{i}\rangle =\begin{bmatrix} cos(\frac{f_1(p_{1,i})}{2})\\ sin(\frac{f_1(p_{1,i})}{2}) \end{bmatrix}. \end{aligned}$$
(49)

Therefore, (49) can be encoded to be used as the input for Part 3. So, the following dataset (matrix \(O_2\)) can be considered to create the outputs of Part 2.

$$\begin{aligned} O_2=\begin{bmatrix} cos(\frac{f_1(p_{1,1})}{2}) & sin(\frac{f_1(p_{1,1})}{2}) \\ cos(\frac{f_1(p_{1,2})}{2}) & sin(\frac{f_1(p_{1,2})}{2}) \\ \vdots & \vdots \\ cos(\frac{f_1(p_{1,N_{n,1}-1})}{2}) & sin(\frac{f_1(p_{1,N_{n,1}-1})}{2}) \\ cos(\frac{f_1(p_{1,N_{n,1}})}{2}) & sin(\frac{f_1(p_{1,N_{n,1}})}{2}) \end{bmatrix}. \end{aligned}$$
(50)

The next part will discuss that how the outputs of Part 2 can be encoded to provide the inputs of Part 3. Then, it will talk about how the encoded data can be used to obtain the output of Part 3 and as a result the output of Q/C-ANN.

Calculation the output of part 3

The output of Part 2 is based on (49) and (50). As mentioned above, the input of Part 3 is the encoded version of (49) and (50). In this work, the elements related to state \(| 0 \rangle\) (\(cos(\frac{f_1(p_{1,i})}{2})\)) is used as the input of Part 3 and as a result, the first column of (50) is used as the input of Part 3. So, the vector representation of the inputs for Part 3 is as follows:

$$\begin{aligned} I_3=\begin{bmatrix} cos(\frac{f_1(p_{1,1})}{2})\\ cos(\frac{f_1(p_{1,2})}{2})\\ \vdots \\ cos(\frac{f_1(p_{1,N_{n,1}-1})}{2}) \\ cos(\frac{f_1(p_{1,N_{n,1}})}{2}) \end{bmatrix}. \end{aligned}$$
(51)

In addition, the matrix representation of the weighting coefficients (\(W_3\)) and the vector representation of the bias factors of the classical layer of Part 3 are as follows:

$$\begin{aligned} & W_3=\begin{bmatrix} w_{3,1,1} & w_{3,1,2} & w_{3,1,3} & \cdots & w_{3,1,N_{n,1}} \\ w_{3,2,1} & w_{3,2,2} & w_{3,2,3} & \cdots & w_{3,2,N_{n,1}} \\ w_{3,3,1} & w_{3,3,2} & w_{3,3,3} & \cdots & w_{3,3,N_{n,1}} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ w_{3,N_{n,3},1} & w_{3,N_{n,3},2} & w_{3,N_{n,3},3} & \cdots & w_{3,N_{n,3},N_{n,1}} \end{bmatrix}, \end{aligned}$$
(52)
$$\begin{aligned} & B_3=\begin{bmatrix} b_{3,1}\\ b_{3,2}\\ \vdots \\ b_{3,N_{n,3}-1} \\ b_{3,N_{n,3}} \end{bmatrix}. \end{aligned}$$
(53)

Where, in (52) and (53), \(w_{3,j,k}\) and \(b_{3,j}\) are the weighting coefficient to connect the \(k^{th}\) input of Part 3 to the \(j^{th}\) neuron of the classical layer of Part 3, and the bias factor of the \(j^{th}\) mentioned neuron. So, based on (51), (52), and (53), the output of the classical layer of Part 3 without the deployment of activation functions can be calculated as follows:

$$\begin{aligned} P_3=W_3 \times I_3 + B_3. \end{aligned}$$
(54)

Where, the vector representation of \(P_3\) can be written as follows:

$$\begin{aligned} P_3=\begin{bmatrix} p_{3,1}\\ p_{3,2}\\ \vdots \\ p_{3,N_{n,3}-1} \\ p_{3,N_{n,3}} \end{bmatrix}, \end{aligned}$$
(55)

and using an activation function \(f_3\) for this layer, the output vector of this layer (\(O_3\)) is as follows:

$$\begin{aligned} O_3=\begin{bmatrix} f_3(p_{3,1})\\ f_3(p_{3,2})\\ \vdots \\ f_3(p_{3,N_{n,3}-1}) \\ f_3(p_{3,N_{n,3}}) \end{bmatrix}. \end{aligned}$$
(56)

Also, the main task of Part 3 is to produce the output of the hybrid network and as a result the prediction of the value of the load. So, in this part, in addition to the previous classical layer, a classical output layer is existed. The output of this layer is the prediction of \(P_L\) at time \(t + \Delta t\). Therefore, the output layer of Part 3 has only one neuron and as a result one bias factor (\(b_o\)) for that neuron. In addition, the weighting coefficients representation of this layer can be shown by a vector as follows:

$$\begin{aligned} W_o=\begin{bmatrix} w_{o,1}&w_{o,2}&\cdots&w_{o,N_{n,3}-1}&w_{o,N_{n,3}} \end{bmatrix}. \end{aligned}$$
(57)

Finally, using a linear activation function, the output of the hybrid network can be calculated as follows:

$$\begin{aligned} P_o=W_o \times O_3 + b_o. \end{aligned}$$
(58)

This part discussed that how the output of Q/C-ANN can be calculated. In the next part, a brief discussion about load forecasting using Q/C-ANN is provided. The next part shows how the three parts of Q/C-ANN can be used in a unified algorithm (i.e., Algorithm 1) to produce the output of the hybrid network in a nutshell.

The Implemented Strategy in a Nutshell

Briefly, the output of the hybrid network is \(P_o\). The hybrid network receives X in (37). Then, using (39), (42), (43), and (44), the outputs of Part 1 is calculated. The outputs of Part 1 is used as the inputs of Part 2 to tune rotation parameters of the quantum circuit. Therefore, based on (46), (47), (48), (49), and (50), the outputs of Part 2 is determined. After that, the outputs of Part 2 are encoded to the inputs of Part 3 using (51). Then, using (54), (55), and (56), the output signals of the neurons of the classical layer of Part 3 can be obtained. Finally, using (58), the output of the hybrid network (predicted value of the load at time = \(t+\Delta t\) (\({\overline{P}}_L(t + \Delta t)\))) is calculated.

Algorithm 1
figure a

Load Forecasting based on Q/C-ANN.

In addition to the inputs of the hybrid network, there are other parameters, which should be known before the calculation of the output of the hybrid network. This parameters are the parameters related to the weighting coefficients and the bias factors, and they can be shown by (40), (41), (52), (53), (57), and \(b_o\). These parameters can be tuned using training process. For more clarification about the implementation of Q/C-ANN, Algorithm 1 shows the steps to estimate the future value of the load.

More math-based discussions

In this section, the implementation of the classical data to update the states of the qubits of the quantum layer, quantum measurement and the relation between that and the output of Part 2 of Q/C-ANN, different encoding techniques to map classical data into quantum states, and data pre-processing will be discussed in a more detail and mathematically.

Updating the states of the qubits

In this part, the implementation of the classical data (the produced classical data by Part 1 of Q/C-ANN) to update the initialized qubits (with general states) of the quantum circuit will be clarified mathematically. In a general form, for the quantum circuit of Q/C-ANN with n qubits, the initial state of the \(i^{th}\) qubit (for \(1\le i \le n\)) can be considered as follows:

$$\begin{aligned} |\gamma _{in,i}\rangle =\alpha _{1,i} | 0 \rangle + \alpha _{2,i} | 1 \rangle . \end{aligned}$$
(59)

Where, based on (2),

$$\begin{aligned} \left| \alpha _{1,i} \right| ^2 + \left| \alpha _{2,i} \right| ^2 =1. \end{aligned}$$
(60)

Also, as mentioned before, the quantum circuit is made using rotation operators. Then, the final state of each single qubit can be calculated as follows:

$$\begin{aligned} |\gamma _{out,i}\rangle =R_y(f_1(p_1,i)) \times |\gamma _{in,i}\rangle . \end{aligned}$$
(61)

In other words,

$$\begin{aligned} |\gamma _{out,i}\rangle =\begin{bmatrix} cos(\frac{f_1(p_1,i)}{2}) & -sin(\frac{f_1(p_1,i)}{2})\\ sin(\frac{f_1(p_1,i)}{2}) & cos(\frac{f_1(p_1,i)}{2}) \end{bmatrix} \times \begin{bmatrix} \alpha _{1,i}\\ \alpha _{2,i} \end{bmatrix}. \end{aligned}$$
(62)

So, if

$$\begin{aligned} |\gamma _{out,i}\rangle =\begin{bmatrix} \beta _{1,i}\\ \beta _{2,i} \end{bmatrix}, \end{aligned}$$
(63)

then,

$$\begin{aligned} \beta _{1,i} = \alpha _{1,i} \times cos(\frac{f_1(p_1,i)}{2}) - \alpha _{2,i} \times sin(\frac{f_1(p_1,i)}{2}), \end{aligned}$$
(64)

and

$$\begin{aligned} \beta _{2,i} = \alpha _{1,i} \times sin(\frac{f_1(p_1,i)}{2}) + \alpha _{2,i} \times cos(\frac{f_1(p_1,i)}{2}). \end{aligned}$$
(65)

Therefore, the final state of each single qubit (for example qubit i) can be obtained using (64) and (65). In addition, using (5) and (31), if \(U=R\), the final state of the system in the form of a n qubits system can be calculated as follows:

$$\begin{aligned} | \Gamma _{out} \rangle = R \times | \Gamma _{in} \rangle . \end{aligned}$$
(66)

Also, the initial and the final states (i.e., \(|\Gamma _{in}\rangle\) and \(|\Gamma _{out}\rangle\), respectively) of the quantum circuit of Q/C-ANN can be written in the form of n qubits system in a general form as follows:

$$\begin{aligned} | \Gamma _{in} \rangle = \overbrace{| \gamma _{in,1} \rangle \otimes | \gamma _{in,2} \rangle \otimes \cdots \otimes | \gamma _{in,n} \rangle }^{n}, \end{aligned}$$
(67)

and

$$\begin{aligned} | \Gamma _{out} \rangle = \overbrace{| \gamma _{out,1} \rangle \otimes | \gamma _{out,2} \rangle \otimes \cdots \otimes | \gamma _{out,n} \rangle }^{n}. \end{aligned}$$
(68)

Then,

$$\begin{aligned} | \Gamma _{in} \rangle = \overbrace{\begin{bmatrix} \alpha _{1,1}\\ \alpha _{2,1} \end{bmatrix} \otimes \begin{bmatrix} \alpha _{1,2}\\ \alpha _{2,2} \end{bmatrix} \otimes \cdots \otimes \begin{bmatrix} \alpha _{1,n}\\ \alpha _{2,n} \end{bmatrix}}^{n}, \end{aligned}$$
(69)

and

$$\begin{aligned} | \Gamma _{out} \rangle = \overbrace{\begin{bmatrix} \beta _{1,1}\\ \beta _{2,1} \end{bmatrix} \otimes \begin{bmatrix} \beta _{1,2}\\ \beta _{2,2} \end{bmatrix} \otimes \cdots \otimes \begin{bmatrix} \beta _{1,n}\\ \beta _{2,n} \end{bmatrix}}^{n}. \end{aligned}$$
(70)

In this part, the mathematical-based representation to deploy the output signals (which are classical data) of Part 1 of Q/C-ANN to be implemented for the rotation gates to update the states of the initialized qubits of the system have been explained. To achieve a more efficient clarification, the states of the initialized qubits have been considered in a general form.

Quantum measurement and the outputs of the quantum layer

Measurements play a key role in a quantum circuit by considering the desired basis and using the result for different purposes, e.g, coding/decoding, interpretation, or transferring them to other units. A general description of a measurement can be demonstrated as follows46,47:

$$\begin{aligned} |\Gamma _M \rangle = \frac{M_O|\Gamma \rangle }{\sqrt{ \langle \Gamma |M_O^{\dagger }M_O|\Gamma \rangle }}, \end{aligned}$$
(71)

and

$$\begin{aligned} Prob(O)=\langle \Gamma |M_O^{\dagger }M_O|\Gamma \rangle . \end{aligned}$$
(72)

Where, \(M_O\) is the operator of the measurement with the possible result O, \(|\Gamma _M\rangle\) is the new state if the measurement result is O, and Prob(O) is the probability corresponded that the measurement reach result O. Considering the case of projective measurements, a projection operator corresponding to measurement result \(|j\rangle\) is \(Prj_j\) where \(Prj_j^{\dagger }=Prj_j\), \(Prj_j^2=Prj_j\), and as a result46,47:

$$\begin{aligned} |\Gamma _j \rangle = \frac{Prj_j|\Gamma \rangle }{\sqrt{\langle \Gamma |Prj_j|\Gamma \rangle }}, \end{aligned}$$
(73)

and

$$\begin{aligned} Prob(j)=\langle \Gamma |Prj_j|\Gamma \rangle . \end{aligned}$$
(74)

A projective measurement can be characterized by a complete set of orthogonal projectors, and if the measurement is done with respect to z axis, the complete set of the orthogonal projectors is \(\left\{ Prj_0,Prj_1\right\}\)46,47:

$$\begin{aligned} Prj_0=|0 \rangle \langle 0|, \end{aligned}$$
(75)

and

$$\begin{aligned} Prj_1=|1 \rangle \langle 1|. \end{aligned}$$
(76)

Considering the measurement corresponded to z axis of the \(i^{th}\) qubit, and if Prob(j) is called \(r_{ij}\) (for \(j\in \left\{ 0,1\right\}\)), (74) can be written as follows:

$$\begin{aligned} Prob(0)=r_{i0}=\langle \gamma _{out,i}|Prj_0|\gamma _{out,i} \rangle , \end{aligned}$$
(77)

and

$$\begin{aligned} Prob(1)=r_{i1}=\langle \gamma _{out,i}|Prj_1|\gamma _{out,i} \rangle . \end{aligned}$$
(78)

Then considering that \(| \gamma _{out,i} \rangle = \beta _{1,i}| 0 \rangle + \beta _{2,i}| 1 \rangle\), and using (75) and (76), (77) and (78) can be updated as follows:

$$\begin{aligned} r_{i0}=(\overline{\beta _{1,i}}\langle 0| + \overline{\beta _{2,i}}\langle 1|)(|0 \rangle \langle 0|)(\beta _{1,i}| 0 \rangle + \beta _{2,i}| 1 \rangle ), \end{aligned}$$
(79)

and

$$\begin{aligned} r_{i1}=(\overline{\beta _{1,i}}\langle 0| + \overline{\beta _{2,i}}\langle 1|)(|1 \rangle \langle 1|)(\beta _{1,i}| 0 \rangle + \beta _{2,i}| 1 \rangle ). \end{aligned}$$
(80)

In other words:

$$\begin{aligned} r_{i0}=(\overline{\beta _{1,i}}\langle 0|0 \rangle \langle 0| + \overline{\beta _{2,i}}\langle 1|0 \rangle \langle 0|)(\beta _{1,i}| 0 \rangle + \beta _{2,i}| 1 \rangle ), \end{aligned}$$
(81)

and

$$\begin{aligned} r_{i1}=(\overline{\beta _{1,i}}\langle 0|1 \rangle \langle 1| + \overline{\beta _{2,i}}\langle 1|1 \rangle \langle 1|)(\beta _{1,i}| 0 \rangle + \beta _{2,i}| 1 \rangle ). \end{aligned}$$
(82)

Where, \(\overline{\beta _{1,i}}\), \(\overline{\beta _{2,i}}\), and more generally \({\overline{\iota }}\) denote the complex conjugate of \(\beta _{1,i}\), \(\beta _{2,i}\), and \(\iota\), respectively. In addition, \(|0 \rangle\) and \(|1 \rangle\) are orthonormal basis and as a result, \(\langle 0|0 \rangle = \langle 1|1 \rangle = 1\) and \(\langle 0|1 \rangle = \langle 1|0 \rangle = 0\)46,47. Therefore, using (64) and (65), (81) and (82) are modified as follows:

$$\begin{aligned} \begin{aligned} r_{i0}=(\overline{(\alpha _{1,i} \times cos(\frac{f_1(p_1,i)}{2}) - \alpha _{2,i} \times sin(\frac{f_1(p_1,i)}{2}))}\langle 0|) \times ... \\ \quad \quad \quad \quad ((\alpha _{1,i} \times cos(\frac{f_1(p_1,i)}{2}) - \alpha _{2,i} \times sin(\frac{f_1(p_1,i)}{2}))| 0 \rangle + ... \\ (\alpha _{1,i} \times sin(\frac{f_1(p_1,i)}{2}) + \alpha _{2,i} \times cos(\frac{f_1(p_1,i)}{2}))| 1 \rangle ), \end{aligned} \end{aligned}$$
(83)

and

$$\begin{aligned} \begin{aligned} r_{i1}=(\overline{(\alpha _{1,i} \times sin(\frac{f_1(p_1,i)}{2}) + \alpha _{2,i} \times cos(\frac{f_1(p_1,i)}{2}))}\langle 1|) \times ... \\ \quad \quad \quad \quad ((\alpha _{1,i} \times cos(\frac{f_1(p_1,i)}{2}) - \alpha _{2,i} \times sin(\frac{f_1(p_1,i)}{2}))| 0 \rangle + ... \\ (\alpha _{1,i} \times sin(\frac{f_1(p_1,i)}{2}) + \alpha _{2,i} \times cos(\frac{f_1(p_1,i)}{2}))| 1 \rangle ). \end{aligned} \end{aligned}$$
(84)

Again by considering that \(| 1 \rangle\) and \(| 0 \rangle\) are orthonormal basis, the updated version of (83) and (84) are as follows:

$$\begin{aligned} \begin{aligned} r_{i0}=\overline{(\alpha _{1,i} \times cos(\frac{f_1(p_1,i)}{2}) - \alpha _{2,i} \times sin(\frac{f_1(p_1,i)}{2}))} \times ... \\ \quad \quad (\alpha _{1,i} \times cos(\frac{f_1(p_1,i)}{2}) - \alpha _{2,i} \times sin(\frac{f_1(p_1,i)}{2})) \end{aligned} \end{aligned}$$
(85)

and

$$\begin{aligned} \begin{aligned} r_{i1}=\overline{(\alpha _{1,i} \times sin(\frac{f_1(p_1,i)}{2}) + \alpha _{2,i} \times cos(\frac{f_1(p_1,i)}{2}))} \times ... \\ \quad \quad (\alpha _{1,i} \times cos(\frac{f_1(p_1,i)}{2}) - \alpha _{2,i} \times sin(\frac{f_1(p_1,i)}{2})) \end{aligned} \end{aligned}$$
(86)

Additionally, considering that \({\overline{\Lambda }}\Lambda =\left| \Lambda \right| ^2\)48, by defining \(\Lambda _0\) as

$$\begin{aligned} \Lambda _0 = \alpha _{1,i} \times cos(\frac{f_1(p_1,i)}{2}) - \alpha _{2,i} \times sin(\frac{f_1(p_1,i)}{2}), \end{aligned}$$
(87)

and defining \(\Lambda _1\) as

$$\begin{aligned} \Lambda _1 = \alpha _{1,i} \times sin(\frac{f_1(p_1,i)}{2}) + \alpha _{2,i} \times cos(\frac{f_1(p_1,i)}{2}), \end{aligned}$$
(88)

(85) and (86) can be modified as follows:

$$\begin{aligned} r_{i0}=\overline{\Lambda _0}\Lambda _0=\left| \Lambda _0 \right| ^2, \end{aligned}$$
(89)

and

$$\begin{aligned} r_{i1}=\overline{\Lambda _1}\Lambda _1=\left| \Lambda _1 \right| ^2. \end{aligned}$$
(90)

In other words:

$$\begin{aligned} r_{i0} = \left| \alpha _{1,i} \times cos(\frac{f_1(p_1,i)}{2}) - \alpha _{2,i} \times sin(\frac{f_1(p_1,i)}{2}) \right| ^2, \end{aligned}$$
(91)

and

$$\begin{aligned} r_{i1} = \left| \alpha _{1,i} \times sin(\frac{f_1(p_1,i)}{2}) + \alpha _{2,i} \times cos(\frac{f_1(p_1,i)}{2}) \right| ^2. \end{aligned}$$
(92)

Besides, if the initial state of each qubit is in a pure state \(| 0 \rangle\) (i.e., \(\alpha _{1,i}=1\) and \(\alpha _{2,i}=0\)):

$$\begin{aligned} r_{i0} = \left| cos(\frac{f_1(p_1,i)}{2}) \right| ^2, \end{aligned}$$
(93)

and

$$\begin{aligned} r_{i1} = \left| sin(\frac{f_1(p_1,i)}{2}) \right| ^2. \end{aligned}$$
(94)

Therefore,

$$\begin{aligned} cos(\frac{f_1(p_1,i)}{2}) = \sqrt{r_{i0}}, \end{aligned}$$
(95)

and

$$\begin{aligned} sin(\frac{f_1(p_1,i)}{2}) = \sqrt{r_{i1}}. \end{aligned}$$
(96)

Finally, \(\sqrt{r_{i0}}\) for \(1 \le i \le N_{n,1}\), can be deployed to construct the input vector (i.e., \(I_3\) in (51)) of the next layer (i.e., Part 3). More mathematically, (51) can be updated as follows:

$$\begin{aligned} I_3=\begin{bmatrix} cos(\frac{f_1(p_{1,1})}{2})\\ cos(\frac{f_1(p_{1,2})}{2})\\ \vdots \\ cos(\frac{f_1(p_{1,N_{n,1}})}{2}) \end{bmatrix} = \begin{bmatrix} \sqrt{r_{10}}\\ \sqrt{r_{20}}\\ \vdots \\ \sqrt{r_{N_{n,1}0}} \end{bmatrix} \end{aligned}$$
(97)

Briefly, in this part, the measurement approach has been explained, and it has been discussed with detail that how the measurement characteristics can be mapped into the classical data to be used for the next layer. The next part will discuss that how a classical data is mapped into a quantum state. Besides, other different encoding techniques to map classical data into quantum states with details will be explained. In addition, to have a more clarification, some examples will be solved, which can help to have a better understanding about encoding classical data into the quantum states.

Classical data mapping into quantum states

In this study, an angle encoding technique is deployed to map the classical output data of Part 1 of QC-ANN into the quantum states in Part 2 of QC-ANN. Generally, there are different encoding approaches that can be implemented to encode classical data into quantum states, and as a simple example of an encoding technique, basis encoding or mapping can be mentioned to map classical data into quantum data. In this approach, a binary string can be mapped into a quantum state with orthonormal basis \(|0\rangle\) and \(|1\rangle\) like mapping 0 to \(|0\rangle\) and 1 to \(|1\rangle\)49,50. For instance, 123, 1237, and 111111 in the binary number systems are equivalent to \(1111011_2\), \(10011010101_2\), and \(11011001000000111_2\), respectively. herefore, they can be encoded to the respective quantum states as follows:

$$\begin{aligned} & \overbrace{123}^{\text {Classical Data}} \rightarrow \overbrace{|1111011\rangle }^{\text {Quantum State}}, \end{aligned}$$
(98)
$$\begin{aligned} & \overbrace{1237}^{\text {Classical Data}} \rightarrow \overbrace{|10011010101\rangle }^{\text {Quantum State}}, \end{aligned}$$
(99)
$$\begin{aligned} & \overbrace{111111}^{\text {Classical Data}} \rightarrow \overbrace{|11011001000000111\rangle }^{\text {Quantum State}}. \end{aligned}$$
(100)

As another encoding technique, quantum associative memory encoding method can be mentioned. In this approach, a vector of classical data can be mapped into an equal superposition of different states corresponded to each classical data51,52. In other words, each classical data can be converted to the equivalent number in basis 2 (i.e., binary number). Then, an equal superposition of different states related each binary string represents the encoded quantum state. As an example, each element of a vector of classical data \(V_{data}=[4004 \quad 4080 \quad 4081 \quad 4090]\) can be written in the binary system, i.e., \(4004=111110100100_2\), \(4080=111111110000_2\), \(4081=111111110001_2\), and \(4090=111111111010_2\). Then, the quantum encoded version of the classical dataset \(V_{data}\) is as follows:

$$\begin{aligned} \begin{aligned} |QV_{data}\rangle = \frac{|111110100100\rangle }{2} + \frac{|111111110000\rangle }{2} \\ ... + \frac{|111111110001\rangle }{2} + \frac{|111111111010\rangle }{2}. \end{aligned} \end{aligned}$$
(101)

Therefore, the vector of the classical data \(V_{data}\) is mapped into a mixed quantum state in an equal superposition as follows:

$$\begin{aligned} \overbrace{V_{data}}^{\text {Classical Dataset}} \rightarrow \overbrace{|QV_{data}\rangle }^{\text {Quantum State}}. \end{aligned}$$
(102)

In this study, the quantum circuit is made based on rotation gates and as a result, an angle encoding approach is automatically implemented to map classical data into the state of the qubits by updating the state of each initialized qubit using a rotation operator with an angle that is the classical output data from Part 1. In this method, a classical data such as \(\kappa\) can be implemented to provide the angle of a rotation gate53. Then, the rotation gate can be used to map a classical data into a quantum state. In other words, if the initial state of a qubit is a known state \(| Q_{In} \rangle\), then by applying a rotation gate about \({\hat{y}}\) axis with angle \(\kappa\), the new state is as follows:

$$\begin{aligned} | Q_{In} \rangle \xrightarrow {R_y(\kappa )} | Q_S \rangle . \end{aligned}$$
(103)

In other words:

$$\begin{aligned} \overbrace{\begin{bmatrix} q_{S0} \\ q_{S1} \end{bmatrix}}^{| Q_S \rangle } = \overbrace{\begin{bmatrix} cos(\frac{\kappa }{2}) & -sin(\frac{\kappa }{2})\\ sin(\frac{\kappa }{2}) & cos(\frac{\kappa }{2}) \end{bmatrix}}^{R_y(\kappa )} \times \overbrace{\begin{bmatrix} q_{In0} \\ q_{In1} \end{bmatrix}}^{| Q_{In} \rangle }, \end{aligned}$$
(104)

So, the classical data \(\kappa\) can be encoded into a quantum state as follows:

$$\begin{aligned} \overbrace{\kappa }^{\text {Classical Data}} \xrightarrow {\overbrace{R_y(\kappa ), | Q_{In} \rangle }^{\text {Rotation Gate, and Initialized Qubit}}} \overbrace{| Q_{S} \rangle }^{\text {Quantum State}}. \end{aligned}$$
(105)

For a more clarification regarding the angle encoding, two examples are provided with different initial states as follows:

Example 1: In this example, the initial state of the qubit is \(| 0 \rangle\). Therefore, \(q_{In0} = 1\) and \(q_{In1} = 0\). So, to map \(\kappa _1=\frac{8\pi }{17}\), \(\kappa _2=\frac{3\pi }{7}\), \(\kappa _3=\frac{\pi }{12}\), and \(\kappa _4=\frac{3\pi }{10}\) to the corresponding quantum states, \(R_y(\frac{8\pi }{17})\), \(R_y(\frac{3\pi }{7})\), \(R_y(\frac{\pi }{12})\), and \(R_y(\frac{3\pi }{10})\) can be applied as follows:

$$\begin{aligned} & | Q_{S1} \rangle = \begin{bmatrix} q_{S01} \\ q_{S11} \end{bmatrix} = \begin{bmatrix} cos(\frac{8\pi }{34}) & -sin(\frac{8\pi }{34})\\ sin(\frac{8\pi }{34}) & cos(\frac{8\pi }{34}) \end{bmatrix} \times \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \end{aligned}$$
(106)
$$\begin{aligned} & | Q_{S2} \rangle = \begin{bmatrix} q_{S02} \\ q_{S12} \end{bmatrix} = \begin{bmatrix} cos(\frac{3\pi }{14}) & -sin(\frac{3\pi }{14})\\ sin(\frac{3\pi }{14}) & cos(\frac{3\pi }{14}) \end{bmatrix} \times \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \end{aligned}$$
(107)
$$\begin{aligned} & | Q_{S3} \rangle = \begin{bmatrix} q_{S03} \\ q_{S13} \end{bmatrix} = \begin{bmatrix} cos(\frac{\pi }{24}) & -sin(\frac{\pi }{24})\\ sin(\frac{\pi }{24}) & cos(\frac{\pi }{24}) \end{bmatrix} \times \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \end{aligned}$$
(108)

and

$$\begin{aligned} | Q_{S4} \rangle = \begin{bmatrix} q_{S04} \\ q_{S14} \end{bmatrix} = \begin{bmatrix} cos(\frac{3\pi }{20}) & -sin(\frac{3\pi }{20})\\ sin(\frac{3\pi }{20}) & cos(\frac{3\pi }{20}) \end{bmatrix} \times \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \end{aligned}$$
(109)

So,

$$\begin{aligned} & | Q_{S1} \rangle = cos(\frac{8\pi }{34}) | 0 \rangle + sin(\frac{8\pi }{34}) | 1 \rangle \approx \begin{bmatrix} 0.7390 \\ 0.6737 \end{bmatrix} \end{aligned}$$
(110)
$$\begin{aligned} & | Q_{S2} \rangle = cos(\frac{3\pi }{14}) | 0 \rangle + sin(\frac{3\pi }{14}) | 1 \rangle \approx \begin{bmatrix} 0.7818 \\ 0.6235 \end{bmatrix} \end{aligned}$$
(111)
$$\begin{aligned} & | Q_{S3} \rangle = cos(\frac{\pi }{24}) | 0 \rangle + sin(\frac{\pi }{24}) | 1 \rangle \approx \begin{bmatrix} 0.9914 \\ 0.1305 \end{bmatrix} \end{aligned}$$
(112)

and

$$\begin{aligned} | Q_{S4} \rangle = cos(\frac{3\pi }{20}) | 0 \rangle + sin(\frac{3\pi }{20}) | 1 \rangle \approx \begin{bmatrix} 0.8910 \\ 0.4540 \end{bmatrix} \end{aligned}$$
(113)

Therefore, \(K = \begin{bmatrix} \kappa _1 \\ \kappa _2 \\ \kappa _3 \\ \kappa _4\ \end{bmatrix}\) is encoded to a quantum state as follows:

$$\begin{aligned} K = \overbrace{\begin{bmatrix} \kappa _1 \\ \kappa _2 \\ \kappa _3 \\ \kappa _4\ \end{bmatrix}}^{\text {Classical Dataset}} \rightarrow \overbrace{| Q_{S1} \rangle \otimes | Q_{S2} \rangle \otimes | Q_{S3} \rangle \otimes | Q_{S4} \rangle }^{\text {Quantum State}}. \end{aligned}$$
(114)

In other words:

$$\begin{aligned} \overbrace{\begin{bmatrix} \kappa _1 = \frac{8\pi }{17} \\ \kappa _2 = \frac{3\pi }{7} \\ \kappa _3 = \frac{\pi }{12} \\ \kappa _4 = \frac{3\pi }{10} \end{bmatrix}}^{\text {Classical Dataset}} \rightarrow \overbrace{\overbrace{\begin{bmatrix} 0.7390 \\ 0.6737 \end{bmatrix}}^{| Q_{S1} \rangle } \otimes \overbrace{\begin{bmatrix} 0.7818 \\ 0.6235 \end{bmatrix}}^{| Q_{S2} \rangle } \otimes \overbrace{\begin{bmatrix} 0.9914 \\ 0.1305 \end{bmatrix}}^{| Q_{S3} \rangle } \otimes \overbrace{\begin{bmatrix} 0.8910 \\ 0.4540 \end{bmatrix}}^{| Q_{S4} \rangle }}^{\text {Quantum State}} \end{aligned}$$
(115)

Example 2: In this example, the initial state of the qubit is \(\frac{15}{113}| 0 \rangle + \frac{112}{113}| 1 \rangle\). Therefore, \(q_{In0} = \frac{15}{113}\) and \(q_{In1} = \frac{112}{113}\). Additionally, the state of the initialized quabit satisfies \(q_{In0}^2 + q_{In1}^2 = 1\). As same as previous example, here, \(\kappa _1=\frac{8\pi }{17}\), \(\kappa _2=\frac{3\pi }{7}\), \(\kappa _3=\frac{\pi }{12}\), and \(\kappa _4=\frac{3\pi }{10}\). Then, by applying the corresponded rotation gate for each initialized qubit, the new state of that qubit containing the encoded classical data can be obtained as follows:

$$\begin{aligned} & | Q_{S1} \rangle = \begin{bmatrix} q_{S01} \\ q_{S11} \end{bmatrix} = \begin{bmatrix} cos(\frac{8\pi }{34}) & -sin(\frac{8\pi }{34})\\ sin(\frac{8\pi }{34}) & cos(\frac{8\pi }{34}) \end{bmatrix} \times \begin{bmatrix} \frac{15}{113} \\ \frac{112}{113} \end{bmatrix}, \end{aligned}$$
(116)
$$\begin{aligned} & | Q_{S2} \rangle = \begin{bmatrix} q_{S02} \\ q_{S12} \end{bmatrix} = \begin{bmatrix} cos(\frac{3\pi }{14}) & -sin(\frac{3\pi }{14})\\ sin(\frac{3\pi }{14}) & cos(\frac{3\pi }{14}) \end{bmatrix} \times \begin{bmatrix} \frac{15}{113} \\ \frac{112}{113} \end{bmatrix}, \end{aligned}$$
(117)
$$\begin{aligned} & | Q_{S3} \rangle = \begin{bmatrix} q_{S03} \\ q_{S13} \end{bmatrix} = \begin{bmatrix} cos(\frac{\pi }{24}) & -sin(\frac{\pi }{24})\\ sin(\frac{\pi }{24}) & cos(\frac{\pi }{24}) \end{bmatrix} \times \begin{bmatrix} \frac{15}{113} \\ \frac{112}{113} \end{bmatrix}, \end{aligned}$$
(118)

and

$$\begin{aligned} | Q_{S4} \rangle = \begin{bmatrix} q_{S04} \\ q_{S14} \end{bmatrix} = \begin{bmatrix} cos(\frac{3\pi }{20}) & -sin(\frac{3\pi }{20})\\ sin(\frac{3\pi }{20}) & cos(\frac{3\pi }{20}) \end{bmatrix} \times \begin{bmatrix} \frac{15}{113} \\ \frac{112}{113} \end{bmatrix}, \end{aligned}$$
(119)

Therefore,

$$\begin{aligned} & \begin{aligned} | Q_{S1} \rangle = (\frac{15}{113} \times cos(\frac{8\pi }{34}) - \frac{112}{113} \times sin(\frac{8\pi }{34})) | 0 \rangle \\ ... + (\frac{15}{113} \times sin(\frac{8\pi }{34}) + \frac{112}{113} \times cos(\frac{8\pi }{34})) | 1 \rangle \\ \approx \begin{bmatrix} -0.5696 \\ 0.8219 \end{bmatrix}, \end{aligned} \end{aligned}$$
(120)
$$\begin{aligned} & \begin{aligned} | Q_{S2} \rangle = (\frac{15}{113} \times cos(\frac{3\pi }{14}) - \frac{112}{113} \times sin(\frac{3\pi }{14})) | 0 \rangle \\ ... + (\frac{15}{113} \times sin(\frac{3\pi }{14}) + \frac{112}{113} \times cos(\frac{3\pi }{14})) | 1 \rangle \\ \approx \begin{bmatrix} -0.5142 \\ 0.8577 \end{bmatrix}, \end{aligned} \end{aligned}$$
(121)
$$\begin{aligned} & \begin{aligned} | Q_{S3} \rangle = (\frac{15}{113} \times cos(\frac{\pi }{24}) - \frac{112}{113} \times sin(\frac{\pi }{24})) | 0 \rangle \\ ... + (\frac{15}{113} \times sin(\frac{\pi }{24}) + \frac{112}{113} \times cos(\frac{\pi }{24})) | 1 \rangle \\ \approx \begin{bmatrix} 0.0022 \\ 0.9999 \end{bmatrix}, \end{aligned} \end{aligned}$$
(122)

and,

$$\begin{aligned} \begin{aligned} | Q_{S4} \rangle = (\frac{15}{113} \times cos(\frac{3\pi }{20}) - \frac{112}{113} \times sin(\frac{3\pi }{20})) | 0 \rangle \\ ... + (\frac{15}{113} \times sin(\frac{3\pi }{20}) + \frac{112}{113} \times cos(\frac{3\pi }{20})) | 1 \rangle \\ \approx \begin{bmatrix} -0.3317 \\ 0.9434 \end{bmatrix}, \end{aligned} \end{aligned}$$
(123)

So, in this example, \(K = \begin{bmatrix} \kappa _1 \\ \kappa _2 \\ \kappa _3 \\ \kappa _4\ \end{bmatrix}\) is mapped into a quantum state as follows:

$$\begin{aligned} \begin{aligned} \overbrace{\begin{bmatrix} \kappa _1 = \frac{8\pi }{17} \\ \kappa _2 = \frac{3\pi }{7} \\ \kappa _3 = \frac{\pi }{12} \\ \kappa _4 = \frac{3\pi }{10} \end{bmatrix}}^{\text {Classical Dataset}} \rightarrow \overbrace{| Q_{S1} \rangle \otimes | Q_{S2} \rangle \otimes | Q_{S3} \rangle \otimes | Q_{S4} \rangle } ^{\text {Quantum State}} = ... \\ = \overbrace{\overbrace{\begin{bmatrix} -0.5696 \\ 0.8219 \end{bmatrix}}^{| Q_{S1} \rangle } \otimes \overbrace{\begin{bmatrix} -0.5142 \\ 0.8577 \end{bmatrix}}^{| Q_{S2} \rangle } \otimes \overbrace{\begin{bmatrix} 0.0022 \\ 0.9999 \end{bmatrix}}^{| Q_{S3} \rangle } \otimes \overbrace{\begin{bmatrix} -0.3317 \\ 0.9434 \end{bmatrix}}^{| Q_{S4} \rangle }}^{\text {Quantum State}} \end{aligned} \end{aligned}$$
(124)

Data pre-proccessing

In this part, details regarding two data pre-processing techniques will be talked mathematically. To achieve this goal, normalization as well as filling missed data and outliers will be discussed.

For the mapping data into the range of \(\begin{bmatrix} \varrho _{min}, \varrho _{max} \end{bmatrix}\), the following transformation as min-max normalization can be used54:

$$\begin{aligned} \lambda _{Nr}=\frac{\left( \lambda -MIN_{\lambda } \right) \times \left( \varrho _{max}-\varrho _{min} \right) }{MAX_{\lambda }-MIN_{\lambda }}+\varrho _{min}. \end{aligned}$$
(125)

Where, \(\lambda\) and \(\lambda _{Nr}\) are the desired data and the normalized vale of that, respectively. In addition, \(MAX_{\lambda }\) and \(MAX_{\lambda }\) are the maximum and the minimum value of the \(\lambda\), respectively. So, based on (125), for normalizing \(P_L\) into \(\begin{bmatrix} 0, 1 \end{bmatrix}\) and \(\begin{bmatrix} -1, 1 \end{bmatrix}\), (126) and (127) can be implemented,

$$\begin{aligned} P_{Nr,(0,1)}(j)=\frac{ P_L(j)-MIN_{P} }{MAX_{P}-MIN_{P}}, \end{aligned}$$
(126)

and,

$$\begin{aligned} P_{Nr,(-1,1)}(j)=\frac{ 2P_L(j)-2MIN_{P} }{MAX_{P}-MIN_{P}}-1. \end{aligned}$$
(127)

Where, \(P_{Nr,(0,1)}(j)\) and \(P_{Nr,(-1,1)}(j)\) are the normalized value of \(P_L\) at \(t=j\) into range of \(\begin{bmatrix} 0, 1 \end{bmatrix}\) and \(\begin{bmatrix} -1, 1 \end{bmatrix}\), respectively.

Besides, to fill missed data, different strategies can be implemented. One of the common methods to fill the missed data, is based on a linear interpolation strategy as follows55:

$$\begin{aligned} \frac{\Xi (t)-\Xi (t_1)}{\Xi (t_2)-\Xi (t_1)}=\frac{t-t_1}{t_2-t_1}. \end{aligned}$$
(128)

Where, by substituting \(\Xi\), t, \(t_1\), and \(t_2\) with \(P_L\), m, j, and \(j+k\), (128) can be updated as follows:

$$\begin{aligned} \frac{P_L (m)-P_L (j)}{P_L (j+k)-P_L (j)}=\frac{m-j}{(j+k)-j}, \end{aligned}$$
(129)

In other words:

$$\begin{aligned} P_L (m)= \frac{(P_L (j+k)-P_L (j)) \times (m-j)}{k} + P_L (j), \end{aligned}$$
(130)

Results

In this part, a dataset related to a real experimental setup is implemented to test the effectiveness of the proposed strategy. The dataset includes data of a refrigerator and a work station. Therefore, in this part, two scenarios are shown for the case of the refrigerator and the work station. In addition, before to implement the dataset, a data pre-proccessing strategy is used including actions such as data cleaning, filling missed data, removing and filling outliers, and normalization. For the case of normalization, min-max normalization technique is implemented, and for the case of filling missed data and replacing outliers, a linear interpolation approach is used. It is important to note that, in this study, Matlab programming language is implemented to simulate the quantum-based strategy.

For this part, \(N_{in}\) is 5. So, Q/C-ANN includes 5 inputs. Also, the sampling time \(\Delta t\) is 2s. In addition, Part 1 of the hybrid network has a classical layer with 10 neurons. Also, Part 2 is made using 10 qubits and 10 rotation gates. Besides, Part 3 includes two classical layer. The first classical layer of Part 3 contains 3 neurons and the last classical layer of Part 3 includes one neuron, which produces the output of the hybrid network. Also, for both case studies, the size of the dataset is about 86400, and \(70 \%\) of the samples are implemented to train the hybrid network.

Case study 1: a refrigerator

In this scenario, a refrigerator is considered as the tested load. The goal of this scenario is to test the proposed strategy to predict the power consumption of a refrigerator. In Fig. 6, the value of the power consumption of the refrigerator (\(P_L\)) and the predicted value of that (\({\bar{P}}_L\)) for the training phase is depicted. Based on Fig. 6, the hybrid network is trained properly. Further, the regression plot related to the training is shown by Fig. 7. In addition, Fig. 8 is related to the examination of the trained hybrid network and Fig. 9 is related to the regression plot for examining the trained hybrid network. Based on Fig. 8 and Fig. 9, the trained hybrid network works effectively.

Fig. 6
figure 6

Case Study 1 (Training Phase): The value of the power consumption of the refrigerator (\(P_L\)) and the predicted value of that (\({\bar{P}}_L\)) for the training phase.

Fig. 7
figure 7

Case Study 1 (Training Phase): The regression plot for the power consumption of the refrigerator \((P_{L})\) and the predicted value of that (\({\bar{P}}_{L}\)) rlated to the training phase.

Fig. 8
figure 8

Case Study 1 (Testing Phase): The value of the power consumption of the refrigerator (\(P_L\)) and the predicted value of that (\({\bar{P}}_L\)) for the testing phase.

Fig. 9
figure 9

Case Study 1 (Testing Phase): The regression plot for the power consumption of the refrigerator (\(P_{L}\)) and the predicted value of that (\({\bar{P}}_{L}\)) related to the testing phase.

Case study 2: a work station

Here, the goal of this case study is to predict the power consumption of a work station. The value of the power consumption and the anticipated value of that related to the training phase is depicted by Fig. 10. Based on the obtained result from Fig. 10, the hybrid network is trained successfully. Besides, Fig. 11 depicts the regression plot of the training phase. Furthermore, Fig. 12 shows the value of the power consumption and the predicted value of that and for the test dataset. Besides, in Fig. 13, the regression plot regarding to the values of the parameters of Fig. 12 is plotted. Based on Fig. 12 and Fig. 13, the trained hybrid network can properly predict the power consumption of the work station.

Fig. 10
figure 10

Case Study 2 (Training Phase): The value of the power consumption of the work station (\(P_L\)) and the predicted value of that (\({\bar{P}}_L\)) for the training phase.

Fig. 11
figure 11

Case Study 2 (Training Phase): The regression plot for the power consumption of the work station (\(P_{L}\)) and the predicted value of that (\({\bar{P}}_{L}\)) related to the training phase.

Fig. 12
figure 12

Case Study 2 (Testing Phase): The value of the power consumption of the work station (\(P_L\)) and the predicted value of that (\({\bar{P}}_L\)) for the testing phases.

Fig. 13
figure 13

Case Study 2 (Testing Phase): The regression plot for the power consumption of the work station (\(P_{L}\)) and the predicted value of that (\({\bar{P}}_{L}\)) related to the testing phase.

Fig. 14
figure 14

The results of the linear regression related to forecasting the loads.

Fig. 15
figure 15

The results of the support vector regression related to forecasting the loads.

Fig. 16
figure 16

The results of the decision tree regression related to forecasting the loads.

Other AI-based approaches: applications of machine learning

In this part, to have a more in-depth view regarding the ability of AI to forecast the desired values and parameters, three classical models for the prediction of loads will be discussed. The models are linear regression, support vector machine (SVM) regression, and decision tree regression. The mentioned models are three well-known models of machine learning, which have been used widely to predict loads. It is important to note that, for all the case studies of this part, the same dataset related to Section VI is implemented for the refrigerator and the work station. In addition, the inputs and the output of the machine learning-based models are as same as the inputs and the output, which have been discussed in Section IV.

Linear regression-based strategies have been used before for load forecasting, e.g.,56 and57. Figure. 14 shows the results related to the linear regression-based load forecasting. In Fig. 14(a), 14(b), 14(c), and 14(d), the training fit plot, training regression plot, testing fit plot, and testing regression plot of case study 1 are depicted, respectively. Furthermore, Fig. 14(e), 14(f), 14(g), and 14(h) show the training fit plot, training regression plot, testing fit plot, and testing regression plot of case study 2, respectively. Furthermore, previously, some studies have been done to predict the loads using support vector machine regression-based approaches such as58,59,60,61,62,63. The results related to support vector regression-based method is depicted in Fig. 15. For more clarification, Fig. 15(a), 15(b), 15(c), and 15(d) illustrate the results of case study 1, i.e., the fit and the regression plots. Further, Fig. 15(e), 15(f), 15(g), and 15(h) are related to case study 2. In addition, different decision tree-based strategies have been deployed to forecast the loads like64 and65. Here, Fig. 16 shows the results of the decision tree regression for case study 1 and case study 2.

Discussion

Briefly, mathematical-based modeling of a system considering all details of the system can be a challenging task. In other words, a system can involve some hidden or undiscovered details and as a result, it is not easy and practical to extract all information from the system using the existing mathematical-based model of the system. On the other hand, an alternative solution to achieve the information of a system as much as possible without the deployment of mathematical-based modeling of the studied system is access to data employing measurements if possible. A Q/C-ANN can work only with data that do not require the mathematical-based model of the system. Therefore, a Q/C-ANN can be considered as a proper data-driven-based computational tool to extract the information of the system (e.g., the value of the electrical loads).

In addition, a Q/C-ANN includes quantum gates and as a result, a quantum circuit using those gates, that can provide the opportunity to exploit the advantages of quantum computing, e.g., making an entanglement between the deployed qubits to have a secure data transmission strategy to transfer the information of the system between different units in a power system. Furthermore, typically to perform a quantum algorithm three main parts can be required, i.e., Part1 for coding classical data to create the state of the input qubits, Part 2 to use the coded qubits in a quantum circuit that is made using quantum gates, and Part 3 to encode the measurements of the states of the output qubits to classical data. A Q/C-ANN has the main required three parts naturally. For more clarification, Part 1 and Part 3 of a Q/C-ANN are made by classical hidden layers and they are cooperating for coding classical data to modify the state of the initialized qubits and decoding the measured state of the output qubits to classical data, respectively. In addition, Part 2 of a Q/C-ANN is made using quantum gates to structure a quantum circuit.

Besides, challenges regarding the deployment of a fully quantum-based strategy are the limitation of access to a large number of qubits, and the sensitivity of a fully quantum-based methodology to the noises. Based on the mentioned challenges, the implementation of quantum-based approaches for a large-scale system (e.g., a power system) can be a difficult task. A solution to overcome these kinds of difficulties is the implementation of Q/C-ANNs, which can require less number of quantum gates due to the implementation of classical parts.

Therefore, based on the above-mentioned advantages, a Q/C-ANN can offer an effective solution for forecasting-based applications in power systems such as the prediction of the values of the loads.

Conclusion

Q/C-ANNs have quantum and classical parts. The quantum part includes a quantum circuit containing quantum operators. Also, classical parts are made by classical ANNs and using classical neurons. A Q/C-ANN can be implemented to solve different challenges. In this work, application of Q/C-ANNs has been implemented for load forecasting in smart grids. This study implemented the hybrid network in a form of time-series, which requires only the historical as well as the current values of the load to predict the future value of that. In addition, in this study, two different types of loads (i.e., refrigerator, and work station) have been predicted in a short-term mode. The implemented strategy was able to predict the loads effectively.

Future works

For the future direction, the hybrid network can be used to solve other problems, e.g., load classification, fault detection, etc. In addition, quantum annealing has been studied before by some studies such as66,67,68, and the implementation of that as well as the comparison of that with various methods can be considered for further investigation to predict the values of the loads.