Introduction

High speed trains are an important indicators of the world’s technological advancement, and the arc resulting from train operation has always been a research hotspot. Note that elevated temperature generated by the arc occurring between the pantograph and catenary will accelerate the degradation and erosion of the pantograph sliding plate and contact wire. Additionally, high-frequency harmonics will severely interfere with surrounding communication, and current quality will be considerably reduced. The accurate and rapid detection of arc faults directly determines the safe, stable, and reliable operation of the entire railway system.

The mathematical models applied to the arc between the pantograph and catenary are one of the fundamental issues to investigate with respect to arc characteristics, thereby supporting the proposal of effective arc detection techniques. Given the presence of strong arc blowing airflow during high-speed train operation, which cannot be ignored in modeling and identifying of pantograph–catenary arcs, studying the impact of high-speed airflow on arcs holds important practical significance.

In 1939 and 1943, Cassie1 and Mayr2 proposed corresponding Cassie and Mayr arc models. The Cassie and Mayr models are derived based on the assumption of constant time, dissipated power, and constant voltage gradient. Chen, Zhou and Qiao et al.3,4,5 improved the dissipated power term in the Mayr model by introducing airflow velocity. Li et al.6 investigated voltage gradient at different fluorocarbon ratios and derived an expression for voltage gradient these conditions. Additionally, Midya and Terzija et al.7,8 explored the effect of each electrical parameter on the pantograph–catenary arc mathematical model, Moreover, while they provided a general expression for the relationship between voltage gradient and current, they did not consider the effect of airflow velocity on the pantograph–catenary arc. J. Sousa et al.9 noted that different expressions of voltage gradient were employed in the 100 and 1000 A current regions; however, they did not consider the effect of high-speed airflow on voltage gradients. Wang et al.10,11,12 established the relationship between arc dissipation power, arc voltage gradient, and airflow velocity through indirect data fitting. However, because voltage gradient is also dependent on current flowing through it, this expression has certain limitations. Zeng et al.13 studied the influence of strong airflow on the arc motion characteristics of the pantograph–catenary arc for different airflow speeds by establishing an arc motion characteristic under high-speed airflow during train operation (MHD) model. Their findings provided a foundation for analyzing the arc characteristics in this paper. Zhou et al.14 investigated the relationship between voltage gradient and dissipated power with offline distance based on the Habedank arc equation when the train speed increases. They developed a new arc model to analyze the characteristic waveform of the arc. Additionally, Liu et al.15 incorporated the dynamic separation process of the pantograph and introduced a dynamic PC detachment trajectory model into the arc model based on the Habedank arc model, thereby establishing a dynamic arc model. However, these models did not consider the influence of high-speed airflow. Yang et al.16 examined the effects of crosswind and input current on arc temperature and voltage and explained the crosswind effect based on the Karman vortex street of the fluid theory. Reference17 introduces a model based on physical parameters of the arc including temperature, enthalpy, pressure, and also arc geometry and investigates the effect of different arc diameters on dissipated power through high-speed imaging. This model facilitates an easier provision of the motion pattern of the arc.

A suitable and correct arc mathematical model can complete various theoretical analyses of arc faults, propose new fault identification methods, and reduce workload and costs associated with arc fault diagnosis. Currently, most arc recognition methods are based on current or voltage signals, and arcs signals are highly concealed within voltage and current signals, making them challenging to identify. Consequently, high-dimensional feature extraction or complex algorithms are necessary to complete fault diagnosis. Li et al.18 transformed arc current signals into a two-dimensional image and used convolutional neural networks to recognize the image, thereby intensifying recognition efforts. Similarly, Liu et al.19 proposed a semantic segmentation model, which incorporates a novel hybrid multiscale feature fusion model for pantograph–catenary arc detection. This model identifies arcs based on the principle of the brightest area image of the arc. Wang et al.20 proposed a differential detection method to extract arc voltage and identify arc fault by analyzing arc voltage. However, accessing this signal is not straightforward. Based on the loop current and feature extraction techniques, Wang and Li21,22 identified arc faults within pantograph–catenary systems. Wang et al.23,24,25 identified arc faults by detecting high-frequency components via employing multi feature fusion and utilizing modal decomposition. All these methods involve feature processing for arc current, and the recognition process involves complex data processing. Jiang et al.26 proposed the RLC high-frequency arc model, which categorizes current signal into different types of oscillation signals. They subsequently employed convolutional neural networks to identify faults. Additionally, Zhang et al.27 proposed an adaptive arc fault diagnosis model to better facilitate the extraction of effective recognition features. Based on the maximum mutual information coefficient and its significance, Zuo et al.28 proposed a feature change mining method to extract highly recognizable features. Considering the concealment of electric arcs, Zhao et al.29 proposed the fusion of current fluctuation and zero-current features. Li et al.30 selected multiple physical values as the characteristic values of pantograph current. Further, they calculated the contribution rate for each characteristic value and utilized the feature value with higher contribution rate as the training sample for extracting effective features. Jing et al.31 considered various vibration conditions and loads, decomposed measured current signal, selected the optimal intrinsic mode function based on the correlation coefficient and kurtosis index calculated from decomposed data; furthermore, they reconstructed the decomposed signal to establish a 3D feature quantity for identifying the arc. Jiang et al.32 decomposed the spectra of fault and normal signals, and further decomposed and reconstructed time series based on spectral characteristics and expected margins. This approach exhibits good generalization performance and recognition accuracy. In summary, arc signals are highly obscured within voltage or current information. Consequently, traditional methods necessitate the extraction of high-dimensional features or use of optimization algorithms to accurately classify faults. This, in turn, further increases the challenge of fault identification.

Addressing the identified shortcomings of the arc model and fault diagnosis methods, this article introduces two influencing factors simultaneously: high-speed airflow and arc current. It derives an arc voltage gradient expression that better reflects the actual situation of the pantograph system, instead of assuming a constant voltage gradient in the mathematical model of the arc. Recognizing the necessity of complex feature extraction and optimization algorithms to attain high recognition accuracy when using voltage or current, we propose the concept of port impedance. Furthermore, we analyze the feasibility of identifying arc faults through port impedance based on the established mathematical model.

Arc modeling

The Cassie and Mayr arc models represent classical mathematical frameworks, each tailored to specific current ranges. The Mayr model effectively characterizes small current regions, whereas the Cassie model is suitable for describing large current regions. Consequently, scholars have proposed utilizing the current amplitude to regulate the degree of model resistance involvement33. Figure 1 shows the circuit structure.

Fig. 1
figure 1

Circuit structure diagram of Cassie-Mayr arc model.

\({R}_{m}\) is is the equivalent impedance of the Mayr model, and \({R}_{c}\) is the equivalent impedance of the Cassie model.\(\varphi (i)\) in the figure is the transition function, the expression is:

$$ \varphi \left( i \right) = \frac{1}{{1 + e^{{ - a\left( {i^{2} - I_{0}^{2} } \right)}} }} $$
(1)

where \(i\) is the arc current (A), \(\alpha\) coefficient determines the steepness of the curve (1/A2), and \(i_{0}\) determines the distortion point of the curve (A).

Choose the steep rate of \(\alpha = 0.025\), the distortion point \(i_{0} = 15\) as the transition function parameter values. The final arc structure equivalent resistance expression is described as:

$$ \frac{1}{g} = \varphi \left( i \right)\frac{1}{{g_{c} }} + \left[ {1 - \varphi \left( i \right)} \right]\frac{1}{{g_{m} }} $$
(2)

The expression of the Cassie–Mayr coupling model based on the transition function is derived as:

$$ \left\{ \begin{gathered} \frac{{dg_{c} }}{dt} = \frac{1}{{\theta_{c} }}\left( {\frac{{u^{2} g^{2} }}{{u_{c}^{2} g_{c} }} - g_{c} } \right) \hfill \\ \frac{{dg_{m} }}{dt} = \frac{1}{{\theta_{m} }}\left( {\frac{{u^{2} g^{2} }}{{p_{loss} }} - g_{m} } \right) \hfill \\ \frac{1}{g} = \varphi \left( i \right)\frac{1}{{g_{c} }} + \left[ {1 - \varphi \left( i \right)} \right]\frac{1}{{g_{m} }} \hfill \\ \end{gathered} \right. $$
(3)

where \(g_{c}\) is arc conductance in the high current region (S), \(\theta_{c}\) is arc time constant in the high current region; \(u_{c}\) is arc voltage gradient (V/cm), \(g_{m}\) is arc conductance in the low current region (S); \(\theta_{m}\) is arc time constant in the low current region, \(p_{loss}\) is dissipated power constant (W), \(u\) is arc instantaneous voltage (V), and \(g\) is arc instantaneous conductance (S).

The Cassie–Mayr model assumes that voltage gradient and dissipated power are constants. However, various external environments can lead to different parameter variation patterns. Therefore, this paper will explore the effect of airflow speed on voltage gradient and dissipative power based on the unique high-speed airflow during high-speed train operation, thereby improving the arc model.

Relationship between voltage gradient and airflow velocity

High-speed airflow has a lateral arc extinguishing effect on the pantograph–catenary arc, as shown in Fig. 2. The primary focus of this paper is to address the main challenge of incorporating high-speed airflow into the arc model and resolving the associated functional relationship. Where \({V}_{Airflow}\) represents the velocity of the high-speed airflow, and \({V}_{train}\) represents the speed of the train.

Fig. 2
figure 2

Schematic of pantograph-catenary arc.

Reference literature21 presents the following equation derived from an energy balance perspective:

$$ E^{2 + m} I_{a}^{2 - m} = K^{\prime} $$
(4)

where \(E\) represents the arc voltage gradient (V/cm), \(I_{a}\) denotes the arc current, \(m\) is the coefficient, which is related to the cooling method, and \(K^{\prime}\) signifies the constant. Based on the aforementioned fundamental theory, the equation influenced by high-speed airflow is derived next.

Throughout the current period, the voltage gradient maintains symmetry between the positive and negative half periods. By converting the negative half-period to a positive half-period, we obtain the expression for the arc voltage gradient as follows:

$$ E = K|I_{a} |_{{}}^{{\frac{m - 2}{{m + 2}}}} = K|I_{a} |_{{}}^{ - \delta } $$
(5)

where, K represents a constant. The parameter \(\delta\) in (5) correlates with the arc’s cooling method. Experimental findings suggest that with lateral cooling, \(m = 1\),\(\delta = 1/3\), whereas during free burning, \(m = 0\),\(\delta = 1\). As train speed progressively increases, \(\delta\) transitions from free burning to lateral cooling, warranting consideration for \(\delta \in (1/3,1)\).

The relationship between the parameter \(\delta\) and train speed encompasses various interdisciplinary fields such as thermodynamics and fluid dynamics, making it exceedingly complex to count. Hence, this paper improves (5) by integrating the maximum offline distance–train speed curve of the pantograph–catenary arc.

Optimized arc model

Among existing arc models, there are still relatively few models that link airflow velocity with voltage gradient. According to12 the arc voltage gradient in a stable burning state is solely dependent on the arc length, with the voltage gradient being directly proportional to the arc length. The empirical values are represented as \(Z = 15\) V/cm2, enabling the derivation of:

$$ u_{c} = 15 \times L_{Arc} $$
(6)

where \(L_{Arc}\) represent the length of the arc column. Let the maximum offline distance of pantograph–catenary at different speeds as \(d_{\max } (v)\),which is obtained by fitting the field experimental data and the results from Literature3:

$$ d_{\max } (v) = 4.571 \times 10^{ - 6} v^{2} + 0.0238v - 0.1411 $$
(7)

Assuming a constant arc resistance value per unit length, as speed and the offline distance increases, arc resistance also increases. This observation lays the theoretical groundwork for introducing port impedance in the following discussions. By examining the extreme offline scenario during operation, we consider \(L_{{{\text{Arc}}}} = d_{\max }\) and derive the expression for the voltage gradient as

$$ u_{c} = 6.857 \times 10^{ - 5} v^{2} + 0.357v - 2.117 $$
(8)

To incorporate airflow velocity into (5), we integrate the fitted offline distance-train speed curve into the blown arc model, substituting the blown arc parameters \(\delta\). This leads to the mathematical model of arc voltage gradient variation with airflow speed and arc current.

To control variables, when calculating the relationship between voltage gradient and velocity, we set the current equal to the effective value of 141.4 A of the equivalent circuit, resulting in:

$$ E_{v} = K \times 141.4^{ - \delta } = K \times 141.4^{f(v)} $$
(9)

The coefficient \(\delta\) is linked to airflow velocity, such that \(f\left( v \right) = - \delta\), based on the value range of \(\delta\), so \(f(v) \in ( - 1, - 1/3)\).

In the same pantograph–catenary system, the variation of voltage gradient with speed should remain consistent, yielding:

$$ E_{v} = u_{c} $$
(10)

where \(E_{v}\) represents the voltage gradient expression based on the energy balance theory with current as the effective value,\(u_{c}\) is the voltage gradient expression indirectly derived based on the maximum offline distance. The calculated values of these two voltage gradient expressions should be equal, so:

$$ \begin{aligned} u_{c} & = K \times 141.4^{f(v)} \\ & = 6.854 \times 10^{ - 5} v^{2} + 0.357v - 2.117 \\ \end{aligned} $$
(11)

Simplification yields:

$$ f(v) = \log_{141.4} \frac{{6.857 \times 10^{ - 5} v^{2} + 0.357v - 2.117}}{K} $$
(12)

where v represents the airflow velocity.Considering the maximum speed of the train is 500 km/h, due to \(f(v) \in ( - 1, - 1/3)\), the calculation yields \(K = 1007\), culminating in the final expression for the improved voltage gradient as:

$$ \left\{ \begin{gathered} E = 1007\left| {I_{a} } \right|^{{\frac{m - 2}{{m + 2}}}} = 1007\left| {I_{a} } \right|^{ - \delta } \hfill \\ - \delta = \log_{141.4} \frac{{6.857 \times 10^{ - 5} v^{2} + 0.357v - 2.117}}{1007} \hfill \\ \end{gathered} \right. $$
(13)

Thus far, we have established the relationship between the influence of airflow speed and arc current on voltage gradient.

Building on the energy balance theory, we derive the relationship between dissipated power and airflow velocity is obtained as follows3:

$$ P_{loss} = d_{\max } (224.55v_{c} - \frac{1319.88}{{v_{c} + 51}} + 11482.58) $$
(14)

where \({P}_{loss}\) represents the dissipated power. During the operation, changes in the pantograph–catenary offline distance accompany alterations in speed, consequently affecting dissipated power. This yields the expression for the improved arc model is:

$$ \left\{ \begin{gathered} \frac{{dg_{c} }}{dt} = \frac{1}{{\theta_{c} }}\left( {\frac{{u^{2} g^{2} }}{{\left( {1007|I_{a} |_{{}}^{f\left( v \right)} } \right)^{2} g_{c} }} - g_{c} } \right) \hfill \\ f\left( v \right) = \log_{141.4} \frac{{6.857 \times 10^{ - 5} v^{2} + 0.357v - 2.117}}{1007} \hfill \\ \frac{{dg_{m} }}{dt} = \frac{1}{{\theta_{m} }}\left( {\frac{{u^{2} g^{2} }}{{d_{\max } (224.55v_{c} - \frac{1319.88}{{v_{c} + 51}} + 11482.58)}} - g_{m} } \right) \hfill \\ d_{\max } = 4.571 \times 10^{ - 6} v^{2} + 0.0238v - 0.1411 \hfill \\ \frac{1}{g} = \varphi \left( i \right)\frac{1}{{g_{c} }} + \left[ {1 - \varphi \left( i \right)} \right]\frac{1}{{g_{m} }} \hfill \\ \hfill \\ \end{gathered} \right. $$
(15)

In summary, we consider the unique high-speed airflow environment of pantograph–catenary systems and derived a specific mathematical model for the pantograph–catenary arc.

Simulation arguments

PSCAD/EMTDC serves as a general-purpose power system transient simulation platform, boasting a rich library of components capable of simulating various operating states within a power system. Consequently, researchers utilize the PSCAD/EMTDC software to simulate and demonstrate the proposed arc model.

Simulation implementation

In the high-speed railway power supply system, we equate the transformer to a powered two-port output network, consider the contact wire conductors as a \(\pi\)-type circuit, and regard the motive force as an unpowered two-port network36. Figure 3 illustrates the equivalent circuit.

Fig. 3
figure 3

Equivalent circuit of pantograph-catenary system.

where \(U_{s}\) represents traction power supply, the effective value of which is measured at 27.5 kV;\(R_{T}\) and \(L_{T}\) are the equivalent resistance and inductance of transformer respectively; \(R_{L}\),\(L_{L}\) and \(C_{L}\) are the equivalent resistance, inductance, and capacitance between the ground and contact wire, respectively; \(R_{Arc}\) is arc resistance; and \(R_{M}\),\(L_{M}\) are the equivalent resistance and inductance during the operation of the locomotive motor.

As shown in Table 1, taking the Wuhan–Guangzhou high-speed railway section as an example37, values for each electrical parameter were calculated.

Table 1 Equivalent circuit parameters of pantograph–catenary system.

When calculating arc voltage gradient, estimating \(\left| {I_{a} } \right|^{f(v)}\) is necessary, where the exponent and base are functions of time. However, PSCAD/EMTDC lacks calculation units with variables in exponents and bases, though it does contain functions of type exp(x). Therefore, the original function \(E = k|I_{a} |^{f(v)}\) must be converted to the form of (16):

$$ E = e^{{\ln k + f(v)\ln |I_{a} |}} $$
(16)

Using the above mathematical model equations and equivalent circuits, a simulation platform is built in PSCAD, and the partial program diagram is shown in Fig. 4:

Fig. 4
figure 4

Transition function program diagram.

In PSCAD, the airflow velocity is set to increase at a rate of 166 km/h per second, and a fault occurs at 0.2 s. Therefore, when the fault occurs at 0.2 s, the Cassie model and the Mayr model capture the loop voltage and current at that moment. Based on these values, the current resistances of the Mayr and Cassie models are calculated. Then, through a transition function, the proportion of the resistances of the two models is determined, allowing the calculation of the final model resistance. This calculated resistance is then used in the equivalent circuit to determine the relevant electrical quantities for the next moment, until the simulation ends.

Analysis of simulation results

According to the Mayr assumption, voltage gradient should be proportional to the square root of arc resistance25, as expressed by:

$$ E = A\sqrt {R_{Arc} } $$
(17)

where A is the proportionality constant and \(R_{Arc}\) represents arc resistance (Ω).

In Fig. 5, curves 1 and 3 satisfy the Mayr assumption of positive proportionality. Furthermore, in curves 1 and 2 in Fig. 5, the voltage gradient is defined as the voltage drop per unit arc length, which should exhibit certain proportional similarity to the arc voltage. Consequently, we can infer that the voltage gradient solution presented in this paper is correct.

Fig. 5
figure 5

Contrast diagram of voltage gradient, voltage absolute value and resistance square.

Figure 6 shows voltage gradient as a function of current and speed. The X-axis represents current, calculated from the equivalent circuit, with values ranging from -200 A to 200 A, where the Y-axis represents airflow speed.

Fig. 6
figure 6

Function relation of current and train speed to voltage gradient.

First, the relationship between the voltage gradient and the arc current is analyzed. At a constant speed point, the current fluctuates between -200 A and + 200 A within one cycle, and the voltage gradient shows a spike waveform, with the spike occurring near the current zero-crossing point. When the current crosses zero, the arc tends to extinguish, and the arc column becomes thin or even breaks. At this point, the resistance of the arc column is high, causing a larger voltage to drop across the entire circuit. Therefore, a spike in the voltage gradient occurs at the zero-crossing point. As the current gradually increases from the zero-crossing point, the arc reignites, and the energy input to the arc increases, leading to more intense combustion. The arc column expands, the arc resistance decreases, and the voltage drop across the arc reduces, resulting in a decrease in the voltage gradient. This is consistent with the spike curve conclusion shown in Fig. 6.

Regarding the relationship between the voltage gradient and the airflow velocity, at a constant current point, since the arc consists of charged particles generated when the air is ionized, exhibiting fluid-like properties, when the train speed increases, the middle part of the arc column is stretched, tilting occurs, and the arc column becomes thinner. As a result, the arc resistance increases, leading to an increase in the voltage drop. Additionally, as the arc column elongates and becomes thinner, heat dissipation also increases. At this point, an increase in terminal voltage is required to maintain the energy balance of the arc and prevent it from extinguishing28. The above theoretical analysis of the mathematical model is consistent with the characteristics of the pantograph-arc system.

Figure 7 depicts the simulated waveforms of arc voltage, current, and resistance based on the previously demonstrated arc voltage gradient. By observing the resistance variation, it is evident that significant changes in the arc's characteristic parameters occur when the current approaches zero.

Fig. 7
figure 7

Simulated waveforms of the arc's voltage, current, and resistance.

Experimental verification

Experimental setup

To further demonstrate the mathematical model proposed in this paper and to obtain the experimental data under the influence of airflow velocity for the fault diagnosis in the following text, we conduct experiments on pantograph–catenary arcs using laboratory experimental equipment. Figures 8 and 9 depict the schematics of the arc experimental machine schematic and its physical diagram, respectively.

Fig. 8
figure 8

Electrical schematic diagram of the arc generating platform.

Fig. 9
figure 9

Experimental platform diagram.

The experimental power supply operates at AC 220 V 50 HZ. We use a circuit breaker to connect the circuit. The regulator and high current generator work together to simulate the high current in the pantograph–catenary system. We connect the output side of the high current generator to the load cabinet. To better simulate the actual situation of the pantograph–catenary arc, we need to adjust the power factor of the load cabinet to match the power factor of the equivalent circuit.

Figure 10 illustrates the structure of the experimental platform. In the mechanical part, a three-phase AC motor drives the turntable to rotate, and the relative motion of the carbon slide plate and metal wire embedded on the turntable simulates the relative sliding of the pantograph and catenary in the pantograph–catenary system. The relative speed of the metal wire and carbon slide plate is changed by adjusting the speed of the motor. A tone ring motor positioned at the lower left side of the turntable relates to the carbon slide plate to adjust the contact pressure between the carbon slide plate and turntable, thereby simulating pressure between the pantograph and catenary, reflecting the offline probability. Additionally, a mechanical structure driven by a single-phase motor on one side of the carbon slide plate. This motion converts the rotational motion of the motor into the reciprocating motion of the overall structure of the carbon skateboard, which is used to simulate "Z" sliding between the pantograph and catenary.

Fig. 10
figure 10

Structural diagram of arc-generation platform.

We improve the original experimental equipment by adding a 5500 W high-power blower to simulate the high-speed airflow experienced during train operation. Firstly, the 220 V power supply passes through an autotransformer, a high-current generator, and a load cabinet to simulate the current-carrying capacity of the traction system. Next, the control system adjusts the parameters of the host computer to regulate contact pressure, circuit current, turntable speed, and the reciprocating speed of the sliding table. Then, a handheld blower is used to generate a high-speed airflow field, which is applied laterally to the arc column to simulate the lateral arc-blowing process. Finally, voltage and current transformers sense voltage and current signals, which are converted into digital values by a data acquisition card and transmitted to the host computer via Modbus communication. The host computer displays and records the data.

For current detection, the experimental system utilizes an LMZJ1 current transformer, and DLPT202D voltage transformer to detect port voltage. These two signals pass through their respective signal conditioning circuits before being sent to the data acquisition. The NI PCI-6251 data acquisition card is used to collect real-time experimental data such as port voltage and current and upload it to the data acquisition and processing system of the upper computer for display, analysis, and storage. The sampling frequency of the data acquisition card is set to 10 kHz.

In the control section of the experiment, operators can manually adjust the relative sliding speed, airflow speed, operating current, and contact pressure of the carbon sliding plate on the upper computer.

Analysis of experimental results

Figure 11 presents the simulated and measured arc voltage waveforms at an airflow speed of 225.7 km/h. Upon comparing voltage waveforms, a high degree of similarity is observed between the measured and simulated waveforms of the pantograph–catenary arc. Additionally, Fig. 12 displays arc voltages measured under different high-speed airflow levels.

Fig. 11
figure 11

Comparison between the simulated and measured waveforms.

Fig. 12
figure 12

Half-period arc voltage at different airflow velocities.

An increase in the airflow speed increases the arc voltage amplitude, aligning with the conclusion drawn from the arc model presented in this paper. To further scrutinize the resemblance between the actual and simulated waveforms at 225.7 km/h in Fig. 11, the Spearman correlation coefficient is introduced in this study. This coefficient enables us to observe the similarity between these two waveforms. During the Spearman correlation analysis, the voltage waveforms from the experiment and simulation in Fig. 11 were replaced with x and y, respectively, for correlation calculation. The expression for calculating the Spearman correlation coefficient is:

$$ \rho = \frac{{\sum\limits_{i = 1}^{N} {(x_{i} - \overline{x})(y_{i} - \overline{y})} }}{{\sqrt {\sum\limits_{i = 1}^{N} {(x_{i} - \overline{x})^{2} } \sum\limits_{i = 1}^{N} {(y_{i} - \overline{y})^{2} } } }} $$
(18)

where \(\rho \) represents the Spearman correlation coefficient. x and y are two variables, \(\overline{x }\) and \(\overline{y }\) represent the mean values of the two variables. The Spearman correlation coefficients and their corresponding correlation are illustrated in Fig. 13.

Fig. 13
figure 13

Spearman correlation coefficient and correlation degree.

We found that the Spearman correlation coefficient between the simulated and measured waveforms were 0.816, indicating a high level of correlation. Due to the difference in amplitude between the experimental and simulation waveforms, as well as the fact that simulation data changes rapidly and is not affected by environmental factors, these factors can lead to deviations between the simulation and experimental waveforms.

In summary, the experimental data and correlation analysis presented above demonstrate the reliability of the mathematical model established in this paper for theoretical analysis of pantograph–catenary arc. Next, we will employ this mathematical model to analyze and verify the feasibility of identifying arc faults through port impedance.

Pantograph–catenary arc fault diagnosis

Arcs exhibit strong concealment within various signals, including voltage and current. Achieving accurate classification and recognition necessitates complex high-dimensional feature extraction and the search for optimal parameters through optimization algorithms. The difficulty of this task is further heightened when the arc burning phenomenon is not readily apparent. To address this challenge, this paper introduces the concept of port impedance.

Feasibility analysis of port impedance

If the impedance per unit length of the arc during stable arcing is considered in the modeling process in Sect.  1.1, the total impedance \({Z}_{u}\) of the entire circuit can be expressed as:

$$ \begin{aligned} \frac{{U_{S} }}{{I_{S} }} & \approx Z_{load} + d_{\max } (v) \times Z_{u} \\ & \approx Z_{load} + (4.571 \times 10^{ - 6} v^{2} + 0.0238v - 0.1411) \times Z_{u} \\ \end{aligned} $$
(19)

where \(U_{S}\) and \(I_{S}\) are the port voltage and current when the arc occurs, \(Z_{load}\) is the equivalent impedance of the load, \(d_{\max } (v)\) is the maximum offline distance of the pantograph–catenary, and equivalent is the arc length.

According to (19), after arc occurrence, the impedance of the entire circuit will increase with increasing airflow speed. Loop impedance cannot be directly measured, and its value is equal to the effective value of the power supply port voltage divided by the effective value of the power supply port current, which is referred to as the port impedance below. When calculating port impedance, voltage and current characteristics are superimposed to amplify the significance of the arc. Subsequently, the feasibility of using port impedance is analyzed below to identify arcs.

Using PSCAD simulation, the instantaneous voltage and current of the power supply port before and after arc occurrences are obtained. Then, the effective value is calculated through a sliding window with a length period of N, and the \(\left[ {i - N + 1,i} \right]\) interval is taken as the port impedance value at a point \(i\). The expression for impedance calculation is shown in (20):

$$ \left\{ \begin{gathered} rms = \left[ {\frac{1}{N}\sum\limits_{i = 0}^{N} {x_{i}^{2} } } \right]^{{_{2}^{1} }} \hfill \\ Z_{rms} = \frac{{V_{rms} }}{{I_{rms} }} \hfill \\ \end{gathered} \right. $$
(20)

where N represents the number of discrete acquisition points in a cycle and \(V_{rms}\), \(I_{rms}\) and \(Z_{rms}\) represent the effective values of voltage, current, and port impedance, respectively. As illustrated in Fig. 14, the simulated data were processed according to (20) to obtain port impedance and arc resistance waveforms with changes in airflow velocity.

Fig. 14
figure 14

Diagram of arc resistance and port impedance changing with speed under simulation.

We set the arc fault to occur at 0.2 s, and increase the airflow speed by 166 km/h per second. Under this condition, the changes in arc resistance and port impedance with the increase of airflow speed are shown in Fig. 15. When the arc does not occur, the arc resistance is zero, and we calculate the port impedance to be 2.5 Ω. When the arc occurs, the port impedance value fluctuates and stabilizes at 2.8 Ω. As the airflow speed increases, the peak value of arc resistance and port impedance also increase. At 90 km/h, the port impedance value increases to 4.75 Ω. Therefore, diagnosing arc faults using port impedance is theoretically feasible.

Fig. 15
figure 15

Port voltage and current under normal and fault conditions.

We establish an experimental platform with a contact pressure of 70 N, a current effective value of 141 A, and a speed level of 135.8 km/h. This data incorporates the influence of airflow on the arc, making it closer to the actual arc data of the pantograph and catenary. We collect port voltage and current data under normal and fault conditions, normalize the collected data, and present the results shown in Fig. 15.

Figure 15 depicts port voltage and current in the different states. In the fault state, a zero-rest phenomenon emerges near the current zero crossing, accompanied by high-frequency harmonics in the port voltage.

According to (20), this article selects 30 normal and 30 fault cycle samples to calculate port impedance, and the calculation results are shown in Fig. 16.

Fig. 16
figure 16

Comparison of port impedance under normal and fault conditions.

Figure 16 shows that under normal conditions, the port impedance value is stable, with its amplitude remaining around 3.02 Ω. After the occurrence of an arc, the port impedance fluctuates significantly, with an average value of approximately 3.1964 Ω. In both cases, a clear boundary exists between the port impedance waveforms. This indicates that arc fault identification can be formulated as a linearly separable classification problem based on port impedance.

Arc recognition poses a nonlinear problem under current or voltage, requiring additional data processing and feature extraction for accurate identification. However, under port impedance proposed in this paper, arc recognition presents a linear problem, indicating that port impedance offers remarkable advantages in arc recognition.

SVM identifies arc faults

Support Vector Machine (SVM) is a supervised learning algorithm widely used for classification and regression problems. It is known for its efficiency, high accuracy, robustness, good interpretability, and adaptability. Its core idea is to map the input data into a high-dimensional feature space and then find an optimal hyperplane in that space to separate samples from different classes while maximizing the distance between the hyperplane and the nearest data points. Given SVM's excellent classification performance, it is adopted in this paper as the identification model for arc fault detection, with current features selected for classification and recognition.

Because the identification of the normal and fault states under port impedance is a linear classification problem, there is no need to extract features to map high-dimensional space and construct hyperplane, this paper categorizes the port resistance under normal and fault conditions as − 1 and 1, respectively. The port resistance under both conditions is used as input, while the accuracy of the algorithm’s training and test sets is used as output.

Each point in Fig. 17 represents impedance values within a cycle, thereby 500 normal and fault sample values were randomly selected and divided into a training set and a validation set at 8:2. Figure 16 clearly demonstrates that the fault and normal states can be directly separated using a straight-line boundary. Therefore, there is no need to extract high-dimensional features or utilize complex optimization algorithms. This represents the biggest advantage of using port impedance to identify arc faults. In this study, the simplest SVM binary classification model is directly used in MATLAB for training, without the need to set the relevant parameters of the algorithm model. The interception and slope of the decision boundary can be obtained through simple training. After training, the distribution of trained sample points and classifier boundary lines is shown in the figure:

Fig. 17
figure 17

Distribution of training sample points and classifier boundary.

The distribution of fault sample points and normal sample points on the plane is shown in Fig. 17. By training the binary classification model with the above sample points, the boundary line function of the classifier is obtained as (21).

$$ Y = wx + b = 0.00000251x + 3.1064 $$
(21)

From the solved boundary lines, it can be seen that we can distinguish the faults and normal states by setting a threshold for a horizontal line, which can easily achieve arc fault diagnosis on embedded devices and achieve lightweight productization.

Import the remaining 200 randomly selected test set data into the model for prediction. The distribution of test set sample points and model prediction are shown in Fig. 19:

From Fig. 18, we observed that the classifier trained from the previous 800 sample points accurately classifies the 188 sample points that did not participate in the training. This indicates that a simple binary SVM model can also exhibit good robustness performance in port impedance.

Fig.18
figure 18

Test set samples and classifier boundaries.

We assume that the label of the fault sample in the test set is − 1 and the lable of the normal data sample is 1. The comparison between the predicted classification of the model and the actual classification is shown in Fig. 19.

Fig. 19
figure 19

Test set predictive classification and actual classification.

From Fig. 19, it is evident that under the simple original SVM linear binary classification, the testing accuracy can reach 99%.

Therefore, we conclude that port impedance demonstrates superior discrimination performance in arc recognition. In comparison to using arc current or voltage to identify arcs, the concealment of arcs under port impedance is low and comparatively easy to identify.

Comparison with other recognition methods

The advantages of port impedance in identifying arc faults have been analyzed above. Next, we will compare the port impedance with several common arc fault identification methods from multiple aspects.

Identifying arc faults through arc current, arc voltage, and arc image is a common method. This article will compare the advantages and disadvantages of port impedance with those of three methods: identifying arc faults will be comprehensively evaluated from six aspects. Note that this article does not simply evaluate the effectiveness of the method based on recognition accuracy alone. Differences in accuracy can arise from using different feature categories, algorithm models, and data, as well as varying requirements for processor performance and recognition time.

Therefore, this article comprehensively examines six aspects and assesses the advantages and disadvantages of recognition methods through score values.

Taking the arc current identification fault method as an example, evaluate the difficulty values of this method in six aspects, and finally calculate the total difficulty value. The evaluation process is shown in Fig. 20. These six aspects include the difficulty of distinguishing between fault signals and normal signals, the difficulty of signal acquisition, algorithm requirements for achieving accuracy of over 90%, requirements for feature value selection with over 90% accuracy, number of required original signal types, and the performance requirements for processors. The difficulty level of each aspect will be divided into three intervals: high, medium, and low, with corresponding scores of 3, 2, and 1. The higher the score, the more difficult it is to achieve fault diagnosis.

Fig. 20
figure 20

Calculation process of difficulty value for identifying faults through arc current.

This paper compares three conventional arc identification methods—arc current, arc voltage, and arc image—with the proposed port impedance recognition method. The recognition processes of each method are shown in Fig. 21.

Fig. 21
figure 21

The recognition processes of the arc identification methods.

Table 2 displays the scores of various recognition methods in each aspect. From a vertical perspective, it is evident that the image recognition has the highest difficulty, indicating that identifying arc faults through images is relatively challenging. Conversely, the difficulty score of port impedance is the lowest, suggesting that identifying arc faults through port impedance is the lowest. To provide a clearer visualization of the distribution of difficulty values, the table is presented graphically in Fig. 21.

Table 2 Distribution of difficulty values for various methods from various aspects.

By observing the distribution of difficulty values in the Fig. 22, when we utilize arc current as the original signal to diagnose arc faults, we observe a high degree of similarity between the fault and normal signals. In this context, the selection of feature value types becomes crucial, directly determining the accuracy of recognition. When utilizing arc voltage as the original signal to diagnose arc faults, the selection of characteristic values remains very crucial. However, due to the uncertainty of the arc’s location, obtaining accurate arc voltage data poses challenges. This is also why most arc recognition uses current signals.

Fig. 22
figure 22

Distribution of difficulty values for various methods from various aspects.

Moreover, when employing arc images as raw signals for the arc fault diagnosis, obtaining these images proves to be challenging. Furthermore, image recognition requires more complex algorithms, resulting in high requirements for processor performance.

When using port impedance to identify arcs, apart from requiring voltage and current signals, the difficulty values for the other five aspects are relatively low, especially without the need to extract feature values. This method can directly and linearly separate faults and normal states, which is largely superior to the other three arc identification methods.

Conclusion

Considering the unique high-speed airflow environment of the pantograph–catenary system, this paper establishes a mathematical model of the arc to adapt to the situation. To reduce the difficulty in arc identification, a method for identifying arc faults using port impedance is proposed and feasibility is analyzed using the mathematical model demonstrated in the article. The following conclusions are drawn from this research.

  • The proposed mathematical model integrates the high-speed airflow effect of the pantograph–catenary arc, and the derivation results show that arc voltage gradient is a composite function with arc current as the base function and airflow velocity function as the index. A more realistic arc model is acquired by incorporating voltage gradient and dissipative power including airflow velocity into the Cassie–Mayr simultaneous model and adjusting the contribution rate with a transition function.

  • The influence of high-speed airflow on the pantograph–catenary arc is simulated using a pantograph–catenary arc experimental machine at different speed levels. The correlation analysis between the measured and simulation data demonstrates the effectiveness of the mathematical model simulating the airflow-affected pantograph–catenary arc .can be used for characteristic analysis and diagnosis of the pantograph–catenary arc.

  • This paper proposes the concept of port impedance and conducts its feasibility analysis using theoretical and experimental data. The findings demonstrate that port impedance enhances the importance of arc characteristics and transform traditional nonlinear recognition problems into linear recognition problems. This method eliminates the need for complex feature extraction and algorithm parameter optimization. The proposed port impedance approach achieves an accuracy rate close to 99% in recognizing arc faults.

This paper considers the influence of high-speed airflow on the pantograph-catenary arc and proposes a model that more closely resembles the arc behavior during actual train operation, providing a theoretical basis for understanding and controlling the current collection state of trains in real-world scenarios. Furthermore, future work can build upon the model established in this study by incorporating additional environmental factors, such as electromagnetic fields, heat conduction, fluid dynamics, and various disturbances. These factors can be combined using mathematical methods to describe the behavior of the arc more accurately.