Abstract
Open-circuit faults (OCFs) in three-level neutral-point-clamped (NPC) inverters can severely degrade power quality and compromise system reliability. However, existing diagnostic methods often exhibit performance degradation under mixed operating conditions and strong noise, and they remain highly sensitive to hyperparameter settings. To address these issues, this paper proposes an integrated model–optimization framework that couples a lightweight diagnostic network, DR-SE-NPCNet, with an improved honey Badger algorithm (IHBA) for global hyperparameter tuning. DR-SE-NPCNet preserves full temporal resolution through a temporal resolution preserving–temporal dilation (TRP-TD) backbone and enhances discriminative representations using a residual and squeeze-and-excitation–calibrated fusion (ReSE-CF) module. IHBA further stabilizes and improves the model by enabling efficient and robust hyperparameter optimization. Experiments on a hardware NPC inverter platform demonstrate that the proposed method achieves 92.83–96.94% accuracy under 10 dB noise and mixed variations in load level, modulation index, DC-bus voltage, and output frequency, outperforming conventional CNN-based approaches. With IHBA optimization, diagnostic accuracy is further increased by an additional 2–3%. These results confirm that the integrated DR-SE-NPCNet and IHBA framework provides a robust and high-accuracy solution for OCF diagnosis under severe noise and mixed operating conditions.
Similar content being viewed by others
Introduction
Three-level NPC inverters, owing to their superior output performance, high energy efficiency, and excellent electromagnetic compatibility, have been widely deployed in medium- and high-voltage applications such as renewable energy integration, industrial drives, electric vehicles, and railway traction1. Compared with two-level topologies, the NPC structure is more complex and involves more switching devices that are susceptible to failures under long-term thermal stress, electrical stress, temperature cycling, and restricted heat dissipation2,3. Short-circuit faults in switching devices can be detected and isolated within milliseconds using mature hardware techniques and are therefore not discussed. In contrast, OCFs allow the inverter to continue operating temporarily, but if not located and identified rapidly and accurately, they cause current distortion, voltage imbalance and DC-link drift, leading to load shocks, power quality degradation, and even equipment shutdown4. Consequently, accurate and timely OCF diagnosis in three-level NPC inverters is critical for ensuring system safety and reliability and also holds significant economic value5,6,7,8.
Diagnostic methods for OCFs are generally divided into model-based, signal-processing-based, and data-driven approaches9. Model-based methods rely on accurate mathematical models for state estimation and fault detection. Several sliding-mode observers, including interval, fractional-order preset-time, and super-twisting designs, improve robustness and detection speed but remain sensitive to parameter tuning and can become unstable under noise10,11,12. Abbas et al. applied an extended Kalman filter bank capable of detecting single and multiple faults but the approach is heavily dependent on model accuracy and is computationally demanding13. Overall, model-based methods are accurate but their applicability is constrained by parameter dependence, modeling precision, and computational complexity.
Signal-processing-based methods diagnose faults by extracting features from current or voltage signals. Zhou et al. proposed a voltage-signal trend decomposition method to improve accuracy14, but its reliability declines under strong noise. Sun et al. enhanced sensitivity using residual voltage features15. Zhang et al.16 proposed a current-vector-phase-based detection method that improves real-time performance but still experiences performance degradation under complex conditions. Other studies also exploit critical current features to enhance sensitivity but remain limited in adaptability17. Overall, signal-processing approaches improve feature utilization and detection accuracy, but current-based methods respond slowly and are highly susceptible to load variation, whereas voltage-based methods require extra sensors, thereby increasing the overall cost and system complexity.
Data-driven fault diagnosis methods rely on data mining and pattern recognition to map data to fault states, offering greater flexibility than model- or signal-based approaches. These methods can generally be divided into two categories. The first uses conventional machine learning techniques such as support vector machines (SVM), decision trees, k-nearest neighbors (KNN), and random forests (RF). They depend on manually extracted time-, frequency-, or time–frequency-domain features for classification. Studies combining wavelet energy features with RF18 or weighted RF with feature selection19 improved discriminability but suffered from limited noise robustness and became increasingly computationally expensive for large datasets. Other SVM- and RF-based methods20,21,22 improved accuracy but continued to face challenges with handling multiple faults and maintaining interpretability. Overall, conventional data-driven methods show good accuracy and stability but require substantial labeled datasets and are highly sensitive to hyperparameter settings.
The second category comprises deep-learning-based diagnostic methods. These approaches automatically extract hierarchical features directly from raw data, avoiding handcrafted features, and enable high-accuracy fault classification and localization through end-to-end learning23,24. Convolutional neural networks (CNNs), due to their local perception and parameter-sharing, are particularly effective for multi-scale feature extraction and have been widely applied in power electronics fault diagnosis. For multi-dimensional time-series signals, CNNs can effectively capture transient and steady-state patterns, thereby maintaining high accuracy and robustness25,26.
In recent years, researchers have proposed several CNN-based improvements for inverter OCF diagnosis. Shen et al.27 introduced CNNs into fault detection for hybrid active NPC inverters, constructing current matrices to improve accuracy. Yuan et al.28 proposed an enhanced 1D-CNN with the IAdamod optimizer, improving convergence but introducing a large number of parameters. Rajabi et al. developed a separable Conv2D CNN approach that reduces parameter count and enhances real-time performance for edge deployment29. However, the overall network structures remain relatively complex, and their engineering applicability is still limited30,31.
To improve adaptability across varying load and operating conditions, Zhang et al.32 proposed a wavelet kernel CNN. While their method extended condition coverage, it still lacked robustness under noisy and cross-condition environments. Recent studies further demonstrate that CNN-based models often suffer from notable performance degradation in such scenarios: Wang et al.33 observed accuracy dropping below 80% at 10 dB SNR, and Chai et al.34 reported losses exceeding 12% in most methods under similar noise levels.
These findings highlight a critical limitation in the noise resilience and cross-condition generalization of existing approaches, particularly under practical deployment scenarios. To address this gap, this study proposes a lightweight CNN architecture that is specifically designed to enhance diagnostic robustness under both noisy environments and mixed operating conditions.
In existing research, CNN performance is still influenced by both network architecture and hyperparameter settings. Because deep models contain numerous highly coupled hyperparameters, metaheuristic algorithms are widely applied for CNN optimization to enhance diagnostic performance. Pavithra et al. employed the Orca Optimization Algorithm (ORCA) to optimize CNN depth and learning rate, improving classification but remaining sensitive to initial parameters35. Almuflih et al.36 applied Binary Snake Optimization (BSO) for 1D-CNN and LSTM, enhancing detection but exhibiting low efficiency in high-dimensional search. Rajadurai et al.37 used Particle Swarm Optimization (PSO) with dynamic weight updating, improving precision but incurring high computational cost. Jothi et al.38 combined Whale Optimization Algorithm (WOA) with DCNN for faster convergence but observed reduced accuracy under non-stationary conditions. Alabbas et al.39 applied the Salp Swarm Algorithm (SSA) for better global search than PSO, though requiring long training times. Vishwanath and Manjula40 applied Orca Predation Algorithm (OPA) for hyperparameter tuning but reported convergence instability. Vetrithangam et al.41 enhanced OPA with adversarial domain adaptation, improving cross-domain transfer but significantly increasing complexity. Nguyen et al.42 introduced adaptive PSO to optimize LSTM–CNN, accelerating convergence and improving accuracy but still exhibiting oscillatory behavior and limited global search capability under complex conditions.
From these studies, two major issues can be identified:
-
(1)
Diagnostic accuracy significantly degrades under condition switching and noise interference, as mixed scenarios and noise tend to destabilize decision boundaries.
-
(2)
Deep models contain many strongly coupled hyperparameters; existing tuning methods are inefficient and susceptible to local optima, leading to unstable convergence and poor reproducibility.
To overcome these challenges, this paper proposes an integrated model–optimization diagnostic framework. It introduces DR-SE-NPCNet to enhance robustness and generalization and integrates an IHBA for efficient automatic tuning of critical hyperparameters, thereby significantly improving diagnostic accuracy and stability under complex operating conditions.
The main contributions of this paper are summarized as follows:
-
(1)
The DR-SE-NPCNet architecture is presented, integrating TRP-TD and ReSE-CF modules to achieve multi-scale feature extraction and stable residual fusion, thereby improving robustness under complex conditions.
-
(2)
The improved IHBA algorithm integrates OAS, SBAS, and MEMDMS for synergistic optimization, automatically tuning key hyperparameters such as the learning rate and batch size to enhance global search efficiency and ensure stable and generalizable convergence.
-
(3)
A comprehensive dataset of NPC inverter OCFs is constructed, and the proposed model is systematically evaluated under multiple noise levels and mixed operating conditions. It achieves up to 98.32% accuracy in noise-free scenarios and 92.83–96.94% under 10 dB noise and mixed conditions, significantly outperforming other methods. With IHBA, performance is further improved by 2–3% under challenging conditions, confirming the reliability and practical value of the overall framework.
The remainder of this paper is organized as follows: “Open-Circuit Fault Analysis in NPC Three-Level Inverters” section introduces the NPC three-level inverter topology and OCF mechanisms, defining the diagnostic scope. “Diagnostic Framework Based on DR-SE-NPCNet and IHBA” section presents the DR-SE-NPCNet framework with TRP-TD, ReSE-CF, and IHBA-based hyperparameter optimization. “Experimental Validations” section details the experimental platform, dataset, and evaluation under diverse conditions with IHBA benchmark verification. “Conclusion” section concludes with main findings and contributions.
Open-circuit fault analysis in NPC three-level inverters
The main circuit of an NPC three-level inverter consists of twelve insulated gate bipolar transistors (IGBTs), with each phase leg containing four switching devices (Sa1–Sa4, Sb1–Sb4, Sc1–Sc4), clamping diodes, and a split DC-link formed by two capacitors, C1 and C2, as shown in Fig. 1. This structure enables three output voltage levels: +Udc/2, 0, and − Udc/2. During operation, the DC-bus voltage and three-phase currents are regulated through a dual-loop PI control scheme, comprising an outer voltage PI loop and an inner current PI loop, while the switching states are generated by a standard SVPWM modulation strategy. These control loops ensure stable system operation and accurate reference tracking. After passing through the AC filter, the inverter outputs high-quality sinusoidal voltages to the load or grid. Importantly, the proposed DR-SE-NPCNet relies solely on the measured three-phase output currents for fault diagnosis, and its feature extraction process is fully independent of the specific control strategy used in the inverter.
Control schematic of the NPC three-level inverter.
Each phase alternates between the positive DC bus (P), neutral point (O), and negative DC bus (N). Taking phase A as an example, the conduction path of each state depends on the instantaneous direction of the phase current, as illustrated in Fig. 2. In the P state, Sa1 and Sa2 conduct, and the phase terminal is connected to + Udc/2. When the phase A current is positive, current flows from the upper bridge arm to the load. When the phase current is negative, current returns to the positive bus through the antiparallel diodes of the upper bridge devices. In the O state, Sa2 and Sa3 conduct. In the N state, Sa3 and Sa4 conduct. Switching among the P, O, and N states synthesizes a low-distortion three-level output while evenly distributing thermal stress among devices.
Current loop of the normal operating state. (a) P state. (b) O state. (c) N state.
The current waveform under normal conditions without open-circuit faults is nearly symmetrical. An open-circuit fault in any IGBT interrupts its corresponding conduction path, leading to half-cycle loss, extended freewheeling intervals, or distinct unbalanced distortions in the observed waveforms. Consequently, fault localization can be achieved solely by monitoring the three-phase currents, without requiring additional voltage sensors. Considering that the probability of multiple devices opening simultaneously in practice is very low, this paper focuses on diagnosing single-switch and double-switch open-circuit faults.
For single-switch open-circuit faults, take phase A as an example. As illustrated in Fig. 3(a), when Sa2 is open-circuited, both the P and O clamping states fail, while the N state remains functional. The positive half-cycle does not completely disappear due to the inductive nature of the load; instead, a small transient current spike appears as the stored magnetic energy is released through the freewheeling diodes, resulting in partial conduction and pronounced current distortion. Similarly, when Sa4 is open-circuited, the N clamping state is lost, while the O and P states remain intact. The positive half-cycle remains nearly normal, whereas the negative half-cycle exhibits distinct diode freewheeling distortion.
Current waveforms in different fault situations. (a) Switch Sa2 fault. (b) Switches Sa1and Sa2 fault. (c) Switches Sa1and Sa4 fault. (d) Switches Sa2 and Sa3 fault. (e) Switches Sa1and Sb1 fault. (f) Switches Sa2and Sb1 fault.
Double-switch open-circuit faults can be classified into three categories. The first involves two devices in the same half-bridge of a single phase opening simultaneously. The second occurs when two devices in the upper and lower half-bridges of the same phase both open. The third occurs when one device in each of two different phases opens simultaneously. Together with the normal state and single-switch faults, these constitute five categories of open-circuit faults, comprising 73 distinct fault modes in total, as summarized in Table 1. In the first category, inner devices or both switches of the same half-bridge fail simultaneously, producing waveforms similar to single-switch faults and therefore making them difficult to distinguish. For example, when Sa1 and Sa2 are open-circuited simultaneously, the positive half-cycle current in phase A is largely lost, leaving only a small spike caused by inductive freewheeling, as shown in Fig. 3(b), and the waveform is nearly indistinguishable from the Sa2 single open-circuit case.
The second category involves cases where one switch in the upper half-bridge and one in the lower half-bridge of the same phase both open. For example, when Sa1 and Sa4 or Sa2 and Sa3 of phase A open simultaneously, both P and N states are lost, leaving only the O state functional. As shown in Fig. 3(c) and (d), due to the inductive nature of the load, the current does not immediately drop to zero but instead exhibits a decaying freewheeling process, resulting in clear gaps where the P or N states should occur and a noticeable reduction in the fundamental amplitude, thereby weakening current transfer capability and potentially leading to neutral-point voltage imbalance.
The third category corresponds to cases where devices in different phases fail simultaneously. For example, when Sa1 or Sa2 of phase A and Sb1 of phase B are open-circuited, the affected phase currents exhibit their respective missing clamping characteristics, as shown in Fig. 3(e) and (f). This imbalance shifts the neutral-point potential and distorts the healthy phase currents, leading to amplitude fluctuation, even-order harmonics, and interphase oscillations.
Diagnostic framework based on DR-SE-NPCNet and IHBA
OCFs in three-level NPC inverters typically manifest as multi-scale and non-stationary features, such as fundamental waveform distortion, mid- and high-frequency ripples, and slow drift of the DC-bus potential. However, as illustrated in Fig. 4, conventional convolutional neural network (CNN)-based methods generally rely on pooling layers and strided convolutions to enlarge the receptive field and reduce computational complexity. Although this approach expands the coverage range, it inevitably sacrifices temporal resolution, causing weak and short-lived transient fault features in the current to be attenuated or lost. Moreover, due to the limited receptive field and downsampling operations, conventional one-dimensional CNNs are insensitive to long-term dependencies across cycles. In particular, under varying operating conditions, critical fault segments are weakened or distorted, leading to blurred inter-class boundaries and ultimately restricting diagnostic accuracy under complex operating scenarios.
Infrastructure of the CNN diagnosis model.
As shown in Fig. 5, the DR-SE-NPCNet first employs the TRP-TD backbone to extract multi-scale temporal features without any pooling, and subsequently refines these features through the ReSE-CF module using residual enhancement and channel-wise attention. This forms a coherent feature-processing pipeline in which all pooling and strided-downsampling operations are removed from the temporal extraction path, thereby preventing the loss of transient fault information.
Structure of the proposed DR-SE-NPCNet.
TRP-TD backbone
To overcome the loss of transient information caused by pooling-based CNN architectures, the TRP-TD backbone is designed to address the limitations of CNN structures that rely on pooling or strided convolution, both of which tend to suppress fast transient information. Since the input is one-dimensional sequential data, dilation is applied only along the temporal axis. By incorporating a dilation factor into the convolution kernel, the model aggregates temporal information at intervals of d without increasing the number of parameters, thereby enlarging the effective receptive field while preserving sample-level temporal resolution. This enables the extraction of both short-term dynamic variations and slower cross-cycle trends in the current signals. Although the subsequent ReSE-CF module uses global average pooling (GAP) to compute channel weights, this operation is confined to the attention branch and does not alter the temporal resolution, thereby preserving the no-pooling and no-strided-downsampling principle of the TRP-TD backbone.
In CNNs, the receptive field determines the temporal context that the model is able to perceive. The one-dimensional dilated convolution employed in the TRP-TD backbone is defined in Eq. (1):
where y[t] denotes the output sequence, x[t] is the input sequence, w[k] represents the convolution kernel, and d is the dilation factor. When d = 1, the formulation reduces to a standard convolution. When d > 1, dilation introduces gaps along the temporal dimension and expands the effective kernel length, which is expressed in Eq. (2):
where f denotes the actual kernel size. With stride fixed at 1 and no pooling, the effective receptive field of TRP-TD expands layer-by-layer in proportion to the effective kernel length feff, thereby enabling long-range temporal coverage at full resolution without introducing additional parameters.
As shown in Fig. 6, although dilated convolutions can substantially enlarge the receptive field without introducing additional parameters, improper design may introduce two risks. The first is the occurrence of gridding artifacts, where an excessively large dilation factor creates gaps between adjacent receptive fields, resulting in local information loss and weakened feature representation. The second is temporal phase alignment deviation. When input current waveforms undergo temporal stretching due to variations in fundamental frequency, modulation index, or load conditions, fixed strides or pooling operations can lead to misalignment between the feature map and the original signal. Both risks hinder the effective capture of short-term non-stationary features. Therefore, appropriate structural constraints and complementary strategies are required to ensure temporal consistency and feature integrity while maintaining full resolution.
Receptive field expansion with (a) standard and (b) dilated convolution.
To mitigate these issues, the TRP-TD backbone adopts several lightweight yet effective refinements. The dilation factor is fixed at d = 2 to balance long-term dependency modeling and local feature coverage; each dilated convolution is followed by a non-dilated convolution to smooth potential gridding effects; stride = 1 with consistent padding is applied across the network to maintain pointwise alignment; and full-length features are preserved so that the ReSE-CF module can compute channel weights over the complete temporal domain. A 1 × 1 residual mapping further ensures phase alignment during cross-layer fusion. Together, these refinements enhance temporal consistency and stability while preserving full resolution.
Consequently, the proposed TRP-TD backbone differs fundamentally from conventional dilated CNNs. It removes all pooling and strided convolutions along the temporal axis and restricts dilation strictly to the time dimension, thereby enabling full-resolution temporal modeling. This design enlarges the receptive field while preserving sample-level temporal alignment, effectively avoiding the loss of temporal resolution and the irregular transient sampling commonly introduced by existing dilated-CNN structures. Thus, TRP-TD is not a reuse of standard dilated CNNs but rather a dedicated full-resolution temporal backbone.
ReSE-CF backbone
In practical fault diagnosis, conventional CNNs often attempt to enhance representational capacity by deepening the network or expanding convolutional kernels; however, this approach frequently leads to training instability, gradient vanishing, and overfitting, particularly when available data are limited. Moreover, assigning equal weights to all channels ignores the inherent differences in feature sensitivity, which may dilute discriminative information and hinder effective cross-layer propagation, thereby further constraining the model’s fault recognition capability.
To address these limitations, this paper proposes the ReSE-CF backbone, which integrates residual learning with a squeeze-and-excitation (SE) mechanism. The residual path employs a 1 × 1 convolution, batch normalization (BN), and identity shortcuts to stabilize gradient flow, while the SE path performs global average pooling (GAP) and nonlinear channel recalibration to emphasize fault-related channels and suppress those dominated by noise.
The core principle of ReSE-CF is calibrated residual coupling. As illustrated in Fig. 5, the main-branch features are first recalibrated by SE module and then added to the unweighted residual path, forming a stable and adaptive fusion as expressed in Eq. (3):
Here xskip denotes the identity shortcut that ensures stable gradient propagation, F(x) represents the transformed main-branch feature, and s is the channel-wise weight vector. Importantly, weighting is applied only to the main branch, whereas the residual path remains an identity mapping, providing a stable reference and preventing excessive suppression. This design ensures that the residual shortcut preserves a reliable gradient pathway while SE mechanism selectively modulates only the main branch.
For stability analysis, we focus on the Lipschitz constant of the mapping in Eq. (3). Under mild assumptions, the Lipschitz constant of y with respect to x satisfies:
where LF denotes the Lipschitz constant of F and || s || ≤ 1. This inequality indicates that the SE-gated coupling constrains the Lipschitz constant of the mapping, preventing gradient explosion while maintaining the stability of the residual transmission.
The Lipschitz upper bound above primarily characterizes the non-divergence of the first-order gradient. To further elucidate its role in suppressing noise and redundant features, a more detailed analysis from the perspective of output variance is required. ReSE-CF achieves a balance between trainability and noise robustness by applying SE weighting exclusively to the main branch while preserving the residual path as a pure identity mapping, thereby ensuring stable gradient propagation. Mathematically, if the channel weights are defined as shown in Eq. (5):
The approximate variance of the output can be expressed as shown in Eq. (6):
Here, the unweighted xskip provides a stable residual baseline, while the fine-grained modulation introduced by s effectively suppresses the variance contributions from noise-dominated or redundant features.
Under complex operating conditions, this combination of an identity residual path and SE-based channel gating helps maintain clear decision boundaries while attenuating irrelevant information, thereby offering a concise yet rigorous explanation of the noise-robust behavior of ReSE-CF.
To address potential scale or phase inconsistencies in residual networks, the residual branch is formulated as shown in Eq. (5). In practice, this branch is implemented using a 1 × 1 convolution followed by batch normalization and ReLU activation, which together ensure proper scale and phase alignment before fusion with the SE-calibrated main-branch features. This lightweight alignment step preserves the identity-mapping property and maintains stable gradient propagation.
Meanwhile, the SE weights of the main branch are generated according to Eq. (7):
where GAP extracts global statistical information to suppress local noise fluctuations, and the two fully connected layers W1 and W2 compress and restore dimensionality to generate channel-adaptive scaling factors. Through this mechanism, the SE module emphasizes fault-related informative channels while attenuating those dominated by noise, thereby enabling dynamic attention with minimal computational cost.
This pre-fusion coupling strategy allows ReSE-CF to preserve the stability of the identity shortcut while selectively enhancing discriminative features on the main branch. Under diverse operating conditions, it strengthens channel-level selectivity, maintains cross-layer consistency, and ensures that adaptive channel recalibration does not introduce phase or scale distortions. Collectively, these properties enhance noise robustness and representation quality, thereby supporting the overall diagnostic performance of the proposed DR-SE-NPCNet.
The ReSE-CF module differs fundamentally from conventional SE-ResNet blocks. Unlike standard designs, it applies SE weighting exclusively to the main branch while preserving the residual shortcut as a pure identity mapping. Moreover, the SE operation is positioned before feature fusion, ensuring that the shortcut provides a stable, unaltered reference signal. This calibrated residual coupling produces a controlled gain on the main branch, maintains a clean gradient pathway, and effectively suppresses noise-dominated channels. As a result, ReSE-CF delivers enhanced stability and robustness compared with traditional SE-ResNet structures, particularly in noisy or non-stationary operating environments.
Improved honey Badger algorithm
In complex operating conditions, hyperparameters such as the learning rate and batch size exert a significant influence on the training stability and generalization performance of deep learning-based diagnostic models. However, the optimal hyperparameter configuration is highly sensitive to the coupling between data distribution and variations in operating conditions. Manual tuning or grid search is often inefficient and prone to non-convergence. Therefore, the introduction of intelligent optimization algorithms for automated hyperparameter search becomes essential.
The Honey Badger Algorithm (HBA) is a recently proposed metaheuristic swarm-intelligence optimization method inspired by the highly flexible and globally explorative foraging behavior of honey badgers. HBA employs a “digging” process to perform nonlinear, jump-like exploration and a “honey-seeking” process for local exploitation around the current best solution, enabling rapid early-stage expansion of the search space and facilitating escape from local optima. However, the traditional HBA still faces several challenges. In high-dimensional and non-convex search spaces, the digging phase tends to overexploit the current best solution, resulting in a rapid loss of population diversity and premature convergence. In later iterations, the honey-seeking phase shrinks the search radius excessively, leading to insufficient global exploration and eventual search stagnation. Moreover, its adaptability and stability under varying operating conditions remain limited.
To address these issues, this study proposes an IHBA, which enhances convergence efficiency and stability in complex hyperparameter spaces, thereby providing stronger support for the training and deployment of DR-SE-NPCNet under non-stationary operating conditions.
Incorporation of the osprey attack strategy
In the standard HBA, the digging-phase update equation typically employs a cosine function to model nonlinear oscillations, thereby enhancing the stochasticity and global exploration capability of the search process. However, in high-dimensional and non-convex spaces, this mechanism often encounters two limitations. First, during the early iterations, excessive reliance on the current best solution reduces population diversity and leads to premature convergence. Second, in the later iterations, the oscillation amplitude of the cosine function becomes too small to sustain global exploration, causing solutions to cluster and further exacerbating stagnation. Moreover, when population updates lack balance, individuals can easily become trapped in local attraction basins, further restricting global search efficiency.
To address these limitations and improve global exploration, an OAS is incorporated into the digging phase. OAS promotes broad exploration during early iterations and gradually guides the population toward convergence in the later stage, thereby maintaining diversity through dynamic balancing. Specifically, OAS introduces random perturbations to prevent unidirectional fixation of individuals and integrates a mean-shift mechanism to generate new search directions across the solution space, thus improving the ability to escape local optima. With this modification, HBA not only maintains fast convergence but also achieves better stability and distribution balance in high-dimensional hyperparameter optimization. To ensure reproducibility, the hyperparameter settings used in this study are summarized in Table 2. The dual-mode update strategy is formulated as shown in Eq. (8):
where the variable Xj i(t + 1) represents the updated position of the i-th individual in the j-th dimension at iteration t + 1. The index i denotes the individual identifier in the population and takes values from {1, 2, …, N}, where N is the population size. The term Xj prey(t) denotes the global best solution in the j-th dimension at iteration t, while F∈ {1, − 1} is a stochastic direction control factor. The variable t denotes the current iteration index, and T denotes the maximum number of iterations. The time-decay term (1 − t/T) decreases linearly from 1 to 0 as the iteration proceeds. The density factor α(t) = Ce− t/T decreases exponentially with iteration t. The parameter β is a fixed scaling factor, and µj (t) is the population mean in the j-th dimension. The term Δx is a differential perturbation used to enhance search diversity. The variables rk ∈ [0,1] are independent uniformly distributed random variables used in OAS to introduce oscillatory randomness and perturbation.
The variable rand, uniformly distributed in [0,1], acts as the switch between the two update modes: when rand > 0.5, the original local exploitation mode is applied; when rand ≤ 0.5, the OAS global-exploration mode is applied. By incorporating the difference between µj (t) and Xj prey(t), the update rule maintains broad global exploration during early iterations while progressively strengthening local exploitation as iterations proceed, thereby achieving a balanced trade-off between exploration and exploitation.
Incorporation of the secretary bird aerial search strategy
In the original HBA, the honey-searching phase updates each individual based on the best solution Xj prey(t) and the control term (F·r5·α(t)). While this strategy enables rapid convergence toward local optima, its strong dependence on Xprey j causes individuals to cluster tightly around the current best solution, thereby reducing diversity and resulting in premature convergence. As iterations progress, the density attenuation factor α(t) continuously shrinks the search radius, further weakening global exploration. Although the incorporation of OAS enhances exploration to some extent, the solutions may still become excessively concentrated near the best solution in later iterations, which further restricts diversity.
To alleviate this limitation and enhance population diversity, an SBAS based on random differential search is incorporated into the honey-searching phase. Instead of relying solely on the best solution, SBAS constructs a differential vector by randomly selecting two distinct individuals from the population and uses it to update positions. This stochastic differential mechanism increases distributional diversity while still preserving convergence capability. The corresponding update strategy is formulated as shown in Eq. (9):
where the variable Xj i(t) denotes the position of the i-th individual in the j-th dimension at iteration t. This term serves as the baseline state in the SBAS update mechanism, ensuring that each individual retains its current search information and does not drift away from its inherent trajectory during the update process. The indices sk represent mutually distinct individuals randomly selected from the population, where sk ∈ {1, 2, …, N}. The term (Xj s1(t) − Xj s2(t)) denotes the differential perturbation constructed from the positions of the two randomly selected individuals in the j-th dimension at iteration t. This perturbation injects cross-individual information interaction into the population and acts as the core driver of the SBAS differential update, thereby enhancing search diversity and strengthening global exploration capability.
The mechanism achieves a dynamic balance between global exploration and local exploitation: the differential update broadens the global search range, whereas density attenuation and stochastic control maintain effective local refinement. Consequently, the modified algorithm significantly alleviates the loss of diversity and premature convergence that frequently arise in the mid-to-late stages of the traditional HBA.
Mirror-enhanced multi-differential mutation strategy
Although OAS and SBAS improve global exploration to some degree, they still exhibit shortcomings. OAS, due to its reliance on mean shifts, tends to over-concentrate individuals around the current best solution, limiting exploration of distant regions in the solution space. SBAS, while introducing strong randomness through differential perturbations, cannot continuously preserve population diversity, thus being prone to local optima in later exploration.
To address these issues, this study proposes the MEMDMS. After each iteration, additional perturbations are first applied to prevent overly rapid convergence; subsequently, random differential mutations are introduced to enhance diversity and broaden search coverage; finally, opposite solutions are generated through reverse learning to explore mirrored regions of the solution space, thereby avoiding potential optima being missed due to directional bias.
The multi-differential mutation–based position update formula is given in Eq. (10):
where Xnew i(t + 1) denotes the final updated state of the i-th individual at iteration t + 1, generated through the coordinated effect of multi-differential mutation and mirror-opposite learning. The temporary state Xtemp i(t + 1), obtained from the OAS or SBAS update, ensures that the secondary mutation is conducted on an already optimized trajectory. The differential vectors inject cross-individual information flow and diversify the search direction. The coefficient G, drawn from a uniform distribution over (0,1), regulates the scale of the differential mutation. When rand > 0.5, the dual differential-mutation operator is activated; otherwise, the temporary state is retained, ensuring stable convergence without loss of exploration capability.
The mirror-opposite selection and update process is defined by Eqs. (11) and (12):
where Xopposite i(t + 1) denotes the mirrored opposite position of the i-th individual at iteration t + 1 computed from Xnew i(t + 1). The vectors ub and lb denote the upper-bound and lower-bound constraints of the decision variables, defining the permissible search space for the optimization problem.
This strategy integrates dual-differential mutation and opposite learning, thereby preventing premature convergence. Dual-differential mutation introduces stronger gradient perturbations, expanding the search space and enhancing global exploration, while opposite learning exploits mirrored regions to uncover potential high-quality solutions, thereby improving the utilization of the solution space. Ultimately, this strategy alleviates premature convergence in HBA while maintaining low computational complexity.
In summary, IHBA integrates OAS, SBAS, and MEMDMS in a coordinated and phase-separated manner. OAS enhances early global exploration, SBAS preserves mid-iteration diversity through differential updates, and MEMDMS refines solutions via opposition-based multi-differential mutation in the late stage. By assigning each strategy to a distinct phase rather than stacking multiple operators within a single update rule, IHBA stabilizes convergence, mitigates population stagnation, and reduces premature convergence, thereby providing an efficient and robust hyperparameter optimization engine for DR-SE-NPCNet.
Based on the above analysis, the main innovations of the proposed method are summarized as follows:
-
(1)
TRP-TD backbone: By eliminating pooling and strided downsampling while introducing dilated convolution along the time axis, the TRP-TD backbone expands the receptive field without reducing temporal resolution, thereby capturing both short-term transient features and long-term cross-cycle variations.
-
(2)
ReSE-CF backbone: The ReSE-CF backbone adopts a calibrated residual coupling structure. While preserving the stability of the identity shortcut, it incorporates an SE module to recalibrate channel responses, enabling adaptive fusion and selection of multi-scale features.
-
(3)
IHBA optimization: An improved HBA algorithm is introduced for automatic optimization of learning rate and batch size, thereby improving global search efficiency and ensuring stable and generalizable convergence.
Diagnostic procedure
The diagnostic procedure is illustrated in Fig. 7, and comprises signal acquisition, data preprocessing and dataset construction, model development, hyperparameter search, training and evaluation, followed by classification inference and result output.
Proposed diagnostic flowchart for OCF detection in NPC three-level inverters.
-
(1)
Signal acquisition: Three-phase current signals of the NPC three-level inverter are collected as the raw input for modeling and diagnosis.
-
(2)
Dataset construction and preprocessing: The signals are segmented into fixed-length windows, normalized using Min–Max scaling, and reorganized to preserve temporal order and inter-phase correlation. Samples are labeled and split into training and testing sets.
-
(3)
Model construction: The dual-backbone DR-SE-NPCNet is built, with the TRP-TD branch preserving temporal resolution and enlarging the receptive field, and the ReSE-CF branch performing channel adaptation and residual calibration.
-
(4)
IHBA-based hyperparameter optimization: The improved IHBA is employed to search for optimal learning rate and mini-batch size by minimizing validation error.
-
(5)
Model training: The model is trained with the optimized hyperparameters until convergence using forward/backward propagation and early stopping to ensure stable learning.
-
(6)
Model evaluation: Diagnostic accuracy and confusion matrix are computed on the test set, and t-SNE visualization is applied to assess feature separability and generalization.
-
(7)
Classification output: The specific OCF category is determined, and the diagnostic result is output.
Experimental validations
To comprehensively verify the effectiveness and robustness of the proposed DR-SE-NPCNet and the IHBA, a series of experiments was conducted. A hardware NPC three-level inverter platform was built to collect fault data and evaluate diagnostic accuracy. Subsequently, noise interference tests and ablation experiments were performed to evaluate the model’s noise suppression capability and analyze the independent and synergistic effects of the TRP-TD and ReSE-CF modules. Furthermore, robustness and generalization were examined under diverse and mixed operating conditions, including variations in load level, modulation index, DC-bus voltage, and output frequency. The global optimization performance of IHBA was assessed using CEC2020 benchmark functions and further tested through its application to hyperparameter optimization of DR-SE-NPCNet. Finally, t-SNE visualization was conducted to reveal layer-wise feature separability.
Experimental platform setup and data processing
An NPC three-level inverter experimental platform was designed and constructed as shown in Fig. 8 to validate the diagnostic performance of the proposed DR-SE-NPCNet. The platform consisted of the inverter main circuit, control and driver units, signal acquisition subsystem, and fault diagnosis network. Following standard NPC fault-diagnosis practice, the experiments were conducted without introducing additional abnormalities. The main circuit was composed of the DC bus and three-phase bridge arms, each arm containing four IGBT devices, and key experimental parameters are listed in Table 3. The DC power source was stabilized by capacitors and the output was filtered by an LCL filter to suppress harmonics. The control unit employed a TMS320F28335 DSP to drive the IGBTs via fiber-optic drivers and integrated protection functions. The three-phase currents measured by sensors were used both for closed-loop control and as diagnostic features. Open-circuit faults were emulated by disabling the gate signal of the selected IGBT, providing a safe, controllable, and repeatable means to reproduce open-circuit behavior. This gate-signal–based method is widely adopted in multilevel inverter studies and accurately reflects real hardware conditions, as the loss of conduction path produces current distortion identical to that in actual device failures.
Experimental platform of the NPC three-level inverter.
The proposed fault diagnosis model was implemented in MATLAB R2021b. The experimental environment consisted of an Intel i7-8750 H CPU, 32 GB RAM, and an NVIDIA RTX 2070 GPU, which together provided sufficient computational capability for large-scale data processing and deep learning tasks. Model training employed the Adam optimizer, and the learning rate and batch size were optimized using the IHBA.
The acquired three-phase current signals were segmented into fixed-length windows and normalized to [− 1, 1] to suppress amplitude variations and accelerate convergence. Since variations in the input window length have only minimal impact on model performance, a 2000-sample window is adopted as the default setting, as it provides stable feature representations and aligns with common practice in multilevel inverter OCF diagnosis. Each phase produced 150 valid samples, forming 19 matrices of size 2000 × 450, which were split into training and testing subsets using an 8:2 ratio.
To ensure comprehensive evaluation, a systematic analysis of open-circuit faults in the NPC three-level inverter was performed. Although 73 theoretical fault modes existed across five categories as shown in Table 1, many dual-device and cross-phase combinations exhibited redundant or physically equivalent current signatures. To avoid class proliferation and imbalance, a representative sampling strategy was adopted: all modes in the first three categories were retained, while only waveform-distinct cases from the remaining two categories were selected. Together with the normal state, six single-device faults, and six same-phase half-bridge combinations, a total of 19 non-overlapping representative classes were constructed. This reduced set preserves all unique transient and steady-state current characteristics inherent in the full 73-mode space while eliminating redundant variants, thereby maintaining fault diversity and ensuring the generalization capability of the diagnostic model.
The proposed DR-SE-NPCNet diagnostic network takes a current sequence of length 2000 as its input. The convolutional layers use kernel sizes of 8, 16, 32, and 32, with output channels of 32, 64, 128, and 256, respectively. The third layer incorporates a dilation factor of 2 to enlarge the receptive field without increasing parameters, thereby enabling cross-cycle feature extraction. Each convolutional block is followed by an SE-based channel recalibration module, which generates channel weights via global average pooling while preserving temporal resolution. The extracted features are mapped to 19 fault classes through fully connected layers and a Softmax classifier, with a dropout rate of 0.1 applied before the FC layer to enhance generalization. Overall, the structure achieves multi-scale feature extraction and stable classification while maintaining full temporal resolution.
To confirm that removing temporal pooling does not compromise real-time feasibility, a concise computational-cost assessment was performed. For a 2000-sample input, the proposed DR-SE-NPCNet requires approximately 1.05 × 10⁸ MACs (≈ 0.10 GMAC or 0.21 GFLOPs) and contains 1.27 million parameters, placing it within the sub-GFLOP and million-parameter range typical of lightweight 1D CNNs. Most of the computational load originates from the temporal convolution layers in the TRP-TD backbone, whereas the SE calibration branch contributes less than 2% of the total cost. The measured inference latency is 0.35 ms on an RTX 2070 GPU and 4.8 ms on an Intel i7 CPU, both substantially lower than the temporal duration represented by the 2000-sample window. These results confirm that the model, even without temporal pooling, fully satisfies the real-time requirements for open-circuit fault diagnosis.
Overall performance evaluation and comparative analysis
Performance evaluation metrics
The performance of the diagnostic model is evaluated using commonly adopted classification metrics, including accuracy and the confusion matrix. Accuracy is defined as follows:
where TP denotes true positive, TN denotes true negative, FP denotes false positive, and FN denotes false negative. The confusion matrix is a C×C matrix that summarizes the numbers of correct and incorrect predictions across all C fault categories, providing a comprehensive overview of model behavior and class-wise discriminability.
During training, both the loss function and diagnostic accuracy were tracked to evaluate convergence. As shown in Fig. 9, the loss function gradually decreases and stabilizes as the number of training epochs increases, while the accuracy rises rapidly and converges at a high level, indicating effective feature capture and stable learning. The coordinated evolution of loss and accuracy demonstrates the absence of severe overfitting or underfitting, thereby confirming good convergence. Figure 10 presents the confusion matrix for the 19 fault types, where most categories were accurately distinguished with low misclassification rates, and the model maintained balanced classification performance across categories, highlighting its robustness and reliability.
Variation trends of diagnostic accuracy and loss function during DR-SE-NPCNet training. (a) Diagnostic accuracy. (b) Loss function.
Confusion matrix of diagnostic results for 19 types of faults.
Noise suppression capability and ablation studies
To validate the reliability of the proposed method under realistic operating conditions, noise interference tests were first conducted. In power electronic systems, sampling and commutation processes are often affected by electromagnetic and measurement noise, and evaluations performed solely on ideal datasets are likely to be overestimated. To address this issue, Gaussian white noise was superimposed on the three-phase current signals, and four test scenarios were established: noise-free, signal-to-noise ratios (SNRs) of 30 dB, 20 dB, and 10 dB. Except for the noise level, all other training conditions were kept identical to ensure that diagnostic accuracy could be compared across different SNRs, thereby systematically evaluating the model’s noise suppression capability.
In this study, only a Boosted Trees classifier is included as the representative traditional baseline, since feature-engineered methods are not compatible with the end-to-end waveform learning framework adopted here. Likewise, computationally heavy architectures such as Transformer–CNN hybrids are not considered, as they do not align with the lightweight design objectives aimed at future online deployment. Therefore, the baseline comparison focuses on one-dimensional CNN models that operate directly on raw current waveforms.
Five comparative models are incorporated to analyze the role of each network component: (1) a standard CNN consisting of convolution and pooling layers; (2) ResCNN, which introduces residual connections to improve training stability; (3) TRP-TD-CNN, which enlarges the receptive field without pooling or downsampling; (4) TRP-TD-ResCNN, which adds residual connections but excludes channel attention; and (5) the proposed DR-SE-NPCNet, which integrates TRP-TD with the complete ReSE-CF module. This stepwise design enables a systematic assessment of the independent contributions of TRP-TD and ReSE-CF.
Compared with these 1D-CNN variants, DR-SE-NPCNet achieves two major enhancements. First, the TRP-TD backbone performs full-resolution temporal modeling, allowing the network to capture short-duration OCF transients without sacrificing sampling density. Second, the ReSE-CF module implements calibrated residual coupling to amplify informative channels while suppressing those dominated by noise. Together, these mechanisms provide superior robustness under load variations and mixed-noise conditions.
As shown in Fig. 11, DR-SE-NPCNet achieved the best performance across all four noise conditions, with accuracies of 98.32%, 97.10%, 95.67%, and 92.83% under noise-free, 30 dB, 20 dB, and 10 dB environments, respectively. Compared with the second-best model, TRP-TD-ResCNN, the improvements were 3.12%, 6.29%, 8.13%, and 12.82%, respectively. In the most challenging 10 dB environment, DR-SE-NPCNet outperformed Boosted Trees, LSTM, and a standard CNN with accuracy improvements of 29.86%, 24.06%, and 20.24%, respectively. These results demonstrate that Boosted Trees and LSTM experience substantial degradation under strong noise, while the standard CNN also exhibits significant performance drop. Although ResCNN and TRP-TD-CNN performed reasonably well under moderate noise levels, their accuracies dropped to only 77.20% and 75.98% at 10 dB, respectively. By contrast, DR-SE-NPCNet maintained 92.83% diagnostic accuracy under this harsh condition, indicating a superior ability to suppress noise-dominated channels and preserve short-term texture features, thereby achieving enhanced robustness and discriminability.
Diagnostic accuracy comparison of different models under various noise.
To further assess the contributions of individual structures, progressive ablations were performed on CNN, TRP-TD-CNN, ResCNN, TRP-TD-ResCNN, and DR-SE-NPCNet, with ResCNN serving as a parallel control. Under noise-free conditions, the accuracies of CNN, TRP-TD-CNN, TRP-TD-ResCNN, and DR-SE-NPCNet were 87.43%, 89.11%, 95.20% and 98.32%, respectively. The accuracy of ResCNN was 92.11%, lower than the 95.20% of TRP-TD-ResCNN, indicating that residual connections alone are less effective than the synergy between TRP-TD and residual structures. A similar trend was observed under 30 dB and 20 dB noise. Specifically, TRP-TD expanded the receptive field without downsampling, raising accuracy from 82.03% to 86.25% under 30 dB. Adding residual connections further improved it to 90.81%. When noise intensity increased to 10 dB, TRP-TD-ResCNN accuracy dropped to 80.01%, whereas integrating SE-based channel recalibration in DR-SE-NPCNet increased accuracy to 92.83%, an improvement of 12.82% points. This demonstrates that TRP-TD effectively models long-term dependencies, while ReSE-CF suppresses noise and enhances discriminability through channel selection. Their combined effect ultimately yields a more robust architecture under severe noise conditions.
Robustness and generalization under multiple operating conditions
To verify the robustness of the proposed model in real operating scenarios, its performance was evaluated under multiple conditions. In practical operation, variations in load level, modulation index, DC-bus voltage, and output frequency occur due to environmental changes and control strategies, causing significant drifts in the amplitude, harmonic distribution, and transient characteristics of three-phase currents. If the model is trained and tested only under a single condition, it may perform well for that specific distribution but is prone to misclassification when conditions change or overlap. Therefore, it is essential to assess the anti-interference capability and cross-distribution generalization under multidimensional and multilevel scenarios.
To this end, four types of mixed datasets were constructed on the basis of conventional model comparisons to simulate complex operating conditions. For example, the mixed-load dataset MixLoad includes four load levels (25%, 50%, 75%, and 100%). From each subset, 25% of the samples were randomly selected using a fixed seed to ensure reproducibility. The four subsets were then concatenated and shuffled to form MixLoad. Similarly, datasets representing variations in modulation index, DC-bus voltage, and output frequency were constructed, denoted as MixMI, MixUdc, and MixFreq, respectively. This design not only ensures balanced sampling and reproducibility but also closely reflects real-world scenarios of load switching, modulation variation, voltage fluctuation, and frequency changes, thereby providing a unified and reliable benchmark for subsequent analysis.
Experimental results, as shown in Figs. 12, 13, 14 and 15, indicate that under single-condition tests—including variations in load, modulation index, voltage, and frequency—the diagnostic accuracies of all models fluctuated only slightly. Boosted Trees and LSTM maintained accuracies between 81 and 83% and 84–86%, respectively. CNN achieved 86–88%, ResCNN maintained 91–93%, and TRP-TD-ResCNN reached 94–96%. By contrast, the proposed DR-SE-NPCNet consistently exceeded 97%, approaching 99% in most cases. These results demonstrate that, under normal operating conditions, amplitude drift and harmonic variation have limited effects, while the preprocessing steps of normalization and fixed-length segmentation further mitigate amplitude differences, ensuring stable feature distributions across different scenarios.
Diagnostic performance of different models under varying and mixed load conditions.
Diagnostic performance of different models under varying and mixed modulation index conditions.
Diagnostic performance of different models under varying and mixed DC-bus voltage conditions.
Diagnostic performance of different models under varying and mixed output frequency conditions.
In mixed-condition tests, however, performance differences became pronounced. For instance, in MixLoad, the accuracies of Boosted Trees, LSTM, CNN, ResCNN, and TRP-TD-ResCNN dropped to 71.60%, 74.20%, 78.10%, 83.93%, and 86.84%, respectively, as shown in Fig. 12. Similar degradations of 6–15% were observed in MixMI, MixUdc, and MixFreq, reflecting blurred decision boundaries and heightened sensitivity to distribution shifts, as shown in Figs. 13, 14 and 15. In contrast, DR-SE-NPCNet maintained accuracies between 95% and 97% across all four mixed conditions, with less than a 2% reduction compared with single-condition tests, significantly outperforming all other models.
Further analysis of the latter three conditions reveals that modulation index variation mainly affects the fundamental amplitude and switching harmonic distribution; DC-bus voltage fluctuations introduce DC bias and low-frequency envelopes, which disturb the signal distribution; while frequency changes scale cross-cycle dependencies and induce phase misalignment. The proposed DR-SE-NPCNet addresses these challenges through its structural innovations. Specifically, the TRP-TD module employs temporal dilated convolution to preserve high-resolution features without relying on pooling and effectively capture cross-cycle dependencies, thus preventing feature loss caused by amplitude drift. Meanwhile, the ReSE-CF module recalibrates channels by suppressing noise-dominated subbands and emphasizing discriminative features. Together, these mechanisms enable DR-SE-NPCNet to achieve minimal performance fluctuations and the highest diagnostic accuracy under multi-condition disturbances, demonstrating superior robustness and cross-scenario generalization.
IHBA performance evaluation
To rigorously validate the effectiveness and superiority of the proposed IHBA, this study employed the widely used CEC2020 benchmark test functions for systematic comparative experiments. It is worth noting that these benchmark tests objectively evaluate the optimization performance and robustness of IHBA without dependence on specific task characteristics, thereby ensuring the algorithm’s general applicability and theoretical reliability. On this basis, IHBA was further applied to the hyperparameter tuning of the proposed DR-SE-NPCNet, fully leveraging its advantages in global search efficiency and accuracy.
Benchmarking against existing optimization methods
The CEC2020 test suite consists of ten highly challenging optimization problems: one unimodal function (F1), three rotated and shifted multimodal functions (F2–F4), three hybrid functions (F5–F7), and three composition functions (F8–F10). All functions are defined in multidimensional spaces. Considering space limitations, one representative function was selected from each category: F1 for unimodal, F2 for multimodal, F5 for hybrid, and F10 for composition. The search dimension was set to 10, and iterative curves were plotted with the number of iterations as the horizontal axis and fitness values as the vertical axis, as illustrated in Fig. 16.
Performance comparison of different optimization algorithms on CEC2020 benchmark functions. (a) F1 unimodal function. (b) F2 multimodal function. (c) F5 hybrid function. (d) F10 composite function.
The unimodal test function features a single global optimum without local optima and is commonly used to assess convergence speed and accuracy. Due to its simple structure, no local minima interfere with the search process, allowing algorithms to directly reflect the speed of approaching the global optimum and the precision of final convergence. As shown in Fig. 16(a), on the unimodal F1 function, IHBA and the original HBA achieved similar convergence speeds, but IHBA exhibited significantly higher optimization accuracy, clearly surpassing the other comparative algorithms. This indicates that IHBA improves the ability to approach the global optimum while maintaining fast convergence.
The multimodal test functions contain numerous local optima alongside a global optimum, and are primarily designed to evaluate an algorithm’s global search capability and its ability to escape local minima. In practical optimization problems, such multimodal landscapes often cause algorithms to converge prematurely. As shown in Fig. 16(b), on the F2 multimodal function, HBA performed worse than several comparative algorithms and exhibited premature convergence. By contrast, IHBA, enhanced with multiple strategies, significantly improved its search ability, effectively avoiding entrapment in local optima, thereby confirming the validity and feasibility of the proposed improvements.
Hybrid and composition functions combine multiple types of structures to create test scenarios with complex scales, dimensional heterogeneity, and diverse search-space characteristics. They are typically employed to evaluate an optimizer’s ability to adapt to challenging landscapes and to balance global and local search behaviors. As shown in Fig. 16(c) and (d), for the hybrid function F5 and the composition function F10, IHBA consistently achieves fast convergence and stable optimization performance, clearly outperforming the comparative algorithms. These results demonstrate that IHBA exhibits strong robustness and adaptability across different categories of optimization problems.
To further provide an overall assessment of the full set of ten CEC2020 test functions, Fig. 17 illustrates the distribution of fitness values across multiple independent runs. The box plots summarize the median, quartiles, and dispersion of each optimizer; the noticeably lower variance and tighter spread of IHBA indicate superior stability and convergence consistency relative to the other algorithms. Table 4 additionally reports the mean objective values for F1–F10 together with the corresponding Friedman average ranks, where IHBA achieves lower mean values on most functions and attains the best average rank of 1.1, outperforming HBA with 2.0 and SABO with 3.7. These results collectively confirm the comprehensive superiority of IHBA in both convergence accuracy and robustness.
Box plots of fitness values of optimization algorithms on CEC2020 benchmark functions.
Sensitivity analysis of IHBA hyperparameters
To evaluate the robustness of IHBA with respect to its internal hyperparameters, a systematic sensitivity analysis was performed on three key factors: the density factor C (which determines the initial value of α through α = Ce−t/T), the hunting capability coefficient β, and the population size N. In each experiment, only one parameter was perturbed while all others were fixed at their nominal values. To ensure strict comparability across configurations, all trials were initialized from the same population matrix by fixing the random seed.
For each hyperparameter value, IHBA was executed for 20 iterations and repeated across five independent runs. The mean objective values over iterations are illustrated in Fig. 18. As shown in the figure, all convergence trajectories exhibit highly similar behaviors during the early stages and eventually approach nearly identical minima. The differences among the curves remain very small throughout the optimization process, regardless of whether C, β, or N is perturbed.
Sensitivity analysis of IHBA hyperparameters. (a) Sensitivity to the density factor C. (b) Sensitivity to the hunting capability coefficient β. (c) Sensitivity to the population size N.
These observations indicate that IHBA shows only weak sensitivity to moderate variations in its hyperparameters within the practical operating range considered in this study. The algorithm consistently maintains stable convergence and reliable optimization performance, demonstrating strong robustness for hyperparameter tuning of DR-SE-NPCNet. Therefore, the nominal hyperparameter configuration adopted in this paper is representative and does not depend on delicate parameter tuning.
Ablation study and computational complexity analysis of IHBA
To quantify the incremental contribution of each enhancement component in IHBA, an ablation study was conducted, as shown in Table 5. Starting from the baseline HBA, the OAS module was first added to form HBA-OAS, followed by the inclusion of the SBAS component to obtain HBA-OAS-SBAS, and finally the MEMDMS mechanism was incorporated to produce the full IHBA. The results demonstrate a clear and consistent performance improvement at each enhancement stage, confirming that OAS, SBAS, and MEMDMS all contribute positively and cumulatively to the final optimization capability of IHBA.
To complement the above ablation analysis and provide a more comprehensive understanding of the algorithm, we also examine the computational complexity of IHBA. The computational effort of HBA and its variants mainly arises from evaluating and updating N individuals in a d-dimensional search space over T iterations, resulting in a baseline complexity of O(TNd). The incorporation of OAS and SBAS introduces only minor auxiliary position evaluations without increasing the asymptotic order, whereas MEMDMS adds dual-differential mutation and opposite learning operations, which may require one additional objective evaluation for certain individuals. Consequently, the overall complexity of IHBA increases only slightly from O(TNd) to O(TN(d + 1)), thereby keeping the algorithm computationally efficient and fully tractable.
Following validation on benchmark functions, IHBA was further utilized for hyperparameter optimization of DR-SE-NPCNet and evaluated under multiple operating conditions. As shown in Fig. 19, the IHBA-tuned model achieved performance improvements across all scenarios analyzed in the previous sections. In conventional scenarios, such as the normal condition and the 30 dB noise case, accuracies improved from 98.32% to 99.12% and from 97.10% to 98.20%, respectively, representing relatively modest gains. However, in challenging scenarios, the advantages of IHBA became more pronounced. For example, under the 10 dB strong-noise condition, accuracy improved markedly from 92.83% to 95.73%, an increase of nearly 3% points; in the MixVdc scenario, the accuracy rose to 99.35%, a gain of almost 2% points; and in the mixed-condition datasets MixLoad, MixMI, and MixFreq, accuracies also increased to 98.11%, 98.10%, and 98.17%, respectively.
Performance comparison of DR-SE-NPCNet with and without IHBA optimization under different noise levels and mixed operating conditions.
Therefore, IHBA provides DR-SE-NPCNet with superior hyperparameter configurations, enabling stronger robustness and generalization under complex operating conditions. The significant improvements achieved under the 10 dB noise condition and mixed scenarios further demonstrate that IHBA not only enhances performance in conventional cases but also delivers robust global search capability in environments characterized by severe noise or complex distribution shifts.
Visualization
In fault diagnosis tasks, deep learning models often suffer from a lack of interpretability. To enhance interpretability and facilitate the analysis of feature distributions, this study employs the t-distributed stochastic neighbor embedding (t-SNE) algorithm for dimensionality reduction and visualization. This method projects high-dimensional features into a two-dimensional space, thereby providing an intuitive representation of the clustering patterns and boundary distributions among different fault categories.
As shown in Fig. 20, t-SNE was applied to visualize the feature vectors extracted from different layers of DR-SE-NPCNet after dimensionality reduction. The horizontal and vertical axes correspond to the two-dimensional feature space, while different colors represent the 19 categories of open-circuit faults. Overall, the clustering structure of the samples becomes progressively clearer as the network depth increases, indicating gradually enhanced feature separability and more discriminative representations at deeper layers.
The t-SNE visualization of DR-SE-NPCNet. (a) Input layer. (b) TRP-TD module. (c) ReSE-CF module. (d) Output layer.
At the input layer, the fault categories exhibit substantial overlap, with blurred boundaries and insufficient discriminability. After passing through the TRP-TD module, the dilated convolution expands the receptive field, enabling partial separation of some categories, though the inter-class margins remain limited. With further processing by the ReSE-CF module, channel recalibration amplifies the differences among fault categories, leading to sharper inter-class boundaries. Finally, at the output layer, features are accurately mapped to the label space, resulting in distinct and well-formed groupings of the 19 fault types. This process provides a direct illustration of the progressive, layer-wise enhancement of discriminative capability and interpretability achieved by DR-SE-NPCNet.
Conclusion
This study proposes an integrated model–optimization framework for open-circuit fault diagnosis in three-level NPC inverters. DR-SE-NPCNet serves as the core diagnostic backbone, while IHBA adaptively tunes the learning rate and batch size to improve hyperparameter optimization. By expanding the temporal receptive field and performing channel recalibration, DR-SE-NPCNet effectively captures both short-duration transients and slow cross-cycle variations at full resolution, thereby enhancing robustness against operating-condition disturbances and measurement noise. IHBA integrates OAS, SBAS, and MEMDMS to jointly improve global exploration, convergence stability, and population diversity. Hardware experiments demonstrate that the proposed method achieves 98.32% accuracy under noise-free conditions and 92.83%–96.94% under 10 dB noise and mixed operating scenarios. With IHBA, the diagnostic accuracy further increases by 2–3% under the most challenging conditions, thereby confirming the practical value, enhanced robustness, and strong generalization capability of the overall approach.
Despite these strengths, several limitations remain. IHBA relies on predefined parameter bounds, which increases the tuning burden and may constrain its scalability when extended to more complex optimization spaces. In addition, practical deployment challenges persist, including the need to verify real-time inference performance on embedded hardware and to ensure compatibility with commonly used industrial hardware platforms.
Future work will focus on integrating the proposed approach into a complete online monitoring system, incorporating DC-link capacitor voltage and neutral-point dynamics for improved reliability under non-ideal conditions, and extending the diagnostic framework to T-type, hybrid, and higher-level multilevel converters to assess cross-topology generalization.
Data availability
The datasets generated and/or analyzed during the current study are not publicly available but are available from the corresponding author on reasonable request.
References
Aljafari, B., Satpathy, P. R., Thanikanti, S. B. & Nwulu, N. Supervised classification and fault detection in grid-connected PV systems using 1D-CNN: simulation and real-time validation. Energy Rep. 12, 2156–2178 (2024).
Si, Y. et al. Fault diagnosis based on attention collaborative LSTM networks for NPC Three-Level inverters. IEEE Trans. Instrum. Meas. 71, 1–16 (2022).
Jung, J., Apsari, D. P. & Lee, D. C. Robust Open-Switch fault diagnosis of Three-Level NPC inverters based on data augmentation with white noise injection. IEEE Trans. Power Electron. 40, 3553–3565 (2025).
Kumar, B., Peddapati, S. & Alhosaini, W. A. Novel Fault-Tolerant Single-Phase multilevel inverter for reliable UPS applications. IEEE Open. J. Ind. Appl. 6, 647–662 (2025).
Yan, Y. et al. An Open-Circuit fault diagnosis method for Three-Level neutral point clamped inverters based on Multi-Scale shuffled convolutional neural network. Sensors 24, 1745 (2024).
Bhadra, A. B. et al. Dual graph attention network for robust fault diagnosis in photovoltaic inverters. Sci. Rep. 15, 31330 (2025).
Song, L., Liao, M., Wang, R. & Guo, X. Open-circuit diagnosis method based on dual-mode voltage residual model for T-type threelevel inverter. Sci. Rep. 15, 14059 (2025).
Patthi, S. et al. An ameliorated single-phase five-level multi-switch fault-tolerant inverter with reduced number of switches. Electr. Eng. 107, 13081–13098 (2025).
Karthik, K. & Ponnambalam, P. Design and implementation of time-based fault tolerance technique for solar PV system reliability improvement in different applications. Sci. Rep. 15, 7377 (2025).
Xu, S. et al. Multiple Open-Switch fault diagnosis for Three-Phase Four-Leg inverter under unbalanced loads via interval sliding mode observer. IEEE Trans. Power Electron. 39, 7607–7619 (2024).
Li, D. & Jiang, D. Fault diagnosis of grid-connected inverters using a fractional-order predefined-time sliding mode observer. Int. J. Electr. Power Energy Syst. 171, 110988 (2025).
Hashemi, M., Stolz, M. & Watzenig, D. Super-twisting algorithm-based sliding mode observer for open-circuit fault diagnosis in PWM voltage source inverter in an in-wheel motor drive system. In IEEE International Conference on Mechatronics (ICM) 1–6 https://doi.org/10.1109/ICM54990.2023.10102000 (2023).
Abbas, M., Chafouk, H. & Ardjoun, S. A. E. M. Fault diagnosis in wind turbine current sensors: detecting single and multiple faults with the extended Kalman filter bank approach. Sensors 24, 728 (2024).
Zhou, Y. et al. A Seasonal–Trend-Decomposition-Based Voltage-Source-Inverter Open-Circuit fault diagnosis method. IEEE Trans. Power Electron. 37, 15517–15527 (2022).
Wang, B., Feng, X., Sun, T., Wang, Z. & Cheng, M. Relative β -Axis residual voltage signal based fault detection for inverter switch Open-Circuit failure. IEEE Trans. Power Electron. 38, 11315–11326 (2023).
Zhang, Y., Liu, Y., Kang, S. & Wang, P. Current vector phase based weak Open-Circuit fault diagnosis of Voltage-Source inverters. IEEE Trans. Power Electron. 1–11. https://doi.org/10.1109/TPEL.2024.3478766 (2024).
Liu, B., Shi, T., Zhang, G., Yan, Y. & Xia, C. Open-Circuit fault diagnosis method for inverters based on common characteristics in critical region. IEEE Trans. Power Electron. 40, 8540–8552 (2025).
Khan, F. A. et al. Open-Circuit fault detection in a multilevel inverter using Sub-Band wavelet energy. Electronics 11, 123 (2022).
Qiu, G., Wu, F., Chen, K. & Wang, L. A. Robust accuracy weighted random forests algorithm for IGBTs fault diagnosis in PWM converters without additional sensors. Appl. Sci. 12, 2121 (2022).
Boutaleb, D. N., Laribi, S. & Bendiabdellah, A. Advanced detection and localization of open circuit faults in two-level three-phase IGBT-based inverters using machine learning approaches and discrete wavelet transform. Int. J. Inf. Technol. 16, 4713–4720 (2024).
Shan, R., Yang, J. & Huang, S. Open-circuit fault diagnosis of three-phase PWM rectifier circuits based on transient characteristics and random forest classification. J. Power Electron. 24, 130–139 (2024).
Al-kaf, H. A. G., Lee, J. W. & Lee, K. B. Fault detection of NPC inverter based on ensemble machine learning methods. J. Electr. Eng. Technol. 19, 285–295 (2024).
Mahmoud, M. S., Salem, A., Huynh, V. K. & Robbersmyr, K. G. Robust Self-Augmented Open-Circuit fault diagnosis of Three-Level inverters for EV powertrains. IEEE Trans. Transp. Electrification. 11, 10250–10261 (2025).
Xie, J., Qian, X. & Huang, Q. Fault diagnosis and fault tolerant control algorithm based on neural network in modular multi-level converter (MMC). In International Conference on Electronics and Renewable Systems (ICEARS) 247–253 https://doi.org/10.1109/ICEARS64219.2025.10940921 (2025).
Godhade, S., Singh, P. & Kumar, J. Design of an improved model for real-time fault detection and localization using hybrid transformer-CNN and graph neural networks. In 2025 Third International Conference on Microwave, Antenna and Communication (MAC) 1–6 https://doi.org/10.1109/MAC64480.2025.11139887 (2025).
Zhai, Z., Wang, N., Lu, S., Zhou, B. & Guo, L. A. Novel open circuit fault diagnosis for a modular multilevel converter with modal Time-Frequency diagram and FFT-CNN-BIGRU attention. Machines 13, 533 (2025).
Shen, H., Tang, X., Luo, Y., Xie, F. & Shi, Z. Online Open-Circuit fault diagnosis for neutral point clamped inverter based on an improved convolutional neural network and sample amplification method under varying operating conditions. IEEE Trans. Instrum. Meas. 73, 1–12 (2024).
Yuan, W. et al. Open-Circuit fault diagnosis of NPC inverter based on improved 1-D CNN network. IEEE Trans. Instrum. Meas. 71, 1–11 (2022).
Rajabi, N., Kalhor, A. & Iman-Eini, H. A method for detecting and localizing open-circuit switch faults in MMCs using separable Conv2D neural networks. IEEE Trans. Ind. Electron. 1–11. https://doi.org/10.1109/TIE.2025.3557998 (2025).
Mali, B. & Lee, D. C. Multi-switch fault analysis of six-phase inverters using CNN and data augmentation with limited training dataset utilization. In 2025 IEEE Energy Conversion Congress & Exposition Asia (ECCE-Asia) 1–5 https://doi.org/10.1109/ECCE-Asia63110.2025.11112197 (2025).
Ma, G. et al. Real-Time diagnosis of multiple Open-Circuit faults in ANPC inverters based on lightweight deployment of edge 2D-CNN. IEEE Trans. Ind. Electron. 1–12. https://doi.org/10.1109/TIE.2025.3549086 (2025).
Zhang, G., Li, M., Gu, X. & Chen, W. Fault diagnosis method for open-circuit faults in NPC three-level inverter based on WKCNN. CES Trans. Electr. Mach. Syst. 1–12. https://doi.org/10.30941/CESTEMS.2025.00012 (2025).
Wang, H., Zhang, C., Zhang, N., Chen, Y. & Chen, Y. Fault diagnosis for IGBTs open-circuit faults in high-speed trains based on convolutional neural network. In 2019 Prognostics and System Health Management Conference (PHM-Qingdao) 1–8 https://doi.org/10.1109/PHM-Qingdao46334.2019.8943008 (2019).
Chai, Q., Li, H., Wang, W. & Yan, Q. Transfer learning based open-circuit fault diagnosis method for three-phase inverters. J. Power Electron. 25, 1030–1040 (2025).
Pavithra, P., Kumar, R. K. & S, S. M, A. Securing the cloud: Leveraging deep learning and ORCA optimization for enhanced IDS. In 2025 8th International Conference on Computing Methodologies and Communication (ICCMC) 1229–1234 https://doi.org/10.1109/ICCMC65190.2025.11140715 (2025).
Almuflih, A. S. et al. Securing IoT devices with zero day intrusion detection system using binary snake optimization and attention based bidirectional gated recurrent classifier. Sci. Rep. 14, 29238 (2024).
Thirumalaisamy, S. et al. Breast cancer classification using synthesized deep learning model with metaheuristic optimization algorithm. Diagnostics 13, 2925 (2023).
Jothi, K. R. & Vaithiyanathan, B. Developing a hybrid approach with Whale optimization and deep convolutional neural networks for enhancing security in smart home environments’ sustainability through IoT devices. Sustainability 16, 11040 (2024).
Abdulsaed, E., Alabbas, M. & Khudeyer, R. Hyperparameter optimization for convolutional neural networks using the salp swarm algorithm. Informatica 47, (2023).
Revati, P., Wbaid, V. M. M. B. & Jena, S. K. S. Plant disease classification using fusion cauchy reverse learning strategy based orca predation algorithm with DenseNet-121. In 2025 3rd International Conference on Integrated Circuits and Communication Systems (ICICACS) 1–6 https://doi.org/10.1109/ICICACS65178.2025.10968937 (2025).
Vetrithangam, D., Anguraj, D. K., Arunachalam, K. P. & Rangarajan, N. Enhanced multimodal breast cancer diagnosis using random graph diffusion-based two-branch attention adversarial domain adaptation with improved orca predation optimization. Biomed. Mater. Devices https://doi.org/10.1007/s44174-025-00471-6 (2025).
Nguyen, T. P. An incorporation of metaheuristic algorithm and two-stage deep learnings for fault classified framework for diesel generator maintenance. Eng. Appl. Artif. Intell. 151, 110688 (2025).
Funding
This work was supported by the National Key Research and Development Program of China (2022YFB2404800).
Author information
Authors and Affiliations
Contributions
Q.L. (Qisheng Liu) conceptualized the research idea, wrote the main manuscript, and conceived the experiments. C.C. (Changxi Chen) and W.L. (Weixiang Lei) conducted the literature review and performed the experiments. H.O. (Honglin Ouyang) and M.X. (Muxuan Xiao) provided guidance and assisted with the final revision of the manuscript. All authors reviewed and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Liu, Q., Chen, C., Ouyang, H. et al. IHBA-optimized DR-SE-NPCNet for robust open-circuit fault diagnosis in three-level NPC inverters under mixed and noisy conditions. Sci Rep 16, 3826 (2026). https://doi.org/10.1038/s41598-025-34025-z
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-34025-z






















