Introduction

As a key component in rotating machinery, the performance of rolling bearings directly affects the stability and safety of equipment operation, making their fault diagnosis crucial in practical applications. Fault diagnosis is achieved by extracting the fault feature information embedded in the vibration signals collected during the operation of rolling bearings. However, the collected fault signals often exhibit significant nonlinearity and non-stationarity, making it challenging for traditional methods to effectively extract fault features1,2. To address this issue, numerous scholars have undertaken research.

Currently, fault diagnosis methods primarily include signal processing, machine learning, and deep learning. Common signal processing methods include Wavelet Transform (WT)3,4, Empirical Mode Decomposition (EMD)5,6, and Variational Mode Decomposition (VMD)7,8. Machine learning primarily involves Support Vector Machines (SVM)9,10 and Extreme Learning Machines (ELM)11,12; deep learning includes Convolutional Neural Networks (CNN)13,14 and Long Short-Term Memory (LSTM)15,16. Wavelet Transform is a non-adaptive signal analysis technique that utilizes wavelet basis functions to analyze the original signal. The preset parameters, such as wavelet basis functions, decomposition levels, and thresholds, have a significant impact on the results of wavelet analysis. Additionally, no transformation is applied to the high-frequency components, leading to a loss of high-frequency information. To further analyze high-frequency information, M.V and R.R17 proposed Wavelet Packet Transform, which improves time-frequency resolution. Luan Xiaochi18 employed wavelet packet decomposition to achieve high-frequency filtering and fusion of high- and low-frequency signals, demonstrating the method’s effectiveness in fault signal processing. Empirical Mode Decomposition (EMD) adaptively decomposes fault signals into multiple Intrinsic Mode Functions (IMFs), eliminating the need for manually selected parameters. Zhu Quanjie et al.19 realized the noise reduction of vibration signals by combining singular values with EMD for preferential reconstruction of IMF components. As a result, some scholars proposed Ensemble Empirical Mode Decomposition (EEMD)20 and Complete Ensemble Empirical Mode Decomposition (CEEMD)21. Athisayam Andrews22 applied CEEMD for denoising vibration signals and used Artificial Bee Colony (ABC) to select bearing fault features, achieving coupling fault identification through a neural network. To better address the problems associated with EMD, Dragomiretskiy proposed Variational Mode Decomposition (VMD), which decomposes signals by incorporating variational constraints, significantly improving the mitigation of mode mixing and end effects. However, VMD’s performance is highly sensitive to the selection of decomposition modes and penalty factors. Zheng Yi23 et al. proposed optimizing VMD using the Grasshopper Optimization Algorithm (GOA) to identify the optimal parameter combination, enabling adaptive selection of VMD parameters. Gao Yanfeng24, Zhao Hailong25, et al. proposed a wave propagation fault localization algorithm based on VMD and the Teager energy operator, which enhanced the localization accuracy. However, the accuracy gradually decreases as the noise level increases. Besides signal processing, machine learning is widely applied in the field of fault diagnosis26. Cai Sainan27 proposed a Whale Optimization Algorithm (WOA)-based Least Squares Support Vector Machine (LS-SVM) and introduced Neumann topology, enhancing the global search capability of WOA and improving fault diagnosis accuracy. G.V28,29. et al. conducted an in-depth study on the application of sensor technology and machine learning in fault diagnosis. They analyzed the breakthroughs in sensor technology and the advantages of machine learning, and discussed future development directions.

To address the parameter selection problem for the number of decomposition modes and penalty factors in VMD, this paper proposes an optimization method for VMD decomposition based on the RIME algorithm. The RIME algorithm offers superior global search capability and faster search speed. Permutation entropy is employed as the fitness function for globally optimizing the parameters. The feasibility and effectiveness of the RIME-VMD method were validated using the CWRU bearing fault dataset.

Basic theory

Variational modal decomposition

Variational Mode Decomposition (VMD) is an adaptive signal processing method that builds upon Wiener filtering iterative decomposition. By specifying the number of modes and the penalty factor, VMD decomposes the signal into several Intrinsic Mode Functions (IMFs), each characterized by a unique central frequency. A constraint function is introduced to ensure that the sum of the decomposed IMFs equals the original signal. The constraint function for VMD is described by Eq. 1.

$$\left\{ \begin{gathered} \left\{ {{u_k}} \right\}_{,}^{{\hbox{min} }}\left\{ {{\omega _k}} \right\}\left\{ {\sum\limits_{{k=1}}^{K} {\left\| {\partial \left[ {\left( {\delta \left( t \right)+\frac{j}{{\pi t}}} \right) * {u_k}\left( t \right){e^{ - j\omega kt}}} \right]} \right\|} _{2}^{2}} \right\} \hfill \\ s.t.\sum\limits_{{k=1}}^{K} {{u_k}=f} \hfill \\ \end{gathered} \right.$$
(1)

In the equation: \(\:\delta\:\left(t\right)\) denotes the Dirac delta function, * represents the convolution operation; K is the number of modes to be decomposed; \(\:{u}_{k}\) is the k-th Intrinsic Mode Function (IMF) component after decomposition; \(\:{\omega\:}_{k}\) denotes the central frequency of the IMF component; \(\:f\:\)represents the original signal; \(\:{\:}_{2}^{2}\) denotes the \(\:{L}^{2}\) norm of the squared gradient.

To find the optimal solution for the constraint function, a Lagrange multiplier is introduced, resulting in the Lagrangian augmented function, as given in Eq. 2.

$$\begin{gathered} L\left( {{u_k},{\omega _k},\lambda } \right)=\alpha \sum\limits_{{k=1}}^{K} {\left\| {\partial \left[ {\left( {\delta \left( t \right)+\frac{j}{{\pi t}}} \right) * {u_k}\left( t \right)} \right]{e^{ - j\omega t}}} \right\|_{2}^{2}} \hfill \\ +\left\| {f\left( t \right) - \sum\limits_{{k=1}}^{K} {{u_k}\left( t \right)} } \right\|_{2}^{2}+\left( {\lambda \left( t \right),f\left( t \right) - \sum\limits_{{k=1}}^{K} {{u_k}\left( t \right)} } \right) \hfill \\ \end{gathered}$$
(2)

In the equation: \(\:\lambda\:\) represents the Lagrange multiplier; \(\:\alpha\:\) denotes the quadratic penalty factor.

The Alternating Direction Method of Multipliers (ADMM) algorithm is used to obtain several mode components after multiple iterations, with VMD decomposition concluding when the Wiener filtering residual satisfies the constraint conditions.

In VMD decomposition, the number of modes K and the penalty factor α\alphaα must be set empirically. An improper choice of the number of modes K can result in incomplete decomposition of fault features, leading to the loss of fault information, reduced mode stability, and increased computational complexity. Therefore, incorporating optimization algorithms for the global search of VMD parameters is highly significant for enhancing the accuracy and efficiency of VMD decomposition.

Rime optimization algorithm

RIME is an optimization algorithm that simulates the frost formation process observed in nature. Frost formation primarily occurs in two types: soft frost and hard frost. The algorithm employs a forward greedy mechanism to iteratively search for the optimal solution, achieving global optimization.

In RIME, each frost body is treated as an individual search particle within the algorithm, and the entire frost body population is considered as the algorithm’s population. Firstly, the entire frost body population RRR is initialized to establish the initial mathematical model, as given in Eq. 3.

$$R=\left[ {\begin{array}{*{20}{c}} {{x_{11}}}&{{x_{12}}}& \cdots &{{x_{1j}}} \\ {{x_{21}}}&{{x_{22}}}& \cdots &{{x_{2j}}} \\ \vdots & \vdots & \ddots & \vdots \\ {{x_{i1}}}&{{x_{i2}}}& \cdots &{{x_{ij}}} \end{array}} \right]$$
(3)

In the equation: R denotes the initial frost body population, and \(\:{x}_{ij}\)​ represents the j-th frost particle in the i-th frost crystal. The fitness function \(\:F\left({S}_{i}\right)\) for the frost body surrogate is provided in Eq. 4.

$$F\left( {{S_i}} \right)=\left[ \begin{gathered} f\left( {\left[ {\begin{array}{*{20}{c}} {{x_{11}}}&{{x_{12}}}& \cdots &{{x_{1j}}} \end{array}} \right]} \right) \hfill \\ f\left( {\left[ {\begin{array}{*{20}{c}} {{x_{21}}}&{{x_{22}}}& \cdots &{{x_{2j}}} \end{array}} \right]} \right) \hfill \\ \begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {}& \vdots \end{array}}&{} \end{array}}& \vdots \end{array}}&{}&{}& \vdots \end{array} \hfill \\ f\left( {\left[ {\begin{array}{*{20}{c}} {{x_{i1}}}&{{x_{i2}}}& \cdots &{{x_{ij}}} \end{array}} \right]} \right) \hfill \\ \end{gathered} \right]$$
(4)

In the equation: \(\:f\) denotes the fitness of the frost particle.

When each frost particle condenses into soft frost, it moves according to a specific pattern, with its efficiency affected by environmental factors. If a free frost particle approaches the soft frost, it will condense with the particles in the soft frost, thereby changing its properties. If the particle exceeds the escape radius, condensation will not occur.

$$R_{{ij}}^{{new}}={R_{best,j}}+{r_1}\cdot \cos \theta \cdot \beta \cdot \left( {h\cdot \left( {U{b_{ij}} - L{b_{ij}}} \right)+L{b_{ij}}} \right),{r_2}<E$$
(5)

In the equation: \(\:{R}_{ij}^{new}\)​ denotes the updated position of the particle; \(\:{R}_{best,\:\:j}\)​ represents the j-th particle of the best frost body in the frost population; \(\:{r}_{1}\)​ is a random number within the range (-1, 1); \(\:\theta\:\) indicates the direction of particle movement, which changes with each iteration, as shown in Eq. 6; \(\:\beta\:\) is the environmental factor that simulates changes in the external environment during iterations to ensure the convergence of the algorithm, with the convergence formula provided in Eq. 7; h controls the central distance between two particles and is a random number within the range (0, 1); \(\:U{b}_{ij}\)​​ and \(\:L{b}_{ij}\)​​ represent the upper and lower bounds of the escape interval, limiting the range of particle movement; \(\:{r}_{2}\)​ is a random number within the range (0, 1) and, together with the attachment coefficient EEE, regulates the update of the particle’s position; E is the attachment coefficient that increases with the number of iterations, as expressed in Eq. 8.

$$\theta =\pi \cdot \frac{t}{{10\cdot T}}$$
(6)
$$\beta =1 - \frac{{\left[ {\frac{{W\cdot t}}{T}} \right]}}{W}$$
(7)
$$E=\sqrt {\left( {\frac{t}{T}} \right)}$$
(8)

In the equation: t denotes the current iteration count; T represents the maximum number of iterations; [ ] indicates rounding to the nearest integer; w is the number of segments controlling the step function.

As the area of soft frost increases, it exhibits strong randomness and broad coverage, which facilitates the faster identification of optimal decomposition parameters. To prevent getting trapped in local optima during the optimization process, a hard frost penetration mechanism is proposed for updating the algorithm between agents. This mechanism enables the exchange of particles across different local regions, thereby improving algorithm convergence and preventing local optima, as illustrated in Eq. 9.

$$R_{{ij}}^{{new}}={R_{best,j}},{r_3}<{F^{normr}}$$
(9)

In the equation: \(\:{F}^{normr}\left({S}_{i}\right)\) denotes the current normalized fitness value, which indicates the probability of the i-th ice particle being exchanged; \(\:{r}_{3}\)​ is a random number within the range (-1, 1).

Fault signal diagnosis process

Frost Ice optimizes variational mode decomposition

RIME-VMD leverages the efficient search capability of RIME to globally optimize the VMD decomposition parameters, identifying the optimal values for the number of modes KKK and the penalty factor α\alphaα. Permutation entropy is used as the fitness function for RIME, facilitating the rapid determination of the best parameter combination across the entire range. The flowchart is illustrated in Fig. 1, and the detailed steps are outlined as follows:

  1. 1.

    Initialize the frost population RRR, set the parameters for the frost population, the maximum number of iterations T, and the upper and lower bounds \(\:U{b}_{ij}\)​​ and \(\:L{b}_{ij}\)​​.

  2. 2.

    Input the initial vibration signals for four distinct states.

  3. 3.

    Calculate the attachment coefficient E for the particles after each iteration using Eq. 8. If \(\:{r}_{2}<E\), update the positions of the particles according to Eq. 5.

  4. 4.

    Control the updates of particle positions based on \(\:{F}^{normr}\left({S}_{i}\right)\). If \(\:{r}_{3}<{F}^{normr}\left({S}_{i}\right)\), replace the particle with the current best solution.

  5. 5.

    Check whether the maximum number of iterations T has been reached. If not, increment the iteration count to t + 1 and continue the loop. If T has been reached, terminate the loop and output the current best solution.

Fig. 1
figure 1

Fault recognition flowchart.

Troubleshooting process

Rolling bearing fault signals are characterized by nonlinearity and non-stationarity. To extract meaningful fault information, the initial signal is first decomposed using RIME-VMD. The decomposed IMF components are then filtered based on their kurtosis values, with those showing higher kurtosis being selected for signal reconstruction. Finally, the sample entropy of the reconstructed signal is computed as the fault feature and fed into a support vector machine for rapid fault diagnosis of rolling bearings.

Experimental analyses

Experiment data

To validate the effectiveness of the proposed method, the publicly available rolling bearing fault dataset from Case Western Reserve University was utilized, and experiments were conducted under no-load, light-load, and heavy-load conditions. The experimental equipment is shown in Fig. 2.

Fig. 2
figure 2

Rolling bearing signal acquisition test bench.

The four states are listed in Table 1 as follows:

Table 1 Numbering of four different bearing states.

Vibration signal analysis

As an example for the no-load condition, the time-domain and frequency-domain representations of the four different state signals are presented in Figs. 3 and 4.

Fig. 3
figure 3

Time-domain plots of rolling bearing signals in four different states.

Fig. 4
figure 4

Frequency-domain plots of rolling bearing signals in four different states.

As shown in Fig. 3, the time-domain signal distributions for the four different bearing conditions are quite similar, making it difficult to differentiate the fault types. As shown in the frequency-domain plots in Fig. 4, while there are slight differences between the four conditions, it remains challenging to distinguish them.

Signal decomposition and reconstruction using RIME-VMD

Before applying VMD to decompose the vibration signal, it is crucial to determine the number of modes K and the penalty factor α\alphaα, as these parameters directly affect the VMD decomposition results. The VMD decomposition parameters are optimized globally using RIME, ensuring that the fault information in the signal is fully decomposed. Initially, set the maximum number of iterations T to 20, select the number of modes K as integers in the range2,10, and choose the penalty factor α as a random value within [100,2000]. Permutation entropy is used as the fitness function. Taking the outer race fault as an example, a set of 1000 data points is selected as the decomposition sample, with decomposition performed using different optimization algorithms. The decomposition parameters are listed in Table 2, and the results are presented in Fig. 5.

Table 2 WOA and RIME optimized VMD parameters.
Fig. 5
figure 5

Decomposition results of the outer ring fault signal.

Figure 5(b), (c), (d) shows that although the WOA-VMD optimization decomposition is fairly thorough, over-decomposition occurs between IMF3 and IMF4 and between IMF5 and IMF6, suggesting that WOA has reached a local optimum during the global search. Figure 5(a) shows that after RIME-VMD decomposition, there is no mode mixing among the IMFs, and the decomposition is complete without over-decomposition, demonstrating that RIME achieves a global optimum. The iteration curves of the four optimization algorithms are shown in Fig. 6. As can be seen from Fig. 6; Table 3, RIME exhibits faster convergence and shorter convergence time compared to the other algorithms. Although the PSO algorithm performs well in the initial stages by focusing on rapid local optimization, its global search capability is slightly inferior to that of RIME. During the iterative process, RIME is the first to locate the position with the smallest permutation entropy.

Fig. 6
figure 6

Iteration curves of VMD optimization with different algorithms.

Table 3 Convergence iterations and computation time for different algorithms.

Figure 6 presents the iteration curves for the two optimization algorithms, showing that RIME converges faster than WOA and requires less time for optimization. The time consumed for each optimization decomposition is detailed in Table 3.

The results of decomposing the remaining three signal states using RIME-VMD are shown in Fig. 7.

Fig. 7
figure 7

RIME-VMD decomposition results for three signal states.

Figure 7 shows that the vibration signals for the remaining three states have been fully decomposed without modal aliasing, demonstrating that the global optimization parameters of RIME are optimal. To ensure that the reconstructed signal fully captures the fault characteristic information, G.V. and S.C30. proposed combining spectral kurtosis with the flow direction algorithm, optimizing the filtering process to better isolate the fault information from the collected signals under complex operating conditions. The spectral kurtosis and correlation coefficient effectively reflect whether the IMF components contain sufficient fault characteristic information. In this paper, the two IMF components with the highest spectral kurtosis and correlation coefficient are selected for reconstruction, achieving signal denoising. Table 4 presents the spectral kurtosis and correlation coefficient of IMF components under four different operating conditions. Figure 8 shows the time-domain graphs of the vibration signal before and after reconstruction.

Table 4 Kurtosis values and correlation coefficients of IMFs for four signal states.
Fig. 8
figure 8

Time-domain plots before and after reconstruction for four signal states.

Feature extraction and fault diagnosis

After decomposing and reconstructing the four states’ signals using RIME-VMD and WOA-VMD, the sample entropy at multiple scales of the reconstructed signals was computed. These high-dimensional features were then mapped to two dimensions using t-SNE, preserving the relative distances in the high-dimensional space, as shown in Fig. 9.

Fig. 9
figure 9

Visualization of multiscale sample entropy.

From Fig. 9(a), it can be observed that the reconstructed signals from WOA-VMD exhibit some overlap between State 1 and State 3, leading to potential identification errors in fault diagnosis. In contrast, Fig. 9(b) shows that the signals reconstructed using RIME-VMD are fully decomposed, with a higher degree of feature separation between different states and no significant overlap.

Signals for each of the four distinct states were organized into groups of 1000 samples each. For each state, 600 sample groups were selected, of which 480 were randomly allocated for training and the remaining 120 for testing. Subsequently, the feature data from these states were fed into a support vector machine, with the classification results illustrated in Fig. 10.

Fig. 10
figure 10

Recognition results of WOA-VMD and RIME-VMD.

Figure 10 shows that the fault identification accuracy of WOA-VMD is 98.33%, with minor overlap between Fault 3 and Fault 2, and minimal errors in the remaining cases. The fault identification rate of RIME-VMD is 99.79%, with only one instance of Fault 3 in the test set showing an error, representing a significant improvement in accuracy. The recognition accuracy results show that RIME-VMD, through its forward greedy mechanism, performs global optimization, significantly avoiding local optima. This enhancement in VMD decomposition robustness allows for accurate extraction of fault information from the signal and improved bearing fault identification accuracy, thus validating the effectiveness of the method. Compared to other fault diagnosis models, the algorithm proposed in this paper still achieves a high diagnostic accuracy, as shown in the data presented in Table 531.

Table 5 Comparison of various fault diagnosis models.
Fig. 11
figure 11

Recognition results under different operating conditions.

To verify the effectiveness of the method in handling recognition performance across different operating conditions. As shown in Fig. 11, RIME-VMD achieves accuracies of 99.58% and 99.37% under light and heavy load conditions, respectively, thereby validating the effectiveness of the method across different operating conditions.

Conclusions

  1. (1)

    To address the challenges of parameter selection in VMD for rolling bearing fault diagnosis, we propose an optimized RIME-VMD algorithm. This algorithm leverages RIME’s forward greedy mechanism and efficient search capabilities, using permutation entropy as the fitness function, to globally optimize the parameter combination and thereby improve the robustness of VMD decomposition.

  2. (2)

    In comparison with WOA-VMD, RIME-VMD calculates the adhesion coefficient and fitness value after each iteration, adaptively adjusts particle updates, reduces the maximum number of iterations, and consequently enhances both decomposition efficiency and accuracy.

  3. (3)

    By using both kurtosis and correlation coefficient for selecting and reconstructing bearing vibration signals, the accuracy of fault state identification is improved. After calculating the multi-scale sample entropy of the reconstructed signals and applying them to a support vector machine, the RIME-VMD approach achieved an identification accuracy of 99.79%, thus validating the method’s superiority and effectiveness.