Introduction

Rolling bearings are key components in mechanical systems, and their operation is directly related to overall performance. A large body of data shows that nearly one-third of rotating equipment failures are attributable to the failure of rolling bearings, leading to high equipment downtime costs and serious economic losses1. Consequently, accurate bearing health assessment is critical. Vibration signal analysis is widely used in bearing state monitoring.

In the initial phase of vibration signal analysis, the raw signal is first decomposed to reveal underlying characteristics. This step has been facilitated by various signal processing techniques. Among these methods, time–frequency domain statistical analysis provides detailed information about signal behavior over time and frequency. Wavelet packet decomposition (WPD) is also effective in capturing both temporal and frequency details of non-stationary signals2,3,4,5. Empirical mode decomposition (EMD)6,7 and ensemble empirical mode decomposition (EEMD)8,9,10 are widely used for decomposing complex signals into intrinsic mode functions, enabling more detailed analysis. Local mean decomposition (LMD)11,12 is an alternative approach that extracts product functions (PFs) capturing the local characteristics of the signal, providing a different perspective on signal decomposition.

Root mean square (RMS) and kurtosis, two statistical measures in the time–frequency domain, are among the most common indices for evaluating rolling bearing performance degradation13. Tse et al14. constructed the performance degradation assessment (PDA) model for bearings using RMS to monitor bearing health as vibration energy changes. Shen et al.15 used an effective amplitude-based approach to extract bearing degradation features by eliminating frequency bands without critical information while retaining those containing degradation signals. In the field of bearing PDA, statistical measures such as Shannon entropy (SHE), approximate entropy (AE), and permutation entropy (PE) are effective analytical tools16. Ma et al.17 used AE to model bearing degradation and found that as degradation increased, the signal’s frequency components changed, resulting in decreased signal regularity and increased AE values. Compared with AE, PE depends only on the relative order of signal values and shows strong adaptability to nonlinear signal changes, offering an advantage in processing efficiency18. Jiang et al.19 proposed an improved multiscale PE technique to address the limitations of the coarse-graining process. Zheng et al.20 introduced reverse PE as a nonlinear dynamic parameter that integrates distance information into time series of different lengths.

In mechanical signal processing, vibration signals from rolling bearings typically exhibit nonlinear behavior, presenting a challenge for effective adaptive signal decomposition using time–frequency analysis tools such as WPD. WPD implementation depends on the selected wavelet basis and decomposition level, which limits its adaptability. By contrast, EMD21 can decompose the signal into several intrinsic mode functions (IMFs) and a residual, demonstrating adaptive decomposition. EEMD22 is an optimized version of EMD that improves the adaptive decomposition of complex signals by accounting for their local temporal characteristics. LMD23 is an adaptive time–frequency analysis method that extracts instantaneous amplitude and frequency directly from the signal without using the Hilbert transform (HT). It relies on smoothing the local mean and amplitude, avoiding the errors caused by cubic spline interpolation in EMD and ensuring accurate instantaneous frequency and amplitude measurement24. Researchers often combine LMD with diverse signal processing techniques to capture unique fault signals across different fault modes. For example, Huang et al.25 developed an analysis framework integrating LMD with advanced noise suppression techniques to enhance the identification of subtle fault signals. Zheng et al.26 proposed a method for analyzing gear faults by combining LMD with the morpho-fractal dimension and evaluating the mutual information entropy between the PFs and the original signals to select signal components with the richest fault features. Liu et al.27 combined the generative wavelet transform with LMD to propose a fault identification scheme for rotating machinery. These studies highlight LMD’s ability to capture intrinsic features of vibration signals and demonstrate its key role in the early-stage detection of rolling bearing degradation.

After processing the original signal with LMD, the next key step is constructing the PDA model. Cluster analysis, a widely used technical method in PDA, enables comprehensive health assessment of the operation status of mechanical equipment. Various clustering techniques are effective for identifying and classifying failure modes in fault diagnosis scenarios without explicit labels. Pan et al.28 proposed a strategy combining WPD with the Fuzzy C-Means (FCM) algorithm to support bearing PDA. Although these new degradation metrics can monitor bearing degradation near the end of service life, their sensitivity to detect early signs of degradation remains limited. Wang et al.29 adopted an analytical approach incorporating fractal dimension theory and FCM clustering to explore the degradation patterns of equipment performance.

Although FCM clustering is widely used for its simplicity, it is mainly limited to datasets with uniform properties, as it relies on Euclidean distance to measure similarity between samples30. To address this limitation, Pimentel et al.31 introduced an improved FCM variant that increases adaptability to data distribution by incorporating an adaptive distance metric and a covariance matrix. However, the FCM and Gustafson–Kessel (GK) algorithms remain limited in handling non-spherical datasets, which are common in practical engineering applications. To address this, Gath–Geva (GG) clustering was introduced. This method uses fuzzy maximum likelihood estimation to measure distances between sample points and shows adaptability to various data forms32,33. Building on this, Li et al.34 developed a GG-based strategy specifically for rolling bearing PDA to improve the accuracy and reliability of fault diagnosis.

While existing clustering techniques play an important role in bearing condition analysis, the clustering models discussed above present several limitations:

(1) They often rely on expert knowledge to determine the number of clusters and their centers.

(2) The number of clusters is typically preset into three categories (normal, slight, and severe) based on operating conditions. This manual judgment can misinterpret degradation trends, especially when bearings exhibit only two states (normal and severe).

(3) Few studies have examined the use of clustering by fast search (CFS) for bearing PDA.

To solve the reliance on manual experience for determining the number of cluster centers, Rodriguez et al.35 proposed a CFS model based on local density and sample spacing. This model automatically identifies optimal clustering centers and reduces the interference of manual intervention. In reference36, after decomposing the rolling bearing vibration signal using EEMD, CFS was applied to identify bearing faults. Xu et al.37,38,39,40,41 proposed a fault feature extraction method based on a denoising autoencoder and CFS for rolling bearing fault diagnosis. Their approach also included an automatic fault diagnosis framework that enables fault identification through data labeling.

This paper presents a new method for rolling bearing PDA based on LMD, singular value decomposition (SVD), and CFS. First, the rolling bearing vibration signal is preprocessed using LMD, and all obtained PFs are analyzed using SVD. Second, the top two singular values (SVs) are selected based on their correlation with PFs. These SVs serve as the input to the CFS algorithm to automatically determine the number of clusters and their centers. Finally, a confidence value (CV) is used to perform the bearing PDA.

The main contributions of this paper are as follows:

(1) Because few studies have applied CFS to bearing PDA, this paper lays the foundation for its initial use in this field.

(2) To reduce dependence on manual experience in determining the number of cluster centers, this paper uses CFS for bearing PDA to automatically identify cluster centers.

(3) To demonstrate the LMD–SVD–CFS algorithm’s superiority in bearing performance degradation assessment, it was compared with several widely used time-domain feature indices and clustering algorithms, including RMS, kurtosis, SHE, AE, PE, K-means, K-medoids, FCM, GK, and GG.

The remainder of this paper is organized as follows. Section 2 presents the basic theory of LMD, SVD, and CFS. Section 3 describes the experimental data sources and the methodology steps. In Sect. 4, the experimental results are presented, and the proposed method is compared with existing techniques. Finally, Sect. 5 summarizes the main findings of the study.

Basic theory of LMD, SVD, and CFS

Theoretical framework of LMD

LMD identifies the local extrema of the signal and applies smoothing techniques to the raw data sequence to separate the pure frequency-modulated component from the signal envelope. Each PF is then reconstructed by multiplying the pure frequency-modulated signal by its corresponding envelope. This process is repeated to extract all PF components from the signal. The detailed computational procedure is as follows:

(1) For a given original vibration signal series\(\:\:X\left(i\right)\:1\:\:i\:\:N\), where \(\:N\:\)represents the length of the sample, the initial LMD task focuses on accurately identifying all local extremum points within the sequence (labeled \(\:{n}_{i}\)). The next step is to calculate the mean value, denoted as \(\:{m}_{i}\), and then estimate the signal envelope (denoted as \(\:{a}_{i}\)).

$$\:{m}_{i}=\frac{{n}_{i}+{n}_{i-1}}{2}$$
(1)
$$\:{a}_{i}=\frac{{n}_{i}-{n}_{i+1}}{2}$$
(2)

The obtained average extreme points \(\:{m}_{i}\), identified using the method, are connected by drawing straight lines and then applying a moving average algorithm to smooth the data. The local mean of the sequence \(\:{m}_{local\:mean}\left(i\right)\) and the corresponding envelope estimation function \(\:{a}_{\text{e}\text{n}\text{v}\text{e}\text{l}\text{o}\text{p}\text{e}\:\text{e}\text{s}\text{t}\text{i}\text{m}\text{a}\text{t}\text{i}\text{o}\text{n}}\left(i\right)\) are then obtained.

(2) In the second stage, \(\:{m}_{local\:mean}\left(i\right)\) is extracted from the original time series\(\:\:X\left(i\right)\), generating the residual component C, the detailed computation of which is as follows:

$$\:C\left(i\right)=\:X\left(i\right)-{m}_{local\:mean}\left(i\right)$$
(3)

(3) By demodulating C in step (2), \(\:s\left(i\right)\:\)is calculated as follows:

$$\:s\left(i\right)\:=\:C\left(i\right)/{a}_{\text{e}\text{n}\text{v}\text{e}\text{l}\text{o}\text{p}\text{e}\:\text{e}\text{s}\text{t}\text{i}\text{m}\text{a}\text{t}\text{i}\text{o}\text{n}}\left(i\right)$$
(4)

(4) Repeat the above steps to obtain the next envelope estimation function,\(\:{a}_{\text{e}\text{n}\text{v}\text{e}\text{l}\text{o}\text{p}\text{e}\:\text{e}\text{s}\text{t}\text{i}\text{m}\text{a}\text{t}\text{i}\text{o}\text{n}}\left(j\right).\) If \(\:{a}_{\text{e}\text{n}\text{v}\text{e}\text{l}\text{o}\text{p}\text{e}\:\text{e}\text{s}\text{t}\text{i}\text{m}\text{a}\text{t}\text{i}\text{o}\text{n}}\left(j\right)=1\), \(\:s\left(i\right)\) is recognized as a pure frequency modulation signal. If not, the iterative process continues until \(\:{a}_{\text{e}\text{n}\text{v}\text{e}\text{l}\text{o}\text{p}\text{e}\:\text{e}\text{s}\text{t}\text{i}\text{m}\text{a}\text{t}\text{i}\text{o}\text{n}}\left(i+1\right)=1\) and the sequence \(\:s\left(i\right)\le\:1\).

(5) The PFs are calculated as follows:

$$\:PF\left(i\right)=\:{a}_{i}s\left(i\right)$$
(5)

where \(\:{a}_{i}\) denotes the instantaneous amplitude function of \(\:PF\left(i\right)\)

(6) After completing the above steps, a new data sequence is obtained, denoted as \(\:C\left(1\right)\). Steps (1) to (5) are then applied to \(\:C\left(1\right)\), with successive iterations continuing until \(\:C\left(1\right)\) becomes a monotonically increasing or decreasing function.

$$\:\left[\begin{array}{c}C\left(1\right)=X\left(1\right)-PF\left(1\right)\\\:C\left(2\right)=X\left(2\right)-PF\left(2\right)\\\:C\left(3\right)=X\left(3\right)-PF\left(3\right)\\\:\cdots\:\cdots\:\cdots\:\cdots\:\cdots\:\cdots\:\cdots\:\cdots\:\cdots\:\\\:C\left(i\right)=X\left(i\right)-PF\left(i\right)\end{array}\right]$$
(6)

After completing these steps, the original sequence \(\:X\left(i\right)\) is decomposed into the sum of \(\:PF\left(i\right)\) components and a monotonic function \(\:C\left(i\right)\).

$$\:X\left(i\right)=\sum\:PF\left(i\right)+C\left(i\right)$$
(7)

Theoretical framework of SVD

As previously mentioned, SVD is used to calculate and extract the SVs from the PFs obtained in Sect. 2.1. The SVD process is described as follows:

Each selected PF matrix \(\:T\) is decomposed as:

$$\:T=\left[U\right]\left[Q\right]{\left[V\right]}^{T}$$
(8)

where \(\:U\) and \(\:V\) are orthogonal matrices satisfying \(\:U{U}^{T}=1\) and \(\:V{V}^{T}=1\). Q is the singular value matrix, containing all-positive diagonal elements. SV matrices exhibit properties such as robustness, insensitivity to rotation, and consistent response to changes in scale. These characteristics make SVD a suitable tool for capturing fault features in bearing vibration signals.

Theoretical framework of CFS

CFS is defined by two key characteristics. First, data points are densely concentrated within local regions, meaning that no neighboring region has a higher density than the core region. Second, each core is far from other high-density data points, ensuring clear separation and uniqueness of the clusters. The specifics are as follows:

(1) Set up a dataset \(\:X=\{{x}_{1},{x}_{2},\:,\:{x}_{N}\}\), where\(\:\:X\) is the matrix of all SV vectors obtained in Sect. 2.2.

For this dataset, a Euclidean distance \(\:{d}_{ij}\) between any two samples \(\:{x}_{i}\) and\(\:\:{x}_{j}\:\)is calculated as:

$$\:{d}_{ij}=Euclidean\:distance\left({x}_{i},\:{x}_{j}\right)$$
(9)

(2) Calculate the local density \(\:{\rho\:}_{i}\:\)of the ith sample \(\:{x}_{i}\) using:

$$\:{\rho\:}_{i}=\sum\:_{j=1}^{N}-{\left(\frac{{d}_{ij}}{{d}_{c}}\right)}^{2}$$
(10)

Here, \(\:{\rho\:}_{i}\:\)represents the total number of distances \(\:{d}_{ij}\:\le\:{d}_{c}\:\)between \(\:{x}_{i}\) and\(\:\:{x}_{j}\), where \(\:{d}_{c}\)is a cut-off distance. The data point xi itself is not considered.

(3) Calculate the distance \(\:\delta\:\). Let \(\:{\left\{{\text{q}}_{\text{i}}\right\}}_{\text{i=}\text{1}}^{N}\) represent a descending order of \(\:{\left\{{\rho}_{\text{i}}\right\}}_{\text{i}\text{=1}}^{\text{N}}\), such that:

$${\rho _{{\text{q1}}}} \geqslant {\rho _{{\text{q2}}}} \geqslant \ldots \geqslant {\rho _{{\text{qN}}}}$$

Then, calculate \(\:{\delta\:}_{{q}_{i}}\) as:

$$\:{\delta\:}_{{q}_{i}}=\left\{\begin{array}{c}\text{min}\left\{{d}_{{q}_{i},{q}_{j}}\right\},j<i,i\ge\:2\\\:\text{max}\left\{{\delta\:}_{{q}_{j},}\right\},j\ge\:2,i=1\end{array}\right\}$$
(11)

The variable \(\:{\delta\:}_{{q}_{i}}\:\)is defined as the distance to the furthest point in the dataset if \(\:{x}_{i}\:\)has the highest local density. If\(\:{\:x}_{i}\) does not have this property, \(\:{\delta\:}_{{q}_{i}}\:\)is the distance to the nearest point with higher density.

(4) Following the calculation process of steps (2) and (3), assign each point in the dataset a \(\:\gamma\:\) coordinate value, and then sort them in descending order.

(5) Determine the cluster centers based on the \(\:\gamma\:\) coordinates of each data point. According to23, a data point \(\:{x}_{i}\:\)is more likely to be a cluster center if it has a higher \(\:\gamma\:\) value. Accordingly, the \(\:\gamma\:\) values are sorted in descending order, and the points with higher values are prioritized as potential cluster centers. A significant difference in \(\:\gamma\:\) values become a key indicator for distinguishing cluster centers from non-cluster centers. Based on this jump feature, data points showing sudden changes in coordinate values are identified and selected as clustering centers.

(6) Determine local membership based on whether the distance between a data point and a cluster center does not exceed the threshold\(\:{\:d}_{c}\). This decision is guided by the parameter \(\:{\rho\:}_{i}\). Based on this criterion, the data are classified into different clusters.

Dataset source and procedure for the proposed method

Experimental platform and data source

The experimental data for the rolling bearing study originate from the PRONOSTIA platform at the FEMTO-ST Institute, University of Franche-Comté, France. A detailed description of the dataset is provided in reference42. The PRONOSTIA platform is shown in Fig. 1(a). The experimental platform consists of three parts: the rotating part, the load part; and the testing part.

Fig. 1
figure 1

(a) The bearing data acquisition platform; (b) the load component of the experimental platform; (c) the testing component of the experimental platform.

(1) Rotating part: The motor has a power of 250 W and a maximum speed of 2830 rpm, ensuring that the second shaft rotates at 2000 rpm.

(2) Load part: This section consists of a pneumatic jack that applies a dynamic load of 4000 N to the bearing, as shown in Fig. 1(b).

(3) Testing part: The bearing degradation data fall into two main categories: vibration data and temperature data. The vibration sensor consists of two miniature accelerometers positioned at 90° to each other, as shown in Fig. 1(c). One accelerometer is mounted along the vertical axis and the other along the horizontal axis. Both are radially mounted on the outer ring of the bearing, with a sampling frequency of 25.6 kHz. The temperature sensor is a resistance temperature detector (RTD) installed in a hole near the outer bearing ring, with a sampling frequency of 0.1 Hz.

The experimental data under a 4000 N load and a rotational speed of 1800 rpm include seven rolling bearings, among which Bear11 to Bear14 are used in this study. A detailed description of the data is provided in Table 1, where 2560n is the length of each sample. The total number of data samples for Bear11, Bear12, Bear13, and Bear14 is 2803, 871, 911, and 1139, respectively. The platform operates at a speed of 1800 rpm, and the sampling frequency is 25.6 kHz.

Table 1 Experimental data for rolling bearings under different conditions.

Procedure for the proposed method

The procedure for the proposed method consists of three parts: (1) PF extraction and selection; (2) SVD decomposition and clustering; (3) CV extraction for bearing PDA.

(1) PF extraction and selection: LMD is used to decompose the original vibration signal for different bearings. The correlation coefficient is then used to calculate the similarity between each PF and the original vibration signal. The top two PFs, based on their correlation coefficients, are selected for the subsequent SVD decomposition.

(2) SVD decomposition and clustering: SVD is applied to calculate the SVs. The first two SVs (SV1 and SV2) are used as input for various clustering models (CFS, FCM, GK, GG, K-means, and K-medoids) to identify cluster centers representing different operating conditions: normal state, incipient degradation, and severe degradation.

(3) CV extraction for bearing PDA: The CV is introduced as a new indicator to assess bearing health status. The degree of difference between the feature vectors [SV1, SV2] and the predefined cluster centers is evaluated using a specific mathematical model. CV is calculated as.

$$\:CV={e}^{-\left(\raisebox{1ex}{$D$}\!\left/\:\!\raisebox{-1ex}{$S$}\right.\right)}$$
(12)

where \(\:D\) is the distance between any two samples, and S is a scaling factor associated with the cluster centroids. CV represents the similarity between ith sample \(\:{\:x}_{i}\) and the cluster center under normal conditions. A CV closer to 1 indicates that the sample is in a “normal” state, while a CV closer to 0 indicates an “abnormal” state (e.g., slight or severe degradation).

Finally, to demonstrate the superiority of the proposed method, a series of time–frequency domain characterization tools, including RMS, SHE, AE, PE, and kurtosis, were adopted. In addition, various cluster analysis techniques such as FCM, GK, GG, K-means, and K-medoids were applied to provide a comprehensive assessment of the performance status of the rolling bearings. This procedure is illustrated in Fig. 2

Fig. 2
figure 2

Flowchart of the proposed method.

Experiment and comparison analysis

Original vibration signal

This section presents visualizations of the raw vibration signals listed in Table 1. As revealed in Fig. 3, Bear11, Bear13, and Bear14 experience three different operating states: normal, slight, and severe. By contrast, Bear12 exhibits only two states—normal and severe degradation—highlighted by the red dashed box in the figure.

Fig. 3
figure 3

Original vibration signals of different bearings.

In Fig. 3, the vibration amplitudes of Bear11 and Bear13 increase gradually over time. However, Bear12 and Bear14 show sudden changes in amplitude, with the change in Bear12 being especially abrupt. The amplitude pattern of Bear14 is more complex; the sudden change can be divided into two phases: a slightly degraded phase, indicated by the red solid box, and a severely degraded phase, indicated by the red dashed box. Most cluster analysis methods, including FCM, GK, GG, K-means, and K-medoids, require the number of clusters to be predetermined based on expert knowledge or subjective judgment.

In the subsequent examples, CFS is employed to construct the PDA model for rolling bearings without specifying the number of clusters.

LMD decomposition and SV extraction

As mentioned in Sect. 4.1, all original vibration signal samples are first decomposed using LMD. The resulting PFs obtained from Bear11 to Bear14 are shown in Fig. 4, where C denotes the residual component. As shown in the figure, the vibration amplitudes of the top two PFs are the largest and second largest, suggesting that these components contain valuable information.

Fig. 4
figure 4

PFs obtained from LMD for different bearings.

To verify the potential of the top two PFs as effective decomposition components, correlation analysis was used to quantify the strength of the relationship between the original signal and each individual PF. A correlation value closer to 1 indicates a stronger correlation. As illustrated in Fig. 5, PF1 and PF2 show the highest correlation values, highlighting their significance as key decomposition components.

Fig. 5
figure 5

Correlation coefficients between each PF and the corresponding original signal for different rolling bearings.

To visualize the data, the singular values corresponding to PF1 and PF2—SV1 and SV2—were extracted using SVD and used as the input parameters for the CFS algorithm to construct the rolling bearing PDA model. As shown in Fig. 6, SV1 and SV2 significantly reflect the degradation paths of bearing performance. In particular, SV1 shows a high sensitivity in tracking the degradation trend. However, in the normal state, an oscillation is observed (especially for Bear12), which can be attributed to noise present in the vibration signal under normal conditions, as seen in Fig. 3. However, the following experiment results demonstrate that the proposed method offers better stationarity when using CFS and the CV compared to using only SV obtained via SVD.

Fig. 6
figure 6

SV features extracted for different bearings using SVD.

CFS clustering and PDA using CV

To accurately identify the core elements of clustering, SV1 and SV2 were used as input features for the CFS algorithm.

According to reference27, the appropriate DC value should be selected such that the number of neighboring points of each data point within its DC range is 1–2% of the total number of data points. This helps ensure accuracy and efficiency in the clustering process. In dataset X, any sample xi has a distance relationship with the remaining N − 1 data points, resulting in N × (N − 1) independent distance measures. However, because distances are symmetric, this count includes redundancy. The total number of unique, independent distance values is M = 0.5 N × (N − 1).

By sorting the distance dij (i < j) in Eq. (10) in ascending order, the distance sequence d1d2 ≤‧‧‧≤dM is obtained. If dc is selected as dk, where k {1, 2, …, M}, then approximately k/M of all N × (N − 1) distances are less than dc, which equals approximately (k/M) × N × (N − 1). After averaging this across all data points, each point has approximately (k/M) × (N − 1) ≈ (k/M) × N neighbors. The ratio k/M is given, and the value of dc is determined. If dc is set too high, the local density value ρi of each data point xi becomes excessively large, reducing the discrimination between points. This may cause a single cluster to be mistakenly split into multiple subclusters. In the study, dc is set using a k/M ratio of 1.5%.

Within the framework of the CFS, Bear11 and Bear12 were analyzed based on the local density ρ and distance δ values computed in Eqs. (1112), and the results are presented in Fig. 7. In Fig. 7(a), three sample points are clearly separated from the rest in terms of local density and distance, and the p and t values of these points are significantly higher than those of the remaining samples, showing a significant “jumping” characteristic. Based on this observation, these three samples were selected as cluster centers representing different health states of the bearings, covering the full degradation process from normal to slight to severe. Figure 7(b) shows that two sample points also exhibit “jumps”. These observations confirm the effectiveness of the CFS algorithm in automatically identifying cluster centroids.

Fig. 7
figure 7

Results of the local density ρ and distance δ for Bear11 and Bear12.

The two-dimensional clustering results for Bear11 to Bear14 are shown in Fig. 8, where the red square points represent the cluster points. All samples are separated well by using CFS. Subsequently, a comprehensive assessment of the performance status of the rolling bearings was conducted based on the CVs calculated using Eq. (13). Figure 9 shows the distribution of CVs obtained using different cluster centroids on the datasets of Bear11 to Bear14.

Fig. 8
figure 8

Two-dimensional clustering figures for different bearings using CFS.

Fig. 9
figure 9

CVs of different bearings using CFS; Normal, Slight, and Severe denote different health states.

As shown in Fig. 9(a), the CV of Bear11 remains stable during the “normal” operating phase. However, as the bearing condition progresses into the “slight” and “severe” states, the CV shows a decreasing trend. By contrast, the CVs of Bear12 display an almost continuous straight-line pattern during the initial stages of the observation period, as shown in Fig. 9(b), but a clear downward trend appears at the end. This is due to a sharp increase in the original vibration signal of Bear12 at the end of this stage. Compared to the SVs for Bear12 shown in Fig. 6, the CVs remain stable in the “normal” state, while the SVs exhibit noise. The CV curves in Fig. 9(b) and Fig. 9(d) for Bear13 and Bear14 also show good overall stability. An extended view of Fig. 9 is provided in Fig. 10. Based on this visualization, performance degradation in Bear11 can be observed: its health first drops to the “slightly degraded” level at data point 1491, with an amplitude of 0.9724, then rises to 0.9899 at data point 1492. Further degradation occurs at data point 2748, when the bearing reaches the “severe” state. Similarly, the health status of Bear12 worsens sharply at data point 828, rapidly transitioning from “normal” to “severe”. These observations highlight the efficiency of CFS in achieving PDA for rolling bearings. A key advantage of this method is that it does not require a predetermined number of clusters.

Fig. 10
figure 10

CVs for different bearings using CFS with extended figures.

Comparison analysis

Comparison with FCM, GK, and GG

To demonstrate the superiority of the proposed method, several common techniques, such as FCM, GK, and GG, are used to build PDA models and calculate the CV. For the FCM, GK, and GG models, the number of cluster centers c for Bear11, Bear13, and Bear14 is set to 3, as these bearings exhibit three health states. For Bear12, c is set to 2, as it only includes two states: normal and severe. The value of the termination tolerance is also specified for these models. The results of the two-dimensional clustering are shown in Fig. 10. As shown in Figs. 11 and 12, and 13, the corresponding CV distributions, calculated using various cluster centers according to Eq. (13), are presented in Figs. 14 and 15, and 16.

Fig. 11
figure 11

Two-dimensional clustering figures for different bearings using FCM.

Fig. 12
figure 12

Two-dimensional clustering figures for different bearings using GK.

Fig. 13
figure 13

Two-dimensional clustering figures for different bearings using GG.

Fig. 14
figure 14

CVs of different bearings using FCM; Normal, Slight, and Severe denote different health states.

Fig. 15
figure 15

CVs for different bearings using GK; Normal, Slight, and Severe denote different health states.

Fig. 16
figure 16

CVs for different bearings using GG; Normal, Slight, and Severe denote different health states.

(1) As shown in Fig. 12(a), several samples in different states overlap for Bear11 when using GK, whereas in Fig. 8(a), all samples are separated well using CFS. In Fig. 15, many samples for Bear11, Bear12, and Bear13 overlap when GG is applied. This is especially evident for Bear12, where only two states are present and should be clearly divided; however, some severe samples overlap with normal samples. By contrast, Fig. 9(b) shows that all samples are separated well. For Bear11, all CVs under different conditions using various clustering center points (“normal”, “slight”, and “severe”) appear similar in Fig. 16(a). However, Fig. 10(a) shows a clear increasing trend when the “slight” and “severe” cluster centers are used, making it easier to distinguish between different health states.

(2) For Bear12, the number of clusters must be manually adjusted before building the PDA model, which requires expert judgment. Typically, bearing health is divided into three states: normal, slight, and severe. However, Bear12 contains only two states: normal and severe. This mismatch can lead to errors when determining the health state using FCM, GK, or GG. The results show that CFS can accurately and automatically identify the health condition of the rolling bearing.

(3) These results demonstrate that the performance of CFS is superior to that of FCM, GK, and GG.

Comparison with K-means and K-medoids

As with the FCM, GK, and GG models, certain parameters for K-means and K-medoids must be pre-configured before calculation. The number of clusters c for K-means and K-medoids is the same as that used for FCM, GK, and GG. The results of the two-dimensional clustering using K-means and K-medoids are shown in Figs. 17 and 18, respectively, and the corresponding CVs are presented in Figs. 19 and 20.

Fig. 17
figure 17

Two-dimensional clustering figures for different bearings using K-means.

Fig. 18
figure 18

Two-dimensional clustering figures for different bearings using K-medoids.

Fig. 19
figure 19

CVs for different bearings using K-means; Normal, Slight, and Severe denote different health states.

Fig. 20
figure 20

CVs for different bearings by using K-medoids; Normal, Slight, and Severe denote different health states.

The CVs for Bear11 using K-means and K-medoids are shown in Fig. 19(a). In Fig. 20(a), under the “slight” condition, the CVs increase gradually, whereas with CFS, all CVs under the “slight” condition increase sharply, as shown in Fig. 9(a), with an obvious jump. This indicates that degraded states such as “normal” and “slight” are easier to distinguish for Bear11 when using CFS. A similar pattern is also observed in the “severe” state in Fig. 9(a). In addition, CVs under the “normal” and “slight” conditions of Bear11 in Fig. 19(a) appear similar at first glance, while in Fig. 9(a), they differ clearly. Moreover, the number of clusters must be pre-set before building the PDA model, and expert knowledge is required to determine the cluster number when CFS is used.

Comparison with RMS and kurtosis

In this section, RMS and kurtosis are employed as metrics to assess the health state of all rolling bearings. The RMS and kurtosis values are shown in Figs. 21 and 22.

Fig. 21
figure 21

RMS values of different bearings.

Fig. 22
figure 22

Kurtosis values of different bearings.

(1) Compared with RMS and kurtosis, the CV curves in Figs. 9 and 10 are very smooth and stable within the red dashed rectangle, without obvious noise. By contrast, the curves in Figs. 21 and 22 show significant noise under normal conditions, especially for Bear12. In Fig. 22, the noise in the kurtosis curve is particularly pronounced.

(2) Engineers may easily misjudge the degradation state of the bearing due to noise points that appear to transition from the normal to the slight or severe state, such as the red data points within the red dashed rectangle in Figs. 21 and 22. For example, Bear12 remains in the normal state until data point 828, yet many noise points within the red dashed rectangle have RMS values greater than that at point 828. However, in Fig. 10(b), no data points appear in the red dashed rectangle area after data point 828. Therefore, the noise in the RMS values before point 828 may lead to misjudgment of the bearing’s degradation state. The kurtosis values in Fig. 22 show similar behavior.

(3) Compared with RMS and kurtosis, CFS provides superior performance in PDA. These findings indicate that the CFS algorithm effectively captures and assesses the performance status of rolling bearings.

Comparison with RMS and kurtosis

This section provides a comprehensive assessment of the performance state of rolling bearings using three metrics: SHE, AE, and PE. Before calculating these entropy values, several key parameters must be defined.

For the calculation of AE, two key parameters must be predefined: the embedding dimension and the tolerance. Increasing the embedding dimension helps capture richer data features but also increases the computational load. Therefore, an embedding dimension of 2 was selected. For the tolerance parameter, the recommended range is 0.1 to 0.25 times the SD of the original dataset17.

PE: Before data analysis, two key parameters must be determined: the embedding dimension and the time lag. Researchers generally recommend keeping the embedding dimension in the range of 3 to 7, as when the dimension exceeds this range, the efficiency of calculation may decrease, particularly during phase space reconstruction, and lead to the loss of subtle features in the vibration signals. In addition, if the time lag value is set above 5, the algorithm may fail to capture the small fluctuations in the vibration signal, affecting the accuracy of the analysis. Based on these considerations, an embedding dimension of 6 and a time lag of 3 were selected to ensure both accurate and efficient analysis. The degradation curves obtained using SHE, AE, and PE are presented in Figs. 23, 24 and 25.

Fig. 23
figure 23

SHE values of different bearings.

Fig. 24
figure 24

AE values of different bearings.

Fig. 25
figure 25

PE values of different bearings.

(1) Compared with Figs. 23, 24 and 25, the CV curves in Figs. 9 and 10 are smooth and stable within the red dashed rectangle, without noticeable noise. By contrast, many noisy data points appear in the red dashed rectangle area in Fig. 23(c), Fig. 24, and Fig. 25.

(2) The AE curve in Fig. 24(b) shows some noise, especially for Bear12. It is difficult to distinguish the normal and severe conditions because of the noisy data points in the red dashed rectangle. The same phenomenon is observed in Fig. 25(d) when using the PE model. Therefore, engineers may misjudge the degradation state of the bearing because these noise points appear to transition from the normal to the slight or severe state when using SHE, AE, and PE.

(3) Compared with SHE, AE, and PE, CFS provides superior performance in PDA. These findings demonstrate that the CFS algorithm is effective in capturing and assessing the performance status of rolling bearings.

Conclusion

To address the problem of predetermining the number of clusters, this paper introduces the CFS algorithm for bearing PDA. Very few studies have focused on applying CFS in this context. The advantages and limitations of this study are as follows:

Advantages:

(1) The CFS algorithm uses the local density of data points and a distance metric to dynamically determine the optimal number of clusters, eliminating reliance on manual experience.

(2) This paper lays the groundwork for the initial application of CFS in the field of bearing PDA.

Disadvantages: The selection of the optimal cut-off distance parameter \(\:{d}_{c}\) in CFS remains a key challenge in this study. Particle swarm optimization, genetic algorithms, and other computational methods have been widely used in other fields. We plan to explore these approaches in future work to improve the selection of \(\:{d}_{c}\).