Introduction

Cluster analysis is a key method in data mining, categorizing datasets based on similarity, According to different clustering rules, common clustering methods include hierarchical clustering1,2,3, K-means clustering4,5,6,7, fuzzy C-means clustering8,9,10,11,12, spectral clustering13,14,15, and multi-view clustering16,17,18,19, among others.

Spectral clustering, based on graph theory, constructs undirected graphs where edge weights represent sample similarity. It involves two main steps: embedding a low-dimensional similarity matrix between samples and using a partition-based algorithm (e.g., K-means clustering method) for clustering.

Traditional spectral clustering relies on K-means for clustering. To address this, article14 proposes a local adaptive clustering framework that learns an efficient similarity matrix and the dataset’s clustering structure simultaneously. This framework improves traditional spectral clustering by eliminating the need for K-means, allowing the learnt similarity matrix to be directly used for clustering. Experimental results demonstrate that this framework enhances clustering performance while simplifying the spectral clustering process.

Spectral clustering, a type of hard clustering20, excels with linear or structurally clear datasets. However, it struggles with complex datasets, such as high-dimensional or nonlinear ones, where the constructed graph fails to accurately reflect sample similarity, weakening clustering performance.

Fuzzy clustering methods21,22,23,24 have shown better performance on complex datasets by using soft partitioning. Experimental results indicate that introducing fuzzy operators enhances clustering performance on complex datasets.

To reduce the impact of the similarity matrix on spectral clustering performance, this study introduces a local adaptive fuzzy spectral clustering (FSC) approach, FSC integrates a fuzzy index into the similarity matrix, mitigating its influence and improving clustering results. Additionally, FSC simplifies the spectral clustering process by allowing direct use of the similarity matrix for clustering. The main contributions of this study are summarized as follows:

  1. 1.

    By introducing the fuzzy index, the clustering performance of FSC has a low sensitivity to the similarity matrix of the dataset, that is, the dependence of FSC on the similarity matrix is low. Therefore, FSC method has a good adaptability. Moreover, by introducing fuzzy parameters, FSC can be flexibly adapted to different datasets, so the adaptability of FSC can also be further improved.

  2. 2.

    Owing to the introduction of the idea of “soft partition”, FSC can effectively deal with the problem of datasets with more complex structures. Compared to other spectral clustering methods, FSC has improved its clustering performance on high dimensional datasets or nonlinear data sets. Therefore, FSC improves its practicability.

  3. 3.

    FSC adopts a new spectral clustering framework, which can adaptively optimize the optimal similarity matrix and clustering structure at the same time. Hence the similarity matrix can be directly used to do clustering, it simplifies the process of spectral clustering method.

  4. 4.

    Experimental results show FSC outperforms other methods in most cases both on linear datasets and complex datasets.

Related works

Normalize cut

Normalize cut (NC)13,25 is a classical spectral clustering method that focuses on global features of datasets rather than consistent features. Assume that \(G=\{{\varvec{V}},{\varvec{E}}\}\) denotes the set of graph, where \({\varvec{V}}\) is the set of nodes and \({\varvec{E}}\) is the set of edges. Suppose \(G\) can be divided into two independent sets \({\varvec{A}}\) and \({\varvec{B}}\). And the error of dividing is denoted as \(cut({\varvec{A}},{\varvec{B}})={\sum }_{{\varvec{u}}\in {\varvec{A}},{\varvec{v}}\in {\varvec{B}}}w({\varvec{u}},{\varvec{v}})\), where \(w({\varvec{u}},{\varvec{v}})\) represents the weight of the node \({\varvec{u}}\) and the node \({\varvec{v}}\). So the objective function of NC is defined as follows:

$$Ncut\left({\varvec{A}},{\varvec{B}}\right)=\frac{cut\left({\varvec{A}},{\varvec{B}}\right)}{assoc\left({\varvec{A}},{\varvec{V}}\right)}+\frac{cut\left({\varvec{A}},{\varvec{B}}\right)}{assoc\left({\varvec{B}},{\varvec{V}}\right)}$$
(1)

where \(assoc({\varvec{A}},{\varvec{V}})={\sum }_{u\in {\varvec{A}},t\in {\varvec{V}}}w(u,t)\) represents the sum of the weights of edges between all nodes of \({\varvec{A}}\), and the \(assoc({\varvec{B}},{\varvec{V}})\) has the same definition. In order to represent the weight of each edge, suppose \({\varvec{z}}=\left[{z}_{1},{z}_{2},...,{z}_{N}\right]\in {R}^{1\times N}\) denotes the vector, whose element \({z}_{i}=1\) represents the i-th node belong to the set \({\mathbf{A}}\), otherwise, \({z}_{i}=-1\). Then the weight of each edge can be defined by the similarity between two nodes, i.e.,

$${w}_{ij}=\mathit{exp}\left(-\frac{{\Vert {{\varvec{z}}}_{i}-{{\varvec{z}}}_{j}\Vert }^{2}}{2{\sigma }^{2}}\right),\sigma >0$$
(2)

Let \({\varvec{D}}=({d}_{1},{d}_{2},...,{d}_{N})\) be the sum vector, whose element represents the sum of similarity between the i-th node and other all nodes. Then Eq. (1) can be transformed into:

$$Ncut\left({\varvec{A}},{\varvec{B}}\right)=\frac{cut\left({\varvec{A}},{\varvec{B}}\right)}{assoc\left({\varvec{A}},{\varvec{V}}\right)}+\frac{cut\left({\varvec{A}},{\varvec{B}}\right)}{assoc\left({\varvec{B}},{\varvec{V}}\right)}=\frac{{\sum }_{{z}_{i}>0,{z}_{j}<0}-{w}_{ij}{z}_{i}{z}_{j}}{{\sum }_{{z}_{i}>0}{d}_{i}}+\frac{{\sum }_{{z}_{i}<0,{z}_{j}>0}-{w}_{ij}{z}_{i}{z}_{j}}{{\sum }_{{z}_{i}<0}{d}_{i}}$$
(3)

So optimize Eq. (1) is equal to optimize Eq. (4), we have

$$\underset{{\varvec{z}}}{min}\hspace{0.33em}Ncut\left({\varvec{z}}\right)$$
(4)

According to13, Eq. (4) is equal to:

$$\underset{{\varvec{y}}}{min}\hspace{0.33em}\frac{{{\varvec{y}}}^{{\varvec{T}}}\left({\varvec{D}}-{\varvec{S}}\right){\varvec{y}}}{{{\varvec{y}}}^{{\varvec{T}}}{\varvec{D}}{\varvec{y}}} s.t.\hspace{0.33em}{{\varvec{y}}}^{{\varvec{T}}}{\varvec{D}}1=0{ y}_{i}\in \left\{1,\frac{-{\sum }_{{z}_{i}>0}{d}_{i}}{{\sum }_{{z}_{i}<0}{d}_{i}}\right\}$$
(5)

In the equation, the term in the numerator, \(\begin{array}{c}{y}^{T}(D-S)y\end{array}\), represents the cut of the graph, where \(S\) is the similarity matrix and \(D\) is the degree matrix. The term in the denominator, \({y}^{T}Dy\), is a normalization factor, typically used to prevent trivial solutions. The constraint \({y}^{T}D1=0\) ensures that the solution vector \(y\) is orthogonal to the \(D\)-weighted all-one vector, which helps prevent solutions that are biased towards any particular cluster. The set to which each \({y}_{i}\) belongs represents constraints on the possible values for the entries of \({y}\). These values are derived based on the degrees \({d}_{i}\) and a vector \({z}\), where \({z}_{i}\) likely relates to the membership of the ith data point in a cluster or attributes of the data point. The constraints on \({y}_{i}\) introduce a fuzzy element, indicating that the method allows for soft clustering, where data points can have varying degrees of membership within clusters.

Then Eq. (5) can become further

$${{\varvec{L}}}{\prime}{\varvec{H}}=\lambda {\varvec{H}}$$
(6)

where \({{\varvec{L}}}{\prime}={{\varvec{D}}}^{-(\frac{1}{2})}({\varvec{D}}-{\varvec{W}}){{\varvec{D}}}^{(-\frac{1}{2})}\) and \({\varvec{H}}={{\varvec{D}}}^{-\frac{1}{2}}y\).

According to13, the optimal solution of Eq. (6) is \({\varvec{U}}\in {{\varvec{R}}}^{N\times k}\), it generated by selecting the corresponding eigenvectors of the \(k\) smallest eigenvalues of \({{\varvec{L}}}{\prime}\), and each row of \({\varvec{U}}\) is normalized into 1. Finally, the K-means clustering method is adopted to partition \({\varvec{U}}\) into \(k\) clusters.

From the above the description, we can find the only input of NC is the similarity of datasets, after exerting NC, an optimal matrix (i.e., \({{\varvec{L}}}{\prime}\)) is obtained to generate the clustering structure (i.e., \({\varvec{U}}\)). Finally, exerting K-means on the clustering structure to obtain the clustering results.

However, we can also observe that the clustering performance of NC is seriously depended on the input of similarity of datasets.

Fuzzy C-means clustering method

Fuzzy C-means clustering (FCM)8 is a frequently utilized fuzzy clustering technique. It improves upon K-means by incorporating a fuzzy index, which enhances its ability to handle uncertainty. The objective function of FCM is defined as follows:

$${J}_{FCM}={\sum }_{i=1}^{n}{\sum }_{j=1}^{c}{u}_{ij}^{m}{\Vert {{\varvec{x}}}_{i}-{{\varvec{c}}}_{j}\Vert }^{2} s.t.,\hspace{0.33em}\forall i,{\sum }_{j=1}^{c}{u}_{ij}=\text{1,0}\le {u}_{ij}\le 1,m>0,m\ne 1$$
(7)

FCM optimizes \({u}_{ij}\) and \({{\varvec{c}}}_{j}\) by iteratively updating, and the updated rules are:

$${u}_{ij}=\frac{1}{{\sum }_{k=1}^{c}{\left(\frac{{\Vert {{\varvec{x}}}_{i}-{{\varvec{c}}}_{j}\Vert }^{2}}{{\Vert {{\varvec{x}}}_{i}-{{\varvec{c}}}_{k}\Vert }^{2}}\right)}^{\frac{2}{m-1}}}$$
(8)
$${{\varvec{c}}}_{j}=\frac{{\sum }_{i=1}^{n}{u}_{ij}^{m}{{\varvec{x}}}_{i}}{{\sum }_{j=1}^{n}{u}_{ij}^{m}}$$
(9)

where, \({u}_{ij}\) represents the degree of the sample \({{\varvec{x}}}_{i}\) belonging to j-th cluster, the larger \({u}_{ij}\), the more possibility of \({{\varvec{x}}}_{i}\) belonging to j-th cluster. \({{\varvec{c}}}_{j}\) represents the j-th cluster center. \(m\) is the fuzzy index, by setting the value of \(m\), the influence of \({u}_{ij}\) on the clustering performance can be adjusted, Experimental results show that the clustering performance of FCM can be improved by introducing the fuzzy index.

In this context, \({u}_{ij}\) denotes the degree to which sample \({{\varvec{x}}}_{i}\) belongs to the \(j\) cluster. The higher the \({u}_{ij}\), the more likely \({{\varvec{x}}}_{i}\) is part of the \(j\) cluster. The variable \({{\varvec{c}}}_{j}\) represents the center of the \(j\) cluster. The parameter \(m\) is the fuzzy index, which adjusts the impact of \({u}_{ij}\) on clustering performance by altering its value.Experimental results indicate that introducing the fuzzy index enhances FCM’s clustering performance. So we will explain that how to introduce the fuzzy index into spectral clustering and simultaneously simplify the process of spectral clustering.

A local adaptive fuzzy spectral clustering method

Objective function

Assume that \({\varvec{X}}\in {{\varvec{R}}}^{n\times d}\) represents the training sample set, where \(n\) is the total number of training samples, \(d\) is the number of dimensions, and \(c\) is the number of clusters. Let \({\varvec{S}}\in {{\varvec{R}}}^{n\times n}\) denote the similarity matrix, with each element \({s}_{ij}(i,j=\text{1,2},...,n)\) representing the similarity between the \(i\) and \(j\) samples. Similarly,\({\varvec{D}}\in {{\varvec{R}}}^{n\times n}\) is the distance matrix \(X\), with \({d}_{ij}\) as the distance between the \(i\) and \(j\) samples, calculated using the Euclidean distance, \({d}_{ij}={\Vert {{\varvec{x}}}_{i}-{{\varvec{x}}}_{j}\Vert }_{2}\). By incorporating the fuzzy index into the local adaptive spectral clustering framework and referencing the objective function of FCM, we define the objective function of FSC as follows:

$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i=1}^{n}{\sum }_{j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2} s.t\forall i,{\sum }_{j=1}^{n}{s}_{ij}=1,0<{s}_{ij}<1,m>0,m\ne 1$$
(10)

Obviously, compared to classical spectral clustering method, the optimized similarity matrix \({\varvec{S}}\) becomes more sparser so as to be directly used to partition without K-means clustering method. Besides, by means of the fuzzy index, it can reduce the impact of the similarity matrix \({\varvec{S}}\) on the clustering results and improve the adaptability of the spectral clustering.

Although the sparsity of similarity matrix \({\varvec{S}}\) has been improved by introducing the fuzzy index, it can not guarantee the datasets can be partitioned into exact clusters because the number of the connected component of similarity matrix \({\varvec{S}}\) can not be equal to \(c\). In order to the optimal similarity matrix \({\varvec{S}}\) can be divided into exact \(c\) clusters, referring to14, we add a rank constraint \(rank({{\varvec{L}}}_{{\varvec{S}}})=n-c\) into the Laplacian matrix of \({\varvec{S}}\). And the new objective function is as follows:

$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i=1}^{n}{\sum }_{j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2} s.t\forall i,{\sum }_{j=1}^{n}{s}_{ij}=1,0<{s}_{ij}<1,m>0,m\ne 1,rank\left({{\varvec{L}}}_{{\varvec{S}}}\right)=n-c$$
(11)

According to14, by adding the rank constraint into the Laplacian matrix of \({\varvec{S}}\), the optimized \({\varvec{S}}\) has exact \(c\) connected components, which is equal to the number of clusters. Compared to the objective function in14, by introducing the fuzzy index, the new objective function further improves the sparsity of the similarity matrix and simultaneously improves the anti-sensitivity of spectral clustering method, it improves the adaptivity of FSC.

Optimizing objective function

In order to optimize the Eq. (11), we firstly need to transform the Eq. (11). Suppose \({\sigma }_{i}({{\varvec{L}}}_{{\varvec{S}}})\) denotes the i-th smallest eigen value of \({{\varvec{L}}}_{{\varvec{S}}}\), since \({{\varvec{L}}}_{{\varvec{S}}}\) is a semi-definite positive matrix, then we have \({\sigma }_{i}({{\varvec{L}}}_{{\varvec{S}}})\ge 0\). Considering \(\lambda\) is enough large, Eq. (11) can be transformed as follows:

$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i=1}^{n}{\sum }_{j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2}+2\beta {\sum }_{i=1}^{c}{\sigma }_{i}\left({{\varvec{L}}}_{{\varvec{S}}}\right) s.t\forall i,{\sum }_{j=1}^{n}{s}_{ij}=1,0<{s}_{ij}<1,m>0,m\ne 1$$
(12)

When \(\lambda\) is enough large, the second term of Eq. (12) trends to zero during the optimization of Eq. (12), so the constraint \(rank({{\varvec{L}}}_{{\varvec{S}}})=n-c\) is satisfied. According to Ky Fan’s theory26, we have

$${\sum }_{i=1}^{c}{\sigma }_{i}\left({{\varvec{L}}}_{{\varvec{S}}}\right)=\underset{{\varvec{F}}\in {{\varvec{R}}}^{n\times c},{{\varvec{F}}}^{T}{\varvec{F}}={\varvec{I}}}{min}Tr\left({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}}\right)$$
(13)

where \({\varvec{F}}\) is the generated by the \(c\) corresponding eigenvectors of the \(c\) smallest eigenvalues of \({{\varvec{L}}}_{{\varvec{S}}}\), then Eq. (12) can be further transformed into:

$$\underset{{\varvec{S}},{\varvec{F}}}{min}\hspace{0.33em}J={\sum }_{i,j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2}+2\beta Tr\left({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}}\right) s.t.\forall i,{\sum }_{j=1}^{n}{s}_{ij}=\text{1,0}\le {s}_{ij}\le 1,m>0,m\ne 1,{\varvec{F}}\in {{\varvec{R}}}^{n\times c},{{\varvec{F}}}^{T}{\varvec{F}}={\varvec{I}}$$
(14)

Obviously, compared to Eq. (12), optimizing Eq. (14) is easier operator. So we propose an alternate optimization method to optimize Eq. (14), the main process is summarized as follows:

When \({\varvec{S}}\) is fixed, optimizing \({\varvec{F}}\). Equation (14) becomes

$$\underset{{\varvec{F}}\in {{\varvec{R}}}^{n\times c},{{\varvec{F}}}^{T}{\varvec{F}}={\varvec{I}}}{min}J=Tr\left({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}}\right)$$
(15)

The optimal solution of Eq. (15) is generated by the \(c\) corresponding eigenvectors of the \(c\) smallest eigenvalues of \({{\varvec{L}}}_{{\varvec{S}}}\).

When \({\varvec{F}}\) is fixed, optimizing \({\varvec{S}}\). Equation (14) becomes

$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i,j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2}+2\beta Tr\left({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}}\right) s.t.\forall i,{\sum }_{j=1}^{n}{s}_{ij}=\text{1,0}\le {s}_{ij}\le 1,m>0,m\ne 1,{\varvec{F}}\in {{\varvec{R}}}^{n\times c},{{\varvec{F}}}^{T}{\varvec{F}}={\varvec{I}}$$
(16)

Considering \({{\sum }_{i,j=1}^{n}\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}{s}_{ij}=2Tr({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}})\),\({\varvec{F}}\in {{\varvec{R}}}^{n\times c}\) and whose element \({{\varvec{f}}}_{i}\) represents the i-th column of \({\varvec{F}}\). Then Eq. (16) becomes:

$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i,j=1}^{n}\left({s}_{ij}^{m}{d}_{ij}^{2}+\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}{s}_{ij}\right) s.t.\forall i,{\sum }_{j=1}^{n}{s}_{ij}=\text{1,0}\le {s}_{ij}\le 1,m>0,m\ne 1$$
(17)

For the optimization of Eq. (17), we adopt the Lagrange multiplier method to solve the problem, and the Lagrange function is as follows:

$$J={\sum }_{i,j=1}^{n}\left({s}_{ij}^{m}{d}_{ij}^{2}+\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}{s}_{ij}\right)+{\sum }_{i=1}^{n}{\alpha }_{i}\left(1-{\sum }_{j=1}^{n}{s}_{ij}\right)$$
(18)

where \({\alpha }_{i}\) is the Lagrange multiplier, then let \(\frac{\partial J}{\partial {s}_{ij}}=0\), we can have

$$\frac{\partial J}{\partial {s}_{ij}}=m{s}_{ij}^{m-1}{d}_{ij}^{2}+\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}-{\alpha }_{i}=0$$
(19)

Then we have:

$${s}_{ij}={\left(\frac{{\alpha }_{i}-\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}}{m{d}_{ij}^{2}}\right)}^{\frac{1}{m-1}}$$
(20)

Inverting Eq. (20) into \({\sum }_{j=1}^{n}{s}_{ij}=1\), we can obtain \({\alpha }_{i}\), then inverting into Eq. (20), we have:

$${s}_{ij}={\left(\frac{{\left({\sum }_{k=1,k\ne i}^{n}\left(m{d}_{ik}^{2}+\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{k}\Vert }_{2}^{2}\right)\right)}^{\frac{1}{m-1}}-\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}}{m{d}_{ij}^{2}}\right)}^{\frac{1}{m-1}}$$
(21)

Time complexity

The main time complexity of the proposed method FSC is concentrated on Step1, Step2.1, Step2.2 of Algorithm 1. Step1 initializes the similarity matrix \({\mathbf{S}}\), it requires \({\rm O}(n^{2} dk)\). Step2.1 resolves the eigenvalues of \({\mathbf{L}}_{{\mathbf{S}}}\), it requires \({\rm O}(n^{2} d + n^{3} )\). The time complexity of Eq. (21) is \({\rm O}(mn^{2} dk)\), so the time complexity of Step2.2 is \({\rm O}(mn^{2} dk)\). Therefore, the total time complexity of Algorithm1 is \({\rm O}(n^{2} dk + t(n^{2} d + n^{3} + mn^{2} dk))\), where t is the number of the iteration. From the above descriptions, we can observe that the time complexity of FSC is related to the n, d and k.

Experimental and results

Experimental setting

To evaluate the clustering performance of FSC, three experiments were conducted on different datasets. The first experiment used four simple 2D simulation datasets, the second involved 10 commonly used UCI datasets, and the third focused on high-dimensional or large-sample datasets with complex structures. Comparing FSC with K-means, FCM, NC, and adaptive neighbors clustering (CAN)14, the study comprehensively assessed FSC’s effectiveness and applicability across various datasets.

To evaluate FSC’s performance on 2D datasets, we generated four simulation datasets with varied structures: clear structures (SD1, SD4), uncertainty (SD2, SD4), and large scales (SD1, SD2). Detailed information is provided in Fig. 1.

Fig. 1
figure 1

Simulation datasets (SD1–SD4).

To assess FSC’s practicality, we selected 10 complex datasets from the UCI repository and five high-dimensional datasets, including images and fonts. These datasets, with their higher dimensions and complex structures, provide a robust test for evaluating FSC’s effectiveness.

For the K-means method, the parameter k represents the number of clusters. In the FCM (Fuzzy C-Means) method, the parameter m is set to 2. For the NC method, the parameter σ is set to 1. In the CAN method, the parameter γ represents the average distance between all samples. In the FSC method, the parameter m ranging from {1.5, 2, 2.5, 3, 3.5} and the parameter β ranging from {50, 60, 70, 80, 100}.

Evaluation criterion

In our experiments, two commonly used clustering indexes are selected as evaluation indexes, which are clustering accuracy (ACC)27 and normalized mutual information (NMI)28. The calculation formulas are as follows:

$$\text{ACC} = \frac{{\sum }_{i=1}^{n}\delta \left({l}_{i},{m}_{i}\right)}{n}$$
(22)
$${\text{NMI}}\left(\mathbf{X},\mathbf{Z}\right)=\frac{I\left({\varvec{X}},{\varvec{Z}}\right)}{\sqrt{H\left({\varvec{X}}\right)H\left({\varvec{Z}}\right)}}$$
(23)

where \(n\) is the total number of all the training samples, \({l}_{i}\) and \({m}_{i}\) represent the number of true labels and predicted labels respectively, \(\delta\) is the comparative function. \({\varvec{X}}\) represents the true labels of datasets and \({\varvec{Z}}\) represents the predicted labels of datasets, \(I({\varvec{X}},{\varvec{Z}})\) represents the mutual information between \({\varvec{X}}\) and \({\varvec{Z}}\), \(H({\varvec{X}})\) and \(H({\varvec{Z}})\) represent the entropy of \({\varvec{X}}\) and \({\varvec{Z}}\) respectively.

Experimental results and conclusion

On simulation datasets

To verify FSC’s effectiveness on simulation datasets, SD1–SD4 were used. K-means, FCM, and NC were run 20 times to report average results, while CAN and FSC were run once. For FSC, parameter values were optimized based on the highest NMI value. Results are shown in Fig. 2.

Fig. 2
figure 2

Experimental results of all the methods on simulation datasets.

Figure 2 show that FSC performs well in clustering. For clearly shaped datasets, CAN and FSC outperform other methods. However, in datasets with uncertainty, FSC’s use of a fuzzy index mitigates the impact of the similarity matrix, leading to better performance compared to CAN. Overall, FSC demonstrates strong clustering performance on both clear and uncertain datasets.

On the real datasets and high dimensional datasets

To verify FSC’s practicality on real datasets, we selected 10 datasets from the UCI database. Furthermore, in order to further verify its applicability in high-dimensional datasets, five high-dimensional datasets are added in this Experimental results are presented in Table 1.

Table 1 ACC and NMI of all methods on real datasets and high dimensional datasets.

Table 1 shows that FSC outperforms other methods, including K-means, FCM, and NC, on real datasets due to its ability to effectively capture local information. FSC, by incorporating fuzzy ideas, reduces the impact of the similarity matrix and enhances performance, particularly on high-dimensional datasets, compared to CAN.

Running speed analysis

In order to evaluate the running speed of all methods, we compare the once running time of each method on the real datasets (UCI datasets and high-dimensional datasets) under the optimal parameters. The experimental results are shown in Table 2.

Table 2 Running times (%d) of all methods on real datasets and high dimensional datasets.

For large-scale datasets (e.g., Coil, Mnist, MSRA, Palm, USPS), spectral clustering takes more time due to similarity matrix calculations. The method CAN and FSC need the most running time because they use the solution method of alternate optimization. FSC is slower than CAN because the introduction of the fuzzy index adds to the computation.

Statical analysis

In this experiment, we use the Friedman test with the post-Holm test29 to evaluate FSC’s classification performance compared to other methods in terms of ACC on real and high-dimensional datasets. The Friedman test assesses overall statistical significance, while the post-Holm test compares FSC with other methods. Results are reported in Tables 3 and 4.

Table 3 The ranking of all the adopted methods based on the Friedman test.
Table 4 The post-Holm test results of the comparative methods.

Table 3 shows the ranking of methods based on the Friedman test, with lower values indicating better performance. FSC has the best classification performance due to its lowest ranking value. The null hypothesis, which assumes no significant difference among methods, is rejected since the p-value (0.000001) is below 0.005, confirming significant differences among all methods.

Table 4 presents p-values and statistical magnitudes comparing FSC with other methods. The null hypothesis of no significant difference is rejected if the p-value is below 0.005. The results in Table 4 show statistically significant differences between FSC and all the comparative methods.

Conclusions

To address the high sensitivity of clustering results to the input similarity matrix, this study introduces a local adaptive fuzzy spectral clustering method (FSC). FSC reduces the impact of the similarity matrix by incorporating a fuzzy index and simplifies the spectral clustering process by directly using the learnt similarity matrix. The fuzzy index also enhances FSC’s ability to handle uncertainties, improving clustering performance on complex datasets. Experimental results demonstrate that FSC performs better on both simulation and real datasets, including high-dimensional datasets.