A multivariate process quality correlation diagnosis method based on grouping technique

Niu, Qing; Cheng, Shujie; Qiu, Zeyang

doi:10.1038/s41598-024-61954-y

Download PDF

Article
Open access
Published: 08 June 2024

A multivariate process quality correlation diagnosis method based on grouping technique

Qing Niu¹^na1,
Shujie Cheng¹^na1 &
Zeyang Qiu¹^na1

Scientific Reports volume 14, Article number: 13212 (2024) Cite this article

1371 Accesses
1 Citations
Metrics details

Subjects

Abstract

Correlation diagnosis in multivariate process quality management is an important and challenging issue. In this paper, a new diagnostic method based on quality component grouping is proposed. Firstly, three theorems describing the properties of the covariance matrix of multivariate process quality are established based on the statistical viewpoint of product quality, to prove the correlation decomposition theorem, which decomposes the correlation of all the quality components into a series of correlations of components pairs, and then by using the factor analysis method, all quality components are grouped in order to maximize the correlations in the same groups and minimize the ones between different groups. Finally, on the basis of correlations between different groups are ignored, T² control charts of component pairs in the same groups are established to form the diagnostic model. Theoretical analysis and practice prove that for the multivariate process quality whose the correlations between different components vary considerably, the grouping technique enables the size of the correlation diagnostic model to be drastically reduced, thus allowing the proposed method can be used as a generalized theoretical model for the correlation diagnosis.

Minimalist module analysis for fault detection and localization

Article Open access 07 December 2021

Research on Control Method of Waste Heat Utilization System Based on Multi-parameter Coupling

Article Open access 07 July 2022

Shrinkage estimators of large covariance matrices with Toeplitz targets in array signal processing

Article Open access 08 November 2022

Introduction

With the development of the modern global market, the product's quality has been one of the key factors that greatly influence the competitiveness of enterprises. In the whole formation of the product's quality, process quality is one of the most basic sessions because the product's quality will be influenced by every process' quality directly or indirectly, so process quality control is the essence of quality management in manufacturing.

The objective of managing univariate process quality was achieved by using Shewhart's control chart, which is a tool in the theory of statistical process control (SPC)^1,2,3,4. But in modern manufacturing, there are many processes that involve more than one quality component. Due to the correlation of quality components, all components and their correlation must be monitored simultaneously^5,6. The theory of monitoring the correlation shift of all the quality components using T² control charts was originally proposed by Hotelling⁷. For a p-dimensional process quality y = (y₁, y₂,…, y_p)^T, the T² statistic is defined as:

$$T^{2} = ({\mathbf{y}} - {{\varvec{\upmu}}})^{T} {{\varvec{\Sigma}}}^{ - 1} ({\mathbf{y}} - {{\varvec{\upmu}}})$$

(1)

where μ is the mean vector, and Σ is the covariance matrix of y. When T² > 0, it signifies that all the quality components in y are correlated.

The general distribution of T² statistics can have different forms^8,9,10. Particularly, when y follows the normal distribution N(μ, Σ), the T² statistic follows the χ² distribution with p-freedom, and the proof can be found in Supplementary. Suppose α is the false probability, then the upper control limit (UCL) of the T² statistic is $\chi_{\alpha }^{2} (p)$, and the lower control limit (LCL) is 0. Thus, the T² control chart can be established to monitor the correlation shift of y. T² control chart has the advantage of being able to fully take into account the correlation between components and gives accurate false probability under condition of component correlation, however, this control chart is unable to pinpoint the cause(s) of the correlation shift when it is out of control. Since then, on the basis of T² statistic, scholars have carried out a lot of research on the diagnostic methods of abnormal correlation shift between quality components, and have successively proposed diagnostic methods based on component combinations, principal component analysis, orthogonal decomposition of the T² statistics, and intelligent diagnostic methods.

Diagnosis method based on component combinations

For a p-dimensional process quality y = (y₁, y₂,…, y_p)^T, when the T² control chart K, which monitors the correlation shift of all the quality components shows an abnormality, it indicates that the correlations of one or more quality component combinations must be abnormal. In order to diagnose the specific component combinations that cause the T² control chart K to be abnormal, a straightforward approach is to use the exhaustive method, i.e., to list all possible forms of component combinations, and for each form of combinations, to create a T² control chart. When the T² control chart K displays an abnormality, the specific component combinations that lead to the abnormality of the T² control chart K can be determined by analyzing the results of the T² control charts corresponding to all combinations of quality components one by one^11,12,13, this approach is referred as component combinations based diagnostic (CCBD) method in this paper. While this approach is theoretically sound and appealing, it has inherent deficiencies. For a p-dimensional process quality y = (y₁, y₂, …, y_p)^T, the number of T² control charts using this method is $N = C_{p}^{2} + C_{p}^{3} + \cdots + C_{p}^{p}$ = 2^p-p-1, where N is an exponential function of p, and the space complexity is O(2^p). When p is small, this approach has some feasibility, however, when p increases, N will increase sharply, leading to a significant expansion of the diagnostic system scale, so this method is difficult to apply in practice.

On the other hand, the defect of information redundancy in diagnostic results can not be avoided in the CCBD method. For example, in a 4-dimensional process quality y = (y₁, y₂, y₃, y₄)^T, suppose the abnormal correlation shift between y₁ and y₂ is the only cause which causes the correlation of y out of control. Now in the CCBD method, besides T² control chart to monitor the correlation shift of (y₁, y₂) is out of control, the other combinations which contain y₁ and y₂, namely (y₁, y₂, y₃) and (y₁, y₂, y₄), their T² control charts are both out of control. This phenomenon that because of the correlation shift of one component combination is out of control, the correlations of other component combinations which contain the abnormal component combination are all out of control is called as the redundancy of diagnostic messages. The redundancy in diagnostic results is disturbance for process quality adjustment.

Diagnosis method based on principal component analysis

When the number of quality components to be monitored in manufacturing process is large, direct analysis of process quality data will lead to a significant increase in the computational effort of the diagnostic process. Therefore, reducing the complexity of process quality data in an appropriate means is an effective way to improve diagnostic efficiency. Because the principal component analysis (PCA) is a useful tool for dealing with high-dimensional data, scholars proposed by using the principal component analysis method^14,15,16,17, the original process quality y = (y₁, y₂,…, y_p)^T is converted to p independent principal components and sorted by variance decreasing order, denoted as z = (z₁, z₂, … , z_p)^T. Then firstly, p Shewhart control charts are constructed to monitor the normality of z_i; Secondly, the first n(n < p) principal components whose the cumulative sum of their variance exceeds a specified critical value are grouped as component pairs, and T² control charts are constructed to monitor the normality of (z_i, z_j) (i, j ≤ n, i ≠ j); At last, the normality of the rest principal components group (z_n+1, z_n+2,…, z_p) is monitored by a T² control chart.

Compared with the CCBD method, the number of control charts based on PAC method is $N = p + C_{n}^{2} + 1 = p + n(n - 1)/2 + 1$, the space complexity approximately is O(p²), and the diagnostic efficiency is improved. However, n still increases rapidly while p is increasing, the scale of diagnostic system is still large. Furthermore, due to z_i generally has no engineering meaning after conversion, the cause(s) which cause the correlation shift of y out of control can only be specified by a comprehensive analysis of all the results in control charts and consulting the mapping relationship between y and z, the calculation of diagnosis is increased, and the accuracy of diagnostic results is affected. Meanwhile, the redundancy of diagnostic messages also can not be avoided.

Diagnosis method based on correlation orthogonal decomposition

In 1995, Mason, Young and Tracy^18,19,20 proposed by using regression analysis method, the T² statistic can be decomposed into conditional and unconditional terms which have equal weight in the decomposition results and are orthogonally independent each other. Then, according to the statistical distribution of the conditional and unconditional terms, the corresponding control limits are established to diagnose the specific cause(s) when the manufacturing process is abnormal. Compared with the diagnostic methods based on principal component analysis, the conditional and unconditional terms obtained by MYT orthogonal decomposition method can be directly corresponded to the quality components or component combinations, which improves the accuracy of the diagnostic results.

As an example, in bivariate process quality y = (y₁, y₂)^T, the basic idea of the MYT orthogonal decomposition method^21,22,23,24 is to decompose the T² statistic into the following form:

$$T^{2} = T_{1}^{2} + T_{2 \cdot 1}^{2}$$

(2)

where $T_{1}^{2}$, called the unconditional term, is related only to the quality component y₁ and is used to measure the contribution shift in y₁ to the T² statistic; and $T_{2 \bullet 1}^{2}$, called the conditional term, whose value is related to the conditional probability P(y₂|y₁) and is used to measure the contribution in the correlation between y₁ and y₂ to the T² statistic.

Similar to Eq. (2), the T² statistic can also be decomposed into another form:

$$T^{2} = T_{2}^{2} + T_{1 \cdot 2}^{2}$$

(3)

where the unconditional term $T_{2}^{2}$ is related only to the quality component y₂, and is used to measure the contribution shift in y₂ to the T² statistic; the conditional term $T_{1 \bullet 2}^{2}$ depends on the conditional probability P(y₁|y₂), and is used to measure the contribution in the correlation between y₂ and y₁ to the T² statistic.

Conditional probability P(y₂|y₁) ≠ P(y₁|y₂) when y₁ and y₂ are correlated, and hence the conditional term $T_{2 \cdot 1}^{2} \ne T_{1 \cdot 2}^{2}$. For this reason, Eqs. (2) and (3) are two distinct representations of the T² statistic's decomposition results. In general, for a p-dimensional process quality y = (y₁, y₂,…, y_p)^T, the decomposition results have a total of p(p-1)… × 2 × 1, and the space complexity is O(p!). As the number of quality components increases, under the condition that every possible form of decomposition is analyzed, will lead to a significant increase of calculations and a serious reduction in diagnostic efficiency. At the same time, the accuracy of the diagnostic results based on this method will be affected when there are obvious correlations between different quality components.

Intelligent diagnosis methods

In addition to the traditional diagnostic methods based on mathematical model analysis, in recent years, with the development of artificial intelligence technology, intelligent diagnostic methods are applied to the field of multivariate process quality diagnosis, and the diagnostic methods based on artificial neural network (ANN)^25,26,27,28, Bayesian network^29,30,31,32, support vector machine (SVM)^33,34,35, etc. have been widely applied. Intelligent diagnostic methods can effectively reduce the scale of the diagnostic system and improve the diagnostic efficiency, however, these methods generally require a large amount of data to train the network's parameters, and the constructed network are generally suitable for specific applications, thus their generality will be greatly restricted. Therefore, establish a general and efficient method for multivariate process quality correlation diagnosis is a major problem to be solved in the field of quality management.

Sketch of the algorithm

In this paper, a new correlation diagnosis method based on quality component grouping is proposed. For the multivariate process quality y = (y₁, y₂,…, y_p)^T, three theorems describing the properties of the multivariate process quality covariance matrix are first established based on the statistical viewpoint of product quality in manufacturing processes; Then the correlation decomposition theorem is proved by drawing on the idea of decomposing the T² statistic in the MYT orthogonal decomposition method, which decomposes the correlation of all the quality components into the correlations of all the component pairs, to reduce the space complexity of the diagnostic system to O(p²); Next, refer to the grouping idea in the principal component analysis method, based on the correlation between different components, the quality components are grouped, so that the correlations between components in the same groups are as large as possible, and the correlations between components of different groups are as small as possible; Finally, draw on the principle of component combination diagnosis method, on the premise of ignoring the correlations between different groups, quality components in the same groups are combined as component pairs to establish the corresponding T² control charts, which constitutes the multivariate process quality correlation diagnostic system, thus the space complexity of the diagnostic system is reduced to approximate O(p), to improve the diagnostic efficiency.

Covariance matrix properties of multivariate process quality

In the manufacturing process, factors affecting the product's quality can be attributed to 5 aspects: man, machines, materials, methods and environment (4M1E). On this basis, ISO9000 supplemented another 3 factors: the manufacturing software, auxiliary materials and utilities. Among the many factors affecting the product's quality, changes in any one of them will have an impact on the final quality of the product, so the product's quality is fluctuating in manufacturing. Tolerance theory is a direct proof of the fluctuation of the product's quality.

For the multivariate process quality y = (y₁, y₂,…, y_p)^T, the covariance matrix is an important parameter to describe its correlation. Combined with the fluctuation of the product's quality in the manufacturing process, this paper firstly establishes 3 theorems describing the characteristics of the covariance matrix of multivariate process quality.

Theorem 1

In the covariance matrix Σ of the multivariate process quality y = (y₁, y₂,…, y_p)^T, all of the elements are not 0.

Suppose the mean vector of y is μ = (μ₁, μ₂,…, μ_p)^T. According to the definition of the covariance matrix, it is known that:

$$\begin{aligned} {{\varvec{\Sigma}}} & = \left[ {\begin{array}{*{20}c} {E[(y_{1} - \mu_{1} )(y_{1} - \mu_{1} )]} & {E[(y_{1} - \mu_{1} )(y_{2} - \mu_{2} )]} & \cdots & {E[(y_{1} - \mu_{1} )(y_{p} - \mu_{p} )]} \\ {E[(y_{2} - \mu_{2} )(y_{1} - \mu_{1} )]} & {E[(y_{2} - \mu_{2} )(y_{2} - \mu_{2} )]} & \cdots & {E[(y_{2} - \mu_{2} )(y_{p} - \mu_{p} )]} \\ \vdots & \vdots & \vdots & \vdots \\ {E[(y_{p} - \mu_{p} )(y_{1} - \mu_{1} )]} & {E[(y_{p} - \mu_{p} )(y_{2} - \mu_{2} )]} & \cdots & {E[(y_{p} - \mu_{p} )(y_{p} - \mu_{p} )]} \\ \end{array} } \right] \\ & = \left[ {\begin{array}{*{20}l} {\sigma_{11} } \hfill & {\sigma_{12} } \hfill & \cdots \hfill & {\sigma_{1p} } \hfill \\ {\sigma_{21} } \hfill & {\sigma_{21} } \hfill & \ldots \hfill & {\sigma_{2p} } \hfill \\ \vdots \hfill & \vdots \hfill & \vdots \hfill & \vdots \hfill \\ {\sigma_{p1} } \hfill & {\sigma_{p1} } \hfill & \cdots \hfill & {\sigma_{pp} } \hfill \\ \end{array} } \right] \\ \end{aligned}$$

(4)

For any element $\sigma_{ij} = E[(y_{i} - \mu_{i} )(y_{j} - \mu_{j} )]$ in Σ, the sufficient and necessary condition for it to be 0 is:

$$y_{i} = \mu_{i} \quad {\text{or}} \quad y_{j} = \mu_{j}$$

(5)

According to the properties of mathematical expectation, Eq. (5) implies that the quality component y_i or y_j is a constant in the manufacturing process. Clearly, this is in conflict with the viewpoint of the fluctuation of the product's quality, and therefore, Eq. (5) does not hold, i.e., all the elements in Σ are not 0.

Theorem 2

The covariance matrix Σ of the multivariate process quality y = (y₁, y₂, … , y_p)^T is a real symmetric positive definite matrix.

According to Eq. (4) on the definition of the covariance matrix:

$$\begin{aligned} & \sigma_{ij} = E[(y_{i} - \mu_{i} )(y_{j} - \mu_{j} )] \\ & \sigma_{ji} = E[(y_{j} - \mu_{j} )(y_{i} - \mu_{i} )] \\ \end{aligned}$$

From the properties of mathematical expectation can be seen:

$$\sigma_{ij} = \sigma_{ji}$$

That is, Σ is a symmetric matrix.

Let p-dimensional vector c = (c₁, c₂,…, c_p)^T ≠ 0.

$${\mathbf{c}}^{T} {\mathbf{\Sigma c}} = (c_{1} ,c_{2} , \cdots c_{p} ){{\varvec{\Sigma}}}(c_{1} ,c_{2} , \cdots c_{p} )^{T}$$

(6)

Bringing Eq. (4) into (6), after simplification and consolidation, we get

$${\mathbf{c}}^{T} {\mathbf{\Sigma c}} = E\left[ {\left( {\sum\limits_{i = 1}^{p} {c_{i} (y_{i} - \mu_{i} )} } \right)\left( {\sum\limits_{k = 1}^{p} {(y_{k} - \mu_{k} )c_{k} } } \right)} \right]$$

(7)

Let random variable $z = \sum\nolimits_{i = 1}^{p} {c_{i} (y_{i} - \mu_{i} )}$, bringing this into Eq. (7), we get

$${\mathbf{c}}^{T} {\mathbf{\Sigma c}} = E(z^{2} ) \ge 0$$

From the proof of Theorem 1, it is clear that according to the viewpoint of the fluctuation of the product's quality, z ≠ 0, i.e.

$${\mathbf{c}}^{T} {\mathbf{\Sigma c}} = E(z^{2} ) > 0$$

Therefore, the covariance matrix Σ of the multivariate process quality y = (y₁, y₂,…, y_p)^T is a real symmetric positive definite matrix.

Theorem 3

The inverse matrix Σ⁻¹ of the covariance matrix Σ of the multivariate process quality y = (y₁, y₂,…, y_p)^T is a real symmetric positive definite matrix.

First prove the symmetry of Σ⁻¹. It follows from the symmetry of Σ:

$${{\varvec{\Sigma}}} = {{\varvec{\Sigma}}}^{T}$$

Inverting both ends of the above equation:

$${{\varvec{\Sigma}}}_{{}}^{ - 1} = ({{\varvec{\Sigma}}}_{{}}^{T} )^{ - 1} = ({{\varvec{\Sigma}}}_{{}}^{ - 1} )^{T}$$

The above equation shows that Σ⁻¹ is a symmetric matrix.

Let the eigenvalues of Σ be λ₁, λ₂,…, λ_p. By the positive definiteness of Σ, λ_i > 0 (i ≤ p). According to the nature of the inverse matrix, the eigenvalues of Σ⁻¹ are 1/λ₁, 1/λ₂,…, 1/λ_p, i.e., the eigenvalues of Σ⁻¹ are all greater than 0, so Σ⁻¹ is a positive definite matrix.

Theoretical basis for correlation grouping diagnosis

The exponential function between N and p is the main reason why applying this approach is difficult in the CCBD method. If the gradient of N with p can be lowered by proper means, the defect of diagnostic system scale expands greatly while p is increasing will be avoided to a certain extent, and thus this approach can be applied in multivariate process quality management.

Correlation decomposition

Theorem 4

In the multivariate process quality y = (y₁, y₂,…, y_p)^T, the sufficient and necessary condition of the correlation of all the components exists is, for any two components y_i and y_j, they are correlated.

Firstly, the sufficiency of Theorem 4 is proved. Any two components y_i and y_j in y are correlated shows that σ_ij ≠ 0. From Theorems 2 and 3, the covariance matrix Σ and its inverse matrix Σ⁻¹ are real symmetric positive definite matrix. From the definition of the T² statistic in Eq. (1), it is clear that for any sample data, its T² statistic is greater than 0, i.e., the correlation of all the components exists.

The following proves the necessity of Theorem 4 by reduction and absurdum. The existence of correlation of all the components in y implies that for any sample data, its T² statistic is greater than 0. From the definition of the T² statistic in Eq. (1), there exists an inverse matrix of the covariance matrix Σ of y, and the rank of Σ is p.

$$R({{\varvec{\Sigma}}}) = p$$

(8)

Assume y_k and y_j in y are uncorrelated, i.e., σ_kj = 0. By the definition of covariance, there is:

$$\sigma_{kj} = E[(y_{k} - \mu_{k} )(y_{j} - \mu_{j} )] = 0$$

(9)

The sufficient and necessary condition for Eq. (9) to hold is y_k = μ_k or y_j = μ_j. It may be useful to set y_k = µ_k. From the definition of covariance, we know that for any component y_i (i = 1, 2 ,… , p), there are:

$$\sigma_{ki} = E[(y_{k} - \mu_{k} )(y_{i} - \mu_{i} )] = 0$$

(10)

Equation (10) shows that in the covariance matrix Σ of y, the kth row and kth column are both 0, i.e., R(Σ) ≤ p − 1. This contradicts Eq. (8), the assumption is not valid, and the necessity of Theorem 4 is proved.

Theorem 4 means that the correlation of all the quality components can be represented as correlations of component pairs, so in the correlation diagnostic system, it only needs to monitor the correlation shifts of all the component pairs. In addition, T² control chart to monitor the correlation shift of all the components should be added, the number of T² control charts is N = C_p² + 1 = p(p − 1)/2 + 1, N is the power function of p, the space complexity of the diagnostic system is lowered to O(p²). Compared to the CCBD method, the gradient of N with p is decreased significantly. Meanwhile, because the component pair is the minimum combination of components, the information redundancy in diagnostic results can be avoided effectively.

Grouping principle

Although the functional relation between N and p is lowered to a power function by correlation decomposition, N will still increase rapidly while p is increasing, so further proper ways should be adopted to reduce the scale of the diagnostic system on the basis of the above analysis. For this reason, this paper proposes the following grouping principle.

Theorem 5

Let p = p₁ + p₂ + … + p_m, where p and p_i (i = 1, 2, … , m) are integers greater than 0, m > 1. In this case there is the following inequality:

$$C_{p}^{2} > \sum\limits_{i = 1}^{m} {C_{{p_{i} }}^{2} }$$

(11)

The proof of Theorem 5 proceeds as follows:

$$\begin{aligned} & \sum\limits_{i = 1}^{m} {C_{{p_{i} }}^{2} = \sum\limits_{i = 1}^{m} {\frac{{p_{i} (p_{i} - 1)}}{2}} } \\ & \quad = \frac{1}{2}(\sum\limits_{i = 1}^{m} {p_{i}^{2} } - p) \\ & \quad < \frac{1}{2}(\sum\limits_{i = 1}^{m} {p_{i}^{2} } + 2\sum\limits_{\begin{subarray}{l} k,j = 1 \\ k \ne j \end{subarray} }^{m} {p_{k} p_{j} } - p) \\ & \quad = \frac{1}{2}((\sum\limits_{i = 1}^{m} {p_{i} } )^{2} - p) \\ & \quad = \frac{1}{2}(p^{2} - p) \\ & \quad = C_{p}^{2} \\ \end{aligned}$$

Theorem 5 shows that for the multivariate process quality y = (y₁, y₂,…, y_p)^T, if the quality components are grouped according to the degree of correlations, so that the correlations of quality components located within the same groups should be as large as possible, and the correlations of quality components located between different groups should be as small as possible, the number of T² control charts in the diagnostic model can be further reduced by ignoring the correlations of the quality components located in the different groups, and the reduction of the number of T² control charts is $\sum\nolimits_{\begin{subarray}{l} k,j = 1 \\ k \ne j \end{subarray} }^{m} {p_{k} p_{j} }$, where m is the number of quality components grouped, p_k and p_j denote the number of quality components contained in the kth and jth groups after grouping. In this case, the space complexity of the multivariate process quality correlation diagnostic model based on the grouping technique is approximated as O(p).

Methodology for grouping quality components

Grouping techniques can lead to a significant reduction in the number of T² control charts required in the correlation diagnostic system. Typically, quality components can be grouped with reference to practical experience, but this way can not give an accurate estimate of the error before and after grouping. In order to analyze the errors quantitatively, a grouping method based on the analysis of the covariance matrix of the quality components is used here.

Before grouping, the multivariate process quality y = (y₁, y₂,…, y_p)^T needs to be standardized in order to avoid differences in the observed scales from affecting the grouping results:

$$y_{i}^{*} = \frac{{y_{i} - \mu_{i} }}{{\sigma_{i} }}$$

(12)

where μ_i and σ_i are the mean and variance of y_i. In the standardization result ${\mathbf{y}}^{*} = (y_{1}^{*} ,y_{2}^{*} , \cdots ,y_{p}^{*} )^{T}$, the mean of each component is 0, and the variance is 1.

Factor analysis

Factor analysis is a method of grouping components based on the degree of correlations between different components, using the covariance matrix of a random vector as a reference. The basic model of factor analysis is as follows^36,37:

(1)
The standardized multivariate process quality y^* is an observable random vector with mean vector E(y^*) = 0 and covariance matrix D(y^*) = Σ^*;
(2)
The common factor vector F = (F₁, F₂, …, F_m)^T (m < p) is an unobservable random vector with mean vector E(F) = 0 and covariance matrix D(F) = I, where I is a diagonal matrix where the main diagonal elements are 1, and the remaining elements are 0, i.e., the components in F are independent of each other;
(3)
The error vector ε = (ε₁, ε₂, …, ε_p)^T is independent of the common factor vector F with E(ε) = 0, and the covariance matrix D(ε) is a diagonal matrix:
$$D({{\varvec{\upvarepsilon}}}) = \left( {\begin{array}{*{20}c} {\sigma_{{\varepsilon_{1} }}^{2} } & {} & {} & {} \\ {} & {\sigma_{{\varepsilon_{2} }}^{2} } & {} & {} \\ {} & {} & \ddots & {} \\ {} & {} & {} & {\sigma_{{\varepsilon_{p} }}^{2} } \\ \end{array} } \right)$$

Under the above conditions, the factor analysis model can be expressed as the following equations:

$$\left\{ {\begin{array}{*{20}l} {y_{1}^{*} = a_{11} F_{1} + a_{12} F_{2} + \cdots + a_{1m} F_{m} + \varepsilon_{1} } \hfill \\ {y_{2}^{*} = a_{21} F_{1} + a_{22} F_{2} + \cdots + a_{2m} F_{m} + \varepsilon_{2} } \hfill \\ {\begin{array}{*{20}c} \vdots & {} & {} & {} \\ \end{array} } \hfill \\ {y_{p}^{*} = a_{p1} F_{1} + a_{p2} F_{2} + \cdots + a_{pm} F_{m} + \varepsilon_{m} } \hfill \\ \end{array} } \right.$$

(13)

Expressing the above system of equations in matrix form:

$${\mathbf{y}}^{*} = {\mathbf{AF}} + {{\varvec{\upvarepsilon}}}$$

(14)

where a_ij in the matrix A = (a_ij)_p×p is called the factor loading, and its absolute value indicates the degree of dependence between the quality component $y_{i}^{*}$ and the common factor F_j. The matrix A formed by all the factor loadings is called the factor loading matrix.

From Eq. (14), calculate the covariance matrix of y^*:

$${{\varvec{\Sigma}}}^{*} = D({\mathbf{y}}^{*} ) = D({\mathbf{AF}}) + D({{\varvec{\upvarepsilon}}}) = {\mathbf{A}}D({\mathbf{F}}){\mathbf{A}}^{T} + D({{\varvec{\upvarepsilon}}}) = {\mathbf{AA}}^{T} + D({{\varvec{\upvarepsilon}}})$$

(15)

On the other hand, by Theorem 2, Σ^* is a real symmetric positive definite matrix for which Cholesky decomposition is performed:

$${{\varvec{\Sigma}}}^{*} = {\mathbf{GG}}^{T}$$

(16)

where ${\mathbf{G}} = (\sqrt {\lambda_{1} } {\mathbf{e}}_{1} ,\sqrt {\lambda_{2} } {\mathbf{e}}_{2} , \cdots ,\sqrt {\lambda_{p} } {\mathbf{e}}_{p} )$, λ_i(i = 1, 2, …, p) are the eigenvalues of the covariance matrix Σ^* with λ₁ > λ₂ > … > λ_p, e_i is the eigenvector corresponding to λ_i.

Comparing Eqs. (15) and (16), it can be seen that if A = G, the error vector ε = 0 in Eq. (14), the obtained factor analysis model is accurate, but this means that after standardization, all the quality components in y^* will be grouped into p groups, i.e., the accurate factor analysis model can only be obtained when the correlations between the quality components in y^* are completely ignored. Therefore, considering the general situation, it is necessary to retain most of the correlations between the quality components, in which case an approximation of the factor loading matrix A is constructed from the first m (m < p) columns of the matrix G, i.e.:

$${\mathbf{A}} \approx (\sqrt {\lambda_{1} } {\mathbf{e}}_{1} ,\sqrt {\lambda_{2} } {\mathbf{e}}_{2} , \cdots ,\sqrt {\lambda_{m} } {\mathbf{e}}_{m} )$$

(17)

Error analysis

The error vector ε ≠ 0 when building the factor analysis model from the factor loading matrix derived from Eq. (17), this implies that there must be a certain amount of information loss when grouping the quality components in y^* based on Eq. (14).

In statistics, the total amount of information contained in a random variable is generally measured by its variance. In Eq. (14), let A = G, which gives the sum of the variances of the components in y^* under the exact decomposition condition:

$$\sum\limits_{i = 1}^{p} {D(y_{i}^{*} } ) = \sum\limits_{i = 1}^{p} {\lambda_{i} }$$

(18)

Equation (18) shows that under the condition of exact decomposition, the sum of the information contained in all the quality components in y^* is equal to the cumulative sum of all the eigenvalues of the covariance matrix Σ^* of y^*.

The factor loading matrix A is then constructed according to Eq. (17), at which point it is given by Eq. (14):

$$\sum\limits_{{{{i}} = {1}}}^{{{p}}} {{{D(y}}_{{{i}}}^{*} {)}} = \sum\limits_{{{{i}} = {1}}}^{{{m}}} {{\uplambda }_{{{i}}} } + \sum\limits_{{{{i}} = {1}}}^{{{p}}} {{{D(\varepsilon }}_{{{i}}} {)}}$$

(19)

Comparing Eq. (18) with Eq. (19) shows that grouping the quality components in y^* with Eq. (14), under the condition of ignoring the correlation of the quality components between different groups, the sum of information loss is $\sum\nolimits_{i = m + 1}^{p} {\lambda_{i} }$. Therefore, for a specified error β, the number of quality component group m can be determined by the following inequality:

$$\eta = \frac{{\sum\nolimits_{i = 1}^{m} {\lambda_{i} } }}{{\sum\nolimits_{i = 1}^{p} {\lambda_{i} } }} \ge 1 - \beta$$

(20)

where η is the cumulative variance contribution rate of the first m eigenvalues. Empirically, when η > 80% ~ 85%, the number of groupings m can be determined by inequality (20). The value of η can be reasonably adjusted in combination with specific applications, but the basic principle of adjustment is that it should be conducive to the reasonable interpretation of the factor analysis model.

Correlation diagnostic algorithm based on grouping theory

The above analysis is founded on the condition that the mean μ_j and covariance matrix Σ of the manufacturing process are given. However, in many applications, these parameters are generally unknown. In this case, the unbiased estimator of the manufacturing process parameters can be calculated from a set of sample data y_i = (y_i1, y_i2, … , y_ip)^T, (i = 1, 2,…, n) collected while the process is in stable state.

$$\mu_{j} = \frac{1}{n}\sum\limits_{i = 1}^{n} {y_{ij} } \quad \left( {j = 1,2, \ldots ,p} \right)$$

(21)

$$\sigma_{j} = \frac{1}{n - 1}\sum\limits_{i = 1}^{n} {(y_{ij} - \mu_{j} )^{2} } \quad \left( {j = 1,2, \ldots ,p} \right)$$

(22)

Then the sample data can be standardized as ${\mathbf{y}}_{i}^{*} = (y_{i1}^{*} ,y_{i2}^{*} , \cdots ,y_{ip}^{*} )^{T}$, (i = 1, 2,…, n), where

$$y_{ij}^{*} = \frac{{y_{ij} - \mu_{j} }}{{\sigma_{j} }}\quad \left( {j = 1,2, \ldots ,p} \right)$$

(23)

The covariance matrix Σ^* can be calculated by the standardize sample data:

$${{\varvec{\Sigma}}}^{*} = \frac{1}{n - 1}\sum\limits_{i = 1}^{n} {{\mathbf{y}}_{i}^{*} {\mathbf{y}}_{i}^{*T} }$$

(24)

Based on the above analysis, after grouping the quality components in the standardized multivariate process quality y^* using factor analysis method, on the premise of ignoring the correlation of quality components between different groups, the quality components within the same groups are combined as component pairs, and the corresponding binary T² control charts are established to form the multivariate process quality correlation diagnostic model. The space complexity of this diagnostic model is approximated as a linear function of the quality component number p, which can lead to a significant improvement in the efficiency of the diagnosis.

The multivariate process quality correlation diagnostic model based on grouping technique can be constructed as follows:

(1)
Collect sufficient quality data y_i(i = 1, 2,…, n) while the manufacturing process is in stable state;
(2)
Calculate the manufacturing parameters according to Eqs. (21)–(24);
(3)
Calculate the eigenvalues of the covariance matrix Σ^* and arrange all the eigenvalues in descending order as $\lambda_{{1}} \, > \,\lambda_{{2}} \, > \, \cdots \, > \,\lambda_{p}$ ;
(4)
Calculate the eigenvector e_i corresponding to the eigenvalue λ_i(i = 1, 2, …, p);
(5)
For the given error β, calculate the number m of eigenvectors for constructing the factor loading matrix according to inequality (20), and then construct the factor loading matrix A from the first m eigenvectors according to Eq. (17);
(6)
Group all the quality components according to Eqs. (13), and the grouping results are recorded as G₁, G₂, …, G_m;
(7)
For each pair of components $(y_{s}^{*} ,y_{t}^{*} )$ (s ≠ t) in G_k (k = 1, 2, …, m), build the corresponding T² control chart K_st;
(8)
Establish the T² control chart K to monitor the correlation shift of all the quality components.

In the manufacturing process, if the T² statistic of the new sample data exceeds the control limit in the control chart K, it indicates that the correlation shift of all the quality components is abnormal, and the cause(s) can be specified by examining the rest binary T² control charts in the diagnostic model.

Case study

Blades are important parts in steam turbines and aviation engines, and their machining quality directly affects the life and performance of the equipment. The contour method is a commonly used blade quality inspection technique, and its basic principle is to measure a number of cross-section contour lines of the blade along the height direction (Z-axis direction) in the way shown in Fig. 1, and then match the actual contour lines measured in different height directions with their respective theoretical contour lines by translational and rotational transformations as shown in Fig. 2, so as to decompose the blade profiling error into 4 quality components: blade contouring error before matching, blade contouring error after matching, blade positional error, and blade torsion error.

The machining process shows that the 4 quality components are correlated, so it is necessary to monitor the correlation shift during the manufacturing process and to diagnose the causes of the abnormal correlation. Here, a T² control chart is used to monitor the correlation shift of the 4 quality components, and the method proposed in this paper is used to diagnose the causes of the abnormal correlation.

Parameters estimation

The above 4 quality components are expressed in vector form as y = (y₁, y₂, y₃, y₄)^T. 15 sample data are collected at the cross-section height Z = 25 mm as shown in Table 1, in order to estimate the mean vector and covariance matrix for the manufacturing process.

Table 1 Sample data used for process parameter estimation.

Full size table

Experience shows that the 4 quality components to be monitored generally follow normal distribution. In order to check the normality of the sample data, set the confidence level α_t = 0.95, and the Shapiro–Wilk test is done on the data of the 4 quality components in Table 1, and the results are shown in Table 2. It can be seen that the W statistics of the four components are all greater than the critical value W(15,0.05) = 0.881, indicating that the sample data in Table 1 follow normal distribution.

Table 2 Results of normality test for sample data.

Full size table

The sample data used to estimate the process parameters must be collected while the manufacturing process is in stable state, therefore, for the sample data in Table 1, the probability of false alarm α = 0.0027 is set for each quality component with reference to the 3σ principle of the Shewhart control chart. According to Bonferroni inequality and χ² distribution, take the false alarm probability α_y = 0.025 of the correlation shift, and establish Shewhart control charts for the 4 quality components and T² control chart to monitor the correlation shift, as shown in Figs. 3, 4, 5, 6 and 7.

Figures 3, 4, 5, 6 and 7 show that all the 5 control charts are in normal level, indicating that the sample data in Table 1 are obtained while the blade manufacturing process is in stable state, and can be used for process parameter estimation. The calculated mean vector and standard deviation are:

$$\begin{aligned} & {{\varvec{\upmu}}} = (0.0817,0.0413,0.0961,2.2039)^{T} \\ & {{\varvec{\upsigma}}} = (0.0029,0.0054,0.0054,0.1029)^{T} \\ \end{aligned}$$

The sample data in Table 1 are standardized and the results are shown in Table 3.

Table 3 Sample data after standardization.

Full size table

Calculate the covariance matrix from the data in Table 3, we get:

$${{\varvec{\Sigma}}}^{*} = \left( {\begin{array}{*{20}c} {1} & {{0}{.9814}} & {0.9747} & { - 0.1590} \\ {0.9814} & {1} & {0.9510} & {{ - 0}{.0490}} \\ {0.9747} & {0.9510} & {1} & {{ - 0}{.2730}} \\ { - 0.1590} & {{ - 0}{.0490}} & {{ - 0}{.2730}} & {1} \\ \end{array} } \right)$$

Establishment of diagnostic model

Calculate the eigenvalues and eigenvectors of the covariance matrix Σ^* and sort all the eigenvalues and eigenvectors in descending order of the eigenvalues, as shown in the second and third columns in Table 4. On this basis, calculate the cumulative contribution rate of the variance of the first 1 to 4 eigenvalues, as shown in the fourth column in Table 4.

Table 4 Eigenvalues, eigenvectors of the covariance matrix and cumulative contribution of variance.

Full size table

As can be seen from Table 4, the first two eigenvalues of the covariance matrix, which have a cumulative contribution rate of variance of 99.11%, are already much higher than the empirical threshold of 80–85%, so let m = 2 to construct an approximation of the factor loading matrix A from the first two eigenvectors.

$$\begin{aligned} & {\mathbf{A}} = \left( {\begin{array}{*{20}c} { - 0.9923} & {0.0829} \\ { - 0.9754} & {0.1941} \\ { - 0.9916} & { - 0.0382} \\ {0.2409} & {0.9702} \\ \end{array} } \right) \\ & \quad \left\{ {\begin{array}{*{20}l} {y_{1}^{*} = - 0.9923F_{1} + 0.0829F_{2} } \hfill \\ {y_{2}^{*} = - 0.9754F_{1} + 0.1941F_{2} } \hfill \\ {y_{3}^{*} = - 0.9916F_{1} - 0.0382F_{2} } \hfill \\ {y_{4}^{*} = 0.2409F_{1} + 0.9702F_{2} } \hfill \\ \end{array} } \right. \\ \end{aligned}$$

It can be seen that there is a large degree of dependence between components $y_{1}^{*}$,$y_{2}^{*}$,$y_{3}^{*}$ and factor F₁, and a smaller degree of dependence with factor F₂, so these 3 quality components are grouped together; $y_{4}^{*}$ is only correlated with factor F₂ to a large extent, and therefore will be divided into a group alone. The final result of the grouping is G₁ = {$y_{1}^{*}$,$y_{2}^{*}$,$y_{3}^{*}$}, G₂ = {$y_{4}^{*}$}.

For group G₁, T² control charts K12, K13 and K23 are built to monitor the binary correlations shift of component pairs ($y_{1}^{*}$,$y_{2}^{*}$), ($y_{1}^{*}$,$y_{3}^{*}$) and ($y_{2}^{*}$,$y_{3}^{*}$); Since there is only one quality component in G₂, there is no need to create a T² control chart; Finally, T² control chart K that monitors the correlation shift of all the quality components is established, and the 4 control charts are used to form a diagnostic model of the correlation between the 4 quality components in blade processing.

Manufacturing process diagnosis

In subsequent manufacturing, 5 quality data at different moments are collected, as shown in Table 5, and the results after standardization are shown in Table 6. The T² statistics for the 5 data were calculated and plotted in the T² control chart K, as shown in Fig. 8. It can be seen that in the last three samples, the correlations of all the quality components are abnormal.

Table 5 Test data collected in subsequent manufacturing.

Full size table

Table 6 Test data after standardization.

Full size table

In order to diagnose the cause(s) of the abnormal control chart K, 3 control charts monitoring the binary correlation shift of the 3 component pairs shown from Figs. 9, 10 and 11 were analyzed, and the diagnostic results are shown in Table 7.

Table 7 Diagnostic results.

Full size table

Validity analysis of diagnostic conclusions

In order to judge the accuracy of the diagnostic results in Table 7, another diagnostic model using the CCBD method is built, which contains a total of $C_{4}^{2} + C_{4}^{3} = 10$ T² control charts, as shown in Figs. 12, 13, 14, 15, 16, 17, 18, 19, 20 and 21.

In order to compare the diagnostic conclusions derived from the two different diagnostic models, they are placed in Table 8. It can be seen that there are differences in the diagnostic conclusions of the last 3 points. Taking point 3 as an example, further analysis of the diagnostic results using the CCBD method reveals that since the correlations of component pairs (y^*₁, y^*₃), (y^*₂, y^*₃) are anomalous, the correlations of other component combinations containing (y^*₁, y^*₃) or (y^*₂, y^*₃) are bound to be in anomalous states, and thus the diagnostic results that the correlation abnormalities of component combinations (y^*₁, y^*₂, y^*₃₎, (y^*₁, y^*₃, y^*₄) and (y^*₂, y^*₃, y^*₄) are redundant diagnostic information. After removing the redundant diagnostic information, the diagnostic results of both diagnostic models for the causes of the anomaly in point 3 are identical. A similar analysis of the diagnostic results for points 4 and 5 leads to the same conclusions, as shown in Table 9. Therefore, the accuracy of the diagnostic method of multivariate process quality correlation based on the grouping technique can be guaranteed.

Table 8 Comparison of the diagnostic results of the two diagnostic models.

Full size table

Table 9 Comparison of the two diagnostic systems after redundant diagnostic results are removed.

Full size table

Discussion and conclusion

For the problem of correlation diagnosis in multivariate process quality management, this paper proposed a grouping technique based correlation diagnosis method. Compared with the present diagnostic methods, the method proposed in this paper has the following advantages:

1.1.
The diagnosis is more efficient

The space complexity of the multivariate process quality correlation diagnostic method based on grouping technique is approximately O(p), while the space complexity of the diagnostic algorithm based on the CCBD method, principal component analysis method and the orthogonal decomposition of the T² statistic are O(2^p), O(p²), and O(p!), respectively. Therefore, the proposed method in this paper has higher diagnostic efficiency.

2.2.
The diagnostic results are more accurate

The grouping technique based multivariate process quality correlation diagnosis method takes the correlation of component pairs as the diagnostic unit. Because component pairs are the minimum combination of quality components, the disadvantage of redundant diagnostic information in diagnostic algorithms based on the CCBD method, the principal component analysis method and the orthogonal decomposition of T² statistics can be avoided to provide more accurate diagnostic results for manufacturing processes.

3.3.
Better generality

Compared with the diagnostic methods based on artificial intelligence technology, the diagnostic method proposed in this paper is based on strict mathematical analysis as the theoretical foundation, avoids the defect of intelligent diagnostic methods in which the network structure and parameters are oriented to specific application. Therefore, the proposed method can be used as a general theoretical model for the multivariate process quality correlation diagnosis.

The multivariate process quality diagnostic model based on grouping technique has the following two issues for further discussion in its application.

(1) Judgment of the difference degree in correlations between quality components

The difference degree in correlations between quality components can be judged by the covariance matrix Σ^* obtained after standardizing the quality data collected in stable state. In general, if there exists at least one row of elements in Σ^* such that the ratio of the maximum value to the minimum value, except for the main diagonal element, is not less than 2, it can be tentatively determined that there is a large difference in the correlations between different quality components.

(2) Basis for grouping quality components

The maximum value of each row elements in the factor loading matrix A can be used as a basis for grouping the quality components. The quality component y_i^* can be assigned to group G_k represented by the common factor F_k if a_ik is the element with the largest absolute value in the ith row of A. Experience has shown that grouping is more desirable when a_ik > 0.7. When the difference between the absolute values of the elements of a row in A is small, it indicates that the corresponding quality component has an approximately equal degree of dependence on all the common factors, and at this point, the group where the corresponding quality component is located can be rationally determined in conjunction with the actual interpretation of the factor analysis model. If the absolute values between the elements of any row in A are all approximately equal, it indicates that the degree of dependence of all quality components on all common factors is approximately equal, at this time, all quality components are located within a same group, and the diagnostic model is degraded to the diagnostic method based on the correlation decomposition. We will study this issue in depth in our later work.

Ethics declarations

The authors declare no human or animal subjects, sample or database was used in this manuscript.

Data availability

All data generated or analyzed during this study are included in this manuscript.

References

Fernandes, F. H., Lee, H. L. & Bourguignon, M. About Shewhart control charts to monitor the Weibull mean. Qual. Reliab. Eng. Int. 35, 2343–2357 (2019).
Article Google Scholar
Linda, L. H., Fidel, H. F. & Roberto, C. Q. Improving Shewhart control chart performance for monitoring the Weibull mean. Qual. Reliab. Eng. Int. 37, 984–996 (2021).
Article Google Scholar
Huu, D. N., Kim, P. T., Giovanni, C., Petros, E. M. & Philippe, C. On the effect of the measurement error on Shewhart and EWMA control charts. Int. J. Adv. Manuf. Technol. 107, 4317–4332 (2020).
Article Google Scholar
Malela-Majika, J. C., Shongwe, S. C., Castagliola, P. & Mutambayi, R. M. A novel single composite Shewhart-EWMA control chart for monitoring the process mean. Qual. Reliab. Eng. Int. 38, 1760–1789 (2022).
Article Google Scholar
Mjimer, I., Aoula, E. & Achouyab, E. H. Monitoring of overall equipment effectiveness by multivariate statistical process control. Int. J. Lean Six Sig. 13, 847–862 (2022).
Article Google Scholar
Yefang, S., Ijaz, Y., Yueyi, Z. & Hui, Z. Optimizing the quality control of multivariate processes under an improved Mahalanobis-Taguchi system. Qual. Eng. 35, 413–429 (2023).
Article Google Scholar
Harold, H. The generalization of student’s ratio. Ann. Math. Stat. 2, 360–378 (1931).
Article Google Scholar
Mahdiyeh, E., Bahram, S. G. & Mahmoud, R. A. A new approach for monitoring healthcare performance using generalized additive profiles. J. Stat. Comput. Simul. 91, 167–179 (2021).
Article MathSciNet Google Scholar
Ali, Y. et al. A monitoring framework for health care processes using generalized additive models and auto-encoders. Artif. Intell. Med. 146, 102689 (2023).
Article Google Scholar
Mokhtar, M., Wan, Y. & Liang, C. Robust Hotelling’s T² statistic based on M-estimator. J. Phys. Conf. Ser. 1988, 012116 (2021).
Article Google Scholar
Bahrami, H., Niaki, S. T. A. & Khedmati, M. Monitoring multivariate profiles in multistage processes. Commu. Stat. Simul. C 50, 3436–3464 (2019).
Article MathSciNet Google Scholar
Joshi, K. & Patil, B. Multivariate statistical process monitoring and control of machining process using principal components based Hotelling T² charts: a machine vision approach. Int. J. Product. Qual. Manag. 35, 40–56 (2022).
Article Google Scholar
Ershadi, M. J., Niaki, S. T. A., Azizi, A., Esfahani, A. A. & Abadi, R. E. Monitoring data quality using Hoteling multivariate control chart. Commun. Stat. Simul. C 52, 1591–1606 (2023).
Article MathSciNet Google Scholar
Huang, J. & Yan, X. Quality-driven principal component analysis combined with kernel least squares for multivariate statistical process monitoring. IEEE. Trans. Control Syst. Technol. 27, 2688–2695 (2019).
Article Google Scholar
Li, Q., Xiaoyun, Y., Lina, Y., Yixian, F. & Yuwei, R. Quality-related process monitoring based on improved kernel principal component regression. IEEE Access 9, 132733–132745 (2021).
Article Google Scholar
Muhammad, R., Babar, Z., Rashid, M., Nasir, A. & Muazu, A. Advanced multivariate cumulative sum control charts based on principal component method with application. Qual. Reliab. Eng. Int. 37, 2760–2789 (2021).
Article Google Scholar
Sun, C. & Hou, J. An improved principal component regression for quality-related process monitoring of industrial control systems. IEEE Access 5, 21723–21730 (2017).
Article Google Scholar
Mason, R. L., Tracy, N. D. & Young, J. C. Decomposition of T² for multivariate control chart interpretation. J. Qual. Technol. 27, 109–119 (1995).
Article Google Scholar
Mason, R. L., Tracy, N. D. & Young, J. C. A practical approach for interpreting multivariate T² control chart signals. J. Qual. Technol. 29, 396–406 (1997).
Article Google Scholar
Mason, R. L., Tracy, N. D. & Young, J. C. Improving the sensitivity of the T² statistic in multivariate process control. J. Qual. Technol. 31, 155–165 (1999).
Article Google Scholar
Akeem, A. A., Yahaya, A. & Asiribo, O. Hotelling’s T² decomposition: Approach for five process characteristics in a multivariate statistical process control. Am. J. Theor. Appl. Stat. 4, 432–437 (2015).
Article Google Scholar
Huang, X. H., Xu, J. K. & Zhou, Q. Multi-scale diagnosis of spatial point interaction via decomposition of the k function-based T² statistic. J. Qual. Technol. 49, 213–227 (2017).
Article Google Scholar
Li, X. L. & Liu, S. S. Fault separation and detection algorithm based on Mason Young Tracy decomposition and Gaussian mixture models. Int. J. Intell. Comput. 13, 81–101 (2020).
CAS Google Scholar
Ueda, R. M. & Souza, A. M. An effective approach to detect the source(s) of out-of-control signals in productive processes by vector error correction (VEC) residual and Hotelling’s T² decomposition techniques. Expert Syst. Appl. 187, 115979 (2022).
Article Google Scholar
Yu, J. B., Zhang, C. Y. & Wang, S. J. Sparse one-dimensional convolutional neural network-based feature learning for fault detection and diagnosis in multivariable manufacturing processes. Neural Comput. Appl. 34, 4343–4366 (2022).
Article Google Scholar
Jiao, J. Y., Zhao, M. & Lin, J. A multivariate encoder information based convolutional neural network for intelligent fault diagnosis of planetary gearboxes. Knowl.-Based Syst. 160, 237–250 (2018).
Article Google Scholar
Samira, Z. & Moosa, A. Simultaneous fault diagnosis of wind turbine using multichannel convolutional neural networks. ISA Trans 108, 230–239 (2021).
Article Google Scholar
Xu, Q. Q., Dong, J. & Peng, K. X. A novel method of neural network model predictive control integrated process monitoring and applications to hot rolling process. Expert Syst. Appl. 237, 121682 (2023).
Article Google Scholar
Xian, X. C., Li, J. & Liu, K. B. Causation-based monitoring and diagnosis for multivariate categorical processes with ordinal information. IEEE Trans. Autom. Sci. Eng. 16, 886–897 (2019).
Article Google Scholar
Rezki, N., Kazar, O. & Mouss, L. H. A hybrid approach for complex industrial process monitoring. J. Sci. Ind. Res. India 76, 608–613 (2017).
Google Scholar
Wang, Y. Z., Liu, Y. & Khan, F. Semiparametric PCA and Bayesian network based process fault diagnosis technique. Can. J. Chem. Eng. 95, 1800–1816 (2017).
Article CAS Google Scholar
Yao, W. L., Li, D. H. & Gao, L. Fault detection and diagnosis using tree-based ensemble learning methods and multivariate control charts for centrifugal chillers. J. Build. Eng. 51, 104243 (2022).
Article Google Scholar
Liang, J. P. & Zhang, K. A new hybrid fault diagnosis method for wind energy converters. Electronics 12, 1263 (2023).
Article Google Scholar
Zhang, H. Q., Wang, J. C. & Wang, M. Integration of cuckoo search and fuzzy support vector machine for intelligent diagnosis of production process quality. J. Ind. Manag. Optim. 18, 195–217 (2022).
Article MathSciNet Google Scholar
Tang, J. & Zhao, Q. N. Motor rolling bearing fault diagnosis based on MVMD energy entropy and GWO-SVM. J. Vibroeng. 25, 1096–1107 (2023).
Article Google Scholar
Sardarabadi, A. M. & Vanderveen, A. J. Complex factor analysis and extensions. IEEE Trans. Signal. Process. 66, 954–967 (2018).
Article ADS MathSciNet Google Scholar
Forni, M., Hallin, M. & Lippi, M. Dynamic factor models with infinite-dimensional factor space: Asymptotic analysis. J. Econom. 199, 74–92 (2017).
Article MathSciNet Google Scholar

Download references

Acknowledgements

The work described in this paper was supported by the research grant from the Natural Science Foundation of Gansu Province (22JR5RA342), we hereby thank them for the financial aids.

Author information

These authors contributed equally: Qing Niu, Shujie Cheng and Zeyang Qiu.

Authors and Affiliations

Department of Product Design, Lanzhou Jiaotong University, Lanzhou, Gansu, People’s Republic of China
Qing Niu, Shujie Cheng & Zeyang Qiu

Authors

Qing Niu
View author publications
Search author on:PubMed Google Scholar
Shujie Cheng
View author publications
Search author on:PubMed Google Scholar
Zeyang Qiu
View author publications
Search author on:PubMed Google Scholar

Contributions

Q.N. wrote the main manuscript text, S.C. prepared Figs. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, and 21, and Z.Q. prepared all the Tables. All authors reviewed the manuscript.

Corresponding author

Correspondence to Qing Niu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Niu, Q., Cheng, S. & Qiu, Z. A multivariate process quality correlation diagnosis method based on grouping technique. Sci Rep 14, 13212 (2024). https://doi.org/10.1038/s41598-024-61954-y

Download citation

Received: 19 January 2024
Accepted: 12 May 2024
Published: 08 June 2024
Version of record: 08 June 2024
DOI: https://doi.org/10.1038/s41598-024-61954-y

Subjects

Abstract

Similar content being viewed by others

Minimalist module analysis for fault detection and localization

Research on Control Method of Waste Heat Utilization System Based on Multi-parameter Coupling

Shrinkage estimators of large covariance matrices with Toeplitz targets in array signal processing

Introduction

Diagnosis method based on component combinations

Diagnosis method based on principal component analysis

Diagnosis method based on correlation orthogonal decomposition

Intelligent diagnosis methods

Sketch of the algorithm

Covariance matrix properties of multivariate process quality

Theorem 1

Theorem 2

Theorem 3

Theoretical basis for correlation grouping diagnosis

Correlation decomposition

Theorem 4

Grouping principle

Theorem 5

Methodology for grouping quality components

Factor analysis

Error analysis

Correlation diagnostic algorithm based on grouping theory

Case study

Parameters estimation

Establishment of diagnostic model

Manufacturing process diagnosis

Validity analysis of diagnostic conclusions

Discussion and conclusion

Ethics declarations

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links