Introduction

Gears are key components of transmission systems, widely used in high-end manufacturing fields such as aerospace and precision instruments. Given the high geometric precision and surface quality requirements for gear transmission systems in these fields, gear manufacturing processes face increasingly stringent challenges. Gear grinding represents a complex nonlinear system where the relationship between process parameters and surface roughness lacks a precise mechanistic model, typically relying on empirical functions derived from experimental data fitting. However, identical surface roughness values can result from different combinations of process parameters, as demonstrated in Fig. 1(a). The model parameters derived from fitting different experimental datasets exhibit significant discrepancies. These model parameters are not mutually independent but demonstrate correlations, which subsequently affect the accuracy of reverse engineering process parameters, as illustrated in Fig. 1(b). Appropriately accounting for correlations among model parameters in uncertainty analysis to enable accurate inverse solution of process parameters based on surface roughness represents an issue of significant theoretical value and great importance in practical engineering.

Fig. 1
figure 1

The inverse solution of process parameters using empirical functions and empirical functions derived from different experimental datasets. (a) Function fitting using a single set of experimental data, showing that identical surface roughness values (Ra = 0.68 μm, 0.72 μm, 0.76 μm, and 0.80 μm) correspond to multiple different combinations of process parameters, where f represents feed rate, Vw represents workpiece feed speed, and Vs represents grinding wheel linear velocity. (b) Empirical functions obtained from fitting different experimental datasets. Fitting empirical functions 1: \(Ra=4.86{{\text{e}}^6} \times {V_s}^{{ - 5.53}}{V_w}^{{5.37}}{f^{ - 5.50}}\); Fitting empirical functions 2: \(Ra=110 \times {V_s}^{{ - 1.97}}{V_w}^{{1.82}}{f^{ - 2.05}}\); Fitting empirical functions 3: \(Ra=3.83{{\text{e}}^7} \times {V_s}^{{ - 6.257}}{V_w}^{{5.97}}{f^{ - 6.14}}\); Fitting empirical functions 4: \(Ra=0.97 \times {V_s}^{{ - 0.33}}{V_w}^{{0.21}}{f^{ - 0.42}}\); Fitting empirical functions 5: \(Ra=2.19{e^8} \times {V_s}^{{ - 6.86}}{V_w}^{{6.57}}{f^{ - 6.74}}\)

Process parameters are key factors affecting part manufacturing quality. Research on inverse solution methods for process parameters is fundamental to intelligent manufacturing. In mechanical manufacturing, process parameter research primarily includes mathematical models, numerical simulations, expert systems, machine learning, and neural networks. Tan et al.1 developed a mathematical model for process parameter selection using multi-pass turning as an example. Ermer et al.2 created a multi-pass turning mathematical model combining geometric and linear programming. Juan et al.3 determined optimal cutting process parameters by combining numerical simulation with multi-objective particle swarm optimization algorithms. Dureja4 reviewed various empirical modeling techniques and optimization methods for process parameter optimization in challenging turning problems. Iqbal A. et al.5 investigated fuzzy modeling in the trade-offs between energy consumption, tool life, and productivity in metal cutting processes. Cao et al.6 studied multi-objective decision methods for high-speed dry gear hobbing process parameters. K. Shelesh7 developed an AI system for obtaining process parameters in plastic injection molding. Jianxin Deng et al.8 proposed a data-driven process parameter design method for extrusion casting. Chen et al.9 combined dung beetle optimization algorithms and long short-term memory neural networks to predict the surface roughness of bearing outer rings under various grinding conditions. However, mathematical models, numerical simulations and other methods often require simplifying real systems and cannot accurately analyze the relationship between process parameters and surface roughness. Machine learning, neural networks and other methods, while improving accuracy, typically require large amounts of experimental data.

Researchers have recently combined physical knowledge with neural networks to propose PINN10,11,12,13,14. The purpose is to incorporate physical laws into the loss function, guiding the learning process to produce solutions that better conform to physical laws15,16,17,18. Building on this foundation, researchers have conducted in-depth studies on network architecture, loss function design, optimization strategies, and other aspects19,20,21,22,23,24,25,26,27,28,29. PINN has been applied to mechanical manufacturing, construction, power systems, and other domains with good results in practical engineering fields. A new physics-guided neural network model was proposed for tool wear prediction in mechanical manufacturing30. An AI-driven partial differential equation solver combining finite element methods with physics-informed neural networks was developed for additive manufacturing process parameter selection31. A new model based on physics-informed neural networks (PhysCon) was proposed in the construction field. This model combines the interpretability of physical laws with the expressive power of neural networks for control-oriented demand response in grid-integrated buildings32. In the power systems domain, physics-informed neural networks were used to build high-precision, strongly generalizable power output evaluation models, calculating the peak-shaving tolerance range of target thermal power plants under different heating demands33. However, PINN cannot achieve uncertainty quantification and lacks confidence interval assessment for the model, subsequently affecting the accuracy and reliability of the model.

In practical engineering problems, prediction results can easily be inaccurate due to the influence of multi-source uncertainties in complex systems. Therefore, BPINN34 was introduced, which combines the PINN and Bayesian frameworks for uncertainty quantification and is robust35,36,37. Longze L38 applied physics-based Bayesian neural networks to material property prediction, incorporating physical knowledge to guide BPINN solutions. S. Stock39 introduced the application of Bayesian physics-informed neural networks in power systems. Aditya40 proposed a variant of BPINN, developing a reduced-order machine learning model that accurately and effectively predicted charge density in corrosive environments. The above research demonstrates that BPINN has advantages in predictive performance and uncertainty quantification. However, there is insufficient research on uncertainty analysis that accounts for correlations among model parameters in process parameter inversion problems within complex manufacturing systems.

Due to the complex working conditions in gear manufacturing systems and the correlation and uncertainty of model parameters, new challenges have emerged for the inverse solution of process parameters. The current challenges are summarized as follows:

  1. (1)

    Uncertainty analysis of correlations between model parameters. Model parameters during process parameter inversion are not mutually independent; consideration of correlations among model parameters is necessary when conducting uncertainty analysis to improve model accuracy. How to analyze multi-level model parameter uncertainty is one of the bottleneck problems constraining the inverse solution of process parameters.

  2. (2)

    Model interpretability and consistency of model parameters. Neural networks are black-box models, making it particularly important to incorporate physical formulas to enhance model interpretability. Additionally, ensuring consistency between the posterior and prior distributions of model parameters is essential when quantifying model parameter uncertainty.

To address the aforementioned challenges, HBPINN that accounts for correlations among model parameters during uncertainty analysis is proposed. The overall architecture of the model is illustrated in Fig. 2. The key contributions of this work are summarized as follows:

  1. (1)

    A multi-level hierarchical structure of model parameters consisting of global-level, group-level, and individual-level components has been constructed. The complex correlations among model parameters are investigated through inter-group effects, while parameter uncertainties are analyzed within a hierarchical Bayesian framework. This approach enhances the model’s uncertainty quantification capability and accuracy.

  2. (2)

    A physical loss function relating process parameters to surface roughness has been established using a multivariate regression function, and regularization is implemented through KL divergence across multi-level model parameters, improving model interpretability and model parameter consistency.

  3. (3)

    Additionally, experimental data augmentation is achieved through GPR algorithms. HBPINN is trained using this augmented dataset and validated with a subset of experimental data, simultaneously reducing experimental costs while enhancing model accuracy.

Fig. 2
figure 2

Overall architecture of the model.

The organization of this paper is as follows: In section “HBPINN”, HBPINN for the inverse solution of process parameters is proposed, covering data augmentation methods, hierarchical Bayesian framework, and loss function formulation. In section “Experiment”, the gear grinding process and surface roughness detection experiments are described. In section “Results and discussion”, the performance of different models in the inverse identification of gear grinding process parameters is evaluated. In section “Conclusions”, the main conclusions are presented.

HBPINN

This section proposes HBPINN, which accounts for correlations among model parameters during uncertainty analysis for the inverse solution of gear grinding process parameters. First, GPR is employed to augment experimental data to form the dataset. Second, a hierarchical Bayesian framework establishes global-level, group-level, and individual-level model parameter structures, constructs correlation matrices to analyze interaction patterns between model parameters and subsequently examines parameter uncertainties. Third, a multivariate regression function establishes the relationship between process parameters and roughness as a physics-based loss function, which is combined with a data-driven loss function and regularized through KL divergence of multi-level model parameters to construct the composite loss function. An adaptive mechanism automatically optimizes the hyperparameters in the loss function, and the model is solved by minimizing this loss function.

Data enhancement based on GPR

Obtaining large amounts of experimental data for gear grinding is expensive and time-consuming. Data augmentation41,42 can expand the dataset size without incurring additional costs. Enhancing experimental data through GPR43,44 can improve the model’s generalization ability, making it a key technology for enhancing the predictive performance of neural networks.

For data augmentation in this paper, the original data is first standardized (see Eqs. 1 and 2). The input data is surface roughness Ra, while the output data x consists of workpiece feed rate Vw, feed amount f, and grinding wheel linear velocity Vs.

$${x_{norm}}{\text{ }}={\text{ }}\left( {x - {\text{ }}{\mu _x}} \right)/{\sigma _x}$$
(1)
$${y_{norm}}{\text{ }}={\text{ }}\left( {y{\text{ }} - {\text{ }}{\mu _y}} \right)/{\sigma _y}$$
(2)

Where \({\mu _x}\),\(~{\sigma _x}\), \({\mu _y}\), and \({\sigma _y}\) are the mean and standard deviation of the standardized input data \({x_{norm}}\) and standardized output data\({y_{norm}}\), respectively.

Second, the GPR training can be represented by Eq. (3):

$$f\left( x \right){\text{ }}\sim {\text{ }}GP\left( {m\left( x \right),{\text{ }}K\left( {x,{x_*}} \right)} \right)$$
(3)
$$K\left( {x,{x_*}} \right){\text{ }}={\sigma ^2} \cdot exp\left( { - {{\left| {\left| {x - {x_*}} \right|} \right|}^2}/\left( {2{l^2}} \right)} \right){\text{ }}$$
(4)

Where \({x_*}\) is the next original data, \(m\left( x \right)\) is the mean function, and \(K\left( {x,{x_*}} \right)\) is the kernel function, as shown in Eq. (4). E is the signal variance. F is the length scale parameter. For the predicted input value G, the corresponding predicted output value H follows a multivariate Gaussian distribution as represented by Eq. (5):

$$p\left( {f^{\prime}|x^{\prime},{\text{ }}x,{\text{ }}y} \right){\text{ }}={\text{ }}N\left( {f^{\prime}|\mu ^{\prime},{\text{ }}\Sigma ^{\prime}} \right)$$
(5)
$$\mu '=m(x^{\prime})+K{\left( {x^{\prime},x} \right)^T}{K^{ - 1}}(y - m(x))$$
(6)
$$\Sigma ^{\prime}=K(x^{\prime},x^{\prime}) - K{\left( {x^{\prime},x} \right)^T}{K^{ - 1}}K\left( {x^{\prime},x} \right)$$
(7)

Where \(\mu ^{\prime}\) and \(\Sigma ^{\prime}\) are the mean and covariance matrix of the predicted data, as shown in Eqs. (6) and (7), K represents the covariance matrix between original data points. \(K\left( {x^{\prime},x} \right)\) represents the covariance matrix between predicted data and original data. \(K(x^{\prime},x^{\prime})\) represents the covariance matrix among predicted data points.

Finally, Latin Hypercube Sampling ensures uniform distribution of the augmented data. The augmented data is obtained through the trained Gaussian Process Regression model with adaptive Gaussian noise introduction, as shown in Eqs. (8) and (9).

$$X={x_{new}}{\text{ }}+{\text{ }}{\varepsilon _{xnew}},{\text{ }}{\varepsilon _{xnew}}{\text{ }}\sim {\text{ }}N\left( {0,{\text{ }}{\sigma _{xnew}}^{2}} \right)$$
(8)
$$Y{\text{ }}={\text{ }}{y_{new}}{\text{ }}+{\text{ }}{\varepsilon _y}_{{new}},{\text{ }}{\varepsilon _y}_{{new}}{\text{ }}\sim {\text{ }}N\left( {0,{\text{ }}{\sigma _y}{{_{{new}}}^2}} \right)$$
(9)

Where \({x_{new}}\) and \(\sigma _{{xnew}} ^{2}\) are the input of the augmented data and its corresponding noise variance, \({y_{new}}\) and \(\sigma _{{ynew}} ^{2}\) are the output of the augmented data and its corresponding noise variance. X and Y represent the final augmented data.

Hierarchical bayesian framework

The surface roughness of gears is influenced by numerous uncertainty factors, including thermal errors, machine errors, installation errors, and various other factors. When performing parametric uncertainty analyses, model parameters are not mutually independent but exhibit complex correlations. To solve those problems, a hierarchical Bayesian framework is employed to accounts for correlations among model parameters during uncertainty analysis, thereby improving model accuracy while enhancing uncertainty assessment capabilities.

This section establishes a global-level, group-level, and individual-level hierarchical structure for model parameters. The correlations between model parameters are investigated through inter-group effects, and correlation matrices are constructed to analyze the interactions between model parameters. The final posterior distribution is obtained by combining prior distributions with observed data, and uncertainty quantification of multi-level model parameters is achieved through a hierarchical Bayesian framework45,46,47,48. The specific process is shown in Fig. 3.

Fig. 3
figure 3

Hierarchical Bayesian framework.

Assuming that the prior distributions of global parameter \({\theta _{gl}}\), group parameter \({\theta _{gr}}\), and individual parameter \({\theta _{in}}\) in the model follow Gaussian distributions. \({\theta _{gl}}\), \({\theta _{gr}}\) and \({\theta _{in}}\) follow these distributions:

$${\theta _{gl}}\sim \mathcal{N}({\mu _{gl}},{\sigma ^2}_{{gl}})$$
(10)
$${\theta _{gr}}\sim \mathcal{N}({\mu _{gr}},\sigma _{{_{{gr}}}}^{2})$$
(11)
$${\theta _{in}}\sim \mathcal{N}({\mu _{in}},\sigma _{{in}}^{2})$$
(12)
$$\gamma _{{gl}} = \mu _{{gl}} + \sigma _{{gl}} \cdot\varepsilon ,\varepsilon \sim {\mathcal{N}}(0,1)$$
(13)
$$\gamma _{{gr}} = \mu _{{gr}} + \sigma _{{gr}} \cdot R$$
(14)
$${\theta _{in}}=\mu {_{in}}+{\gamma _{gl}}+{\gamma _{gr}}+{\sigma _{in}} \cdot {\eta _{{\theta _{in}}}}$$
(15)

Where \({\mu _{gl}}\) and \({\sigma ^2}_{{gl}}\) are the mean and variance of \({\theta _{gl}}\), \({\mu _{gr}}\) and \(\sigma _{{_{{gr}}}}^{2}\) are the mean and variance of \({\theta _{gr}}\), \({\mu _{in}}\) and \(\sigma _{{in}}^{2}\) are the mean and variance of \({\theta _{in}}\). R is the correlation matrix between individual parameters. \({\eta _{{\theta _{in}}}}\) is the correlation parameter in R corresponding to \({\theta _{in}}\). Through global effects \({\gamma _{gl}}\), a common baseline is provided for all \({\theta _{in}}\) to improve generalization ability, as shown in Eq. (13). The differences between individual parameters are further refined through group effects \({\gamma _{gr}}\) to account for complex correlations among model parameters, as shown in Eq. (14). \({\theta _{in}}\), i.e., model parameters, are obtained under the joint constraints of \({\gamma _{gl}}\) and \({\gamma _{gr}}\), as shown in Eq. (15).

The correlation matrix R is established through the following steps to consider further correlations between model parameters for improving model accuracy. First, g represents correlations among different model parameters and is initialized to zero. Matrix G is constructed according to Eq. (16). Second, the lower triangular matrix L is extracted, as shown in Eq. (17). Third, the initial correlation matrix \({\mathbf{R^{\prime}}}\) is computed, with the identity matrix I added to ensure matrix symmetry, thereby improving numerical computation stability, see Eq. (18). Fourth, the square root of the i-th diagonal element is computed as shown in Eq. (19). Finally, normalization yields the final correlation matrix \({\mathbf{R}}\), as shown in Eq. (20).

$$\mathbf{G}=\left| {\begin{array}{*{20}{c}} {{g_{AA}}}&{{g_{AB}}}&{{g_{AC}}}&{{g_{AD}}} \\ {{g_{BA}}}&{{g_{BB}}}&{{g_{BC}}}&{{g_{BD}}} \\ {{g_{CA}}}&{{g_{CB}}}&{{g_{CC}}}&{{g_{CD}}} \\ {{g_{DA}}}&{{g_{DB}}}&{{g_{DC}}}&{{g_{DD}}} \end{array}} \right|$$
(16)
$${\mathbf{L}}=\left| {\begin{array}{*{20}{c}} {{g_{AA}}}&0&0&0 \\ {{g_{BA}}}&{{g_{BB}}}&0&0 \\ {{g_{CA}}}&{{g_{CB}}}&{{g_{CC}}}&0 \\ {{g_{DA}}}&{{g_{DB}}}&{{g_{DC}}}&{{g_{DD}}} \end{array}} \right|$$
(17)
$${\mathbf{R}}'={\mathbf{L}}{{\mathbf{L}}^T}+{\mathbf{I}}$$
(18)
$${d_i}=\sqrt {{\mathbf{R}}{'_{ii}}}$$
(19)
$${\mathbf{R}}=\frac{{{{{\mathbf{R^{\prime}}}}_{ij}}}}{{{d_i} \cdot {d_j}}}$$
(20)

Given the observed data \(y=\{ {y_1},{y_2}, \ldots ,{y_N}\}\), the posterior distribution \(p\left( {\theta |y} \right)\) in the hierarchical Bayesian framework can be expressed as:

$$p\left( {\theta |y} \right)=p({\theta _{gl}},{\theta _{gr}},{\theta _{in}}|y)=\frac{{p(y|{\theta _{in}})p({\theta _{in}}|{\theta _{gr}})p({\theta _{gr}}|{\theta _{gl}})p({\theta _{gl}})}}{{p(y)}}$$
(21)

Where \(\theta\) represents parameters at various scales; \(p(y|{\theta _{in}})\) is the likelihood function of individual parameters; \(p({\theta _{in}}|{\theta _{gr}})\) is the conditional prior of individual parameter likelihood functions; \(p({\theta _{gr}}|{\theta _{gl}})\) is the conditional prior of group parameters; \(p({\theta _{gl}})\) is the prior distribution of global parameters; and \(p(y)\) is the marginal likelihood. In Bayesian inference, the posterior distribution is typically complex and difficult to solve directly. A parameterizable approximate distribution \({q_\phi }(\varvec{\theta})\) is introduced through variational inference to approximate the true posterior distribution, where \(\phi\) is a trainable parameter. The core objective is to minimize the KL divergence between \({q_\phi }(\varvec{\theta})\) and \(p\left( {\theta |y} \right)\):

$$KL({q_{\phi }}(\theta )\parallel p(\left. \theta \right|y))=\smallint {q_\phi }(\theta )log\left( {\frac{{{q_\phi }(\theta )}}{{p(\left. \theta \right|y)}}} \right)d\theta$$
(22)

Since the actual posterior distribution is typically challenging to compute directly, we instead optimize the Evidence Lower Bound (ELBO):

$$ELBO={E_{{q_\phi }}}_{{(\theta )}}[logp(\left. y \right|\theta )] - KL({q_\phi }(\theta )\parallel p(\theta ))$$
(23)

Where \({E_{q\phi }}(\theta )[logp(\left. y \right|\theta )]\) is the expected log-likelihood of \(p(y|\theta )\), and \(KL({q_\phi }(\theta )\parallel p(\theta ))\) is the KL divergence between the approximate distribution \({q_\phi }(\theta )\) and the prior distribution \(p(\theta )\).

Total loss function

The integration of empirical loss functions with the physical relationships between process parameters and roughness in actual manufacturing processes, while accounting for the differences between posterior and prior distributions of model parameters for regularization, establishes a total loss function. This approach aims to enhance model interpretability and accuracy while improving generalization capability and preventing overfitting.

To minimize the empirical loss of model predictions on the training set while maintaining low model complexity, an empirical loss function49 is established:

$$\arg \;\hbox{min} \;Los{s_{MSE}}+\lambda \;T$$
(24)

The mean squared error (MSE) serves as the empirical loss, with L1 and L2 norms of W functioning as regularization terms in Eq. 24, as shown below:

$$Los{s_{MSE}}=\frac{1}{N}\sum\limits_{{i=1}}^{N} {{{({{\hat {y}}_i} - {y_i})}^2}}$$
(25)
$$\lambda \;T={\lambda _1}\left\| \mathbf{W} \right\|+{\lambda _2}\left\| \mathbf{W} \right\|$$
(26)

where \({\hat {y}_i}\) represents the model’s predicted values, \({y_i}\) denotes the true values, T measures the complexity of the model, and λ, λ1, λ2 are trade-off hyperparameters.

According to previous research50,51, the primary technological parameters affecting tooth surface roughness during gear grinding include Vw, f, and Vs. A multivariate regression function was established to model the relationship between these process parameters and surface roughness (see Eq. 26). To simplify the formula structure and improve model convergence, logarithmic transformation was applied to both sides of Eq. (24), constructing a physics loss function (see Eq. 25).

$$Ra = g(Vs,Vw,f) = k_{5} Vs^{m} Vw^{n} f^{o}$$
(27)
$$Los{s_{{\text{physical}}}}=\frac{1}{M}\sum\limits_{{j=1}}^{M} {\left| {\ln ({k_5}V{s^m}V{w^n}{f^o}) - In(Ra)} \right|} =\frac{1}{M}\sum\limits_{{j=1}}^{M} {\left| {A+BInVs+CInVw+DInf - In(Ra)} \right|}$$
(28)

Where M represents the sample size, while A, B, C, and D correspond to the logarithmic transformations of the model parameters k5, m, n, and o from Eq. (26), respectively. These model parameters were derived through experimental data fitting, as shown in Table 1. The parameters \(V{s^*}\), \(V{w^*}\), and \({f^*}\) represent \(InVs\), \(InVw\),and \(Inf\), respectively.

Table 1 Model parameters.

Regularization was implemented through the KL divergence of hierarchical model parameters to ensure that the posterior distribution of model parameters does not excessively deviate from the prior distribution.

$${W_{KL}}=K{L_{gl}}+K{L_{gr}}+K{L_{in}}+K{L_{co}}$$
(29)
$$K{L_{gl}}=\frac{1}{2}( - log(\sigma _{{gl}}^{2})+\mu _{{gl}}^{2}+\sigma _{{gl}}^{2} - 1)$$
(30)
$$K{L_{gr}}=\frac{1}{2}\sum\limits_{{u \in \left\{ {A,B,C,D} \right\}}}^{4} {( - log(\sigma _{{gr,u}}^{2})+\mu _{{gr,u}}^{2}+\sigma _{{gr,u}}^{2} - 1)}$$
(31)
$$K{L_{in}}=\frac{1}{2}\sum\limits_{{v \in \left\{ {A,B,C,D} \right\}}}^{4} {( - log(\sigma _{{in,v}}^{2})+\mu _{{in,v}}^{2}+\sigma _{{in,v}}^{2} - 1)}$$
(32)
$$K{L_{co}}=\frac{1}{2}(tr({\mathbf{R}}) - d - log\;det({\mathbf{R}}))$$
(33)

where \(K{L_{gl}}\), \(K{L_{gr}}\), \(K{L_{in}}\), and \(K{L_{co}}\) represent the KL divergence corresponding to \({\theta _{gl}}\), \({\theta _{gr}}\), \({\theta _{in}}\), and R, respectively. d denotes the dimension of the correlation matrix R.

The complete loss function is:

$$Loss=Los{s_{MSE}}+{\lambda _1}\left\| \mathbf{W} \right\|+{\lambda _2}\left\| \mathbf{W} \right\|+{\lambda _{P{\text{hy}}}}Los{s_{{\text{Phy}}}}+{\lambda _3}{{\text{W}}_{KL}}$$
(34)

Where \({\lambda _{P{\text{hy}}}}\) is the hyperparameter corresponding to the physical loss function, \({\lambda _3}\) is the hyperparameter associated with the KL divergence regularization term. An adaptive mechanism52,53,54 was employed to optimize the weights automatically, and the hybrid loss function was minimized based on gradient descent.

Finally, when solving HBPINN, the hardware configuration consisted of an Intel(R) Core(TM) i7-10700 CPU @ 2.90 GHz and an AMD Radeon RX 6500 XT GPU. The PyTorch package implemented the model, comprising three hidden layers with 256, 512, and 256 neurons, respectively. A multi-head attention mechanism was introduced for multiple outputs, utilizing the AdamW optimizer with a learning rate 6.3e-3.

Experiment

In this section, gear grinding processes and surface roughness detection experiments were conducted with various technological parameter combinations to enhance the dataset and validate model results. Gear grinding was performed on a PG50150 gear grinding machine, as shown in Fig. 4. Gear surface roughness detection was conducted using a Taylor Hobson Talysurf CLI2000 profilometer, as illustrated in Fig. 5. Nine experiments were carried out using the orthogonal experimental design method. The processing experimental parameters are presented in Table 2. The grinding wheel diameter was 350 mm, with a grinding depth of 0.05 mm and a unit grinding width of 1 mm. The results of gear surface roughness detection are shown in Fig. 6. Two-thirds of the experimental data (No. 1–6) were augmented to obtain training sets of different sizes (100/200/500/1000/2000), while the remaining experimental data (No. 7–9) served as the validation set.

Fig. 4
figure 4

Gears ground with different technological parameters on the PG50150 gear grinding machine.

Fig. 5
figure 5

Gear surface roughness detection.

Table 2 Experimental parameters for gear processing.
Fig. 6
figure 6

Gear roughness detection (showing surface roughness Ra detection results for experiments 1–9).

Results and discussion

In this section, we compare and evaluate HBPINN against BPINN and PINN across various aspects, demonstrating the superior performance of HBPINN. Additionally, we assess the robustness of the model under data noise levels of 1%, 3%, 5%, 10%, 15%, and 20%.

Impact of different training set sizes

The data augmentation method described in Sect. 2.1 generated five training sets with different sample sizes, as illustrated in Fig. 7. The non-uniform color transitions in the parameter space and the absence of distinct patterns in the surface roughness data points with different colors indicate strong non-linear relationships in the manufacturing process. This underscores the significant importance of improving the accurate inverse solution of process parameters.

To validate the rationality of the augmented dataset generated by the GPR algorithm, we analyzed the distributions of experimental data and augmented data. We conducted t-tests and Kolmogorov-Smirnov (K-S) tests. As shown in Fig. 8, the augmented data exhibit consistency with the experimental data regarding distribution shape and peak values. The p-values of t-tests and K-S tests for Vw, f, Vs, and Ra are significantly greater than 0.05, indicating no significant differences between the augmented data and experimental data regarding mean values and distributions.

The impact of different training set sizes on the accuracy of HBPINN, BPINN, VI-BPINN, and PINN is shown in Fig. 9. Across various training sets, HBPINN consistently achieves higher R² values for Vw*, f*, Vs*, and their means compared to BPINN, VI-BPINN, and PINN. Notably, in smaller training sets (size of 100), BPINN, VI-BPINN, and PINN exhibit higher R² values for f*, but lower R2 values for Vs* and Vw*, indicating poor predictive performance. In contrast, HBPINN maintains high R2 values across all process parameters, demonstrating superior accuracy even under limited training data conditions.

Fig. 7
figure 7

Data augmentation results. (a) Training set size of 100. (b) Training set size of 200. (c) Training set size of 500. (d) Training set size of 1000. (e) Training set size of 2000.

Fig. 8
figure 8

Statistical characteristics analysis of experimental data and enhanced data (size 200). (a) Analysis of enhanced data and experimental data of Vw. (b) Analysis of enhanced data and experimental data of f. (c) Analysis of enhanced data and experimental data of Vs. (d) Analysis of enhanced data and experimental data of Ra.

Fig. 9
figure 9

R2 of prediction results for different models with different training sets.

Analysis of accuracy and computational cost for different models

As shown in Fig. 10, taking a training dataset of 200 samples as an example, all four models began to converge after 10–20 iterations and eventually reached stability. Figures 11, 12 and 13 demonstrates that the 90% confidence intervals were narrowest for Vs* in HBPINN, Vw* in BPINN, and f* in VI-BPINN, indicating the highest prediction reliability for these respective parameters in their corresponding models. Figures 11, 12, 13 and 14 shows that the predicted values of Vw*, f*, and Vs* for all four models clustered closely around the 45-degree line, suggesting good predictive accuracy. Among these models, HBPINN achieved the optimal performance with the highest R2 and lowest RMSE values, demonstrating the superiority of the proposed model in this study.

To further investigate the computational advantages of HBPINN, we employed identical experimental configurations and systematically adjusted training dataset sizes and hidden layer architectures to compare the accuracy and efficiency of the four models. All timing measurements exclusively encompass the model training phase, excluding data augmentation procedures. As shown in Table 3, HBPINN maintained comparable accuracy while achieving substantially improved computational efficiency with a simpler hidden layer structure. Compared with BPINN and VI-BPINN, the computational time was reduced by more than 4-fold, while compared with PINN, it was reduced by nearly 10-fold. This improvement can be attributed to the multi-scale parameter structure of HBPINN, which reduces independent parameter sampling and eliminates redundant optimization processes. These results demonstrate that HBPINN achieves higher accuracy with smaller training datasets and requires significantly lower computational resources than BPINN, VI-BPINN, and PINN. This finding highlights the considerable potential of the HBPINN model for real-time control applications in manufacturing processes.

Fig. 10
figure 10

Comparison of the iterative process of different models for a training set size of 200. (a) HBPINN. (b) BPINN. (c) PINN.

Fig. 11
figure 11

Results of HBPINN with a training set size of 200. (a) Comparison between model-predicted and actual values of Vw*. (b) Comparison between model-predicted and actual values of f *. (c) Comparison between model-predicted and actual values of Vs*.

Fig. 12
figure 12

Results of the BPINN with a training set size of 200. (a) Comparison between model-predicted and actual values of Vw*. (b) Comparison between model-predicted and actual values of f *. (c) Comparison between model-predicted and actual values of Vs*.

Fig. 13
figure 13

Results of the VI-BPINN with a training set size of 200. (a) Comparison between model-predicted and actual values of Vw*. (b) Comparison between model-predicted and actual values of f *. (c) Comparison between model-predicted and actual values of Vs*.

Fig. 14
figure 14

Results of the PINN with a training set size of 200. (a) Comparison between model-predicted and actual values of Vw*. (b) Comparison between model-predicted and actual values of f *. (c) Comparison between model-predicted and actual values of Vs*.

Table 3 Comparison of computational costs of different models.

Uncertainty assessment of model parameters

Since PINN inherently lacks uncertainty quantification capabilities, this section focuses on analyzing the parameter uncertainty quantification performance of HBPINN, BPINN, and VI-BPINN. The uncertainty assessment of model parameters for HBPINN with a training set size of 200 is illustrated in Fig. 15. The model parameters uncertainty assessments of BPINN and VI-BPINN under the same training conditions are shown in Figs. 16 and 17, respectively. Model parameters A, B, C, and D exhibit approximately normal distributions across the three models. The mean value of model parameter A in HBPINN deviates more from the initial value compared to those in BPINN and VI-BPINN, with the confidence intervals for all four model parameters in HBPINN being wider than those in the other two models. This can be attributed to the fact that BPINN assumes independence among model parameters; VI-BPINN adopts a mean-field assumption for modeling parameter correlations; whereas HBPINN’s hierarchical parameter structure provides a richer space for uncertainty modeling. These results suggest that when dealing with complex physical systems with high uncertainty, HBPINN offers more robust uncertainty quantification by accounting for correlations of model parameters, thereby facilitating more reliable model parameter estimation.

The group effect analysis captures the complex correlations between model parameters and evaluates their degree of influence on the model results. As shown in Fig. 18, with a training dataset of 200 samples, the group effect values for model parameters A, B, C, and D were 0.59, 0.65, 0.86, and 0.64, respectively. Model parameters C exhibited the most substantial inter-group effect, indicating its most significant influence on model outcomes. Model parameters A, B, and D demonstrated relatively lower group effects, suggesting their comparatively minor influence and greater stability within the model. Additionally, with a training dataset of 200 samples, the global effect variance was minimal, converging to 0.9074. This indicates that the global parameters, serving as the baseline for model parameters, have negligible impact on the model performance.

Fig. 15
figure 15

Uncertainty assessment of HBPINN parameters with a training dataset of 200 samples. (a) Confidence interval for model parameter (A) (b) Confidence interval for model parameter (B) (c) Confidence interval for model parameter (C) (d) Confidence interval for model parameter D.

Fig. 16
figure 16

Uncertainty assessment of BPINN parameters with a training dataset of 200 samples. (a) Confidence interval for model parameter (A). (b) Confidence interval for model parameter (B). (c) Confidence interval for model parameter (C). (d) Confidence interval for model parameter D.

Fig. 17
figure 17

Uncertainty assessment of VI-BPINN parameters with a training dataset of 200 samples. (a) Confidence interval for model parameter (A) (b) Confidence interval for model parameter (B) (c) Confidence interval for model parameter (C) (d) Confidence interval for model parameter D.

Fig. 18
figure 18

Group effect analysis of HBPINN with a training dataset of 200 samples.

Robustness analysis

To investigate the robustness of the three models under various noise conditions, Gaussian noise was added to the training dataset at levels of 1%, 3%, 5%, 10%, 15%, and 20%. As shown in Fig. 19(a), with a training dataset of 200 samples, the R2 values for Vw*, f*, and Vs* in HBPINN remained remarkably stable at approximately 0.9 as the Gaussian noise increased from 1 to 20%. Figure 19(b) demonstrates that BPINN exhibited a decline in R2 values for Vw*, f*, and Vs* at higher noise levels, with Vw* showing the most pronounced decrease. Figure 19(c) shows that under higher noise levels, VI-BPINN significantly decreases the R2 value of Vs*, while the performance of Vw* and f* remains relatively stable. Figure 19(d) reveals that for PINN, the R2 values for Vw*, f*, and Vs* decreased dramatically when the noise level exceeded 10%. These results demonstrate that HBPINN maintains superior robustness compared to both BPINN and PINN, even under high noise conditions.

Fig. 19
figure 19

Model robustness analysis under different levels of data noise with a dataset size of 200. (a) HBPINN, (b) BPINN, (c) VI-BPINN, (c) PINN.

Conclusions

This study presents a novel HBPINN that accounts for correlations among model parameters during uncertainty analysis, wherein the integration of a hierarchical Bayesian framework with physics-informed neural network enables effective inverse solution of process parameters in gear grinding. The main contributions and conclusions are summarized as follows:

  1. (1)

    A global-group-individual multilevel structure was established to investigate model parameter correlations through group effects and analyze model parameter uncertainties via the hierarchical Bayesian framework. Results demonstrate that HBPINN achieved superior R² and RMSE values compared to BPINN, VI-BPINN, and PINN across datasets of different scales. Additionally, HBPINN required significantly lower computational costs, achieving a 10-fold speed improvement over PINN with smaller datasets and simpler network architectures. Compared to BPINN and VI-BPINN, HBPINN obtained wider confidence intervals for model parameters, reducing dependence on initial values and enhancing model reliability.

  2. (2)

    The total loss function was constructed by integrating empirical loss, physics-based loss, and KL divergence regularization terms, improving model interpretability and consistency of model parameters.

  3. (3)

    HBPINN demonstrated superior robustness, maintaining excellent stability compared to BPINN, VI-BPINN, and PINN, even under 20% high data noise conditions.