Abstract
Accurate inverse solution of process parameters by surface roughness is crucial for precision gear grinding processes. When inversely solving process parameters, model parameters are typically obtained by fitting experimental data. However, model parameters exhibit complex correlations and uncertainties, posing significant challenges to the inverse solution of process parameters. To address these challenges, the study proposes a hierarchical Bayesian physics-informed neural network (HBPINN) for the inverse solution of gear-grinding process parameters. An innovative global-group-individual level hierarchical structure is constructed for model parameters. Correlation analysis among model parameters is conducted through group effects within a hierarchical Bayesian framework, followed by uncertainty analysis. Then, multivariate regression functions describing the relationship between process parameters and surface roughness are constructed to form the physics loss function. The regularization incorporates the Kullback-Leibler (KL) divergence of model parameters, integrating with the empirical loss function. Furthermore, datasets of different scales were established through Gaussian process regression (GPR) algorithms. Compared with Bayesian physics-informed neural network (BPINN), variational inference Bayesian physics-informed neural network (VI-BPINN), and physics-informed neural network (PINN), HBPINN demonstrates superior performance in terms of both efficiency and accuracy. With a training set size of 200, HBPINN reduced prediction time by 4–10 times and achieved an average R² of 0.9629. The model demonstrates excellent uncertainty quantification capabilities and robustness.
Similar content being viewed by others
Introduction
Gears are key components of transmission systems, widely used in high-end manufacturing fields such as aerospace and precision instruments. Given the high geometric precision and surface quality requirements for gear transmission systems in these fields, gear manufacturing processes face increasingly stringent challenges. Gear grinding represents a complex nonlinear system where the relationship between process parameters and surface roughness lacks a precise mechanistic model, typically relying on empirical functions derived from experimental data fitting. However, identical surface roughness values can result from different combinations of process parameters, as demonstrated in Fig. 1(a). The model parameters derived from fitting different experimental datasets exhibit significant discrepancies. These model parameters are not mutually independent but demonstrate correlations, which subsequently affect the accuracy of reverse engineering process parameters, as illustrated in Fig. 1(b). Appropriately accounting for correlations among model parameters in uncertainty analysis to enable accurate inverse solution of process parameters based on surface roughness represents an issue of significant theoretical value and great importance in practical engineering.
The inverse solution of process parameters using empirical functions and empirical functions derived from different experimental datasets. (a) Function fitting using a single set of experimental data, showing that identical surface roughness values (Ra = 0.68 μm, 0.72 μm, 0.76 μm, and 0.80 μm) correspond to multiple different combinations of process parameters, where f represents feed rate, Vw represents workpiece feed speed, and Vs represents grinding wheel linear velocity. (b) Empirical functions obtained from fitting different experimental datasets. Fitting empirical functions 1: \(Ra=4.86{{\text{e}}^6} \times {V_s}^{{ - 5.53}}{V_w}^{{5.37}}{f^{ - 5.50}}\); Fitting empirical functions 2: \(Ra=110 \times {V_s}^{{ - 1.97}}{V_w}^{{1.82}}{f^{ - 2.05}}\); Fitting empirical functions 3: \(Ra=3.83{{\text{e}}^7} \times {V_s}^{{ - 6.257}}{V_w}^{{5.97}}{f^{ - 6.14}}\); Fitting empirical functions 4: \(Ra=0.97 \times {V_s}^{{ - 0.33}}{V_w}^{{0.21}}{f^{ - 0.42}}\); Fitting empirical functions 5: \(Ra=2.19{e^8} \times {V_s}^{{ - 6.86}}{V_w}^{{6.57}}{f^{ - 6.74}}\)
Process parameters are key factors affecting part manufacturing quality. Research on inverse solution methods for process parameters is fundamental to intelligent manufacturing. In mechanical manufacturing, process parameter research primarily includes mathematical models, numerical simulations, expert systems, machine learning, and neural networks. Tan et al.1 developed a mathematical model for process parameter selection using multi-pass turning as an example. Ermer et al.2 created a multi-pass turning mathematical model combining geometric and linear programming. Juan et al.3 determined optimal cutting process parameters by combining numerical simulation with multi-objective particle swarm optimization algorithms. Dureja4 reviewed various empirical modeling techniques and optimization methods for process parameter optimization in challenging turning problems. Iqbal A. et al.5 investigated fuzzy modeling in the trade-offs between energy consumption, tool life, and productivity in metal cutting processes. Cao et al.6 studied multi-objective decision methods for high-speed dry gear hobbing process parameters. K. Shelesh7 developed an AI system for obtaining process parameters in plastic injection molding. Jianxin Deng et al.8 proposed a data-driven process parameter design method for extrusion casting. Chen et al.9 combined dung beetle optimization algorithms and long short-term memory neural networks to predict the surface roughness of bearing outer rings under various grinding conditions. However, mathematical models, numerical simulations and other methods often require simplifying real systems and cannot accurately analyze the relationship between process parameters and surface roughness. Machine learning, neural networks and other methods, while improving accuracy, typically require large amounts of experimental data.
Researchers have recently combined physical knowledge with neural networks to propose PINN10,11,12,13,14. The purpose is to incorporate physical laws into the loss function, guiding the learning process to produce solutions that better conform to physical laws15,16,17,18. Building on this foundation, researchers have conducted in-depth studies on network architecture, loss function design, optimization strategies, and other aspects19,20,21,22,23,24,25,26,27,28,29. PINN has been applied to mechanical manufacturing, construction, power systems, and other domains with good results in practical engineering fields. A new physics-guided neural network model was proposed for tool wear prediction in mechanical manufacturing30. An AI-driven partial differential equation solver combining finite element methods with physics-informed neural networks was developed for additive manufacturing process parameter selection31. A new model based on physics-informed neural networks (PhysCon) was proposed in the construction field. This model combines the interpretability of physical laws with the expressive power of neural networks for control-oriented demand response in grid-integrated buildings32. In the power systems domain, physics-informed neural networks were used to build high-precision, strongly generalizable power output evaluation models, calculating the peak-shaving tolerance range of target thermal power plants under different heating demands33. However, PINN cannot achieve uncertainty quantification and lacks confidence interval assessment for the model, subsequently affecting the accuracy and reliability of the model.
In practical engineering problems, prediction results can easily be inaccurate due to the influence of multi-source uncertainties in complex systems. Therefore, BPINN34 was introduced, which combines the PINN and Bayesian frameworks for uncertainty quantification and is robust35,36,37. Longze L38 applied physics-based Bayesian neural networks to material property prediction, incorporating physical knowledge to guide BPINN solutions. S. Stock39 introduced the application of Bayesian physics-informed neural networks in power systems. Aditya40 proposed a variant of BPINN, developing a reduced-order machine learning model that accurately and effectively predicted charge density in corrosive environments. The above research demonstrates that BPINN has advantages in predictive performance and uncertainty quantification. However, there is insufficient research on uncertainty analysis that accounts for correlations among model parameters in process parameter inversion problems within complex manufacturing systems.
Due to the complex working conditions in gear manufacturing systems and the correlation and uncertainty of model parameters, new challenges have emerged for the inverse solution of process parameters. The current challenges are summarized as follows:
-
(1)
Uncertainty analysis of correlations between model parameters. Model parameters during process parameter inversion are not mutually independent; consideration of correlations among model parameters is necessary when conducting uncertainty analysis to improve model accuracy. How to analyze multi-level model parameter uncertainty is one of the bottleneck problems constraining the inverse solution of process parameters.
-
(2)
Model interpretability and consistency of model parameters. Neural networks are black-box models, making it particularly important to incorporate physical formulas to enhance model interpretability. Additionally, ensuring consistency between the posterior and prior distributions of model parameters is essential when quantifying model parameter uncertainty.
To address the aforementioned challenges, HBPINN that accounts for correlations among model parameters during uncertainty analysis is proposed. The overall architecture of the model is illustrated in Fig. 2. The key contributions of this work are summarized as follows:
-
(1)
A multi-level hierarchical structure of model parameters consisting of global-level, group-level, and individual-level components has been constructed. The complex correlations among model parameters are investigated through inter-group effects, while parameter uncertainties are analyzed within a hierarchical Bayesian framework. This approach enhances the model’s uncertainty quantification capability and accuracy.
-
(2)
A physical loss function relating process parameters to surface roughness has been established using a multivariate regression function, and regularization is implemented through KL divergence across multi-level model parameters, improving model interpretability and model parameter consistency.
-
(3)
Additionally, experimental data augmentation is achieved through GPR algorithms. HBPINN is trained using this augmented dataset and validated with a subset of experimental data, simultaneously reducing experimental costs while enhancing model accuracy.
The organization of this paper is as follows: In section “HBPINN”, HBPINN for the inverse solution of process parameters is proposed, covering data augmentation methods, hierarchical Bayesian framework, and loss function formulation. In section “Experiment”, the gear grinding process and surface roughness detection experiments are described. In section “Results and discussion”, the performance of different models in the inverse identification of gear grinding process parameters is evaluated. In section “Conclusions”, the main conclusions are presented.
HBPINN
This section proposes HBPINN, which accounts for correlations among model parameters during uncertainty analysis for the inverse solution of gear grinding process parameters. First, GPR is employed to augment experimental data to form the dataset. Second, a hierarchical Bayesian framework establishes global-level, group-level, and individual-level model parameter structures, constructs correlation matrices to analyze interaction patterns between model parameters and subsequently examines parameter uncertainties. Third, a multivariate regression function establishes the relationship between process parameters and roughness as a physics-based loss function, which is combined with a data-driven loss function and regularized through KL divergence of multi-level model parameters to construct the composite loss function. An adaptive mechanism automatically optimizes the hyperparameters in the loss function, and the model is solved by minimizing this loss function.
Data enhancement based on GPR
Obtaining large amounts of experimental data for gear grinding is expensive and time-consuming. Data augmentation41,42 can expand the dataset size without incurring additional costs. Enhancing experimental data through GPR43,44 can improve the model’s generalization ability, making it a key technology for enhancing the predictive performance of neural networks.
For data augmentation in this paper, the original data is first standardized (see Eqs. 1 and 2). The input data is surface roughness Ra, while the output data x consists of workpiece feed rate Vw, feed amount f, and grinding wheel linear velocity Vs.
Where \({\mu _x}\),\(~{\sigma _x}\), \({\mu _y}\), and \({\sigma _y}\) are the mean and standard deviation of the standardized input data \({x_{norm}}\) and standardized output data\({y_{norm}}\), respectively.
Second, the GPR training can be represented by Eq. (3):
Where \({x_*}\) is the next original data, \(m\left( x \right)\) is the mean function, and \(K\left( {x,{x_*}} \right)\) is the kernel function, as shown in Eq. (4). E is the signal variance. F is the length scale parameter. For the predicted input value G, the corresponding predicted output value H follows a multivariate Gaussian distribution as represented by Eq. (5):
Where \(\mu ^{\prime}\) and \(\Sigma ^{\prime}\) are the mean and covariance matrix of the predicted data, as shown in Eqs. (6) and (7), K represents the covariance matrix between original data points. \(K\left( {x^{\prime},x} \right)\) represents the covariance matrix between predicted data and original data. \(K(x^{\prime},x^{\prime})\) represents the covariance matrix among predicted data points.
Finally, Latin Hypercube Sampling ensures uniform distribution of the augmented data. The augmented data is obtained through the trained Gaussian Process Regression model with adaptive Gaussian noise introduction, as shown in Eqs. (8) and (9).
Where \({x_{new}}\) and \(\sigma _{{xnew}} ^{2}\) are the input of the augmented data and its corresponding noise variance, \({y_{new}}\) and \(\sigma _{{ynew}} ^{2}\) are the output of the augmented data and its corresponding noise variance. X and Y represent the final augmented data.
Hierarchical bayesian framework
The surface roughness of gears is influenced by numerous uncertainty factors, including thermal errors, machine errors, installation errors, and various other factors. When performing parametric uncertainty analyses, model parameters are not mutually independent but exhibit complex correlations. To solve those problems, a hierarchical Bayesian framework is employed to accounts for correlations among model parameters during uncertainty analysis, thereby improving model accuracy while enhancing uncertainty assessment capabilities.
This section establishes a global-level, group-level, and individual-level hierarchical structure for model parameters. The correlations between model parameters are investigated through inter-group effects, and correlation matrices are constructed to analyze the interactions between model parameters. The final posterior distribution is obtained by combining prior distributions with observed data, and uncertainty quantification of multi-level model parameters is achieved through a hierarchical Bayesian framework45,46,47,48. The specific process is shown in Fig. 3.
Assuming that the prior distributions of global parameter \({\theta _{gl}}\), group parameter \({\theta _{gr}}\), and individual parameter \({\theta _{in}}\) in the model follow Gaussian distributions. \({\theta _{gl}}\), \({\theta _{gr}}\) and \({\theta _{in}}\) follow these distributions:
Where \({\mu _{gl}}\) and \({\sigma ^2}_{{gl}}\) are the mean and variance of \({\theta _{gl}}\), \({\mu _{gr}}\) and \(\sigma _{{_{{gr}}}}^{2}\) are the mean and variance of \({\theta _{gr}}\), \({\mu _{in}}\) and \(\sigma _{{in}}^{2}\) are the mean and variance of \({\theta _{in}}\). R is the correlation matrix between individual parameters. \({\eta _{{\theta _{in}}}}\) is the correlation parameter in R corresponding to \({\theta _{in}}\). Through global effects \({\gamma _{gl}}\), a common baseline is provided for all \({\theta _{in}}\) to improve generalization ability, as shown in Eq. (13). The differences between individual parameters are further refined through group effects \({\gamma _{gr}}\) to account for complex correlations among model parameters, as shown in Eq. (14). \({\theta _{in}}\), i.e., model parameters, are obtained under the joint constraints of \({\gamma _{gl}}\) and \({\gamma _{gr}}\), as shown in Eq. (15).
The correlation matrix R is established through the following steps to consider further correlations between model parameters for improving model accuracy. First, g represents correlations among different model parameters and is initialized to zero. Matrix G is constructed according to Eq. (16). Second, the lower triangular matrix L is extracted, as shown in Eq. (17). Third, the initial correlation matrix \({\mathbf{R^{\prime}}}\) is computed, with the identity matrix I added to ensure matrix symmetry, thereby improving numerical computation stability, see Eq. (18). Fourth, the square root of the i-th diagonal element is computed as shown in Eq. (19). Finally, normalization yields the final correlation matrix \({\mathbf{R}}\), as shown in Eq. (20).
Given the observed data \(y=\{ {y_1},{y_2}, \ldots ,{y_N}\}\), the posterior distribution \(p\left( {\theta |y} \right)\) in the hierarchical Bayesian framework can be expressed as:
Where \(\theta\) represents parameters at various scales; \(p(y|{\theta _{in}})\) is the likelihood function of individual parameters; \(p({\theta _{in}}|{\theta _{gr}})\) is the conditional prior of individual parameter likelihood functions; \(p({\theta _{gr}}|{\theta _{gl}})\) is the conditional prior of group parameters; \(p({\theta _{gl}})\) is the prior distribution of global parameters; and \(p(y)\) is the marginal likelihood. In Bayesian inference, the posterior distribution is typically complex and difficult to solve directly. A parameterizable approximate distribution \({q_\phi }(\varvec{\theta})\) is introduced through variational inference to approximate the true posterior distribution, where \(\phi\) is a trainable parameter. The core objective is to minimize the KL divergence between \({q_\phi }(\varvec{\theta})\) and \(p\left( {\theta |y} \right)\):
Since the actual posterior distribution is typically challenging to compute directly, we instead optimize the Evidence Lower Bound (ELBO):
Where \({E_{q\phi }}(\theta )[logp(\left. y \right|\theta )]\) is the expected log-likelihood of \(p(y|\theta )\), and \(KL({q_\phi }(\theta )\parallel p(\theta ))\) is the KL divergence between the approximate distribution \({q_\phi }(\theta )\) and the prior distribution \(p(\theta )\).
Total loss function
The integration of empirical loss functions with the physical relationships between process parameters and roughness in actual manufacturing processes, while accounting for the differences between posterior and prior distributions of model parameters for regularization, establishes a total loss function. This approach aims to enhance model interpretability and accuracy while improving generalization capability and preventing overfitting.
To minimize the empirical loss of model predictions on the training set while maintaining low model complexity, an empirical loss function49 is established:
The mean squared error (MSE) serves as the empirical loss, with L1 and L2 norms of W functioning as regularization terms in Eq. 24, as shown below:
where \({\hat {y}_i}\) represents the model’s predicted values, \({y_i}\) denotes the true values, T measures the complexity of the model, and λ, λ1, λ2 are trade-off hyperparameters.
According to previous research50,51, the primary technological parameters affecting tooth surface roughness during gear grinding include Vw, f, and Vs. A multivariate regression function was established to model the relationship between these process parameters and surface roughness (see Eq. 26). To simplify the formula structure and improve model convergence, logarithmic transformation was applied to both sides of Eq. (24), constructing a physics loss function (see Eq. 25).
Where M represents the sample size, while A, B, C, and D correspond to the logarithmic transformations of the model parameters k5, m, n, and o from Eq. (26), respectively. These model parameters were derived through experimental data fitting, as shown in Table 1. The parameters \(V{s^*}\), \(V{w^*}\), and \({f^*}\) represent \(InVs\), \(InVw\),and \(Inf\), respectively.
Regularization was implemented through the KL divergence of hierarchical model parameters to ensure that the posterior distribution of model parameters does not excessively deviate from the prior distribution.
where \(K{L_{gl}}\), \(K{L_{gr}}\), \(K{L_{in}}\), and \(K{L_{co}}\) represent the KL divergence corresponding to \({\theta _{gl}}\), \({\theta _{gr}}\), \({\theta _{in}}\), and R, respectively. d denotes the dimension of the correlation matrix R.
The complete loss function is:
Where \({\lambda _{P{\text{hy}}}}\) is the hyperparameter corresponding to the physical loss function, \({\lambda _3}\) is the hyperparameter associated with the KL divergence regularization term. An adaptive mechanism52,53,54 was employed to optimize the weights automatically, and the hybrid loss function was minimized based on gradient descent.
Finally, when solving HBPINN, the hardware configuration consisted of an Intel(R) Core(TM) i7-10700 CPU @ 2.90 GHz and an AMD Radeon RX 6500 XT GPU. The PyTorch package implemented the model, comprising three hidden layers with 256, 512, and 256 neurons, respectively. A multi-head attention mechanism was introduced for multiple outputs, utilizing the AdamW optimizer with a learning rate 6.3e-3.
Experiment
In this section, gear grinding processes and surface roughness detection experiments were conducted with various technological parameter combinations to enhance the dataset and validate model results. Gear grinding was performed on a PG50150 gear grinding machine, as shown in Fig. 4. Gear surface roughness detection was conducted using a Taylor Hobson Talysurf CLI2000 profilometer, as illustrated in Fig. 5. Nine experiments were carried out using the orthogonal experimental design method. The processing experimental parameters are presented in Table 2. The grinding wheel diameter was 350 mm, with a grinding depth of 0.05 mm and a unit grinding width of 1 mm. The results of gear surface roughness detection are shown in Fig. 6. Two-thirds of the experimental data (No. 1–6) were augmented to obtain training sets of different sizes (100/200/500/1000/2000), while the remaining experimental data (No. 7–9) served as the validation set.
Results and discussion
In this section, we compare and evaluate HBPINN against BPINN and PINN across various aspects, demonstrating the superior performance of HBPINN. Additionally, we assess the robustness of the model under data noise levels of 1%, 3%, 5%, 10%, 15%, and 20%.
Impact of different training set sizes
The data augmentation method described in Sect. 2.1 generated five training sets with different sample sizes, as illustrated in Fig. 7. The non-uniform color transitions in the parameter space and the absence of distinct patterns in the surface roughness data points with different colors indicate strong non-linear relationships in the manufacturing process. This underscores the significant importance of improving the accurate inverse solution of process parameters.
To validate the rationality of the augmented dataset generated by the GPR algorithm, we analyzed the distributions of experimental data and augmented data. We conducted t-tests and Kolmogorov-Smirnov (K-S) tests. As shown in Fig. 8, the augmented data exhibit consistency with the experimental data regarding distribution shape and peak values. The p-values of t-tests and K-S tests for Vw, f, Vs, and Ra are significantly greater than 0.05, indicating no significant differences between the augmented data and experimental data regarding mean values and distributions.
The impact of different training set sizes on the accuracy of HBPINN, BPINN, VI-BPINN, and PINN is shown in Fig. 9. Across various training sets, HBPINN consistently achieves higher R² values for Vw*, f*, Vs*, and their means compared to BPINN, VI-BPINN, and PINN. Notably, in smaller training sets (size of 100), BPINN, VI-BPINN, and PINN exhibit higher R² values for f*, but lower R2 values for Vs* and Vw*, indicating poor predictive performance. In contrast, HBPINN maintains high R2 values across all process parameters, demonstrating superior accuracy even under limited training data conditions.
Statistical characteristics analysis of experimental data and enhanced data (size 200). (a) Analysis of enhanced data and experimental data of Vw. (b) Analysis of enhanced data and experimental data of f. (c) Analysis of enhanced data and experimental data of Vs. (d) Analysis of enhanced data and experimental data of Ra.
Analysis of accuracy and computational cost for different models
As shown in Fig. 10, taking a training dataset of 200 samples as an example, all four models began to converge after 10–20 iterations and eventually reached stability. Figures 11, 12 and 13 demonstrates that the 90% confidence intervals were narrowest for Vs* in HBPINN, Vw* in BPINN, and f* in VI-BPINN, indicating the highest prediction reliability for these respective parameters in their corresponding models. Figures 11, 12, 13 and 14 shows that the predicted values of Vw*, f*, and Vs* for all four models clustered closely around the 45-degree line, suggesting good predictive accuracy. Among these models, HBPINN achieved the optimal performance with the highest R2 and lowest RMSE values, demonstrating the superiority of the proposed model in this study.
To further investigate the computational advantages of HBPINN, we employed identical experimental configurations and systematically adjusted training dataset sizes and hidden layer architectures to compare the accuracy and efficiency of the four models. All timing measurements exclusively encompass the model training phase, excluding data augmentation procedures. As shown in Table 3, HBPINN maintained comparable accuracy while achieving substantially improved computational efficiency with a simpler hidden layer structure. Compared with BPINN and VI-BPINN, the computational time was reduced by more than 4-fold, while compared with PINN, it was reduced by nearly 10-fold. This improvement can be attributed to the multi-scale parameter structure of HBPINN, which reduces independent parameter sampling and eliminates redundant optimization processes. These results demonstrate that HBPINN achieves higher accuracy with smaller training datasets and requires significantly lower computational resources than BPINN, VI-BPINN, and PINN. This finding highlights the considerable potential of the HBPINN model for real-time control applications in manufacturing processes.
Uncertainty assessment of model parameters
Since PINN inherently lacks uncertainty quantification capabilities, this section focuses on analyzing the parameter uncertainty quantification performance of HBPINN, BPINN, and VI-BPINN. The uncertainty assessment of model parameters for HBPINN with a training set size of 200 is illustrated in Fig. 15. The model parameters uncertainty assessments of BPINN and VI-BPINN under the same training conditions are shown in Figs. 16 and 17, respectively. Model parameters A, B, C, and D exhibit approximately normal distributions across the three models. The mean value of model parameter A in HBPINN deviates more from the initial value compared to those in BPINN and VI-BPINN, with the confidence intervals for all four model parameters in HBPINN being wider than those in the other two models. This can be attributed to the fact that BPINN assumes independence among model parameters; VI-BPINN adopts a mean-field assumption for modeling parameter correlations; whereas HBPINN’s hierarchical parameter structure provides a richer space for uncertainty modeling. These results suggest that when dealing with complex physical systems with high uncertainty, HBPINN offers more robust uncertainty quantification by accounting for correlations of model parameters, thereby facilitating more reliable model parameter estimation.
The group effect analysis captures the complex correlations between model parameters and evaluates their degree of influence on the model results. As shown in Fig. 18, with a training dataset of 200 samples, the group effect values for model parameters A, B, C, and D were 0.59, 0.65, 0.86, and 0.64, respectively. Model parameters C exhibited the most substantial inter-group effect, indicating its most significant influence on model outcomes. Model parameters A, B, and D demonstrated relatively lower group effects, suggesting their comparatively minor influence and greater stability within the model. Additionally, with a training dataset of 200 samples, the global effect variance was minimal, converging to 0.9074. This indicates that the global parameters, serving as the baseline for model parameters, have negligible impact on the model performance.
Robustness analysis
To investigate the robustness of the three models under various noise conditions, Gaussian noise was added to the training dataset at levels of 1%, 3%, 5%, 10%, 15%, and 20%. As shown in Fig. 19(a), with a training dataset of 200 samples, the R2 values for Vw*, f*, and Vs* in HBPINN remained remarkably stable at approximately 0.9 as the Gaussian noise increased from 1 to 20%. Figure 19(b) demonstrates that BPINN exhibited a decline in R2 values for Vw*, f*, and Vs* at higher noise levels, with Vw* showing the most pronounced decrease. Figure 19(c) shows that under higher noise levels, VI-BPINN significantly decreases the R2 value of Vs*, while the performance of Vw* and f* remains relatively stable. Figure 19(d) reveals that for PINN, the R2 values for Vw*, f*, and Vs* decreased dramatically when the noise level exceeded 10%. These results demonstrate that HBPINN maintains superior robustness compared to both BPINN and PINN, even under high noise conditions.
Conclusions
This study presents a novel HBPINN that accounts for correlations among model parameters during uncertainty analysis, wherein the integration of a hierarchical Bayesian framework with physics-informed neural network enables effective inverse solution of process parameters in gear grinding. The main contributions and conclusions are summarized as follows:
-
(1)
A global-group-individual multilevel structure was established to investigate model parameter correlations through group effects and analyze model parameter uncertainties via the hierarchical Bayesian framework. Results demonstrate that HBPINN achieved superior R² and RMSE values compared to BPINN, VI-BPINN, and PINN across datasets of different scales. Additionally, HBPINN required significantly lower computational costs, achieving a 10-fold speed improvement over PINN with smaller datasets and simpler network architectures. Compared to BPINN and VI-BPINN, HBPINN obtained wider confidence intervals for model parameters, reducing dependence on initial values and enhancing model reliability.
-
(2)
The total loss function was constructed by integrating empirical loss, physics-based loss, and KL divergence regularization terms, improving model interpretability and consistency of model parameters.
-
(3)
HBPINN demonstrated superior robustness, maintaining excellent stability compared to BPINN, VI-BPINN, and PINN, even under 20% high data noise conditions.
Data availability
No datasets were generated or analysed during the current study.
References
Tan, F. P. & Creese, R. C. A generalized multi-pass machining model for machining parameter selection in turning. Int. J. Prod. Res. 33, 1467–1487 (1995).
Ermer, D. S. & Kromodihardjo, S. Optimization of multipass turning with constraints. J. Eng. Ind. 103, 462–468 (1981).
Osorio-Pinzon, J. C., Abolghasem, S., Marañon, A. & Casas-Rodriguez, J. P. Cutting parameter optimization of Al-6063-O using numerical simulations and particle swarm optimization. Int. J. Adv. Manuf. Technol. 111, 2507–2532 (2020).
Dureja, J., Gupta, V., Sharma, V. S., Dogra, M. & Bhatti, M. S. A review of empirical modeling techniques to optimize machining parameters for hard turning applications. Proc. Inst. Mech. Eng. Part B 230, 389–404 (2016).
Iqbal, A., Zhang, H. C., Kong, L. L. & Hussain, G. A rule-based system for trade-off among energy consumption, tool life, and productivity in machining process. J. Intell. Manuf. 26, 1217–1232 (2015).
Cao, W. et al. High stability multi-objective decision-making approach of dry hobbing parameters. J. Manuf. Process. 84, 1184–1195 (2022).
Shelesh-Nezhad, K. & Siores, E. An intelligent system for plastic injection molding process design. J. Mater. Process. Technol. 63, 458–462 (1997).
Deng, J. et al. Process parameters design of squeeze casting through an improved KNN algorithm and existing data. J. Manuf. Process. 84, 1320–1330 (2022).
Chen, B., Zha, J., Cai, Z. & Wu, M. Predictive modelling of surface roughness in precision grinding based on hybrid algorithm. CIRP J. Manuf. Sci. Technol. 59, 1–17 (2025).
Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physics informed deep learning (Part I): data-driven solutions of nonlinear partial differential equations (2017). https://arxiv.org/abs/1711.10561.
Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physics informed deep learning (Part II): Data-driven discovery of nonlinear partial differential equations (2017). https://arxiv.org/abs/1711.10566.
Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019).
Peng, S. et al. Prediction of 3D temperature field through single 2D temperature data based on transfer learning-based PINN model in laser-based directed energy deposition. J. Manuf. Process. 138, 140–156 (2025).
Jiao, W. et al. Real-time prediction of temperature field during welding by data-mechanism driving. J. Manuf. Process. 133, 260–270 (2025).
Wenzel, M., Raisch, S. R., Hopmann, C. & Schmitz, M. Inverse modeling of process parameters from data to predict the cooling behavior in injection molding. J. Manuf. Process. 141, 760–772 (2025).
Zhu, Q., Lu, Z. & Hu, Y. A reality-augmented adaptive physics informed machine learning method for efficient heat transfer prediction in laser melting. J. Manuf. Process. 124, 444–457 (2024).
Zhao, Z., Stuebner, M., Lua, J., Phan, N. & Yan, J. Full-field temperature recovery during water quenching processes via physics-informed machine learning. J. Mater. Process. Technol. 303, 117534 (2022).
Chen, Y., Rao, M., Feng, K. & Zuo, M. J. Physics-informed LSTM hyperparameters selection for gearbox fault detection. Mech. Syst. Signal. Process. 171, 108907 (2022).
Jagtap, A. D., Kawaguchi, K. & Karniadakis, G. E. Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. J. Comput. Phys. 404, 109136 (2020).
Gao, H., Sun, L., Wang, J. X. & PhyGeoNet Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state PDEs on irregular domain. J. Comput. Phys. 428, 110079 (2021).
Kharazmi, E., Zhang, Z. & Karniadakis, G. E. M. hp-VPINNs: variational physics-informed neural networks with domain decomposition. Comput. Methods Appl. Mech. Eng. 374, 113547 (2021).
Mao, Z., Jagtap, A. D. & Karniadakis, G. E. Physics-informed neural networks for high-speed flows. Comput. Methods Appl. Mech. Eng. 360, 112789 (2020).
Huang, X. et al. Solving partial differential equations with point source based on physics-informed neural networks (2021). https://arxiv.org/abs/2111.01394.
Huang, X. & Alkhalifah, T. PINNup: robust neural network wavefield solutions using frequency upscaling and neuron splitting. J. Geophys. Res. Solid Earth. 127, e2021JB023703 (2022).
Mishra, S. & Molinaro, R. Estimates on the generalization error of physics-informed neural networks for approximating PDEs (2021). https://arxiv.org/abs/2101.04529.
Shin, Y., Zhang, Z. & Karniadakis, G. E. Error estimates of residual minimization using neural networks for linear PDEs (2020). https://arxiv.org/abs/2010.08019.
Lee, J. Y., Jang, J. W. & Hwang, H. J. The model reduction of the Vlasov–Poisson–Fokker–Planck system to the Poisson–Nernst–Planck system via the deep neural network approach (2020). https://arxiv.org/abs/2012.14030.
Shin, Y., Darbon, J. & Karniadakis, G. E. On the convergence of physics informed neural networks for linear second-order elliptic and parabolic type PDEs (2020). https://arxiv.org/abs/2004.01806.
Mishra, S. & Molinaro, R. Estimates on the generalization error of physics-informed neural networks for approximating a class of inverse problems for PDEs. IMA J. Numer. Anal. 42, 981–1022 (2021).
Wang, J., Li, Y. & Zhao, R. Physics guided neural network for machining tool wear prediction. J. Manuf. Syst. 57, 298–310 (2020).
Yeh, H. P., Bayat, M., Arzani, A. & Hattel, J. H. Accelerated process parameter selection of polymer-based selective laser sintering via hybrid physics-informed neural network and finite element surrogate modelling. Appl. Math. Model. 130, 693–712 (2024).
Chen, Y. et al. Physics-informed neural networks for Building thermal modeling and demand response control. Build. Environ. 234, 110149 (2023).
Chen, Y., Qin, Z., Lin, C., Liu, J. & Yu, D. Accurate evaluation on peak shaving capacity of combined-heat-and-power thermal power units based on physical information neural network. Appl. Therm. Eng. 258, 124690 (2025).
Yang, L., Meng, X. & Karniadakis, G. E. B-PINNs: bayesian physics-informed neural networks for forward and inverse PDE problems with noisy data. J. Comput. Phys. 425, 109913 (2021).
Nascimento, R. G., Viana, F. A. C., Corbetta, M. & Kulkarni, C. S. A framework for Li-ion battery prognosis based on hybrid bayesian physics-informed neural networks. Sci. Rep. 13, 13856 (2023).
Liu, X., Yao, W., Peng, W. & Zhou, W. Bayesian physics-informed extreme learning machine for forward and inverse PDE problems with noisy data. Neurocomputing 549, 126425 (2023).
Stock, S., Babazadeh, D., Becker, C. & Chatzivasileiadis, S. Bayesian physics-informed neural networks for system identification of inverter-dominated power systems. Electr. Power Syst. Res. 235, 110860 (2024).
Li, L. et al. Uncertainty quantification in multivariable regression for material property prediction with bayesian neural networks. Sci. Rep. 14, 10543 (2024).
Stock, S., Stiasny, J., Babazadeh, D., Becker, C. & Chatzivasileiadis, S. Bayesian physics-informed neural networks for robust system identification of power systems. In Proceedings of the 2023 IEEE Belgrade PowerTech 1–6 (IEEE, 2023).
Venkatraman, A. & Wilson, M. A. Accelerating charge Estimation in molecular dynamics simulations using physics-informed neural networks: corrosion applications. Npj Comput. Mater. 11, 24 (2025).
Zhu, R. et al. Low-resource dynamic loading identification of nonlinear system using pretraining. Eng. Struct. 323, 119238 (2025).
Yuan, W. et al. Nonlinear system identification using audio-inspired Wavenet deep neural networks. AIAA J. 61, 4070–4078 (2023).
Schulz, E., Speekenbrink, M. & Krause, A. A tutorial on Gaussian process regression: modelling, exploring, and exploiting functions. J. Math. Psychol. 85, 1–16 (2018).
Hewing, L., Kabzan, J. & Zeilinger, M. N. Cautious model predictive control using Gaussian process regression. IEEE Trans. Control Syst. Technol. 28, 2736–2743 (2019).
Liu, G., Wang, L., Luo, W. L., Liu, J. K. & Lu, Z. R. Parameter identification of fractional order system using enhanced response sensitivity approach. Commun. Nonlinear Sci. Numer. Simul. 67, 492–505 (2019).
Liu, G., Wang, L., Liu, J. K., Chen, Y. M. & Lu, Z. R. Identification of an airfoil-store system with cubic nonlinearity via enhanced response sensitivity approach. AIAA J. 56, 4977–4987 (2018).
Liu, Q. et al. Interpretable sparse identification of a bistable nonlinear energy sink. Mech. Syst. Signal. Process. 193, 110254 (2023).
Wang, Y., Qian, H., Liu, Q., Ma, Y. & Jiang, D. Hierarchical bayesian model for identifying clearance-type nonlinear system. Mech. Syst. Signal. Process. 235, 112891 (2025).
Daw, A. et al. Physics-guided neural networks (PGNN): an application in lake temperature modeling. In Knowledge Guided Machine Learning (eds. Karpatne, A., Kannan, R. & Kumar, V.) 353–372 (Chapman and Hall/CRC, 2022).
Malkin, S. & Cook, N. H. The wear of grinding wheels: part 1—attritious wear. J. Manuf. Sci. Eng. 93, 1120–1128 (1971).
Malkin, S. & Cook, N. H. The wear of grinding wheels: part 2—Fracture wear. J. Manuf. Sci. Eng. 93, 1129–1133 (1971).
Zhang, L., Zheng, Y. & Liu, Z. Fish mass Estimation method based on adaptive parameter tuning and disparity map restoration under binocular vision. Aquacult. Eng. 110, 102535 (2025).
Huynh, S. et al. H. A strong baseline for vehicle re-identification. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 4142–4149 (IEEE, 2021).
Zhu, Y., Zhuang, F. & Wang, D. Aligning domain-specific distribution and classifier for cross-domain classification from multiple sources. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33 5989–5996 (2019).
Funding
This research was financially supported by the National Natural Science Foundation of China (grant no. 52375008).
Author information
Authors and Affiliations
Contributions
Qi Zhang: Writing – original draft, Methodology. Qiang Zhang: Investigation, Conceptualization. Yongsheng Zhao: Methodology, Data curation. Yanming Liu: Validation, Investigation. Zhi Wang: Writing – review & editing, Funding acquisition. Yali Ma: Writing – review & editing, Methodology.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhang, Q., Zhang, Q., Zhao, Y. et al. Inverse solution of process parameters in gear grinding using hierarchical bayesian physics informed neural network (HBPINN). Sci Rep 15, 35397 (2025). https://doi.org/10.1038/s41598-025-18005-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-18005-x