Machine learning based prediction of carbon concentration in carburized steel

Zhang, Yingtao; Tang, Zirong; Yin, Qing; Wang, Yanjie; Nie, Zhenguo

doi:10.1038/s41598-025-18531-8

Download PDF

Article
Open access
Published: 29 September 2025

Machine learning based prediction of carbon concentration in carburized steel

Yingtao Zhang¹,
Zirong Tang¹,
Qing Yin¹,
Yanjie Wang¹ &
…
Zhenguo Nie²

Scientific Reports volume 15, Article number: 33678 (2025) Cite this article

1091 Accesses
Metrics details

Subjects

Abstract

The carbon concentration and hardness of a part’s surface can be enhanced through the carburizing-quenching process. The mechanical properties are affected by the carbon concentration distribution in the surface layer, which directly determines wear resistance and fatigue life. Carbon concentration distribution can be calculated using the FEA (Finite Element Analysis) method with an error no greater than 10% for the heat treatment engineer. However, predicting the results using machine learning method is more efficient and easier. In this paper, the carburizing numerical model is established for a 20Cr2Ni4A cylindrical part. The microstructures and hardness gradient were tested to verify the accuracy of the simulation model for the furnace samples after the carburizing process. By establishing an accurate numerical model of the carburizing process, multiple sets of process parameter combinations were designed for three different shapes: square, circle, and trapezoid. Large-scale calculations were performed, resulting in a total of 530,226 sets of samples obtained. Finally, a parameter adaptive BPNN (Backpropagation Neural Network) method based on mean shift clustering and slime mould algorithm is proposed for rapid prediction of carbon concentration distribution on the surface of parts after carburizing. By comparing the prediction accuracy and training time with other methods, the feasibility and effectiveness of the method are verified.

Predicting carbonation depth in fiber-reinforced ultra-high performance concrete (FR-UHPC) using state-of-the-art machine learning techniques

Article Open access 06 January 2026

Quantification and prediction of carbon steel surface corrosion based on machine vision

Article Open access 10 January 2026

High temperature oxidation of corrosion resistant alloys from machine learning

Article Open access 15 July 2021

Introduction

In actual working conditions, gears are typically subjected to complex cyclic loads, such as bending, contact, and torsion^1,2,3. To enhance the overall performance of gear, carburizing and quenching methods are commonly employed to increase surface hardness and improve the internal structure of the material^4,5. Atmosphere carburizing is widely used due to its ease of carbon potential adjustment, simple process formulation, and low production cost⁶. This process involves the infiltration of activated carbon atoms-decomposed from the carburizing medium-into the surface layer of the material, resulting in a high carbon content at the surface while the core retains its original composition. Parts subjected to carburizing benefit from strong wear resistance, surface contact resistance, and strong bending fatigue ability⁷. Therefore, the carburizing process has significant practical value and economic benefits and has been widely applied to key mechanical parts such as crankshafts, turbine blades, gears and forgings, which are used in ships, machine tools and aircraft^4,8.

Currently, the optimization of the carburizing process primarily relies on the trial and error method, which involves identifying the relationship between the carburizing process parameters and the gradient change of carbon content through experimentation⁹. However, this method requires substantial manpower, materials and financial resources, resulting in high experimental costs and an extended research cycle. To overcome these problems, researchers have employed numerical simulation methods¹⁰ to calculate the carbon concentration distribution under given carburizing process parameters. This approach can effectively determine the optimal conditions for the carburizing process, thereby reducing experimental costs and shortening the research cycle. Although the numerical simulation method has the potential to optimize the carburizing process, operators with a poor theoretical foundation may face operational difficulties in the actual production process. Additionally, numerical simulations pose challenges in production process, including slow calculation speed and large consumption of time and human resources¹¹. To address these issues, some scholars have predicted the accuracy of the cutting force and surface finish during the milling of titanium alloy machining¹² based on machine learning^13,14. They also adopt data-driven prediction methods based on simulation or experimental data^15,16. Furthermore, some researchers use the BPNN model to establish the nonlinear relationship between the surface carbon concentration, the effective carburizing layer depth, and the process parameters after vacuum carburizing to find the optimal vacuum carburizing process conditions¹⁷. However, the carbon concentration gradient distribution may vary slightly even if the carburized layer depth is the same. This makes it inaccurate to predict the carbon concentration gradient based on solely on the depth of the carburized layer. The 2D carburization cloud map can provide more comprehensive information compared to the 1D carburization distribution, covering the distribution of carbon concentration in different locations and regions, thus highlighting spatial changes. Therefore, this study aims to predict the carbon concentration results of three 2D shapes of square, circle, and trapezoid under various process parameters.

In materials prediction tasks, SVM have demonstrated high accuracy under small sample conditions, particularly in friction material properties prediction¹⁸but tend to suffer from reduced precision and larger errors with large datasets¹⁹. In contrast, BPNN have been applied to material process quality prediction due to their strong nonlinear modeling capability. To improve their convergence and accuracy, genetic algorithms (GA) have been used to optimize initial weights and thresholds. GA-BPNN models have shown higher accuracy and faster convergence compared to standard BPNNs, making them effective tools for predicting process deviations and enhancing quality control²⁰. However, neural networks often face slow training speeds when dealing with large datasets, which can hinder their practical application.

To predict the relationship between the carbon concentration distribution and process parameters after atmosphere carburizing heat treatment, this paper explores a method to address the slow training speed of neural networks caused by the large amount of carbon concentration data in the 2D carburizing model. To address the issue of slow training speed of neural networks in the case of large data, some scholars have proposed combining clustering analysis with neural networks²¹. In the field of ultrasonic testing, a successful prediction method combining mean shift clustering (MSC) and BPNN has been applied²². MSC is used to partition the original dataset into smaller cluster centers, based on the features and trends of the data. Then, these cluster centers are utilized as training data for the BPNN, thereby accelerating the training speed.

However, it did not optimize the parameters of the BPNN, which can easily lead to local optima and slow convergence speed. The hybrid approach of heuristic algorithms²³ and BPNN can avoid the insufficient exploration ability and premature stagnation of the algorithm during later iterations²⁴. This study aims to address the aforementioned issues and makes the following two contributions. Firstly, it proposes a novel prediction method for the BPNN to improve the accuracy, stability, and computational efficiency of the predictions. Secondly, the algorithm is tested on a large-scale dataset of different carburized 2D models, demonstrating excellent generalization performance.

The slime mould algorithm (SMA) is a metaheuristic algorithm based on the behavior of slime mould found in nature²⁵. This method utilizes the collective behavior exhibited by slime mould when searching for food to explore optimal solutions. Its uniqueness lies in simulating the movement, communication, and fitness updating processes of slime mould. Compared to traditional algorithms such as the Dragonfly algorithm (DA)²⁶Gray wolf algorithm (GWO)²⁷ and Particle swarm optimization (PSO)²⁸SMA has demonstrated superior optimization performance. In this study, SMA is employed to optimize the parameters of the BPNN and enhance its convergence speed. A parameter-adaptive BPNN algorithm based on Mean shift clustering (MSC) and SMA (MSMABP) is proposed. MSC is used to cluster the data and form cluster centers as the training set for the neural network. The SMA algorithm is then used to optimize the initial weights and thresholds of BPNN to improve the stability and prediction accuracy of the network. By comparing it with highly similar algorithms in terms of prediction accuracy, stability, and training time, MSMABP achieves top performance in terms of stability and second place in prediction accuracy, which proves the feasibility and effectiveness of this method.

Carbon concentration in carburized steel prediction on MSMABP

The main idea of MSMABP: (1) Divide the large data samples into a training set and a test set; (2) Use MSC to process the training set to get the clustering centers; (3) Set BPNN parameters: weight and threshold, multiple different parameter groups are randomly generated from the BPNN parameter population, and each parameter group is a search individual; 4) BPNN is used to predict the training set samples to obtain the prediction accuracy based on these parameter groups and cluster centers; 5) The SMA algorithm is used to update the BPNN parameter set according to the prediction accuracy; Loop steps 4 and 5 until the cut-off condition is met to obtain the optimal BPNN parameter set. The cut-off condition can be generally set to reach the maximum number of iterations. Finally, weights and thresholds are assigned to the BPNN based on the optimal parameter group. MSMABP is actually a combinatorial innovation method, which uses MSC to reduce the capacity of the large data sample training group to form cluster centers; Then SMA is used to optimize the parameters of the BPNN based on the cluster centers; Finally the initial weight and thresholds after SMA optimization are assigned to the BPNN for training. 6) Input the test set into the trained BPNN to obtain the carbon concentration value.

The BPNN parameter population is marked as X, expressed as X = {X₁, X₂,., X_n}, which is composed of search individuals X_i, where n represents the number of search individuals. The attributes of X_i are the weights and thresholds of each layer in the neural network. Before using the heuristic algorithm, the initial parameters of the neural network were set randomly. The heuristic algorithm was then used to iteratively optimize the initial values of the neural network, using prediction accuracy as the fitness value. The flowchart of MSMABP can be seen in Fig. 1.

Mean shift clustering

After obtaining the normalized training set, it is necessary to perform mean shift clustering analysis to obtain the cluster centers of the training set. Mean shift clustering is a non-parametric clustering method used to discover dense regions from unlabeled data. This algorithm estimates the density of data points and determines the cluster centers in the data space by following the direction of maximum density gradient. The specific steps are as follows:

Step 1 Set the iterative error (error1), merging error (error2) and the search bandwidth (BW);

Step 2 Randomly select a point from the unmarked data points as the cluster center (center);

Step 3 Search for data points in the search dataset that fall within the bandwidth (BW) range of the current center point (center) and label them as set M.

Step 4 For each data point in the set M, calculate the vector from the current center point (center) to the data point, and add these vectors to obtain a total offset vector (shift);

Step 5 Add the current center point (center) and the offset vector (shift) to get the new center point position;

Step 6 Check whether the offset vector (shift) is smaller than the iteration error error1. If yes, the center point is stable enough, go to step 7 for the next step. If not, the center point is still moving, then return to step 3 and continue to iterate this process;

Step 7 Check whether the distance between the current center point (center) and the existing cluster center is smaller than the set merge error error2. If yes, the distance between the two clusters is very close, and they are merged into one cluster. This can reduce the number of clusters and improve the conciseness of the clustering results;

Step 8 If all points are visited by marks, stop searching and output the centers of all clusters as CCs; otherwise, go to step 2.

BP neural network

The neural network model for optimizing the heat treatment of carburized steel utilizes the BPNN algorithm^29,30. The model takes carburizing and diffusion temperature, time, and carbon potential as input parameters, along with shape and x-y coordinates, with carbon concentration as the output layer parameter. Among these, the input shape is represented numerically: 1 for square, 2 for circle and 3 for trapezoid. These three selected shapes have broad applications in actual production processes. Specifically, the square shape resembles the cross-section of a turning tool, the circular shape resembles the cross-section of a shaft, and the trapezoid shape resembles the cross-section of a gear.

Simulation data of 243 sets with different process parameters are generated through ABAQUS, each set of simulation data contains the carbon concentration value corresponding to each coordinate of each shape. A total of 530,226 sets of data were used as samples, of which 371,158 sets were used to train the model for learning, while the remaining 159,068 sets were used to test the generalization and stability of the model.

Firstly, set the maximum number of iterations to 200, the learning rate to 0.01, and the momentum factor to 0.001 before establishing the BPNN model. Select the trainlm function as the training function, and use the backpropagation rule to train the neural network. At the same time, set the activation function of the output layer is purelin, and that of all hidden layers are set to the S-shaped tangent function tansig. Secondly, determine the number of hidden layers and neurons. Compared with a single hidden layer, the overall performance of a multi-hidden layer neural network model is better, though it comes with an increase in training time. This article selects 4, 5, and 6 as the number of hidden layers for discussion. The number of neurons is chosen in the range of 5 to 14 according to the empirical formula³¹and in the range of 15 to 24 outside the formula range. The specific formula is as follows:

$$\:l=\sqrt{n+m}+a$$

(1)

Where l is the number of single hidden layer nodes, a is constant (1 < a < 10), n and m is the number of input nodes and output nodes respectively.

As can be seen from Fig. 2, as the number of neuron nodes and hidden layers increases, the root mean square error of BPNN decreases while the running time gradually increases. As illustrated in the red box area in Fig. 2b, when a model with this regional structure is selected, the root mean square error on the test set is less than 4 × 10⁻⁵, while the calculation time of this regional structure model in Fig. 2a exceeds 3000 s, and often well beyond 3000 s (except for the model with 5 hidden layers and 20 neurons per layer). Considering both prediction accuracy and running time, this paper chooses the BPNN model as 10 × 20 × 20 × 20 × 20 × 20 × 1, as shown in Fig. 3.

Slime mould optimization algorithm

The slime mould algorithm (Slime Mould Algorithm, SMA) is an optimization algorithm simulating the spreading and foraging behavior of slime mould²⁵. It is inspired by the vegetative stages of slime mould, including feeding behavior and morphological changes. SMA uses an adaptive weight factor to simulate the positive and negative feedback behavior of slime mould, forming three different contraction modes of foraging behavior. The foraging behavior of slime mould includes three steps: looking for food, approaching food, and secreting biological enzymes to digest food. The process of SMA is shown in Fig. 4. Table 1 provides the pseudo code of the SMA algorithm.

Approaching food

Slime mould judges the concentration of food based on the smell in the air and then approaches the food. To simulate this behavior, its approach behavior is formulated using the following equation to simulate the contraction mode²⁵:

$$\:\overrightarrow{S\left(t+1\right)}=\left\{\begin{array}{c}\overrightarrow{{S}_{b}\left(t\right)}+\overrightarrow{vb}\bullet\:\left(\overrightarrow{W}\bullet\:\overrightarrow{{S}_{A}\left(t\right)}-\overrightarrow{{S}_{B}\left(t\right)}\right),\:x<q\\\:\overrightarrow{vc}\bullet\:\overrightarrow{S\left(t\right)},\:x\ge\:q\end{array}\right.$$

(2)

where, $\:\overrightarrow{\:vb}$and $\:\overrightarrow{vc}$ are the control parameters. t is the current number of iterations, $\:\overrightarrow{{S}_{b}}\:$denotes the individual location with the highest odor concentration currently found, $\:\overrightarrow{{S}_{A}}$ and $\:\overrightarrow{{S}_{B}}$ are two random individuals, $\:\overrightarrow{W}$ represents the weight of slime mould.

$$\:q=\text{tanh}\left|P\left(i\right)-TF\right|$$

(3)

where $\:i\in\:\text{1,2},\dots\:,n$. $\:P\left(i\right)$ is the adaptation value of $\:\overrightarrow{S}$, $\:TF$ is the best adaptation value in all iterations.

$$\:\overrightarrow{W\left(SmellIndex\right(i\left)\right)}=\left\{\begin{array}{c}1+x \bullet\:{log}\left(\frac{bF-P\left(i\right)}{bF-wF}+1\right),half\\\:1-x \bullet\:{log}\left(\frac{bF-P\left(i\right)}{bF-wF}+1\right),\:others \end{array}\right.$$

(4)

$$\:SmellIndex=sort\left(P\right)$$

(5)

,

where half is P(i) ranks in the top half of the population, others is the remaining individuals in the population, x represents a random number between $\:\left[0,\:1\right]$, $\:bF$ and $\:wF$ are the best and worst fitness value of the current iterations, $\:SmellIndex$ is the sequence of fitness values.

Wrapping food

The formula for updating the slime mould position is as follows:

$$\overrightarrow {{S^{*} }} = \left\{ {\begin{array}{*{20}c} {rand \cdot \left( {UB - LB} \right) + LB,\;\;\;rand < z~~~~~~~~~~~~~~} \\ {\overrightarrow {{S_{b} \left( t \right)}} + \overrightarrow {{vb}} \cdot \left( {W \cdot \overrightarrow {{S_{A} \left( t \right)}} - \overrightarrow {{S_{B} \left( t \right)}} } \right),\;\;\;x < q~~~~~~~~~~} \\ {\overrightarrow {{vc}} \cdot \overrightarrow {{S\left( t \right)}} ,\;\;\;x > q~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~} \\ \end{array} } \right.$$

(6)

Where LB and UB represent the lower and upper boundaries of the search range respectively, and rand represent random values in $\:\left[0,\:1\right]$.

Access to food

The control parameter $\:\overrightarrow{vb}$ is chosen randomly between [-d$\:,$d]. The control parameters $\:\overrightarrow{vc}$ oscillates randomly within the range of $\:\left[-1,\:1\right]$, gradually converging to zero.

$$\:\overrightarrow{vb}=[-d,\:d]$$

(7)

$$\:d=\text{a}\text{r}\text{c}\text{t}\text{a}\text{n}\text{h}\left(-\left(\frac{t}{\text{m}\text{a}\text{x}\_t}\right)+1\right)$$

(8)

Table 1 SMA Pseudo code.

Full size table

Case study

Experiment description

The atmosphere carburizing experiment in this paper is carried out on the wx-1000 atmosphere carburizing multifunctional furnace fabricated by Jiangsu Yixin Gear Manufacturing Co., Ltd. This furnace can meet the production requirements of 1.5 ~ 3.5 mm carburized layer depth for small and medium-sized parts, with a relatively high production rate and high efficiency. It has the characteristics of relatively high temperature, a stable atmosphere in the furnace, and a high degree of automation. It is especially suitable for processing large quantities of carburized heat-treated parts such as gears, bearings, and knives. The experimental material in this study is the gear steel 20Cr2Ni4A, which is prepared as a cylindrical sample of $\:\varnothing\:$20mm$\:\:\times\:\:$48mm. Table 2 shows the chemical composition of 20Cr2Ni4A gear steel.

Table 2 Chemical compositions of 20Cr2Ni4A steel(wt%).

Full size table

For the carburizing production process of different carburized layers, the cylindrical sample is placed into the furnace for carburizing. The carburizing process mainly involves two stages: carburizing and diffusion, followed by subsequent cooling to the appropriate temperature for quenching. The process curve is shown in Fig. 5.

The first stage is preheating, with the temperature raised to 800 °C and isothermally maintained for 30 min to ensure uniform heating; The second stage is carburizing, with the carbon potential cp.₁%, the temperature raised to T₁ °C, and the duration set to t₁ minutes; The third stage is diffusion, with the carbon potential set at cp.₂%, the temperature maintained at T₂ °C, and the duration set to t₂ minutes; In the fourth stage, the temperature is lowered to the quenching temperature T₃ °C and maintained for t₃ minutes; The fifth stage is quenching for 40 min, with the quenching oil temperature set at 70 °C and the actual temperature rising to about 120 °C; The sixth stage is air cooling until the sample cools down to room temperature. The sample placement with the furnace is shown in Fig. 6.

According to the determined carburizing process of spiral bevel gears, cylindrical samples were carburized in the furnace to verify the accuracy of the simulation model. As shown in Fig. 6, three cylindrical specimens were installed in racks with three different process parameters, each gear frame weighs about 1 ton. The carburizing and diffusion temperature of process 1 and process 2 is 910 °C, the carburizing and diffusion temperature of process 3 is 920 °C, the quenching temperature of all three processes is 820 °C. The carburizing carbon potential of the three processes is 1.2%, the diffusion carbon potential of process 1 and process 3 is 0.8%, and the diffusion carbon potential of process 2 is 0.9%. For carburization time the unit is hours, Process 3 has a longer set time, while Process 2 has the shortest carburizing time. The specific process parameters are shown in Table 3.

Table 3 Carburizing and quenching process parameters.

Full size table

Carburized layer carbon concentration gradient test

The carbon concentration is measured at three different positions of the cylindrical sample after carburizing, as shown in Fig. 7b. The test equipment uses a German BRUEKR (Bruker) direct reading spectrometer (Q2 ION), as shown in Fig. 7a. Firstly, the carbon concentration of the surface is measured. Secondly, the marks of the spectrometer test is grinded away with a grinding machine. After that, the loss on thickness is measured with a micrometer, followed by the measurement of the carbon concentration on the identical location. Then the same procedures are repeated until reaching the substrate of the sample.

Numerical simulation

Atmospheric carburizing can be divided into three processes. First, the carburizing medium undergoes a decomposition reaction at high temperature. Subsequently, carbon atoms are continuously transferred to the surface of the steel part, causing the carbon content on the surface of the steel part to increase. Finally, a concentration gradient is formed between the surface and the core of the steel part, providing a driving force for carbon atoms to move into the steel part¹⁷.

The governing equation of the carburizing diffusion model can be described by Fick’s second law and partial differential Eq.³²:

$$\:\frac{\partial\:C}{\partial\:t}=D\frac{{\partial\:}^{2}C}{\partial\:{x}_{i}^{2}}$$

(9)

Here, C is the carbon concentration, D is the carbon diffusion coefficient, which is a function related to temperature and carbon concentration. For alloy steel, the carbon diffusion coefficient can be expressed as:

$$\:D=(0.0047{e}^{-1.6C}\times\:{e}^{\left[\frac{6600C-37000}{RT}\right]})\times\:S$$

(10)

Where R is the gas constant, T is the Kelvin temperature, and S is a constant related to the alloy element content.

$$\:D\frac{\partial\:C}{\partial\:{x}_{i}}=\beta\:(C-{C}_{g})$$

(11)

Generally, C_g are the environmental carbon potential of the carbon content on the workpiece surface, and β is the carbon transfer coefficient between the gas phase and the solid phase steel part.

As shown in Fig. 8a, the simulated object is a small cylindrical sample of 20Cr2Ni4A with a thickness of 48 mm and a diameter of 40 mm. To simplify the calculation of the 1D diffusion process, half of the cylindrical sample is taken as the simulation object, and the finite element model is established, as shown in Fig. 8b. AB is the axis of symmetry, and boundary conditions of temperature and carburizing carbon potential are imposed on the remaining three sides. For carburizing heat treatment model, the grid at a distance of 4 mm from the surface is refined to accurately capture the carbon gradient, forming a denser surface grid and a sparser core grid. The grid type is quadrilateral, the number of grid cells is 4600, and the number of nodes is 4747, as shown in Fig. 8c. The initial condition is that the concentration field inside the sample is uniform, with the carbon concentration at each node is C₀ = 0.2%, and the density set as 7.98 g/cm³. Boundary conditioning are set to face flow conditions.

The testing points on the top of the finite element of the cylindrical sample are shown in Fig. 8d. Each testing point is selected at an interval of 0.13 mm from the surface, and a total of 31 testing points are taken to measure the carbon content.

Comparison between simulation results and experiments

The carbon concentration was measured on cylindrical samples with three different process parameters. The effective carbon concentration depths corresponding to a carbon content of 0.35% are pinpointed. The effective carbon concentration depths of process 1, process 2, and process 3 are 1.372 mm, 0.858 mm, and 1.819 mm respectively. The comparison between the simulation results and experimental results of the carbon content gradient change is shown in Fig. 9. It can be seen from the figure that both the simulated values and the experimental values decrease with the increase in depth, and the decreasing trend is the same.

The highest carbon content among the three processes ranges between 0.91% and 0.81%, with the carbon concentration evenly distributed on the surface and gradually decreasing from the surface toward the center. The surface carbon content in Process 2 is higher than that of the other two processes because the relatively short diffusion time causes the absorption rate of carbon atoms on the workpiece surface to exceed their diffusion rate into the interior. This results in the accumulation of carbon atoms on the surface and a higher surface carbon concentration. In contrast, the carburized layer in Process 3 is the deepest, as both the strong carburizing stage and the diffusion stage in this process are the longest, which facilitates the continuous inward diffusion of carbon atoms.

To verify the accuracy of the carburizing numerical model, the liner interpolation method was used to calculate the carbon concentration values at three different experimental test locations, corresponding to the positions of the simulation coordinates. Subsequently, the standard deviation and average values of the carbon concentration values corresponding to the three different positions were calculated. As shown in Fig. 10, the percentage error curves between the experimental values and simulation values for the three different processes reveal that the maximum percentage error for process 1 is 7.79%, occurring at a distance of 1.86 mm from the surface. The maximum percentage error for all three processes is located within 0 ~ 0.2 mm from the surface. This discrepancy is attributed to the error of the test equipment and the uses of 0.02% substrate material as the standard sample. In areas with higher carbon concentrations, there might be some measurement errors.

This article chooses a cylindrical end face for carbon concentration gradient testing. The results for the end face are the same as the carbon concentration gradient results for the square and trapezoid in the vertical direction of the 1D surface under the same process parameters. For 2D shapes, the finite element software considers the impact of differently shaped surfaces on carbon concentration diffusion. By establishing an accurate carbon concentration diffusion model during carburization, three different shapes circle, square, and trapezoid were modeled, as shown in Fig. 11. The number of grid points corresponding to the three shapes are 729, 728, and 725 respectively. An orthogonal process parameter combination is designed for large-scale calculations according to the carburizing process parameter range. For carburizing and diffusion, the temperature is fixed in three groups, the time settings are the same, and the carbon potential is set to three groups respectively, for a total of 3$\:\times\:$3$\:\times\:$3$\:\times\:$3$\:\times\:$3=243 groups of different carburizing process parameters, as shown in Table 4. The carbon concentration distribution results under the corresponding process parameters are calculated as training samples for the neural network.

The carbon concentration distribution of three shapes under 243 sets of different process parameters was generated through simulation software, with a total of 243$\:\times\:$(729 + 728 + 725) = 530,226 sets of data as training samples. Finally, a prediction model based on BPNN was established with the values corresponding to the carburizing process parameters, coordinates and shape as input and the carbon concentration value as output.

Table 4 Simulation generates a training set scheme list.

Full size table

Feasibility verification experiment

To better display the distribution cloud map of carburizing carbon concentration prediction results, we randomly selected input data of three different shapes under different process parameters (as shown in Table 5) from the test set and input them into the trained network. The carbon concentration distribution cloud diagrams of the three different shapes were obtained, as shown in Fig. 12.

Table 5 Process parameters for three different shape selections.

Full size table

The comparison of the percentage errors between simulation values and predicted values is shown in Fig. 13. The area with larger errors for the circular shape is within a ring 1.5–4.5 mm from the surface. The areas with larger errors for the square and trapezoidal shape are all in the four corner areas. The maximum percentage errors of the square, circle, and trapezoid shapes are 11%, 5.9%, and 10.45% respectively. The reason the circular shape has smaller errors than the other two shapes is that the surface of the circular shape is relatively uniform. Square and trapezoidal surfaces have corners, and the carbon concentration distribution in these areas is more complex.

To better observe the gradient change of carburizing carbon content, for each shape, the carbon concentration value of each point is taken every 0.35 mm along the vertical direction starting from the coordinate position (7.5, 15) to the substrate. The percentage errors between the prediction and simulation results for the square, circle, and trapezoid shapes are 0.071%-2.399%, 0.33%-3.39%, and 0.013%-3.33% respectively, as shown in Fig. 14. This indicates that the algorithm proposed in this article shows strong generalization performance in carbon concentration prediction. The effective carburized layer depths of square, circular and trapezoidal shapes are 1.35 mm, 2.32 mm and 1.72 mm respectively. The effective carburized layer depths depends on the carburizing process parameters. Generally speaking, the higher the temperature, the longer the time, and the higher the carbon potential, the deeper the carburized layer will be¹⁵. The depth of the circular effective carburizing layer is higher than that of the other two shapes because its carburizing temperature is higher, which accelerates the diffusion of carbon atoms. Additionally, the arc shape of its surface also accelerates the diffusion of carbon atoms. From the percentage error curves of the three shapes, it can be seen that there is a larger error in the range of 2$\:\sim$3.5mm from the surface. This is because there are fewer grid points in this range, resulting in insufficient corresponding training samples for the simulation model, thus causing an increase in errors.

To enhance the interpretability of the carbon concentration prediction model, we applied SHAP (SHapley Additive Explanations) analysis to quantify the contribution of each process parameter. As shown in Fig. 15, the feature importance derived from SHAP values provides a clear and consistent explanation of how each input feature affects the model’s output.

Among all process parameters, the X and Y coordinates exhibit the highest importance scores. This finding is consistent with the physical mechanism of the carburizing process carbon concentration tends to be higher closer to the surface, particularly within the 0$\:\sim$2 mm depth range from the part’s surface. Additionally, the diffusion carbon potential and quenching carbon potential also demonstrate significant influence. This is because the quenching stage is the final phase of the carburizing process and directly determines the surface carbon concentration after treatment.

Comparative experiment

Some initial parameters involved in this method are shown in Table 6. Figure 16 shows the BPNN model training performance curves for each hybrid method. After 200 rounds of learning and iteration, each method can reduce the error to 2.14$\:\times\:$10⁻⁵, demonstrating that each network model has a very fast training speed and can achieve good training results. It can be seen from the training performance curve that in the first 15 rounds or so, the mean square error drops sharply, and the network convergence speed is significant and rapid. After 50 rounds, the training performance curve becomes smooth and shows no obvious fluctuations, which better reflects the one-to-one correspondence between the 10 input layer parameters and output layer parameters. Therefore, the above combined methods can be used for carburizing process optimization.

Table 6 Parameter settings.

Full size table

It can be seen from the partial enlarged picture in Fig. 16 that the training performance curve of the MSC-BP neural network algorithm without optimizing the initial weight and threshold tends to become smooth after 80 iterations, which is obviously slower than other methods. Compared with other heuristic algorithms, the convergence speed of the neural network can be significantly accelerated after optimizing the initial value of BPNN. This shows that optimizing the initial value of the neural network through a heuristic algorithm can significantly speed up the convergence speed of the network. Comparing several training performance curves using heuristic algorithms to optimize the initial value of the BPNN, it can be seen that during the first 40 iterations, the loss function of the MSMABP algorithm decreased the fastest.

To verify the prediction performance of the method in this paper (MSMABP), it is compared with MGWOBP, MPSOBP, and MDABP. Using a training group and test group allocation ratio of 7:3, each method was run 5 times. Table 7 lists the average (Avg) and standard deviation (Std) of the prediction error and running time. Std is mainly used to evaluate the stability of the method. As can be seen from Fig. 17, the violin contour of the MSMABP method at 1.29 × 10^− 4 is wider, indicating that the data is more concentrated near this area, while the MPSOBP method has the largest data fluctuation range and the least stable prediction accuracy. In terms of training time, the MSMABP method fluctuates around 650 s, which is more stable than other methods as shown in Fig. 18. This method performs best in terms of prediction accuracy and running time stability, ranking first in overall stability. In terms of prediction accuracy, this method is second only to the MGWOBP method, but it is slightly inferior to other methods in terms of neural network training time.

Compared with other machine learning methods, the performance of the MSMASVR method is significantly worse. As shown in Table 7, its RMSE reaches 2.46 × 10⁻³, which is one order of magnitude higher than the other methods, indicating poor prediction accuracy on large-scale datasets. Additionally, its average running time exceeds 5800 s, far longer than the neural network-based approaches. This is primarily because SVR involves solving a quadratic programming problem with a computational complexity of O(n³, making it highly sensitive to the training sample size. Furthermore, SVR typically requires computing kernel functions against all support vectors, which becomes costly as data grows.

Mean-shift clustering drastically reduces the number of training samples, thus reducing the running time. However, this brings a new problem: the loss of original data results in a decrease of accuracy. Therefore, the mixed method of MSC and BP (MBP) is formed, and the comparison results are shown in Table 7. A total of 522,955 sets of training samples can be clustered into 77,321 cluster centers with mean shift clustering. Using the cluster centers as the training set can reduce the running time of BPNN by nearly 5.7 times. From the perspective of prediction accuracy, the RMSE (Root Mean Square Error) value of MBP is still relatively small, 4.5 times that of the BPNN RMSE. The average percentage error of BPNN is 0.48%, and the average percentage error of MBP is 2.18%. To some extent, this loss is acceptable.

Table 7 Comparison of MSMABP with other algorithms.

Full size table

Conclusion

To solve the problem of high-efficiency prediction of carburizing carbon concentration under the condition of large samples, a parameter adaptive BPNN prediction method combining MSC and SMA is proposed in this paper. MSC is used to accelerate the entire method, and SMA is used to improve the prediction accuracy and stability of the method. By training a large amount of historical data, the model can learn the complex nonlinear relationship between atmospheric carburizing carbon concentration and corresponding process parameters, showing high prediction accuracy. The average percentage error of the trained model on the test set is 2.18%, the correlation coefficient between the training value and the predicted value reach 0.9998, and the root mean square error is 1.29 × 10^− 4, which verifies the superiority of the model. In addition, the prediction accuracy and stability were compared among MSMABP, MBP, MGWOBP, MPSOBP and MDABP. The method proposed in this paper shows a strong competitive advantage and excellent performance in the stability of results, which can be directly applied to practical processing. Future research is expected to apply this method to predict the three-dimensional carburized carbon concentration. Additionally, it should consider the influence of material factors and more complex shapes on the carbon concentration results after carburizing. Targeting production costs, this approach is of great significance in reducing carbon emissions and improving the environment.

Data availability

The carburizing experiment and simulation result data can be downloaded via the Baidu Cloud link: https://github.com/tangzirong36/carburzing.git.

References

Hong, I. J., Kahraman, A. & Anderson, N. A rotating gear test methodology for evaluation of high-cycle tooth bending fatigue lives under fully reversed and fully released loading conditions. Int J Fatigue. 143, 105432 (2020).
Article Google Scholar
Wang, W., Wei, P. T., Liu, H. J., Yu, Y. & Zhou, H. Damage behavior due to rolling contact fatigue and bending fatigue of a gear using crystal plasticity modeling. Fatigue fract. Eng Mater Struct. 44, 2736–2750 (2021).
Article Google Scholar
Bonaiti, L., Bayoumi, A. B. M., Concli, F., Rosa, F. & Gorla, C. Gear root bending strength: a comparison between single tooth bending fatigue tests and meshing gears. J Mech Des. 143, 103402 (2021).
Article Google Scholar
Zhang, N. et al. Failure analysis of the carburized 20MnCr5 gear in fatigue working condition. Int J Fatigue. 161, 106938 (2022).
Article CAS Google Scholar
Zhang, Y. T., Wang, G., Shi, W. K., Yang, L. & Li, Z. C. Modeling and analysis of deformation for spiral Bevel gear in die quenching based on the hardenability variation. J Mater Eng Perform. 26, 3043–3047 (2017).
Article Google Scholar
Kim, H. K. et al. Atmosphere gas carburizing for improved wear resistance of pure titanium fabricated by additive manufacturing. Mater Trans. 58, 592–595 (2017).
Article CAS Google Scholar
Deng, H. L. et al. Bending fatigue life prediction model of carburized gear based on microcosmic fatigue failure mechanism. J Mater Eng Perform. 31, 882–894 (2022).
Article CAS Google Scholar
Shi, L. et al. Improving the wear resistance of heavy-duty gear steels by Cyclic carburizing. Tribol Int. 171, 107576 (2022).
Article CAS Google Scholar
Cao, L. W. et al. Effect of Carburization on Creep Performance of Cr35Ni45Nb Heat Resistant Alloy. Proc. ASME. Pressure. Vessels. Piping. Conf (2019).
Shen, L. M., Gong, J. M. & Qin, X. Y. Experimental investigation and numerical simulation of carburization layer evolution of Cr25Ni35Nb and Cr35Ni45Nb steel. Rev Adv Mater. 33, 142–147 (2010).
Google Scholar
Chen, J., Li, Y. W., Zhang, C. H., Tian, Y. Y. & Guo, Z. K. Urban flooding prediction method based on the combination of LSTM neural network and numerical model. Int J Environ Res Public Health. 20, 1043 (2023).
Article PubMed PubMed Central Google Scholar
Sethuramalingam, P., Uma, M., Raj, S. O. N., Patel, R. & Paul, N. K. Experimental investigations and surface characteristic analysis of titanium alloy using machine learning techniques. J. Mater. Eng. Perform. (2023).
Wang, L. Y., Zhu, S. P., Luo, C. Q., Niu, X. P. & He, J. C. Defect driven physics-informed neural network framework for fatigue life prediction of additively manufactured materials. Philos Trans R Soc A. 381, 20220386 (2023).
Article CAS Google Scholar
Zhu, S. P. et al. Physics-informed machine learning and its structural integrity applications: state of the Art. Phil Trans R Soc. 381, 20220406 (2023).
Article Google Scholar
Zhu, S. P., Niu, X. P., Keshtegar, B., Luo, C. Q. & Bagheri, M. Machine learning-based probabilistic fatigue assessment of turbine bladed disks under multisource uncertainties. Int J Struct Integr. 14, 1000–1024 (2023).
Article Google Scholar
Luo, C. Q., Keshtegar, B., Zhu, S. P. & Niu, X. P. EMSC-SVR: hybrid efficient and accurate enhanced simulation approach coupled with adaptive SVR for structure reliability analysis. Comput. Methods Appl. Mech. Eng. 400, 115499 (2022).
Article ADS Google Scholar
Guo, J. Y. et al. Modeling and simulation of vacuum low pressure carburizing process in gear steel. Coat 11, 1003 (2021).
Article CAS Google Scholar
Zhang, J. P., Wang, L. L. & Wang, G. D. Prediction and comparative analysis of friction material properties using a GA-SVM optimization model. Ind. Lubr Tribol. 76, 345–355 (2024).
Article Google Scholar
Xue, R. & Cai, Y. Optimization of parallel SVM algorithm for big data. J. Comput. Methods Sci. Eng. 24, 1253–1266 (2024).
Google Scholar
Zheng, B. H. Material procedure quality forecast based on genetic BP neural network. Mod. Phys. Lett. B. 31, 19–21 (2017).
Article CAS Google Scholar
Li, B. T., Pi, D. C., Lin, Y. X. & Cui, L. DNC: A deep neural network-based Clustering-oriented network embedding algorithm. J Netw Comput Appl. 173, 102854 (2020).
Article Google Scholar
Virupakshappa, K. & Oruklu, E. Unsupervised machine learning for ultrasonic flaw detection using Gaussian mixture modeling, K-means clustering and mean shift clustering. IEEE Int. Ultrason. Symp 647–649 (2019).
Li, C. X., Zhu, S., Sun, Z. B. & Rogers, J. BAS optimized ELM for KUKA Iiwa robot learning. IEEE Trans. Circuits Syst. II Exp. Briefs. 68, 1987–1991 (2021).
Google Scholar
Xu, L. W., Wang, H., Lin, W., Gulliver, T. A. & Le, K. N. GWO-BP neural network based OP performance prediction for mobile multiuser communication networks. IEEE Access. 7, 152690–152700 (2019).
Article Google Scholar
Li, S. M., Chen, H. L., Wang, M. J., Heidari, A. A. & Mirjalili, S. Slime mould algorithm: A new method for stochastic optimization. Future Gener Comput Syst Int J eScience. 111, 300–323 (2020).
Article Google Scholar
Mirjalili, S. Dragonfly algorithm: a new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Comput Appl. 27, 1053–1073 (2016).
Article Google Scholar
Emary, E., Zawbaa, H. M. & Grosan, C. Experienced Gray Wolf optimization through reinforcement learning and neural networks. IEEE Trans. Neural Netw. Learn. Syst. 29, 681–694 (2018).
Article MathSciNet CAS PubMed Google Scholar
Rui, J. W., Zhang, H. B., Zhang, D. L., Han, F. L. & Guo, Q. Total organic carbon content prediction based on support-vector-regression machine with particle swarm optimization. J Pet Sci Eng. 180, 699–706 (2019).
Article CAS Google Scholar
Wang, Y., Wang, W. & Chen, Y. Carnivorous plant algorithm and bp to predict optimum bonding strength of heat-treated woods. Forests 14, 51 (2023).
Luo, C. Q., Keshtegar, B., Zhu, S. P., Taylan, O. & Niu, X. P. Hybrid enhanced Monte Carlo simulation coupled with advanced machine learning approach for accurate and efficient structural reliability analysis. Comput. Methods Appl. Mech. Eng. 388, 114218 (2022).
Article ADS MathSciNet Google Scholar
Wang, Y. J., Sha, A. X., Li, X. W. & Hao, W. F. Prediction of the mechanical properties of titanium alloy castings based on a Back-Propagation neural network. J Mater Eng Perform. 30, 8040–8047 (2021).
Article CAS Google Scholar
Sugianto, A., Narazaki, M., Kogawara, M., Kim, S. Y. & Kubota, S. Distortion analysis of axial contraction of Carburized-Quenched helical gear. J Mater Eng Perform. 19, 194–206 (2010).
Article CAS Google Scholar

Download references

Acknowledgements

All authors thank Jiangsu Yixin Gear Manufacturing Co., Ltd. for providing the carburizing experimental platform.

Funding

This work was financially supported by National Natural Science Foundation of China No. 52475185.

Author information

Authors and Affiliations

College of Mechanical & Electrical Engineering, Hohai University, Changzhou, 213022, China
Yingtao Zhang, Zirong Tang, Qing Yin & Yanjie Wang
Department of Mechanical Engineering, Tsinghua University, Beijing, 100084, China
Zhenguo Nie

Authors

Yingtao Zhang
View author publications
Search author on:PubMed Google Scholar
Zirong Tang
View author publications
Search author on:PubMed Google Scholar
Qing Yin
View author publications
Search author on:PubMed Google Scholar
Yanjie Wang
View author publications
Search author on:PubMed Google Scholar
Zhenguo Nie
View author publications
Search author on:PubMed Google Scholar

Contributions

ZY: Investigation, analysis, material and data collection TZ: wrote the manuscript, investigation, analysis YQ and WY: investigation and figure preparation NZ: conceptualization, methodology and supervision. All authors contributed to the article and approved the submitted version.

Corresponding authors

Correspondence to Zirong Tang or Zhenguo Nie.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Zhang, Y., Tang, Z., Yin, Q. et al. Machine learning based prediction of carbon concentration in carburized steel. Sci Rep 15, 33678 (2025). https://doi.org/10.1038/s41598-025-18531-8

Download citation

Received: 22 January 2025
Accepted: 02 September 2025
Published: 29 September 2025
Version of record: 29 September 2025
DOI: https://doi.org/10.1038/s41598-025-18531-8

Subjects

Abstract

Similar content being viewed by others

Predicting carbonation depth in fiber-reinforced ultra-high performance concrete (FR-UHPC) using state-of-the-art machine learning techniques

Quantification and prediction of carbon steel surface corrosion based on machine vision

High temperature oxidation of corrosion resistant alloys from machine learning

Introduction

Carbon concentration in carburized steel prediction on MSMABP

Mean shift clustering

BP neural network

Slime mould optimization algorithm

Approaching food

Wrapping food

Access to food

Case study

Experiment description

Carburized layer carbon concentration gradient test

Numerical simulation

Comparison between simulation results and experiments

Feasibility verification experiment

Comparative experiment

Conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links