Introduction

The state-of-the-art development of technologies such as high-performance computing1, next-generation chips2, and renewable energies3 generates substantial heat densities. Thermal contact resistance (TCR) plays a crucial role in applications requiring precise thermal management4. Even minimal TCR at interfaces can strongly degrade computational performance or destabilize qubits5. Addressing TCR not only resolves immediate thermal bottlenecks but also facilitates breakthroughs in sustainability, miniaturization, and technologies for extreme environments, thereby establishing TCR as a fundamental aspect of modern thermal engineering6.

Surface characteristics have long been recognized as critical factors influencing the TCR6. Extensive experimental research has demonstrated that surface roughness markedly affects TCR7,8,9,10,11. However, Cui et al.12 highlighted that even when surfaces have identical roughness values, variations in processing methods can result in substantial differences in TCR. Statistical parameters provide an alternative method for characterizing surfaces. Several studies have employed surface fractal models to analyze contact phenomena. Ma et al. conducted experimental measurements of surface fractal parameters, considering elastic, plastic, and elastoplastic deformations, and developed a theoretical prediction model for TCR13. Similarly, Sun et al. extended this approach to cylindrical contact surfaces14. Chen et al. further enhanced the analysis by incorporating thermal stress and asperity interactions15. In summary, these studies characterize surface properties either by using surface roughness metrics, such as arithmetic average roughness (Ra) and root mean square roughness (Rq), or by deriving fractal parameters, including fractal dimension and scaling factor, from fractal theory. However, theoretical models incorporating complete surface topography have not yet been explored.

An alternative research approach involves directly constructing surface morphology using finite element method (FEM). Dai et al. reconstructed the contact surface and performed FEM simulations with ABAQUS to investigate the TCR of Ti-6Al-4V16. They further incorporated thermal expansion effects and conducted thermo-mechanical coupling simulations17. Wang et al. studied the TCR under non-uniform loading conditions18, while Dong et al. examined the influence of gap conduction19. Although FEM-based numerical simulations can accurately represent complete surface morphology, they have considerable drawbacks, including high computational resource demands, long computation times, and limited reusability. These limitations hinder the development of simple and user-friendly general models.

In recent years, neural network-based predictive models have gained increasing attention in researches20. Ren et al. developed a predictive model for TCR utilizing artificial neural networks (ANN), enabling accurate determination of TCR values directly from temperature distributions and contact pressure at multiple points, thus eliminating the need for complex experimental testing procedures21. Moreover, Feng et al. employed ANN to forecast the TCR of copper blocks subjected to load cycling22. However, current investigations on TCR using neural networks primarily rely on simple ANN architectures, overlooking the potential benefits of more sophisticated network designs and their intrinsic properties.

In this study, we develop and validate a deep learning model based on convolutional neural networks (CNNs) that directly utilizes complete surface topography data to predict TCR. The model exhibits robust predictive performance for both TCR and real contact area when evaluated on a comprehensive synthetic dataset derived from a theoretical framework. Subsequently, the model is validated using experimental measurements from ground and turned steel surfaces. By employing interpretability and visualization techniques, we provide direct evidence identifying the specific surface characteristics that influence heat transfer. Our findings reveal that both contact points and, notably, non-contact regions of interacting surfaces strongly affect prediction outcomes. This approach offers a methodology for advancing the fundamental understanding of TCR.

Results

Model performance

As illustrated in Fig. 1a, the training process was conducted over 80 epochs, with the optimal model identified at the 76th epoch, achieving an average validation mean squared error loss of 0.01 during cross-validation. The model’s predictive performance was evaluated on both the training and validation datasets. The results indicate that the predicted actual contact area achieved determination coefficients \(({R}^{2})\) of 0.993 on the training set and 0.893 on the test set (Fig. 1c). Similarly, the TCR yielded \({R}^{2}\) values of 0.989 and 0.978 on the training and test sets, respectively (Fig. 1d). Relative errors were also calculated, as shown in Fig. 1b. For both prediction targets, relative errors were predominantly below 25% on the training set and below 50% on the test set, with maximum relative errors not exceeding 50 and 125%, respectively, indicating robust predictive accuracy. The regression lines for the actual contact area on both datasets lay below the x = y line, suggesting a tendency of the model to underestimate the actual contact area. In contrast, the regression lines for the TCR were positioned above the x = y line, implying an overestimation of the TCR.

Fig. 1: Training and evaluation details of deep learning model.
Fig. 1: Training and evaluation details of deep learning model.
Full size image

a Changes in the mean squared error (MSE) loss during training and validation across the five-fold cross-validation process. The shaded area represents the range of the loss distribution for each epoch, whereas the solid line indicates the corresponding average loss. b Distribution of relative errors of the prediction model on both the training and test sets. c Performance of the prediction model in predicting the actual contact area. d Performance of the prediction model in predicting thermal contact resistance (TCR).

Experimental validation

After validating the model’s performance, further verification was conducted using experimentally measured data. The test specimens are divided into two groups: the first group was fabricated through grinding, as illustrated in Fig. 2a, while the second group was produced by turning (Fig. 2b). Both groups exhibit an Ra of approximately 0.8 μm. Based on processing characteristics, a 5 × 5 grid of local areas was selected for surface topography testing in the first group, whereas three local areas were tested from the center outward in the second group. TCR was measured under pressures ranging from 1 to 4.55 MPa. Surface topography data and contact pressure were utilized as model inputs to predict TCR. Test areas of upper and lower specimens were paired one-to-one, and the predicted data were employed to assess TCR at specific pressures. Consequently, the first group yielded 625 data sets per pressure, while the second group produced nine data sets.

Fig. 2: Comparisons between prediction results, experimental data, and other prediction models.
Fig. 2: Comparisons between prediction results, experimental data, and other prediction models.
Full size image

a, b The predicted actual contact area for a ground and b turned surfaces. c, d The predicted thermal contact resistance (TCR) aligned with measured values for c ground and d turned surfaces. e, f Predicted TCR, and thermal contact conductance (TCC) compared with the Cooper-Mikic-Yovanocich (CMY) model, machine learning model, and fraction model for e ground and f turned surfaces. Specimens fabricated using griding and turning are shown in (a) and (b), respectively.

The experimental data and prediction results are presented in Fig. 2a–d. The predicted actual contact area exhibited a nearly linear increase with pressure, rising from approximately 0.02% to 0.25% and 0.4% as pressure increased from 1 to 10 MPa. Although Ra was controlled to be similar, turned surfaces tended to have sharper peaks that deform differently under pressure compared to ground surfaces, which had more rounded asperities. At low pressures, both surfaces primarily underwent plastic deformation, resulting in similarly small contact areas proportional to material hardness. However, as pressure increased, elastic deformation became dominant, causing different responses. The taller asperities on the turned surface deformed elastically over a larger area, resulting in a greater actual contact area overall. This phenomenon is detailly discussed in Supplementary Discussion 3. Regarding TCR, model predictions were slightly lower for pressures below 1 MPa and slightly higher for pressures above 3 MPa. Nevertheless, all predictions remained within the acceptable error range. Supplementary Figs. 1–2 present specific surface prediction results. Pronounced differences in local morphological characteristics between surface pairs led to clearly divergent predictive outcomes.

Comparison with existing models

As shown in e, f, comparative analyses were performed between the predictions of this study and those from the correctional Cooper-Mikic-Yovanocich (CMY) model23, the fractal model, and the machine learning (ML) model24. For the first set of specimens, experimental measurements aligned more closely with the fractal model predictions within the 2–4.55 MPa range. The ML model exhibited substantial deviations across all pressure levels. Notably, the predictions from this study were consistently lower than those of the other models for this specimen group. For the second set of specimens, the predictions showed closer agreement with the fractal model in the 1–3 MPa range and better alignment with ML predictions at 2–5 MPa. The CMY model failed to predict accurately when the contact pressure exceeded 2 MPa. It is important to note that both the CMY and ML models do not utilize complete surface information, resulting in identical predictions for both specimen groups. In contrast, the fractal model and our proposed model effectively incorporated comprehensive surface characteristics, thereby demonstrating prediction specificity for different surfaces. Detailed descriptions of the CMY and ML models are provided in Supplementary Method 4, along with the extraction methodology of surface characterization parameters for the fractal model in Supplementary Fig. 5 and Supplementary Methods 3.

Interpretability and visualization

To enhance the interpretability of the model, we employ guided backpropagation (GBP)25 and Class Activation Mapping (CAM)26 to visualize the surfaces influencing the final outcomes. Figure 3a, b, and Fig. 3g, h present GBP results for two sets of surfaces. For improved visualization clarity, the surfaces are manually rotated by 90°, 180°, and 270°, preserving the integrity of the input data. The gradients obtained from GBP are further processed to produce output gradients matching the dimensions of the input surfaces. These gradients are divided into grad1 and grad2 along the first dimension, corresponding to gradients of different surfaces. Positive and negative influences are defined as the absolute values of gradients greater than or less than zero, respectively. The subscripts 1 and 2 denote their respective gradients. For the first set of specimens, rotation substantially influences the actual contact area (Fig. 3c) and the corresponding TCR (Fig. 3d) at a contact pressure of 1 MPa. Specifically, an increase in the actual contact area results in a reduction in TCR. However, at contact pressures of 2 MPa or higher, the effect of rotation becomes negligible. For the second set of specimens, rotation has negligible effects on both the TCR (Fig. 3j) and the actual contact area (Fig. 3i), likely due to the manufacturing process. The rotational symmetry of the texture formed by turning along the axis rendered rotation inconsequential. To further analyze the gradients, we summed all their elements, revealing a positive correlation between the sum of grad elements and the actual contact area, as shown in Fig. 3e, and a negative correlation with the TCR in Fig. 3f. In contrast, the second set of specimens exhibited no discernible trend (Fig. 3k, l), potentially due to minimal variations in the actual contact area and TCR. Examination of the gradient activation maps revealed that they predominantly corresponded to regions of higher or lower surface contact areas (i.e., surf1 + surf2), consistent with TCR theory6. This consistency confirms that our model accurately captures the relevant features of the real surfaces.

Fig. 3: Guided Backpropagation activation maps of the tested surfaces.
Fig. 3: Guided Backpropagation activation maps of the tested surfaces.
Full size image

a and g show the activation maps of the original ground and turned surfaces, respectively. b and h depict the activation maps of the ground and turned surfaces rotated by 90°. The predicted actual contact area varies with the rotation angle at a contact pressure (P) of 1 MPa for the ground (c) and turned (i) surfaces. Similarly, the predicted thermal contact resistance (TCR) changes with the rotation angle at 1 MPa for the ground (d) and turned (j) surfaces. The predicted actual contact area fluctuates with the sum of gradients for the ground (e) and turned (k) surfaces at 1 MPa. Finally, the predicted TCR varies with the sum of gradients for the ground (f) and turned (l) surfaces at 1 MPa. Here, Surf1 and Surf2 denote height profiles of the two contact surfaces, while Surf1 + Surf2 represents the direct summation of the height tensors of these surfaces. Similarly, Grad1 and Grad2, as well as Grad1 + Grad2, correspond to the dimensions of the Guided Backpropagation gradient matrices associated with the respective input surfaces.

The CAM method demonstrates distinct characteristics in Fig. 4. Most notably, for surfaces, the CAM visualization results emphasize the lower regions of the contact surface. Specifically, the activation maps corresponding to the lower regions of surf1 and surf2 exhibit higher values and exert a greater influence on the outcomes. These lower regions can be interpreted as areas where contact is less likely to occur. Moreover, non-contact areas clearly impact the actual contact area: the greater the extent of non-contact regions, the smaller the actual contact area, resulting in an increased TCR. For all CAM methods, activation maps of shallow layers are more interpretable to humans, whereas those from deeper layers highlight regions that are more difficult to interpret, often showing a stronger emphasis on certain edge regions. We propose two explanations for this phenomenon. First, features extracted by deeper neural networks are inherently more complex and less intuitive. Second, due to the CAM generation process, deeper layer features typically have smaller spatial dimensions but more channels, which can lead to information loss or the introduction of irrelevant information when these features are upsampled to match the input size. Consequently, interpreting activation maps from deeper neural network layers remains a challenging area warranting further investigation.

Fig. 4: The class activation maps (CAMs) for ground and turned surfaces.
Fig. 4: The class activation maps (CAMs) for ground and turned surfaces.
Full size image

a CAMs for ground surfaces. b CAMs for turned surfaces. Here, “surf1 + surf2” represents the direct summation of the height tensors of two input surface. The names of specific layers in a deep learning model are listed at the top, while the CAM methods are listed on the left. Each block, corresponding to a specific layer and CAM method, represents the activation extracted from that layer using the respective method.

Discussion

In summary, this study effectively leverages the robust feature extraction capabilities of CNNs to develop a TCR prediction model based on complete surface morphology data. The model also provides a reference for the actual contact area between two surfaces. As shown in Supplementary Table 1, our model achieves a four-order-of-magnitude acceleration in prediction speed compared to conventional FEM approaches under similar computational configurations. This substantial performance improvement remains evident even when accounting for training overhead, highlighting a notable computational advantage. Moreover, the visualization techniques employed effectively illustrate the surface features influencing TCR.

This study has several important limitations. First, the fractal theory-generated surface patterns in the training dataset are overly homogeneous, which may cause the neural network to overfit these specific surfaces. This can lead to suboptimal feature extraction, with the model either overemphasizing some surface traits or neglecting others. Second, the feature set is limited to pressure parameters and surface topography metrics, restricting the model’s predictive ability to a single material type. Extending the framework to other materials or conditions would require new datasets and retraining. Additionally, the model processes pressure inputs and surface features separately, which may reduce its ability to capture interactions between these factors. Experimentally, validation was limited to TCR measurements, without direct verification of predicted contact areas. Thus, contact area predictions should be viewed as approximate rather than exact. Finally, the study does not explain the methodology behind the CAM visualizations, limiting insight into the model’s decision-making process.

Future research can address several key areas: dataset acquisition, feature selection, deep learning architecture development, and model visualization. For dataset acquisition, more diverse methods should be explored to enrich training data. Including gas conduction and radiation effects can improve data accuracy and comprehensiveness. Although resource-intensive, large-scale experimental measurements would provide the most robust datasets. Optimizing feature selection is also crucial for improving model versatility. Expanding features to cover additional material properties and environmental variables will help the model capture broader interactions and enhance generalizability. Advancing deep learning architectures beyond CNNs is another priority. Hybrid models incorporating Transformers, diffusion models, or even large language models for semantic feature encoding could be considered, though further theoretical support is needed. Additionally, feature fusion mechanisms should be redesigned to integrate material, environmental, and surface characteristics before feature extraction, better reflecting real-world conditions where external factors influence surface behavior. Finally, improving model interpretability and visualization is essential. Employing advanced visualization techniques combined with physical theories can clarify the model’s decision-making process and build trust in its reliability.

This study aims to offer an alternative approach to research on TCR and to assist researchers in understanding the impact of surface morphology characteristics from a novel perspective.

Methods

Fractal surface modeling

For a specified three-dimensional fractal dimension \({D}_{3}\) and fractal roughness G, a rough surface can be generated using the Weierstrass-Mandelbrot function27:

$$\begin{array}{c}z\left(x,y\right)=L{\left(\frac{G}{L}\right)}^{{D}_{3}-2}{\left(\frac{{{\mathrm{ln}}}\gamma }{M}\right)}^{\frac{1}{2}}{\sum }_{m=1}^{M}{\sum }_{n={n}_{\min }}^{{n}_{\max }}{\gamma }^{\left({D}_{3}-3\right)n}\\ \times \left\{\cos {\phi }_{m,n}-\cos \left[\frac{2{{{\rm{\pi }}}}{\gamma }^{n}{\left({x}^{2}+{y}^{2}\right)}^{1/2}}{L}\cos \left(\arctan \left(\frac{y}{x}\right)-\frac{{{{\rm{\pi }}}}m}{M}\right)\right]+{\phi }_{m,n}\right\},\end{array}$$
(1)

where \(z(x,y)\) represents the height of the rough surface, while x and y denote the lateral position coordinates. L is the measured length of the sample, and M indicates the number of superimposed ridges. The parameter γ denotes the frequency density of the surface height, commonly taken as 1.5. Additionally, n represents the asperity frequency index, corresponds to the asperity size, and ϕ is the random phase ranging from 0 to \(2{{{\rm{\pi }}}}\). The spatial frequency index n has an upper limit defined by the truncation length \({L}_{{{{\rm{s}}}}}\), approximately equal to the interatomic bond length on the order of angstroms (Å).

Theoretical calculation

For a single contact spot, the tip is approximated as a sphere with an apparent contact area a (Fig. 5a). Upon deformation, the contact area changes, as shown in Fig. 5b, c. As the apparent contact area increases, plastic, elastoplastic, and elastic deformation occur sequentially at the contact interface. The critical areas governing these transitions are as follows28:

$$\left\{\begin{array}{c}{a}_{{{{\rm{c}}}}1}=\frac{{G}^{2}}{{\left(\kappa H/2{E}_{{{{\rm{s}}}}}\right)}^{2\left(D-1\right)}}\\ {a}_{{{{\rm{c}}}}2}=\frac{{G}^{2}}{{\left(3H/2{E}_{{{{\rm{s}}}}}\right)}^{2\left(D-1\right)}}\end{array}\right.,$$
(2)

here, H denotes the microhardness of the softer material between the two surfaces, \({E}_{s}\) represents the equivalent elastic modulus of the two materials, κ is the contact pressure coefficient29, and D is the two-dimensional fractal dimension30. These quantities can be calculated using the following formulas:

$$\frac{1}{{E}_{{{{\rm{s}}}}}}=\frac{1-{\nu }_{1}^{2}}{{E}_{1}}+\frac{1-{\nu }_{2}^{2}}{{E}_{2}},$$
(3)
$$\kappa =0.454+0.41\nu ,$$
(4)
$$D={D}_{3}-1,$$
(5)

where ν denotes Poisson’s ratio.

Fig. 5: Schematic diagrams of the multi-spot model.
Fig. 5: Schematic diagrams of the multi-spot model.
Full size image

a A single spot. b Deformation of a single spot undergoing plastic deformation. c Deformation of a single spot undergoing elastic deformation. d Schematic illustration of deformation types of asperities at various positions within the multi-asperity model. Elastic deformation occurs in areas not circled in the figure.

When the apparent contact area a exceeds the critical value \({a}_{{{{\rm{c}}}}1}\), deformation is fully elastic. when a is less than the critical value \({a}_{{{{\rm{c}}}}2}\), deformation is fully plastic. For value of a between these two critical thresholds, deformation is elastoplastic. For a single asperity, fully plastic deformation results in a real contact area \({A}_{{{{\rm{p}}}}}=a\); whereas fully elastic deformation yields \({A}_{{{{\rm{e}}}}}=a/2\). In the elastoplastic regime, the real contact area is interpolated between these two limits as \({A}_{{{{\rm{ep}}}}}=a\left[1+f\left(a\right)\right]\), where the interpolation function is defined as follows:

$$f\left(a\right)=\frac{48}{441}{\left[\frac{{{{\rm{\pi }}}}}{2}{\left(\frac{a}{{a}_{{{{\rm{c}}}}1}}\right)}^{\left(2-D\right)/2}-1\right]}^{2}-\frac{32}{441}{\left[\frac{{{{\rm{\pi }}}}}{2}{\left(\frac{a}{{a}_{{{{\rm{c}}}}1}}\right)}^{\left(2-D\right)/2}-1\right]}^{3}.$$
(6)

Correspondingly, the forces (\(f\)) generated by a single contact spot under different contact states are31,32:

$${f}_{{{{\rm{p}}}}}=H{A}_{{{{\rm{p}}}}},$$
(7)
$${f}_{{{{\rm{ep}}}}}={A}_{{{{\rm{ep}}}}}\left\{\left[1-f\left(a\right)\right]\kappa H\frac{2}{{{{\rm{\pi }}}}}{\left(\frac{a}{{a}_{{{{\rm{c}}}}1}}\right)}^{\left(2-D\right)/4}+{Hf}\left(a\right)\right\},$$
(8)
$${f}_{{{{\rm{e}}}}}=\frac{4}{3}{{{\rm{\pi }}}}{E}_{{{{\rm{s}}}}}{G}^{2\left(D-1\right)}{a}^{\left(1-D\right)}{A}_{{{{\rm{e}}}}}.$$
(9)

Here p, ep, and e denote the fully plastic, elastoplastic, and fully elastic deformation stages, respectively. For a pair of contacting surfaces with numerous asperities, the number of asperities, N, having apparent areas between a and \(a+{{{\rm{d}}}}a\) is given by ref. 33:

$$N\left(a\right)=\frac{D}{2}\frac{{a}_{{{{\rm{L}}}}}^{D/2}}{{a}^{\left(D/2+1\right)}}{{{\rm{d}}}}a.$$
(10)

Among all asperities, the asperity with the largest apparent area is denoted as \({a}_{{{{\rm{L}}}}}\). When \({a}_{{{{\rm{L}}}}} < {a}_{{{{\rm{c}}}}2}\), all asperities undergo fully plastic deformation. When \({a}_{{{{\rm{c}}}}2}\le {a}_{{{{\rm{L}}}}} < {a}_{{{{\rm{c}}}}1}\), elastoplastic deformation occurs. For \({a}_{{{{\rm{L}}}}} > {a}_{{{{\rm{c}}}}1}\), all three kinds of deformation are involved. These processes are illustrated in Fig. 5d. Based on the above descriptions, the real contact area \({A}_{{{{\rm{r}}}}}\) and the resultant force F at the contact interface under different conditions can be calculated as follows34:

$${{{{\rm{A}}}}}_{{{{\rm{r}}}}}=\left\{\begin{array}{c}{\int }_{0}^{{a}_{{{{\rm{L}}}}}}{A}_{{{{\rm{p}}}}}N\left(a\right){{{\rm{d}}}}a,{a}_{{{{\rm{L}}}}} < {a}_{{{{\rm{c}}}}2} \hfill \\ {\int }_{0}^{{a}_{{{{\rm{c}}}}2}}{A}_{{{{\rm{p}}}}}N\left(a\right){{{\rm{d}}}}a+{\int }_{{a}_{{{{\rm{c}}}}2}}^{{a}_{{{{\rm{L}}}}}}{A}_{{{{\rm{ep}}}}}N\left(a\right){{{\rm{d}}}}a,{a}_{{{{\rm{c}}}}2}\le {a}_{{{{\rm{L}}}}} < {a}_{{{{\rm{c}}}}1} \hfill \\ {\int }_{0}^{{a}_{{{{\rm{c}}}}2}}{A}_{{{{\rm{p}}}}}N\left(a\right){{{\rm{d}}}}a+{\int }_{{a}_{{{{\rm{c}}}}2}}^{{a}_{{{{\rm{c}}}}1}}{A}_{{{{\rm{ep}}}}}N\left(a\right){{{\rm{d}}}}a+{\int }_{{a}_{{{{\rm{c}}}}1}}^{{a}_{{{{\rm{L}}}}}}{A}_{{{{\rm{e}}}}}N\left(a\right){{{\rm{d}}}}a,{a}_{{{{\rm{L}}}}}\ge {a}_{{{{\rm{c}}}}1}\end{array}\right.,$$
(11)
$$F=\left\{\begin{array}{c}{\int }_{0}^{{a}_{{{{\rm{L}}}}}}{f}_{{{{\rm{p}}}}}N\left(a\right){{{\rm{d}}}}a,{a}_{{{{\rm{L}}}}} < {a}_{{{{\rm{c}}}}2} \hfill \\ {\int }_{0}^{{a}_{{{{\rm{c}}}}2}}{f}_{{{{\rm{p}}}}}N\left(a\right){{{\rm{d}}}}a+{\int }_{{a}_{{{{\rm{c}}}}2}}^{{a}_{{{{\rm{L}}}}}}{f}_{{ep}}N\left(a\right){{{\rm{d}}}}a,{a}_{{{{\rm{c}}}}2}\le {a}_{{{{\rm{L}}}}} < {a}_{{{{\rm{c}}}}1} \hfill \\ {\int }_{0}^{{a}_{{{{\rm{c}}}}2}}{f}_{{{{\rm{p}}}}}N\left(a\right){{{\rm{d}}}}a+{\int }_{{a}_{{{{\rm{c}}}}2}}^{{a}_{c1}}{f}_{{{{\rm{ep}}}}}N\left(a\right){{{\rm{d}}}}a+{\int }_{{a}_{c1}}^{{a}_{L}}{f}_{{{{\rm{e}}}}}N\left(a\right){{{\rm{d}}}}a,{a}_{{{{\rm{L}}}}}\ge {a}_{{{{\rm{c}}}}1}\end{array}\right..$$
(12)

The thermal contact conductance at each contact spot can be derived from heat flow tube theory35:

$${h}_{{{{\rm{stage}}}}}=\frac{2{k}_{{{{\rm{s}}}}}{a}_{{{{\rm{stage}}}}}^{1/2}}{\sqrt{{{{\rm{\pi }}}}}{\left[1-{\left({A}_{{{{\rm{r}}}}}^{* }\right)}^{1/2}\right]}^{3/2}},{{{\rm{stage}}}}={{{\rm{p}}}},{{{\rm{ep}}}},{{{\rm{e}}}},$$
(13)

Here, ‘stage’ refers to the mechanical contact state of the asperity. \({A}_{{{{\rm{r}}}}}^{* }\) represents the ratio of the real contact area, and \({k}_{{{{\rm{s}}}}}\) is the equivalent thermal conductivity. These quantities are calculated as follows:

$${A}_{{{{\rm{r}}}}}^{* }={A}_{{{{\rm{r}}}}}/{L}^{2},$$
(14)
$$\frac{2}{{k}_{{{{\rm{s}}}}}}=\frac{1}{{k}_{1}}+\frac{1}{{k}_{2}},$$
(15)

where \({k}_{1}\) and \({k}_{2}\) denote the thermal conductivities of the two materials in the contact pair. By again considering the range of \({a}_{{{{\rm{L}}}}}\), the thermal contact conductance for the three scenarios can be determined as follows34:

$${H}_{{{{\rm{c}}}}}=\left\{\begin{array}{c}{\int }_{0}^{{a}_{{{{\rm{L}}}}}}{h}_{{{{\rm{p}}}}}n\left(a\right){{{\rm{d}}}}a,{a}_{{{{\rm{L}}}}} < {a}_{{{{\rm{c}}}}2} \hfill \\ {\int }_{0}^{{a}_{{{{\rm{c}}}}2}}{h}_{{{{\rm{p}}}}}n\left(a\right){{{\rm{d}}}}a+{\int }_{{a}_{{{{\rm{c}}}}2}}^{{a}_{{{{\rm{L}}}}}}{h}_{{{{\rm{ep}}}}}n\left(a\right){{{\rm{d}}}}a,{a}_{{{{\rm{c}}}}2}\le {a}_{{{{\rm{L}}}}} < {a}_{{{{\rm{c}}}}1} \hfill \\ {\int }_{0}^{{a}_{{{{\rm{c}}}}2}}{h}_{{{{\rm{p}}}}}n\left(a\right){{{\rm{d}}}}a+{\int }_{{a}_{{{{\rm{c}}}}2}}^{{a}_{{{{\rm{c}}}}1}}{h}_{{{{\rm{ep}}}}}n\left(a\right){{{\rm{d}}}}a+{\int }_{{a}_{{{{\rm{c}}}}1}}^{{a}_{{{{\rm{L}}}}}}{h}_{{{{\rm{e}}}}}n\left(a\right){{{\rm{d}}}}a,{a}_{{{{\rm{L}}}}}\ge {a}_{{{{\rm{c}}}}1}\end{array}\right..$$
(16)

The measurement is assumed to be conducted under vacuum conditions, thereby eliminating the need to consider heat conduction through air gaps. Furthermore, since the temperature range is near room temperature, radiation effects are negligible. Consequently, thermal conduction constitutes the sole source of TCR in this context. Thus, the TCR is:

$${R}_{{{{\rm{c}}}}}=1/{H}_{{{{\rm{c}}}}}.$$
(17)

Based on the aforementioned theory, given model inputs such as material properties, surface fractal parameters, and contact pressure, the following procedure is performed. First, a maximum apparent area \({a}_{{{{\rm{L}}}}}\) is initialized. Using this \({a}_{{{{\rm{L}}}}}\), the force F generated by the contact between the two surfaces is calculated. This calculated force is then compared with the externally applied force \({F}_{{{{\rm{ext}}}}}=P{L}^{2}\). If the calculated force exceeds the external force, \({a}_{{{{\rm{L}}}}}\) is decreased; otherwise, \({a}_{{{{\rm{L}}}}}\) is increased, and F is recalculated. The adjustment of \({a}_{{{{\rm{L}}}}}\) is achieved by multiplying it by a coefficient λ to accelerate convergence. When the relative difference between these two forces falls within a specified tolerance \(e=1\times {10}^{-4}\), the current maximum apparent area is considered the desired value. Subsequently, using this \({a}_{{{{\rm{L}}}}}\), the real contact area and TCR can be calculated. The specific calculation workflow is illustrated in the Fig. 6a.

Fig. 6: Schematics of workflows in this paper.
Fig. 6: Schematics of workflows in this paper.
Full size image

a. Flowchart illustrating the theoretical model calculation and dataset generation. b. Flowchart depicting the processes of model training, testing, and interception.

Datasets

The data presented in this paper are derived from the theoretical model described above. Initially, fractal parameters D and G were determined, and two random surfaces characterized by \({D}_{1},{G}_{1},{D}_{2},{G}_{2}\) (calculated from D and G) with dimensions of 1024 × 1024 were generated from the fractal surface model. For theoretical models based on fractal theory both in this study and in prior research, a common practice is to approximate two rough surfaces as one rough surface and one absolute smooth surface. The generation method for the equivalent surface is detailed in the Supplementary Method 3. These fractal parameters serve as inputs to the theoretical model. Using material properties including elastic modulus, Poisson’s ratio, micro-hardness, thermal conductivity, and the contact pressure between the two surfaces, the actual contact area \(({A}_{{{{\rm{r}}}}}^{* })\) and TCR (\({R}_{{{{\rm{c}}}}}\)) were calculated. The generated random surfaces, along with the input and output parameters, constitute the dataset for this study. The training set consists of 10,000 cases, while the testing set includes 2000 cases. To validate the model, experimental tests were conducted on two sets of 316 stainless steel specimens with an Ra of approximately 0.8 μm. The mechanical properties of 316 stainless steel are summarized in Table 1. Material properties of 316 stainless steel at room temperature used in this work. TCR measurements were performed using the steady-state method36, as detailed in Supplementary Method 1 and Supplementary Fig. 4. Specific test principles and uncertainty analyses are provided in the Supplementary Method 2. The surface topography was analyzed using a surface profilometer, as described in Supplementary Fig. 3. These experimental data were also utilized for model validation.

Table 1 Material properties of 316 stainless steel at room temperature used in this work

Selection of description set

Numerous factors influence TCR; however, not all are addressed in this study. This research primarily focuses on surface topography, rendering the topographic data of the two surfaces the most critical descriptors. Since CNNs require input data normalization, Z-score normalization is applied to the surface height data, transforming it to have a mean of 0 and a variance of 1 as follow:

$${Z}_{i}=\frac{{z}_{i}-\mu }{\sigma },\mu ={\sum }_{i=0}^{n}\frac{{z}_{i}}{n},\sigma ={\sum }_{i=0}^{n}\sqrt{\frac{{\left({z}_{i}-\mu \right)}^{2}}{n}},$$
(18)

where n represents the total number of data points in the surface height profile, \({z}_{i}\) denotes the height at each point. Thus, μ and σ correspond the mean and standard deviation of height, respectively. Notably, σ denotes the root mean square roughness of the surface. To preserve the integrity of surface topography information, the standard deviations of the two surfaces are included as input parameters. Consequently, a comprehensive set of descriptors for surface information includes Surf1, Surf2, Std1, and Std2. To simplify the model, all surfaces are assumed to be composed of the same material and tested under identical environmental conditions. Therefore, material properties, ambient temperature, and ambient pressure are excluded. Only contact pressure is considered as a parameter affecting TCR. According to theory, the actual contact area between two surfaces ultimately governs the TCR6. Furthermore, the actual contact area plays a critical role in various fields, including friction, lubrication37, and coating38, establishing it as a more fundamental physical quantity than TCR. Consequently, the actual contact area should be regarded as an additional variable of interest. Because experimental measurement of the actual contact area is challenging, relying solely on it as a target variable complicates model validation. Therefore, both the actual contact area and TCR were designated as target variables. Accurate predictions of TCR are expected to yield correspondingly reliable estimates of the actual contact area, providing additional validation. In summary, the complete set of descriptors in this study includes Surf1, Surf2, Std1, Std2, P, Area, and Rc, as is shown in Fig. 7a with their specific definitions detailed in Table 2.

Fig. 7: Details of the descriptor set.
Fig. 7: Details of the descriptor set.
Full size image

a The physical significance of each descriptor. b The relationships among different features. The diagonal elements display the distribution of each feature, the upper triangular section presents the correlation coefficients alone with a heatmap of the features. In contrast, the lower triangular section depicts the joint distributions of paired features. The units of the parameters are as listed in Table 2.

Table 2 The description set of thermal contact resistance prediction in our deep learning model

The data distribution and correlation coefficients for each descriptor in the dataset. The variance of surface height ranges from 0 to 3 μm, indicating that the surface roughness is of a comparable magnitude. Pressure values span from 1 to 10 MPa, the actual contact area varies between 0 and 0.8%, and the TCR remains below 2000 mm²WK-1. The actual contact area shows strong positive correlation coefficients with Std1, Std2, and P. In contrast, the TCR exhibits a stronger positive correlation with Std1 and Std2 but no discernible correlation with pressure, which contradicts theoretical predictions. Analysis of the distribution of P relative to Std1 and Std2, as well as the distribution of P alone, reveals that P is uniformly distributed across its range. This uniformity leads to a similarly uniform distribution of TCR with respect to P, thereby obscuring any observable correlation.

Deep learning algorithm

Our neural network model consist of three components: an adapter, a pre-trained CNN, and a regression network. The input surface is sequentially processed by the adapter and the pre-trained CNN. Subsequently, the regression network receives the output from the pre-trained CNN along with additional parameters to perform regression and estimate the target parameters, as depicted in Fig. 8a. The pre-trained CNN employs architectures such as ResNet39, DenseNet40, and VGG41, which have demonstrated superior performance in image processing tasks. In this study, we utilized the pre-trained DenseNet12140 architecture without any special consideration to accelerate model training. Given that the image input have three channels but only two contact surfaces are available, input channels of the first convolutional layer are adjusted from three to two. Furthermore, these CNN architectures are used exclusively for surface feature extraction. Since these architectures typically include classification modules, we replace the classification module with an identity layer, which outputs the inputs unchanged, ensuring that only the extracted features are forwarded to subsequent modules. The structure of DenseNet121 is shown in Fig. 8c–f.

Fig. 8: Schematics of the architecture of our surface regression deep learning prediction model.
Fig. 8: Schematics of the architecture of our surface regression deep learning prediction model.
Full size image

a Overall architecture of the model. b Basic architecture of the Adapter Net. c Basic architecture of the modified DenseNet121. d Structure of the Dense Block. e Structure of the Dense Layer. f Structure of the Transition block. g Schematic illustration of data flow in the neural network. h The principle of guided back-propagation. i The principle of Class Activation Mapping.

For image input, the standard required size is 224 × 224; however, the measured surface data size is 1024 × 1024. To address this discrepancy, we introduce an adapter network preceding the pre-trained network, as illustrated in Fig. 8b. The adapter uses simple convolutional operations to resize the input from 1024 × 1024 to 256 × 256. We avoid resizing to 224 × 224 to prevent employing excessively large convolutional kernels, which could cause substantial loss of surface information. For the pre-trained network, ensuring the input size is a multiple of 32 is sufficient to preserve as much input information as possible. Although directly using 1024 × 1024 surfaces as input is feasible, it would result in an exponential increase in training time. Consequently, after thorough consideration, we adopted the current architecture.

The regression network processes the extracted surface feature tensor and input parameters. The surface feature tensor typically contains over a thousand elements, while the input parameters consist of only three elements. Directly concatenating these tensors would reduce the relative influence of the input parameters on the outputs. To address this, we first apply a linear layer to halve the length of the surface feature tensor and another linear layer to expand input parameters to a length comparable to that of the feature tensor. This standardization ensures consistency in their dimensions. Outputs of these linear layers are then concatenated and passed through two additional linear layers to regress the final target parameters: the actual contact area and TCR.

The model was trained on the training set, and its performance was evaluated on a separate test set. During training, the Adam optimizer was employed with a learning rate of 0.001, and the mean squared error loss was utilized. A 5-fold cross-validation strategy was implemented: the training set was divided into five equal subsets, with four subsets used for training and one remaining subset for validation in each iteration. This process was repeated five times, ensuring that each subset served as the validation set once. The average validation loss across all five iterations was used to assess model performance. The overall process is shown in Fig. 6b.

Guided back-propagation

Figure 8g schematically illustrates the data flow during neural network training. Backpropagation, the standard algorithm for training neural networks, computes the gradient of the loss function with respect to the network’s weights by propagating errors backward from the output layer to the input layer. GBP, an extension of standard backpropagation, was introduced to address limitations in generating interpretable visualizations. Its core principle is to modify the backpropagation process by retaining only positive gradients, thereby emphasizing features most relevant to the activation of a specific neuron or layer. During the forward pass, input data is processed through the network, and activations are computed for each layer. In the backward pass, GBP alters the gradient flow by allowing only positive activations to pass through ReLU layers (or similar non-linearities) during the forward pass and permitting only positive gradients to propagate backward through these layers. Negative gradients are set to zero, as is shown in Fig. 8h:

$${{{\bf{G}}}}={{{\rm{ReLU}}}}\left({{\bf{G}}}\right).$$
(19)

Input regions contributing greatly to the result are highlighted by the computed gradient.

Class activation mapping

CAMs are a fundamental technique in deep learning, employed to highlight regions of an input image that are most influential in a CNN’s classification decision. By leveraging the weights from the fully connected layer immediately preceding the output, CAM generates a class-specific activation map for classification tasks. In this work, we extend CAM’s applicability to regression problems. The activation map is mathematically defined as the weighted sum of feature maps from the last convolutional layer, where the weights correspond to the output-specific weights from the final layer:

$${{{\bf{A}}}}={{{{\bf{W}}}}}^{T}\cdot {{{\bf{F}}}}.$$
(20)

Here, W denotes the weight vector for the specific result, and F represents the feature maps. In this context, F typically represents a function of the feature tensor during forward propagation, whereas W generally denotes a function of the gradient tensor during backward propagation. The specific functional form is determined by particular CAM algorithm employed, as is shown in Fig. 8i. Details of different CAMs are provided in Supplementary Method 5. By visually highlighting surface regions that contribute most substantially to predictions, this approach elucidates the model’s decision-making process, thereby enhancing CNN interpretability in real-world applications.