Introduction

Recently, with the advancement of nanotechnology and its applications, understanding nanomaterial dynamics has become critical. The structuring of nanomaterials to control thermal transport in solid materials has garnered significant interest. Efforts to optimize thermoelectric energy utilization have led to research on manipulating thermal conductivity via grain boundaries1,2, defects3,4, impurity5, and nanoparticles6. As 2D nanomaterials, van der Waals (vdW) layered materials typically exhibit high anisotropy, sparking interest in their electrical7,8,9, mechanical10, therma11,12,13, and tribological properties14,15. The layer rotation angle in vdW materials significantly affects thermal transport and the stability of 2D transistors16,17, providing a versatile approach to modifying their characteristics. This twist angle can also lead to moiré patterns, giving rise to phenomena like bilayer graphene superconductivity8,9.

Research has highlighted the impact of twist angles on thermal transport in 2D stacked materials. Ouyang et al.18 explored phonon-phonon coupling effects in twisted graphene and hexagonal boron nitride (h-BN) heterojunctions, with similar findings at graphite/h-BN interfaces19. The inherent anisotropy of 2D materials resulted in high in-plane thermal conductivity with low cross-plane conductivity20,21,22. Recently, Kim et al.23 achieved record thermal conductivity anisotropy in irregularly twisted MoS2, with significant reductions in cross-plane conductivity while preserving in-plane conductivity24. A key feature of 2D stacked materials is the atomically smooth layer interfaces, which suppress diffusive phonon scattering and maintain phonon coherence through specular scattering. This enables control over thermal transport by modulating phonon wave properties. In studies of superlattices and graphene/MoS2 heterostructures, conductance of coherent phonons has been shown to depend on the aperiodic distribution of layer thickness25, due to phonon localization.

It is expected that the stacking order of twist angles should significantly influence coherent phonon transport. However, finding the minimal thermal conductivity among myriad disordered stacked structures is impractical. Recently, machine learning (ML) techniques have emerged as powerful tools for screening extensive design spaces, predicting new material properties26,27,28,29,30,31, and accelerating optimization. For instance, ML has been used to predict the thermal conductivity of nanomaterials32,33 and design specific composite structures34. ML techniques are also applied in optimizing LED designs, lithium battery electrolytes, and composite materials.

In this work, we used non-equilibrium molecular dynamics (NEMD)35 to simulate the thermal transport properties of two-angle disorderly twisted multilayer graphene at room temperature. Leveraging convolutional neural networks and Bayesian optimization, we effectively determined the optimal stacking sequence from tens of thousands of configurations. Our findings revealed that the optimized structures greatly reduced thermal conductivity compared to pristine structures. To understand the underlying mechanism, we examined the phonon transport spectra and phonon histograms for these structures, confirming phonon localization and providing insights into modifying thermal transport via twist angle engineering in 2D materials.

Results

Model construction

In our simulations, graphene was rotated at two discrete angles, 0° and 2.54°, and then stacked randomly using these two angles. The twist angle was determined based on Ref. 36 (Fig. 4), which reveals that the thermal conductivity of ordered twisted graphene systems exhibited a pronounced local minimum around 2.5°, establishing this angle as particularly representative for studying coherent phonon scattering effects. A 2.54° twist angle was ultimately chosen as it induces only negligible lattice strain (~1%) in the stacked structure. In addition to the selected twist angle (2.54°), our computational models employed periodic boundary conditions with in-plane dimensions of 49.2 × 49.2 Å and fixed boundary conditions along the out-of-plane direction. The modeling was conducted using the computational simulation software Material Studio (MS). Different stacking angles produce moiré lattices, where variations in stacking conditions create diverse atomic vibrational modes and stress distributions, resulting in phonon scattering sites and reduce thermal conductivity (TC). Several distinct stacking patterns emerged, illustrated in Fig. 1a, where AA and AB represent two fundamental stacking configurations.

Fig. 1: A twist angle graphene model.
figure 1

a Top panel forms a Moiré lattice in the twist angle graphene. At the bottom of the graph (a), using relative Position AA, AB represents the two stacking modes, the red and green atoms represent the upper and lower layer of atoms respectively. b Calculating thermal conductivity of multiple layers of disorderly twisted graphene. Two blue boxes are fixed during the simulation, and then we apply the Langevin thermostat to add into the two layers marked by the red (heat source) and green (heat sink) rectangle, causing thermal bias. The arrow indicates the direction of the vertical heat flux, while periodic boundary conditions are applied in the in-plane direction.

To calculate the heat transport properties in disorderly twisted multilayer graphene, we employed non-equilibrium molecular dynamics (NEMD)37 simulations using the Graphics Processing Unit Molecular Dynamics (GPUMD) software38. For this twist angle (interlayer rotations) structural model, the neuroevolution potential (NEP) trained by Fan et al.36 can be used to describe the interlayer weak vdW force and the intralayer C-C interaction. The MD simulation structure is shown in Fig. 1b. The blue rectangular boxes on both sides were fixed during the simulation, and the Langevin thermostat was applied to the area marked by the red (heat source Th)and green (heat sinkTl) rectangular boxes for generating a non-equilibrium steady-state heat flow by applying thermal excitation at both ends of the system.

In current simulated experiment, we took 14 layers of graphene stacking together, with their angles randomly arranged at 0° or 2.54°, hence there are total \({2}^{14}=16384\) possible structures. With fixed graphene size, the focus was on the relationship between multilayer twist distribution and heat transport. Identifying the structure with the lowest TC among numerous configurations presents a significant challenge. Considering the infeasibility of an exhaustive search of the extremely large solution space, the use of machine learning (ML) can benefit the current problem significantly. The ML method is a promising tool to screen over a large design space, which has primarily been used to predict properties of new materials. In this study, we utilized two specific algorithms, the Bayesian Optimization method from open-source materials informatics library COMBO39 and the convolutional neural networks (CNN), combined with NEMD simulations, to calculate and predict TCs of different structures. The technical details of these algorithms and NEMD simulations will be presented in the Methods section.

Machine learning-guided thermal conductivity minimization

For the convenience of description and calculations, binary symbols were used: 0 for graphene at 0° and 1 for graphene at 2.54°, applicable in both CNN and Bayesian calculations. First, we discussed the calculated results based on Bayesian optimization algorithm. Figure 2a illustrated the trend in TC changes tracked by the Bayesian optimization algorithm for each round. Green rectangles represented TC values of different structures, while blue triangles indicated the minimum TC achieved after each optimization round. Here, the TC values for 10 structures were calculated each round. The optimal minimum TC for the disorderly twisted multilayer graphene was found by the 7th round. The optimized structure, with the angle sequence of 1 1 0 1 1 0 1 0 1 1 0 1 1 1, exhibited a significantly small TC of \(0.095{\rm{W}}{m}^{-1}{k}^{-1}\), much lower than the TC of the pristine well-aligned graphene (\(0.512{\rm{W}}{m}^{-1}{k}^{-1}\)), and the TC of the superlattices periodic structure (PS) with the two angles orderly stacked (\(0.230{\rm{W}}{m}^{-1}{k}^{-1}\)) (Fig. 2b). The dramatical 80% reduction of TC in the optimized structure demonstrates that controlling the twist angle indeed provides an effective way to modulate the thermal transport across the multiple stacking 2D layered structure.

Fig. 2: ML optimization results based on Bayesian optimization.
figure 2

a The green rectangle represents the TC of the different structures, and the blue triangle represents the minimum TC after each round of optimization. b Comparison of thermal conductivity between the pristine 0° stacked graphene, 0° and 2.45° periodic graphene stacks and the optimum structure with the same number of layers.

The reason for the low TC is that (1) the interface of the multiple stacking 2D layered structures is atomically smooth, which allows for coherent phonon transport (non-dephasing transmitted or reflected scattering) at the interfaces; and (2) the disordered multiple-interface structure can have random and complex multiple scattering effects on the phonon wave, and the interference results in the localization of many coherent phonon modes. While the complete suppression of wave conduction, the phenomenon known as Anderson localization, has been demonstrated in light40, ultrasound41 and ultra-cold atoms42, the observed localization effect here results from the strong partial localization of phonon modes, since heat conduction is a broadband phonon contribution and the complete localization of all phonon modes is hard to achieve. In the literature, supper lattice (SL) of semiconductors has been adopted as a good platform to study phonon localization43. It is found that phonon localization, especially Anderson-type localization, typically occurs in SL with relatively long period thickness, which indicates that Anderson localization can be potentially maintained for part of the coherent phonons in SL.

To further reveal the underlying physical mechanism for the reduced TC, we calculated the corresponding spectral phonon transmissions in the pristine graphene, the 0° /2.45° periodically stack graphene, and the optimized structure, respectively37,44. The solid curve in Fig. 3a shows that thermal transport in the well-aligned pristine graphene is mainly contributed by phonons in the range of 0–15 THz. Compared with pristine graphene, as shown by the red dashed line in the Fig. 3a, the phonon transmission of the optimal multilayer twisted graphene is mainly distributed in the 0–5Thz range, and is significantly suppressed over the entire frequency range, which explains the dramatic decrease of TC in the optimized structure. To demonstrate that the observed phonon transport suppression should result from partial mode localization, we analyzed the phonon transmission data \(({\rm{\xi }})\) of the 200 structures, and then calculated the phonon transmission histogram four frequencies. (Fig. 3b). Previous studies have shown that the distribution pattern of the phonon transport histogram can determine whether the phonon-transmission is localized or diffusive45. If it is diffusive phonon transport, the phonon transmission histogram p(ξ) can be well described by the Gaussian distribution function. While in the localized phonon transport, the histogram of raw transmission p(ξ) becomes an asymmetric Gaussian distribution, but the histogram of logarithm of phonon transmission p(ln (ξ)) will obey the symmetric normal Gaussian distribution. This analysis method has also been widely used in photon and electron transport. Here we took the histogram of 4 typical phonon frequencies to observe, systematic comparative analyses between p(ξ) and p(ln (ξ)) at identical frequencies. Our results in Fig. 3b demonstrate that the histogram of logarithmic transmission p(ln (ξ)) can be significantly better fitted by a normal Gaussian distribution function, than the histogram of raw transmission p(ξ) does, which provides further evidence for the existence of partially localized phonon modes.

Fig. 3: Phonon transmission, phonon histograms analysis.
figure 3

a Phonon transmission spectra of three stacking structures. b Histograms of logarithmic phonon transmission for several frequencies. The solid curves are the fitting results using normal Gaussian distribution functions.

The above optimal multilayer twisted graphene structure was found by the Bayesian optimization algorithm. In order to verify the accuracy of the algorithm and compare its efficiency with other methods, we also optimized the multilayer two-angle twisted graphene structure by one-dimensional convolutional neural network algorithm, which used 2 convolutional layers, 1 pooled layer, and 1 fully connected layer containing 100 neural units to build a network model. During model tuning, the learning rate was set to 0.0002 and the number of iterations epoch was set to 3000–8000.

We utilized the ability of feature extraction in CNN to identify spatial features that help reduce heat transport. Although most of the training set consist medium TC structures, when the model starts training, the feature details that lead to reduced heat transport, can be gradually recognized. The merit of CNN is that it can capture the characteristic information from the structures, map the complex relation between the output (TC in our case) and the input structures (2-angle random twisted graphene), and fast predict TC for any new input structure. The accuracy and rapidity of CNN have been demonstrated in articles46,47. First, as shown in Fig. 4a, we randomly selected 200 structures with their corresponding thermal conductivities as training data sets. Then, the trained model is used to predict the TC of all structures in the sample space. Next, we selected 50 structures with the lowest TC. After repeating the above steps for 5 rounds, the average TC of those 50 selected one for each iteration is shown in Fig. 4a. The iteration was stopped in the fifth iteration, because the average TC was basically the same as before, and more importantly the lowest TC obtained in the last two iterations corresponded to the same structure. In order to evaluate the performance of the CNN model, we compared the real value and the predicted value of the TC after training CNN on 400 data, and observed the convergence of the loss function, as shown in Fig. 4b. We calculated that the mean absolute percentage error (MAPE) for model in the test set is 4.0%. In addition, we introduced the Coefficient of Determination (formula (1)) to measure the quality of the regression model, where \({\hat{y}}_{p}\) is the average of the training set, yp and yt are respectively the real TC (calculated by MD) and the prediction TC. The calculation according to formula (1) yields R2 = 0.98. Finally, we found that the loss function approaches zero after 2000 epoch, which proves that CNN model can capture the spatial characteristics of low TC materials and further demonstrates the accuracy of our model.

$${R}^{2}=1-\frac{\mathop{\sum }\limits_{i=1}^{{n}_{t}}{\left({y}_{p}^{i}-{y}_{t}^{i}\right)}^{2}}{\mathop{\sum }\limits_{i=1}^{{n}_{t}}{\left({y}_{t}^{i}-{\hat{y}}_{p}^{i}\right)}^{2}}$$
(1)
Fig. 4: Thermal conductivity search results based on CNN machine learning.
figure 4

a The black data represents the average TC of the 50 structures with the lowest values in each iteration of the CNN model, while the red one indicates the lowest TC in each iteration. b Left: The comparison between the CNN-model-predicted TC and the MD ground truth values, Right: The convergence behavior of the loss function.

According to our CNN model, the optimal twisted graphene structure is 1 1 0 1 0 0 1 1 0 1 1 1 1 1, with a TC of 0.0943 \({\rm{W}}{m}^{-1}{k}^{-1}\), almost the same value as the Bayesian optimization’s prediction but differing in structure. The result indicates that we have identified two local optimums with the two machine learning algorithms. Here, we evaluated two algorithms that can swiftly and accurately identify structures with lower TC through intelligent analysis, surpassing the constraints of manual random sampling. Notably, the CNN model excels when properly tuned, as it can rapidly compute the TCs of all structures, facilitating the selection of the desired ones. Compared to the Bayesian optimization model, the CNN is better suited to the optimization requirements of our project.

Phonon transport suppression in optimized structures

Here we defined the same angle of graphene continuously stacked together as a block. In the study of the effect of the twisted angle induced randomness on phonon transport, we realized that the number of blocks plays an important role. In fact, given the total length of the system is fixed, if the number of blocks is relatively large, i.e., if the average block thickness is reduced to a very small size, the multilayer random twisted structure will be very close to the super lattice structure, very few individual blocks show a slight difference in thickness. The multilayer structure will even become exactly the super lattice structure under the limit case (If the average block thickness is reduced to the thickness of a single layer of graphene). Therefore, when the average block thickness gradually decreases, the effect of localizing phonon by random stacking will become worse and worse, it is impossible to introduce enough randomness to generate localization under the condition of small average block thickness. On the other hand, when the average block thickness increases (the number of blocks decreases), there is again not enough randomness to generate localization. Since the increase and decrease of the average block thickness will both gradually destroy the localization of phonons, there should be a reasonable average number of blocks which introduces maximized randomness to enhance the localization of phonons. This trend can be seen when the simulated structures with different block numbers are shown in the same graph (Fig. 5a). Apparently, the structures with moderate 7 to 9 blocks tend to possess the lowest TCs. This optimal performance arises from the coexistence of alternating thin and thick blocks, which creates favorable conditions for coherent phonon localization.

Fig. 5: Thermal Conductivity and Phonon participation ratio (PPR) analysis.
figure 5

a Thermal conductivity distribution of stacked structures with random twist angles with different average block numbers. b Phonon participation ratio across different system sizes.

For disordered twist structures with identical numbers of constituent blocks, the layer thickness and spatial distribution of these blocks may still vary. Since the interface densities in these disordered twists remain the same, the observed variations in thermal conductivity for specific average block sizes can be attributed to differences in the degree of localization of coherent phonon modes within the disordered structures. Our findings on the influence of block distribution on thermal conductivity align with the results of Roy Chowdhury et al.48 regarding the role of randomization.

Discussions

Herein, we evaluated the degree of phonon localization in molecular dynamics simulations by calculating the phonon participation ratio (PPR) for the relevant phonon modes. To systematically investigate the size dependence of localization, we probed disordered twist-angle structures with different thicknesses. Based on the “U” shape functional relationship between thermal conductivity and the number of blocks (Fig. 5a), i.e., the smallest average k appears when the number of blocks happens to be half of the number of layers, we can roughly determine the appropriate number of blocks for each thickness. Thus, we constructed randomly disordered structures of five thicknesses (the layer numbers are 7, 14, 21, 28, 35)), and computed their thermal transport properties. For each system size, we generated multiple disordered structures, calculated their respective PPR, and then averaged the results. As shown in Fig. 5b, a comprehensive comparison of the averaged PPR across different system sizes reveals that the participation ratio converges to a stable value as the system size increases, consistent with localization. As discussed in Ref. 49, when the participation ratio remains invariant under system size expansion, the corresponding phonon states can be classified as localized.

Moreover, we have also calculated the harmonic dynamical structure factors (DSF)50,51 for two 14-layer systems: pristine graphene and an algorithm-generated disordered twist-stacked structure. The DSF results are presented in Fig. 6. In pristine graphene (6. a), distinct dispersion branches (e.g., linear or parabolic curves) are observed, where frequency evolves continuously with momentum, indicative of propagating phonon-dominated transport. In contrast, the DSF of disordered twisted graphene (6. b) exhibits broadened peaks and degraded dispersion branches. The disordered stacking enhances phonon scattering and reduces their lifetimes, leading to peak broadening in \({\rm{S}}\left(\left|{\rm{q}}\right|\cdot {\rm{w}}\right)\), which is a signature of phonon localization. In the “blurred” dispersion region, it can be regarded that for a certain vibration frequency ω, there are a wide range of q corresponding to that frequency, i.e., the uncertainty in momentum ∆q is large, indicating a small uncertainty in space ∆r for the vibration, which is the localization of phonon modes. In other words, these localized modes, characterized by vibration energy concentrated at specific frequencies independent of momentum distribution, signify spatial confinement of vibrational states within disordered regions. This phenomenon directly correlates with the observed drastic reduction in thermal conductivity, confirming the critical role of phonon localization in suppressing heat transport.

Fig. 6: Dynamic structure factor (DSF) and thermal conductivity analysis.
figure 6

(a, b) DSF for original and disordered stacking structures. (c, d) Thermal conductivity for different system sizes.

To systematically investigate the size effect, we compared the thermal conductivities of the constructed systems with different numbers of layers. As shown in Fig. 6c, for structures with several tens of layers, the thermal conductivity of graphene layers without layer rotation increases almost linearly with increasing layer thickness, manifesting the well-known ballistic size effect. However, TC of the disordered systems only increases slightly with the thickness, showing distinct behavior in contrast to the linear size effect in ballistic regime. In addition, we observed that the difference in thermal conductivity between the disordered and the pristine systems gradually increases with increasing thickness, as shown in Fig. 6d by the decreasing ratio of TCdisorder/TCpristine. This observation indicates that the random rotations can indeed cause the phonon localization by providing random scattering interfaces for the coherent phonon waves. As the thickness gets larger, more and more phonon modes will sense and get affected by the disordered stacking structure, and thus turn from the ballistic delocalized one to the localized one in a certain extent. Suppose if the phonon wavelets would still propagate in the rotationally disorder system from end to end ballistically, the thermal conductivity would still show a strong size dependence as in the pristine case. Therefore, a much weaker size dependence indicates that frequent scatterings actually take place in the disordered system. The randomly scattered phonon wavelets can interfere with each other, with more and more amount of them becoming localized when the thickness increases, provided that the randomness is large enough. To further validate the enhanced phonon localization in extended systems, we systematically scaled the 14-layer structure to 28 and 42 layers, i.e., repeating the optimized 14-layer twice and thrice. Notably, the repeated-algorithm-optimized structures exhibit further reduced thermal conductivity and weaker thickness dependence compared to their manually constructed counterparts at identical thicknesses. These results further indicate the superior phonon localization effects in algorithm-optimized structures.

In summary, the spatial distribution of twist angle plays a very important role for the thermal transport in multilayer twisted graphene and similar 2D materials. By optimizing the sequence of 2-angle twisted multilayer graphene, the TC can change by 80% with respect to the well-aligned one. Using two machine learning algorithms—Bayesian optimization and Convolutional Neural Networks (CNN)—combined with Non-Equilibrium Molecular Dynamics (NEMD), we successfully identified the multilayer random twisted graphene structures with the lowest thermal conductivity. These algorithms are highly efficient, requiring only 200–400 structure evaluations within a sample space of tens of thousands to find the optimal configuration. Through combined analysis of log-normal phonon transport distributions, system-size-dependent participation ratios, and dynamic structure factor visualization, we established a direct correlation between effective phonon localization and suppressed thermal conductivity. The optimal block number achieves maximum randomness, enhancing the phonon localization. Ultimately, our work reinforces that machine learning can uncover physically accurate structures without initial physical insights, thus providing a novel direction for the research and applications of stacking 2D materials.

Methods

Computing thermal conductivity via molecular dynamics

The thermal transport characteristics of disordered twisted multilayer graphene are computed using non-equilibrium molecular dynamics simulations, implemented in the GPUMD38 (Graphics Processing Unit Molecular Dynamics) package. In molecular dynamics simulation, the time step for propagating the equations of motion was set as 1.0 fs. We used Berendsen method of isothermal-isobaric (NPT) ensemble to thermalize the entire system at ambient temperature of 300 K for 300 ps to achieve thermal equilibrium. After that, we established a temperature gradient by setting heat source and heat sink at both ends of the structure. Then the system run 6 ns under the micro-canonical (NVE) ensemble to achieve equilibrium. In our simulation process, we set Th 325 K and Tl 275 K to calculate the TC at 300 K. The TC calculation is directly based on the temperature gradient and heat flow of the system. When the temperature gradient has been maintained for a sufficient time and the system has reached a steady state, an effective thermal conductivity can be calculated supposing that Fourier’s law is still valid for the whole system. If the temperature difference of the local hot bath is ∆T, the heat flow is Q/S, then the equivalent thermal conductivity is \(\frac{Q/S}{\varDelta T/L}\), where L is the distance between the heat source and sink. Notably, the temperature gradient should be measured as ΔT/L, rather than from the linear portion of the temperature profile away from local thermostats, as emphasized in Ref. 37. We have analyzed the temperature distribution and the energies accumulated in the thermostats for both pristine graphene and its optimized structure (See Fig. S1 in Supporting Information).

Machine learning algorithm 1: Bayesian optimization

COMBO, an efficient open-source Bayesian optimization library, can identify the optimal structure by evaluating only a small percentage of all candidates. Detailed information on Bayesian optimization methods can be found in the literature52,53,54. The optimization process flowchart appears in Fig. 7. Binary notation was employed for computational efficiency, with 0 and 1 corresponding to graphene orientations of 0° and 2.54°, respectively. This scheme was implemented across CNN and Bayesian frameworks. In the study based on Bayesian optimization algorithm, 10 structures were randomly selected from the complete sample space, using binary symbols and their TCs as the initial training dataset. These results were fed into the Bayesian algorithm to predict 10 new candidate structures, whose TCs were calculated via MD simulation to assess their suitability. If an ideal structure is discovered, the information is retained, and the calculation stops. If not, the predicted structures and their TCs are incorporated into the training set, and the operations are repeated until the optimal (minimum TC) structure is identified. Here, we determined that when the lowest TC obtained through multiple iterations remains unchanged, we considered it to be the optimal structure.

Fig. 7
figure 7

Molecular Dynamics (MD) and machine learning (ML) calculation diagram based on Bayesian optimization.

Machine learning algorithm 2: CNN

Besides the Bayesian optimization algorithms, CNNs have recently been employed to predict material properties, including TC, from structural information. CNNs excel at feature extraction and translational invariance. We applied a CNN algorithm to predict the TC of the disorderly twisted multilayer graphene by detecting and recognizing pertinent spatial features in the input twisted structures. Once the model was trained, it can output the TC of each twisted graphene structure sufficiently fast, significantly reducing computation time and enabling the enumeration of TCs across the sample space.

Our CNN architecture combined with MD was shown in Fig. 8. In the CNN model, we used convolutional layers composed of 45–50 filters, the size of the filter was 5 × 1 and the stride in the convolution was 1, and no-padding. Each convolutional layer was followed by a rectifying linear unit (ReLU) function, and we added a maximum pooled layer of size 2 × 1 with a stride length of 2 to the last convolutional layer. The final part of the model contained a fully connected layer (FCL). Because we dealt with a regression problem, the ReLU function does not appear in the last layer. For the CNN model, we employed 2 convolutional layers, 1 pooled layer, and a fully connected layer with 100 neural units. The gradient optimization algorithm uses an Adam (Adaptive Moment Estimation) optimizer to minimize the mean square error (MSE) between prediction and target.

$${MSE}=\mathop{\sum }\limits_{i=1}^{{n}_{t}}\frac{{({y}_{p}^{i}-{y}_{t}^{i})}^{2}}{{n}_{t}}$$
(2)

In Formula 2, nt represents the number of training data samples, yp and yt are respectively the real TC calculated by MD and the prediction TC simulated by the CNN model.

Fig. 8
figure 8

Schematic of the search algorithm based on the CNN model.

In the initial iteration, we randomly selected 200 structures from the entire sample space and calculate their TC using NEMD. This primary model, trained on the initial 200 structures, then predicted the TC of the remaining structures in the sample space. Of these predictions, the 50 structures with the lowest TC were examined through NEMD simulation to assess model accuracy. If the condition for the optimal structure is unmet, these predicted TCs and their structural descriptors are added to the training set, leading to an iterative update and fitting of the CNN model. The above procedure is repeated until the optimal structure is found. It should be noted that the scale of the initial training data set was relatively small and lacks representative features, making early CNN predictions potentially inaccurate. To tackle this challenge, we leverage the capabilities of CNNs to identify spatial features that can effectively reduce the thermal transport. With the continuous iteration, the training data set becomes more and more comprehensive, the accuracy of prediction will gradually improve, and lower TC structures increasingly contribute to the training data.