Introduction

The tristructural isotropic (TRISO) particle fuel is a core component in high-temperature gas-cooled reactors (HTGRs) due to its excellent high-temperature stability and fission product retention capability1,2. A typical TRISO fuel particle consists of a uranium-containing kernel and four successive coating layers, namely a buffer pyrocarbon layer, an inner dense pyrocarbon layer, a silicon carbide (SiC) layer, and an outer dense pyrocarbon layer3. The SiC layer serves as the primary barrier to fission product release. The diffusion behavior of fission products within it is directly related to the safety and service life of the fuel. Among numerous fission products, cesium (Cs) isotopes have been a research focus since their high yield and the long half-life of Cs-137. Therefore, an in-depth exploration of the diffusion mechanism of Cs in SiC is essential for evaluating fuel performance and optimizing fuel design.

Several studies have explored the diffusion behavior of Cs in SiC and found that Cs atoms mainly diffuse through bulk and grain boundaries (GBs)4. Bulk diffusion requires relatively high energy barriers, becoming significant only at very high temperatures5. In contrast, GBs contain a large number of defects, which provide low-energy diffusion paths for Cs, leading to much faster diffusion and making them the primary transport channels6. There are various types of GBs in SiC, which can significantly influence Cs diffusion behaviors7,8. Identifying which type of GBs plays the most critical role in Cs transport is essential to fully understand its diffusion mechanism in SiC.

Previous researchers have constructed different types of SiC GB models and calculated the Cs diffusivities in them5,9,10. These studies demonstrate that high-energy grain boundaries (HEGBs) with highly disordered structures11,12 provide more diffusion channels for Cs atoms, leading to significantly higher diffusivities than other types of GBs. More importantly, the calculated diffusivities of Cs in HEGBs are closer to experimental measurements compared to those in crystalline or other GBs5,10, which indicates HEBGs may dominate the Cs transport in the SiC layer. However, the atomic scale diffusion mechanism of Cs in HEGBs remains unclear. Since Cs has a very low melting temperature and high chemical reactivity, making it difficult to conduct experimental research. Under such constraints, simulation methods have become the key approach to unravel this mechanism.

Currently, simulation methods for studying diffusion behavior mainly include density functional theory (DFT) based ab initio calculations and molecular dynamic (MD) simulations13,14,15,16. Ab initio calculations can accurately describe interatomic interactions but are computationally expensive, making it impractical for modeling complex grain boundary structures. Although the classical MD simulation is computationally efficient, its accuracy highly depends on the interatomic potential. To the authors’ knowledge, only one empirical interatomic potential (EIP) specifically developed for the SiC-Cs system has been reported5. However, this EIP does not account for Cs-Cs interactions, which may influence diffusion behavior during actual migration processes. Moreover, EIPs often struggle to accurately represent the highly disordered atomic environments in HEGBs17. Therefore, the lack of accurate and computationally efficient for long time-scale diffusion simulations interatomic potentials has become a major obstacle restricting the further application of MD simulations in Cs diffusion research. Machine learning interatomic potentials (MLIPs) have become a promising solution to this issue. MLIPs do not rely on pre-constructed EIPs, instead, they can be constructed for a given system in an on-the-fly manner using active learning18. This enables MLIPs to bypass the need for EIPs to conduct MD simulations on the time scales needed to study diffusion, while still offer accuracy comparable to DFT at low computational costs. In addition, universal or broadly transferable MLIPs trained on large and diverse materials datasets have demonstrated excellent accuracy in simulating a wide range of atomistic processes19,20. However, despite their strong generality, their computational cost remains significantly higher than that of system-specific MLIPs, making them impractical for the long time-scale simulations needed to study diffusion.

MLIPs have been proven effective to study complex amorphous materials systems and investigating diffusion behavior21,22,23. Among various MLIPs, the moment tensor potential (MTP) shows a good performance in both accuracy and efficiency24. Therefore, we used MTP to study the diffusion behavior of Cs in HEGBs. Theoretical framework of MTP construction are established in the Supplementary Information (SI) Note 1, while the detailed explanation can be found in the original publication25.

Amorphous SiC (aSiC) structures have been used to represent the disordered features of HEGBs in diffusion simulations10,12,14. Accordingly, we constructed several aSiC structures to study Cs diffusion behaviors. Furthermore, we constructed models of polycrystalline SiC with HEGBs to better reflects the actual microstructure of the SiC layer in TRISO particles. However, applying MLIPs to such complex GB models remains challenging, because constructing small periodic configurations suitable for ab initio calculations is enable for complex GB structures.

To address this challenge, various strategies have been proposed. One strategy involves extracting non-periodic spherical clusters from large structures and adding thick vacuum layers to from periodic configurations26,27. However, this method introduces artificial free surfaces that change the atomic environments and introduce errors28,29. Other vacuum-free periodic modeling strategies have been developed30,31,32, yet they struggle to handle the complex structures like HEGBs. The Periodic Configuration Construction via Grand Canonical Monte Carlo simulations (PCC-GCMC) method proposed by Shuang et al.29 exhibits good performance in modeling complex structures, but it relies on EIPs to perform GCMC simulations. Since there is no suitable EIPs for SiC-Cs system in public databases, this method is hard to apply. A more straightforward solution is to cut a rectangular region containing the cluster with unknown atomic environment from the large model and apply periodic boundary conditions for DFT calculations. This approach has been implemented in some studies28,33,34,35, with the key difference lying in the treatment of atoms outside the cluster. For instance, Lysogorskiy et al.28 directly relaxed the outside atoms during DFT calculations, while Erhard et al.34 relaxed the outside atoms via melting and annealing.

In this study, we proposed a workflow for developing MTP to resolve these challenges. We first adopt the active learning (AL) based on configuration strategy (based on the D-optimality criterion18) to develop an accurate MTP that is applicable to the aSiC-Cs system. This establishes a solid foundation for capturing the atomic interactions within the SiC-Cs system. To extend the MLIP’s applicability to complex HEGB structures, we integrated the AL based on neighborhood strategy into the overall workflow. In this strategy, we identified the unknown local environments in the large HEGB structures, and then directly cut cubes containing these environments to construct periodic configurations, thereby preserving the original atomic environments. Through this integrated workflow, we ultimately constructed a set of MTPs that met the precision requirements for the aSiC-Cs system and also achieved reliable applicability to the polycrystalline SiC-Cs system. These MLIPs enable us to reveal the atomic-scale diffusion mechanism of Cs in aSiC and HEGBs, including diffusion barriers, diffusion paths, and interactions between Cs and SiC. Consequently, this work is expected to fill the gap in the atomic-scale understanding of the diffusion mechanism of Cs in SiC, and also provide a workflow that can be used in other local MLIPs.

Results

Workflow for developing MTP

The four-stage progressive workflow to construct MTP for the SiC-Cs system is illustrated in Fig. 1. The workflow includes four stages, which are (i) initialization of MTP by passive learning, (ii) active training of MTP on amorphous configurations, (iii) extending the training set with additional crystalline and GB configurations, and (iv) refinement of MTP by AL based on neighborhood strategy. Through this workflow, we gradually expanded the scope of applicability of the MTP within the SiC-Cs system, and eventually achieved accurate descriptions of both aSiC and polycrystalline SiC with HEGBs.

Fig. 1: Workflow to develop MTP for studying Cs diffusion in amorphous SiC (aSiC) and polycrystalline SiC.
Fig. 1: Workflow to develop MTP for studying Cs diffusion in amorphous SiC (aSiC) and polycrystalline SiC.
Full size image

a Passive learning stage. b Apply active learning (AL) based on configuration strategy, and obtain MTP-a specialized for the amorphous SiC-Cs system. c Add additional structures in the training set obtained after the configuration-based AL stage. d Perform AL based on neighborhood strategy to get MTP-GB for studying Cs diffusion in polycrystalline SiC.

Considering that HEGBs are disordered structures, the MTP development started with passive learning on structures of Cs in aSiC to initialize MTP, as shown in Fig. 1. We conducted AIMD simulations on structures of Cs in aSiC with different densities. Subsequently, we selected representative structures based on the MaxVol algorithm18 to construct the initial training set, which was then used to train and obtain the initial MTP. In the second stage, we applied the AL based on the configuration strategy to the aSiC-Cs system, which effectively sampled new atomic environments that were not covered in the initial training set. Eventually, a robust MTP that supports stable long-term MD simulations was developed. It should be noted that both passive learning and the AL based on configuration strategy have been validated for effectiveness in several studies36. After these two stages, we obtained an MTP specialized in accurately describing the aSiC-Cs system, which is denoted as MTP-a.

In addition, we applied the above two-stage strategy to the Cs and crystalline SiC system and developed a corresponding MTP. We calculated the structural parameter and mechanical constants (Table S1) of crystalline 3C-SiC and compared them with literature data. Meanwhile, we investigated the diffusion behavior of Cs in perfect 3C-SiC (Fig. S2). Detailed information can be found in the SI Note 2. These efforts verified the feasibility and accuracy of using MTP to study the diffusion behavior of Cs in SiC.

Driven by our primary goal of developing the MTP applicable to polycrystalline SiC that contains both amorphous and crystalline structures, we conceptually designed the third stage to expand the training set. Specifically, we incorporated additional crystalline SiC and grain boundary structures that were not covered in the training set from the second stage, as shown in Fig. 1c. Due to the high complexity of polycrystalline SiC with HEGBs and the large number of atoms in these structures, the configuration-based AL strategy is no longer applicable. To address this issue, we conducted AIMD simulations on perfect 3C-SiC, defective 3C-SiC, and grain boundary structures. We then selected representative structures from these AIMD trajectories and added them to the training set. We then retrained the MTP using this updated training set. This stage significantly improves the adaptability of MTP to crystalline phases, defects, and grain boundary regions.

In the fourth stage, we implemented the neighborhood-based AL strategy to achieve MTP refinement. We applied this strategy to polycrystalline SiC models containing multiple Cs atoms, which further enhanced the ability of MTP to describe complex HEGB structures. The major difference between this stage and the configuration-based AL strategy lies in the addition of a periodic structure construction process, as shown in Fig. 1d. The specific steps are as follows: first, we identified the extrapolation grade of each local atomic neighborhood (LAN_grade) in the large polycrystalline model. Second, we selected the atom with the highest LAN_grade as the center and cut out a non-periodic cluster with a radius of 5.5 Å. The potential issues if we directly use this isolated cluster with vacuum layers for DFT calculation are shown in Fig. S3. We also extracted a cube (the central atom same as that of the cluster) with a size of 6 Å from the original configuration and used this cube to construct a periodic structure. In this periodic structure, atoms inside the cluster were fixed, while atoms outside the cluster remained unfixed during the following relaxation. Third, we relaxed the lattice parameters of the constructed periodic structure and the positions of unfixed atoms to ensure the rationality of the structure. Additional illustration on the fourth stage can be found in Section “Dataset for MTP-GB”. Finally, we developed an MTP that can be used to study the diffusion behavior of Cs in complex polycrystalline SiC with HEGBs, denoted as MTP-GB.

The construction of the training set in these four stages can be found in Table S2. We employed the Smooth Overlap Atomic Positions (SOAP) descriptor with parameters rcut = 5.5 Å, nmax = 12 Å and lmax = 10 Å37,38 to describe the atomic environments in the training set. These atomic environments were visualized t-distributed stochastic neighbor embedding (t-SNE), and the results are shown in Fig. 2. The training set covers a broad range of atomic environments, and the atomic environments generated at the four stages exhibit distinct characteristics without significant overlap. Furthermore, the importance and differences between the third stage and fourth stage are illustrated in Fig. S4. Therefore, each stage in our proposed workflow is indispensable.

Fig. 2: The t-distributed stochastic neighbor embedding (t-SNE) visualizations of the training set.
Fig. 2: The t-distributed stochastic neighbor embedding (t-SNE) visualizations of the training set.
Full size image

a Colored by element types. b Colored by stages.

MTP performances

The accuracy of MTP-a was first analyzed, and the results are shown in Fig. 3. The root mean square errors (RMSEs) of energy for the training and validation sets were 9.7 meV/atom and 10.4 meV/atom, respectively. Although no MLIPs for the SiC-Cs system have been reported in the literature to date, our results show that the energy accuracy of MTP-a is much higher than that of the neural network potential (NNP) developed by Lim et al.39 for the aSiC system. The RMSEs of force for the training and validation sets were 435.7 meV/Å and 424.8 meV/Å, respectively. Considering the inherently complex atomic environments in amorphous systems, these force errors are still considered excellent when compared with other MLIPs developed for amorphous systems22,39,40. To investigate the sources of force errors, we calculated the minimum distance to the nearest neighbor for each atom, and defined the force error as the magnitude of the force vector difference between DFT and MTP-a. The analysis of carbon-carbon (C-C) pairs is shown in Fig. 3c, and other types of atomic pairs are shown in Fig. S5. The force RMSE excluding these abnormal atoms that are extremely close to surrounding atoms is presented in Fig. S6, and it can be concluded that significant force errors are mainly caused by a small number of abnormal atoms. The training configurations for MTP-a were generated from high-temperature (over 1673 K) simulations, causing atoms to undergo intense thermal vibrations. This resulted in a small number of atomic pairs becoming abnormally close due to random collisions or strong fluctuations, causing them to deviate significantly from equilibrium distances. In these regions with small atomic distances, strong repulsive forces dominate, leading to steep gradients on the PES. Accurately capturing these steep local PES regions during MLIP fitting is a significant challenge. Consequently, even if MLIPs can fit the PES well overall, minor deviations in these extreme local environments are amplified, resulting in relatively large force errors. However, the histograms in Fig. 3c and Fig. S3 show that only a small number of atomic pairs have small distances, and the majority of atoms exhibit good force accuracy. This demonstrates that the MTP-a achieves high accuracy for most of the atomic environments, confirming its reliability for use in the aSiC-Cs system.

Fig. 3: Accuracy analysis of the trained MTP-a.
Fig. 3: Accuracy analysis of the trained MTP-a.
Full size image

Parity plots of the (a) energies and b forces for training (purple) and validation (yellow) sets for MTP-a potential. c Scatter plot with histograms of minimum interatomic distance for C-C pair and the force error.

Subsequently, the accuracy of MTP-GB was evaluated with the analysis results presented in Fig. S7 and Tab. S3. Herein, two versions of MTP-GB of level 10 and level 16 were used to clarify the relationship between the number of parameters of MTP, prediction accuracy, and computational efficiency. The level of an MTP corresponds to the number of parameters used to construct the potential, where higher level corresponds to the larger number of MTP parameters (see SI Note 1 for detailed explanation of the term level). As for MTP-GB of level 10 (MTP-GB-L10), the RMSEs of energy and force are 26.4 meV/atom and 629.3 meV/Å, respectively. The accuracy of MTP-GB of level 16 (MTP-GB-L16) is better, and the RMSEs of energy and force acting on atoms are 16.3 meV/atom and 462.3 meV/Å, respectively. There is no doubt that the accuracy of MTP-GB-L16 is better than that of MTP-GB-L10, as the former has more parameters. However, a critical trade-off between accuracy and computational efficiency is also evident. The speed of MD simulation with MTP-GB-L10 is more than three times faster than with MTP-GB-L16, which is a critical advantage for conducting long-timescale diffusion simulations. In addition, even with slightly higher errors compared to MTP-GB-L16, the accuracy of MTP-GB-L10 is still close to DFT when considering the complexity of the training set.

Furthermore, to comprehensively assess MTP-GB performance, we compared it with other advanced MLIPs such as Atomic Cluster Expansion (ACE)41 and MACE42,43. By construction ACE is very similar to MTP, while MACE contains message passing functionality, which, in principle, should lead to the higher accuracy. Using the same training set employed for MTP-GB, we fitted ACE potential and MACE-OMAT-0 (with fine-tuning and out of the box). The results of our benchmark are shown in Table S3. Overall, results of MTP-GB and ACE are almost identical; while RMSEs obtained with fine-tuned MACE model are lower. Performing MD simulations with this model are computationally too demanding. To further verify the reliability of MTP-GB-L10 in simulating the diffusion behavior of Cs atoms, we calculated the diffusivity of the Cs atoms in the target system (6 Å HEGB model) using MTP-GB-L10, MTP-GB-L16, and the fitted ACE potential. The results are shown in Fig. S8 and Table S3. It can be concluded that the diffusivities obtained from these fitted potentials are nearly identical, indicating that the accuracy difference between different fitted potentials have negligible impacts on the simulation of diffusion behaviors. We also validated the consistency between MTP-GB-L10 and DFT in simulating Cs diffusion by NEB calculations (Fig. S9), and the results showed good agreement. Additionally, we recorded the evolution of the maximum extrapolation grade during MD simulations of HEGB models using MTP-GB-L10 (Fig. S10), and the results indicate that MTP-GB-L10 shows high robustness, enabling long-timescale stable MD simulations. Given this consistency in diffusivity prediction, combined with its computational efficiency, MTP-GB-L10 was selected for the subsequent simulations to optimize the trade-off between precision and feasibility.

Cs diffusivity in SiC

We performed MD simulations to investigate the diffusion behavior of Cs atoms in aSiC with different densities and in HEGBs at 1673 K, 1773 K, 1873 K and 1973 K. The time dependent mean square displacement (MSD) of Cs atoms was recorded, and the results are presented in Fig. S11. The diffusivity of Cs was calculated according to the Einstein relation:

$$D=\frac{{\rm{MSD}}(t)}{6t}$$
(1)

where D is the diffusivity, and t is the time. The diffusivity was determined from the slope of the linear region in the plot of MSD versus t. By fitting the linear regions of MSD evolutions and applying them to Eq. (1), the diffusivity of Cs atom at the specific temperature was calculated, with the results shown in Fig. 4. It is found that the diffusivities of Cs follow the Arrhenius equation:

$$D={D}_{0}\exp \left(-\frac{{E}_{a}}{kT}\right)$$
(2)

the theoretical diffusion constants (D0) and diffusion barriers (Ea) were calculated, and the results are summarized in Table 1.

Fig. 4: Cs diffusvities in aSiC and high energy grain boundary (HEGB) models.
Fig. 4: Cs diffusvities in aSiC and high energy grain boundary (HEGB) models.
Full size image

Arrhenius plot for the diffusivities of Cs atoms in aSiC with different densities and 6 Å HEGB model.

Table 1 Diffusion barriers (Ea) and diffusion constants (D0) for Cs diffusion in aSiC with different densities and 6 Å HEGB model

Comparison with other diffusivity data

Based on the results summarized by Simon et al.5 and Ko et al.10, we selected representative diffusivity data of Cs in SiC obtained by experimental or simulation methods and compared them with our results (Fig. 5).We also compared our results with selected literature diffusivity values at 1873 K (regarded as the typical accident temperature of HTGR), and the results are presented in Fig. 6. It can be observed that Cs diffusivity data reported in the current literature show significant scatter, and there is a large deviation between the simulation results and the experimental results. Additionally, we compared the diffusion barriers of Cs in SiC between this work and previous literature, as summarized in Table S4. Notably, the energy barriers for Cs diffusion in aSiC and HEGBs calculated in this work are significantly lower than those reported for crystalline bulk SiC (5.14 eV and 5.54 eV from simulations) and align more closely with values from studies on disordered structures or grain boundaries (0.697 eV and 2.41 eV for HEGBs). Our results also show good agreement with selected experimental results (ranging from 1.83 eV to 2.45 eV) while being slightly lower. Such agreement in diffusion barriers indicates that our model properly captures the key physical mechanisms underlying Cs diffusion, particularly the role of structural disorder in reducing diffusion barriers.

Fig. 5: Diffusivity data in the literature5,9,10,44,45,46,47,48,49,52,84 and this work.
Fig. 5: Diffusivity data in the literature5,9,10,44–49,52,84 and this work.
Full size image

GB stands for grain boundary, and HEGB stands for high-energy grain boundary.

Fig. 6: Comparison of diffusivity data at 1873 K among this work, selected literature5, and experiment.
Fig. 6: Comparison of diffusivity data at 1873 K among this work, selected literature5, and experiment.
Full size image

Deff stands for effective diffusivity.

Notably, our calculations include two kinds of scenarios, namely Cs diffusion in pure aSiC and polycrystalline SiC containing HEGBs. For pure aSiC, the calculated Cs diffusivities are considerably higher than the experimental values, which can be largely attributed to structural differences between the simulated system and real SiC layers. As discussed in Section “Local environment for Cs in SiC”, the interaction between the Cs atom and surrounding cage atoms is dominated by weak interactions, leading to lower diffusion barriers in amorphous SiC compared to those in crystalline bulk or coherent grain boundaries (such as Σ5). Moreover, the highly disordered atomic structure and relatively loose network structure of amorphous SiC facilitate the formation of abundant diffusion channels, further enhancing Cs diffusivity. For polycrystalline SiC with HEGBs, the calculated Cs diffusivity is slightly lower than that in pure aSiC but still higher than experimental values. Although HEGBs exhibit structural disorder similar to amorphous SiC, they have finite thickness and additional structural constraints that moderately restrict Cs migration. This explains why the diffusivity in HEGBs is slightly lower than that in pure aSiC. Additionally, we further discuss the influence of HEGB thickness on Cs diffusivity in polycrystalline SiC, and find no significant effect. This is because as the thickness of HEGBs decreases, the density of HEGBs decreases (Fig. S12), and these two factors offset each other in restricting Cs migration.

In contrast, macroscopic diffusion data obtained by experimental methods such as integral release fitting44,45,46,47,48,49 typically reflect the coupling of multiple diffusion mechanisms, including slow bulk diffusion and grain boundary diffusion. Thus, the measured results are slower than the fast diffusion mechanism discussed in this study. The effective diffusivity of Cs atoms in polycrystalline SiC can be quantified using the Hart equation50:

$${D}_{{\rm{eff}}}={f}\,{D}_{{\rm{GB}}}+(1-f){D}_{{\rm{bulk}}}$$
(3)

where f denotes the volume fraction of HEGBs in the polycrystalline SiC models, DGB and Dbulk are the diffusivity of the Cs in the HEGBs and bulk, respectively. Given that DGB is much larger than Dbulk for Cs, Deff can be approximated as fDGB. The HEGB volume fraction f was calculated using the alpha-shape method51, and the results are shown in Fig. S12. When incorporating our calculated DGB (for HEGBs) into this equation, the resulting Deff remains higher than experimental values, further confirming that structural differences between simulated models and real SiC layers drive the observed deviation.

Previous studies used DFT method to calculate the diffusivity of Cs in bulk 3C-SiC and Σ5 grain boundary9,52, and the calculated values are much lower than the experimental data. In contrast, simulations of Cs diffusion along HEGBs (usually treated as highly disordered structures)5,10 produced results closer to experiments, supporting the validity and rationality of the aSiC based modeling approach in this work. A key innovation of this study lies in the development of MLIPs (MTP-a and MTP-GB) suitable for the aSiC-Cs system and polycrystalline SiC-Cs system. Our potentials enable accurate description of highly disordered atomic environments while enabling long-timescale diffusion simulations with high precision. Our results further support the view that in the SiC layer of TRISO particles, structural features such as locally amorphous regions, irradiation-induced defect clusters, and HEGBs may act as fast diffusion pathways for fission products10,14,53,54,55.

It should be pointed out that this study only focuses on the diffusion of Cs in aSiC and polycrystalline SiC containing HEGBs, while real SiC layers exhibit diverse types of grain boundary structures. Unlike HEGBs, coherent grain boundaries (such as Σ3 and Σ5) and low angle grain boundaries feature more ordered atomic arrangements, thus may significantly decrease Cs diffusivities. This omission of diverse grain boundary types also explains why the calculated Deff values in this work are still larger than the experimental values, since real SiC layers contain a lower fraction of HEGBs. Future work could further construct a more complex polycrystalline SiC that incorporates diverse grain boundary types and study the influence of grain boundary types on Cs diffusivity. Furthermore, from an experimental perspective, using characterization techniques such as electron backscatter diffraction to accurately determine the fraction of HEGBs in SiC layers of TRISO particles could also be done.

Local environment for Cs in SiC

To investigate the diffusion mechanism of Cs, we systematically analyzed the local environments of Cs. The partial radial distribution functions (RDFs) of Cs were calculated and shown in Fig. 7. The RDFs of amorphous SiC were also computed (Fig. S13), and the results are consistent with previous studies17,56,57. Results in Fig. 7a, b reveal that for aSiC system with the same density, temperature variation has no significant effect on the RDFs of Cs-C and Cs-Si. This may be attributed to the high stability of the atomic structure within the studied temperature range, where atomic thermal vibrations do not significantly change the short-range ordered structure. To assess the influence of the density of aSiC on the local environment of Cs, we selected the simulation results at 1773 K as an example and plotted the Cs-C and Cs-Si RDFs in Fig. 7c, d. The results show that as density decreases, the positions of the first peaks in both Cs-C and Cs-Si RDFs shift to longer distances, and the peak shapes become broader and less sharp. This indicates an increase in the average distance between Cs and its neighbor atoms and a higher local disorder. Such changes directly arise from the decrease in system density, which increases the average interatomic spacing and broadens the spatial distribution of the nearest neighbors. Consequently, the amplitude of atomic fluctuations around their equilibrium positions increases, causing the broadening and rightward shift of the first peak in the RDFs.

Fig. 7: The partial radial distribution functions (RDFs) of Cs.
Fig. 7: The partial radial distribution functions (RDFs) of Cs.
Full size image

a Cs-C pair and b Cs-Si pair of 2.8 g/cm3 aSiC at different temperatures; c Cs-C pair and d Cs-Si pair of aSiC with different densities at 1773 K.

To analyze the above observations, we performed Voronoi analysis on Cs atoms. The volume of the Voronoi cell centered at the Cs atom and the cavity size in each frame of the MD simulation were calculated using Ovito, and their statistical histograms and representative parameters are shown in Fig. 8c, d. The results reveal that as the density decreases, the volume of the Voronoi cell and the size of the cavity increase, which is consistent with the RDF analysis. Meanwhile, we also calculated the Voronoi cell volume and size of Cs atoms in the 6 Å HEGB model, and found that both parameters were the smallest among the studied systems. This structural feature explains why the diffusivity of Cs in the HEGB is even lower than that in 3.0 g/cm3 aSiC. Representative snapshots from the MD trajectories were selected to visualize the time evolution of Cs atom and its surrounding Voronoi cell, as shown in Fig. 8a. The Voronoi cells form cage-like structures, with green markers indicating the cage wall atoms from the initial frame and purple marker indicating the central Cs atom, whose trajectory is also attached. It can be observed that the Cs atom oscillates within the cage initially, with the cage wall atoms remaining relatively unchanged. Eventually, the Cs atom escapes the cage through a local hopping event, after which it becomes enclosed by a new set of surrounding atoms, forming a new cage. To further explore the relationship between atomic dynamic behaviors and Cs diffusion mechanism, we calculated the MSD of C, Si, and Cs atoms during the diffusion simulations, as shown in Fig. S14. The results demonstrate that C and Si atoms undergo self-diffusion, with diffusivities comparable to those of Cs atoms. This phenomenon suggests that the C-Si framework in the amorphous structure is not rigid at high temperatures but exhibits structural flexibility, providing a foundation for the local rearrangement of the cages.

Fig. 8: Schematic diagram of the cage diffusion mechanism and analysis of cage geometric structures.
Fig. 8: Schematic diagram of the cage diffusion mechanism and analysis of cage geometric structures.
Full size image

a A schematic diagram of the cage diffusion process. b Histogram of the cage volume for the aSiC with different densities and the 6 Å HEGB model. c Histogram of the cage radius for aSiC with different densities and 6 Å HEGB model. The red and blue dashed lines represent mean value and median value, respectively.

To investigate the electronic structure of the Cs atom within the cage, a representative cage structure was extracted. This configuration was optimized using DFT calculations with D3 dispersion correction58, followed by the computation of the charge density difference (CDD), electron localization function (ELF)59, and Bader charge60 (Fig. 9). Both the CDD plot and Bader charge show that the Cs atom loses about 0.62e, with the transferred electrons uniformly distributed around the cage wall atoms, rather than accumulating on specific atoms. The ELF result further confirms that no covalent bonds or strong ionic bonds are formed between Cs and surrounding atoms. The above results suggest that the interaction between the Cs atom and surrounding atoms is mainly non-localized electrostatic effect, with weak interactions dominating.

Fig. 9: Electronic structure analysis of Cs in the cage structure.
Fig. 9: Electronic structure analysis of Cs in the cage structure.
Full size image

a Charge density difference (CDD) plot of Cs in cage structure. The isosurface value is 0.0015 e/Å3. b Electron localization function (ELF) plot of Cs in cage structure. The value on Cs atom indicates the Bader charge of Cs atom.

We attempt to explain why the diffusivity increases significantly with decreasing density. According to transition state theory61, the diffusivity is related to the attempt frequency of atomic jumps. With the decrease in the density of aSiC, RDFs and Voronoi analyses reveal that the cage volume and free space around the Cs atom increase, which enables the Cs atom to have a larger vibration amplitude and a significantly higher attempt frequency to break cage boundaries, thereby leading to an increase in diffusivity. This provides a clear explanation for the enhanced Cs diffusivity in low-density amorphous SiC. Furthermore, this mechanism is also applicable for HEGB model, as it has the smallest cage volume and thus results in the lowest diffusivity among the studied models.

The diffusion mechanism of Cs in aSiC and HEGBs can be described as a cage-breaking diffusion mechanism regulated by the local atomic environment. The diffusion process primarily consists of localized hopping events, during which the Cs atom escapes from one cage formed by the disordered network and is temporarily constrained by another cage, exhibiting a cage-breaking behavior. This mechanism reflects the interaction between local structural constraints and dynamic structural rearrangements, which is consistent with studies on the diffusion behavior of atoms or ions in amorphous or glassy systems62,63,64,65,66. Christensen et al.65 found in their study on amorphous NaFePO4 that low constraint regions formed by larger ring structures could provide fast diffusion pathways for Na ions. This aligns with our study that Cs diffusion relies on the local reorganization of the flexible C-Si framework to reduce constraints. Additionally, Sørensen et al.66 observed taht in Na2S − SiS2 glass system, some slowly diffusing Na ions are trapped in cages formed by rigid Si-S network, undergoing cage-rattling behavior. This behavior and oscillatory of Cs before hopping observed in our work both reveals the constraining effect of the local atomic environment on atomic migration. Notably, in the present study, C and Si atoms already exhibit self-diffusion behavior during the Cs oscillation stage. This provides the structural basis for for subsequent local rearrangement of the cage-wall atoms.

Discussion

In summary, we proposed a workflow for training MTPs that integrates the AL based on configuration strategy and the AL based on neighborhood strategy, enabling efficient extraction of unknown regions from complex HEGB structures to directly construct periodic structures without relying on EIPs. Through this workflow, we successfully developed MTPs for the aSiC-Cs system (MTP-a) and the polycrystalline SiC-Cs system (MTP-GB) to study diffusion behaviors of Cs atoms. The developed MTPs accurately described the complex atomic environments, with energy and force errors comparable to those of state-of-the-art MLIPs for amorphous systems. Diffusion simulations revealed that Cs diffusivity follows the Arrhenius relationship and Cs atoms diffuse via a cage-breaking mechanism, facilitated by weak interactions with surrounding atoms, as evidenced by RDFs, Voronoi analysis, and electronic structure calculations. These results clarified the microscopic mechanisms governing Cs mobility in SiC, and also enhanced understanding of Cs diffusion through HEGBs of SiC in TRISO particles. Comparisons with literature data confirmed that Cs diffusion in aSiC and HEGBs corresponds to the fast diffusion channels in SiC, whereas experimental data reflecting coupled bulk and grain boundary diffusion were slower. This further supports the validity of our modeling approach and the dominance of disordered regions and HEGBs in Cs transport within SiC layers.

This study advances atomic-scale insights into Cs diffusion in aSiC and HEGBs, and provides a reliable MLIP training workflow for future studies. This workflow can serve as a generalizable solution for developing MLIPs for other complex systems involving grain boundaries or large-scale defects. However, the current models are limited to idealized amorphous structures or HEGBs, and do not account for the full diversity of grain boundary types (such as Σ3 and Σ5) in real polycrystalline SiC. Future work should extend the proposed workflow to developing MLIPs for polycrystalline SiC with mixed grain boundary types, investigate the impact of stoichiometric variations on Cs diffusion, and even applying the framework to other fission products, ultimately enabling more accurate predictions of fission product behavior in nuclear materials.

Methods

Machine learning interatomic potentials

We used the MTP implemented in the MLIP-2 package36 to describe interatomic interactions. The MTP employs moment tensor descriptors to represent the atomic local environments and applies linear regression to correlate energies with basis functions25. The MTP has been successfully applied to various types of materials, and it exhibits excellent accuracy and computational efficiency among the developed MLIPs24. We utilized AL based on configuration strategy18 implemented in the MLIP-2 package to efficiently sample new local atomic environments for training the MTP for aSiC-Cs system. The AL based on neighborhood strategy is realized by using MLIP-3 package26. Within the framework of these two strategies, we performed parallel MD simulations using the trained MTP and calculated the extrapolation grade γ for each configuration during the simulations. Configurations with 2 < γ ≤ 10 were considered as dangerous extrapolations and saved to a file. Simulation would stop immediately if a configuration with γ > 10 appeared. Detailed explanations on extrapolation grade can be found in refs. 36,26. Next, we used the MaxVol algorithm18 to select configurations from the previously saved file, performing periodic structure construction (only applied in AL based on neighborhood strategy) and ab initio calculations, then added them to the training set. The MTP was then retrained using this updated training set. The above process was repeated until no new configurations were selected, resulting in a final MTP with high robustness (that can ensure long-term stable MD simulations). As for the cutoff length of MTP, we conducted locality test (see SI Note 2 for details) to choose a suitable value of 5.5 Å.

Ab initio calculations

All first-principles calculations in this study were performed using the Vienna Ab initio Simulation Package (VASP)67 with the projector augmented wave method68. The generalized gradient approximation within the Perdew-Burke-Ernzerhof (PBE)69 form was used to describe the exchange-correlation functional. The energy convergence threshold was set to 10−6 eV for electronic self-consistency calculations. The plane-wave cutoff energy was set to 650 eV. The first Brillouin zone was sampled using the Monkhorst-Pack scheme70 with k-points of 3 × 3 × 3 grid for structural optimization and electron structural analysis. A single Γ point was used in AIMD calculations and single-point calculations in the on-the-fly active learning process of MTP training. The initial training sets were generated using AIMD simulations with the Langevin thermostat71,72. Given that the system investigated in this study is amorphous, we checked the difference between PBE and PBE+D3 when conducting ab initio calculations, and the influence is negligible for the subsequently trained MTP (see SI Note 4 for details).

Computational models

Amorphous SiC structures were generated through the melt-quench method using the classical MD simulation with the Tersoff potential73. The initial structure was a cube with a specific number of C and Si atoms in a body-centered cubic lattice at the specified density. For locality tests (see SI Note 3), the initial structure contained 150 atoms (75 C atoms and 75 Si atoms) to ensure the cell size exceeded 12 Å at the selected density, while for subsequent MTP fitting and MD simulations, the initial structure contained 128 atoms (64 C atoms and 64 Si atoms) to ensure the cell size exceeded 11 Å at the selected density. According to the literatures39,74,75,76, the density of aSiC ranging from approximately 2.7 g/cm3 to 3.0 g/cm3, so we setted the target densities as 2.7 g/cm3, 2.8 g/cm3, 2.9 g/cm3 and 3.0 g/cm3. The initial structure was first rapidly heated to 5000 K and equilibrated for 200 ps to form a liquid structure. The liquid structure was then cooled to 300 K at a rate of 1 × 1012 K/s, followed by aging for 300 ps to obtain the final amorphous structure. We tested the influence of quench rate on the diffusion of Cs in aSiC (Fig. S17), and showed that 1 × 1012 K/s is a suitable quench rate. The canonical (NVT) ensemble with Nosé-Hoover thermostat77,78 was employed in the MD simulations to ensure the density of the initial structure remained constant. One or two Cs atoms were added into the amorphous SiC structure using Packmol79, with the distance between Cs atoms and other atoms larger than 2 Å.

The construction method for polycrystalline 3C-SiC models containing HEGBs is as follows. Firstly, Atomsk was used to construct the initial polycrystalline 3C-SiC model with 5 grains and dimensions of 40 Å × 40 Å × 40 Å. We used the Identify Diamond Structure function in Ovito to identify grain boundary atoms. Then, using the Expand Selection function, by setting a specific cutoff radius, atoms in the region with a thickness of twice the cutoff radius near the grain boundary were selected in an expanded manner, thereby defining the scope of amorphization treatment. Next, only the atoms in the selected grain boundary region were subjected to the same melting and quenching method as used for preparing amorphous SiC to achieve amorphization. Meanwhile, the NVT ensemble was applied to the unselected atoms inside the grains to maintain a temperature of 300 K, ensuring the preservation of their crystalline structure. Finally, the entire system was relaxed for 100 ps under the NPT ensemble at 300 K and 0 atm to eliminate internal stress, resulting in a polycrystalline SiC model containing HEGBs with a specific thickness. According to Felix et al.80, the thickness of the amorphous layer between grains in the SiC layer of TRISO particles is approximately 0.6 nm. Therefore, we constructed a polycrystalline SiC model containing HEGBs with a thickness of 0.6 nm, denoted as 6 Å HEGB. To investigate the influence of HEGB thickness on the diffusion behavior of Cs, a model containing HEGBs with a thickness of 0.3 nm was also established, denoted as 3 Å HEGB. In addition, for the initial polycrystalline SiC model constructed by Atomsk mentioned above, without subsequent expansion and selection operations, the Tersoff potential was directly used for relaxation under the NPT ensemble at 300 K for 100 ps, obtaining a polycrystalline SiC model containing ordinary grain boundaries, denoted as 0 Å HEGB. In each HEGB model, twelve Cs atoms were inserted into the grain boundary region using Packmol, with the distance between Cs atoms and other atoms larger than 2 Å. Fig. S18 gives the diagrams of these three HEGB models.

Dataset for MTP-a

The initial dataset to train the MTP for the amorphous SiC-Cs system was generated by performing an AIMD simulation on the initial structure of a 2.8 g/cm3 amorphous SiC containing one Cs atom. The AIMD simulation was conducted in the NPT ensemble using the Langevin thermostat at 1873 K for 2 ps. Twenty configurations were selected from the AIMD trajectory with a 100 fs interval to pretrain the MTP. Subsequently, configurations were selected from all configurations in the AIMD trajectory based on the MaxVol algorithm and added to the initial training set to retrain the MTP, obtaining the initial MTP. After that, the initial MTP was actively trained in eight parallel MD simulations. These MD simulations used four structures: amorphous SiC with densities of 2.8 g/cm3 and 3.0 g/cm3, each containing one or two Cs atoms. MD simulations were performed using these four structures in the NPT ensemble with Nosé-Hoover thermostat at 1773 K and 1973 K for 200 ps, generating 8 MD trajectories in total. Herein, we found that if only aSiC models containing a single Cs atom are used to generate the training set, nonphysical aggregation of Cs atoms occurs in subsequent MD simulations, which is specifically characterized by excessively close atomic distances, as shown in Fig. S19. Therefore, to enable the MTP to fully learn the Cs-Cs atomic interactions, we additionally introduced aSiC models containing two Cs atoms during the training of the MTP, thereby improving the structural representativeness of the training set. This AL strategy ensured that the MTP could fully and efficiently learn the interactions within aSiC, between Cs and aSiC, and between Cs atoms. Finally, the MTP used for the amorphous SiC-Cs system was parametrized, denoted as MTP-a.

MD simulations were conducted on aSiC-Cs structures with four different densities (2.7 g/cm3, 2.8 g/cm3, 2.9 g/cm3, and 3.0 g/cm3) for 100 ps using MTP-a under the NVT ensemble at temperatures of 1673 K, 1773 K, 1873 K, and 1973 K, resulting in a total of 16 trajectories. Configurations were then uniformly selected from these trajectories to form the validation set, where the number of configurations in the validation set was approximately 25% of that in the training set.

Dataset for MTP-GB

The initial dataset for training MTP-GB takes the training set optimized after the configuration-based AL stage (originally used for MTP-a training) as its foundational framework, and is expanded by supplementing simulation data of specific structures. Nine structures were additionally designed and generated, which specifically include six types of crystal 3C-SiC structures and three types of grain boundary structures. Among them, the crystal structures cover a 3 × 3 × 3 3C-SiC supercell, single-vacancy structures (C vacancy, Si vacancy), and divacancy structures (CC divacancy, SiSi divacancy, CSi divacancy). The grain boundary structures were randomly extracted from the grain boundary region of the 6 Å HEGB model, with a size of 12 Å × 12 Å × 12 Å; three structures with distinct grain boundary characteristics were obtained in this way, and the structural diagrams are shown in Fig. S20. All nine additional structures were subjected to 2 ps AIMD simulations under the NPT ensemble at 1873 K. All structures from the simulation trajectories collectively constitute the structural pool of the dataset. Subsequently, the MaxVol algorithm was applied to select representative structures from the structural pool. These selected structures were then merged with the original MTP-a training set, forming the initial training set for MTP-GB. Four parallel MD simulations were employed for the AL based on neighborhood strategy. The structures for MD simulation were 6 Å HEGB model and 3 Å HEGB model. All MD simulations were performed for 50 ps under the NPT ensemble at 1773 K and 1973 K, respectively. Finally, after the implementation of AL based on neighborhood strategy, a potential suitable for polycrystalline SiC with HEGBs was obtained, denoted as MTP-GB.

Molecular dynamic simulation and analysis

The Large-Scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) package81 was used for all MD simulations in this study. To explore the diffusion behavior of Cs in aSiC, the 3 × 3 × 3 supercell of the aSiC structure (containing 1728 C atoms, 1728 Si atoms, and 27 Cs atoms) was used for MTP-MD simulations. All MTP-MD simulations were conducted as the same procedure in the NVT ensemble. The structures (i.e., aSiC models and HEGB models) were first relaxed for 100 ps, and then MSD data were recorded for 1000 ps. To ensure the reliability and statistical significance of the simulations results, MD simulations for each type of model were repeated ten times with different velocity seeds. The final MSD results were obtained by averaging the MSD data from these ten independent simulations.

The OVITO82 package was used to visualize and post-process results, such as calculations of RDFs and Voronoi analysis. The RDFs were obtained by time-averaging over hundreds of picoseconds. Additionally, the Atomic Simulation Environment (ASE) python library83 was used to calculate the distances between atoms in the simulation cell.