Abstract
Si and its oxides have been extensively explored in theoretical research due to their technological importance. Simultaneously describing interatomic interactions within both Si and SiO2 without the use of ab initio methods is considered challenging, given the charge transfers involved. Herein, this challenge is overcome by developing a unified machine learning interatomic potentials describing the Si/SiO2/O system, based on the moment tensor potential (MTP) framework. This MTP is trained using a comprehensive database generated using density functional theory simulations, encompassing diverse crystal structures, point defects, extended defects, and disordered structure. Extensive testing of the MTP is performed, indicating it can describe static and dynamic features of very diverse Si, O, and SiO2 atomic structures with a degree of fidelity approaching that of DFT.
Similar content being viewed by others
Introduction
Si/SiO2 interfaces are ubiquitous in semiconductor manufacturing, which includes metal-oxide-semiconductor field-effect transistors1, nanowire- and nanodot-transistors. The formation of SiO2 layers involves charge transfer during the oxidation of Si substrates2. Additionally, siliceous materials-including clay minerals and cement, which comprise Si, O, and SiO2 components-are governed by interactions that entail similar charge transfer. Abundant past theoretical research has focused on understanding Si, its oxides, the formation of SiO2 multilayered structures, early oxidation rates, and amorphization of oxide layers3,4,5,6,7,8. These studies predominantly relied on electronic density functional theory (DFT)9,10,11 and nonflexible classical potentials12,13. However, simulating large systems with multiple components, charge transfer, and hetero-interfacial systems poses challenges within these frameworks. An ideal modeling approach should explicitly or implicitly capture charge transfer without compromising accuracy or incurring prohibitively large computational costs. Existing charge equilibration potentials like ReaxFF14,15 and COMB16,17,18, while being capable of describing chemical interactions during MD simulations, tend to have a limited ability to describe mechanical properties of materials15,17,19 unless special reparametrization is applied.
Recent advances, such as linear-scaling DFT11,20 and machine learning (ML) force fields–e.g., the Gaussian approximation potentials (GAP)21, and artificial neural networks22,23—lift the limitations of traditional methods. ML force fields have demonstrated high accuracy in modeling Si24,25,26,27 and many other elements28,29,30,31,32. Similar progress has been made in improving interatomic interaction descriptions in Si oxides33,34,35,36 and metal oxides37,38,39,40. However, jointly describing compounds and their constituents using ML force fields presents challenges due to the disjointed configurational space of multi-phase forms and the need to handle charge transfer. ML force fields41,42,43,44,45 combine a descriptor to a regression procedure to encode geometry and ab initio properties, usually omitting explicit electronic structures. A previous study focusing on modeling SiO2 using the moment tensor potential (MTP) suggests incorporating additional reference data is preferable to adding explicit charge equilibration for long-range interactions33.
The novelty of this article is a MTP46,47 that jointly describes interatomic interactions in SiO2 and its constituents (Si and O), enabling the representation of multiple charge states. The developed MTP for Si/O/SiO2 systems is parameterized using an ab initio database containing diverse crystal structures, point defects, extended defects, and disordered structures. This MTP is then utilized for molecular statics (MS) and molecular dynamics (MD) simulations to investigate crystalline, interfaces, amorphous, and liquid states of Si and SiO2. These test simulations indicate that the MTP can provide a unified description of these disjoint systems.
Results
Our analysis encompasses both MS and MD simulations. MS results include cohesive energy, lattice constant, elastic constant, defect formation energies, interface relaxation, as well as linear and planar defect. The MD results bracket vacancy diffusion coefficients, melting point,interfaces, as well as the liquid and amorphous structures of Si and SiO2. The outcome of these runs are compared against the reference method (DFT) and those derived from the semi-classical potential.
Cohesive energy, elastic constant, point defects and extended defects
First, the state equations of various Si and SiO2 polymorphs are presented as cohesive energy vs lattice parameter. Please see the methodology section for calculation details. Additionally, the cohesive energy values for molecular oxygen, both O2 and O3, are provided. As shown in Fig. 1, the MTP replicated the cohesive energies of the references states with remarkable accuracy. The Table 1 compares lattice parameters predicted by the MTP, COMB, and ReaxFF models with benchmark and experimental data. Remarkably, the MTP predictions show excellent agreement with both the benchmark and experimental data.
a Bond and angle energy of dioxygen and ozone as calculated using MTP (lines) and DFT (dots). b–d Energy-volume relationships in crystalline SiO2 and Si polymorphs calculated using the MTP (lines) and DFT (dots). 3D Si polymorphs (b), 3D (c), and 2D (d) SiO2 structures were considered. Details pertaining to these crystal structures are available as S.I. Agreement between MTP and DFT is excellent. Insets: illustration of selected polymorphs crystal structures.
The second-order elastic constants and bulk modulus are determined using finite difference, as detailed in the methods section. Table 2 provides the relative root mean square error (RRMSE) on elastic constant with respect to the DFT benchmark and experimental data. The MTP model demonstrates lower RRMSE when compared to those of ReaxFF and COMB potentials, it competes closely with the Beest Kramer van Santen (BKS)48 potential. Moreover, other semi-empirical models reported in ref. 24 demonstrate higher errors compared to the predictions made by the MTP model. The bulk modulus values for various silicon and silica polymorphs can also be found in Table 3. As evident, the MTP predictions closely align with the reference methods and experimental values, although it is worth noting that the training set did not encompass the deformation of certain polymorphs. Our potential accurately predicts the elastic constant of amorphous silica, even though amorphous configurations were not included in the training set. In our testing of the MTP potential, we have also considered point defects like vacancies, divacancies, and self-interstitials. The RRMSE values for these defects are reported in Table 4. Once again, using the MTP leads to smaller relative errors in comparison to ReaxFF potential. Since the majority of potentials were not specifically parameterized for the oxygen system alone, our comparison was limited solely to ReaxFF. In our study, we examined a specific case involving the I4 compact cluster49 within the Si crystal, which was not included in our training set. The atoms within the cluster exhibit a harmonious four-coordinated arrangement. Notably, the cluster boasts the presence of five-, six-, and seven-membered atomic rings. The bond lengths and bond angles of this cluster were calculated based on relaxed structures obtained from DFT, MTP, SW, ReaxFF and COMB calculations, as illustrated in Fig. 2. Notably, no dangling bonds were observed for all potentials, except for the COMB potential, which failed to reproduce the I4 structure. When analyzing the formation energy of the cluster, the MTP model exhibited a prediction within 14% of the reference value, while the SW and ReaxFF potentials displayed errors reaching up to 27% and 47%, respectively. Among the models assessed, the MTP model exhibited superior agreement with DFT calculations for both bond lengths and bond angles. As this is a perfectly coordinated tetra-interstitial, we also tested 3 and 5-fold coordinated interstitials, namely, di-interstitial, tri-interstitial, and tetra-interstitial, as shown in Fig. 2. While these defects are not included in the training set, the MTP exhibits better agreement with the benchmark than ReaxFF, as detailed in Table 4. The MTP also outperforms ReaxFF in describing the vacancy formation energy in silica polymorphs, as demonstrated in Table 4. Again, no SiO2 point defects configurations were incorporated to our training set.
In silicon crystal, a relaxed perfectly coordinated four-interstitial cluster is depicted. left a and c: the MTP leads to bond lengths and angles (as illustrated in the b) in better agreement with DFT as compared to SW, ReaxFF and COMB models. Additionally, two other interstitial clusters with coordination defects are presented, namely a four-interstitial cluster d and a di-interstitial e. All the interstitial atoms are colored orange. In d and e, the interstitial bonds are also colored orange. Their formation energies are indicated in Table 4, where configurations b, d, and e correspond to Si-I4C2, Si-I4C1, and Si-I2, respectively. The notations I1, I2, I3, and I4, as well as α, β, γ, δ, and ϵ, represent bond lengths and bond angles within the cluster, respectively. These interstitial configurations were not included in the training set.
The static migration barrier energy of the vacancy was determined using the nudged elastic band (NEB)50. The migration barrier profiles, including the DFT-based profiles as well as the MTP- and SW-based profiles, are depicted in Fig. 3b.The MTP migration barrier profile shows excellent agreement with the DFT reference profile. In contrast, the SW potential does not capture the reference profiles with similar accuracy. Examining the barrier for vacancy migration reveals a relative error in barrier height of 15.0% for MTP, while for SW, it is 73.1%. These results demonstrate that the MTP model can better mimic the reference method when studying point defects within larger systems, as demonstrated in references28,51,52. The activation barriers for mono-vacancy hopping were further investigated using MD simulations, considering temperatures ranging from 1000 K to 1650 K. The simulation details are provided in the method section for reference. The mean square displacements are also provided in the supplementary Fig. S4. As reported in Fig. 3a, the activation energy for mono-vacancy hopping is 0.31 eV for MTP and 0.41 eV for SW. These values are close to the static activation barriers computed at 0 K; the MTP model behaves in a physically plausible fashion. As observed in Fig. 3a, b, the barrier obtained from MD simulation is 0.32 eV, whereas the static migration barrier stands at 0.31 eV. It is noteworthy that NEB configurations, including ab initio molecular dynamics (AIMD) configurations containing vacancies, were not included in the training set. We investigated extended defects such as generalized stacking faults and dislocations in the Si crystal. Generalized stacking faults are planar defects closely linked to slip. In turn, the behavior of dislocations and their core properties are particularly important for understanding plasticity. The methods section of our study provides a detailed description of our model for generalized stacking faults and dislocations, as well as the calculations involved. In Fig. 4a, b, the excess energy per unit area, also known as a γ-line, is presented. The γ-line was calculated using the benchmark method, the MTP model, the SW53 and Tersoff (TS)54,55 semi-classical potentials. As shown in the Fig. 4, the MTP is in good agreement with the benchmark results. In contrast, SW and TS demonstrate lesser agreement with the benchmark. In Fig. 5, the dislocation core structures are presented. The core structures predicted by the MTP model exhibit a nearly perfect agreement with those predicted by DFT, as reported in ref. 56. An important and distinctive feature of our potential is the direct relaxation of the C1 core structure to the C2 structure, which is commonly referred to as the double-period reconstruction of the C1 core. In past studies, the C2 core structure was often manually reconstructed from the relaxed C1 core56,57. However, our potential eliminates the need for manual reconstruction by obtaining the C2 structure directly. To obtain the relaxed structure and energy of the C1 configuration, a snapshot is selected from the relaxation steps that ultimately lead to the C2 structure. Our investigation also revealed that the C2 core is the most stable configuration, a result consistent with previous reports56,57. In addition to Fig. 5, the core structures are also depicted in Supplementary Fig. S10.
a Temperature dependence of vacancy diffusion coefficients simulated using the MTP and SW. b NEB-based mono-vacancy jump barrier. DFT, MTP and SW are compared. AIMD configurations containing vacancies, along with NEB configurations, were excluded from the training set. Insets: illustration of vacancy position before and after the jump within the silicon crystal.
γ-lines on the (111) plane as predicted by DFT, MTP, TS and SW. The shuffle (S) and glide (G) cuts are illustrated in the inset. a The MTP provides a good description of the γ-line associated to the shuffle cut and b a near-perfect description of the γ-line associated to the glide cut. Insets (i), (ii), and (iii) represent the bulk structure, the relaxed shuffle (S) structure, and the relaxed glide (G) structure, respectively.
Positions and relaxed structures of [110] screw dislocation cores in Si obtained with the MTP potential. The system size was set to 14400 atoms and oriented such that the x, y, and z directions coincide to [11-2], [111], [110] respectively. The dislocation core structures are in good agreement with DFT core structures reported in the literature56. The core types are represented by A, B, C1, and C2, respectively. The red mark indicates the position of the dislocation core. Dislocation configurations were excluded from the training dataset.
Coexistence simulation
We determined the solid-liquid coexistence temperature of silicon using the solid-liquid interface method described in ref. 58. Our MTP potential predicts the silicon melting point to be 1485 ± 5 K, which is ~0.5% lower than the benchmark DFT-GGA value of 1492 K59. Note that both the MTP potential and the benchmark value are ~12% lower than the experimental melting point of 1687 K, as reported in ref. 60. Notably, our database initially lacked solid-liquid interface data, so we integrated a few AIMD configurations gathered around the experimental melting point into our unified training set. Additionally, it is important to acknowledge the influence of the exchange-correlation (XC) function on melting behavior, as discussed in refs. 59,61. The capability of MTP to accurately replicate the melting point of the GGA XC functional showcases the high-quality simulation of liquid structure by MTP, as elaborated in the following section. For comprehensive details on our coexistence simulation approach, please refer to the accompanying supplementary Fig. S9.
Silicon slab energy
We compare the surface energy predicted by MTP against experimental data and DFT references, as slab data were not included in the training set. Experimental slab energy values for Si (100), Si(110), and Si(111) are reported to be 2.1, 1.5, and 1.2 j/m262, respectively, while our DFT values are 2.1, 1.8, and 1.6 j/m². The MTP predicts these surface energies to be 2.0, 1.3, and 1.2 j/m², respectively. While a large discrepancy between MTP and DFT is observed, especially for Si(110) with relative errors of up to 25%, the MTP-predicted surface energy closely matches the experimental slab energy.
Disordered structures
We also considered an extensive set of disordered Si and SiO2 structures. We compared structures generated using ab initio MD, MTP-MD, and semi-empirical-MD. All three cases were subjected to identical MD simulation conditions, except for amorphous silica, where the MTP simulation time was shorter compared to the other potentials. For comparison, we have chosen the Vashista (VA)63, Munetoh (TS)64, BKS48, and Sundararaman (SHK1 and SHK2)65 models. The details regarding the ab initio and classical MD simulations are comprehensively provided in the methods section. To analyze the disordered structures, we utilized both the pair correlation functions and the bond angle distribution functions.
Liquid and amorphous silicon
As observed from the radial distribution function shown in Fig. 6a, the MTP describes the structure of liquid Si with high precision. In contrast, the others potential leads to a shifted position of the first-neighbor peak compared to the results obtained from the reference (DFT). Additionally, there is an overestimation of the peak height, primarily observed with the EDIP and Tersoff potentials. The angular distribution function (ADF) Fig. 6b also demonstrates excellent agreement between the MTP model and DFT data. When considering amorphous Si, the MTP model accurately describes the structural features in agreement with experimental data. Conversely, semi-classical potentials like SW and Tersoff fail to replicate the experimental radial distribution profile. As shown in Fig. 6c, experimental g(r) and MTP-MD lead to nearly identical first-neighbor peaks (2.36 Å) and second neighbor peaks (3.89 Å). Additionally, the MTP model exhibits better agreement with the experimental bond angle distribution centered around 108.6°66. Although the bond angle distributions of semi-empirical models like SW and EDIP are closer to the experimental values, they exhibit angle distributions below 90°, as shown in Fig. 6d. Note that the ab initio cooling and equilibration trajectory was not included in the training set of the MTP model, which suggests the MTP is fairly general, which can be attributed to the fact that the training dataset encompasses a variety of configurations within the disordered Si systems. Overall, the MTP model demonstrates a level of accuracy comparable to that of DFT and experiment when describing the structural features and bonding characteristics of disordered Si.
Disordered structures of Si simulated using DFT, MTP, semi-empirical models a Radial and b angular distribution functions of liquid Si (3370 K, 64 atoms); c Radial and d angular distribution functions of amorphous Si (300 K, 1000 atoms). The amorphous distribution functions are compared against experimental data from Exp A112,113 and Exp B66.
Liquid and amorphous silica
Figure 7 illustrates results pertaining to liquid SiO2. It includes pair distribution functions (PDF) and ADF. The MTP is in better agreement with DFT as compared to the others potential. We notice both quantitative and qualitative differences between the semi-empirical potentials and DFT, except for the Si–O pair correlation function. In this case, the semi-empirical potentials only overestimates the height of the first peak, which is located around 1.62 Å67,68,69. This value is consistent with the experimental Si–O bond length observed in liquid SiO2, indicating a strong chemical interaction between the Si–O pairs. Both the O-O and Si-Si pair correlation functions exhibit a shift in the first peak as given by the others models, while there is a strong quantitative and qualitative match between the MTP-based and the DFT-based structures. Furthermore, the BKS-based Si–O–Si ADF, along with those of other semi-empirical models, does not match the DFT-based ADF. Conversely, the MTP-based ADF demonstrates a good match. At 3600 K, using our 96-atom simulation box model, the average Si–O–Si angle between two SiO4 tetrahedra is determined as follows: DFT–134.5°, MTP–131.4°, BKS–146.0°, TS–143.6°, VA–141.8°, SHK1–142.6°, and SHK2–140.4°. Our DFT value aligns closely with literature, approximately 136.0°70 and 135.0°71 respectively. It is evident that the MTP closely resembles the reference method, whereas other models align with experimental values for the Si–O–Si angle in amorphous silica, ranging between 140° and 152°72,73,74. This likely stems from the semi-empirical models being meticulously fitted with consideration of experimental properties. Most of them were optimized based on mixed ab initio-experimental data. We then varied the temperature of liquid silica from 2500 K to 3500 K using a 648-atom box and recorded the Si–O–Si angle, as depicted in Fig. 7d. We have found that the Si–O–Si angle in liquid silica changes with temperature, consistent with the findings reported in ref. 75. As the temperature decreases, the angle between tetrahedral interconnections increases, contributing to network relaxation. It is likely that the relaxation of the network at room temperature upon cooling is primarily attributed to variations in bond lengths and angles, given that the network structure of silica liquid does not qualitatively change between 3500 K and 300 K.
a Si-Si, b O-O and c Si–O correlation functions g(r) and f Si-O-Si, e O-Si-O, g Si-Si-Si, and h O-O-O partial bond angle distribution function for liquid SiO2 at 3600 K simulated using DFT, MTP, BKS, TS, VA, SHK1 and SHK2 with a 96-atom simulation box. d Distribution of the Si–O–Si angle against temperature using the MTP potential.
Furthermore ab initio MD, MTP-MD, BKS-MD and other models lead to a within-tetrahedra O-Si–O bond angle distribution centered around 109°, which is nearly equal to the experimental bond angle67,68,69. However, the BKS and VA potentials overestimate the average probability at 109°, while the TS potential underestimates this probability. The SHK1, SHK2, and MTP potentials match the benchmark. The DFT, MTP, and all other models except TS lead to very similar O-O-O ADFs. However, when it comes to the Si-Si-Si angle, qualitative and quantitative discrepancies are observed between BKS-generated structures and DFT-generated structures, while MTP-generated structures match the benchmark. There is also a notable discrepancy between the structure predicted by other models and that of the benchmark.
The partial PDFs of vitreous SiO2 are illustrated in Fig. 8a–c. The comparison is made against the PDF computed from experimental data using the reverse Monte Carlo (RMC) method76. The MTP potential exhibits qualitative agreement with RMC data, as our PDF profiles match those obtained from RMC.For example, only the MTP accurately reproduces the RMC profiles for Si-Si interactions, as the second peak around 5 Å is not reproduced by the other models, including the BKS model. Additionally, while the other models fail to reproduce the height of the Si–O RMC PDF, the MTP potential shows a good match. To further analyze the structure, we also computed the most important ADF, as well as the Si–O bond length distribution function. For O-Si–O (refer to Fig. 8c, all the profiles are similar but vary in height, centered around the experimental value of 109.0°. When it comes to Si–O–Si angle (Fig. 8d), both the MTP and BKS show similar profiles centered between the experimental values of 144° and 152°. The average values for BKS Si–O–Si angle is 150 and the one of the MTP is 145.5. As the average value for the same angle predicted by MTP in liquid silica at 3600 is ~131.4, this confirms that the Si–O–Si angle varies in liquid silica. Considering all the potential, the the Si–O bond length distribution are centered between 1.60 and 1.66 Å. The average bond length distribution for MTP potential is 1.63 Å, which is close to experimental values of 1.62 Å. Note that these structural properties may vary slightly depending on the employed cooling rate. Despite the high cooling rate, no major coordination defects were observed. This indicates that the configuration has been well equilibrated at 3000 K, resulting in the establishment of a strong network. Note that the ab initio MD trajectory, which encompasses both the cooling and equilibration stages of the amorphous structure preparation, was not included in the MTP training set, which is an indicator of the MTP’s generalization capability. Overall, the MTP potential demonstrates a remarkable improvement in accurately describing the structure of disordered SiO2 compared to well-established potentials such as the BKS potential and others. The MTP model captures the essential structural features of the disordered systems with greater precision, resulting in a better agreement with experimental observations67,68,69,76.
Partial radial distribution functions (a–c) in vitreous silica were obtained using various potential including MTP, BKS, TS, VA, SHK1, and SHK2. The vitreous systems were equilibrated at 300 K, consisting of 648 atoms. These partial radial distribution functions are compared with experimental data obtained using Reverse Monte Carlo (RMC)76. Additionally, the angular distribution function (d, e) and bond length distribution function (f) were analyzed and compared with experimental data from multiple sources: Exp A67, Exp B73, Exp C114, and Exp D115.
Phonon dispersion
We calculated the phonon dispersion of c-Si and α-quartz, as illustrated in Fig. 9a, b, respectively. The MTP model exhibits very good agreement with the reference method (DFT) except the higher frequencies for α-quartz. Once again, the corresponding frozen phonon configurations were not explicitly included in the training set.
Si–SiO2 interface
The Si–SiO2 interface, a cornerstone in semiconductor physics and material science, plays a fundamental role in device fabrication and significantly impacts device performance. Exploring heterostructures involving Si–SiO2 interfaces opens avenues for novel functionalities and applications in microelectronics and beyond. Given its importance, capturing the structure and dynamics of the Si–SiO2 interface is paramount for a potential model of the Si–O system. To achieve this, we employed various models, encompassing crystalline Si slabs with different orientations, such as Si(100), Si(110), and Si(111). Our approach involved utilizing both α-quartz, β-cristobalite, and other polymorphs in constructing the Si–SiO2 interface. For a detailed explanation of our construction scheme, please refer to the Method section. To assess the suitability of the potential for modeling the Si/SiO2 interface, we examined interfaces using both O-terminated SiO2 slabs and Si-terminated slabs. For each Si/SiO2 crystalline interface configuration, we perform force and energy minimization, allowing the positions of atoms and the simulation box size to change simultaneously. Following geometry optimization, MDs simulations are conducted for 50 ps at 300 K in the NPT ensemble. First, our potential stabilized the majority of the interfaces in both static and dynamic runs, with the extent of stabilization depending on the orientation of the Si slab and the termination of the quartz slab. For Si(110) in contact with either a Si- or O-terminated quartz slab, our potential successfully stabilizes and describes the dynamics of the resulting interfaces, whether symmetric or non-symmetric. However, our potential only successfully describes symmetric interfaces for Si(100) and Si(111), which are built from quartz slabs terminated by Si. These combinations of termination and orientation were not considered in the training set, showcasing the remarkable generalization ability of our interaction potential. It is worth noting that defects, such as silicon dangling bonds or over-coordinated oxygen atoms, are observed at the interface following both minimization and dynamic runs, as depicted in Fig. 10. As noted in ref. 77, the presence of the dangling bonds is a natural occurrence and constitutes a typical aspect of interface defects. Such anomalies are commonplace and anticipated in these interfaces, owing to the inherent lattice mismatch between the involved materials. Usually, interfacial defects are passivated or special construction schemes are adopted to eliminate them. However, our study does not aim to create defect-free interfaces. Instead, our goal is to evaluate the potential’s capability to manage complex heterostructures with varying bonding types not encountered during the training process. To further validate our potential for silicon and silica interfaces, we compare the interfacial energies of small models computed using MTP and DFT. Detailed descriptions of these small models are provided in the Methods section. Our results (shown in Fig. 11) exhibit good correlation between DFT and MTP models. Importantly, configurations generated by the MTP potential through relaxation and MD simulations converged easily, typically requiring fewer than 50 iterations of single-point energy calculations of DFT. This highlights the reliability of interfacial configurations generated by the MTP potential. These findings demonstrate the capability of MTP potentials to effectively investigate heterosystems containing Si–SiO2 interfaces. The relaxed small models of the Si–SiO2 interfaces are presented in the supporting information (Figs. S11 and S12).
Discussion
In this work, we have successfully parametrized a ML potential that can implicitly capture and describe different charge states. Additionally, this ML potential has the remarkable ability to describe disjoint zones and hetero-zones of the configurational space of SiO2 and its constituent elements, Si and oxygen. The potential description of various phenomena-including point defects, diffusion in Si crystals, extended defects, the liquid phase of Si and SiO2, and the amorphous phase of Si and SiO2-either rivals or outperforms existing potentials. The potential exhibits very good agreement with experimental data in challenging configurational zones, such as the amorphous state, even though these configurations were not included in the training data. In many scenarios, such as disordered phases (liquid, amorphous), the potential achieved a near-perfect match with the reference method in terms of accuracy, using exactly the same simulation time(very short) and conditions. Even when utilizing longer simulation times and large system with a semi-empirical model, it does not reaches a level of accuracy similar to the reference method. Furthermore, the potential displays an intriguing capability regarding dislocation behavior. It autonomously transitioned the C1 structure to the C2 structure without the need for manual reconstruction. While there is a growing consensus that ML potentials can effectively serve as surrogates for DFT in terms of accuracy and speed, generalization including charge state modeling remains a challenging task. This study provides evidence that reaction coordinates are sufficient to implicitly capture charge transfer or charge states involved in chemical reactions. Aside from potentials explicitly considering charge transfer, separate ML potentials are developed for each individual chemical element or compound which database is constructible by DFT. Our approach suggests this is not necessary; compounds and their elemental constituents can be trained jointly. Indeed, joint parametrization, where parameters are derived simultaneously for both silicon, silica and oxygen, offers several advantages. By employing a single potential to describe the interactions between atoms in both silicon and silica, the computational model becomes more streamlined and easier to manage. The unified potential can save time and resources by avoiding the need to recalibrate parameters of the model for each of the materials involved. The unification idea is also important for some areas of application, such as interfacial modeling for electronic devices, energy storage and conversion, and surface coatings and tribology. Here, the reference data for ML potential must include both the individual materials in contact as well as the boundary region. In addition, as chemical reactions can occur in MD simulation, good modeling of a multi-component material or complex systems under certain conditions requires joint parametrization of the considered system and its constituents. For instance, oxygen aggregation in high-temperature MD simulations was observed by researchers, as noted in ref. 34 (supporting information). This observation led to the inclusion of oxygen molecules in the training set by the researchers. Our preliminary study pertained to a semiconductor and its oxide. However, whether this approach can be generalized to other elements and mixtures-including multi-component alloys and compounds remains to be seen. To ensure a well-implemented unified interaction potential, several other aspects need to be explored in the future. Given that the compound and its constituents are not located within the same zone of the configurational space, achieving an accurate, efficient, low-cost, and general unified potential for both the material and its oxides with limited data may require adjustments to the underlying mathematical model, the fitting procedure, and the database sampling methods (including active learning). These adjustments could help attain the same level of accuracy at a more affordable computational cost, resembling a feature of potentials parameterized for a single compound. Likewise, while this study achieved a joint description of Si and SiO2 using the MTP framework, it is likely that other currently developed ML potential frameworks would have led to a similar result. In conjunction with these questions, our aim in the future is to extend this work by incorporating the element of hydrogen to model silica gels.
Methods
Ab initio calculations
The database was constructed using DFT, as implemented in the Quantum ESPRESSO78 package. The exchange-correlation potential was treated using the generalized gradient approximation of Perdew-Burke-Ernzerhof (GGA-PBE)79. Projector augmented waves (PAW)80 were employed. Kinetic energy cutoffs of 884 eV for Si and 1224 eV for both SiO2 and oxygen were chosen. In all calculations, the Brillouin zone was sampled using the Monkhorst-Pack grid81 scheme. Different k-points were used for each polymorph, including an 8 × 8 × 8 for the ordinary phases of Si and an 11 × 11 × 11 for SiO2. The gamma point was used for oxygen molecules.
ML model: the MTP
In this work, the MTP46 was chosen as the ML model. The MTP is a multi-component potential. In a previous comparative study82, it demonstrated a favorable trade-off between accuracy and computational speed across a range of modeling problems. The model derives its name from its use of a tensorial representation of atomic coordinates and utilizes linear regression to determine the local atomic energy. These local atomic energies are subsequently summed to obtain the total energy of the system under consideration. The MTP model considers the total energy of a specific atomic configuration as a sum of individual atomic energy contributions.
The argument ζi is a tuple ζi = (rij, τi, τj) containing the relative coordinate rij and atomic types τi, τj. Here, Vlocal is approximately computed within the sphere or circle of radius (Rc) of 5.7 Å, beyond which the central atom no longer feels any interaction. Practically, in the MTP framework, the expansion of the atomic energy Vlocal into basis functions Bβ serves as the foundation for linear regression.
Since the potential energy function Vlocal is smooth, the force acting on an atom k at position rk in a given configuration xq can be calculated by taking the gradient of the total energy.
The virial stress within an atomic configuration xq of volume Ω can be expressed as follows.
The functions Bβ in equation 2 are obtained through the contraction of the descriptors. In the MTP model, the descriptors are formed by tensors of atomic coordinates weighted by radial functions. These descriptors consider both the radial distribution and the angular distribution of the neighborhood surrounding each atom. By incorporating information from both the radial and angular aspects, the descriptors capture the local atomic environment in a more comprehensive manner, enabling a more accurate representation of the atomic energy within the MTP framework.
The radial function fμ, is further expanded using radial basis functions Q(α) and fitting parameter \({c}_{\mu ,{\tau }_{i},{\tau }_{j}}^{(\alpha )}\) as expansion coefficients. This expansion allows for a more flexible and accurate representation of the radial dependence of the atomic interactions.
The model parameters \(\Theta =({c}_{\beta },{c}_{\mu ,{\tau }_{i},{\tau }_{j}})\) are determined during the minimization of the cost function as given by Equation (7).
Data curation and optimization
We acquired ab initio data using established methods and databases from prior research. The database construction involved two methodologies, namely manual processing24,82,83 and active learning84, as explained deeply in supporting information. Specifically, we referred to the dataset created for the GAP for silicon24, the comparative study82, and active learning techniques detailed in ref. 84. For liquid silica, we utilized the temperature range (1000 K–5000 K) from previous databases specifically designed for neural network interatomic potentials (NNIP), which covered temperatures exceeding its boiling point and extended as high as 5000 K, as referenced in ref. 36. While NNIP potentials involve a very large number of adjustable parameters–typically tens of thousands–allowing to jointly describe a large number of off-equilibrium configurations, accommodating such deviations from equilibrium becomes challenging within the MTP framework due to limited numbers of parameters. Effective MTP training, therefore, relies on carefully selecting the training set. Our final training dataset, herein referred to as the unified training set, was constructed through a two-step process: curation and subsequent optimization.
To enhance the quality of our training set and to properly assess the error of the test set, down selection was applied. To begin, we sorted our full database into smaller subsets as elaborated in Tab. S3 through S6 in the supporting information. Within each subset, we then utilize a filtering strategy referred to herein as the “train-remove-train" approach. We first train while monitoring for significant reductions in energy and force errors associated to each configuration as we incrementally raise the MTP level by one unit (We focus on levels 08 to 14 for deformation and defects, while levels 16 to 18 are used for disordered structures). Next, we analyze the error reduction between 2 or 3 consecutive levels. If a significant decrease is not observed, we then eliminate configurations based on factors such as:
-
(1)
Total energy: configurations with similar total energies yet differing atomic coordinates to other in the training set, as well as those with fluctuating total energies but nearly identical atomic positions to others in the training set are removed. These configurations originated from relaxation of molecules, single-point calculations of unrelaxed defects, manually constructed unrelaxed jump paths, and strained configurations where lattice parameters or vectors were strained without corresponding adjustments to atomic positions and embedded dimer.
-
(2)
Contributions from smearing: this can be primarily attributed to the extensive use of high-temperature AIMD simulations. Silica configurations generated from AIMD simulations, featuring a number of atoms greater than 36, are excluded if they exhibit smearing contributions to the total energy greater than zero.
-
(3)
Minimum interatomic distances (this criteria complements the total energy criteria): if multiple configurations from the same batch exhibit similar minimum distances, some are removed. This technique was mostly used for oxygen molecules. For example, we check the interatomic distance in the batch of relaxed O3 molecules. We also discard embedded dimer configurations by comparing their minimum interatomic distances with those of the AIMD configurations. If the minimum interatomic distances are equal, we choose AIMD over embedded dimer configuration.
-
(4)
Polymorphism: These configurations are derived from deformations following thermal expansion. Most of these configurations have been accurately computed and were included in the training database. However, some polymorphs arising from displacive phase transformations and polymorphs sharing the same lattice system with identical coordination numbers were excluded. In the case of displacive phase transformation, we retain the parent crystal and exclude the child crystal. When dealing with two polymorphs that share the same lattice system and coordination number, we typically choose one of them.
After removing these undesired configurations, we reinitiate the training process incrementally, following a pattern akin to the first stage. This “train-remove-train” process is iterated until we attain a high level of confidence in the cleanliness of the subset–i.e., all configurations included in the subset are associated with training errors that decrease as the MTP level increases. In the subsequent curation phase, we examine the possibility of extracting an even smaller subset from each previously cleaned subset.84. To achieve this, we use the “select-add" command embedded in the MTP code, as described in ref. 47. We applied the “select-add" command to every cleaned subset.
Training set
In implementing the unified potential, we selected a range of configurations from the optimized and curated database, as outlined in the Data Curation and Optimization section. Similarly, the test set was chosen from the same optimized database to eliminate any overlap between the two sets. While the test set encompasses all properties or types of configurations represented in the extensive database, the training set only includes certain configuration types. This strategic approach aims to ensure the portability of the interaction potential. Note that the database contains configurations of substantial size, with up to 1000 atoms. However, we constrained the maximum box size in the training set to 36, except for the interfaces set where a few configurations are of size 80. Consequently, the number of atoms in the training set’s boxes ranges from 1 to 80. Conversely, the test set contains configurations of the maximum size found in the database. The specific types of configurations represented in the training set are detailed in Table 5. By restricting the training set to specific configuration types, we aim to enhance the generalizability of our potential by simulating properties not present in the training set. Within the current implementation of the MTP potential, we do not employ a validation set in the typical manner used to estimate overfitting or underfitting during the training of neural network models. Instead, we opt to utilize an offline test set for this purpose. To gauge the potential’s portability more comprehensively, we perform MD simulations for properties that were not included in the training set.
Training and validation
The cost function, as described by Equation (7), was minimized using the Broyden-Fletcher-Goldfarb-Shanno algorithm, a quasi-Newton optimization method implemented in the MTP framework. Fundamentally, training the MTP model with atomic configurations entails finding the parameter set {Θ} by solving the minimization problem presented in Equation (7). We trained the refined unified training sets as detailed in Tab. S7 (supporting information) and Table 5. First, MTP potentials with parameter sets ranging from 300 to 1600, corresponding to levels 18 to 26, were used to train set 1 (Tab. S7 and Fig. S1a) as part of preliminary works, including optimization, training mode, and testing. The preliminary works are also presented in supplemental information from Figs. S5–S8. At this stage, two training modes were employed: vibration mode and structures weighting mode. Specifically, the potential resulting from the vibration mode was used as input for the structures weighting mode training. The training process iterated until a desirable level of accuracy was achieved. As outlined in the supporting information, we assessed the validation error using the resulting potentials. We employed two independent validation sets, denoted as Validation 1 and Validation 2, which atom distributions are shown in the supplementary Fig. S2 and Fig. S3, respectively. Validation 1 was randomly selected concurrently with the training set from the curated database. On the other hand, Validation 2 consisted of the AIMD cooling and equilibration trajectories at 300 K. These validation sets were utilized as part of preliminary work. For the final implementation, we utilized the level 28. Based on the preliminary work, we opted for the structure weighting mode. The two-step training mode, as applied to large-size configurations in set 1 (Tab. S7 and Fig. S1a)), was deemed unnecessary. Given that our final training set (Table 5) comprises small cell configurations, we exclusively employed the structure weighting mode. We utilized the resulting potential to conduct both static calculations and MD simulations, with the outcomes presented in the main text.
Here, Eqm, Fqm, σqm denotes the values of energy, force, and stress computed by the quantum mechanical approach (DFT), while Emtp, Fmtp, σmtp represents the corresponding values obtained from the MTP model. we, wf, wσ are the relative weights indicating the importance of the energy, the force and stress in optimization procedure.
Static calculations
This section summarizes the mathematical procedure used to determine static properties presented in the article.
For the chemical component with a formula XlYmZn, we calculate the cohesive energy as follows:
Where \({E}_{{X}_{l}{Y}_{m}{Z}_{n}}\) represents the energy of the supercell of the compound, while EX, EY, and EZ correspond to the energies of the isolated atoms. The subscripts l, m, and n indicate the number of X, Y, and Z atoms present in the building block \({E}_{{X}_{l}{Y}_{m}{Z}_{n}}\) of the material. Due to variations in the number of atoms within the primitive cell of each polymorph compared to the standard structural configuration, we normalize the cohesive energy by dividing it by the number of atoms present in the regular phase.
Points defects formation properties such as vacancy formation energy (\({E}_{v}^{f}\)) was calculated using this equations:
For interstitial formation energy \({E}_{i}^{f}\), we used:
In equations (9) and (10), N0 and \({E}_{{N}_{0}}\) correspond to the number of atoms and total energy of a perfect supercell.
Particularly, vacancies in SiO2 polymorphs were estimated considering a neutral state. Thus, the formation energy in SiO2 polymorphs was calculated using:
In this equation, Evac and Ebulk represent the energy of the supercell containing the oxygen vacancy and the energy of the bulk supercell, respectively. The chemical potential is defined as half of the energy of a dioxygen molecule (\({\mu }_{O}=1/2\ast {{\rm{E}}}_{{O}_{2}}\)).
The equilibrium bulk modulus which correspond to the curvature of the energy-volume curve at its minimum was derived from the second-order elastic constants85. We calculate elastic stiffness constant Cij using central finite difference formula.
where \({P}_{i}^{(+{\varepsilon }_{j})}\) is the ith component of the stress tensor when the configuration is strained only by jth component (εj) of the strain vector (\(\overrightarrow{\varepsilon }\)). After applying directional or isotropic deformation, the atomic positions undergo relaxation while the overall box size remains fixed. We compute the generalized stacking fault energy (γ(u)) by incrementally shifting the upper crystal half along the slip direction and assessing energy differences per unit area (A) of the fault plane.
where, Eo represents the energy of the perfect crystal, while E(u) denote the energy of the supercell with the fault vector u which is directly proportional to the Burgers vector (b). Surface energy is also calculated using the following expression:
In this context, A refers to the area of the slab, N represents the number of atoms in the slab, while Eslab and Ebulk denote the total energy of the slab and the bulk energy per atom, respectively.
Si–SiO2 Interface construction
Previous reports indicate that defects are commonly observed at the interface Si–SiO2 due to the imperfect matching of the two materials. To avoid considerable lattice mismatch, we utilize specific techniques. First, we rotate the alpha-quartz structure to achieve a tetragonal configuration. Next, we duplicate both the silicon crystal and alpha-quartz structure, ensuring that the lattice dimensions perpendicular to the interface direction closely match. This approach enables us to apply a small strain (<2%) to the lattice vectors before forming the interface. Technically, the lattice mismatch α can be defined as the relative difference in lattice parameters between two crystalline materials, often expressed as a percentage or in terms of the absolute difference in lattice constants along specific crystallographic directions:
Lattice duplication factors are represented by integers n and m; L1 and L2 denote the lattice parameters of a given direction. In both cases, symmetric and asymmetric interfaces were constructed for both oxygen-terminated and silicon-terminated quartz slabs, incorporating Si (100), Si (110), and Si (111) slab orientations. Our objective is not solely to construct a flawless interface representation of a naturally occurring or real-world interface, but rather to explore the versatility of the potential. We then estimate the interface energy using:
Where A represents the area of the interface, \({{\rm{n}}}_{Si{O}_{2}}\) and mSi represent the number of formula units of SiO2 and Si in the interface system. ES is the energy of the supercell containing the interface. The terms \({{\rm{E}}}_{Si{O}_{2}}\) and ESi correspond to the energy of silica and silicon per formula unit, respectively. Due to the impractical size of duplicated models for energy computation via DFT, smaller superlattices were also constructed involving Si (100) interfacing with α-quartz, β-cristobalite, α-cristobalite, β-tridymite, and amorphous silica. This facilitated comparison between results obtained using MTP potentials and those from DFT. Each simulation box contains two distinct interfaces. Initially, these small models were relaxed at 0 K using MTP potentials. Subsequently, MTP-driven MD simulations were performed at temperatures of 300 K, 500 K, 800 K, and 1200 K for 100 ps, and configurations were selected from the trajectories. The energies of these selected configurations, as well as the relaxed configurations, were then computed using DFT-based single-point energy calculations.
MD simulation
AIMD simulation was carried out using Quantum Espresso using the parameters as described in the section “Ab initio calculations details." The integration timestep was set to 1 fs for Si and 2 fs for SiO2. The ionic temperature during simulations was controlled using velocity rescaling.
Force-field MD simulations were performed using the large-scale atomic/molecular massively parallel simulator (LAMMPS) software package86. While it is impractical to perfectly replicate the MD settings in Quantum Espresso within LAMMPS, we aimed to make them as close as possible. To generate disordered structures, we employed the velocity rescaling thermostat to control the temperature during the simulations. The time integration used the same time steps as in the AIMD simulations. For studying point defects diffusion in Si and self-diffusion in SiO2, the Nosee-Hoover thermostat87 was employed. The latter simulations were performed using a timestep of 1 fs, and the damping parameter was set to 100 fs.
Data availability
The Si–O–SiO2 database and the unified potentials can be found at https://gitlab.com/Kazongogit/MTPu.
Code availability
The main codes used for this work are Quantum ESPRESSO (version 6.8), LAMMPS (version 2022), PHONOPY, and MLIP-2 (version 2). They are available at https://www.quantum-espresso.org, https://lammps.sandia.gov, https://phonopy.github.io/phonopy and https://mlip.skoltech.rurespectively. Further details can be found in the GitLab repository https://gitlab.com/Kazongogit/MTPu. Additionally, Python scripts were written for data manipulation and processing; most are available on the GitLab repository.
References
Arns, R. G. The other transistor: early history of the metal-oxide semiconductor field-effect transistor. Eng. Sci. Educ. J. 7, 233–240 (1998).
Bauza, D. Thermal oxidation of silicon and si–sio2 interface morphology, structure, and localized states. Handb. Surf. interfaces Mater. 1, 115–216 (2001).
Pasquarello, A., Hybertsen, M. S. & Car, R. Interface structure between silicon and its oxide by first-principles molecular dynamics. Nature 396, 58–60 (1998).
Ganster, P., Tréglia, G., Lancon, F. & Pochet, P. Molecular dynamics simulation of silicon oxidization. Thin Solid Films 518, 2422–2426 (2010).
Ganster, P., Béland, L. K. & Mousseau, N. First stages of silicon oxidation with the activation relaxation technique. Phys. Rev. B 86, 075408 (2012).
Cvitkovich, L. et al. Dynamic modeling of si (100) thermal oxidation: oxidation mechanisms and realistic amorphous interface generation. Appl. Surf. Sci. 610, 155378 (2023).
Salles, N., Richard, N., Mousseau, N. & Hémeryck, A. Strain-driven diffusion process during silicon oxidation investigated by coupling density functional theory and activation relaxation technique. J. Chem. Phys. 147, 054701 (2017).
Takahashi, N., Yamasaki, T. & Kaneta, C. Molecular dynamics simulations on the oxidation of si (100)/sio2 interface: emissions and incorporations of si-related species into the sio2 and substrate. Phys. Status Solidi (b) 251, 2169–2178 (2014).
Kohn, W. & Sham, L. J. Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, A1133 (1965).
Sholl, D. S. & Steckel, J. A. Density functional theory: a practical introduction (John Wiley & Sons, 2022).
Goedecker, S. Linear scaling electronic structure methods. Rev. Mod. Phys. 71, 1085 (1999).
Torrens, I. Interatomic Potentials (Elsevier, 2012).
Pizzagalli, L. Classical atomistic simulations in materials sciences: an introduction. Mater. Sci. 10, 125 (2004).
Van Duin, A. C., Dasgupta, S., Lorant, F. & Goddard, W. A. Reaxff: a reactive force field for hydrocarbons. J. Phys. Chem. A 105, 9396–9409 (2001).
Van Duin, A. C. et al. Reaxffsio reactive force field for silicon and silicon oxide systems. J. Phys. Chem. A 107, 3803–3811 (2003).
Phillpot, S. R. et al. Charge optimized many body (comb) potentials for simulation of nuclear fuel and clad. Comput. Mater. Sci. 148, 231–241 (2018).
Yu, J., Sinnott, S. B. & Phillpot, S. R. Charge optimized many-body potential for the si/sio 2 system. Phys. Rev. B 75, 085311 (2007).
Liang, T., Devine, B., Phillpot, S. R. & Sinnott, S. B. Variable charge reactive potential for hydrocarbons to simulate organic-copper interactions. J. Phys. Chem. A 116, 7976–7991 (2012).
Shan, T.-R. et al. Second-generation charge-optimized many-body potential for si/sio 2 and amorphous silica. Phys. Rev. B 82, 235302 (2010).
Hine, N. D., Haynes, P. D., Mostofi, A. A., Skylaris, C.-K. & Payne, M. C. Linear-scaling density-functional theory with tens of thousands of atoms: Expanding the scope and scale of calculations with onetep. Comput. Phys. Commun. 180, 1041–1053 (2009).
Bartok, A. P. The Gaussian Approximation Potential: an interatomic potential derived from first principles quantum mechanics (Springer Science & Business Media, 2010).
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
Behler, J. Four generations of high-dimensional neural network potentials. Chem. Rev. 121, 10037–10072 (2021).
Bartók, A. P., Kermode, J., Bernstein, N. & Csányi, G. Machine learning a general-purpose interatomic potential for silicon. Phys. Rev. X 8, 041048 (2018).
Li, R., Lee, E. & Luo, T. A unified deep neural network potential capable of predicting thermal conductivity of silicon in different phases. Mater. Today Phys. 12, 100181 (2020).
Hu, L., Su, R., Huang, B. & Liu, F. An accurate and transferable machine-learning interatomic potential for silicon. arXiv https://arxiv.org/abs/1901.01638 (2019).
Babaei, H., Guo, R., Hashemi, A. & Lee, S. Machine-learning-based interatomic potential for phonon transport in perfect crystalline si and crystalline si with vacancies. Phys. Rev. Mater. 3, 074603 (2019).
Dragoni, D., Daff, T. D., Csányi, G. & Marzari, N. Achieving dft accuracy with a machine-learning interatomic potential: Thermomechanics and defects in bcc ferromagnetic iron. Phys. Rev. Mater. 2, 013808 (2018).
Szlachta, W. J., Bartók, A. P. & Csányi, G. Accuracy and transferability of gaussian approximation potential models for tungsten. Phys. Rev. B 90, 104108 (2014).
Rowe, P., Deringer, V. L., Gasparotto, P., Csányi, G. & Michaelides, A. An accurate and transferable machine learning potential for carbon. J. Chem. Phys. 153, 034702 (2020).
Smith, J. S. et al. Automated discovery of a robust interatomic potential for aluminum. Nat. Commun. 12, 1257 (2021).
Deringer, V. L., Caro, M. A. & Csányi, G. A general-purpose machine-learning force field for bulk and nanostructured phosphorus. Nat. Commun. 11, 5461 (2020).
Novikov, I. S. & Shapeev, A. V. Improving accuracy of interatomic potentials: more physics or more data? a case study of silica. Mater. Today Commun. 18, 74–80 (2019).
Erhard, L. C., Rohrer, J., Albe, K. & Deringer, V. L. A machine-learned interatomic potential for silica and its relation to empirical models. npj Comput. Mater. 8, 90 (2022).
Li, W. & Ando, Y. Comparison of different machine learning models for the prediction of forces in copper and silicon dioxide. Phys. Chem. Chem. Phys. 20, 30006–30020 (2018).
Balyakin, I., Rempel, S., Ryltsev, R. & Rempel, A. Deep machine learning interatomic potential for liquid silica. Phys. Rev. E 102, 052125 (2020).
Artrith, N., Morawietz, T. & Behler, J. High-dimensional neural-network potentials for multicomponent systems: applications to zinc oxide. Phys. Rev. B 83, 153101 (2011).
Artrith, N. & Urban, A. An implementation of artificial neural-network potentials for atomistic materials simulations: performance for tio2. Comput. Mater. Sci. 114, 135–150 (2016).
Kandy, A. K. A., Rossi, K., Raulin-Foissac, A., Laurens, G. & Lam, J. Comparing transferability in neural network approaches and linear models for machine-learning interaction potentials. Phys. Rev. B 107, 174106 (2023).
Kobayashi, K. et al. Machine learning molecular dynamics simulations toward exploration of high-temperature properties of nuclear fuel materials: case study of thorium dioxide. Sci. Rep. 12, 9808 (2022).
Behler, J. Perspective: Machine learning potentials for atomistic simulations. J. Chem. Phys. 145, 170901 (2016).
Das Sarma, S., Deng, D.-L. & Duan, L.-M. Machine learning meets quantum physics. Phys. Today 72, 48–54 (2019).
Mishin, Y. Machine-learning interatomic potentials for materials science. Acta Mater. 214, 116980 (2021).
Onat, B., Ortner, C. & Kermode, J. R. Sensitivity and dimensionality of atomic environment representations used for machine learning interatomic potentials. J. Chem. Phys. 153, 144106 (2020).
Friederich, P., Häse, F., Proppe, J. & Aspuru-Guzik, A. Machine-learned potentials for next-generation matter simulations. Nat. Mater. 20, 750–761 (2021).
Shapeev, A. V. Moment tensor potentials: a class of systematically improvable interatomic potentials. Multiscale Model. Simul. 14, 1153–1173 (2016).
Novikov, I. S., Gubaev, K., Podryabinkin, E. V. & Shapeev, A. V. The mlip package: moment tensor potentials with mpi and active learning. Mach. Learn Sci. Technol. 2, 025002 (2020).
Van Beest, B., Kramer, G. J. & Van Santen, R. Force fields for silicas and aluminophosphates based on ab initio calculations. Phys. Rev. Lett. 64, 1955 (1990).
Arai, N., Takeda, S. & Kohyama, M. Self-interstitial clustering in crystalline silicon. Phys. Rev. Lett. 78, 4265 (1997).
Henkelman, G., Uberuaga, B. P. & Jónsson, H. A climbing image nudged elastic band method for finding saddle points and minimum energy paths. J. Chem. Phys. 113, 9901–9904 (2000).
Ohbitsu, M. et al. Atomic structures and stability of finite-size extended interstitial defects in silicon: large-scale molecular simulations with a neural-network potential. Scr. Mater. 214, 114650 (2022).
Cheng, Y. et al. Vacancy formation energy and its connection with bonding environment in solid: a high-throughput calculation and machine learning study. Comput. Mater. Sci. 183, 109803 (2020).
Stillinger, F. H. & Weber, T. A. Computer simulation of local order in condensed phases of silicon. Phys. Rev. B 31, 5262 (1985).
Tersoff, J. New empirical model for the structural properties of silicon. Phys. Rev. Lett. 56, 632 (1986).
Tersoff, J. Empirical interatomic potential for silicon with improved elastic properties. Phys. Rev. B 38, 9902 (1988).
Guénolé, J., Godet, J. & Pizzagalli, L. Determination of activation parameters for the core transformation of the screw dislocation in silicon. Model. Simul. Mater. Sci. Eng. 18, 065001 (2010).
Huang, X., Hu, Y.-J. & An, Q. Locking of screw dislocations in silicon due to core structure transformation. J. Phys. Chem. C. 125, 24710–24718 (2021).
Morris, J. R., Wang, C., Ho, K. & Chan, C. T. Melting line of aluminum from simulations of coexisting phases. Phys. Rev. B 49, 3109 (1994).
Alfè, D. & Gillan, M. Exchange-correlation energy and the phase diagram of si. Phys. Rev. B 68, 205212 (2003).
Lide, D. R. CRC Handbook of Chemistry and Physics, vol. 85 (CRC Press, 2004).
Dorner, F., Sukurma, Z., Dellago, C. & Kresse, G. Melting si: beyond density functional theory. Phys. Rev. Lett. 121, 195701 (2018).
Jaccodine, R. Surface energy of germanium and silicon. J. Electrochem. Soc. 110, 524 (1963).
Vashishta, P., Kalia, R. K., Rino, J. P. & Ebbsjö, I. Interaction potential for sio 2: a molecular-dynamics study of structural correlations. Phys. Rev. B 41, 12197 (1990).
Munetoh, S., Motooka, T., Moriguchi, K. & Shintani, A. Interatomic potential for si–o systems using tersoff parameterization. Comput. Mater. Sci. 39, 334–339 (2007).
Sundararaman, S., Huang, L., Ispas, S. & Kob, W. New optimization scheme to obtain interaction potentials for oxide glasses. J. Chem. Phys. 148, 194504 (2018).
Fortner, J. & Lannin, J. Radial distribution functions of amorphous silicon. Phys. Rev. B 39, 5527 (1989).
Mozzi, R. & Warren, nB. The structure of vitreous silica. J. Appl. Crystallogr. 2, 164–172 (1969).
Grimley, D. I., Wright, A. C. & Sinclair, R. N. Neutron scattering from vitreous silica iv. time-of-flight diffraction. J. Non-Cryst. Solids 119, 49–64 (1990).
Mei, Q., Benmore, C. & Weber, J. Structure of liquid sio 2: a measurement by high-energy x-ray diffraction. Phys. Rev. Lett. 98, 057802 (2007).
Carré, A., Ispas, S., Horbach, J. & Kob, W. Developing empirical potentials from ab initio simulations: the case of amorphous silica. Comput. Mater. Sci. 124, 323–334 (2016).
Liu, H., Fu, Z., Li, Y., Sabri, N. F. A. & Bauchy, M. Balance between accuracy and simplicity in empirical forcefields for glass modeling: insights from machine learning. J. Non-Cryst. Solids 515, 133–142 (2019).
Farnan, I. et al. Quantification of the disorder in network-modified silicate glasses. Nature 358, 31–35 (1992).
Da Silva, J., Pinatti, D., Anderson, C. & Rudee, M. A refinement of the structure of vitreous silica. Philos. Mag. J. Theor. Exp. Appl. Phys. 31, 713–717 (1975).
Coombs, P. et al. The nature of the si-o-si bond angle distribution in vitreous silica. Philos. Mag. B 51, L39–L42 (1985).
Kobayashi, K., Nagai, Y., Itakura, M. & Shiga, M. Self-learning hybrid monte carlo method for isothermal–isobaric ensemble: application to liquid silica. J. Chem. Phys. 155, 034106 (2021).
Tucker, M., Keen, D., Dove, M. & Trachenko, K. Refinement of the si–o–si bond angle distribution in vitreous silica. J. Phys. Condens. Matter 17, S67 (2005).
Himpsel, F., McFeely, F., Taleb-Ibrahimi, A., Yarmoff, J. & Hollinger, G. Microscopic structure of the sio 2/si interface. Phys. Rev. B 38, 6084 (1988).
Giannozzi, P. et al. Quantum espresso: a modular and open-source software project for quantum simulations of materials. J. Phys. Condens. Matter 21, 395502 (2009).
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865 (1996).
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953 (1994).
Monkhorst, H. J. & Pack, J. D. Special points for brillouin-zone integrations. Phys. Rev. B 13, 5188 (1976).
Zuo, Y. et al. Performance and cost assessment of machine learning interatomic potentials. J. Phys. Chem. A 124, 731–745 (2020).
Zongo, K., Béland, L. & Ouellet-Plamondon, C. First-principles database for fitting a machine-learning silicon interatomic force field. MRS Adv. 7, 39–47 (2022).
Podryabinkin, E. V. & Shapeev, A. V. Active learning of linearly parametrized interatomic potentials. Comput. Mater. Sci. 140, 171–180 (2017).
Pandit, A. & Bongiorno, A. A first-principles method to calculate fourth-order elastic constants of solid materials. Comput. Phys. Commun. 288, 108751 (2023).
Thompson, A. P. et al. Lammps-a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Comput. Phys. Commun. 271, 108171 (2022).
Evans, D. J. & Holian, B. L. The nose–hoover thermostat. J. Chem. Phys. 83, 4069–4074 (1985).
Brandes, E. A. & Brook, G. Smithells metals reference book (Elsevier, 2013).
McMahon, M., Nelmes, R., Wright, N. & Allan, D. Pressure dependence of the imma phase of silicon. Phys. Rev. B 50, 739 (1994).
Kim, D. Y., Stefanoski, S., Kurakevych, O. O. & Strobel, T. A. Synthesis of an open-framework allotrope of silicon. Nat. Mater. 14, 169–173 (2015).
Adams, G. B., O’Keeffe, M., Demkov, A. A., Sankey, O. F. & Huang, Y.-M. Wide-band-gap Si in open fourfold-coordinated clathrate structures. Phys. Rev. B 49, 8048 (1994).
Levien, L., Prewitt, C. T. & Weidner, D. J. Structure and elastic properties of quartz at pressure. Am. Mineral. 65, 920–930 (1980).
Wright, A. & Lehmann, M. The structure of quartz at 25 and 590 c determined by neutron diffraction. J. Solid State Chem. 36, 371–380 (1981).
Downs, R. & Palmer, D. The pressure behavior of α cristobalite. Am. Mineral. 79, 9–14 (1994).
Barth, T. Cristobalite structures; ii, low-cristobalite. Am. J. Sci. 24, 97–110 (1932).
Cellai, D., Carpenter, M., Kirkpatrick, R., Salje, E. & Zhang, M. Thermally induced phase transitions in tridymite: an infrared spectroscopy study. Phys. Chem. Miner. 22, 50–60 (1995).
Villars, P. Pearson’s handbook: crystallographic data for intermetallic phases. c1991; 2nd edn. (1985).
Levien, L. & Prewitt, C. T. High-pressure crystal structure and compressibility of coesite. Am. Mineral. 66, 324–333 (1981).
Grocholski, B., Shim, S.-H. & Prakapenka, V. Stability, metastability, and elastic properties of a dense silica polymorph, seifertite. J. Geophys. Res.: Solid Earth 118, 4745–4757 (2013).
Ross, N. L., Shu, J. & Hazen, R. M. High-pressure crystal chemistry of stishovite. Am. Mineral. 75, 739–747 (1990).
Shropshire, J., Keat, P. P. & Vaughan, P. A. The crystal structure of keatite, a new form of silica. Z. Kristallogr. Cryst. Mater. 112, 409–413 (1959).
Miehe, G. et al. Crystal structure of moganite: a new structure type for silica. Eur. J. Mineral. 4, 693–706 (1992).
Díaz-Cabañas, M.-J. & Barrett, P. A. Synthesis and structure of pure sio 2 chabazite: the sio 2 polymorph with the lowest framework density. Chem. Commun. 1881–1882 (1998).
Plévert, J., Kubota, Y., Honda, T., Okubo, T. & Sugi, Y. Gus-1: a mordenite-like molecular sieve with the 12-ring channel of zsm-12electronic supplementary information (esi) available: details of the synthesis of the sda, crystal data and fractional atomic coordinates for gus-1. See http://www.rsc.org/suppdata/cc/b0/b005225f. Chem. Commun. 2363–2364 (2000).
Artioli, G., Lamberti, C. & Marra, G. Neutron powder diffraction study of orthorhombic and monoclinic defective silicalite. Acta Crystallogr. Sect. B: Struct. Sci. 56, 2–10 (2000).
McSkimin, H., Andreatch Jr, P. & Thurston, R. Elastic moduli of quartz versus hydrostatic pressure at 25 and- 195.8 c. J. Appl. Phys. 36, 1624–1632 (1965).
Léger, J.-M., Haines, J. & Chateau, C. The high-pressure behaviour of the “moganite” polymorph of sio2. Eur. J. Mineral. 13, 351–359 (2001).
Pabst, W. & Gregorová, E. Elastic properties of silica polymorphs–a review. Ceram. -Silik. 57, 167–184 (2013).
Leardini, L., Quartieri, S., Vezzalini, G., Martucci, A. & Dmitriev, V. Elastic behavior and high pressure-induced phase transition in chabazite: New data from a natural sample from nova scotia. Micropor. Mesopor. Mater. 170, 52–61 (2013).
Durandurdu, M. & Drabold, D. Ab initio simulation of first-order amorphous-to-amorphous phase transition of silicon. Phys. Rev. B 64, 014101 (2001).
Guerette, M. & Huang, L. A simple and convenient set-up for high-temperature brillouin light scattering. J. Phys. D: Appl. Phys. 45, 275302 (2012).
Laaziri, K. et al. High resolution radial distribution function of pure amorphous silicon. Phys. Rev. Lett. 82, 3460 (1999).
Meidanshahi, R. V., Bowden, S. & Goodnick, S. M. Electronic structure and localized states in amorphous si and hydrogenated amorphous si. Phys. Chem. Chem. Phys. 21, 13248–13257 (2019).
Vukcevich, M. A new interpretation of the anomalous properties of vitreous silica. J. Non-Cryst. Solids 11, 25–63 (1972).
Khouchaf, L. et al. Study of the microstructure of amorphous silica nanostructures using high-resolution electron microscopy, electron energy loss spectroscopy, x-ray powder diffraction, and electron pair distribution function. Materials 13, 4393 (2020).
Acknowledgements
We thank the Digital Research Alliance of Canada for generous allocation of compute resources. Financial support was provided by the Natural Sciences and Engineering Research Council of Canada (NSERC), the Nuclear Waste Management Organization (NWMO), the Association canadienne-française pour l’avancement des sciences (ACFAS), and the Canada Research Chair on Sustainable Multifunctional Construction Materials (CRC-2019-00074).
Author information
Authors and Affiliations
Contributions
L.K.B. initiated and coordinated the work, overseeing the research project. K.Z. was responsible for developing the reference database and fitting the unified potential. K.Z. performed all static calculations and MD simulations using DFT, MTP, and semi-classical potentials. All the authors, K.Z., H.S., L.K.B., and C.O.P., contributed to the writing and reviewing of the paper. C.O.P. and L.K.B. supervised K.Z. C.O.P. and L.K.B. provided computing resources.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zongo, K., Sun, H., Ouellet-Plamondon, C. et al. A unified moment tensor potential for silicon, oxygen, and silica. npj Comput Mater 10, 218 (2024). https://doi.org/10.1038/s41524-024-01390-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41524-024-01390-8