A unified moment tensor potential for silicon, oxygen, and silica

Zongo, Karim; Sun, Hao; Ouellet-Plamondon, Claudiane; Béland, Laurent Karim

doi:10.1038/s41524-024-01390-8

Download PDF

Article
Open access
Published: 13 September 2024

A unified moment tensor potential for silicon, oxygen, and silica

npj Computational Materials volume 10, Article number: 218 (2024) Cite this article

3872 Accesses
6 Citations
Metrics details

Subjects

Abstract

Si and its oxides have been extensively explored in theoretical research due to their technological importance. Simultaneously describing interatomic interactions within both Si and SiO₂ without the use of ab initio methods is considered challenging, given the charge transfers involved. Herein, this challenge is overcome by developing a unified machine learning interatomic potentials describing the Si/SiO₂/O system, based on the moment tensor potential (MTP) framework. This MTP is trained using a comprehensive database generated using density functional theory simulations, encompassing diverse crystal structures, point defects, extended defects, and disordered structure. Extensive testing of the MTP is performed, indicating it can describe static and dynamic features of very diverse Si, O, and SiO₂ atomic structures with a degree of fidelity approaching that of DFT.

A machine-learned interatomic potential for silica and its relation to empirical models

Article Open access 28 April 2022

Modelling atomic and nanoscale structure in the silicon–oxygen system through active machine learning

Article Open access 02 March 2024

Predicting electronic structures at any length scale with machine learning

Article Open access 27 June 2023

Introduction

Si/SiO₂ interfaces are ubiquitous in semiconductor manufacturing, which includes metal-oxide-semiconductor field-effect transistors¹, nanowire- and nanodot-transistors. The formation of SiO₂ layers involves charge transfer during the oxidation of Si substrates². Additionally, siliceous materials-including clay minerals and cement, which comprise Si, O, and SiO₂ components-are governed by interactions that entail similar charge transfer. Abundant past theoretical research has focused on understanding Si, its oxides, the formation of SiO₂ multilayered structures, early oxidation rates, and amorphization of oxide layers^3,4,5,6,7,8. These studies predominantly relied on electronic density functional theory (DFT)^9,10,11 and nonflexible classical potentials^12,13. However, simulating large systems with multiple components, charge transfer, and hetero-interfacial systems poses challenges within these frameworks. An ideal modeling approach should explicitly or implicitly capture charge transfer without compromising accuracy or incurring prohibitively large computational costs. Existing charge equilibration potentials like ReaxFF^14,15 and COMB^16,17,18, while being capable of describing chemical interactions during MD simulations, tend to have a limited ability to describe mechanical properties of materials^15,17,19 unless special reparametrization is applied.

Recent advances, such as linear-scaling DFT^11,20 and machine learning (ML) force fields–e.g., the Gaussian approximation potentials (GAP)²¹, and artificial neural networks^22,23—lift the limitations of traditional methods. ML force fields have demonstrated high accuracy in modeling Si^24,25,26,27 and many other elements^{28,29,30,31,32}. Similar progress has been made in improving interatomic interaction descriptions in Si oxides^33,34,35,36 and metal oxides^37,38,39,40. However, jointly describing compounds and their constituents using ML force fields presents challenges due to the disjointed configurational space of multi-phase forms and the need to handle charge transfer. ML force fields^{41,42,43,44,45} combine a descriptor to a regression procedure to encode geometry and ab initio properties, usually omitting explicit electronic structures. A previous study focusing on modeling SiO₂ using the moment tensor potential (MTP) suggests incorporating additional reference data is preferable to adding explicit charge equilibration for long-range interactions³³.

The novelty of this article is a MTP^46,47 that jointly describes interatomic interactions in SiO₂ and its constituents (Si and O), enabling the representation of multiple charge states. The developed MTP for Si/O/SiO₂ systems is parameterized using an ab initio database containing diverse crystal structures, point defects, extended defects, and disordered structures. This MTP is then utilized for molecular statics (MS) and molecular dynamics (MD) simulations to investigate crystalline, interfaces, amorphous, and liquid states of Si and SiO₂. These test simulations indicate that the MTP can provide a unified description of these disjoint systems.

Results

Our analysis encompasses both MS and MD simulations. MS results include cohesive energy, lattice constant, elastic constant, defect formation energies, interface relaxation, as well as linear and planar defect. The MD results bracket vacancy diffusion coefficients, melting point,interfaces, as well as the liquid and amorphous structures of Si and SiO₂. The outcome of these runs are compared against the reference method (DFT) and those derived from the semi-classical potential.

Cohesive energy, elastic constant, point defects and extended defects

First, the state equations of various Si and SiO₂ polymorphs are presented as cohesive energy vs lattice parameter. Please see the methodology section for calculation details. Additionally, the cohesive energy values for molecular oxygen, both O₂ and O₃, are provided. As shown in Fig. 1, the MTP replicated the cohesive energies of the references states with remarkable accuracy. The Table 1 compares lattice parameters predicted by the MTP, COMB, and ReaxFF models with benchmark and experimental data. Remarkably, the MTP predictions show excellent agreement with both the benchmark and experimental data.

**Fig. 1: Bond and angle energy of oxygen molecules, and equation of state of silicon and silica polymorphs.**

Table 1 Comparison of lattice constants predicted by MTP, ReaxFF, and COMB models against experimental data and DFT calculations

Full size table

The second-order elastic constants and bulk modulus are determined using finite difference, as detailed in the methods section. Table 2 provides the relative root mean square error (RRMSE) on elastic constant with respect to the DFT benchmark and experimental data. The MTP model demonstrates lower RRMSE when compared to those of ReaxFF and COMB potentials, it competes closely with the Beest Kramer van Santen (BKS)⁴⁸ potential. Moreover, other semi-empirical models reported in ref. ²⁴ demonstrate higher errors compared to the predictions made by the MTP model. The bulk modulus values for various silicon and silica polymorphs can also be found in Table 3. As evident, the MTP predictions closely align with the reference methods and experimental values, although it is worth noting that the training set did not encompass the deformation of certain polymorphs. Our potential accurately predicts the elastic constant of amorphous silica, even though amorphous configurations were not included in the training set. In our testing of the MTP potential, we have also considered point defects like vacancies, divacancies, and self-interstitials. The RRMSE values for these defects are reported in Table 4. Once again, using the MTP leads to smaller relative errors in comparison to ReaxFF potential. Since the majority of potentials were not specifically parameterized for the oxygen system alone, our comparison was limited solely to ReaxFF. In our study, we examined a specific case involving the I4 compact cluster⁴⁹ within the Si crystal, which was not included in our training set. The atoms within the cluster exhibit a harmonious four-coordinated arrangement. Notably, the cluster boasts the presence of five-, six-, and seven-membered atomic rings. The bond lengths and bond angles of this cluster were calculated based on relaxed structures obtained from DFT, MTP, SW, ReaxFF and COMB calculations, as illustrated in Fig. 2. Notably, no dangling bonds were observed for all potentials, except for the COMB potential, which failed to reproduce the I4 structure. When analyzing the formation energy of the cluster, the MTP model exhibited a prediction within 14% of the reference value, while the SW and ReaxFF potentials displayed errors reaching up to 27% and 47%, respectively. Among the models assessed, the MTP model exhibited superior agreement with DFT calculations for both bond lengths and bond angles. As this is a perfectly coordinated tetra-interstitial, we also tested 3 and 5-fold coordinated interstitials, namely, di-interstitial, tri-interstitial, and tetra-interstitial, as shown in Fig. 2. While these defects are not included in the training set, the MTP exhibits better agreement with the benchmark than ReaxFF, as detailed in Table 4. The MTP also outperforms ReaxFF in describing the vacancy formation energy in silica polymorphs, as demonstrated in Table 4. Again, no SiO₂ point defects configurations were incorporated to our training set.

Table 2 Comparision of force fields prediction of elastic constant for silicon crystal and α-quart

Full size table

Table 3 Exploring bulk modulus in silicon and silica polymorphs: calculations using experimental lattice parameters input in LAMMPS code

Full size table

Table 4 Comparative analysis of defect formation energies in silicon crystals and silica polymorphs using DFT, MTP, and ReaxFF results

Full size table

**Fig. 2: Point defects in silicon crystal.**

The static migration barrier energy of the vacancy was determined using the nudged elastic band (NEB)⁵⁰. The migration barrier profiles, including the DFT-based profiles as well as the MTP- and SW-based profiles, are depicted in Fig. 3b.The MTP migration barrier profile shows excellent agreement with the DFT reference profile. In contrast, the SW potential does not capture the reference profiles with similar accuracy. Examining the barrier for vacancy migration reveals a relative error in barrier height of 15.0% for MTP, while for SW, it is 73.1%. These results demonstrate that the MTP model can better mimic the reference method when studying point defects within larger systems, as demonstrated in references^28,51,52. The activation barriers for mono-vacancy hopping were further investigated using MD simulations, considering temperatures ranging from 1000 K to 1650 K. The simulation details are provided in the method section for reference. The mean square displacements are also provided in the supplementary Fig. S4. As reported in Fig. 3a, the activation energy for mono-vacancy hopping is 0.31 eV for MTP and 0.41 eV for SW. These values are close to the static activation barriers computed at 0 K; the MTP model behaves in a physically plausible fashion. As observed in Fig. 3a, b, the barrier obtained from MD simulation is 0.32 eV, whereas the static migration barrier stands at 0.31 eV. It is noteworthy that NEB configurations, including ab initio molecular dynamics (AIMD) configurations containing vacancies, were not included in the training set. We investigated extended defects such as generalized stacking faults and dislocations in the Si crystal. Generalized stacking faults are planar defects closely linked to slip. In turn, the behavior of dislocations and their core properties are particularly important for understanding plasticity. The methods section of our study provides a detailed description of our model for generalized stacking faults and dislocations, as well as the calculations involved. In Fig. 4a, b, the excess energy per unit area, also known as a γ-line, is presented. The γ-line was calculated using the benchmark method, the MTP model, the SW⁵³ and Tersoff (TS)^54,55 semi-classical potentials. As shown in the Fig. 4, the MTP is in good agreement with the benchmark results. In contrast, SW and TS demonstrate lesser agreement with the benchmark. In Fig. 5, the dislocation core structures are presented. The core structures predicted by the MTP model exhibit a nearly perfect agreement with those predicted by DFT, as reported in ref. ⁵⁶. An important and distinctive feature of our potential is the direct relaxation of the C₁ core structure to the C₂ structure, which is commonly referred to as the double-period reconstruction of the C₁ core. In past studies, the C₂ core structure was often manually reconstructed from the relaxed C₁ core^56,57. However, our potential eliminates the need for manual reconstruction by obtaining the C₂ structure directly. To obtain the relaxed structure and energy of the C₁ configuration, a snapshot is selected from the relaxation steps that ultimately lead to the C₂ structure. Our investigation also revealed that the C₂ core is the most stable configuration, a result consistent with previous reports^56,57. In addition to Fig. 5, the core structures are also depicted in Supplementary Fig. S10.

**Fig. 3: Diffusion of point defects in silicon crystal.**

**Fig. 4: Planar defect in silicon crystal.**

**Fig. 5: Line defect in silicon crystal.**

Coexistence simulation

We determined the solid-liquid coexistence temperature of silicon using the solid-liquid interface method described in ref. ⁵⁸. Our MTP potential predicts the silicon melting point to be 1485 ± 5 K, which is ~0.5% lower than the benchmark DFT-GGA value of 1492 K⁵⁹. Note that both the MTP potential and the benchmark value are ~12% lower than the experimental melting point of 1687 K, as reported in ref. ⁶⁰. Notably, our database initially lacked solid-liquid interface data, so we integrated a few AIMD configurations gathered around the experimental melting point into our unified training set. Additionally, it is important to acknowledge the influence of the exchange-correlation (XC) function on melting behavior, as discussed in refs. ^59,61. The capability of MTP to accurately replicate the melting point of the GGA XC functional showcases the high-quality simulation of liquid structure by MTP, as elaborated in the following section. For comprehensive details on our coexistence simulation approach, please refer to the accompanying supplementary Fig. S9.

Silicon slab energy

We compare the surface energy predicted by MTP against experimental data and DFT references, as slab data were not included in the training set. Experimental slab energy values for Si (100), Si(110), and Si(111) are reported to be 2.1, 1.5, and 1.2 j/m²⁶², respectively, while our DFT values are 2.1, 1.8, and 1.6 j/m². The MTP predicts these surface energies to be 2.0, 1.3, and 1.2 j/m², respectively. While a large discrepancy between MTP and DFT is observed, especially for Si(110) with relative errors of up to 25%, the MTP-predicted surface energy closely matches the experimental slab energy.

Disordered structures

We also considered an extensive set of disordered Si and SiO₂ structures. We compared structures generated using ab initio MD, MTP-MD, and semi-empirical-MD. All three cases were subjected to identical MD simulation conditions, except for amorphous silica, where the MTP simulation time was shorter compared to the other potentials. For comparison, we have chosen the Vashista (VA)⁶³, Munetoh (TS)⁶⁴, BKS⁴⁸, and Sundararaman (SHK1 and SHK2)⁶⁵ models. The details regarding the ab initio and classical MD simulations are comprehensively provided in the methods section. To analyze the disordered structures, we utilized both the pair correlation functions and the bond angle distribution functions.

Liquid and amorphous silicon

As observed from the radial distribution function shown in Fig. 6a, the MTP describes the structure of liquid Si with high precision. In contrast, the others potential leads to a shifted position of the first-neighbor peak compared to the results obtained from the reference (DFT). Additionally, there is an overestimation of the peak height, primarily observed with the EDIP and Tersoff potentials. The angular distribution function (ADF) Fig. 6b also demonstrates excellent agreement between the MTP model and DFT data. When considering amorphous Si, the MTP model accurately describes the structural features in agreement with experimental data. Conversely, semi-classical potentials like SW and Tersoff fail to replicate the experimental radial distribution profile. As shown in Fig. 6c, experimental g(r) and MTP-MD lead to nearly identical first-neighbor peaks (2.36 Å) and second neighbor peaks (3.89 Å). Additionally, the MTP model exhibits better agreement with the experimental bond angle distribution centered around 108.6°⁶⁶. Although the bond angle distributions of semi-empirical models like SW and EDIP are closer to the experimental values, they exhibit angle distributions below 90°, as shown in Fig. 6d. Note that the ab initio cooling and equilibration trajectory was not included in the training set of the MTP model, which suggests the MTP is fairly general, which can be attributed to the fact that the training dataset encompasses a variety of configurations within the disordered Si systems. Overall, the MTP model demonstrates a level of accuracy comparable to that of DFT and experiment when describing the structural features and bonding characteristics of disordered Si.

Liquid and amorphous silica

Figure 7 illustrates results pertaining to liquid SiO₂. It includes pair distribution functions (PDF) and ADF. The MTP is in better agreement with DFT as compared to the others potential. We notice both quantitative and qualitative differences between the semi-empirical potentials and DFT, except for the Si–O pair correlation function. In this case, the semi-empirical potentials only overestimates the height of the first peak, which is located around 1.62 Å^67,68,69. This value is consistent with the experimental Si–O bond length observed in liquid SiO₂, indicating a strong chemical interaction between the Si–O pairs. Both the O-O and Si-Si pair correlation functions exhibit a shift in the first peak as given by the others models, while there is a strong quantitative and qualitative match between the MTP-based and the DFT-based structures. Furthermore, the BKS-based Si–O–Si ADF, along with those of other semi-empirical models, does not match the DFT-based ADF. Conversely, the MTP-based ADF demonstrates a good match. At 3600 K, using our 96-atom simulation box model, the average Si–O–Si angle between two SiO₄ tetrahedra is determined as follows: DFT–134.5°, MTP–131.4°, BKS–146.0°, TS–143.6°, VA–141.8°, SHK1–142.6°, and SHK2–140.4°. Our DFT value aligns closely with literature, approximately 136.0°⁷⁰ and 135.0°⁷¹ respectively. It is evident that the MTP closely resembles the reference method, whereas other models align with experimental values for the Si–O–Si angle in amorphous silica, ranging between 140° and 152°^72,73,74. This likely stems from the semi-empirical models being meticulously fitted with consideration of experimental properties. Most of them were optimized based on mixed ab initio-experimental data. We then varied the temperature of liquid silica from 2500 K to 3500 K using a 648-atom box and recorded the Si–O–Si angle, as depicted in Fig. 7d. We have found that the Si–O–Si angle in liquid silica changes with temperature, consistent with the findings reported in ref. ⁷⁵. As the temperature decreases, the angle between tetrahedral interconnections increases, contributing to network relaxation. It is likely that the relaxation of the network at room temperature upon cooling is primarily attributed to variations in bond lengths and angles, given that the network structure of silica liquid does not qualitatively change between 3500 K and 300 K.

Furthermore ab initio MD, MTP-MD, BKS-MD and other models lead to a within-tetrahedra O-Si–O bond angle distribution centered around 109°, which is nearly equal to the experimental bond angle^67,68,69. However, the BKS and VA potentials overestimate the average probability at 109°, while the TS potential underestimates this probability. The SHK1, SHK2, and MTP potentials match the benchmark. The DFT, MTP, and all other models except TS lead to very similar O-O-O ADFs. However, when it comes to the Si-Si-Si angle, qualitative and quantitative discrepancies are observed between BKS-generated structures and DFT-generated structures, while MTP-generated structures match the benchmark. There is also a notable discrepancy between the structure predicted by other models and that of the benchmark.

The partial PDFs of vitreous SiO₂ are illustrated in Fig. 8a–c. The comparison is made against the PDF computed from experimental data using the reverse Monte Carlo (RMC) method⁷⁶. The MTP potential exhibits qualitative agreement with RMC data, as our PDF profiles match those obtained from RMC.For example, only the MTP accurately reproduces the RMC profiles for Si-Si interactions, as the second peak around 5 Å is not reproduced by the other models, including the BKS model. Additionally, while the other models fail to reproduce the height of the Si–O RMC PDF, the MTP potential shows a good match. To further analyze the structure, we also computed the most important ADF, as well as the Si–O bond length distribution function. For O-Si–O (refer to Fig. 8c, all the profiles are similar but vary in height, centered around the experimental value of 109.0°. When it comes to Si–O–Si angle (Fig. 8d), both the MTP and BKS show similar profiles centered between the experimental values of 144° and 152°. The average values for BKS Si–O–Si angle is 150 and the one of the MTP is 145.5. As the average value for the same angle predicted by MTP in liquid silica at 3600 is ~131.4, this confirms that the Si–O–Si angle varies in liquid silica. Considering all the potential, the the Si–O bond length distribution are centered between 1.60 and 1.66 Å. The average bond length distribution for MTP potential is 1.63 Å, which is close to experimental values of 1.62 Å. Note that these structural properties may vary slightly depending on the employed cooling rate. Despite the high cooling rate, no major coordination defects were observed. This indicates that the configuration has been well equilibrated at 3000 K, resulting in the establishment of a strong network. Note that the ab initio MD trajectory, which encompasses both the cooling and equilibration stages of the amorphous structure preparation, was not included in the MTP training set, which is an indicator of the MTP’s generalization capability. Overall, the MTP potential demonstrates a remarkable improvement in accurately describing the structure of disordered SiO₂ compared to well-established potentials such as the BKS potential and others. The MTP model captures the essential structural features of the disordered systems with greater precision, resulting in a better agreement with experimental observations^67,68,69,76.

Phonon dispersion

We calculated the phonon dispersion of c-Si and α-quartz, as illustrated in Fig. 9a, b, respectively. The MTP model exhibits very good agreement with the reference method (DFT) except the higher frequencies for α-quartz. Once again, the corresponding frozen phonon configurations were not explicitly included in the training set.

Si–SiO₂ interface

The Si–SiO₂ interface, a cornerstone in semiconductor physics and material science, plays a fundamental role in device fabrication and significantly impacts device performance. Exploring heterostructures involving Si–SiO₂ interfaces opens avenues for novel functionalities and applications in microelectronics and beyond. Given its importance, capturing the structure and dynamics of the Si–SiO₂ interface is paramount for a potential model of the Si–O system. To achieve this, we employed various models, encompassing crystalline Si slabs with different orientations, such as Si(100), Si(110), and Si(111). Our approach involved utilizing both α-quartz, β-cristobalite, and other polymorphs in constructing the Si–SiO₂ interface. For a detailed explanation of our construction scheme, please refer to the Method section. To assess the suitability of the potential for modeling the Si/SiO₂ interface, we examined interfaces using both O-terminated SiO₂ slabs and Si-terminated slabs. For each Si/SiO₂ crystalline interface configuration, we perform force and energy minimization, allowing the positions of atoms and the simulation box size to change simultaneously. Following geometry optimization, MDs simulations are conducted for 50 ps at 300 K in the NPT ensemble. First, our potential stabilized the majority of the interfaces in both static and dynamic runs, with the extent of stabilization depending on the orientation of the Si slab and the termination of the quartz slab. For Si(110) in contact with either a Si- or O-terminated quartz slab, our potential successfully stabilizes and describes the dynamics of the resulting interfaces, whether symmetric or non-symmetric. However, our potential only successfully describes symmetric interfaces for Si(100) and Si(111), which are built from quartz slabs terminated by Si. These combinations of termination and orientation were not considered in the training set, showcasing the remarkable generalization ability of our interaction potential. It is worth noting that defects, such as silicon dangling bonds or over-coordinated oxygen atoms, are observed at the interface following both minimization and dynamic runs, as depicted in Fig. 10. As noted in ref. ⁷⁷, the presence of the dangling bonds is a natural occurrence and constitutes a typical aspect of interface defects. Such anomalies are commonplace and anticipated in these interfaces, owing to the inherent lattice mismatch between the involved materials. Usually, interfacial defects are passivated or special construction schemes are adopted to eliminate them. However, our study does not aim to create defect-free interfaces. Instead, our goal is to evaluate the potential’s capability to manage complex heterostructures with varying bonding types not encountered during the training process. To further validate our potential for silicon and silica interfaces, we compare the interfacial energies of small models computed using MTP and DFT. Detailed descriptions of these small models are provided in the Methods section. Our results (shown in Fig. 11) exhibit good correlation between DFT and MTP models. Importantly, configurations generated by the MTP potential through relaxation and MD simulations converged easily, typically requiring fewer than 50 iterations of single-point energy calculations of DFT. This highlights the reliability of interfacial configurations generated by the MTP potential. These findings demonstrate the capability of MTP potentials to effectively investigate heterosystems containing Si–SiO₂ interfaces. The relaxed small models of the Si–SiO₂ interfaces are presented in the supporting information (Figs. S11 and S12).

**Fig. 10: Interface structure of Si–SiO₂.**

**Fig. 11: Interface energy of Si–SiO₂.**

Discussion

In this work, we have successfully parametrized a ML potential that can implicitly capture and describe different charge states. Additionally, this ML potential has the remarkable ability to describe disjoint zones and hetero-zones of the configurational space of SiO₂ and its constituent elements, Si and oxygen. The potential description of various phenomena-including point defects, diffusion in Si crystals, extended defects, the liquid phase of Si and SiO₂, and the amorphous phase of Si and SiO₂-either rivals or outperforms existing potentials. The potential exhibits very good agreement with experimental data in challenging configurational zones, such as the amorphous state, even though these configurations were not included in the training data. In many scenarios, such as disordered phases (liquid, amorphous), the potential achieved a near-perfect match with the reference method in terms of accuracy, using exactly the same simulation time(very short) and conditions. Even when utilizing longer simulation times and large system with a semi-empirical model, it does not reaches a level of accuracy similar to the reference method. Furthermore, the potential displays an intriguing capability regarding dislocation behavior. It autonomously transitioned the C₁ structure to the C₂ structure without the need for manual reconstruction. While there is a growing consensus that ML potentials can effectively serve as surrogates for DFT in terms of accuracy and speed, generalization including charge state modeling remains a challenging task. This study provides evidence that reaction coordinates are sufficient to implicitly capture charge transfer or charge states involved in chemical reactions. Aside from potentials explicitly considering charge transfer, separate ML potentials are developed for each individual chemical element or compound which database is constructible by DFT. Our approach suggests this is not necessary; compounds and their elemental constituents can be trained jointly. Indeed, joint parametrization, where parameters are derived simultaneously for both silicon, silica and oxygen, offers several advantages. By employing a single potential to describe the interactions between atoms in both silicon and silica, the computational model becomes more streamlined and easier to manage. The unified potential can save time and resources by avoiding the need to recalibrate parameters of the model for each of the materials involved. The unification idea is also important for some areas of application, such as interfacial modeling for electronic devices, energy storage and conversion, and surface coatings and tribology. Here, the reference data for ML potential must include both the individual materials in contact as well as the boundary region. In addition, as chemical reactions can occur in MD simulation, good modeling of a multi-component material or complex systems under certain conditions requires joint parametrization of the considered system and its constituents. For instance, oxygen aggregation in high-temperature MD simulations was observed by researchers, as noted in ref. ³⁴ (supporting information). This observation led to the inclusion of oxygen molecules in the training set by the researchers. Our preliminary study pertained to a semiconductor and its oxide. However, whether this approach can be generalized to other elements and mixtures-including multi-component alloys and compounds remains to be seen. To ensure a well-implemented unified interaction potential, several other aspects need to be explored in the future. Given that the compound and its constituents are not located within the same zone of the configurational space, achieving an accurate, efficient, low-cost, and general unified potential for both the material and its oxides with limited data may require adjustments to the underlying mathematical model, the fitting procedure, and the database sampling methods (including active learning). These adjustments could help attain the same level of accuracy at a more affordable computational cost, resembling a feature of potentials parameterized for a single compound. Likewise, while this study achieved a joint description of Si and SiO₂ using the MTP framework, it is likely that other currently developed ML potential frameworks would have led to a similar result. In conjunction with these questions, our aim in the future is to extend this work by incorporating the element of hydrogen to model silica gels.

Methods

Ab initio calculations

The database was constructed using DFT, as implemented in the Quantum ESPRESSO⁷⁸ package. The exchange-correlation potential was treated using the generalized gradient approximation of Perdew-Burke-Ernzerhof (GGA-PBE)⁷⁹. Projector augmented waves (PAW)⁸⁰ were employed. Kinetic energy cutoffs of 884 eV for Si and 1224 eV for both SiO₂ and oxygen were chosen. In all calculations, the Brillouin zone was sampled using the Monkhorst-Pack grid⁸¹ scheme. Different k-points were used for each polymorph, including an 8 × 8 × 8 for the ordinary phases of Si and an 11 × 11 × 11 for SiO₂. The gamma point was used for oxygen molecules.

ML model: the MTP

In this work, the MTP⁴⁶ was chosen as the ML model. The MTP is a multi-component potential. In a previous comparative study⁸², it demonstrated a favorable trade-off between accuracy and computational speed across a range of modeling problems. The model derives its name from its use of a tensorial representation of atomic coordinates and utilizes linear regression to determine the local atomic energy. These local atomic energies are subsequently summed to obtain the total energy of the system under consideration. The MTP model considers the total energy of a specific atomic configuration as a sum of individual atomic energy contributions.

$${E}_{Total}=\mathop{\sum }\limits_{i=1}^{n}{E}_{i}=\mathop{\sum }\limits_{i=1}^{n}{V}_{local}({\zeta }_{i})$$

(1)

The argument ζ_i is a tuple ζ_i = (r_ij, τ_i, τ_j) containing the relative coordinate r_ij and atomic types τ_i, τ_j. Here, V_local is approximately computed within the sphere or circle of radius (Rc) of 5.7 Å, beyond which the central atom no longer feels any interaction. Practically, in the MTP framework, the expansion of the atomic energy V_local into basis functions B_β serves as the foundation for linear regression.

$${V}_{local}=\sum _{\beta }{c}_{\beta }{B}_{\beta }$$

(2)

Since the potential energy function V_local is smooth, the force acting on an atom k at position r_k in a given configuration x_q can be calculated by taking the gradient of the total energy.

$${F}_{k}({x}_{q})=-\nabla {E}_{Total}({x}_{q})=-\sum _{i}\frac{\partial V({\zeta }_{i})}{\partial {r}_{k}({x}_{q})}$$

(3)

The virial stress within an atomic configuration x_q of volume Ω can be expressed as follows.

$${\sigma }_{ij}({x}_{q})=\frac{1}{2{{\Omega }}}\sum _{k\in {{\Omega }}}\sum _{l\in {{\Omega }}}({x}_{i}^{(l)}-{x}_{i}^{(k)}){F}_{j}^{(kl)}$$

(4)

The functions B_β in equation 2 are obtained through the contraction of the descriptors. In the MTP model, the descriptors are formed by tensors of atomic coordinates weighted by radial functions. These descriptors consider both the radial distribution and the angular distribution of the neighborhood surrounding each atom. By incorporating information from both the radial and angular aspects, the descriptors capture the local atomic environment in a more comprehensive manner, enabling a more accurate representation of the atomic energy within the MTP framework.

$$M_{\mu,\nu}(r_{ij}, \tau_i, \tau_j) = \sum\limits_jf_\mu(|r_{ij}|,\tau_i, \tau_j) \mathop{\underbrace{r_{ij}\otimes\cdots\cdots\cdots\otimes r_{ij}}}\limits_{{\nu\,{\rm{times}}}}$$

(5)

The radial function f_μ, is further expanded using radial basis functions Q^(α) and fitting parameter ${c}_{\mu ,{\tau }_{i},{\tau }_{j}}^{(\alpha )}$ as expansion coefficients. This expansion allows for a more flexible and accurate representation of the radial dependence of the atomic interactions.

$${f}_{\mu ,}(| {r}_{ij}| ,{\tau }_{i},{\tau }_{j})=\sum _{\alpha }{c}_{\mu ,{\tau }_{i},{\tau }_{j}}^{(\alpha )}{Q}^{(\alpha )}(| {r}_{ij}| )$$

(6)

The model parameters $\Theta =({c}_{\beta },{c}_{\mu ,{\tau }_{i},{\tau }_{j}})$ are determined during the minimization of the cost function as given by Equation (7).

Data curation and optimization

We acquired ab initio data using established methods and databases from prior research. The database construction involved two methodologies, namely manual processing^24,82,83 and active learning⁸⁴, as explained deeply in supporting information. Specifically, we referred to the dataset created for the GAP for silicon²⁴, the comparative study⁸², and active learning techniques detailed in ref. ⁸⁴. For liquid silica, we utilized the temperature range (1000 K–5000 K) from previous databases specifically designed for neural network interatomic potentials (NNIP), which covered temperatures exceeding its boiling point and extended as high as 5000 K, as referenced in ref. ³⁶. While NNIP potentials involve a very large number of adjustable parameters–typically tens of thousands–allowing to jointly describe a large number of off-equilibrium configurations, accommodating such deviations from equilibrium becomes challenging within the MTP framework due to limited numbers of parameters. Effective MTP training, therefore, relies on carefully selecting the training set. Our final training dataset, herein referred to as the unified training set, was constructed through a two-step process: curation and subsequent optimization.

To enhance the quality of our training set and to properly assess the error of the test set, down selection was applied. To begin, we sorted our full database into smaller subsets as elaborated in Tab. S3 through S6 in the supporting information. Within each subset, we then utilize a filtering strategy referred to herein as the “train-remove-train" approach. We first train while monitoring for significant reductions in energy and force errors associated to each configuration as we incrementally raise the MTP level by one unit (We focus on levels 08 to 14 for deformation and defects, while levels 16 to 18 are used for disordered structures). Next, we analyze the error reduction between 2 or 3 consecutive levels. If a significant decrease is not observed, we then eliminate configurations based on factors such as:

(1)
Total energy: configurations with similar total energies yet differing atomic coordinates to other in the training set, as well as those with fluctuating total energies but nearly identical atomic positions to others in the training set are removed. These configurations originated from relaxation of molecules, single-point calculations of unrelaxed defects, manually constructed unrelaxed jump paths, and strained configurations where lattice parameters or vectors were strained without corresponding adjustments to atomic positions and embedded dimer.
(2)
Contributions from smearing: this can be primarily attributed to the extensive use of high-temperature AIMD simulations. Silica configurations generated from AIMD simulations, featuring a number of atoms greater than 36, are excluded if they exhibit smearing contributions to the total energy greater than zero.
(3)
Minimum interatomic distances (this criteria complements the total energy criteria): if multiple configurations from the same batch exhibit similar minimum distances, some are removed. This technique was mostly used for oxygen molecules. For example, we check the interatomic distance in the batch of relaxed O₃ molecules. We also discard embedded dimer configurations by comparing their minimum interatomic distances with those of the AIMD configurations. If the minimum interatomic distances are equal, we choose AIMD over embedded dimer configuration.
(4)
Polymorphism: These configurations are derived from deformations following thermal expansion. Most of these configurations have been accurately computed and were included in the training database. However, some polymorphs arising from displacive phase transformations and polymorphs sharing the same lattice system with identical coordination numbers were excluded. In the case of displacive phase transformation, we retain the parent crystal and exclude the child crystal. When dealing with two polymorphs that share the same lattice system and coordination number, we typically choose one of them.

After removing these undesired configurations, we reinitiate the training process incrementally, following a pattern akin to the first stage. This “train-remove-train” process is iterated until we attain a high level of confidence in the cleanliness of the subset–i.e., all configurations included in the subset are associated with training errors that decrease as the MTP level increases. In the subsequent curation phase, we examine the possibility of extracting an even smaller subset from each previously cleaned subset.⁸⁴. To achieve this, we use the “select-add" command embedded in the MTP code, as described in ref. ⁴⁷. We applied the “select-add" command to every cleaned subset.

Training set

In implementing the unified potential, we selected a range of configurations from the optimized and curated database, as outlined in the Data Curation and Optimization section. Similarly, the test set was chosen from the same optimized database to eliminate any overlap between the two sets. While the test set encompasses all properties or types of configurations represented in the extensive database, the training set only includes certain configuration types. This strategic approach aims to ensure the portability of the interaction potential. Note that the database contains configurations of substantial size, with up to 1000 atoms. However, we constrained the maximum box size in the training set to 36, except for the interfaces set where a few configurations are of size 80. Consequently, the number of atoms in the training set’s boxes ranges from 1 to 80. Conversely, the test set contains configurations of the maximum size found in the database. The specific types of configurations represented in the training set are detailed in Table 5. By restricting the training set to specific configuration types, we aim to enhance the generalizability of our potential by simulating properties not present in the training set. Within the current implementation of the MTP potential, we do not employ a validation set in the typical manner used to estimate overfitting or underfitting during the training of neural network models. Instead, we opt to utilize an offline test set for this purpose. To gauge the potential’s portability more comprehensively, we perform MD simulations for properties that were not included in the training set.

Table 5 Final refined and optimized training dataset derived from extensive uncurated database

Full size table

Training and validation

The cost function, as described by Equation (7), was minimized using the Broyden-Fletcher-Goldfarb-Shanno algorithm, a quasi-Newton optimization method implemented in the MTP framework. Fundamentally, training the MTP model with atomic configurations entails finding the parameter set {Θ} by solving the minimization problem presented in Equation (7). We trained the refined unified training sets as detailed in Tab. S7 (supporting information) and Table 5. First, MTP potentials with parameter sets ranging from 300 to 1600, corresponding to levels 18 to 26, were used to train set 1 (Tab. S7 and Fig. S1a) as part of preliminary works, including optimization, training mode, and testing. The preliminary works are also presented in supplemental information from Figs. S5–S8. At this stage, two training modes were employed: vibration mode and structures weighting mode. Specifically, the potential resulting from the vibration mode was used as input for the structures weighting mode training. The training process iterated until a desirable level of accuracy was achieved. As outlined in the supporting information, we assessed the validation error using the resulting potentials. We employed two independent validation sets, denoted as Validation 1 and Validation 2, which atom distributions are shown in the supplementary Fig. S2 and Fig. S3, respectively. Validation 1 was randomly selected concurrently with the training set from the curated database. On the other hand, Validation 2 consisted of the AIMD cooling and equilibration trajectories at 300 K. These validation sets were utilized as part of preliminary work. For the final implementation, we utilized the level 28. Based on the preliminary work, we opted for the structure weighting mode. The two-step training mode, as applied to large-size configurations in set 1 (Tab. S7 and Fig. S1a)), was deemed unnecessary. Given that our final training set (Table 5) comprises small cell configurations, we exclusively employed the structure weighting mode. We utilized the resulting potential to conduct both static calculations and MD simulations, with the outcomes presented in the main text.

$$\begin{array}{ll}\mathop{\sum }\limits_{i=1}^{n}&\left[{w}_{e}{({E}^{mtp}({x}^{(i)},\Theta )-{E}^{qm}({x}^{(i)}))}^{2}+\right.\\ &{w}_{f}\mathop{\sum }\limits_{j=1}^{{N}_{a}({x}^{(i)})}| {F}_{j}^{mtp}({x}^{(i)},\Theta )-{F}_{j}^{qm}({x}^{(i)}){| }^{2}\\ &\left.+{w}_{\sigma }| {\sigma }^{mtp}({x}^{(i)},\Theta )-{\sigma }^{qm}({x}^{(i)}){| }^{2}\right]\,\to \,min\end{array}$$

(7)

Here, E^qm, F^qm, σ^qm denotes the values of energy, force, and stress computed by the quantum mechanical approach (DFT), while E^mtp, F^mtp, σ^mtp represents the corresponding values obtained from the MTP model. w_e, w_f, w_σ are the relative weights indicating the importance of the energy, the force and stress in optimization procedure.

Static calculations

This section summarizes the mathematical procedure used to determine static properties presented in the article.

For the chemical component with a formula X_lY_mZ_n, we calculate the cohesive energy as follows:

$${E}_{coh}={E}_{{X}_{l}{Y}_{m}{Z}_{n}}-(l{E}_{X}+m{E}_{Y}+n{E}_{Z}).$$

(8)

Where ${E}_{{X}_{l}{Y}_{m}{Z}_{n}}$ represents the energy of the supercell of the compound, while E_X, E_Y, and E_Z correspond to the energies of the isolated atoms. The subscripts l, m, and n indicate the number of X, Y, and Z atoms present in the building block ${E}_{{X}_{l}{Y}_{m}{Z}_{n}}$ of the material. Due to variations in the number of atoms within the primitive cell of each polymorph compared to the standard structural configuration, we normalize the cohesive energy by dividing it by the number of atoms present in the regular phase.

Points defects formation properties such as vacancy formation energy (${E}_{v}^{f}$) was calculated using this equations:

$${E}_{v}^{f}={E}_{{N}_{0}-1}-\frac{{N}_{0}-1}{{N}_{0}}* {E}_{{N}_{0}}.$$

(9)

For interstitial formation energy ${E}_{i}^{f}$, we used:

$${E}_{i}^{f}={E}_{{N}_{0}+1}-\frac{{N}_{0}+1}{{N}_{0}}* {E}_{{N}_{0}}.$$

(10)

In equations (9) and (10), N₀ and ${E}_{{N}_{0}}$ correspond to the number of atoms and total energy of a perfect supercell.

Particularly, vacancies in SiO₂ polymorphs were estimated considering a neutral state. Thus, the formation energy in SiO₂ polymorphs was calculated using:

$${E}_{f}={E}_{vac}-{E}_{bulk}+{\mu }_{O}.$$

(11)

In this equation, E_vac and E_bulk represent the energy of the supercell containing the oxygen vacancy and the energy of the bulk supercell, respectively. The chemical potential is defined as half of the energy of a dioxygen molecule (${\mu }_{O}=1/2\ast {{\rm{E}}}_{{O}_{2}}$).

The equilibrium bulk modulus which correspond to the curvature of the energy-volume curve at its minimum was derived from the second-order elastic constants⁸⁵. We calculate elastic stiffness constant C_ij using central finite difference formula.

$${C}_{ij}=\frac{{P}_{i}^{(+{\varepsilon }_{j})}-{P}_{i}^{(-{\varepsilon }_{j})}}{2* {\varepsilon }_{j}}.$$

(12)

where ${P}_{i}^{(+{\varepsilon }_{j})}$ is the ith component of the stress tensor when the configuration is strained only by jth component (ε_j) of the strain vector ($\overrightarrow{\varepsilon }$). After applying directional or isotropic deformation, the atomic positions undergo relaxation while the overall box size remains fixed. We compute the generalized stacking fault energy (γ(u)) by incrementally shifting the upper crystal half along the slip direction and assessing energy differences per unit area (A) of the fault plane.

$$\gamma (u)=\frac{E(u)-{E}_{o}}{A}.$$

(13)

where, E_o represents the energy of the perfect crystal, while E(u) denote the energy of the supercell with the fault vector u which is directly proportional to the Burgers vector (b). Surface energy is also calculated using the following expression:

$$\gamma =\frac{{E}_{slab}-N{E}_{bulk}}{2A}.$$

(14)

In this context, A refers to the area of the slab, N represents the number of atoms in the slab, while E_slab and E_bulk denote the total energy of the slab and the bulk energy per atom, respectively.

Si–SiO₂ Interface construction

Previous reports indicate that defects are commonly observed at the interface Si–SiO₂ due to the imperfect matching of the two materials. To avoid considerable lattice mismatch, we utilize specific techniques. First, we rotate the alpha-quartz structure to achieve a tetragonal configuration. Next, we duplicate both the silicon crystal and alpha-quartz structure, ensuring that the lattice dimensions perpendicular to the interface direction closely match. This approach enables us to apply a small strain (<2%) to the lattice vectors before forming the interface. Technically, the lattice mismatch α can be defined as the relative difference in lattice parameters between two crystalline materials, often expressed as a percentage or in terms of the absolute difference in lattice constants along specific crystallographic directions:

$$\alpha =\frac{n* {L}_{1}-m* {L}_{2}}{n* {L}_{1}+m* {L}_{2}}.$$

(15)

Lattice duplication factors are represented by integers n and m; L₁ and L₂ denote the lattice parameters of a given direction. In both cases, symmetric and asymmetric interfaces were constructed for both oxygen-terminated and silicon-terminated quartz slabs, incorporating Si (100), Si (110), and Si (111) slab orientations. Our objective is not solely to construct a flawless interface representation of a naturally occurring or real-world interface, but rather to explore the versatility of the potential. We then estimate the interface energy using:

$$\gamma =\frac{{E}_{S}-({n}_{Si{O}_{2}}* {E}_{Si{O}_{2}}+{m}_{Si}* {E}_{Si})}{A}.$$

(16)

Where A represents the area of the interface, ${{\rm{n}}}_{Si{O}_{2}}$ and m_Si represent the number of formula units of SiO₂ and Si in the interface system. E_S is the energy of the supercell containing the interface. The terms ${{\rm{E}}}_{Si{O}_{2}}$ and E_Si correspond to the energy of silica and silicon per formula unit, respectively. Due to the impractical size of duplicated models for energy computation via DFT, smaller superlattices were also constructed involving Si (100) interfacing with α-quartz, β-cristobalite, α-cristobalite, β-tridymite, and amorphous silica. This facilitated comparison between results obtained using MTP potentials and those from DFT. Each simulation box contains two distinct interfaces. Initially, these small models were relaxed at 0 K using MTP potentials. Subsequently, MTP-driven MD simulations were performed at temperatures of 300 K, 500 K, 800 K, and 1200 K for 100 ps, and configurations were selected from the trajectories. The energies of these selected configurations, as well as the relaxed configurations, were then computed using DFT-based single-point energy calculations.

MD simulation

AIMD simulation was carried out using Quantum Espresso using the parameters as described in the section “Ab initio calculations details." The integration timestep was set to 1 fs for Si and 2 fs for SiO₂. The ionic temperature during simulations was controlled using velocity rescaling.

Force-field MD simulations were performed using the large-scale atomic/molecular massively parallel simulator (LAMMPS) software package⁸⁶. While it is impractical to perfectly replicate the MD settings in Quantum Espresso within LAMMPS, we aimed to make them as close as possible. To generate disordered structures, we employed the velocity rescaling thermostat to control the temperature during the simulations. The time integration used the same time steps as in the AIMD simulations. For studying point defects diffusion in Si and self-diffusion in SiO₂, the Nosee-Hoover thermostat⁸⁷ was employed. The latter simulations were performed using a timestep of 1 fs, and the damping parameter was set to 100 fs.

Data availability

The Si–O–SiO₂ database and the unified potentials can be found at https://gitlab.com/Kazongogit/MTPu.

Code availability

The main codes used for this work are Quantum ESPRESSO (version 6.8), LAMMPS (version 2022), PHONOPY, and MLIP-2 (version 2). They are available at https://www.quantum-espresso.org, https://lammps.sandia.gov, https://phonopy.github.io/phonopy and https://mlip.skoltech.rurespectively. Further details can be found in the GitLab repository https://gitlab.com/Kazongogit/MTPu. Additionally, Python scripts were written for data manipulation and processing; most are available on the GitLab repository.

References

Arns, R. G. The other transistor: early history of the metal-oxide semiconductor field-effect transistor. Eng. Sci. Educ. J. 7, 233–240 (1998).
Article Google Scholar
Bauza, D. Thermal oxidation of silicon and si–sio2 interface morphology, structure, and localized states. Handb. Surf. interfaces Mater. 1, 115–216 (2001).
Article CAS Google Scholar
Pasquarello, A., Hybertsen, M. S. & Car, R. Interface structure between silicon and its oxide by first-principles molecular dynamics. Nature 396, 58–60 (1998).
Article CAS Google Scholar
Ganster, P., Tréglia, G., Lancon, F. & Pochet, P. Molecular dynamics simulation of silicon oxidization. Thin Solid Films 518, 2422–2426 (2010).
Article CAS Google Scholar
Ganster, P., Béland, L. K. & Mousseau, N. First stages of silicon oxidation with the activation relaxation technique. Phys. Rev. B 86, 075408 (2012).
Article Google Scholar
Cvitkovich, L. et al. Dynamic modeling of si (100) thermal oxidation: oxidation mechanisms and realistic amorphous interface generation. Appl. Surf. Sci. 610, 155378 (2023).
Article CAS Google Scholar
Salles, N., Richard, N., Mousseau, N. & Hémeryck, A. Strain-driven diffusion process during silicon oxidation investigated by coupling density functional theory and activation relaxation technique. J. Chem. Phys. 147, 054701 (2017).
Article CAS PubMed Google Scholar
Takahashi, N., Yamasaki, T. & Kaneta, C. Molecular dynamics simulations on the oxidation of si (100)/sio2 interface: emissions and incorporations of si-related species into the sio2 and substrate. Phys. Status Solidi (b) 251, 2169–2178 (2014).
Article CAS Google Scholar
Kohn, W. & Sham, L. J. Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, A1133 (1965).
Article Google Scholar
Sholl, D. S. & Steckel, J. A. Density functional theory: a practical introduction (John Wiley & Sons, 2022).
Goedecker, S. Linear scaling electronic structure methods. Rev. Mod. Phys. 71, 1085 (1999).
Article CAS Google Scholar
Torrens, I. Interatomic Potentials (Elsevier, 2012).
Pizzagalli, L. Classical atomistic simulations in materials sciences: an introduction. Mater. Sci. 10, 125 (2004).
Google Scholar
Van Duin, A. C., Dasgupta, S., Lorant, F. & Goddard, W. A. Reaxff: a reactive force field for hydrocarbons. J. Phys. Chem. A 105, 9396–9409 (2001).
Article Google Scholar
Van Duin, A. C. et al. Reaxffsio reactive force field for silicon and silicon oxide systems. J. Phys. Chem. A 107, 3803–3811 (2003).
Article Google Scholar
Phillpot, S. R. et al. Charge optimized many body (comb) potentials for simulation of nuclear fuel and clad. Comput. Mater. Sci. 148, 231–241 (2018).
Article CAS Google Scholar
Yu, J., Sinnott, S. B. & Phillpot, S. R. Charge optimized many-body potential for the si/sio 2 system. Phys. Rev. B 75, 085311 (2007).
Article Google Scholar
Liang, T., Devine, B., Phillpot, S. R. & Sinnott, S. B. Variable charge reactive potential for hydrocarbons to simulate organic-copper interactions. J. Phys. Chem. A 116, 7976–7991 (2012).
Article CAS PubMed Google Scholar
Shan, T.-R. et al. Second-generation charge-optimized many-body potential for si/sio 2 and amorphous silica. Phys. Rev. B 82, 235302 (2010).
Article Google Scholar
Hine, N. D., Haynes, P. D., Mostofi, A. A., Skylaris, C.-K. & Payne, M. C. Linear-scaling density-functional theory with tens of thousands of atoms: Expanding the scope and scale of calculations with onetep. Comput. Phys. Commun. 180, 1041–1053 (2009).
Article CAS Google Scholar
Bartok, A. P. The Gaussian Approximation Potential: an interatomic potential derived from first principles quantum mechanics (Springer Science & Business Media, 2010).
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
Article PubMed Google Scholar
Behler, J. Four generations of high-dimensional neural network potentials. Chem. Rev. 121, 10037–10072 (2021).
Article CAS PubMed Google Scholar
Bartók, A. P., Kermode, J., Bernstein, N. & Csányi, G. Machine learning a general-purpose interatomic potential for silicon. Phys. Rev. X 8, 041048 (2018).
Google Scholar
Li, R., Lee, E. & Luo, T. A unified deep neural network potential capable of predicting thermal conductivity of silicon in different phases. Mater. Today Phys. 12, 100181 (2020).
Article Google Scholar
Hu, L., Su, R., Huang, B. & Liu, F. An accurate and transferable machine-learning interatomic potential for silicon. arXiv https://arxiv.org/abs/1901.01638 (2019).
Babaei, H., Guo, R., Hashemi, A. & Lee, S. Machine-learning-based interatomic potential for phonon transport in perfect crystalline si and crystalline si with vacancies. Phys. Rev. Mater. 3, 074603 (2019).
Article CAS Google Scholar
Dragoni, D., Daff, T. D., Csányi, G. & Marzari, N. Achieving dft accuracy with a machine-learning interatomic potential: Thermomechanics and defects in bcc ferromagnetic iron. Phys. Rev. Mater. 2, 013808 (2018).
Article Google Scholar
Szlachta, W. J., Bartók, A. P. & Csányi, G. Accuracy and transferability of gaussian approximation potential models for tungsten. Phys. Rev. B 90, 104108 (2014).
Article Google Scholar
Rowe, P., Deringer, V. L., Gasparotto, P., Csányi, G. & Michaelides, A. An accurate and transferable machine learning potential for carbon. J. Chem. Phys. 153, 034702 (2020).
Article CAS PubMed Google Scholar
Smith, J. S. et al. Automated discovery of a robust interatomic potential for aluminum. Nat. Commun. 12, 1257 (2021).
Article CAS PubMed PubMed Central Google Scholar
Deringer, V. L., Caro, M. A. & Csányi, G. A general-purpose machine-learning force field for bulk and nanostructured phosphorus. Nat. Commun. 11, 5461 (2020).
Article CAS PubMed PubMed Central Google Scholar
Novikov, I. S. & Shapeev, A. V. Improving accuracy of interatomic potentials: more physics or more data? a case study of silica. Mater. Today Commun. 18, 74–80 (2019).
Article CAS Google Scholar
Erhard, L. C., Rohrer, J., Albe, K. & Deringer, V. L. A machine-learned interatomic potential for silica and its relation to empirical models. npj Comput. Mater. 8, 90 (2022).
Article Google Scholar
Li, W. & Ando, Y. Comparison of different machine learning models for the prediction of forces in copper and silicon dioxide. Phys. Chem. Chem. Phys. 20, 30006–30020 (2018).
Article CAS PubMed Google Scholar
Balyakin, I., Rempel, S., Ryltsev, R. & Rempel, A. Deep machine learning interatomic potential for liquid silica. Phys. Rev. E 102, 052125 (2020).
Article CAS PubMed Google Scholar
Artrith, N., Morawietz, T. & Behler, J. High-dimensional neural-network potentials for multicomponent systems: applications to zinc oxide. Phys. Rev. B 83, 153101 (2011).
Article Google Scholar
Artrith, N. & Urban, A. An implementation of artificial neural-network potentials for atomistic materials simulations: performance for tio2. Comput. Mater. Sci. 114, 135–150 (2016).
Article CAS Google Scholar
Kandy, A. K. A., Rossi, K., Raulin-Foissac, A., Laurens, G. & Lam, J. Comparing transferability in neural network approaches and linear models for machine-learning interaction potentials. Phys. Rev. B 107, 174106 (2023).
Article CAS Google Scholar
Kobayashi, K. et al. Machine learning molecular dynamics simulations toward exploration of high-temperature properties of nuclear fuel materials: case study of thorium dioxide. Sci. Rep. 12, 9808 (2022).
Article CAS PubMed PubMed Central Google Scholar
Behler, J. Perspective: Machine learning potentials for atomistic simulations. J. Chem. Phys. 145, 170901 (2016).
Article PubMed Google Scholar
Das Sarma, S., Deng, D.-L. & Duan, L.-M. Machine learning meets quantum physics. Phys. Today 72, 48–54 (2019).
Article Google Scholar
Mishin, Y. Machine-learning interatomic potentials for materials science. Acta Mater. 214, 116980 (2021).
Article CAS Google Scholar
Onat, B., Ortner, C. & Kermode, J. R. Sensitivity and dimensionality of atomic environment representations used for machine learning interatomic potentials. J. Chem. Phys. 153, 144106 (2020).
Article PubMed Google Scholar
Friederich, P., Häse, F., Proppe, J. & Aspuru-Guzik, A. Machine-learned potentials for next-generation matter simulations. Nat. Mater. 20, 750–761 (2021).
Article CAS PubMed Google Scholar
Shapeev, A. V. Moment tensor potentials: a class of systematically improvable interatomic potentials. Multiscale Model. Simul. 14, 1153–1173 (2016).
Article Google Scholar
Novikov, I. S., Gubaev, K., Podryabinkin, E. V. & Shapeev, A. V. The mlip package: moment tensor potentials with mpi and active learning. Mach. Learn Sci. Technol. 2, 025002 (2020).
Article Google Scholar
Van Beest, B., Kramer, G. J. & Van Santen, R. Force fields for silicas and aluminophosphates based on ab initio calculations. Phys. Rev. Lett. 64, 1955 (1990).
Article Google Scholar
Arai, N., Takeda, S. & Kohyama, M. Self-interstitial clustering in crystalline silicon. Phys. Rev. Lett. 78, 4265 (1997).
Article CAS Google Scholar
Henkelman, G., Uberuaga, B. P. & Jónsson, H. A climbing image nudged elastic band method for finding saddle points and minimum energy paths. J. Chem. Phys. 113, 9901–9904 (2000).
Article CAS Google Scholar
Ohbitsu, M. et al. Atomic structures and stability of finite-size extended interstitial defects in silicon: large-scale molecular simulations with a neural-network potential. Scr. Mater. 214, 114650 (2022).
Article CAS Google Scholar
Cheng, Y. et al. Vacancy formation energy and its connection with bonding environment in solid: a high-throughput calculation and machine learning study. Comput. Mater. Sci. 183, 109803 (2020).
Article CAS Google Scholar
Stillinger, F. H. & Weber, T. A. Computer simulation of local order in condensed phases of silicon. Phys. Rev. B 31, 5262 (1985).
Article CAS Google Scholar
Tersoff, J. New empirical model for the structural properties of silicon. Phys. Rev. Lett. 56, 632 (1986).
Article CAS PubMed Google Scholar
Tersoff, J. Empirical interatomic potential for silicon with improved elastic properties. Phys. Rev. B 38, 9902 (1988).
Article CAS Google Scholar
Guénolé, J., Godet, J. & Pizzagalli, L. Determination of activation parameters for the core transformation of the screw dislocation in silicon. Model. Simul. Mater. Sci. Eng. 18, 065001 (2010).
Article Google Scholar
Huang, X., Hu, Y.-J. & An, Q. Locking of screw dislocations in silicon due to core structure transformation. J. Phys. Chem. C. 125, 24710–24718 (2021).
Article CAS Google Scholar
Morris, J. R., Wang, C., Ho, K. & Chan, C. T. Melting line of aluminum from simulations of coexisting phases. Phys. Rev. B 49, 3109 (1994).
Article CAS Google Scholar
Alfè, D. & Gillan, M. Exchange-correlation energy and the phase diagram of si. Phys. Rev. B 68, 205212 (2003).
Article Google Scholar
Lide, D. R. CRC Handbook of Chemistry and Physics, vol. 85 (CRC Press, 2004).
Dorner, F., Sukurma, Z., Dellago, C. & Kresse, G. Melting si: beyond density functional theory. Phys. Rev. Lett. 121, 195701 (2018).
Article CAS PubMed Google Scholar
Jaccodine, R. Surface energy of germanium and silicon. J. Electrochem. Soc. 110, 524 (1963).
Article CAS Google Scholar
Vashishta, P., Kalia, R. K., Rino, J. P. & Ebbsjö, I. Interaction potential for sio 2: a molecular-dynamics study of structural correlations. Phys. Rev. B 41, 12197 (1990).
Article CAS Google Scholar
Munetoh, S., Motooka, T., Moriguchi, K. & Shintani, A. Interatomic potential for si–o systems using tersoff parameterization. Comput. Mater. Sci. 39, 334–339 (2007).
Article CAS Google Scholar
Sundararaman, S., Huang, L., Ispas, S. & Kob, W. New optimization scheme to obtain interaction potentials for oxide glasses. J. Chem. Phys. 148, 194504 (2018).
Fortner, J. & Lannin, J. Radial distribution functions of amorphous silicon. Phys. Rev. B 39, 5527 (1989).
Article CAS Google Scholar
Mozzi, R. & Warren, nB. The structure of vitreous silica. J. Appl. Crystallogr. 2, 164–172 (1969).
Article CAS Google Scholar
Grimley, D. I., Wright, A. C. & Sinclair, R. N. Neutron scattering from vitreous silica iv. time-of-flight diffraction. J. Non-Cryst. Solids 119, 49–64 (1990).
Article CAS Google Scholar
Mei, Q., Benmore, C. & Weber, J. Structure of liquid sio 2: a measurement by high-energy x-ray diffraction. Phys. Rev. Lett. 98, 057802 (2007).
Article CAS PubMed Google Scholar
Carré, A., Ispas, S., Horbach, J. & Kob, W. Developing empirical potentials from ab initio simulations: the case of amorphous silica. Comput. Mater. Sci. 124, 323–334 (2016).
Article Google Scholar
Liu, H., Fu, Z., Li, Y., Sabri, N. F. A. & Bauchy, M. Balance between accuracy and simplicity in empirical forcefields for glass modeling: insights from machine learning. J. Non-Cryst. Solids 515, 133–142 (2019).
Article CAS Google Scholar
Farnan, I. et al. Quantification of the disorder in network-modified silicate glasses. Nature 358, 31–35 (1992).
Article CAS PubMed Google Scholar
Da Silva, J., Pinatti, D., Anderson, C. & Rudee, M. A refinement of the structure of vitreous silica. Philos. Mag. J. Theor. Exp. Appl. Phys. 31, 713–717 (1975).
Google Scholar
Coombs, P. et al. The nature of the si-o-si bond angle distribution in vitreous silica. Philos. Mag. B 51, L39–L42 (1985).
Article CAS Google Scholar
Kobayashi, K., Nagai, Y., Itakura, M. & Shiga, M. Self-learning hybrid monte carlo method for isothermal–isobaric ensemble: application to liquid silica. J. Chem. Phys. 155, 034106 (2021).
Tucker, M., Keen, D., Dove, M. & Trachenko, K. Refinement of the si–o–si bond angle distribution in vitreous silica. J. Phys. Condens. Matter 17, S67 (2005).
Article CAS Google Scholar
Himpsel, F., McFeely, F., Taleb-Ibrahimi, A., Yarmoff, J. & Hollinger, G. Microscopic structure of the sio 2/si interface. Phys. Rev. B 38, 6084 (1988).
Article CAS Google Scholar
Giannozzi, P. et al. Quantum espresso: a modular and open-source software project for quantum simulations of materials. J. Phys. Condens. Matter 21, 395502 (2009).
Article PubMed Google Scholar
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865 (1996).
Article CAS PubMed Google Scholar
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953 (1994).
Article Google Scholar
Monkhorst, H. J. & Pack, J. D. Special points for brillouin-zone integrations. Phys. Rev. B 13, 5188 (1976).
Article Google Scholar
Zuo, Y. et al. Performance and cost assessment of machine learning interatomic potentials. J. Phys. Chem. A 124, 731–745 (2020).
Article CAS PubMed Google Scholar
Zongo, K., Béland, L. & Ouellet-Plamondon, C. First-principles database for fitting a machine-learning silicon interatomic force field. MRS Adv. 7, 39–47 (2022).
Article CAS Google Scholar
Podryabinkin, E. V. & Shapeev, A. V. Active learning of linearly parametrized interatomic potentials. Comput. Mater. Sci. 140, 171–180 (2017).
Article CAS Google Scholar
Pandit, A. & Bongiorno, A. A first-principles method to calculate fourth-order elastic constants of solid materials. Comput. Phys. Commun. 288, 108751 (2023).
Article CAS Google Scholar
Thompson, A. P. et al. Lammps-a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Comput. Phys. Commun. 271, 108171 (2022).
Article CAS Google Scholar
Evans, D. J. & Holian, B. L. The nose–hoover thermostat. J. Chem. Phys. 83, 4069–4074 (1985).
Article CAS Google Scholar
Brandes, E. A. & Brook, G. Smithells metals reference book (Elsevier, 2013).
McMahon, M., Nelmes, R., Wright, N. & Allan, D. Pressure dependence of the imma phase of silicon. Phys. Rev. B 50, 739 (1994).
Article CAS Google Scholar
Kim, D. Y., Stefanoski, S., Kurakevych, O. O. & Strobel, T. A. Synthesis of an open-framework allotrope of silicon. Nat. Mater. 14, 169–173 (2015).
Article CAS PubMed Google Scholar
Adams, G. B., O’Keeffe, M., Demkov, A. A., Sankey, O. F. & Huang, Y.-M. Wide-band-gap Si in open fourfold-coordinated clathrate structures. Phys. Rev. B 49, 8048 (1994).
Article CAS Google Scholar
Levien, L., Prewitt, C. T. & Weidner, D. J. Structure and elastic properties of quartz at pressure. Am. Mineral. 65, 920–930 (1980).
CAS Google Scholar
Wright, A. & Lehmann, M. The structure of quartz at 25 and 590 c determined by neutron diffraction. J. Solid State Chem. 36, 371–380 (1981).
Article CAS Google Scholar
Downs, R. & Palmer, D. The pressure behavior of α cristobalite. Am. Mineral. 79, 9–14 (1994).
CAS Google Scholar
Barth, T. Cristobalite structures; ii, low-cristobalite. Am. J. Sci. 24, 97–110 (1932).
Article CAS Google Scholar
Cellai, D., Carpenter, M., Kirkpatrick, R., Salje, E. & Zhang, M. Thermally induced phase transitions in tridymite: an infrared spectroscopy study. Phys. Chem. Miner. 22, 50–60 (1995).
Article CAS Google Scholar
Villars, P. Pearson’s handbook: crystallographic data for intermetallic phases. c1991; 2nd edn. (1985).
Levien, L. & Prewitt, C. T. High-pressure crystal structure and compressibility of coesite. Am. Mineral. 66, 324–333 (1981).
CAS Google Scholar
Grocholski, B., Shim, S.-H. & Prakapenka, V. Stability, metastability, and elastic properties of a dense silica polymorph, seifertite. J. Geophys. Res.: Solid Earth 118, 4745–4757 (2013).
Article CAS Google Scholar
Ross, N. L., Shu, J. & Hazen, R. M. High-pressure crystal chemistry of stishovite. Am. Mineral. 75, 739–747 (1990).
CAS Google Scholar
Shropshire, J., Keat, P. P. & Vaughan, P. A. The crystal structure of keatite, a new form of silica. Z. Kristallogr. Cryst. Mater. 112, 409–413 (1959).
Article CAS Google Scholar
Miehe, G. et al. Crystal structure of moganite: a new structure type for silica. Eur. J. Mineral. 4, 693–706 (1992).
Article CAS Google Scholar
Díaz-Cabañas, M.-J. & Barrett, P. A. Synthesis and structure of pure sio 2 chabazite: the sio 2 polymorph with the lowest framework density. Chem. Commun. 1881–1882 (1998).
Plévert, J., Kubota, Y., Honda, T., Okubo, T. & Sugi, Y. Gus-1: a mordenite-like molecular sieve with the 12-ring channel of zsm-12electronic supplementary information (esi) available: details of the synthesis of the sda, crystal data and fractional atomic coordinates for gus-1. See http://www.rsc.org/suppdata/cc/b0/b005225f. Chem. Commun. 2363–2364 (2000).
Artioli, G., Lamberti, C. & Marra, G. Neutron powder diffraction study of orthorhombic and monoclinic defective silicalite. Acta Crystallogr. Sect. B: Struct. Sci. 56, 2–10 (2000).
Article Google Scholar
McSkimin, H., Andreatch Jr, P. & Thurston, R. Elastic moduli of quartz versus hydrostatic pressure at 25 and- 195.8 c. J. Appl. Phys. 36, 1624–1632 (1965).
Article Google Scholar
Léger, J.-M., Haines, J. & Chateau, C. The high-pressure behaviour of the “moganite” polymorph of sio2. Eur. J. Mineral. 13, 351–359 (2001).
Article Google Scholar
Pabst, W. & Gregorová, E. Elastic properties of silica polymorphs–a review. Ceram. -Silik. 57, 167–184 (2013).
CAS Google Scholar
Leardini, L., Quartieri, S., Vezzalini, G., Martucci, A. & Dmitriev, V. Elastic behavior and high pressure-induced phase transition in chabazite: New data from a natural sample from nova scotia. Micropor. Mesopor. Mater. 170, 52–61 (2013).
Article CAS Google Scholar
Durandurdu, M. & Drabold, D. Ab initio simulation of first-order amorphous-to-amorphous phase transition of silicon. Phys. Rev. B 64, 014101 (2001).
Article Google Scholar
Guerette, M. & Huang, L. A simple and convenient set-up for high-temperature brillouin light scattering. J. Phys. D: Appl. Phys. 45, 275302 (2012).
Article Google Scholar
Laaziri, K. et al. High resolution radial distribution function of pure amorphous silicon. Phys. Rev. Lett. 82, 3460 (1999).
Article CAS Google Scholar
Meidanshahi, R. V., Bowden, S. & Goodnick, S. M. Electronic structure and localized states in amorphous si and hydrogenated amorphous si. Phys. Chem. Chem. Phys. 21, 13248–13257 (2019).
Article Google Scholar
Vukcevich, M. A new interpretation of the anomalous properties of vitreous silica. J. Non-Cryst. Solids 11, 25–63 (1972).
Article CAS Google Scholar
Khouchaf, L. et al. Study of the microstructure of amorphous silica nanostructures using high-resolution electron microscopy, electron energy loss spectroscopy, x-ray powder diffraction, and electron pair distribution function. Materials 13, 4393 (2020).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank the Digital Research Alliance of Canada for generous allocation of compute resources. Financial support was provided by the Natural Sciences and Engineering Research Council of Canada (NSERC), the Nuclear Waste Management Organization (NWMO), the Association canadienne-française pour l’avancement des sciences (ACFAS), and the Canada Research Chair on Sustainable Multifunctional Construction Materials (CRC-2019-00074).

Author information

Authors and Affiliations

Département de génie de la construction, École de technologie supérieure, Université du Québec, Montréal, QC, Canada
Karim Zongo & Claudiane Ouellet-Plamondon
Department of Mechanical and Materials Engineering, Queen’s university, Kingston, ON, Canada
Hao Sun & Laurent Karim Béland

Authors

Karim Zongo
View author publications
Search author on:PubMed Google Scholar
Hao Sun
View author publications
Search author on:PubMed Google Scholar
Claudiane Ouellet-Plamondon
View author publications
Search author on:PubMed Google Scholar
Laurent Karim Béland
View author publications
Search author on:PubMed Google Scholar

Contributions

L.K.B. initiated and coordinated the work, overseeing the research project. K.Z. was responsible for developing the reference database and fitting the unified potential. K.Z. performed all static calculations and MD simulations using DFT, MTP, and semi-classical potentials. All the authors, K.Z., H.S., L.K.B., and C.O.P., contributed to the writing and reviewing of the paper. C.O.P. and L.K.B. supervised K.Z. C.O.P. and L.K.B. provided computing resources.

Corresponding authors

Correspondence to Karim Zongo or Laurent Karim Béland.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information for: a unified moment tensor potential for silicon, oxygen, and silica

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zongo, K., Sun, H., Ouellet-Plamondon, C. et al. A unified moment tensor potential for silicon, oxygen, and silica. npj Comput Mater 10, 218 (2024). https://doi.org/10.1038/s41524-024-01390-8

Download citation

Received: 19 November 2023
Accepted: 17 August 2024
Published: 13 September 2024
DOI: https://doi.org/10.1038/s41524-024-01390-8

Subjects

Abstract

Similar content being viewed by others

A machine-learned interatomic potential for silica and its relation to empirical models

Modelling atomic and nanoscale structure in the silicon–oxygen system through active machine learning

Predicting electronic structures at any length scale with machine learning

Introduction

Results

Cohesive energy, elastic constant, point defects and extended defects

Coexistence simulation

Silicon slab energy

Disordered structures

Liquid and amorphous silicon

Liquid and amorphous silica

Phonon dispersion

Si–SiO2 interface

Discussion

Methods

Ab initio calculations

ML model: the MTP

Data curation and optimization

Training set

Training and validation

Static calculations

Si–SiO2 Interface construction

MD simulation

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Information for: a unified moment tensor potential for silicon, oxygen, and silica

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links

Si–SiO₂ interface

Si–SiO₂ Interface construction