Introduction

In recent years, two-dimensional twisted moiré structures have captured significant interest due to the diverse physical phenomena they exhibit. By varying the interlayer twist angle, researchers can tune the band structure of these materials, enabling the experimental observation of novel phenomena. For instance, in twisted graphene, when the twist angle reaches the so-called “magic angle”, the valence band flattens, prompting electrons to transition from a weakly correlated to a strongly correlated state. This shift gives rise to a host of intriguing behaviors, including unconventional superconductivity, Mott insulating states, and the quantum anomalous Hall effect1,2,3,4,5,6,7,8,9,10,11,12. Similar phenomena have also been observed in moiré bilayers of transition metal dichalcogenides (TMDs)13,14,15,16,17,18,19.

In twisted structures, the moiré potential narrows the bandwidth as the periodicity of the structure increases. For instance, the bandwidth of bilayer twisted graphene at a twist angle of 1.08° is only a few meV4,20, while the bandwidth of bilayer twisted MoTe2 at 3.89° is just over 10 meV14. Such narrow bands are highly susceptible to the effects of lattice relaxation, which significantly influences their electronic properties. Theoretical calculations reveal that the electronic band structures of rigid twisted graphene differ markedly from those of relaxed systems20. Additionally, experimental studies using scanning tunneling microscopy have also documented the relaxation patterns in TMDs resulting from lattice reconstruction21,22.

To accurately model the electronic properties of moiré structures, density functional theory (DFT) is often employed, particularly for structures with large twist angles, where it is considered essential for reliable structural relaxation14,15,19. However, despite its high level of accuracy, the computational complexity of DFT scales cubically with the number of atoms. The atoms in moiré structures increase dramatically as the twist angle decreases (Table 1), rendering the DFT calculation impractical for smaller-angle structures due to the sheer number of atoms involved.

Table 1 Number of atoms in moiré cell of twisted bilayer TMDs

To address this computational challenge, researchers have developed parameterized continuum models that are better suited for structures with small twist angles23,24,25,26,27,28,29,30. While these models provide a computationally feasible alternative, they typically do not reach the accuracy levels of DFT relaxation. For materials such as graphene31,32,33 and transition metal dichalcogenides (TMDs)34, empirical force fields have been effectively utilized for structural relaxation12,35,36,37,38. However, in other systems, robust and extensively validated empirical potentials remain scarce, limiting the scope of studies that can be conducted.

Machine learning force fields (MLFF) offer a promising solution to the computational challenges posed by moiré structures39,40,41,42,43,44,45,46,47,48,49,50. Recent advancements in universal MLFFs have shown great promise in terms of versatility, efficiency, and accuracy for materials discovery and high-throughput calculations51,52,53,54,55,56. Universal MLFFs typically can achieve an energy error of several tens of millielectron volts (meV) per atom. For example, the mean absolute energy errors of CHGNET52 and ALIGNN-FF54 are 33 meV per atom and 86 meV per atom, respectively. However, in the context of moiré systems, the energy scales of electronic bands are often on the order of meV, a range comparable to the accuracy limits of these universal MLFFs. This indicates that while universal MLFFs provide broad applicability, their precision may be insufficient for structural relaxation tasks in moiré systems, necessitating the development of MLFFs specifically tailored to individual material systems. Algorithms such as NequIP41 and Allegro40 can achieve errors of approximately a fraction of a meV per atom when trained on specific materials, which is accurate enough for moiré systems.

Previous efforts have successfully constructed MLFFs for twisted structures, achieving encouraging outcomes. Some studies have developed MLFFs for large twist angles and then applied these models to smaller angles57,58, while others have trained MLFFs on non-twisted structures before using them to relax twisted configurations14,59. Additionally, a few approaches have combined initial training on non-twisted structures with subsequent transfer learning on large twist-angle structures to efficiently relax twisted configurations60,61. This multifaceted strategy highlights the adaptability of MLFFs in addressing the specific challenges posed by the diverse configurations encountered in moiré systems.

While these innovative approaches have shown promise, their validation has often been limited to specific materials, and a comprehensive tool for constructing MLFFs tailored to twisted structures is still lacking. Moiré systems offer a unique platform for exploring novel phenomena such as strong correlations and topological states, with numerous experimental and theoretical advances highlighting their potential. Given the rapid development in this field, there is a pressing need for a universal tool that can conveniently and efficiently construct MLFFs for such complex systems. To bridge this gap, we propose a new methodology and introduce an open-source software, DPmoire, designed specifically for moiré systems. DPmoire leverages non-twisted structures to construct training datasets, facilitating the automated generation of MLFFs tailored to the unique challenges of moiré systems. This tool aims to streamline the MLFF construction process, enabling researchers to more effectively study and model the intricate behaviors exhibited by twisted materials.

Results

MLFF for moiré systems

To develop an MLFF for moiré superlattice structures, we initially constructed 2 × 2 supercells of non-twisted bilayers and introduced in-plane shifts to generate various stacking configurations. Subsequently, structural relaxations were performed for each configuration, ensuring that the x and y coordinates of a reference atom from each layer remained fixed to prevent structural drift toward energetically favorable stackings. The lattice constants were also held constant throughout the simulations. The relaxation data were compiled into a training dataset.

Following the relaxation phase, Molecular Dynamics (MD) simulations were conducted under the aforementioned constraints to augment the training data pool. For these simulations, we employed the VASP MLFF module to explore a wide range of atomic configurations. VASP MLFF module is an on-the-fly MLFF algorithm, which will be described in detail in section “Machine learning force fields”. Then, we selectively incorporated data solely from DFT calculation steps. Given the potential instability when initiating MD simulations with VASP MLFF from an untrained state, we initially established a baseline MLFF using single-layer structures before proceeding with the full simulations. To ensure the MLFF’s applicability to moiré systems and to mitigate overfitting to non-twisted structures, we constructed the test set using large-angle moiré patterns. These were subjected to ab initio relaxations, with the resultant data serving as the test set.

Finally, the compilation of the aforementioned datasets facilitated the training of a robust and accurate MLFF. While we utilized the Allegro framework for MLFF training in this study, other MLFF algorithms, such as DeepMD42, could also be effectively trained on these datasets to potentially enhance predictive accuracy and transferability across similar complex structures (Fig. 1).

Fig. 1: Schematic overview of the process for constructing the MLFF.
figure 1

Initially, an MLFF is generated for monolayer structures to stabilize subsequent molecular dynamics (MD) simulations for bilayer systems. We then create non-twisted bilayer structures with various stacking configurations, relax these structures, and run MD simulations using the VASP MLFF module to construct the training dataset. The coordinates (x and y) of a selected atom from each layer are maintained constant during relaxation to preserve the integrity of the stacking order. Subsequently, the twisted structures are relaxed using density functional theory (DFT) to generate the test dataset. The MLFF is ultimately trained on these collected datasets, ensuring it can accurately predict the physical behaviors of moiré systems.

Eventually, the procedure described above was implemented in DPmoire. As shown in Fig. 2, DPmoire is structured into four functional modules: DPmoire.preprocess, DPmoire.dft, DPmoire.data, and DPmoire.train. Firstly, as provided the unit cell structures of each layer, DPmoire.preprocess module will automatically combine two layers and generate shifted structures of a 2 × 2 supercell. The twisted structure for the building test set will also be prepared. The preprocess module will take care of the input files for VASP according to the provided templates. After that, the DPmoire.dft module will submit VASP calculation jobs through slurm system. When all the calculation is done, the DFT-calculated data (Energy, Force, and Stress) will be collected by DPmoire.data module. Then, DPmoire.data will merge the data into training set and test set. DPmoire.train module will modify the system-dependent settings in the configuration file according to given template for training Allegro or NequIP MLFF, and submit the training job. After the training is done, the trained MLFF can be used in ASE62 or LAMMPS63 to perform structural relaxation.

Fig. 2: Overview of the DPmoire package workflow.
figure 2

Initially, the preprocess module utilizes the provided structural files for each layer along with an input template to generate the necessary input files for subsequent VASP DFT calculations. The dft module then orchestrates these calculations using the Slurm management system. Upon completion, the data module collects the results and compiles them into datasets. Subsequently, the train module begins training a machine learning force field using these datasets, adhering to the parameters specified in the MLFF configuration template file. Once trained, the MLFF can be integrated with software packages such as LAMMPS63 or ASE62 to facilitate structural relaxation.

Performance of generated MLFF

The accuracy of MLFF is critically dependent on the precision of underlying density functional theory (DFT) calculations. Particularly in layered materials, the van der Waals (vdW) interactions play a crucial role in determining the DFT-calculated interlayer distances, making their inclusion indispensable. Over the years, a plethora of vdW correction methodologies have been developed64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80. Despite these developments, the predicted interlayer distances using different vdW corrections can vary by a few tenths of an Ångstrom.

Given this variation, it is crucial to identify the most appropriate vdW correction for each material prior to the training of MLFFs. To this end, we evaluated the lattice constants obtained under various vdW corrections, comparing them against experimental measurements to ascertain the optimal vdW correction for each material. The details of this comparative analysis and the optimal vdW corrections are documented in Section I of Supplementary Information, providing a rigorous foundation for the subsequent MLFF training. These tailored corrections are crucial for enhancing the accuracy of DFT calculations, thereby improving the robustness of the developed MLFFs for different TMD materials.

Then, the MLFF is constructed utilizing the previously determined optimal vdW corrections for both AA and AB stacking configurations of MX2 (M = Mo, W; X = S, Se, Te) materials, as thoroughly discussed in Section I of Supplementary Information. Settings used to train the MLFFs are shown in Table 2. We specifically examined AA WSe2 and AA MoS2 as representative examples. The efficacy of the MLFF is demonstrated through a comparison of predicted and DFT-calculated forces within the test set, as illustrated in Fig. 3. The comparison shows a strong alignment between the MLFF predictions and the DFT calculations, with root mean square errors of 0.007 eV/Å and 0.014 eV/Å for WSe2 and MoS2, respectively, underscoring the accuracy of the MLFF in capturing the essential physical interactions in these materials.

Table 2 Settings in training MLFFs
Fig. 3: Force error of generated MLFF.
figure 3

a MLFF-predicted versus DFT-calculated forces for AA WSe2 in test set of 7.34° twist. b Similar comparison for AA MoS2 in test set including 9.34°, 7.34°, and 6.08° twists.

We further evaluated the performance of the trained MLFFs by relaxing a structure with a 7.34° twist angle, followed by a comparison relaxation using DFT. As depicted in Fig. 4, the relaxation outcomes from the MLFF are nearly indistinguishable from those obtained via DFT, with no significant deviations observed. The maximum differences in atomic positions were found to be 0.039 Å in WSe2 and 0.003 Å in MoS2. In the relaxed structures, regions characterized by MX and XM stacking exhibited lower interlayer distances compared to the AA regions. Throughout the relaxation process, atoms near the AA regions tend to rotate counterclockwise, which intensifies the local twist effect. Conversely, atoms in proximity to the MX and XM regions rotate clockwise. This differential rotation behavior strategically maximizes the area of MX and XM regions while minimizing the AA region. These findings align well with previous theoretical studies23.

Fig. 4: Relaxation pattern of 7.34° AA WSe2 and MoS2.
figure 4

a, c correspond to the interlayer distance and intralayer displacement in MLFF-relaxed WSe2, respectively. b, d correspond to the interlayer distance and intralayer displacement in DFT-relaxed WSe2. e, g correspond to the interlayer distance and intralayer displacement in MLFF-relaxed MoS2. f, h correspond to the interlayer distance and intralayer displacement in DFT-relaxed MoS2.

To further investigate how different relaxation approaches affect the computed band structures, we also performed band structure calculations on both MLFF-relaxed and DFT-relaxed structures for AA WSe2 and AA MoS2, as shown in Fig. 5. The band structures of the two methods are nearly identical, with only minor differences, demonstrating that the MLFF is sufficiently accurate to capture the essential physical phenomena in moiré structures without the need for additional DFT relaxation. As detailed in Section II of Supplementary Information, MLFFs for other materials also exhibited robust performance. For MoS2, WS2, AB MoTe2, and WTe2, the structures relaxed by MLFF and DFT methods were nearly identical, and their corresponding band structures closely matched. However, for materials like MoSe2 and AA-stacked MoTe2, slight variations in interlayer distances led to minor differences in their band structures. We further analyzed the 5.09° twist angle in AA and AB stacked MoSe2 (Supplementary Fig. 13), where the discrepancies between DFT-relaxed and MLFF-relaxed structures were reduced, suggesting that the observed suboptimal performance in these materials may be due to the larger twist angles. In large-angle structures, the lattice mismatch between layers is not negligible, and such atomic configurations rarely appear in the training dataset.

Fig. 5: Band structure comparison of 7.34° AA WSe2 and MoS2.
figure 5

a Comparison of electronic band structure between MLFF-relaxed structure and DFT-relaxed structure in 7.34° AA WSe2. b Comparison of electronic band structure between MLFF-relaxed structure and DFT-relaxed structure in 7.34° AA MoS2.

Moreover, we evaluated the transferability of our MLFF with 7.34° AB MoS2 as an example. The root mean squared errors (RMSE) under different temperatures and stresses are shown in Fig. 6. For the temperature tests, we conducted 1-picosecond MLFF-MD simulations at each temperature and sampled 10 structures evenly from the trajectories to compute the force errors against DFT references. For the stress tests, we performed structural relaxations starting from a rigid structure and similarly sampled 10 structures from each relaxation trajectory. Stresses in the z-direction are applied by imposing forces on the top and bottom sulfur atoms. The results demonstrate excellent transferability of the MLFF across varying temperatures and stresses. This indicates that our MLFF is not only suitable for structural relaxation but also robust for MD simulations under diverse conditions.

Fig. 6: RMSE of forces of 7.34° AB MoS2 in different temperatures and stresses.
figure 6

a RMSE under different temperatures. For each data point, we sampled 10 structures evenly from 1-ps MLFF-MD simulation to calculate the error. b RMSE under different stresses. For each data point, we sampled 10 structures evenly from the relaxation trajectories. Stresses in the z-direction are applied by imposing forces on the top and bottom sulfur atoms.

Discussion

In this work, we introduced a universal methodology and developed an open-source tool, DPmoire, for constructing MLFF tailored to moiré structures. Utilizing the VASP MLFF module, DPmoire effectively generates training sets and constructs validation sets based on large-twist-angle configurations. We successfully trained accurate MLFFs for MX2 (M = Mo, W; X = S, Se, Te) systems, which precisely replicate both the relaxation patterns and electronic band structures observed in DFT relaxations, but at a significantly reduced computational cost.

This innovative tool enables the effective relaxation of moiré systems across a broader range of smaller angles and varied materials. Additionally, it facilitates phonon calculations within these complex systems. We anticipate that DPmoire will significantly enhance the understanding of physical phenomena influenced by relaxation effects and spur the discovery of novel moiré materials.

Moreover, we found that for moiré systems, carefully constructing the training set can significantly improve the accuracy of MLFF. We believe that for other systems, designing the training set according to the common characteristic might also be a promising approach to get an accurate model.

Methods

Moiré structures

Moiré twisted materials could be constructed by either applying a twist angle between layers of two layered materials or stacking two materials with a slight lattice constant mismatch. Generally, the smaller the twist angle, the larger the resulting moiré supercell. Different regions of a moiré structure exhibit various stacking arrangements. Taking twisted AA WSe2 as an example (Fig. 7), in the AA region, the W/Se atoms in the top layer are aligned with the corresponding W/Se atoms in the bottom layer. In the MX region, the W atoms in the top layer align with the Se atoms in the bottom layer, while in the XM region, the Se atoms in the top layer align with the W atoms in the bottom layer. In non-twisted structures, various stacking configurations correspond to different energy states, as illustrated in Fig. 7(b). When the interlayer twist angle is minimal, the lattice vectors of both layers closely match, making the local atomic configurations in the moiré structure similar to those in non-twisted structures. By modeling the potential energy surfaces of these non-twisted configurations, we can effectively reconstruct the potential energy landscape of twisted structures, thereby advancing our understanding of their unique properties.

Fig. 7: Characteristic of moiré structures.
figure 7

a Moiré crystal structure of WSe2 with a 2.13° AA stacking twist, resembling the atomic layout of non-twisted bilayer WSe2. b Energy profile of non-twisted bilayer WSe2 based on relative in-plane shifts between layers, where X and Y axes represent shift vectors, and color indicates unit cell energy. Energy at MX and XM stackings is zeroed. Interlayer distance is 6.8 Å.

Machine learning force fields

MLFF39,40,41,42,43,44,45,46,47,48,49,50 refers to machine learning algorithms for predicting the energy and forces of crystal structures. Typically, to train an MLFF, it needs a dataset consisting of a set of crystal structures along with their corresponding energies and forces. Once training is complete, the MLFF can rapidly predict the energies and forces of similar structures. The computational cost of MLFF prediction scales linearly with the number of atoms, making the cost of relaxation manageable even for very large structures.

However, constructing a comprehensive dataset can be a time-consuming endeavor. Directly using ab-initio MD simulations to build datasets is a relatively inefficient approach, as structures that are close in time within an MD trajectory are very similar. This similarity results in a redundancy that offers little added value to the training dataset, posing a challenge for efficient MLFF deployment.

On-the-fly MLFF approaches like DP-GEN81 and the MLFF module of the Vienna Ab initio Simulation Package (VASP)39,82 provide effective solutions for managing computational costs in MD simulations. This article focuses on the MLFF module within VASP. This module automates the process of data collection, MLFF training, and its immediate application to accelerate MD simulations within a continuous loop. The MLFF module operates based on Bayesian linear regression, which allows it to directly estimate the error in its predictions without needing to compare them against ab-initio results. During an MD simulation, if the module estimates a small error, it applies the MLFF-predicted results directly. Conversely, if a large error is estimated, it discards these results and performs a density functional theory (DFT) step to obtain accurate data. This ab-initio data is then added to the training dataset for refining the MLFF. The error criteria also evolve with the averaged predicted error. The initial error criteria in our calculations are set to 2 meV. This iterative process repeats throughout the MD simulation, allowing for extensive sampling from MD trajectories, which could involve tens of thousands of steps, while only requiring DFT calculations for several hundred steps. In our test cases, around two hundred DFT steps were computed during the 10,000-step molecular dynamics simulation. As a result, a high-quality dataset can be constructed with minimal computational expense, optimizing both resources and time.

The MLFF algorithm in VASP is designed to be relatively lightweight, which significantly reduces the training time required during the simulation loop. However, this streamlined approach means that the accuracy of the VASP MLFF may not rival that of more complex neural network-based MLFF algorithms. Consequently, we first utilize VASP MLFF and collect only the DFT data generated in this iteration, subsequently employing a more accurate neural network-based MLFF to fit the collected DFT dataset.

One such advanced approach is NequIP, a machine learning force field based on an E(3)-equivariant graph neural network41. This method ensures covariance among the inputs, outputs, and hidden layers, leading to enhanced data efficiency and model accuracy. Another notable E(3)-equivariant algorithm is Allegro, which is particularly well-suited for large structures and optimized for parallel computing40. While this article primarily focuses on the application of Allegro, the dataset generated using our approach is versatile and can be employed to train other MLFF models as well. This flexibility facilitates the exploration and application of various advanced MLFF techniques in computational material science.