Transfer learning relaxation, electronic structure and continuum model for twisted bilayer MoTe2

Mao, Ning; Xu, Cheng; Li, Jiangxu; Bao, Ting; Liu, Peitao; Xu, Yong; Felser, Claudia; Fu, Liang; Zhang, Yang

doi:10.1038/s42005-024-01754-y

Download PDF

Article
Open access
Published: 03 August 2024

Transfer learning relaxation, electronic structure and continuum model for twisted bilayer MoTe₂

Communications Physics volume 7, Article number: 262 (2024) Cite this article

5252 Accesses
41 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Large-scale moiré systems are extraordinarily sensitive, with even minute atomic shifts leading to significant changes in electronic structures. Here, we investigate the lattice relaxation effect on moiré band structures in twisted bilayer MoTe₂ with two approaches: (a) large-scale plane-wave basis first principle calculation down to 2.88°, (b) transfer learning structure relaxation + local-basis first principles calculation down to 1.1°. We use two types of van der Waals corrections: the D2 method of Grimme and the density-dependent energy correction, and find that the density-dependent energy correction yields a continuous evolution of bandwidth with twist angles. Based on the above results. we develop a complete continuum model with a single set of parameters for a wide range of twist angles, and perform many-body simulations at ν = −1, −2/3, −1/3.

Moiré band engineering in twisted trilayer WSe₂

Article Open access 28 November 2025

Polarization-driven band topology evolution in twisted MoTe₂ and WSe₂

Article Open access 18 May 2024

Tunable angle-dependent electrochemistry at twisted bilayer graphene with moiré flat bands

Article 17 February 2022

Introduction

Recent experiments on twisted bilayer MoTe₂ (tMoTe₂) reported the observation of the fractional quantum anomalous Hall (FQAH) effect at fractional fillings $\nu =-\frac{2}{3}$ and $-\frac{2}{5}$ of the moiré band^1,2,3,4. The realization of the FQAH effect in twisted transition metal dichalcogenide bilayers (tTMDs) was theoretically proposed^5,6,7 as a consequence of band topology⁸ and electron interaction. Specially, spontaneous ferromagnetism and electron correlation in spin-valley polarized Chern band lead to the emergence of fractional Chern insulators (FCI) that exhibit the FQAH effect at zero magnetic fields. The observed FQAH effect in tMoTe₂ is remarkably robust, existing over an unexpectedly wide range of twist angles and persisting up to 2 K³. The experimental realization of the long-sought fractional quantum Hall effect at zero magnetic field^{9,10,11,12,13,14,15,16,17,18} not only expands the realm of fractionalized topological phases, but also holds promise for anyon-based topological quantum computations^19,20.

While theoretical studies have provided important insights into the FQAH effect in tTMDs, the underlying moiré band structures of tMoTe₂ over the experimentally accessible twist angle range has not been systematically studied. A number of first-principles studies report different bandwidths at the commensurate angle θ = 3.89°, ranging from 9 to 18 meV for the lowest moiré band^21,22,23. Importantly, lattice relaxation at the moiré length scale can significantly impact the band structure and even the band topology. While the effect of the out-of-plane corrugation has been considered, the in-plane lattice relaxation and the effect of the resulting strain field have not been incorporated into the continuum model. The strain-induced pseudomagnetic field, as well as higher-harmonic moiré potentials, strongly affect higher moiré bands, and therefore are crucial for studying band-mixing FCI states^23,24,25 and interaction-induced phases at higher filling factors. Finally, first-principles electronic structure calculations for twist angles below θ = 3.89° are entirely lacking. The accuracy of the continuum model at small twist angles remains to be assessed.

In this work, we perform extensive first-principles simulations to study moiré lattice relaxation and electronic structures. Our calculations encompass a wide range of twist angles, reaching as small as 2.88° using plane-wave basis and 1.1° using transfer learning technique and local basis. In addition to interlayer corrugation, we observe significant in-plane displacement^{26,27,28,29,30} reaching around 0.5 Å at small angles. To capture the significant effect of lattice relaxation, we extend the continuum model to include second harmonic moiré potentials and pseudo-magnetic field up to 250 T from in-plane strain²⁷. Remarkably, all four topmost moiré valence bands over the entire range of experimentally relevant twist angles (θ = 2.6° – 5°^1,2,3,4) are accurately reproduced by our continuum model, with a single set of parameters. With the transfer-learning model, we further calculate the topological edges states and Wilson loop around 2°, revealing a series of C = 1 Chern bands. These findings serve as the foundation point for even-denominator non-Abelian states.

Results

Two types of van der Waals corrections

For accurate lattice relaxations in two-dimensional (2D) multi-layer systems, it is essential to incorporate van der Waals (vdWs) dispersion corrections into the total energy, potential, interatomic forces, and stress tensor calculations. The choice of vdWs corrections, therefore, influences the lattice parameters of unit cells and the interlayer distances. Typically, vdWs corrections fall into two categories: (a) charge-density independent methods such as DFT-D2/D3 and (b) charge-density-dependent methods. The latter category accounts for charge-density variations in vdWs contributions of atoms influenced by their local chemical environments.

The DFT-D2 method³¹ adds an empirical single-shot dispersion correction to the conventional density functional theory (DFT) calculations. The correction term for the dispersion energy E_disp is given by

$${E}_{disp}=-{s}_{6}{{\sum}_{i=1}^{N-1}}{{\sum}_{j=i+1}^{N}}\frac{{C}_{6,ij}}{{R}_{ij}^{6}}\cdot {f}_{damp}({R}_{ij}).$$

(1)

Here, s₆ denotes a global scaling factor that only depends on the density functional used, N is the number of atoms in the system, C_6,ij are the dispersion coefficients for the atom pair (i, j), and R_ij is the distance between atoms i and j. The damping function f_damp is used to avoid the divergence of the dispersion term at short interatomic distances. C_6,ij and f_damp are determined by the local geometry, which is unrelated to the self-consistent iteration.

Unlike the D2 method, the density-dependent screened Coulomb (dDsC) method^32,33 involves a density-dependent screening function to modulate the Coulomb interaction, which allows for a more realistic representation of vdWs interactions as a function of the local chemical environment. The correction term for dDsC can be expressed by

$${E}_{{{{\mathrm{disp}}}}}=-{{\sum}_{i=1}^{N-1}}{{\sum}_{j=i+1}^{N}}\frac{{C}_{6,ij}}{{R}_{ij}^{6}}\cdot {f}_{{{{\mathrm{damp}}}}}(b{R}_{ij}).$$

(2)

The key difference between dDsC and DFT-D2 lies in the damping function f_damp, which is associated to the key component b (damping factor) for dDsC. This damping factor can be determined by the local electron density, the gradient of the electron density, and other environment-specific parameters. Therefore, it is particularly useful for systems (e.g., strongly correlated moiré systems studied here) where vdWs interactions are sensitive to the local electronic environment.

For untwisted bulk structures, these two vdWs correction methods often give similar results. As shown in Supplementary Note 1, for the bulk-MoTe₂, the lattice constants and the vertical layer distances predicted by both DFT-D2 and dDsC methods agree well with the experimental results (a = 3.52 Å and d = 6.99 Å)^34,35. However, for the moiré superlattice system, the dDsC method yields more reliable structure relaxation, since the rich local chemical environments such as position-dependent electrical dipoles appearing in the moiré superlattice are better described by dDsC.

Large-scale DFT and lattice relaxation effect

Making use of the initial moiré structure generated by deep potential molecular dynamics (DeePMD)³⁶, large-scale structural relaxations can be achieved at a significantly reduced computational cost. Remarkably, the relaxation of θ = 2.88° twisted structures comprising 2382 atoms was completed in just 5 h with 17 DFT ionic steps using DeePMD-generated structure in four NVIDIA H100 GPUs. The self-consistent calculation and band diagonalization of this 2382-atom system (IBAND = 17,160 and plane-wave number 11,469,590) can be done within 80 min in 20 NVIDIA H100 GPUs, showcasing the massive speedup of the GPU platform for large-scale first principle simulation.

To demonstrate the relaxation effect in the tMoTe₂, we compare the relaxed moiré structures with twist angles 3.89° and 2.88°. First, there is a big variation in the interlayer spacing (ILS) (Fig. 1), indicating a large structural transformation. For tMoTe₂ with a twist angle of 3.89° (Fig. 1a), the maximum ILS observed is 7.8 Å. This occurs in the MM region, where the Te/Mo atoms of the top layer are aligned directly above those in the bottom layer, resulting in an energy increase in this area due to the strong repulsion. The minimum ILS is 7.0 Å, which is observed in the MX region where the Mo atoms of the top layer stack over the Te atoms of the bottom layer. Figure 1b shows the ILS for 2.88°tMoTe₂ exhibiting a clear domain wall connecting MM regions, which becomes more significant at lower twist angles, as shown in Supplementary Figs. 2–5.

**Fig. 1: Lattice relaxation of interlayer and intralayer distances.**

Concerning the intralayer strain, both structures exhibit similar behaviors. As depicted in Fig. 1c, d), the in-plane displacement pattern displays a helical chirality, with the amplitude intensifying as the twist angle diminishes. We observe a large displacement up to 0.5 Å for θ = 2.88°, which generates a pseudomagnetic field up to 200 T (see Supplementary Note 3).

Symmetry analysis of moiré band structures

The space group of the relaxed structures is P321 (No. 150), whose point group is generated by a twofold rotational symmetry along y axis (C_2y), and three-fold rotational symmetry along z axis (C_3z). In the crystal momentum space, the C_2y symmetry only protects twofold degeneracies at the invariant lines or points within the Brillouin Zone, as defined by the relation C_2yk → k. Within this invariant space, the Hamiltonian commutes with the symmetry operation, allowing it to be block-diagonalized into two distinct sectors, each characterized by unique eigenvalues ± π. Due to the constraints imposed by the symmetry, a band represented by e^iπ is inherently degenerate with another band represented by e^−iπ, forming a doubly degenerate band structure. Consequently, the only lines that encapsulate the C_2y symmetries within the two-dimensional Brillouin zone are the ΓK lines (satisfying 2k₁ + k₂ = 0). When considering the C_3z rotational symmetry, the lines that meet the conditions k₁ + 2k₂ = 0 and k₁ − k₂ = 0 also emerge as symmetry-invariant lines. As a result, bands along the ΓK and MK lines are always doubly degenerate, while a clear splitting is observed along the ΓM line, as shown in Fig. 2.

**Fig. 2: Fitting results from the continuum model.**

In Fig. 3, we plot the angle-dependent bandwidth and direct gap using two types of vdW corrections. At twist angle θ = 3.89°, D2 type of vdW correction gives rise to a narrow bandwidth as 12 meV, which is close to previous calculation using local-basis SIESTA package³⁷ and D2 correction²¹. While under the dDsC type of vdW correction, we obtain the bandwidth as 18 meV, and the overall trend of angle-dependent bandwidth follows a parabolic continuum behavior with a single set of parameter, as we will discuss later. At the smallest calculated twist angle θ = 2.88°, the width of the top moiré valence band reduced to 6 meV.

**Fig. 3: Variation of bandwidth and direct band gap under different van der Waals corrections.**

Complete continuum model

We now introduce a more comprehensive continuum model to depict the moiré band structure. The key low-energy states originate from the hole bands in the K and ${K}^{{\prime} }$ valleys of the two MoTe₂ layers. Considering that these valleys are connected through time-reversal symmetry (${{{\mathcal{T}}}}$), analyzing one valley is sufficient to infer the band structure. For tTMD systems with rotational (C_3z) and layer-exchange symmetry (${C}_{2y}{{{\mathcal{T}}}}$), we derive the following form:

$$\hat{H}=\left[\begin{array}{cc}-\frac{{(\hat{{{{\boldsymbol{k}}}}}-{{{{\boldsymbol{K}}}}}_{{{{\boldsymbol{t}}}}}+{{{\boldsymbol{eA}}}})}^{{{{\bf{2}}}}}}{2{m}^{* }}+{\Delta }_{t}({{{\boldsymbol{r}}}})&{\Delta }_{T}({{{\boldsymbol{r}}}})\\ {\Delta }_{T}^{{{\dagger}} }({{{\boldsymbol{r}}}})&-\frac{{(\hat{{{{\boldsymbol{k}}}}}-{{{{\boldsymbol{K}}}}}_{{{{\boldsymbol{b}}}}}-{{{\boldsymbol{eA}}}})}^{{{{\bf{2}}}}}}{2{m}^{* }}+{\Delta }_{b}({{{\boldsymbol{r}}}})\end{array}\right]$$

(3)

with:

$${\Delta }_{t}({{{\boldsymbol{r}}}}) = \, 2{V}_{1}{{\sum}_{i=1,3,5}}\cos ({{{{\boldsymbol{g}}}}}_{{{{\boldsymbol{i}}}}}^{{{{\bf{1}}}}}\cdot {{{\boldsymbol{r}}}}+l{\phi }_{1})+2{V}_{2}{{\sum}_{i=1,3,5}}\cos ({{{{\boldsymbol{g}}}}}_{{{{\boldsymbol{i}}}}}^{{{{\bf{2}}}}}\cdot {{{\boldsymbol{r}}}})\\ {\Delta }_{T} = \, {w}_{1}{{\sum}_{i=1,2,3}}{e}^{-i{{{{\boldsymbol{q}}}}}_{{{{\boldsymbol{i}}}}}^{{{{\bf{1}}}}}\cdot {{{\boldsymbol{r}}}}}+{w}_{2}{{\sum}_{i=1,2,3}}{e}^{-i{{{{\boldsymbol{q}}}}}_{{{{\boldsymbol{i}}}}}^{{{{\bf{2}}}}}\cdot {{{\boldsymbol{r}}}}}\\ A({{{\boldsymbol{r}}}}) = \, A({{{{\boldsymbol{a}}}}}_{{{{\bf{2}}}}}\sin ({{{{\boldsymbol{G}}}}}_{{{{\bf{1}}}}}\cdot {{{\boldsymbol{r}}}})-{{{{\boldsymbol{a}}}}}_{{{{\bf{1}}}}}\sin ({{{{\boldsymbol{G}}}}}_{{{{\bf{3}}}}}\cdot {{{\boldsymbol{r}}}})-{{{{\boldsymbol{a}}}}}_{{{{\bf{3}}}}}\sin ({{{{\boldsymbol{G}}}}}_{{{{\bf{5}}}}}\cdot {{{\boldsymbol{r}}}}))$$

(4)

where $\hat{{{{\boldsymbol{k}}}}}$ is the momentum measured from the Γ point of single layer MoTe₂, K_t(K_b) is high symmetry momentum K of the top (bottom) layer, Δ_t(r)(Δ_b(r)) is the layer dependent moiré potential, Δ_T(r) is the interlayer tunneling, G_i’s are moiré reciprocal vectors, A(r) is the strain-induced gauge field which gives a periodic pseudomagnetic field^27,38: ${{{\boldsymbol{B}}}}({{{\boldsymbol{r}}}})=B{\sum }_{i = 1,3,5}\cos ({{{{\boldsymbol{G}}}}}_{{{{\boldsymbol{i}}}}}\cdot {{{\boldsymbol{r}}}})$. ${{{{\boldsymbol{g}}}}}_{{{{\boldsymbol{i}}}}}^{{{{\bf{1}}}}}$ and ${{{{\boldsymbol{g}}}}}_{{{{\boldsymbol{i}}}}}^{{{{\bf{2}}}}}$ represent the momentum differences between the nearest and second-nearest plane-wave bases within the same layer. Similarly, ${{{{\boldsymbol{q}}}}}_{{{{\boldsymbol{i}}}}}^{{{{\bf{1}}}}}$ and ${{{{\boldsymbol{q}}}}}_{{{{\boldsymbol{i}}}}}^{{{{\bf{2}}}}}$ denote the momentum differences between the nearest and second-nearest plane-wave bases across different layers. a₁, a₂, a₃ are the moire lattice vectors. The relations between different wave vector can be given by ${{{{\boldsymbol{G}}}}}_{1}=\frac{4\pi }{\sqrt{3}{a}_{M}}{(\frac{1}{2},-\frac{\sqrt{3}}{2})}^{T},{{{{\boldsymbol{G}}}}}_{3}=\frac{4\pi }{\sqrt{3}{a}_{M}}{(\frac{1}{2},\frac{\sqrt{3}}{2})}^{T},{{{{\boldsymbol{G}}}}}_{5}=\frac{4\pi }{\sqrt{3}{a}_{M}}{(-1,0)}^{T},{{{{\boldsymbol{q}}}}}_{1}^{1}=\frac{4\pi }{3a}2\sin \left(\frac{\theta }{2}\right){(0,1)}^{T},{{{{\boldsymbol{q}}}}}_{2}^{1}={C}_{3}{{{{\boldsymbol{q}}}}}_{1}^{1},{{{{\boldsymbol{q}}}}}_{3}^{1}={C}_{3}^{2}{{{{\boldsymbol{q}}}}}_{1}^{1},{{{{\boldsymbol{q}}}}}_{1}^{2}={{{{\boldsymbol{q}}}}}_{1}^{1}-{{{{\boldsymbol{G}}}}}_{5},{{{{\boldsymbol{q}}}}}_{2}^{2}={C}_{3}{{{{\boldsymbol{q}}}}}_{1}^{2},$ and ${{{{\boldsymbol{q}}}}}_{3}^{2}={C}_{3}^{2}{{{{\boldsymbol{q}}}}}_{1}^{2}$.

To obtain accurate parameters in the continuum model, we perform large-scale calculations with dDsC vdWs corrections (IVDW = 4), then fit the DFT moiré band structure at 3.15° to obtain the following continuum parameter: m^* = 0.62m_e, V₁ = 10.3 meV, V₂ = 2.9 meV, w₁ = −7.8 meV, w₂ = 6.9 meV, ϕ₁ = − 75°, Φ/Φ₀ = 0.737. (Φ₀ is the quantum flux, Φ represents the value of flux in the moire unit cell). The continuum model parameters with D2 vdW corrections (IVDW = 10) are presented in the Supplementary Note 4. In our subsequent analysis of the continuum model, we will utilize the parameter from the IVDW = 4, as it provides the more reliable structure relaxation previously discussed. Employing these parameters, we are now equipped to solve the moiré band structures at various small twist angles.

Next, we examine the topology of these moiré bands from 1.6° to 5°. At twist angles below 2.5°, the Chern numbers for the top three bands, as calculated using the continuum model, are 1, − 1, 0, as shown in Fig. 4c (see Supplementary Note 2). For greater twist angles, these Chern numbers change to 1, 1, − 2. We emphasize that the arrangement of Chern numbers for θ > 2.83° is in agreement with experimental data. So far, in all experiments where twist angle θ ranges between 3. 5°–3. 9°, both^1,2 the Hall conductance and the reflective magnetic circular dichroism increase once the doping exceeds ν = −1. And a double quantum spin Hall effect has been observed at ν = −4. These results suggest that the second band shares the same Chern number as the first band.

**Fig. 4: Band topology of the continuum model.**

We additionally verify the trace condition for the uppermost moiré band. The band’s geometry is encapsulated in the quantum geometry tensor:

$${\eta }^{uv}:= {A}_{BZ}\langle {\partial }^{u}{u}_{{{{\boldsymbol{k}}}}}| (1-| {u}_{{{{\boldsymbol{k}}}}}\rangle \langle {u}_{{{{\boldsymbol{k}}}}}| )| {\partial }^{v}{u}_{{{{\boldsymbol{k}}}}}\rangle$$

(5)

where A_BZ is the area of the Brillouin zone. The symmetric and antisymmetric parts of the quantum geometry tensor give the Berry curvature ($\Omega ({{{\boldsymbol{k}}}})=-2{{{\rm{Im}}}}({\eta }^{xy})$) and quantum metric (${g}^{uv}({{{\boldsymbol{k}}}})={{{\rm{Re}}}}({\eta }^{uv})$). To quantify the geometric properties, one can calculate two figures of merits^39,40,41:

$${\sigma }_{F} := {\left[\frac{1}{{A}_{BZ}}\int{d}^{2}{{{\boldsymbol{k}}}}{(\frac{\Omega ({{{\boldsymbol{k}}}})}{2\pi }-1)}^{2}\right]}^{\frac{1}{2}}\\ T := \frac{1}{{A}_{BZ}}\int{d}^{2}{{{\boldsymbol{k}}}}\left[tr(g({{{\boldsymbol{k}}}}))-\Omega ({{{\boldsymbol{k}}}})| \right],$$

(6)

where σ_F describes the fluctuations of Berry curvature and the T quantifies the violation of the trace condition. When both σ_F and T tend towards 0, it becomes possible to exactly map the Chern band to a Landau-level problem, allowing for an intuitive understanding of the fractional state. We calculate the values of these parameters in relation to the twist angle, as depicted in Fig. 4d.

Transfer learning structure relaxation

In order to resolve the problem of structural relaxation, we adopt the ab initio DeePMD method, which combines the first-principles accuracy and empirical-potential efficiency for large-scale systems³⁶. We begin with 3 × 3 × 1 MM, MX, and XM configurations, along with 28 distinct intermediate transition states, all of which have been relaxed with a fixed volume. For each one of the 31 configurations, we introduced random perturbations to generate 200 distinct structures. The random perturbations are applied to both the atomic coordinates, drawing values from a uniform distribution spanning [−0.01 Å, 0.01 Å], and the lattice constants, guided by a deformation matrix that is constructed from a distorted identity matrix spanning [−0.03, 0.03]. Besides, we conduct the 20 fs ab initio molecular dynamics to gather VASP-calculated energy, force, and virial tensor, which constitute the entirety of the initial training set.

Next, we train the initial neural network model through the initial training set, and run molecular dynamics simulations for different pressures (−100 to 10000 bar) and temperatures (10 to 500 K). A bunch of trajectories are generated in this process, and we label them as the failure, candidate, or accurate configurations according to the model deviation: ${\sigma }_{f}^{\max }=\max \sqrt{\langle {\vert {{{{\rm{F}}}}}_{i}-\langle {{{{\rm{F}}}}}_{i}\rangle \vert }^{2}\rangle }.$ During the process, 3 to 200 candidate configurations will be selected to perform the self-consistent DFT calculations, and the data will be collected for the training process of next-iteration.

Although the neural network model shows effective performance in the IVDW-10 correction, it does not yield successful results in the IVDW-4 correction, largely attributed to the complex dependencies on charge density. To address the issue, we augment our training datasets with comprehensive data from twisted structures of 2.88°, encompassing 118 sets of forces, energies, and virial information, as illustrated in Fig. 5. Leveraging the principles of transfer learning, we strategically froze the parameters within the embedding layers while focusing on training the hidden and output layer. This approach significantly improves the performance of the pre-trained model, enabling it to adapt more effectively to the complexities of IVDW-4.

**Fig. 5: Scheme of transfer learning.**

Conclusion

In this paper, we delve deeply into the lattice relaxation and single-particle of the tMoTe₂ system. We present a comprehensive exploration of the moiré band structure under two types of vdW corrections, where we harness the power of large-scale DFT calculations together with transfer learning and GPU acceleration. Built on angle-dependent moiré band structures, we construct a more complete continuum model including higher-harmonic potential and strain-induced gauge field. Our calculations reveal that, at experimentally pertinent twist angles, the intralayer displacement induces a sizeable gauge field, and top two moiré bands consistently display nontrivial Chern numbers.

The continuum model parameters have a strong impact on interaction-induced phases in tMoTe₂, as shown by previous numerical studies^{21,22,23,24,25,42,43,44,45,46}. With the continuum model fitted from D2 type of vdW correction²¹, integer quantum anomalous Hall effect only appears at large dielectric constant ϵ >15^24,25, and $\nu =-\frac{2}{3},-\frac{1}{3}$ are both found to be FCIs^21,43. With the continuum model fitted from dDsC type of vdW correction^22,23, the integer quantum anomalous Hall effect has been shown to occur at experimentally studied twist angles^24,25, while $\nu =-\frac{2}{3}$ and $\nu =-\frac{1}{3}$ are FCI and charge-density wave states, respectively^22,25.

Note: Upon the completion of this work, a related work appeared⁴⁷, which overlaps with some of our calculations with IVDW = 10.

Method

Plane-wave basis first principle calculations

The large-scale plane-wave basis first principle calculations are carried out with Perdew–Burke–Ernzerhof (PBE) functionals using the Vienna Ab initio simulation package (VASP)^48,49,50. We chose the projector augmented wave potentials, incorporating six electrons for each of the Mo and Te atoms. During the structural relaxation, we set the plane-wave cutoff energy and the energy convergence criterion to 250 eV and 1 × 10⁻⁶ eV, respectively. Larger energy cutoff of 350 and 500 eV has been tested for θ = 3.89°, which leads to less than 1 meV change in the bandwidth of topmost valence band. The structure is fully relaxed when the convergence threshold for the maximum force experienced by each atom is less than 10 meV/Å.

Local-basis first-principles calculations

Apart from the calculation utilizing the plane-wave basis, our DFT study on tMoTe₂ has also been performed under the pseudo atomic orbital (PAO) basis. Using the relaxed atomic structures from VASP and DeePMD, we use OpenMX package^51,52 with PAOs chosen to be Mo7.0-s3p2d1 (7.0 means the cutoff radius is 7.0 Bohr, s3p2d1 means three sets of s-orbitals, two sets of p-orbitals and 1 set of d-orbitals, summed up as 3 × 1 + 2 × 3 + 1 × 5 = 14 atomic orbitals for each Mo atom) and Te7.0-s3p2d2 to conduct the self-consistent calculation and obtain the band structure. The PBE exchange-correlation functional and the norm-conserving pseudopotential⁵³ are employed in the calculation with single Γ k-sampling and convergence criterion no lower than 6 × 10⁻⁵ Hartree.

Machine learning workflow

We are using the DeePMD-kit code to train the neural network³⁶. Here, we adopt the two-body embedding smooth edition of the DeepPot-SE descriptor, which is constructed by both angular and radial atomic configurations. The cut-off and smooth radius for neighbor searching are set as 8.0 and 2.0 Å, including a maximum number of 100 Mo and 100 Te atoms. Then, we construct a neural network that maps the descriptors to atomic energy, through three embedding layers and three hidden layers of size (25, 50, 100) and (240, 240, 240), respectively. To measure the quality of the neural network, we construct a loss function by a sum of different root means square errors (RMSE):

$$L\left({p}_{\epsilon },{p}_{f},{p}_{\xi }\right)=\frac{{p}_{\epsilon }}{N}\Delta {E}^{2}+\frac{{p}_{f}}{3N}{{\sum}_{i}}{\left\vert \Delta {{{{\boldsymbol{F}}}}}_{i}\right\vert }^{2}+\frac{{p}_{\xi }}{9N}\parallel \Delta \Xi {\parallel }^{2},$$

(7)

where ΔE, ΔF_i, and ΔΞ refer to the RMSE of energy, force, and virial, respectively. During the training process, the prefactor p_f decreases from 1000 to 1, p_ϵ and p_ξ increase from 0.02 to 1. To improve the efficiency of network training, we adopt an exponentially decaying learning rate to minimize the loss function. After 1,700,000 training steps, the learning rate decreases from 1e⁻³ to a small value of 3.6e⁻⁸.

Data availability

The data that support the findings of this study are available from the corresponding authors upon reasonable request.

Code availability

The code that supports the findings of this study are available from the corresponding authors upon reasonable request.

References

Cai, J. et al. Signatures of fractional quantum anomalous hall states in twisted MoTe₂. Nature 622, 63–68 (2023).
Article ADS Google Scholar
Zeng, Y. et al. Thermodynamic evidence of fractional chern insulator in moiré MoTe₂. Nature 622, 69–73 (2023).
Article ADS Google Scholar
Park, H. et al. Observation of fractionally quantized anomalous hall effect. Nature 622, 74–79 (2023).
Article ADS Google Scholar
Xu, F. et al. Observation of integer and fractional quantum anomalous hall effects in twisted bilayer MoTe₂. Phys. Rev. X 13, 031037 (2023).
Google Scholar
Devakul, T., Crépel, V., Zhang, Y. & Fu, L. Magic in twisted transition metal dichalcogenide bilayers. Nat. Commun. 12, 1–9 (2021).
Article Google Scholar
Li, H., Kumar, U., Sun, K. & Lin, S.-Z. Spontaneous fractional Chern insulators in transition metal dichalcogenide moiré superlattices. Phys. Rev. Res. 3, L032070 (2021).
Article Google Scholar
Crépel, V. & Fu, L. Anomalous hall metal and fractional chern insulator in twisted transition metal dichalcogenides. Phys. Rev. B 107, L201109 (2023).
Article ADS Google Scholar
Wu, F., Lovorn, T., Tutuc, E., Martin, I. & MacDonald, A. Topological insulators in twisted transition metal dichalcogenide homobilayers. Phys. Rev. Lett. 122, 086402 (2019).
Article ADS Google Scholar
Tang, E., Mei, J.-W. & Wen, X.-G. High-temperature fractional quantum Hall states. Phys. Rev. Lett. 106, 236802 (2011).
Article ADS Google Scholar
Sheng, D., Gu, Z.-C., Sun, K. & Sheng, L. Fractional quantum Hall effect in the absence of Landau levels. Nat. Commun. 2, 389 (2011).
Article ADS Google Scholar
Regnault, N. & Bernevig, B. A. Fractional chern insulator. Phys. Rev. X 1, 021014 (2011).
Google Scholar
Sun, K., Gu, Z., Katsura, H. & Sarma, S. D. Nearly flatbands with nontrivial topology. Phys. Rev. Lett. 106, 236803 (2011).
Article ADS Google Scholar
Neupert, T., Santos, L., Chamon, C. & Mudry, C. Fractional quantum hall states at zero magnetic field. Phys. Rev. Lett. 106, 236804 (2011).
Article ADS Google Scholar
Xiao, D., Zhu, W., Ran, Y., Nagaosa, N. & Okamoto, S. Interface engineering of quantum Hall effects in digital transition metal oxide heterostructures. Nat. Commun. 2, 596 (2011).
Article ADS Google Scholar
Venderbos, J. W., Kourtis, S., van den Brink, J. & Daghofer, M. Fractional quantum-hall liquid spontaneously generated by strongly correlated t_2g electrons. Phys. Rev. Lett. 108, 126405 (2012).
Article ADS Google Scholar
Bergholtz, E. J. & Liu, Z. Topological flat band models and fractional Chern insulators. Int. J. Mod. Phys. B 27, 1330017 (2013).
Article ADS MathSciNet Google Scholar
Neupert, T., Chamon, C., Iadecola, T., Santos, L. H. & Mudry, C. Fractional (Chern and topological) insulators. Phys. Scr. 2015, 014005 (2015).
Article Google Scholar
Liu, Z. & Bergholtz, E. J. In Reference Module in Materials Science and Materials Engineering (Elsevier, 2023).
Parameswaran, S. A., Roy, R. & Sondhi, S. L. Fractional quantum Hall physics in topological flat bands. C. R. Phys. 14, 816–839 (2013).
Article ADS Google Scholar
Nayak, C., Simon, S. H., Stern, A., Freedman, M. & Das Sarma, S. Non-abelian anyons and topological quantum computation. Rev. Mod. Phys. 80, 1083–1159 (2008).
Article ADS MathSciNet Google Scholar
Wang, C. et al. Fractional chern insulator in twisted bilayer MoTe₂. Phys. Rev. Lett. 132, 036501 (2024).
Article ADS Google Scholar
Reddy, A. P., Alsallom, F., Zhang, Y., Devakul, T. & Fu, L. Fractional quantum anomalous hall states in twisted bilayer MoTe₂ and WSe₂. Phys. Rev. B 108, 085117 (2023).
Article ADS Google Scholar
Xu, C., Li, J., Xu, Y., Bi, Z. & Zhang, Y. Maximally localized wannier functions, interaction models, and fractional quantum anomalous hall effect in twisted bilayer MoTe₂. Proc. Natl Acad. Sci. USA 121, e2316749121 (2024).
Article Google Scholar
Yu, J. et al. Fractional chern insulators versus nonmagnetic states in twisted bilayer MoTe₂. Phys. Rev. B 109, 045147 (2024).
Article ADS Google Scholar
Abouelkomsan, A., Reddy, A. P., Fu, L. & Bergholtz, E. J. Band mixing in the quantum anomalous hall regime of twisted semiconductor bilayers. Phys. Rev. B 109, L121107 (2024).
Article ADS Google Scholar
Naik, M. H. & Jain, M. Ultraflatbands and shear solitons in moiré patterns of twisted bilayer transition metal dichalcogenides. Phys. Rev. Lett. 121, 266401 (2018).
Article ADS Google Scholar
Yu, H., Chen, M. & Yao, W. Giant magnetic field from moiré induced berry phase in homobilayer semiconductors. Natl Sci. Rev. 7, 12–20 (2020).
Article Google Scholar
Xian, L. et al. Realization of nearly dispersionless bands with strong orbital anisotropy from destructive interference in twisted bilayer MoS₂. Nat. Commun. 12, 5644 (2021).
Article ADS Google Scholar
Zhang, Y., Liu, T. & Fu, L. Electronic structures, charge transfer, and charge order in twisted transition metal dichalcogenide bilayers. Phys. Rev. B 103, 155142 (2021).
Article ADS Google Scholar
Angeli, M. & MacDonald, A. H. Γ valley transition metal dichalcogenide moiré bands. Proc. Natl Acad. Sci. USA 118, e2021826118 (2021).
Article Google Scholar
Grimme, S. Semiempirical gga-type density functional constructed with a long-range dispersion correction. J. Comput. Chem. 27, 1787–1799 (2006).
Article Google Scholar
Steinmann, S. N. & Corminboeuf, C. A generalized-gradient approximation exchange hole model for dispersion coefficients. J. Chem. Phys. 134, 044117 (2011).
Steinmann, S. N. & Corminboeuf, C. Comprehensive benchmarking of a density-dependent dispersion correction. J. Chem. Theory Comput. 7, 3567–3577 (2011).
Article Google Scholar
Wilson, J. A. & Yoffe, A. The transition metal dichalcogenides discussion and interpretation of the observed optical, electrical and structural properties. Adv. Phys. 18, 193–335 (1969).
Article ADS Google Scholar
Reshak, A. H. & Auluck, S. Band structure and optical response of 2H − MoX₂ compounds (X = S, Se, and Te). Phys. Rev. B 71, 155114 (2005).
Zhang, L., Han, J., Wang, H., Car, R. & Weinan, E. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 120, 143001 (2018).
Article ADS Google Scholar
García, A. et al. Siesta: recent developments and applications. J. Chem. Phys. 152, 204108 (2020).
Xie, Y.-M., Zhang, C.-P., Hu, J.-X., Mak, K. F. & Law, K. T. Valley-polarized quantum anomalous Hall state in Moiré MoTe₂/WSe₂ heterobilayers. Phys. Rev. Lett. 128, 026402 (2022).
Article ADS Google Scholar
Roy, R. Band geometry of fractional topological insulators. Phys. Rev. B 90, 165139 (2014).
Article ADS Google Scholar
Parameswaran, S., Roy, R. & Sondhi, S. L. Fractional chern insulators and the w∞ algebra. Phys. Rev. B 85, 241308 (2012).
Article ADS Google Scholar
Claassen, M., Lee, C. H., Thomale, R., Qi, X.-L. & Devereaux, T. P. Position-momentum duality and fractional quantum hall effect in chern insulators. Phys. Rev. Lett. 114, 236802 (2015).
Article ADS Google Scholar
Goldman, H., Reddy, A. P., Paul, N. & Fu, L. Zero-field composite fermi liquid in twisted semiconductor bilayers. Phys. Rev. Lett. 131, 136501 (2023).
Article ADS Google Scholar
Dong, J., Wang, J., Ledwith, P. J., Vishwanath, A. & Parker, D. E. Composite fermi liquid at zero magnetic field in twisted MoTe₂. Phys. Rev. Lett. 131, 136502 (2023).
Article ADS Google Scholar
Reddy, A. P. & Fu, L. Toward a global phase diagram of the fractional quantum anomalous hall effect. Phys. Rev. B 108, 245159 (2023).
Article ADS Google Scholar
Qiu, W.-X., Li, B., Luo, X.-J. & Wu, F. Interaction-driven topological phase diagram of twisted bilayer MoTe₂. Phys. Rev. X 13, 041026 (2023).
Google Scholar
Wang, T., Devakul, T., Zaletel, M. P. & Fu, L. Topological magnets and magnons in twisted bilayer MoTe₂ and WSe₂. Preprint at arXiv:2306.02501 (2023).
Jia, Y. et al. Moiré fractional chern insulators. i. first-principles calculations and continuum models of twisted bilayer MoTe₂. Phys. Rev. B 109, 205121 (2024).
Article ADS Google Scholar
Kresse, G. & Hafner, J. Ab initio molecular dynamics for liquid metals. Phys. Rev. B 47, 558–561 (1993).
Article ADS Google Scholar
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953–17979 (1994).
Article ADS Google Scholar
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
Article ADS Google Scholar
Ozaki, T. & Kino, H. Numerical atomic basis orbitals from H to Kr. Phys. Rev. B 69, 195113 (2004).
Article ADS Google Scholar
Ozaki, T. Variationally optimized atomic orbitals for large-scale electronic structures. Phys. Rev. B 67, 155108 (2003).
Article ADS Google Scholar
Morrison, I., Bylander, D. M. & Kleinman, L. Nonlocal Hermitian norm-conserving Vanderbilt pseudopotential. Phys. Rev. B 47, 6728–6731 (1993).
Article ADS Google Scholar
Onishi, Y. & Fu, L. Fundamental bound on topological gap. Phys. Rev. X 14, 011052 (2024).
Google Scholar

Download references

Acknowledgements

We are grateful to Tingxin Li, Taige Wang, Trithep Devakul, Fengcheng Wu, and Allan Macdonald for their helpful discussions. Y.Z. thanks Quansheng Wu and Jianpeng Liu for the cross-check on DFT parameters. L.F. and C.F. are partly supported by the Catalyst Fund of the Canadian Institute for Advanced Research. Y.Z. is supported by the start-up fund and the seed grant from the AI Tennessee Initiative at the University of Tennessee Knoxville. The research by J. L. was primarily supported by the National Science Foundation Materials Research Science and Engineering Center program through the UT Knoxville Center for Advanced Materials and Manufacturing (DMR-2309083). The machine learning simulations and large matrix diagonalizations are performed on H100 nodes provided by the AI Tennessee Initiative.

Author information

These authors contributed equally: Ning Mao, Cheng Xu.

Authors and Affiliations

Max Planck Institute for Chemical Physics of Solids, 01187, Dresden, Germany
Ning Mao & Claudia Felser
Department of Physics and Astronomy, University of Tennessee, Knoxville, TN, 37996, USA
Cheng Xu, Jiangxu Li & Yang Zhang
Department of Physics, Tsinghua University, Beijing, 100084, China
Cheng Xu, Ting Bao & Yong Xu
Institute of Metal Research, Chinese Academy of Sciences, 110016, Shenyang, China
Peitao Liu
Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Liang Fu
Min H. Kao Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN, 37996, USA
Yang Zhang

Authors

Ning Mao
View author publications
Search author on:PubMed Google Scholar
Cheng Xu
View author publications
Search author on:PubMed Google Scholar
Jiangxu Li
View author publications
Search author on:PubMed Google Scholar
Ting Bao
View author publications
Search author on:PubMed Google Scholar
Peitao Liu
View author publications
Search author on:PubMed Google Scholar
Yong Xu
View author publications
Search author on:PubMed Google Scholar
Claudia Felser
View author publications
Search author on:PubMed Google Scholar
Liang Fu
View author publications
Search author on:PubMed Google Scholar
Yang Zhang
View author publications
Search author on:PubMed Google Scholar

Contributions

Y.Z. initiated this project. N.M. performed the first principle calculations, transfer learning structure relaxation, and continuum model calculations, with the help of C.X., J.X.L., and T.B. P.T.L., Y.X., C.F., and L.F. contributed to data analysis. Y.Z., N.M., C.X., and J.X.L. wrote the manuscript with input from all the authors.

Corresponding author

Correspondence to Yang Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

This manuscript has been previously reviewed at another Nature Portfolio journal. The manuscript was considered suitable for publication without further review at Communications Physics.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Materials

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mao, N., Xu, C., Li, J. et al. Transfer learning relaxation, electronic structure and continuum model for twisted bilayer MoTe₂. Commun Phys 7, 262 (2024). https://doi.org/10.1038/s42005-024-01754-y

Download citation

Received: 24 June 2024
Accepted: 18 July 2024
Published: 03 August 2024
Version of record: 03 August 2024
DOI: https://doi.org/10.1038/s42005-024-01754-y

This article is cited by

Regarding the existence of abelian fractional topological insulators in twisted MoTe2 and related systems
- Yves H. Kwan
- Glenn Wagner
- Nicolas Regnault
Communications Physics (2026)
Finite-momentum superconductivity from chiral bands in twisted MoTe2
- Yinqi Chen
- Cheng Xu
- Constantin Schrade
Nature Communications (2026)
DPmoire: a tool for constructing accurate machine learning force fields in moiré systems
- Jiaxuan Liu
- Zhong Fang
- Quansheng Wu
npj Computational Materials (2025)
Ferromagnetism and topology of the higher flat band in a fractional Chern insulator
- Heonjoon Park
- Jiaqi Cai
- Xiaodong Xu
Nature Physics (2025)
Transferable dispersion-aware machine learning interatomic potentials for multilayer transition metal dichalcogenide heterostructures
- Yusuf Shaidu
- Mit H. Naik
- Jeffrey B. Neaton
npj Computational Materials (2025)