Abstract
We propose a non-collinear spin-constrained method that generates training data for deep-learning-based magnetic model, which provides a powerful tool for studying complex magnetic phenomena that requires large-scale simulations at the atomic level. First, we propose a basis-independent projection method for calculating atomic magnetic moments by applying a radial truncation to numerical atomic orbitals. A double-loop Lagrange multiplier method is utilized to ensure the satisfaction of constraint conditions while achieving accurate magnetic torque. The method is implemented in ABACUS with both plane wave basis and numerical atomic orbital basis. We benchmark the iron (Fe) systems and analyze differences from calculations with the plane wave basis and numerical atomic orbitals basis in describing magnetic energy barriers. Based on an automated workflow composed of first-principles calculations, magnetic model, active learning, and dynamics simulation, more than 30,000 first-principles data with the information of magnetic torque are generated to train a deep-learning-based magnetic model DeePSPIN for the Fe system. By utilizing the model in large-scale molecular dynamics simulations, we successfully predict Curie temperatures of α-Fe close to experimental values.
Similar content being viewed by others
Introduction
The study of magnetic materials has been a cornerstone of condensed matter physics, with implications ranging from fundamental science to technological applications. Density Functional Theory (DFT)1,2 has emerged as a powerful tool for understanding the electronic and magnetic properties of these materials3. However, accurate description of excited magnetic states within DFT has remained a formidable challenge due to the complex interplay between electron and lattice behaviors.
The development of constrained Density Functional Theory (cDFT) has marked a significant advancement in this field4,5. In the pioneering work of Dederichs et al. in 19844, DFT was extended to arbitrary constraints through the introduction of the Lagrange multiplier method. Since then, several cDFT methods have been used to study excited states of charge distribution4,6 or magnetization7,8,9,10,11,12,13. Introducing penalty functions is commonly adopted to implement any constraint.10,11. Wu and Van Voorhis (2005)6 introduced an efficient algorithm, which allows for the use of a double-loop method to find the effective Lagrange multiplier that satisfies the constraints. In addition, the cDFT methods have been implemented based on different basis sets, such as the real-space grid10, the full-potential linearized augmented plane-wave (FLAPW)8, the projector augmented-wave (PAW)11,12,13, and the numerical atomic orbitals (NAOs)9. These efforts provide powerful tools in understanding charge and magnetization fluctuations in solids, predicting spin-dependent phenomena, and characterizing electron transfer reactions in molecules5,6,7,8,9,11,12,13.
In recent years, the integration of deep learning models with DFT has offered unprecedented opportunities for transferring first-principles accuracy to larger scales14,15,16,17,18,19,20,21. The available databases22,23 (https://www.atomly.net/)24 contain a large amount of DFT data that can be used to train machine learning models for different purposes.25,26,27. Recently, some magnetic models28,29 have emerged that incorporate the degrees of freedom of spin into the existing neural networks and can perform large-scale magnetic dynamics simulations. However, these models suffer from a lack of data since most existing DFT data in datasets are non-magnetic or only include ground-state magnetic results. Ground-state data cannot meet the sampling requirements of magnetic models with an additional degree of freedom spin. From this perspective, spin-cDFT methods not only act as powerful first-principles tools for studying magnetic excited states at the atomic scale but also provide training data for machine-learning-based magnetic models.
To address the substantial data requirements for AI-driven magnetic modeling, cDFT methodology must simultaneously satisfy multiple critical requirements: (1) robust convergence and high accuracy for reliable large-scale sampling, (2) consistent precision across datasets to eliminate untraceable errors, (3) minimal dependence on manually tuned parameters for automated active learning workflows, (4) adaptable constraint conditions accommodating diverse magnetic systems, and (5) optimized computational efficiency to address the inherent cost premium of magnetic versus non-magnetic calculations. To establish a robust data generation framework meeting these criteria, we have developed and implemented a Lagrange multiplier-based cDFT approach within the open-source ABACUS software30,31,32, which provides energies, atomic forces, stresses, and magnetic torques for any magnetic excited state, making it a suitable data engine for magnetic models. Additionally, we propose a basis-independent local orbital projection method for calculating magnetic moment magnitudes. Unlike the Mulliken population method9, which, while maintaining the sum rule, may introduce contributions from non-local orbitals. Benefiting from the locality of NAOs, magnetic moments are obtained through the projection of atomic orbitals. By appropriately truncating the local atomic orbitals and smoothing near the truncation, we find that the modulated atomic orbitals provide satisfactory performance in stability and convergence during cDFT calculations. We present a unified implementation of cDFT that works efficiently with both plane-wave (PW) and numerical atomic orbital (NAO) basis. The spin-cDFT is fully optimized using the double-loop approach6. The optimized torque holds significant value for magnetic dynamics simulations and configuration space exploration. While conventional penalty methods11 yield approximate torques λ, their fixed λ induces non-negligible errors in energy derivatives. The adaptive scheme circumvents this by enforcing ΔM → 0. Moreover, the penalty methods encounter convergence difficulties when λ is too large33, which requires careful handling of the iterative process. More critically, while the penalty function method can achieve reliable calculations for case studies through careful adjustment of λ, this system-dependent tuning process becomes impractical for the AI-driven large-scale sampling. A predetermined λ exhibits varying accuracy levels across different configurations, introducing untraceable error that compromise subsequent model training. The limited accuracy, unstable convergence, and untraceable errors in large datasets render the penalty function method inadequate for meeting requirements 1/2/3. In contrast, the Lagrange multiplier method effectively addresses these challenges through its dual-optimization scheme6,8,12,13.
Our implementation supports two optional constraint modes: the first constrains both the magnitude and direction of magnetic moments, while the second constrains only the moment direction. Despite Gyorffy et al.’s assertion that longitudinal and transverse spin fluctuations exhibit temporal separation34, recent studies indicate that longitudinal spin fluctuations play a significant role in processes such as ferromagnetic-paramagnetic transitions35 and phonon-magnon interactions36. In this work, we primarily focus on the application of the full-constraint algorithm, enabling the construction of magnetic potential energy surfaces through integration with recently developed magnetic models29. In coupled spin-lattice dynamics37, the contribution of longitudinal spin fluctuations manifests in the thermodynamic statistical properties of equilibrium states across different temperatures within the energy domain. The real-time evolution of longitudinal fluctuations in the temporal domain falls outside the scope of the present study.
Through the synergistic integration of cDFT calculations, magnetic model, active learning algorithms, and coupled spin-lattice dynamics37, we have developed a comprehensive end-to-end framework for autonomous magnetic model development. The workflow achieves complete automation throughout the entire training process, eliminating the need for manual intervention. Starting from scratch, we successfully trained two distinct models using PW and NAO basis, respectively. Both models yielded quantitatively consistent ferromagnetic-paramagnetic phase transition. This robust agreement across independent models provides validation of our methodology’s accuracy and transferability. This research paradigm offers valuable insights for AI-driven magnetic materials simulation. Furthermore, our work provides the research community with a reliable and accessible tool for related investigations.
The content of this paper is as follows. Section “Method” details the theoretical foundations, including non-collinear spin, projection methods, and spin-constrained DFT. Section “Projection Methods” discusses the modulated NAOs, as well as the behavior of atomic magnetic moments based on orbital projections. Section “Finite Difference Tests for Spin-cDFT Method” verifies the correctness of the implementation through finite difference tests. Section “Magnetic Constraints and Energy Surface of Iron Phases” introduces the properties of various magnetic excited states of pure Fe calculated using spin-cDFT. Section “DeePSPIN Model” employs iron (Fe) as a prototype system to demonstrate our automated workflow that bridges first-principles cDFT calculations with AI magnetic model, and provides an accurate Curie temperature through molecular dynamic simulation. Finally, Section “Discussion” summarizes the work.
Results
Projection methods
We have developed an innovative algorithm designed to estimate and control atomic magnetic moments using localized orbital projection techniques (see methods in Section “Projection Methods”). The key innovation of our method lies in the localized modulation of the numerical atomic orbital basis. We have benchmarked this algorithm on ferromagnetic (FM) and antiferromagnetic (AFM) iron (Fe) bulk.
Our first step involved comparing the modulated radial functions (RFs) of orbitals obtained through our algorithm with the original NAO RFs. It is important to distinguish the two properties discussed here: the cutoff radius rc and the modulation radius rm of the orbitals. The original NAO is zero beyond the cutoff radius, while the modulation radius refers to the range of the original NAO modulated by Eq. (22) for magnetic moment projection. In Fig. 1, we modify the shape of a NAO using Eq. (22) with rm ranging from 1 to 5 Bohr, where the rc of original NAO is 6.0 Bohr. As shown in Fig. 1(a), the modulated p orbitals align closely with the original orbital shape if rm ≥ 2 Bohr, while the modulated d orbitals remain consistent with the original shape for rm ≥ 3 Bohr. These modulated orbitals exhibit smoothness at rm, with the first derivative approaching zero. For more localized rm, the modulated orbitals retain the peak radius of the RFs, but their peak heights are significantly increased due to normalization constraints. Excessively large rm may introduce non-localized electron wavefunctions or portions of wavefunctions from localized orbitals of other atoms into the projection results, thereby preventing the projection from accurately representing atomic localized orbitals. On the other hand, excessively small rm can lead to significant distortion, as they cause a large deviation from the original atomic orbitals.
a The original and modulated RFs for p orbitals, where the original numerical atomic orbitals have a cutoff radius rc of 6.0 Bohr, as well as the modulated orbital with the modification radius rm as 1.0, 2.0 and 4.0 Bohr. “p-1.0” represents the p orbital modulated by Eq. (22) with rm = 1.0 Bohr. b The original and modulated RFs for d orbitals.
Next, we discuss the criteria for selecting appropriate rm. To this end, we calculated the ground state of BCC phase iron (BCC-Fe) for both ferromagnetic (FM) and anti-ferromagnetic (AFM) configurations using PW and NAO basis sets. The atomic magnetic moments of Fe atom were then estimated using the modulated orbital local projection algorithm with various rm. As a reference, we define the “TMAG” by directly summing the magnetic density within the cell ∑rm(r)/N for the FM configuration, and “AMAG” as the modulus sum of the magnetic density ∑r∣m(r)∣/N for the AFM configuration, where N is the number of Fe atom. For the NAO calculations, we can use the high-precision TZDP-10 Bohr results as a AFM reference.
As shown in Fig. 2(a) and (c), the projected magnetic moments obtained from the different modulated orbitals agree well with each other if rm < 4 Bohr in all PW calculations. This indicates that the projection results are dependent solely on the selected rm, rather than on rc. The results from the NAO basis (Fig. 2(b) and (d)) show differences in the projection results between various basis sets, with these discrepancies arising solely from the NAO basis set itself. Furthermore, the comparison between left and right panel of Fig. 2 shows the higher-precision TZDP results align well with the PW ones. Fig. S8 further compares the projected magnetic moments versus modulation radius rm across different basis sets (PW, SZ, DZP, TZDP). For BCC-Fe ferromagnetic state calculations, the maximum errors are 0.13 μB (SZ), 0.03 μB (DZP), and 0.002 μB (TZDP).
The modulated orbital projection algorithm is employed with DZP and TZDP basis sets with cutoff radius rc ranging from 6.0 to 10.0 Bohr, to demonstrate the impact of different NAO basis sets on atomic magnetic moments. Where a is for the PW basis set in the FM magnetic configuration, with the reference value being the total magnetization per atom (TMAG); b is for the NAO basis set in the FM magnetic configuration, with the reference value being the TMAG; c is for the PW basis set in the AFM magnetic configuration, with the reference value being the absolute magnetization per atom (AMAG); and d is for the LCAO basis set in the AFM magnetic configuration, with the reference value being the atomic magnetization from Mulliken charge.
The choice of an appropriate modulation radius rm significantly affects the estimated values of atomic magnetic moments. We discuss two possible strategies for determining rm. A direct approach is to calibrate the projected orbital radius by matching it to reference values, such as experimental data or literature values. For BCC-Fe, the modulation radius chosen using the first approach is 2.0 Bohr, with the FM and AFM atomic magnetic moments being 2.21 and 1.48 μB, respectively. These values are in good agreement with those calculated by VASP11, which are 2.22 and 1.52 μB. However, this can be challenging for complex compounds or systems without available experimental data, as it may be difficult to identify suitable reference values.
One application of our implementation is to generate a large-scale dataset for subsequent model training. The dataset required for the model includes the system energy E, atomic magnetic moments M, and the magnetic torque λ. It should be clarified that the Lagrange multipliers in this work differ fundamentally from those in other implementations11,12, as their physical interpretation is intrinsically determined by the specific constraint conditions (see Sec. 4.3 for the definition of λ in this work). The influence of the modulation radius on all these physical quantities should be taken into consideration. Here we present the calculated energy and magnetic torque for BCC Fe at different modulation radius in Fig. 3. It can be observed that within a certain range (1–5 Bohr), different rm have a small quantitative impact on the energy, and do not lead to qualitative changes, such as alterations in the energy ordering of different states. Compared to the energy, the influence of the rm on the magnetic torque is more pronounced. When the modulated orbital covers only an extremely narrow region near the nucleus (1–2 Bohr), truncation distorts the wavefunction characteristics of the original NAOs. To achieve the same Mtarget within a reduced integration volume, a stronger constraint field must be applied, causing ∣λ∣ to rise sharply. In contrast, when rm is sufficiently large to encompass the main valence characteristic peak of the NAOs, the projector exhibit high spatial overlap with the true orbitals. Here, λ only needs to compensate for minor deviations between the self-consistent electron distribution and the target magnetic moment, and its value stabilizes. Figure 3b shows that ∣λ∣ remains essentially constant beyond rm ≥ 3 Bohr, indicating that the physical quantity has become independent of the projection parameter. We wish to emphasize that for an individual calculation, these physical quantities are self-consistent, as guaranteed by the finite difference tests will be presented below. However, considering high-throughput calculations and the further exploration of sample space at high temperatures, one can anticipate the appearance of a large number of structurally complex configurations. We are concerned that in highly non-uniform configurations, even minor disturbances may lead to significant changes in physical quantities, potentially causing difficulties in model training. Based on these considerations, we aimed to find rm where the physical quantities exhibit relatively smooth behavior with respect to changes in real-space coordinates, that is \(\frac{\partial E}{\partial {r}_{m}}\approx 0,\frac{\partial {\bf{M}}}{\partial {r}_{m}}\approx 0,\frac{\partial {\boldsymbol{\lambda }}}{\partial {r}_{m}}\approx 0\). Based on Figs. 2 and 3, we selected 3.0 Bohr as the setting for producing the dataset in this work. This rm estimated atomic magnetic moments being 2.36 μB for FM and 1.6 μB for AFM in BCC Fe. By comparing with the direction of the total magnetic moment, this projection scheme is also demonstrated to accurately maintain alignment between the target magnetic moment and the self-consistent spin directions (see Table S5). We wish to clarify that the different magnetic torques generated by different rm do not introduce systematic bias into the magnetic dynamics simulations for which the model is ultimately intended. This point is demonstrated in Fig. S10 of the Supplementary Information.
a The total energy of BCC-Fe calculated at different modulation radius by using DZP basis (rc = 7.0 Bohr). The labels represent the angle of magnetic moment between two nearest-neighbor Fe atoms in BCC-Fe, while the magnetic moment magnitudes are set to the corresponding FM ground-state values for each rm, as indicated by the black pentagrams in (b). b shows the corresponding magnetic torque ∣λ∣ of BCC-Fe calculated at different modulation radius.
Finite difference tests for spin-cDFT method
Based on the theory defined in Section 4, we implemented the spin-cDFT method in the open-source software ABACUS, the method is available with either plane wave or numerical atomic orbital basis sets. In the spin-cDFT framework, we introduced corrections for the energy, atomic forces, and lattice stresses under full magnetic constraints and incorporated calculations of magnetic torque λ. These magnetic torques are optimized in the inner loop of the self-consistent field iterations. To validate our implementation, we compared analytical solutions for atomic forces, lattice stresses, and magnetic torques against numerical ones obtained through finite difference method. As shown in Fig. 4, these tests were conducted on elemental body-centered cubic (BCC) iron (Fe), FePt binary alloy, and NiMnTi ternary alloy. For a 16-atom BCC-Fe supercell, we perturbed one Fe atom’s position, where the maximum discrepancy between numerical forces and analytical forces across all perturbations does not exceed 6 meV/Å. Magnetic torques are the partial derivatives of the energy with respect to magnetic moments, rather than atomic positions. Selecting one Fe atom in the FePt alloy, we perturbed its magnetic moment. The maximum error in the finite-difference values does not exceed 0.006 eV/μB, where the analytical result is obtained from the inner optimization. For the stress tests, we applied varying magnitudes of lattice strain to NiMnTi and obtained numerical stresses through finite-difference calculations. All components of these numerical stress tensors showed discrepancies within 0.25 kpar when compared with analytical values, demonstrating good consistency. For these three tests, the cutoff energy was set to 100 Ry, and the Brillouin zone was uniformly sampled by 5 × 5 × 5 Monkhorst-Pack grid. The raw data for all finite-difference tests are provided in Table S1/S2/S3 of Supplementary Information. The successful finite difference tests corroborate the proper functionality of the cDFT features for energy, atomic forces, magnetic torques, and lattice stress in ABACUS.
a Finite difference tests for the atomic force in BCC Fe16 with a perturbation step of 0.01 Bohr along z-direction. b Finite difference tests for the magnetic torque in binary alloy FePt with a perturbation step of 0.1 μB. c Finite difference tests for the cell stress in ternary alloy NiMnTi with a perturbation step 0.0001. “11” refers to the component of stress matrix σ11. In (a–c), “Analytic” represents the value calculated from the formula, while “Numerical” is the value determined from the finite difference method.
Magnetic constraints and energy surface of iron phases
The spin-cDFT method presented in this paper facilitates calculations on any magnetic configurations and supports the analysis of complex magnetic structures. We take the bulk iron as an example to demonstrate the reliability of the magnetic constraint method in ABACUS.
The BCC and FCC phases are two prevalent iron crystal structures. The BCC structure is the stable phase of iron at room temperature and is known for its high strength and low ductility. The BCC Fe structure exhibits a ferromagnetic (FM) state with atomic magnetism around 2.2 μB per atom38, and the Curie temperature was found at 1043 K experimentally39. On the other hand, The FCC structure has a higher density of atoms compared to BCC. Experimentally, iron undergoes a structural phase transition from BCC to FCC around 1085 K40 and the FCC structure is the stable phase until 1667 K40. Although FCC Fe exhibits paramagnetic behavior in the experiment, theoretically, it is predicted that the energy of either antiferromagnetic (AFM) or double-layer antiferromagnetic (DAFM) states would be lower than that of ferromagnetic at 0 K28. It has a more unimodal density of states at the Fermi level, which results in lower magnetic moments compared to BCC Fe.
Firstly, we consider ferromagnetic BCC-Fe. In the lower panel of Fig. 5(a), the blue solid line illustrates the total energy’s dependence on the BCC-Fe cell volume, calculated using the PW basis without spin constraint. It is evident that the equilibrium volume of BCC-Fe, determined by the volume at which the total energy is minimized, is 11.2 Å3. The black line represents results computed with the v2.1 NAOs basis, specifically the DZP basis with a cutoff radius of 8 au, referred to as “DZP-8au-v2.1”. The NAOs basis is optimized on the basis of the results from the PW basis set, and the energy difference between them reflects the quality of the NAOs orbitals. We observe that the energy difference remains nearly constant under tensile conditions but decreases significantly under compression, indicating discrepancies in orbital precision at varying interatomic distances. The grey histograms in the figure depict the atomic magnetic moments at different volumes. We present only the NAO results since the magnetic moments calculated using the PW basis set and the NAOs basis set are relatively similar. The atomic magnetic moments increase monotonically with volume, with the projected atomic magnetic moment reaching 2.36 μB at the energy minimum.
a, b plot the magnetic force λ = ∣λ∣ as a function of the cell volume per atom for BCC-Fe (a) and FCC-Fe (b). c, d The solid lines show the total energy per atom as a function of the cell volume per atom for BCC-Fe (c) and FCC-Fe (d). Here, “DZP-8au-v2.1” represents the v2.1 NAOs calculations based on the DZP orbitals with the 8 Bohr cutoff. The gray histograms represent the atomic magnetic moment m after fully unconstrained self-consistent calculations. The dotted lines show the cDFT energy with certain constraints.
We now impose a constraint on the magnetic configuration, fixing the magnetic moments at 2.36 μB. The corresponding energy from the spin-cDFT calculations is shown as dotted lines. Due to this constraint, the magnetic moment of 2.36 μB represents an excited state for all volume points except 11.2 Å3, resulting in energies higher than the unconstrained ground-state energy. The spin-cDFT energy intersects the ground-state energy at only one point with the magnetic moment 2.36 μB. The upper panel of Fig. 5(a) displays the magnetic torques λ optimized in the cDFT calculations. The magnetic torques are zero when the cDFT states are the same as the ground state, indicating that no penalty is required to maintain the specified magnetic moment configuration. However, the excited magnetic states lead to finite magnetic torque, which can be interpreted as an additional effective magnetic field necessary to constrain the magnetic moment.
Similarly, in Fig. 5(b), we present the results for ferromagnetic FCC-Fe. Compared to BCC-Fe, the FCC phase near the ground state exhibits two local minima in the cell volume per atom, located around 10.5 Å3 and 12.2 Å3, respectively. The atomic magnetic moment increases monotonically with the cell volume. A key difference arises at the boundary between the two minima, where a sudden change in the magnitude of the atomic magnetic moment is observed. The atomic magnetic moments corresponding to the two minima are 1.16 μB and 2.72 μB, respectively. We constrained the magnitude of the atomic magnetic moments to these two values, maintaining their direction in the ferromagnetic state. The results show that, regardless of whether the calculations are performed using the PW or NAO basis, the total energy under the constrained conditions only touches the ground-state energy at the corresponding ground-state magnetic moments, with the associated magnetic force λ being zero. When the constrained magnetic moments deviate from the ground-state values, the total energy exceeds the ground-state energy, and the magnetic force progressively increases.
To systematically present the potential energy surfaces of Fe, we modeled BCC-Fe and FCC-Fe with atomic volumes ranging from 8 to 16 Å3 and calculated their total energies at various magnetic moments using spin-cDFT. The magnetic moments considered ranged from 0.2 to 3.8 μB. Figure 6 illustrates the potential energy surfaces, E(V, M), for BCC-Fe and FCC-Fe, respectively. The results reveal that the BCC phase exhibits a single energy minimum, whereas the FCC phase features two distinct energy minima, consistent with the findings presented in Fig. 5. In addition, Fig. S5 displays the potential energy surface E(θ, ∣M∣) of BCC-Fe, encompassing both transverse and longitudinal excitations. The results demonstrate that as the angle between the magnetic moments of two neighboring iron atoms increases, the total energy rises progressively, while the magnetic moment amplitude corresponding to the energy-favored configuration gradually diminishes.
Magnetic calculations have high accuracy requirements. Unlike the PW basis, where accuracy can be systematically improved by increasing the number of plane waves, the accuracy of the NAOs basis depends on the quality of the orbitals themselves and lacks a systematic way to enhance precision. In the following, we will quantitatively examine the precision of spin-cDFT results using different basis sets. In the Heisenberg model H = ∑ijJijSi ⋅ Sj, the magnetic exchange strength J determines the resistance of the material’s magnetic order to external perturbations such as temperature and magnetic fields. This strength typically falls within the range of a few to tens of meV3. The value of J can generally be derived from the energy differences between various magnetic configurations, making the magnetic barrier energy an excellent quantitative metric for evaluating accuracy.
To this end, we chose a BCC-Fe unitcell containing two Fe atoms, each with the same magnetic moment M and an angle θ between them. When θ = 0°, the system represents a ferromagnetic state, and when θ = 180∘, it represents an antiferromagnetic state. To evaluate the quality of orbitals, we selected different NAO basis sets with various cutoff radius rc and calculated E(M, θ) for these configurations. Additionally, we used results from a PW basis as a reference. All E(M, θ) values for these magnetic configurations were computed using spin-cDFT method. It is important to note that the energy errors calculated using the NAOs basis cannot be directly compared to those of the PW basis. Since the magnetic exchange J reflects the response to changes in magnetic moments, we must first subtract the ground-state energy. Specifically, we define \(\widehat{E}(M,\theta )=E(M,\theta )-E({M}_{0},0)\), where M0 is the atomic magnetic moment of the ground state. We then compare the error in \(\widehat{E}(M,\theta )\) between the NAOs and PW basis.
Figure 7 presents the results for various NAOs basis sets, showing the energy error compared to the PW results as \(\Delta \widehat{E}(M,\theta )=\widehat{E}(M,\theta ){| }_{NAOs}-\widehat{E}(M,\theta ){| }_{PW}\). According to the figure, except in regions where the magnetic moment decreases and the angle is relatively large–where the energy error is significantly higher–the error is fairly uniform in other areas. The results indicate that the error primarily depends on the type of basis rather than the cutoff radius. In practical scenarios, such as the gradual transition from a ferromagnetic to a nonmagnetic state with increasing temperature in BCC-Fe, the magnetic moments significantly change in angle instead of amplitude. When the angle changes from 0∘ to 180∘, the error for the DZP basis set ranges from 2 to 5 meV, whereas the error for the TZDP basis set is significantly reduced to 0.5 to 1 meV.
a the BCC-Fe with 2 Fe atoms labels as {M0 − ΔM, θ}, where the two atoms have the same atomic magnetic moment M0 − ΔM. M0 is the atomic magnetic moment and ΔM is ranging from −0.4 μB to 0.4 μB. θ is the angle between them. b, c shows the magnetic energy barrier difference between the NAOs basis and PW basis, namely \(\Delta \widehat{E}(M,\theta )=\widehat{E}(M,\theta ){| }_{NAOs}-\widehat{E}(M,\theta ){| }_{PW}\). Here, the magnetic energy barrier is defined by the energy above the ground state \(\widehat{E}(M,\theta )=E(M,\theta )-E({M}_{0},0)\). d-f shows the ratio of the magnetic energy barrier difference \(\Delta \widehat{E}(M,\theta )\) to the PW results.
To further validate the accuracy within the framework of density functional theory, we analytically calculated the magnetic exchange energy using the magnetic force theorem41,42. Specifically, we utilized the exchange energy calculations by ABACUS and TB2J43, employing a 9 × 9 × 9 supercell to determine the Jij values up to the fourth-nearest neighbors. The results obtained using different NAOs basis are summarized in Table 1. Since ABACUS currently does not support an interface with PW and TB2J, we used PW results reported in the literature as reference values44,45. Overall, the dominant ferromagnetic JNN calculated by the TZDP basis shows good agreement with the results reported in ref. 45.
DeePSPIN model
The non-collinear spin-cDFT implementation in ABACUS provides a powerful tool for studying complex magnetic configurations at the atomic scale. However, conventional DFT methods are time consuming, which makes it challenging to simulate magnetic dynamic processes, such as transitions in magnetic ordering that require large-scale supercells. DeePSPIN29 is a deep learning approach for magnetic materials that treats spin as so-called “pseudo-atoms” and integrates with the descriptor framework of DeepPot-SE46, preserving translational, rotational, and permutation symmetries. In detail, the DeePSPIN model requires high-precision first-principles data for magnetic materials, including energy, forces, magnetic torques, stress (optional), and atomic configurations, as training data. Additionally, the design of appropriate loss functions and active learning methods47 significantly reduces the required number of samples. A well-trained DeePSPIN model can accurately predict physical properties such as energy, forces, and magnetic torques. By combining it with methods like molecular dynamics (MD)48 and the Landau-Lifshitz-Gilbert equations49, it can simulate spin evolution in large-scale systems and accurately describe spin-lattice interactions.
To assess the capability of ABACUS+DeePSPIN in representing significant changes in the lattice and magnetic moments, we generated over 10,000 collinear magnetic configurations for the BCC/FCC phase of iron, including three magnetic states of ferromagnetic (FM), antiferromagnetic (AFM), and double-layer ferrimagnetic (DAFM). The cell volume perturbations ranged from−30% to +30%, and the magnitude of the atomic magnetic moments varied from 0 to 4.0 μB. These data points were obtained through collinear spin-cDFT calculations using the DZP-8au-v2.1 basis set. We then trained a DeePSPIN model based on these perturbed configurations. Figure 8 adopts both DeePSPIN and DFT methods to predict the total energy of collinear magnetic configurations such as FM, AFM, and DAFM. We find the results from the two models align well with each other, demonstrating the good accuracy of the DeePSPIN model. Figure 9 illustrates the dependence of total energy for varying magnitudes of magnetic moments, with the volume being changed isotropically. The solid line represents the ground-state energy from unconstrained DFT calculations at each volume, while the data points from the DeePSPIN model correspond to excited states, which have energies higher than those of the DFT results. These observations are in good agreement with recent studies28, demonstrating the reliability of the ABACUS spin-cDFT data and the expressive capability of the DeePSPIN model.
To explore more complex magnetic phenomena, such as the transition of magnetic ordering, it is crucial for the model to accurately capture information about non-collinear magnetic moments. To this end, we trained the DeePSPIN model following the automated workflow, comprising four key components: (1) initial configuration construction, (2) first-principles data labeling, (3) model training, and (4) sampling exploration. The workflow begins with the construction of an initial dataset. Based on two fundamental structures (BCC and FCC) and three magnetic configurations (FM, AFM, and DAFM), we obtained six initial configurations (2 × 2 × 2 supercells containing 16 Fe atoms). Subsequently, random perturbations were applied to each configuration’s atomic positions, lattice strain, magnetic moment orientations, and magnetic moment magnitudes, generating 100 perturbed configurations per initial structure to form the initial dataset. The maximum perturbation amplitude for the lattice cell was 1%, while atomic positions were perturbed by up to 0.03 Å. Magnetic moments were randomly rotated by up to 90 degrees, and their magnitudes were perturbed by up to 0.5 μB. The initially sampled configurations were subjected to first-principles calculations using ABACUS to obtain corresponding physical quantities such as energy (E), atomic forces (F), stress (σ), and magnetic torque (λ). These data, combined with random seeds, were used to train four different initial DeePSPIN magnetic models29. Starting from these models, we performed active learning, which consisted of four components: configuration exploration, configuration analysis, first-principles sampling, and model updating.
The exploration was conducted based on an active learning strategy47, employing molecular dynamics37. The molecular dynamics simulations utilize a modified version of LAMMPS, explicitly adapted to include atomic magnetic moment dynamics and their coupling with lattice motions37. Specifically, we selected one of the models to perform molecular dynamics simulations and sample configurations, while using all four initial models to predict material properties for the new configurations. The active learning strategy selects new configurations based on the models’ prediction uncertainty, specifically identifying “high-value” configurations as those exhibiting large prediction uncertainties. Configurations exhibiting smaller errors are categorized as “well-learned”, while those with anomalously large errors are labeled as “unphysical”. Both types of configurations are subsequently removed from the exploration space. High-value configurations were then subjected to first-principles calculations. The newly acquired data were incorporated into the existing training set to update the models. The exploration process started at low temperatures, and after model convergence within each temperature interval, the MD simulation temperature was incrementally increased. The sampling temperature ranged from 50 to 1600 K, with each temperature range including MD configurations with the virtual magnetic mass of 0.01/0.05/0.1/0.5/1.0. The virtual magnetic mass serves as an auxiliary parameter introduced to characterize spin dynamics in the simulations. While this parameter influences the numerical stability of the calculations, our tests demonstrate that it does not introduce any significant bias in the magnetic evolution (see Fig. S10). This result equivalently demonstrates that varying magnitudes of magnetic torque do not affect the thermodynamic statistical outcomes. We employed both NAOs basis (DZP-7au-v2.0) and PW basis to generate two separate data sets. All computations were performed using non-collinear spin-cDFT, incorporating spin-orbit coupling. The PW training strategy followed the same approach as for the NAOs, with the key difference being that NAOs BCC sampling involved 400 configurations per round, whereas PW BCC sampling involved 200 configurations per round. Ultimately, the NAOs basis set generated a total of 34,703 DPGEN samples, while the PW basis set generated 20,931 DPGEN samples. Additionally, for each basis set, we supplemented the data with 175 perturbed configurations for FCC-FM and 109 configurations for the FCC 4 × 2 × 2 supercells.
We trained two models, DeePSPIN-DZP and DeePSPIN-PW, based on first-principles data, including energy, force, and magnetic torque. These models allow us to perform large-scale simulations to observe the evolution of magnetism with temperature. The training errors and test performance of the DZP model are presented in the Supplementary Information (see Figs. S1 and S2). While the DZP basis set introduces a systematic energy error of approximately 1 meV/atom, this magnitude of error is well within the acceptable tolerance range for both model training and prediction. The basis set error does not constitute the primary source of model’s training error. The model demonstrates consistency in prediction errors across different system sizes as depicted in Fig. S3. We conducted NVT ensemble simulations at different temperatures using the trained models, employing an 8 × 8 × 8 supercell (consisting of 1024 atoms), with a virtual magnetic mass set to 0.01, a timestep of 0.1 fs, for a total of 3 ps of simulation. In Fig. S11, we present the time-dependent magnetic moment M(t) at several temperatures. After the MD equilibration, we selected the trajectory from 1 to 3 ps to calculate the average total magnetization of the system. In Fig. 10, we present the total magnetization of BCC-Fe as a function of temperature. It can be observed that BCC-Fe exhibits ferromagnetic behavior at low temperatures, with the total magnetization exceeding 2 μB. The magnetization decreases gradually up to 800 K, before undergoing a sudden drop around 1000 K, at which point it approaches zero, indicating a ferromagnetic-to-paramagnetic transition (see Fig. S12). This transition temperature is in close agreement with the experimentally observed Curie temperature of 1043 K. Furthermore, systematic analysis of dynamical simulation convergence with respect to system size demonstrates that while the model overestimates total magnetic moments in small systems at elevated temperatures (due to challenges in modeling paramagnetic states), its predictions exhibit progressive convergence as system dimensions increase, as evidenced in Fig. S13. Surprisingly, although the DZP-7au-v2.0 basis set introduces non-negligible errors when calculating the magnetic exchange strength J, the magnetization as a function of temperature and the Curie temperature is in very close agreement with the results from PW basis.
Magnetization of BCC-Fe as a function of temperature. Based on the DeePSPIN model, MD simulations at different temperatures are performed, with the stable magnetic moment counted as the magnetization at that temperature. Line width refers to the error bar. The number of atoms in the supercell is 1024, and the virtual magnetic mass M=0.01.
In summary, we have implemented the ABACUS spin-cDFT method, which combines first-principles data with AI-assisted magnetic models to extend the precision of magnetic first principles to larger scales, which allows molecular dynamics simulations of lattice and magnetic moment behavior at large spatial scales and long-time scales. We demonstrate that the deep learning magnetic model accurately predicts the Curie temperature of iron. Furthermore, the model can be used to study complex dynamic phenomenon such as the BCC-FCC structural phase transition of iron. A more detailed discussion of this will be provided in a forthcoming paper.
Discussion
In this study, we have implemented a non-collinear spin-constrained method within the open-source software ABACUS, utilizing both plane wave and numerical atomic orbital basis. This implementation allows for the precise control and calculation of arbitrary magnetic configurations. We introduce a smooth modulation orbital method for calculating atomic magnetization using the NAOs projection. Systematic investigations on bulk iron have demonstrated its reliability. The precision of the numerical atomic orbitals used in magnetic calculations has been quantitatively discussed, with the TZDP basis set yielding results very close to the reference plane-wave basis.
An automated workflow is utilized to train a fundamental magnetic DeePSPIN model for elemental Fe. Combined with molecular dynamics simulations, the DeePSPIN model based on PW and NAO basis both successfully observed the ferromagnetic-paramagnetic transition near the experimental Curie temperature, demonstrating the robustness and effectiveness of this workflow. The datasets and models presented in this paper will be made openly available to support subsequent fine-tuning and applications. The dataset in this study was constructed from the outset based on fully constrained cDFT. This conservative approach may lead to redundancy in the dataset. The distribution of magnetic moment magnitudes remains relatively concentrated in the low-temperature region. A feasible solution is to initially employ direction-only constrained cDFT for sampling in the low-temperature region, while gradually increasing the proportion of fully constrained cDFT calculations for magnetic moment magnitude variations as the temperature rises. This hybrid strategy is expected to significantly reduce the required number of samples while maintaining model accuracy. We anticipate that future work will address this limitation.
In conclusion, the noncollinear spin-constrained method implemented in ABACUS provides a powerful tool for studying complex magnetic phenomena at the atomic scale, and a data engine for deep-learning magnetic models. By integrating first-principles calculations, dynamical simulations, active learning, and magnetic modeling methods, we transfer first-principles accuracy to large-scale magnetic simulations in a maximally automated way. This integration enables large-scale magnetic simulations, providing new possibilities for the study of complex magnetic phenomena.
Methods
Non-collinear spin
For systems with non-collinear spin configurations, where the spin is not uniformly aligned along a single axis, the electronic wavefunction is expressed as a spinor ψ = {ψ↑, ψ↓}. In the context of non-collinear magnetism within density functional theory, the Kohn-Sham equations for a two-component spinor wave function ψ can be written as2
where Veff is the effective potential, including the Hartree term VH, the external potential term Vext and the exchange-correlation (XC) term Vxc. In addition, the generalized charge density can be defined by introducing two-component spin space and the formula is
where
is charge density and the spin density vector m = (mx, my, mz) is
Here ψi is two component spinor wavefunction for i-th band with occupation number fi. p takes values {1, 2, 3}, representing the components of spin density m along the x, y and z directions. At the same time, it also denotes the p-th Pauli matrix σp.
The magnetic effects in the Hamiltonian can be categorized into two parts. First, the electron-electron interaction, which is encompassed within the exchange-correlation term, directly influences the self-consistent solution process and necessitates the careful selection of appropriate functionals. Second, the spin-orbit coupling (SOC) effect, arising from relativistic effects, is introduced through fully relativistic pseudopotentials and is computed within the non-local components of the pseudopotential.
In the computation of the XC functional, the treatment of non-collinear spins can be simplified by using the local density approximation (LDA), which reduces the problem to calculating collinear spins at each grid point. For local reference systems used to calculate the XC functional, the local charge density can be transformed into the diagonal matrix form of the generalized charge density as
where ρ+ and ρ− are defined as
Taking into account the non-collinear electron spins within the framework of the aforementioned local spin density approximation (LSDA)50, the expressions for the exchange-correlation energy functional takes the form of
and the corresponding potential is
The computation of the XC magnetic field b(r) can be achieved by employing the chain rule to differentiate the modulus of the spin density, ensuring that b(r) is collinear with the magnetization vector m(r). The formula is
The generalized gradient approximation (GGA) functionals51, which incorporate both charge density and its gradient as well as the spin density gradient, are widely used because of the excellent balance between accuracy and efficiency. For non-collinear spin calculation, the gradient of the spin density vector can be calculated in different ways52,53,54,55, while none of them simultaneously achieves both high efficiency and avoidance of the well-known numerical instabilities54,55 for non-collinear-spin GGA functionals. The method implemented in ABACUS, proposed by Kübler et al.52,53, calculates ∇ ρ±(r) without considering the different rotation matrices that would reduce non-collinear spins to collinear spins at various grids, treating ρ±(r) as a scalar function.
The spin-orbit coupling and relativistic effects are incorporated using norm-conserving pseudopotentials56,57 within the Kleinman-Bylander (KB) form58,59,60,61. Additionally, spin-orbit terms are directly included in both plane-wave basis sets62 and numerical atomic orbitals basis sets63 for self-consistent field calculations.
Projection methods
As the core physical quantity in spin-cDFT, the atomic magnetic moment is crucial for accurate algorithmic implementation. Typically, one can define atomic magnetic moments in three ways: by partitioning the spin density64,65,66, through wavefunction analysis67,68, and via subspace projection12.
A straight definition belonging to the first kind is the “spherical definition method”, in which the spin density is integrated over a spherical volume around the nucleus that is defined by the window function Θ(r; τI). For example
where τI denotes the position of the nucleus I and rcut is the cutoff radius, Θ is the Heaviside step function or other smoothed window function.
More sophisticated strategy requires the conservation on quantities to be partitioned,
in which
wI(r) is the weighting function and partition function PI(r) approaches to 1 near I-th atom and 0 otherwise. A famous example of a more sophisticated partitioning strategy is called Atom-In-Molecule (AIM), or well-known Bader charge analysis64, which defines atomic regions by zero-flux surfaces in the gradient of the electron density, Voronoi tessellation65 partitions space into polyhedral cells around each atom based on proximity. The Bader charge analysis provides a mathematically rigorous definition of atomic boundaries based on electron density, yielding charges that are invariant to the choice of basis set and more reflective of the true electronic structure. However, this method would fail in cases where there are zero or more than one atoms appear in one Voronoi cell, and lack of decomposition of the orbits.
Mulliken67 and Löwdin68 charge analysis methods are well-established for partitioning electron density within a molecule to assign atomic charges. The formula for calculating the atomic magnetization from Mulliken charge of I-th atom is given by
Similarly, the formula for the Löwdin charge is:
where σ is the Pauli matrix to decompose reciprocal space density matrix ρμν(k), and Sμν(k) is the overlap integral matrix between basis functions μ and ν on this atom. These two charge analysis methods are powerful tools for approximating electron distribution by assigning shared electrons between atoms based on orbital overlaps, thereby offering a rudimentary yet quick insight into molecular charge states. However, their reliance on the choice of basis sets and the arbitrary nature of electron partitioning, which overlooks electronegativity differences, often results in charges that lack physical accuracy and can vary significantly with computational parameters.
The subspace projection method requires a rational construction of projection operators in which the projection function should be of physical meaning. For example, in projected augmented wave (PAW) formulism69, the PAW projector \(\{| {p}_{I\mu }\rangle \}\) which holds the orthonormality with the pseudo partial wave that is connected with the all-electron one \(\{| {\phi }_{\mu }\rangle \}\) is a natural choice11,12. Atomic magnetic moments is defined as
in which the scalar σ and \({\sigma }^{{\prime} }\) refer to the spin index (up and down) and the σp is the p-th Pauli matrix. \({D}_{I\mu \nu }^{\sigma {\sigma }^{{\prime} }}\) is the one-center density matrix representation of pseudo partial wave centered at the I-th atom,
Ωμν is the all-electron partial wave matrix elements of the cutoff-sphere integral
This algorithm effectively utilizes the PAW projection orbitals’ ability to accurately describe different electronic angular momentum orbitals, while also leveraging spherical truncation to preserve locality, making it a highly efficient method for defining atomic magnetic moments.
In this work, we propose a projection method under pseudopotential formalism for calculating atomic magnetization, we employ valence electron orbital local density projection operators
to estimate atomic magnetic moments MI, where αIlm is the projection orbital distinguished by the angular momentum quantum number l and the magnetic quantum number m. Within this framework, the sum of occupancy of the valence electrons \({\sum }_{m}{n}_{Ilm}^{\sigma {\sigma }^{{\prime} }}\) with identical angular momentum l and spin index \(\sigma ,{\sigma }^{{\prime} }\) corresponds to the trace of the projected density matrix \(Tr({n}_{Ilm{m}^{{\prime} }}^{\sigma {\sigma }^{{\prime} }})\), where
The atomic magnetic moment along the direction p is calculated by summing the diagonal elements of the occupation matrix for each angular momentum
where p refers to x, y, z components.
The accuracy and locality of the projection depend upon the orbitals \(| {\alpha }_{Ilm}\rangle\) used to construct the projection operators \({\widehat{P}}_{Ilm{m}^{{\prime} }}\), which are composed of radial distribution functions and spherical harmonics
where the radial functions α(r) are constructed to meet three criteria: (1) maximized efficiency for extracting valence electron information of interest, (2) normalization, and (3) smoothness at the boundary to avoid numerical error. Practically, we construct smooth modulation orbitals (SMOs) by truncating those ζ functions of ABACUS numerical atomic orbitals (NAOs)30,70 that collect majority of valence electron information over various reference systems, then smoothed by a function centered at the boundary
where r is the distance to the atom, rm is the artificially chosen modulation radius. Spreading parameter σ is solved in an iterative way, that can minimize the gradient term of spillage70 between the SMOs \(| \alpha (\sigma )\rangle\) and NAOs \(| \chi \rangle\) under the constraint of normalization of the SMOs themselves
where
Using SMOs as projection orbitals allows for the representation of charge distributions across different angular momentum orbitals within an atom’s localized environment. Since the projection orbitals are designed to be strongly localized, inter-atomic overlap can be neglected. Furthermore, the orthogonality of projection orbitals with different angular momenta is preserved during modulation, resulting in a simplified form for the projection operator.
The onsite occupation matrix, which quantifies the localized atomic charge occupancy, can be obtained through the projection of wave functions onto SMOs. In NAO formulation for periodic systems, the onsite occupation matrix can be written as
where
represents the eigenvector of n-th band at the first Brillouin zone sampling point k, and the spin index is σ. The parameter fnk depicts occupation numbers of electrons. \({\phi }_{\mu }^{{\bf{0}}}={\phi }_{\mu }({\bf{r}}-{{\boldsymbol{\tau }}}_{\mu })\) and \({\phi }_{\nu }^{{\bf{R}}}={\phi }_{\nu }({\bf{r}}-{\bf{R}}-{{\boldsymbol{\tau }}}_{\nu })\) are numerical orbital functions. \({\rho }_{\mu \nu }^{\sigma {\sigma }^{{\prime} }}({\bf{R}})\) is the real space density matrix that takes the form of
For plane wave basis, the onsite occupation matrix \({n}_{Ilm{m}^{{\prime} }}^{\sigma {\sigma }^{{\prime} }}\) can be written as
where
is the plane wave basis representation of the overlap between \(| {\psi }_{n{\bf{k}}}^{\sigma }\rangle\) and SMOs, \({c}_{n{\bf{k}}}^{\sigma }({\bf{G}})\) and αIlm(G) are coefficients of plane wave expansion of \(| {\psi }_{n{\bf{k}}}^{\sigma }\rangle\) and SMOs, respectively. One can refer to ref. 71 for the details of SMOs expansion.
Spin-constrained DFT
Utilizing the established Lagrange formalism4,13, the challenge pertaining to the restriction of atomic magnetic moments is reformulated as an endeavor to ascertain the equilibrium points of the pertinent function. To this end, a Lagrange multiplier is introduced into the energy functional of the system:
where EKS is the Kohn-Sham energy with charge density ρ(r) and spin density m(r), λI is the Lagrange multiplier, which can be treated as the magnetic torque under this constraint. MI and MI,target are the atomic spin moment and the target atomic spin moment for atom I, respectively. The stationary point problem within the Lagrange formalism can be solved iteratively by minimizing Ec with respect to λI, ρ(r) and m(r)6.
This method can be extended in a straightforward manner to support a functionality that constrains only the direction of atomic spin moments. By rewriting the target atomic spin moment in Eq. (30) as MI,target = MI,targeteI,target, and iteratively updating the magnitude of the target spin moment MI,target = MI(m(r)) ⋅ eI,target during the inner-loop calculation of optimized Lagrange multipliers \({{\boldsymbol{\lambda }}}_{I}{| }_{{{\boldsymbol{\lambda }}}_{I}\cdot {{\bf{e}}}_{I,target}=0}\), upon the inner loop’s convergence, the following conditions are satisfied: \({{\bf{M}}}_{I}({\bf{m}}({\bf{r}}))\parallel {{\bf{e}}}_{I,target},\) and ∑IλI ⋅ MI = 0. A typical example is the transition of a single atomic spin from ferromagnetic to antiferromagnetic testing, examining whether fixing the magnetic moment magnitude results in differences in energy and torque performance. The comparative studies between the full-constraint and direction-constraint method is perfomed in Fig. S6. It is worth noting that constraining only the spin direction-rather than both magnitude and direction-reduces the number of optimization degrees of freedom, typically cutting the required convergence steps in the inner loop by approximately half. Another noteworthy aspect is that the direction-only constraint approach circumvents errors arising from discrepancies in atomic magnetic moment definitions across different software packages, thereby enabling meaningful cross-software comparisons (see Fig. S14).
Another extension can be achieved by replacing all vector quantities in Eq. (30) with scalars, enabling the method to support spin-magnitude-constrained calculations under the LSDA. It is important to clarify that the two aforementioned extensions cannot be employed simultaneously, as the spin orientation in LSDA is restricted to two discrete directions (up and down), which cannot be successfully constrained through iterative gradient-based optimization.
First, we introduce the implementation within numerical atomic orbital basis. The penalty Hamiltonian term in NAOs basis in real space can be derived from the Lagrange function as
Here
where R is the lattice vector between basis functions μ and ν. The last term in the chain derivative on the right-hand side is Pauli matrix; the rest of the term can continue to expand the calculation by Eq. (20) as
The penalty term of Hamiltonian has the form of
where the 2 × 2 parameter matrix \(f(I,\sigma {\sigma }^{{\prime} })\) from Pauli matrix is
This modified Hamiltonian ensures that the results satisfy the constraints and contribute accordingly to the total energy, and the contribution of energy from the penalty term as follows:
where \({\widehat{h}}^{{\boldsymbol{\lambda }}}\) is symbol of penalty operator.
The correction terms of atomic force and lattice stress can be calculated by the derivative of the energy term in Eq. (36). The expression of penalty force in the direction along p = x, y, z is
where \({\tau }_{I}^{p}\) is the p-component coordinates of I-th atom τ. The penalty stress term is
where α, β are indices of stress 3 × 3 tensor.
For plane wave basis, directly constructing the full Hamiltonian matrix is typically infeasible due to the large number of basis functions involved. Instead, iterative diagonalization methods are employed to solve the Kohn-Sham equations, which require only the computation of the action of the Hamiltonian operator on the wave function. As a result, within the plane-wave basis set, the formula for the Hamiltonian correction term at a specified k-point is as follows
where
is the overlap of projected orbital and wave function.
The terms for the correction of atomic force and lattice stress can be also derived from the energy term of the Eq. (36) through its differentiation. The expression for atomic forces is
and for stress is
where Ω is the cell volume, εαβ is the lattice strain tensor, and detail of last derivative term is similar with nonlocal pseudopotential term in ref. 72.
We present a uniform flowchart of the specific implementation of spin-cDFT in Fig. 11 for both PW and NAO basis. As depicted in Fig. 11, this procedure necessitates an additional iteration within the self-consistent field (SCF) cycle of DFT to ascertain the fixed-point solution satisfying the spin constraints. The process involves solving the Kohn-Sham equations with a given set of Lagrange multipliers, which are then updated to minimize the deviation between the calculated and target magnetic moments. This minimization is typically achieved through a conjugate-gradient scheme or a similar optimization technique, ensuring that the magnetic moments converge to the targets within a predefined tolerance. The iterative loop continues, alternating between the solution of the Kohn-Sham equations and the optimization of the Lagrange multiplier, until the system reaches a self-consistent solution where the constraints are satisfied to a desired level of accuracy. Through comparative testing, we found that the energy calculation results from the adaptive algorithm implemented in this work demonstrate agreement with the penalty function methods employed in other software packages (see Fig. S14). Compared to the penalty function method, which exhibits nonlinear errors dependent on λ across different excited states (Fig. S14), new method demonstrates consistent accuracy and is suitable for large-scale sampling.
We observed that in calculations of the Fe system, the optimization of lambda typically introduces an inner-loop iteration count on the order of 5. Since the inner loop also requires solving eigenvalue problems, this leads to an approximately fivefold increase in total computational time when using numerical atomic orbital basis sets (Fig. S15). For plane-wave basis set, due to its high dimensionality, the inner loop employs a single-step subspace solution of the Davidson diagonalization algorithm when the magnetic moment error is large, rather than the full Davidson solver used in the outer loop. This merit significantly improves the efficiency of magnetic moments in the inner loop while guaranteeing the correctness of final magnetic moment convergence. In tests, each step of the inner loop with PW basis sets consumes significantly less time than the outer loop, resulting in only a modest increase ( ~ 1/3) in total computational time (see Fig. S7). Tests on systems of varying sizes (see Fig. S15) demonstrate that for both LCAO and PW basis implementations, the proportion of additional computational cost introduced by cDFT remains nearly constant as the system size increases. This indicates that the current cDFT method does not alter the scaling of the original SCF calculation; the extra cost only increases the prefactor of the total computational time, ensuring its scalability. Furthermore, the present method, due to its strict enforcement of magnetic moment constraints at each electronic step, reduces the degrees of freedom of the electronic state during the SCF process, leading to superior convergence speed compared to the penalty method (see Table S6).
First-principles calculations
All calculations in this work were performed with ABACUS, utilizing plane waves or numerical atomic orbitals31 to expand wavefunctions. The pseudopotentials were sourced from the Pseudo-Dojo pseudopotential library (http://www.pseudo-dojo.org/). The numerical atomic orbitals of ABACUS are generated by optimizing the spillage function averaged over structures; this strategy was originally proposed in the works of Chen et al.30 (denoted as v1.0) and improved by Lin et al.70 by including the kinetic terms (denoted as v2.0). In this work, a numerically further improved version is also used (denoted as v2.1).
These v2.1 orbitals have not yet been officially released on the official website but the generation code is online available (https://github.com/kirk0830/ABACUS-ORBGEN). This paper performs NAOs calculations using v2.0 and v2.1 orbitals, including Single-ζ (SZ), Double-ζ plus polarization functions (DZP), and Triple-ζ plus double polarization functions (TZDP) with different cutoffs. The convergence test for the plane-wave cutoff energy is presented in the Supplementary Information. By adopting a criterion of energy error per atom below 1 meV and stress error below 0.1 kbar, the cutoff energy was set to 100 Ry for all calculations (see Fig. S4). The Brillouin zone was uniformly sampled at 0.14 Bohr−1 intervals. Specifically for the BCC-Fe supercell with 16 iron atoms, this k-spacing corresponds to a 5 × 5 × 5 Monkhorst-Pack grid. The projected orbitals used to calculate magnetic moments were modified numerical atomic orbitals, with a modulation radius of 3 Bohr. Each inner loop of λ optimization converges when the maximum atomic magnetism variation is less than δM < 10−7μB, while the SCF convergence until density difference δρ < 10−6. All noncollinear calculations incorporated the spin-orbit coupling (SOC) effect. We would like to clarify that while SOC effects in pure Fe are relatively weak–smaller than the intrinsic errors of our model–we deliberately retained SOC calculations to rigorously test the stability of our implementation under combined noncollinear magnetism, SOC, and cDFT conditions, as well as to validate the robustness of the entire model workflow. The cDFT method implemented in ABACUS, capable of precisely obtaining first-principles results for arbitrary magnetic moment configurations, also serves as an effective tool for studying the systems where SOC plays a decisive role in determining the magnetic ground state or emergent phenomena73–such as heavy-element magnets, topological spin textures, or Dzyaloshinskii-Moriya interaction (DMI) driven systems. In Table S4 of the Supplementary Material, we have included a short example using the four-states method74 to calculate the DMI effect in the Fe-Pt system, demonstrating this capability.
Molecular dynamics simulation
The molecular dynamics simulations utilize a modified version of LAMMPS using the TSPIN method involved simultaneous integration of lattice and spin degrees of freedom within a unified Nosé-Hoover chain (NHC) thermostat framework37. The dynamics are propagated using symplectic integration schemes derived from an extended Hamiltonian formulation, which preserves symplecticity and ensures accurate sampling of canonical (NVT) and isothermal-isobaric (NPT) ensembles.
The Lagrangian that governs the system’s evolution can be expressed as:
where psi and pi are the generalized momenta associated with the spins and the lattice positions, respectively. μi denotes the virtual magnetic mass. The energy function U({R}, {M}) accounts for the interactions between the lattice, spins, and spin-lattice couplings. The Euler-Lagrange equations derived from this Lagrangian provide the following equations of motion:
The above equations describe the evolution of the lattice (Ri) and spin (Mi), where the lattice evolves according to the atomic forces (Fi), while the spins evolve under the magnetic torque (λi). For the NVT and NPT ensembles, we use the equations of motion with thermostat variables for temperature control shown in ref. 37.
Finite difference tests
For the force test, we randomly selected one iron atom from a BCC-Fe16 configuration and perturbed its z-coordinate with a step size of Δ = 0.01 Bohr, calculating the energies of 11 perturbed configurations E(z0 + iΔ). Using the finite difference formula:
we computed numerical solutions for 9 points (excluding the edge points).
For the magnetic torque test, we randomly selected a FePt configuration and perturbed the z-component magnetic moment of one Fe atom with a step size of Δ = 0.1 μB. We calculated the total energies of 11 perturbed configurations and determined the numerical solutions \(\frac{\partial E}{\partial {{\bf{m}}}_{z}}\) using the central difference method.
We randomly selected a NiMnTi configuration to conduct stress finite-difference testing. The stress can be obtained through strain perturbation:
where Ω represents the volume, and the stress tensor σ has six independent components: σ11, σ12, σ13, σ22, σ23, σ33. For each component, we constructed corresponding strained configurations. Taking σ11 as an example, we generated 11 configurations with ε11 = i*Δ (step size Δ = 0.0001). According to the following formula, the strain induces the corresponding unit cell modifications:
The raw data is shown in the Table S1/S2/S3 of Supplementary Information.
Data availability
The data and models used in this study have been made open-source and are hosted on AIS Square (https://www.aissquare.com/), where they can be accessed online (https://www.aissquare.com/datasets/detail?pageType=datasets&name=Fe-DeepSpin&id=386).
Code availability
The codes used to produce the results are available from the corresponding author upon reasonable request. The relevant implementation is scheduled to be open-sourced in the next major release of ABACUS and may be accessed at: https://github.com/deepmodeling/abacus-develop.
References
Hohenberg, P. & Kohn, W. Inhomogeneous electron gas. Phys. Rev. 136, 864 (1964).
Kohn, W. & Sham, L. J. Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, 1133 (1965).
Szilva, A. et al. Quantitative theory of magnetic interactions in solids. Rev. Mod. Phys. 95, 035004 (2023).
Dederichs, P., Blügel, S., Zeller, R. & Akai, H. Ground states of constrained systems: application to cerium impurities. Phys. Rev. Lett. 53, 2512 (1984).
Kaduk, B., Kowalczyk, T. & Van Voorhis, T. Constrained density functional theory. Chem. Rev. 112, 321–370 (2012).
Wu, Q. & Van Voorhis, T. Direct optimization method to study constrained systems within density-functional theory. Phys. Rev. A 72, 024502 (2005).
Stocks, G. et al. Towards a constrained local moment model for first principles spin dynamics. Philos. Mag. 78, 665–673 (1998).
Kurz, P., Förster, F., Nordström, L., Bihlmayer, G. & Blügel, S. Ab initio treatment of noncollinear magnets with the full-potential linearized augmented plane wave method. Phys. Rev. B 69, 024415 (2004).
Cuadrado, R., Pruneda, M., Garcia, A. & Ordejon, P. Implementation of non-collinear spin-constrained DFT calculations in siesta with a fully relativistic Hamiltonian. J. Phys. Mater. 1, 015010 (2018).
Gebauer, R. & Baroni, S. Magnons in real materials from density-functional theory. Phys. Rev. B 61, 6459 (2000).
Ma, P.-W. & Dudarev, S. Constrained density functional for noncollinear magnetism. Phys. Rev. B 91, 054420 (2015).
Hegde, O. et al. Atomic relaxation around defects in magnetically disordered materials computed by atomic spin constraints within an efficient Lagrange formalism. Phys. Rev. B 102, 144101 (2020).
Cai, Z., Wang, K., Xu, Y., Wei, S.-H. & Xu, B. A self-adaptive first-principles approach for magnetic excited states. Quantum Front. 2, 21 (2023).
Zhang, D. et al. Pretraining of attention-based deep learning potential model for molecular simulation. npj Comput. Mater. 10, 94 (2024).
Zhang, D. et al. DPA-2: a large atomic model as a multi-task learner. npj Comput. Mater. 10, 293. https://doi.org/10.1038/s41524-024-01493-2 (2024).
Qi, J., Ko, T. W., Wood, B. C., Pham, T. A. & Ong, S. P. Robust training of machine learning interatomic potentials with dimensionality reduction and stratified sampling. npj Comput. Mater. 10, 43 (2024).
Chen, C. & Ong, S. P. A universal graph deep learning interatomic potential for the periodic table. Nat. Comput. Sci. 2, 718–728 (2022).
Deng, B. et al. Chgnet as a pretrained universal neural network potential for charge-informed atomistic modelling. Nat. Mach. Intell. 5, 1031–1041 (2023).
Batatia, I. et al. A foundation model for atomistic materials chemistry. J. Chem. Phys. 163, 184110 (2025).
Choudhary, K. et al. Unified graph neural network force-field for the periodic table: solid state applications. Digital Discov. 2, 346–355 (2023).
Choudhary, K. & DeCost, B. Atomistic line graph neural network for improved materials property predictions. npj Comput. Mater. 7, 185 (2021).
Jain, A. et al. The materials project: a materials genome approach to accelerating materials innovation. APL Mater. 1(1) (2013).
Talirz, L. et al. Materials Cloud, a platform for open computational science. Sci. Data 7, 299 (2020).
Barroso-Luque, L. et al. Open materials 2024 (omat24) inorganic materials dataset and models. arXiv preprint arXiv:2410.12771 (2024).
Liu, J. et al. Machine-learning-based interatomic potentials for group IIB to via semiconductors: Toward a universal model. J. Chem. Theory Comput. 20, 5717–5731 (2024).
Wu, J. et al. Universal interatomic potential for perovskite oxides. Phys. Rev. B 108, 180104 (2023).
Wu, J., Yang, J., Ma, L., Zhang, L. & Liu, S. Modular development of deep potential for complex solid solutions. Phys. Rev. B 107, 144102 (2023).
Rinaldi, M., Mrovec, M., Bochkarev, A., Lysogorskiy, Y. & Drautz, R. Non-collinear magnetic atomic cluster expansion for iron. npj Comput. Mater. 10, 12 (2024).
Yang, T. et al. Deep learning illuminates spin and lattice interaction in magnetic materials. Phys. Rev. B 110, 064427 (2024).
Chen, M., Guo, G.-C. & He, L. Systematically improvable optimized atomic basis sets for ab initio calculations. J. Phys. Condens. Matter 22, 445501 (2010).
Li, P. et al. Large-scale ab initio simulations based on systematically improvable atomic basis. Comput. Mater. Sci. 112, 503–517 (2016).
Zhou, W. et al. ABACUS: An electronic structure analysis package for the AI era. J. Chem. Phys. 163, 192501 (2025).
Streib, S. et al. Equation of motion and the constraining field in ab initio spin dynamics. Phys. Rev. B 102, 214407 (2020).
Gyorffy, B., Pindor, A., Staunton, J., Stocks, G. & Winter, H. A first-principles theory of ferromagnetic phase transitions in metals. J. Phys. F: Met. Phys. 15, 1337 (1985).
Ruban, A. V., Khmelevskyi, S., Mohn, P. & Johansson, B. Temperature-induced longitudinal spin fluctuations in Fe and Ni. Phys. Rev. B 75, 054402 (2007).
Wu, X., Liu, Z., Luo, T. Magnon and phonon dispersion, lifetime, and thermal conductivity of iron from spin-lattice dynamics simulations. J. Appl. Phys. 123(8) (2018).
Huang, Z., Xu, B. Symplectic spin-lattice dynamics with machine-learning potentials. arXiv, 2506–12877 (2025).
Acet, M., Zähres, H., Wassermann, E. & Pepperhoff, W. High-temperature moment-volume instability and anti-invar of γ-fe. Phys. Rev. B 49, 6012 (1994).
Chen, Q. & Sundman, B. Modeling of thermodynamic properties for bcc, fcc, liquid, and amorphous iron. J. Phase Equilib. 22, 631–644 (2001).
Okamoto, H., Schlesinger, M.E., Mueller, E.M. Alloy phase diagrams. (ASM International, 2016).
Liechtenstein, A., Katsnelson, M. & Gubanov, V. Exchange interactions and spin-wave stiffness in ferromagnetic metals. J. Phys. F: Met. Phys. 14, 125 (1984).
Liechtenstein, A. I., Katsnelson, M., Antropov, V. & Gubanov, V. Local spin density functional approach to the theory of exchange interactions in ferromagnetic metals and alloys. J. Magn. Magn. Mater. 67, 65–74 (1987).
He, X., Helbig, N., Verstraete, M. J. & Bousquet, E. TB2J: A Python package for computing magnetic interaction parameters. Comput. Phys. Commun. 264, 107938 (2021).
Wang, H., Ma, P.-W. & Woo, C. H. Exchange interaction function for spin-lattice coupling in bcc iron. Phys. Rev. B 82, 144304 (2010).
Pajda, M., Kudrnovský, J., Turek, I., Drchal, V. & Bruno, P. Ab initio calculations of exchange interactions, spin-wave stiffness constants, and Curie temperatures of Fe, Co, and Ni. Phys. Rev. B 64, 174402 (2001).
Zhang, L. et al. End-to-end symmetry preserving inter-atomic potential energy model for finite and extended systems. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. NIPS’18, pp. 4441–4451. (Curran Associates Inc., 2018).
Zhang, Y. et al. DP-GEN: A concurrent learning platform for the generation of reliable deep learning based potential energy models. Comput. Phys. Commun. 253, 107206 (2020).
Thompson, A. P. et al. Lammps-a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Comput. Phys. Commun. 271, 108171 (2022).
Gilbert, T. L. A phenomenological theory of damping in ferromagnetic materials. IEEE Trans. Magn. 40, 3443–3449 (2004).
Miehlich, B., Savin, A., Stoll, H. & Preuss, H. Results obtained with the correlation energy density functionals of Becke and Lee, Yang and Parr. Chem. Phys. Lett. 157, 200–206 (1989).
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865 (1996).
Kubler, J., Hock, K.-H., Sticht, J. & Williams, A. Density functional theory of non-collinear magnetism. J. Phys. F: Met. Phys. 18, 469 (1988).
Sjöstedt, E. & Nordström, L. Noncollinear full-potential studies of γ- fe. Phys. Rev. B 66, 014447 (2002).
Scalmani, G. & Frisch, M. J. A new approach to noncollinear spin density functional theory beyond the local density approximation. J. Chem. Theory Comput. 8, 2193–2196 (2012).
Peralta, J. E., Scuseria, G. E. & Frisch, M. J. Noncollinear magnetism in density functional calculations. Phys. Rev. B 75, 125119 (2007).
Hamann, D., Schlüter, M. & Chiang, C. Norm-conserving pseudopotentials. Phys. Rev. Lett. 43, 1494 (1979).
Troullier, N. & Martins, J. L. Efficient pseudopotentials for plane-wave calculations. Phys. Rev. B 43, 1993 (1991).
Hemstreet, L., Fong, C. & Nelson, J. First-principles calculations of spin-orbit splittings in solids using nonlocal separable pseudopotentials. Phys. Rev. B 47, 4238 (1993).
Theurich, G. & Hill, N. A. Self-consistent treatment of spin-orbit coupling in solids using relativistic fully separable ab initio pseudopotentials. Phys. Rev. B 64, 073106 (2001).
Kleinman, L. Relativistic norm-conserving pseudopotential. Phys. Rev. B 21, 2630 (1980).
Bachelet, G. B. & Schlüter, M. Relativistic norm-conserving pseudopotentials. Phys. Rev. B 25, 2103 (1982).
Corso, A. D. & Conte, A. M. Spin-orbit coupling with ultrasoft pseudopotentials: application to Au and Pt. Phys. Rev. B 71, 115106 (2005).
Cuadrado, R. & Cerdá, J. Fully relativistic pseudopotential formalism under an atomic orbital basis: spin–orbit splittings and magnetic anisotropies. J. Phys. Condens. 24, 086005 (2012).
Bader, R.F., Molecules, A.I. A quantum theory. (Clarendon, 1990).
Fonseca Guerra, C., Handgraaf, J.-W., Baerends, E. J. & Bickelhaupt, F. M. Voronoi deformation density (vdd) charges: assessment of the Mulliken, Bader, Hirshfeld, Weinhold, and vdd methods for charge analysis. J. Comput. Chem. 25, 189–210 (2004).
Hirshfeld, F. L. Bonded-atom fragments for describing molecular charge densities. Theoretica Chim. acta 44, 129–138 (1977).
Mulliken, R. S. Electronic population analysis on LCAO–MO molecular wave functions. i. J. Chem. Phys. 23, 1833–1840 (1955).
Löwdin, P.-O. On the nonorthogonality problem. In: Advances in Quantum Chemistry vol. 5, pp. 185–199. (Elsevier, 1970).
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953–17979 (1994).
Lin, P., Ren, X. & He, L. Strategy for constructing compact numerical atomic orbital basis sets by incorporating the gradients of reference wavefunctions. Phys. Rev. B 103, 235131 (2021).
King-Smith, R., Payne, M. & Lin, J. Real-space implementation of nonlocal pseudopotentials for first-principles total-energy calculations. Phys. Rev. B 44, 13063 (1991).
Nielsen, O. H. & Martin, R. M. Stresses in semiconductors: ab initio calculations on Si, Ge, and GaAs. Phys. Rev. B 32, 3792–3805 (1985).
Yang, H., Liang, J. & Cui, Q. First-principles calculations for Dzyaloshinskii–Moriya interaction. Nat. Rev. Phys. 5, 43–61 (2023).
Xiang, H., Kan, E., Wei, S.-H., Whangbo, M.-H. & Gong, X. Predicting the spin-lattice order of frustrated systems from first principles. Phys. Rev. B 84, 224429 (2011).
Acknowledgements
We thank Zuxin Jin and Han Wang for many helpful discussions. We gratefully acknowledge AI for Science Institute, Beijing (AISI). W.Z. gratefully acknowledges support from the Hongyi postdoctoral fellowship of Wuhan University. W.Z. gratefully acknowledges funding support from the National Natural Science Foundation of China (Grant No. 12547165). The work of M.C. is supported by NSFC the Excellence Research Group Program for multiscale problems in nonlinear mechanics (Grant No. 12588201) and the National Key R&D Program of China under Grant No.2025YFB3003603. M.C. gratefully acknowledges funding support from the National Natural Science Foundation of China (grant no. 12122401,12074007,12135002).
Author information
Authors and Affiliations
Contributions
D.Z. and W.Z. designed and implemented the spin-constrained method in ABACUS. Z.C. participated in the discussion, design, and benchmark of the implementation. Y.H. and L.Z. provided guidance on the projection method. X.P. and Y.W. constructed the first-principles database and trained the iron model. D.Z. and Z.H. implemented code for magnetic model training and molecular dynamics simulations. W.Z. and D.Z. wrote the main manuscript. W.Z., B.X. and M.C. conceived and directed the research, supervised the analysis and interpretation. All authors contributed to the discussions and the final editing.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zheng, D., Peng, X., Huang, Y. et al. Integrating deep-learning-based magnetic model and non-collinear spin-constrained method: methodology, implementation and application. npj Comput Mater 12, 52 (2026). https://doi.org/10.1038/s41524-025-01923-9
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41524-025-01923-9













