Interpolation and differentiation of alchemical degrees of freedom in machine learning interatomic potentials

Nam, Juno; Peng, Jiayu; Gómez-Bombarelli, Rafael

doi:10.1038/s41467-025-59543-2

Download PDF

Article
Open access
Published: 10 May 2025

Interpolation and differentiation of alchemical degrees of freedom in machine learning interatomic potentials

Nature Communications volume 16, Article number: 4350 (2025) Cite this article

10k Accesses
4 Citations
3 Altmetric
Metrics details

Subjects

Abstract

Machine learning interatomic potentials (MLIPs) have become a workhorse of modern atomistic simulations, and recently published universal MLIPs, pre-trained on large datasets, have demonstrated remarkable accuracy and generalizability. However, the computational cost of MLIPs limits their applicability to chemically disordered systems requiring large simulation cells or to sample-intensive statistical methods. Here, we report the use of continuous and differentiable alchemical degrees of freedom in atomistic materials simulations, exploiting the fact that graph neural network MLIPs represent discrete elements as real-valued tensors. The proposed method introduces alchemical atoms with corresponding weights into the input graph, alongside modifications to the message-passing and readout mechanisms of MLIPs, and allows smooth interpolation between the compositional states of materials. The end-to-end differentiability of MLIPs enables efficient calculation of the gradient of energy with respect to the compositional weights. With this modification, we propose methodologies for optimizing the composition of solid solutions towards target macroscopic properties, characterizing order and disorder in multicomponent oxides, and conducting alchemical free energy simulations to quantify the free energy of vacancy formation and composition changes.

Geometry-enhanced pretraining on interatomic potentials

Article 05 April 2024

Learning from models: high-dimensional analyses on the performance of machine learning interatomic potentials

Article Open access 20 July 2024

Systematic softening in universal machine learning interatomic potentials

Article Open access 10 January 2025

Introduction

Atomistic simulations are a cornerstone of computational modeling of the dynamic behavior of materials. Achieving predictive and efficient simulations necessitates a balance between the quality and cost of the description of interatomic interactions and exhaustive sampling to achieve converged thermodynamic averages. Density functional theory (DFT) calculations are typically taken as a gold standard for accuracy in materials simulations. Ab initio molecular dynamics (AIMD) simulations¹ propagate dynamics using these high-quality DFT forces, but their high computational cost limits scalability. Machine learning interatomic potentials (MLIPs)^2,3, trained on electronic structure calculation results, offer a low-cost alternative to DFT energies and forces in MD. Beginning from the seminal works of the Behler–Parrinello network⁴ and GAP⁵, various architectures of MLIP have been proposed to offer a selection within a trade-off between accuracy and speed, such as SchNet⁶, PaiNN⁷, NequIP⁸, Allegro⁹, MACE^10,11, and CACE¹². Recently, universal MLIPs, such as M3GNet¹³, CHGNet¹⁴, and MACE-MP-0¹⁵, have emerged, providing atomistic modeling capabilities across a substantial portion of elements in the periodic table and their combinations. All these models are trained on DFT energies and gradients extracted from a large-scale materials database such as the Materials Project¹⁶. The benchmark results^17,18 demonstrate that they offer high-fidelity modeling of atomic interactions and phonon dispersion, thereby serving as reliable foundation models in the context of downstream atomistic simulation applications.

While interatomic potentials are primarily intended to operate on atomic positions with fixed elemental identities, it is intriguing to consider their alchemical degrees of freedom, wherein the elemental identities can be altered continuously. In the realm of electronic structure methods, von Lilienfeld and colleagues have pioneered the molecular grand-canonical ensemble DFT and have advanced subsequent lines of research on alchemical transformations, which enable the alteration and optimization of chemical compositions^{19,20,21,22,23}. From the standpoint of MLIPs, Ceriotti and colleagues introduced an alchemical compression scheme based on an atom-centered density framework and applied the approach to model high-entropy alloys^24,25,26. They demonstrated that compressing the representation of physical elements onto low-dimensional subspaces of pseudoelements enables efficient modeling of compositionally complex systems and interpolation to elements not encountered during training. Additionally, Chen et al.²⁷ demonstrated that pre-trained materials property predictors can be applied to disordered crystals by using linear interpolation of low-dimensional elemental embeddings. While continuous representations of elements correspond to atomic embeddings in graph-based MLIPs, most universal MLIPs typically use much higher-dimensional atomic embeddings to ensure that the model is sufficiently expressive. Since models are only trained with discrete atom identities, it is challenging to identify meaningful submanifolds of elemental embeddings to interpolate elements or project gradients, as seen in the context of molecule design with pre-trained MLIPs²⁸. On the other hand, simple linear interpolation of embeddings for modeling compositions may lead to unphysical outputs.

Alchemical changes are of particular importance in free energy simulations^29,30. Free energy simulations are widely used to characterize the finite-temperature stabilities of solid phases^31,32, and automatic protocols have been developed accordingly³³. However, while alchemical free energy calculations are widely used to study protein–small molecule interactions³⁴, their applications in materials systems are limited. This would be largely due to the challenge of parametrizing interatomic potentials for systems with three or more elements. Notably, Jinnouchi et al.³⁵ introduced a thermodynamic integration (TI) method to compute the chemical potentials of liquid Si and LiF in H₂O by smoothly turning on or off interactions between atoms in kernel-based MLIPs through alchemical switching.

With the advent of universal MLIPs, the challenge of fitting potentials for systems containing multiple types of elements has been alleviated, and they provide reasonable accuracy for dynamics around equilibrium geometries. Thus, it is timely to consider the application of universal MLIPs to facilitate free energy simulations along alchemical pathways. In this work, building upon the prototypical construction of graph-based MLIPs, we access the hitherto hidden alchemical degrees of freedom inherent in MLIPs. Rather than altering the continuous embeddings of individual atoms, we augment the input graph structure by introducing alchemical atoms, each associated with its respective compositional weight. Through subsequent modifications to the message passing scheme and energy readout, our scheme provides smooth interpolation between different compositional states of materials. Moreover, given the end-to-end differentiability with respect to the alchemical weights λ, it facilitates the calculation of the alchemical gradient of the energy ∂H/∂λ and subsequently the calculation of the free energy of the alchemical transformation. In addition, we explore the application of alchemical intermediate states with mixed compositions in creating a computationally efficient description of solid solutions.

Results

Alchemical graph and message passing

Prototypical MLIP construction

Our objective here is to introduce modifications to the non-learnable parts of the MLIPs so that we can model the alchemical compositions of materials without further fine-tuning the models. First, we start by introducing the prototypical construction of graph-based MLIPs. An atomic system is represented as a graph ${{{\mathcal{G}}}}=({{{\mathcal{V}}}},{{{\mathcal{E}}}})$ with an atom as a node $i\in {{{\mathcal{V}}}}$ and an atom pair within a defined cutoff distance as an edge $(i,j)\in {{{\mathcal{E}}}}$^36,37. Each element Z_i is embedded into a continuous vector z_i, which is then used to initialize node features ${{{{\boldsymbol{h}}}}}_{i}^{(0)}$. Edge features e_ij are derived from the relative displacements r_ij. The input is then passed through the layers of the graph neural network with a message-passing mechanism^38,39,40. In layer t, a message ${{{{\boldsymbol{m}}}}}_{i}^{(t)}$ is constructed by pooling the message contributions over the neighboring nodes ${{{\mathcal{N}}}}(i)$ as

$${{{{\boldsymbol{m}}}}}_{i}^{(t)}={\sum}_{j\in {{{\mathcal{N}}}}(i)}{M}_{t}\left({{{{\boldsymbol{h}}}}}_{i}^{(t)},{{{{\boldsymbol{h}}}}}_{j}^{(t)},{{{{\boldsymbol{e}}}}}_{ij}\right),$$

(1)

where each contribution is computed from the hidden node features and the edge feature by a message function M_t. The messages are then used to update the node features:

$${{{{\boldsymbol{h}}}}}_{i}^{(t+1)}={U}_{t}\left({{{{\boldsymbol{h}}}}}_{i}^{(t)},{{{{\boldsymbol{m}}}}}_{i}^{(t)}\right),$$

(2)

where U_t is an update function. Finally, a readout function R transforms the final node features ${{{{\boldsymbol{h}}}}}_{i}^{(T)}$ into the node energies, which are summed over the entire node list to give an estimate of the potential energy as

$$E={\sum}_{i\in {{{\mathcal{V}}}}}R\left({{{{\boldsymbol{h}}}}}_{i}^{(T)}\right).$$

(3)

This is a minimal prototype of MLIPs, and the state-of-the-art models use various additional mechanisms to enhance the expressivity to improve the fit to the training data. Although the alchemical modifications introduced in this work are based on this prototype, it can easily be integrated with such additional mechanisms, as further detailed in Section “Methods”.

Alchemical modification

We now introduce the modifications to the input graph and the architecture of the MLIP model to allow the modeling of compositionally mixed structures with partial occupancies of atoms. The main idea is to augment the original graph with alchemical parts, creating an extra group of atoms or nodes for each compositional state to be modeled, and to modify the message passing scheme to keep it consistent with the baseline MLIP. First, we define the alchemical weights ${{{\boldsymbol{\lambda }}}}={\{{\lambda }_{\alpha }\}}_{\alpha=1}^{k}$ to assign the weights to each compositional state. For example, if we are modeling the mixed structure of LiCl, NaCl, and KCl with 20%, 30%, and 50% weights, respectively, the weights would be λ = [0.2, 0.3, 0.5].

Now, we define an alchemical graph $\tilde{{{{\mathcal{G}}}}}=(\tilde{{{{\mathcal{V}}}}},\tilde{{{{\mathcal{E}}}}})$ as an extension of an original graph ${{{\mathcal{G}}}}=({{{\mathcal{V}}}},{{{\mathcal{E}}}})$. For the previous example, we assume that we have an original graph representing the NaCl crystal structure. The construction is independent of the original elemental identities of the alchemical atoms, and only the atomic positions will be inherited. Each node in an alchemical graph is identified by a pair of indices, the original atom index i and the alchemical index α, and denoted by $(i,\alpha )\in \tilde{{{{\mathcal{V}}}}}$. All non-alchemical atoms (e.g., Cl), for which the element remains the same for all compositional states, are assigned with α = 0, and the corresponding weight λ₀ = 1. Alchemical atoms are split into multiple nodes according to their compositional states (Fig. 1a). For example, the Na atom i in the original graph is split into three nodes (i, 1), (i, 2), and (i, 3), with elements (Z_(i, 1), Z_(i, 2), Z_(i, 3)) = (Li, Na, K). As such, the node features for alchemical atoms will be initialized with respective elemental embeddings. Then, we assign an alchemical weight λ_α to node (i, α). All other features, such as the positions of the atoms, are inherited from the original graph, e.g., r_(i, α) = r_i.

**Fig. 1: Alchemical modification scheme for machine learning interatomic potentials.**

Edges are connected between the alchemical graph nodes as in the original graph when either any the two endpoint nodes is non-alchemical (with weight index 0), or both nodes are in the same alchemical state (have the same weight index), i.e.,

$$\begin{array}{rc}\tilde{{{{\mathcal{E}}}}}=\left\{((i,\alpha ),(\; j,\beta ))\right.\,&| \,(i,\alpha ),(j,\beta )\in \tilde{{{{\mathcal{V}}}}}\wedge (i,j)\in {{{\mathcal{E}}}}\\ &\left.\wedge (\alpha=0\vee \beta=0\vee \alpha=\beta )\right\}.\end{array}$$

(4)

This is in line with the dual topology paradigm widely utilized in the alchemical free energy literature^41,42,43, in which the atoms in the different alchemical states geometrically coexist but do not interact directly with each other. To model the scaled interaction between atoms in the alchemical graph, we introduce edge weights to scale the message contributions. Aldeghi and Coley⁴⁴ have proposed a similar idea in which they model the different topological assemblies of polymers by weighted (stochastic) edges between linkage atoms in monomers. Here, we use an asymmetrical weighting scheme given as

$$\begin{array}{r}{\omega }_{\alpha \beta }=\left\{\begin{array}{ll}{\lambda }_{\beta }\quad &\,{\mbox{if}}\,\,\alpha=0\wedge \beta \,\ne \,0\\ 1\quad &\,{\mbox{otherwise}}\,,\end{array}\right.\end{array}$$

(5)

i.e., only the message contributions from alchemical atoms to non-alchemical atoms are weighted by the alchemical weight of the source atom. This choice is based on the observation depicted in Fig. 1b. Since we are extending the original MLIP for alchemical compositions without modifying the learnable functions, we should ensure that the message passing is consistent with original graphs where all edge weights are implicitly 1. According to the expansion of alchemical atoms and the edge connection scheme, only the message passing from an alchemical atom to a non-alchemical atom is split into multiple pathways with respective alchemical node weights. Therefore, we utilize the alchemical node weights as the edge weights in this case, and the message aggregation scheme is modified from Eq. (1) as the weighted sum of the message contributions:

$${{{{\boldsymbol{m}}}}}_{(i,\alpha )}^{(t)}={\sum}_{(j,\beta )\in \tilde{{{{\mathcal{N}}}}}((i,\alpha ))}{\omega }_{\alpha \beta }{M}_{t}\left({{{{\boldsymbol{h}}}}}_{(i,\alpha )}^{(t)},{{{{\boldsymbol{h}}}}}_{(j,\beta )}^{(t)},{{{{\boldsymbol{e}}}}}_{ij}\right).$$

(6)

Finally, the readout for energy prediction (Eq. (3)) is modified as a weighted pooling of alchemical node contributions (Fig. 1c):

$$E={\sum}_{(i,\alpha )\in \tilde{{{{\mathcal{V}}}}}}{\lambda }_{\alpha }R\left({{{{\boldsymbol{h}}}}}_{(i,\alpha )}^{(T)}\right).$$

(7)

Note that the same M_t and R functions as in Eqs. (1) and (3) are used, i.e., no trainable weights are modified. This modification scheme ensures two essential consistencies with the original MLIP scheme. First, when all of the alchemical elements are the same (Z_(i, α) = Z_i) for each original atom and the alchemical weights sum up to 1 (${\sum }_{\alpha=1}^{k}{\lambda }_{\alpha }=1$), the predicted potential energy is the same with the original graph. Second, when only one of the alchemical weights is 1 (λ_α = 1), and the others are zero, the predicted potential energy is also the same as in the original graph with an elemental composition corresponding to Z_α. These two consistencies in the limiting cases ensure the correct interpolation between compositional states, and although the argument here is based on the prototypical MLIP, the consistencies still hold when adapted to other architectures, as detailed in Section “Architecture-specific modifications” and Supplementary Information. We additionally explore alternative interpolation methods, including embedding interpolation, and compare their ability to interpolate the MLIP energy output in Supplementary Information.

Representation of solid solution

Lattice parameters

First, we investigate whether our representation of a mixture of compositional states can be used to model solid solutions and to optimize their properties with respect to composition. Although many crystal properties can be tuned by the design choice of solid solutions⁴⁵, here we will use lattice parameters to probe the modeling ability. Empirically, the lattice parameters of solid solutions can be approximated by linear interpolation of those of constituent pure crystals with the corresponding compositional weights, as stated by Vegard’s law^46,47. Nevertheless, there are systems that exhibit significant positive or negative deviation from this idealized linear behavior, and we assess whether the proposed method is able to predict such trend.

First, the cell parameter for cubic Ce_1−xM_xO₂ solid solution exhibits linear behavior to x when M = Zr, but shows a positive deviation with a kink for M = Sn⁴⁸. We modeled this solid solution starting from the CeO₂ structure (Fig. 2a), splitting the Ce atoms into two alchemical states, Ce with weight 1 − x and Zr or Sn with weight x, and optimizing the zero-temperature cell parameters by relaxing the unit cell. The alchemical scheme adapted for the universal MACE-MP-0 model¹⁵ gives the correct linear behavior for M = Zr, and successfully identifies the positive deviation for M = Sn (Fig. 2b) although it fails to predict the kink. Further, we also model orthorhombic BiSX_1−xY_x (X, Y = Cl, Br, I) solid solutions, for which the lattice parameters a (positive) and c (negative with a local minimum) exhibit deviations from linearity⁴⁹. We start from BiSBr structure (Fig. 2a) and split the Br atoms into two alchemical atoms of X and Y. The cell parameters are optimized with respective alchemical weights. For example, the BiSCl_1−xI_x structure will have alchemical atoms Cl and I with alchemical weights of λ = (1 − x, x). The alchemical scheme with the MACE model correctly identifies the positive and negative deviations for a and c, respectively, for X = Cl and Y = I (Fig. 2c). In particular, while the parameter c is much larger than the experimental values (due to the inherent error in the original MLIP, itself likely arising from the underbinding nature of the PBE functional used to create the training data), the composition for the local minimum (x ≈ 0.2) is accurately predicted. Although there is no direct correspondence between the alchemical weights and the stoichiometry of the solid solution, these results indicate that the representation developed here offers greater predictive accuracy compared to the naive estimate from Vegard’s law. It is important to note that the current method assumes infinite disorder and thus neglects the effect of ionic ordering. In addition, because all the alchemical atoms are co-located in the position of the parent atom, the potential discrepancies among the fractional coordinates of substituent alchemical atoms are not taken into account.

**Fig. 2: Lattice parameters for solid solutions.**

Compositional optimization

Most MLIPs are designed to be end-to-end differentiable in order to obtain atomic forces and stress as gradients of the potential energy with respect to the positions r_i and the strain tensor ϵ, i.e., F_i = − ∂E/∂r_i and σ = V⁻¹∂E/∂ϵ where V is the volume of the system. Gradient calculations are performed efficiently through the backward pass generated by automatic differentiation⁵⁰. With our additional continuous representation of compositional states, the alchemical weights λ, we can also compute the gradients of the energy with respect to the composition ∂E/∂λ. Since the potential energy is defined up to constant, physically meaningful optimization targets are, in general, given by the energy difference or the gradient of the energy with respect to some system variables.

First, we consider a simple model: a solid solution of three alkali metal chlorides, LiCl, NaCl, and KCl. We fix the fractional coordinates of each atom and consider the cubic lattice constant as a function of alchemical (or compositional) weights of Li, Na, and K. To find a composition that matches a target lattice constant, we can enumerate a grid of compositions and relax the cell dimensions at each fixed composition to probe lattice constants over the compositional space (Fig. 3a, left). However, instead of this direct method, we can consider that the stress is minimized for the optimized structure and composition. Since our scheme is end-to-end differentiable, we can calculate the gradient of absolute hydrostatic stress $| {{{\rm{tr}}}}\,{{{\boldsymbol{\sigma }}}}| /3$ with respect to the composition where the lattice constant is fixed to the target value (Fig. 3a, right). Then, the optimal composition could be found by performing a gradient descent on the compositional space, offering a different approach to the design problem. This is more efficient because only a single gradient-based compositional optimization is required. In this case, since the size of Na is between Li and K, multiple optimal compositions exist on the compositional space.

Now, we apply this to a more realistic example, where we want to find the lattice-matching composition for solid solutions Al_1−xSc_xN and Al_1−xY_xN with GaN. The lattice-matched composition would facilitate the epitaxial growth of such solid solutions on the GaN substrate⁵¹. The objective is to determine a composition x for each solid solution such that the cell parameter a of the lattice matches the value for the GaN structure. Although GaN and AlN possess a hexagonal lattice (space group P6₃mc), pure YN and ScN have a cubic lattice (space group $Fm\overline{3}m$), which means that one cannot simply interpolate between the cell parameters of the constituent compounds to infer those of solid solutions. Here, we fix the cell parameter a for the hexagonal lattice, and we optimize the relevant stress components with respect to the cell parameter c as well as the Al/Sc or Al/Y composition (see Section “Representation of solid solution”) because the doped AlN would result in different c/a ratio. Results in Fig. 3b show that the optimized compositions are x ≈ 0.1 (Y) and x ≈ 0.2 (Sc), and are in good agreement with the forward scan result, where the relaxed cell parameters are measured while scanning for various x values. Furthermore, we created a 4 × 4 × 4 supercell of AlN and randomly switched some Al atoms to Sc or Y atoms to match the target composition and measured the unit cell parameters. These results match well with the scan results over alchemical unit cell compositions, which indicates that the methodology in the current work can also be regarded as a computationally efficient compact representation of the supercell with compositional disorder.

Disorder energetics

The high computational efficiency and accuracy of alchemically modified MLIPs for modeling disordered solid solutions are further validated by examining a dataset of ${{{{\rm{A}}}}}_{2}{{{{\rm{B}}}}}^{{\prime} }{{{{\rm{B}}}}}^{{\prime}{\prime} }{{{{\rm{O}}}}}_{6}$ multicomponent perovskites in our recent high-throughput studies^52,53. Notably, the thermodynamic preference of an ${{{{\rm{A}}}}}_{2}{{{{\rm{B}}}}}^{{\prime} }{{{{\rm{B}}}}}^{{\prime}{\prime} }{{{{\rm{O}}}}}_{6}$ perovskite to adopt either cation-ordered or cation-disordered structures depends on the difference between formation energetics of various cation-ordered configurations and those of cation-disordered solid solutions⁵². For ordered structures, the formation energetics across all possible symmetrically inequivalent cation arrangements can serve as physics-informed descriptors to predict the thermodynamic tendency towards experimental cation disorder⁵³. While DFT is computationally prohibitive for evaluating formation energetics of various enumerated cation-ordered atomic arrangements, we have shown that symmetry-aware equivariant graph neural networks, including equivariant MLIPs, provide efficient and accurate surrogates for assessing ordering-dependent thermodynamic stability in multicomponent perovskite oxides⁵².

Here, we extend our previous analysis to directly examine the formation energetics of fully cation-disordered ${{{{\rm{A}}}}}_{2}{{{{\rm{B}}}}}^{{\prime} }{{{{\rm{B}}}}}^{{\prime}{\prime} }{{{{\rm{O}}}}}_{6}$ solid solutions with partial B site occupancies of 0.5 B′ and 0.5 B″. Traditionally, special quasirandom structures (SQS)^54,55, which optimize elemental placements within a supercell to mimic the cluster vectors of random alloys, have been widely used to study disordered solid solutions. While the SQS approach provides a systematic approach to model disordered structures, it requires large supercells to avoid correlations across periodic boundaries and relies on optimization routines such as Monte Carlo simulations⁵⁶, limiting its feasibility for high-throughput studies. Given the efficiency of alchemically modified MLIPs in representing disorder through partial elemental occupancies, we compare alchemical unit cell modeling of perovskite solid solutions to SQS cells using baseline MLIPs for disorder modeling.

Starting from the base ordered perovskite ABO₃ structure, we split the B atom into two alchemical species, B′ and B″, each assigned an alchemical weight of 0.5. We then generate N × N × N (N = 1, 2, 4, 6) alchemical supercells and 4 × 4 × 4 SQS supercells (Fig. 4a), optimizing each cell using alchemically modified MACE-MP-0 and baseline MACE-MP-0 models. For alchemical supercells, the relaxed cell energy per atom pleateaus at the 2 × 2 × 2 supercell, while the unit cell (1 × 1 × 1) exhibits notably higher energy compared to larger supercells (Fig. 4b). The structural differences between unrelaxed and relaxed disordered cells, shown in Fig. 4c, quantified using cosine distances of local structural fingerprints^53,57, reveal that the alchemical unit cell relaxes only slightly, whereas larger alchemical supercells and SQS cells show more significant differences between their corresponding unrelaxed and MLIP-relaxed structures. As noted in previous works^52,53, crystallographic sites undergo substantial distortion during relaxation, such as octahedral tilting and Jahn–Teller distortions⁵⁸, which are typically beyond the periodicity of a perovskite unit cell and thus can hardly be captured by modeling a single unit cell. Given that the 2 × 2 × 2 supercell yields results similar to those of larger alchemical supercells, we proceed with further analysis using the 40-atom 2 × 2 × 2 alchemical supercell.

**Fig. 4: Disordered energetics in multicomponent perovskite oxides.**

As shown in Fig. 4d, the optimized single-point energies from the alchemical 2 × 2 × 2 supercell align well with those from the 4 × 4 × 4 SQS supercell, with a mean absolute error (MAE) of 0.032 eV/atom. Since the preference for cation-ordered and cation-disordered configurations depends on the relative formation energetics of each, we further compare energy values with all symmetrically inequivalent cation-ordered configurations in the 2 × 2 × 2 supercell, obtained by enumerating four B′ and four B″ cations occupying eight B sites⁵³. The results in Fig. 4e show the relative energies of all considered structures, aligned with the ground-state (lowest-energy) cation-ordered structure energy as the reference. As previously discussed in refs. ^52,53, we observe that experimentally observed ordered compositions exhibit significant difference between the ground-state ordered configuration energy and other configurations, whereas experimentally cation-disordered compositions show similar energies among different configurations. The relative energy of the disordered SQS supercell provides a useful metric for characterizing experimental order/disorder, as seen by the separation of ordered and disordered compositions when sorting the oxide compositions by the SQS energy. Although the 2 × 2 × 2 alchemical supercell energies show more stochasticity, they follow the same trend as the relative energies of the SQS. This is further supported by the receiver operating characteristic (ROC) curves for experimental order/disorder classification based on relative energy values (Fig. 4f), where 4 × 4 × 4 SQS cell energies provide excellent classification with an area under the curve (AUC) of 0.95, while the alchemical 2 × 2 × 2 cell energies achieve reasonably good experimental order/disorder classification with an AUC of 0.80. The likely source of this difference is that for ions of very different sizes, local structural distortions are related to local chemical ordering, but the use of an average structure imposed by the alchemical method fails to produce local distortions that SQS captures well.

Hence, based on these results, we conclude that the alchemical modification of MLIPs offers a scalable approach for disorder modeling, as demonstrated with this multicomponent perovskite oxide dataset. The alchemical 2 × 2 × 2 supercells provide reasonable accuracy for experimental disorder classification, while using only 1/8 of the atoms in the 4 × 4 × 4 SQS supercells. Unlike SQS, these alchemical supercells can be obtained without the need for additional annealing steps for configuration generation. The results were achieved by modifying off-the-shelf pre-trained MLIPs and could be further fine-tuned to improve energy prediction and order-disorder classification. They may also be adapted for other material systems, including compositionally complex alloys and ceramics.

We also note that our approach shares similarities with the Virtual Crystal Approximation (VCA)^59,60,61, a traditional approach in modeling solid solutions with partial elemental site occupancy. VCA relies on two assumptions: (1) geometry: the solid solution is represented by an averaged structure where crystallographic sites are randomly occupied by different elements, disregarding local ordering; and (2) interaction: the random occupancy is approximated by compositionally weighted average of atomic pseudopotentials. Our method adopts the first assumption, making it subject to the same geometric limitations, such as the elements should be of similar size, occupy comparable positions, and local disorder effects should be minimal. However, the practical limitations of VCA mainly arise from what could be described as pseudopotential alchemy, where accuracy depends heavily on carefully tuning pseudopotential parameters like radial cutoffs and electronic configurations (core/valence). In contrast, our method sidesteps these challenges: MLIPs replace electronic structure calculations with iterative message-passing between node and edge features. Built-in regularization from training scheme and model architecture help ensure that results remain within a physically reasonable range, reducing the need for extensive manual parameter adjustments.

Free energy calculations

Here, we utilize the nonequilibrium switching method, where the Hamiltonian depends on a progression parameter λ ∈ [0, 1] so that it interpolates between the initial Hamiltonian H_i = H(λ = 0) and the final Hamiltonian H_f = H(λ = 1). Assuming the NVT ensemble, the reversible work is given via the TI equation⁶²:

$$\Delta F={W}_{{{\rm{i}}}\to {{\rm{f}}}}^{{{\rm{rev}}}}=\int_{0}^{1}{{{\rm{d}}}}\lambda {\left\langle \frac{\partial H}{\partial \lambda }\right\rangle }_{\lambda }.$$

(8)

We now consider a finite-time process in which λ is switched from 0 at time t_i to 1 at time t_f. The irreversible work done by switching the Hamiltonian is

$${W}_{{{\rm{i}}}\to {{\rm{f}}}}^{{{{\rm{irrev}}}}}=\int_{{t}_{{{{\rm{i}}}}}}^{{t}_{{{{\rm{f}}}}}}{{{\rm{d}}}}t\frac{{{{\rm{d}}}}\lambda }{{{{\rm{d}}}}t}\frac{\partial H}{\partial \lambda }={W}_{{{\rm{i}}}\to {{\rm{f}}}}^{{{{\rm{rev}}}}}+{E}_{{{\rm{i}}}\to {{\rm{f}}}}^{{{\rm{diss}}}},$$

(9)

where ${E}_{{{\rm{i}}}\to {{\rm{f}}}}^{{{\rm{diss}}}\,}$ is the dissipated energy. In a linear-response regime, it can be shown^63,64 that the dissipated energy for the forward and backward path is the same when averaged over the transition path ensemble, i.e.,

$$\overline{{E}^{{{{\rm{diss}}}}}}=\overline{{E}_{{{\rm{i}}}\to {{\rm{f}}}}^{{{\rm{diss}}}}}=\overline{{E}_{{{\rm{f}}}\to {{\rm{i}}}}^{{{\rm{diss}}}}}=\frac{1}{2}\left(\overline{{W}_{{{\rm{i}}}\to {{\rm{f}}}}^{{{\rm{irrev}}}}}+\overline{{W}_{{{\rm{f}}}\to {{\rm{i}}}}^{{{\rm{irrev}}}}}\right).$$

(10)

Then, the free energy difference can be computed as

$$\Delta F=\frac{1}{2}\left(\overline{{W}_{{{\rm{i}}}\to {{\rm{f}}}}^{{{\rm{irrev}}}}}-\overline{{W}_{{{\rm{f}}}\to {{\rm{i}}}}^{{{\rm{irrev}}}}}\right).$$

(11)

Often, the Hamiltonian is parametrized by the linear interpolation of the two endpoints, i.e., H(λ) = (1 − λ)H_i + λH_f, to simplify the calculation of the gradient term ∂H/∂λ in Eqs. (8) and (9): ∂H/∂λ = H_f − H_i. However, we note that in our case, the system Hamiltonian can be parametrized by the alchemical weights, and ∂H/∂λ can be calculated straightforwardly using automatic differentiation⁵⁰ on the MLIP. This method proves to be more efficient than linear interpolation as it obviates the need to repeat calculations for non-changing atoms. We compare computational efficiencies and the resultant free energy calculations in Supplementary Information.

Free energy of vacancy formation

Accurate evaluation of the free energy of a point defect is important for characterizing its thermodynamic stability⁶⁵. Here, we calculate the Gibbs free energy of vacancy defined as

$${G}_{{{{\rm{v}}}}}={G}_{{{{\rm{defect}}}}}-\frac{N-1}{N}{G}_{{{{\rm{perfect}}}}},$$

(12)

where G_defect and G_perfect are the Gibbs free energies of crystal with and without a point defect, and N is the number of atoms in the perfect crystal. Because the vacancy diffuses at high temperatures, it is common to first evaluate the Gibbs free energies at low temperatures in which the vacancy is fixed at one site⁶⁶ and extend the calculation by considering the temperature dependence of Gibbs free energy⁶⁷. Hence, we will focus on determining the Gibbs free energy of vacancy in BCC iron at low temperatures and compare the result with Gibbs free energies determined using the Frenkel–Ladd path⁶⁸, which is commonly used in nonequilibrium calculations^33,64. In the Frenkel–Ladd path, the crystal structure is switched from and to a system of independent harmonic oscillators with the same equilibrium positions (the Einstein crystal), for which we can calculate the exact free energy. See Section “Free energy calculations” for more details on the reference calculation.

We introduce a new alchemical path for determining the free energy of vacancy, as depicted in Fig. 5a. While the previous examples of our method were restricted to cases where ${\sum}_{\alpha }{\lambda }_{\alpha }=1$, we can lift this restriction to create or annihilate atoms in a system alchemically. In this case, we assign alchemical weight λ₁ = 1 − λ to the atom in the vacancy site and switch the weight from 1 to 0 (λ from 0 to 1) over the simulation to make it continuously disappear from the system. At the same time, we add the harmonic oscillator term to the atom position with weight λ, so that through the alchemical conversion from λ = 0 to 1 transforms the perfect crystal into a crystal with defect and a harmonic oscillator (Fig. 5b). Through nonequilibrium switching simulations, we can obtain the alchemical free energy difference ΔG^AL (Eq. (23)). We now compare the free energy of vacancy (Eq. (12)) obtained from both Frenkel–Ladd calculations (${G}_{{{\rm{defect}}}}^{{{\rm{FL}}}}$ and ${G}_{{{\rm{perfect}}}}^{{{\rm{FL}}}}$) and with alchemical free energy calculations (ΔG^AL and ${G}_{{{\rm{perfect}}}}^{{{\rm{FL}}}}$).

**Fig. 5: Free energy of vacancy formation in BCC iron.**

The results in Fig. 5c show that G_v calculated by the proposed alchemical free energy method is comparable to that from the reference Frenkel–Ladd calculations, while offering more consistent results with much smaller standard deviations when using the same switching time steps. We further investigate the statistical efficiency of the switching paths at 100 K by evaluating the convergence of ΔG, taking the longest switching time result as its reference, as well as the dissipated energy E_diss (Eq. (10)) in Fig. 5d. The alchemical pathway offers much faster convergence, with minimal average energy deviations ( < 0.02 meV/atom) from the reference value, even at a very short switching time of 2 ps (1000 MD steps).

Alchemical free energy calculations

Now, we examine the effectiveness of the proposed alchemical scheme in the calculation of alchemical free energy difference associated with the change in the elemental identities of the atoms. We use halide perovskites CsPbI₃ and CsSnI₃ as our model system, which have been studied using MLIPs (e.g.,⁶⁹) and classical force fields (e.g.,^70,71). Both CsPbI₃ and CsSnI₃ exhibit three photoactive perovskite phases, α (cubic, $Pm\overline{3}m$), β (tetragonal, P4/mbm), and γ (orthorhombic, Pnma), in decreasing order of temperature window of stability. However, they also possess a photoinactive non-perovskite polymorph, δ (orthorhombic, Pnma), which is the most stable phase at room temperature⁷². Here, we analyze the difference in the relative stabilities of perovskite (P) and non-perovskite (N) phases as shown in the thermodynamic cycle in Fig. 6a. The direct computation of the free energy of phase transformation, ΔG_Pb,P→N and ΔG_Sn,P→N, may require enhanced sampling simulations with tailored collective variables or nonequilibrium simulations (the Frenkel–Ladd paths) with longer simulation time until convergence. The alchemical path enables the calculation of ΔG_P,Pb→Sn and ΔG_N,Pb→Sn. Since the two types of free energy differences are linked by

$$\Delta \Delta G=\Delta {G}_{{{\rm{N}}},{{\rm{Pb}}}\to {{\rm{Sn}}}}-\Delta {G}_{{{\rm{P}}},{{\rm{Pb}}}\to {{\rm{Sn}}}}\\=\Delta {G}_{{{\rm{Sn}}},{{\rm{P}}}\to {{\rm{N}}}}-\Delta {G}_{{{\rm{Pb}}},{{\rm{P}}}\to {{\rm{N}}}},$$

(13)

we can compute the difference in the relative stability of phases upon compositional changes, or we can calculate either of the free energies of phase transformation if another is already known.

**Fig. 6: Alchemical free energy simulations.**

For the alchemical free energy simulation, starting from the CsPbI₃ structure, the Cs and I atoms remain as non-alchemical atoms, and the Pb atoms are divided into alchemical atoms, Pb and Sn, with alchemical weights λ₁ = 1 − λ and λ₂ = λ. Then, switching λ from 0 to 1 continuously transforms the CsPbI₃ structure into the CsSnI₃ structure. Refer to Section “Free energy calculations” for more details on the alchemical free energy calculation settings and result analysis required to obtain the Gibbs free energies.

First, we compare the Gibbs free energy of compositional change from two methods: $\Delta {G}_{{{\rm{P}}}/{{\rm{N}}},{{\rm{Pb}}}\to {{\rm{Sn}}}}^{{{\rm{AL}}}}$ from the alchemical path and $\Delta {G}_{{{\rm{P}}}/{{\rm{N}}},{{\rm{Pb}}}\to {{\rm{Sn}}}}^{{{{\rm{FL}}}}}={G}_{{{\rm{P}}}/{{\rm{N}}},{{\rm{Sn}}}}^{{{{\rm{FL}}}}}-{G}_{{{\rm{P}}}/{{\rm{N}}},{{\rm{Pb}}}}^{{{\rm{FL}}}}$ from the Frenkel–Ladd path for each composition. The results in Fig. 6b indicate that the two calculation results coincide well except for the slight deviation in the perovskite phase for temperatures lower than 400 K. The deviation may occur from the phase transformation between perovskite phases of CsPbI₃ (i.e., α → β). The Frenkel–Ladd path is simulated in the fixed cell (NVT) of the respective α phase, whereas the alchemical path is simulated in the NPT ensemble, in which phase transformations can occur. Given that the β phase is more stable than the α phase for CsPbI₃ at low temperatures, $\Delta {G}_{{{\rm{Pb}}}\to {{\rm{Sn}}}}^{{{\rm{AL}}}}$ is expected to be larger than $\Delta {G}_{{{\rm{Pb}}}\to {{\rm{Sn}}}}^{{{\rm{FL}}}}$, as in Fig. 6b. See Supplementary Information for further discussion. The calculation of ΔΔG (Eq. (13)) also shows that the two results are well matched at higher temperatures, while the alchemical path provides smaller standard deviations from multiple runs.

Similarly to the previous example, we analyzed the convergence of the Gibbs free energy and the energy dissipation for the alchemical path for the perovskite phase at 400 K by changing the switching time for nonequilibrium simulations. Fig. 6c shows that, similar to the previous result, the alchemical path provides much faster convergence than the Frenkel–Ladd path. This result confirms that the phase space overlap between the two same phase structures with different compositions is much more significant than that between the atomic structures and the Einstein crystals, which enables much more efficient free energy simulations.

Discussion

The alchemical modification of MLIP introduced in this work allows a smooth interpolation between structures with two or more different compositions. Building upon a prototypical construction of MLIP, we modified the input graph, message passing scheme, and readout layers to alchemically weight the different compositional states. Although this modification can be generalized to various classes of MLIPs, it is particularly efficient when integrated with MACE because of its construction of many-body features from two-body messages (see Section “Architecture-specific modifications”).

We first applied the scheme to the modeling of solid solutions. Although there is no theoretical relationship between the stoichiometry and the alchemical weights, the results showed that it could model the nonlinear deviations of cell parameters in some solid solutions. The end-to-end differentiability of the model with respect to the alchemical weights enabled the optimization of composition to match the desired cell parameters. The alchemical modification also provides a scalable, efficient method for characterizing order and disorder, as demonstrated in multicomponent perovskite oxides, achieving accuracy comparable to SQS with fewer number of atoms and no optimization needed for structure generation. Furthermore, the alchemical weights allow smooth creation or annihilation of atoms, or the change in atom types, enabling the calculation of free energy differences between two compositional states. We demonstrated that the free energy of vacancy in BCC iron and the relative phase stabilities of the perovskite and non-perovskite phases of CsPbI₃ and CsSnI₃ could be calculated much more efficiently than the widely utilized Frenkel–Ladd path. It is worth noting that, unlike the modeling of solid solutions, alchemical free energy calculations conducted here are theoretically exact when reaching convergence.

Overall, the proposed method enables efficient modeling of composition-related properties with sufficient consistency within the underlying MLIP. Beyond the aforementioned lack of theoretical ground on the connection between alchemical weights and stoichiometric coefficients and convergence questions that are universal to thermodynamic integration methods, inaccuracies emerge primarily from the MLIP. In particular, there are two sources of error: (1) the discrepancies between the MLIP and the DFT calculations and (2) the inaccuracy of the underlying DFT calculations. Since most universal MLIPs are trained on the energies and derivatives from the relaxation trajectory, the relative error around the energy minima would be small. This implies that the former error would also be small when performing free energy calculations for systems with a sufficient number of similar structures in the materials database. Fine-tuning the MLIP using the DFT data from relevant compositional space would alleviate the former error. One can also utilize free energy perturbation methods⁷³ to calculate the free energy from a more accurate Hamiltonian to reduce both types of errors. We also note that differentiable simulations^74,75 could be used to fine-tune the MLIP to match either the cell parameters resulting from the relaxation trajectory or the free energy differences from the MD simulations to their desired values, to mitigate both sources of errors.

While we devised the alchemical scheme with fixed elemental identities and λ representing the occupancies of different alchemical atoms to align with our goal of leveraging pre-trained MLIPs, we note that interpolating the elemental identities of atoms, i.e., coupling λ to atomic numbers, is also a promising direction that connects with the quantum alchemy literature. Although pre-trained embeddings may not be ideal for this, they could be fine-tuned by alchemical force matching with ∂E/∂λ derived from quantum alchemy^22,76,77, possibly using analytical gradients^78,79,80, given that the baseline MLIP is end-to-end differentiable with respect to embeddings. Learning an MLIP-based representation consistent with quantum alchemy might offer a well-regularized approximation of the physical state for the TI calculation. While this alternative scheme could improve the physical relevance of alchemical degrees of freedom, it is incompatible with the current approach and remains a prospect for future work. Beyond the applications demonstrated in this work, we expect that the gradient of the physical observables with respect to the composition or elemental identities would hold particular importance to the generative modeling of molecules and materials systems. We envisage that further works, integrated with the discrete sampling literature^81,82, will utilize the alchemical degrees of freedom in MLIPs for such modeling applications.

Methods

Architecture-specific modifications

In MACE, the atomic basis ${{{{\boldsymbol{A}}}}}_{i}^{(t)}$ is constructed by pooling the two-body features over the neighbors as in Eq. (14) (Eq. (8) in the original paper¹⁰). The modification to message passing in Eq. (6) is implemented by multiplying the edge weights ω_αβ (Eq. (5)) to the summands of the message aggregation as in Eq. (15):

$${A}_{i,k{l}_{3}{m}_{3}}^{(t)}={\sum}_{{l}_{1}{m}_{1},{l}_{2}{m}_{2}}{C}_{{l}_{1}{m}_{1},{l}_{2}{m}_{2}}^{{l}_{3}{m}_{3}}{\sum}_{j\in {{{\mathcal{N}}}}(i)}{R}_{k{l}_{1}{l}_{2}{l}_{3}}^{(t)}({r}_{ji}){Y}_{{l}_{1}}^{{m}_{1}}({\hat{{{{\boldsymbol{r}}}}}}_{ji}){\sum}_{\tilde{k}}{W}_{k\tilde{k}{l}_{2}}^{(t)}{h}_{j,\tilde{k}{l}_{2}{m}_{2}}^{(t)},$$

(14)

$${A}_{(i,\alpha ),k{l}_{3}{m}_{3}}^{(t)}={\sum}_{{l}_{1}{m}_{1},{l}_{2}{m}_{2}}{C}_{{l}_{1}{m}_{1},{l}_{2}{m}_{2}}^{{l}_{3}{m}_{3}}{\sum}_{(j,\beta )\in \tilde{{{{\mathcal{N}}}}}((i,\alpha ))}{\omega }_{\alpha \beta }{R}_{k{l}_{1}{l}_{2}{l}_{3}}^{(t)}({r}_{ji}){Y}_{{l}_{1}}^{{m}_{1}}({\hat{{{{\boldsymbol{r}}}}}}_{ji}){\sum}_{\tilde{k}}{W}_{k\tilde{k}{l}_{2}}^{(t)}{h}_{(j,\beta ),\tilde{k}{l}_{2}{m}_{2}}^{(t)}.$$

(15)

The original readout mechanism is the sum of site energies over all the outputs of readout layers ${{{{\mathcal{R}}}}}_{t}(\cdot )$ (Eq. (16)). We implement the alchemical readout in Eq. (7) as the weighted sum of alchemical site energies as in Eq. (17):

$$E={\sum}_{i\in {{{\mathcal{V}}}}}{E}_{i}={\sum}_{i\in {{{\mathcal{V}}}}}{\sum}_{t=0}^{T}{{{{\mathcal{R}}}}}_{t}\left({{{{\boldsymbol{h}}}}}_{i}^{(t)}\right),$$

(16)

$$E={\sum}_{(i,\alpha )\in \tilde{{{{\mathcal{V}}}}}}{\lambda }_{\alpha }{E}_{(i,\alpha )}={\sum}_{(i,\alpha )\in \tilde{{{{\mathcal{V}}}}}}{\lambda }_{\alpha }{\sum}_{t=0}^{T}{{{{\mathcal{R}}}}}_{t}\left({{{{\boldsymbol{h}}}}}_{(i,\alpha )}^{(t)}\right).$$

(17)