Abstract
Metal–organic frameworks (MOFs) are highly porous and versatile materials studied extensively for applications such as carbon capture and water harvesting. However, computing phonon-mediated properties in MOFs, like thermal expansion and mechanical stability, remains challenging due to the large number of atoms per unit cell, making traditional Density Functional Theory (DFT) methods impractical for high-throughput screening. Recent advances in machine learning potentials have led to foundation atomistic models, such as MACE-MP-0, that accurately predict equilibrium structures but struggle with phonon properties of MOFs. In this work, we developed a workflow for computing phonons in MOFs within the quasi-harmonic approximation with a fine-tuned MACE model, MACE-MP-MOF0. The model was trained on a curated dataset of 127 representative and diverse MOFs. The fine-tuned MACE-MP-MOF0 improves the accuracy of phonon density of states and corrects the imaginary phonon modes of MACE-MP-0, enabling high-throughput phonon calculations with state-of-the-art precision. The model successfully predicts thermal expansion and bulk moduli in agreement with DFT and experimental data for several well-known MOFs. These results highlight the potential of MACE-MP-MOF0 in guiding MOF design for applications in energy storage and thermoelectrics.
Similar content being viewed by others
Introduction
Metal–organic frameworks (MOFs) are composed of organic molecules, called linkers, connected to inorganic ions or clusters, called nodes1,2,3. MOFs are nanoporous materials that, due to their large modular nature and high porosity, have become strong candidates for many potential applications ranging from water harvesting4,5 and catalysis to biosensing. Some important physical properties that make MOFs suitable for such potential applications include mechanical stability, thermal expansion, heat conduction and superconductivity which are influenced by phonon-mediated lattice dynamics6,7,8,9. Phonons describe the collective vibrations of atoms in a crystal and interact with electronic and thermal excitations that affect these physical characteristics of MOFs10. However, the current understanding of these properties in MOFs is limited due to the computational complexity of predicting phonons in these large structures. Currently, one of the most reliable methods to study the electronic structure of materials and their properties is Density Functional Theory (DFT)11,12,13. However, for materials like MOFs that can have several hundreds or even thousands of atoms in their unit cell, performing supercell calculations necessary for obtaining accurate phonons becomes too computationally expensive for screening applications.
Semi-empirical Quantum Mechanical methods, such as the Density Functional Tight Binding (DFTB) method, have been used to compute phonons in few highly symmetric isoreticular MOFs14. Several DFTB schemes have been developed, with DFTB315 being the most widely used. While DFTB is 2-3 orders of magnitude faster than DFT, its pair- wise parametrization strategy and lack of available metal atom parameters limits its use in screening applications, particularly pertaining to phonons in MOFs. A notable variant of DFTB3, GFN1-xTB16 follows a element-specific parameters strategy and extends its applicability to nearly the entire periodic table, up to thousands of atoms. In17, the GFN1-xTB method has been reported to produce cell parameters within a 5% deviation (relative to experiments) for 75% of the MOFs in the CoRE database18,19, showing promise for adsorption applications and obtaining binding energies. However, GFN1-xTB still scales quadratically with the number of electrons which may present challenges for larger MOFs, and has not been used for the analysis of vibrational properties in MOFs.
Traditional force fields like UFF20 and CHARMM21 are widely used due to their scalability and significant speed up in calculations compared to tight-binding DFT. UFF4MOF22, the UFF parametrization on MOFs, can be used for rapid structure prediction and screening. However, this transferability comes with limitations in accurately predicting dynamical properties such as phonons.22,23. Other force fields, like the MOF-FF model24 also predicted accurately the lattice parameters of some well known MOFs like MOF-5, UiO-66 within 0.5% deviation relative to DFT. Force fields derived with a focus on vibrational properties for MOFs have also been reported like VMOF25. While the VMOF model reproduced optimized lattice parameters and phonon density of states in good agreement with DFT, important phonon-derived properties like bulk modulus obtained with VMOF underestimated DFT predictions by more than 50% even for standard MOFs like UiO-66 and MOF-525. These limitations of such classical force fields are because of the vast combinations of possible frameworks in the MOF chemical space that make their parameter optimization and selection of functional form extremely challenging.
Recently, several neural network-based machine learning potentials (MLPs) such as26,27 and on-the-fly MLPs such as kernel-based potentials in the Vienna Ab initio Simulation Package28,29,30,31 (VASP MLPs) and Moment Tensor Potentials (MTP)32 reported in23 have emerged that produce vibrational properties with sufficient accuracy relative to DFT for MOFs. However, such on-the-fly MLPs are restricted to the specific MOF included in the training set, and the need for continuous regeneration and retraining on DFT data to incorporate new configurations renders them impractical for high-throughput screening of MOFs. Therefore, the need for a ready-to-use transferrable model for screening dynamical properties in MOFs motivates the development of new MLPs.
MLPs like the MACE foundation model (MACE-MP-0)33, which utilizes the MACE architecture34 of an equivariant message-passing graph tensor network with many-body information of atomic features encoded in each layer, has been tested on MOFs. Like several MLPs and force fields that are trained on MOF building blocks22,25,26 rather than on whole MOFs because of their large chemical space and size of the unit cell, the MACE-MP-0 model was trained on the MPtrj dataset of 150k inorganic crystals35. The MACE-MP-0 demonstrated high accuracy in predicting the potential energy surface, with a root-mean-squared-error of 33 meV/atom in energies relative to DFT for the 20k MOFs in the QMOF database36,37. The foundation model thus, demonstrates its transferability in being able to capture some of the complex interactions in MOFs and shows potential to further improve and investigate its accuracy for phonons and derived properties in MOFs.
In this work, we introduce MACE-MP-MOF0, a highly accurate model derived from MACE-MP-0b (medium model), fine-tuned on a high-quality dataset of 127 representative MOFs. The MACE-MP-0b model is a slightly modified version of the original MACE-MP-0 (which was released in April 2024) to address few shortcomings of the original model such as dealing with short-distance collapse by adding a ZBL potential (at the time of writing, a new version MACE-MP-0c was released).The model is evaluated for key properties derived from phonons in well-known and representative MOFs, such as bulk moduli and thermal expansions, within the quasi-harmonic approximation, allowing for a comprehensive comparison with previous models and experimental data reported in the literature. MACE-MP-MOF0 accurately reproduces experimentally observable phenomena, such as negative thermal expansion in MOFs, demonstrating its applicability beyond computational predictions. In addition to analyzing the performance of the model on MOFs in the curated dataset, we test its transferability on other well-known MOFs not seen by the model during the training and find excellent agreement in the predicted bulk modulus wih DFT and experiments. The presented model is therefore ready-to-use for high-throughput phonon calculations of MOFs and aims to incorporate more chemically diverse MOFs in future generations of the model.
Results
Dataset curation
To date, tens of thousands of MOFs have been synthesized38, and countless more can be designed39 due to the vast diversity of their chemical building blocks, making the selection of an appropriate training set particularly challenging. In this work, in addition to studying 19 prototypical and experimentally widely studied MOFs for gas storage, catalysis and defect engineering (see Supplementary Tables 1 and 2), we expand our dataset by curating 108 more structures from the 20,375 diverse MOFs in the QMOF database36,37.
For sampling these 108 structures, we considered predominantly non-spin polarized MOFs with pore limiting diameters (PLDs) greater than 3.6 Å, using nitrogen as a probe molecule. MACE34 descriptors hold a large amount of information regarding the atomic features of MOFs and hence were used to sample 100 diverse MOFs. These MOFs were selected to span over all the 7 crystal symmetry systems and diverse bonding interactions. Therefore, as shown in (Fig. 1c, d) this dataset consists of 127 representative MOFs spread over a wide range of 24 elements in the inorganic clusters and organic ligands. (Figure 1a, b) shows the MACE descriptor space diversity for the curated 127 MOFs (refer the Supplementary Fig. 1 for the main features of the sampled MOFs).
Diversity in the MACE descriptors of the (a) linkers and (b) nodes of the curated 127 MOFs using gaussian kernel density estimation method (c) The symmetry distribution of the dataset (d) The heat map for elemental counts for the atoms included in the dataset. The blank elements in the heat map are not included in the dataset.
On this curated dataset of 127 MOFs, we have used various strategies to generate DFT data points for fine tuning: i) molecular dynamics simulations (an NPT ensemble using MACE-MP-0b+D3), from which frames were selected with a farthest point sampling (FPS) approach to maximize spread in descriptor space (refer SI), ii) strained configurations generated by expanding and compressing unit cells using an equation of state approach, iii) geometry optimization trajectories of various structures by retaining a relatively small number of frames (up to 10, if available), using the FPS approach (refer the “Methods” section and Supplementary Note 1 for further details). The aforementioned calculations result in a total of 4764 DFT data points split into 85% for training and 7.5% each for testing and validation. We have fine-tuned two models, MACE-MP-MOF0 and MACE-MP-MOF0-v2 where the two models differ only via the way in which the data was split into train, test and validation sets, with the former using a random approach, while the latter using an FPS approach. The goal of comparing the two versions of the fine-tuned model is to show that the performance of these models is comparable regardless of the reference configurations in the training set sampled from this curated dataset.
Phonon workflow
As illustrated in Fig. 2, the phonon workflow for MACE-MP-MOF0 begins with a full cell relaxation, unconstrained by the symmetry of the input configuration. This step is crucial for ensuring the model’s applicability in screening scenarios, particularly when working with MOFs whose stable symmetry configurations are unknown. The full cell relaxation is performed with ASE’s40 L-BFGS and FrechetCellFilter optimizers until the max force component is ≤ 10−6 eV/Å. The search for equilibrium structure is stopped if any negative phonon frequencies present \(\le \,\left\vert 1{0}^{-4}\right\vert\) THz. The process for eliminating spurious imaginary modes that are larger than this threshold is discussed in the Methods section. The symmetry of the equilibrium structure is determined using Pymatgen’s41 space group analyzer with a symmetry search tolerance of 10−5 Å for determining number of symmetry inequivalent displacements.
This work aims to quantitatively compare MACE-MP-MOF0 with models based on various theoretical levels reported in the literature and to ensure consistency across all levels of theory, the Finite Difference (FD) approach is chosen for phonon calculations. The displacements of 0.001 Å are produced on the atoms in the 2 × 2 × 2 supercell for obtaining force constants with the FD approach.
The Harmonic Approximation (HA) for computing phonons fails to capture essential physical properties for MOFs such as thermal expansion, phase transition and elastic moduli42. On the other hand, the Quasi-Harmonic approximation (QHA) is an extension of the HA that aims to capture anharmonicity by introducing the volume dependence of frequencies due to temperature and pressure effects. In this work, we use the Phonopy package43 version 2.29.0 to study thermal and mechanical properties of MOFs by studying the lattice dynamics under the QHA. A 11 × 11 × 11 k-point grid, beyond the Γ point, is consistently used for all phonon calculations, which is essential to accurately capture heat transport properties.
Overview of the investigated systems and potentials for benchmarking
In this work, we benchmark the MACE-MP-MOF0 and MACE-MP-MOF0-v2 models against DFT, extended tight-binding DFT, MLPs and experiments. Among MLPs, we focus on a wide range of models from MACE-MP-0b (with D3 dispersion) to MTP and VASP MLP for MOFs23. For extended tight-binding DFT, we use the GFN1-xTB model17. The GFN1-xTB model was built with a focus on the computation of molecular geometries, vibrational frequencies, and non-covalent interaction energies which are all important for this study of MOFs Table 1.
The MOFs chosen for benchmarking the phonons calculated with the MACE-MP-MOF0 models are widely studied and hence, enables thorough comparison of performance across the aforementioned models and experiments reported in literature. The investigated systems of MOF-5 (Zn), UiO-66 (Zr), MOF-74 (Zn) and MIL-53 (Al) are diverse in their nodes, linkers and topology (refer Table 2). In this work, we study the large-pore configuration of the flexible framework, MIL-53 (Al). The number of reference configurations of these MOFs in the training sets of the two versions of MACE-MP-MOF0 models help analyze the transferability of the model and the performance dependence on the choice and size of training set. In addition to the primary benchmarking of phonon density of states and band structures for these four MOFs, benchmarking of the phonon-derived bulk modulus is done for several other well-known MOFs in literature which are not a part of the curated dataset for demonstrating the transferability of the MACE-MP-MOF0 model.
Benchmarking prediction of unit cell parameters of equilibrium structures
The first crucial step of benchmarking phonons is to accurately capture the equilibrium structure of the MOFs. Predictions of both MACE-MP-MOF0 and MACE-MP-MOF0-v2 show excellent agreement with DFT with a largest deviation of 1.02% as shown in Fig. 3. The MACE-MP-0b deviations are much larger, with the largest deviation of 10%.
Comparison of percentage deviations in force field predictions of average equilibrium unit cell lengths LFF relative to DFT LDFT where FF stands for different models represented by bars of their respective colors for the investigated systems (a) full range (b) zoomed for better visualization of small deviations.
As shown in Table 2, MACE-MP-MOF0 and MACE-MP-MOF0-v2 also predict the right space group for all MOFs while GFN1-xTB and MACE-MP-0b transform the unit cell to different space groups after a full cell relaxation. Even though GFN1-xTB produces deviations ≤ 2% which are in good agreement with DFT, the minor distortions in unit cell lengths and angles shown in Table 2, cause the identification of a different symmetry. As a result, obtaining phonons with GFN1-xTB for a highly symmetric structure like MOF-5 incurs a 17-fold increase in computational cost compared to the MACE-MP-MOF0 models. Symmetrizing the optimized unit cell by loosening the symmetry search tolerance does not eliminate the distortions and does not achieve the true symmetry with GFN1-xTB (see SI). Such symmetrizing procedures are not needed with the MACE-MP-MOF0 models, which enables a full cell relaxation. Hence, in this work, the equilibrium structure is obtained by allowing the relaxation of only atomic positions with GFN1-xTB for phonon calculations.
Benchmarking predictions of forces, energies and stress
As phonon dispersion curves and density of states are obtained from force constants, an accurate description of interatomic forces in MOFs is necessary. Table 3 shows the performance of both MACE-MP-MOF0 and MACE-MP-MOF0-v2 in obtaining the forces, energies and stress on their respective test sets. A linear fit with an R2 score ≥ 0.999 is obtained for forces, energies and stress with MACE-MP-MOF0 and MACE-MP-MOF0-v2 relative to DFT for their respective training, test and validation sets (refer Supplementary Fig. 2).
The analysis of element-wise mean absolute errors (MAE) in energies relative to DFT for the MOFs in the QMOF database containing the 24 elements in the curated dataset reveal that the MACE-MP-0b predictions were 80% larger than the MACE-MP-MOF0-v2 MAE predictions of 0.016 eV/atom (see SI). The MACE-MP-MOF0-v2 model shows significant improvement of 50% in the energy MAE for MOFs with metal nodes like Zn, Al, Mg (which constitute a large set in the QMOF database) and in Hf and Zr-based MOFs (which are widely studied due to their importance in defect engineering of MOFs44). In this subset of the QMOF database, more than 75% of MOFs show improved predictions, with the remaining 25% primarily coming from the Cu-MOFs. The energy MAEs for integral elements of organic linkers like C, H and O common to all MOFs are also approximately lowered by 36% with MACE-MP-MOF0-v2. Hence, overall we achieve a significant improvement in the MAE for the majority of the 24 elements in the curated MOFs, which cover 60% of the QMOF database, by only including a small number of their reference configurations in the training dataset.
In addition to benchmarking the MACE-MP-MOF0 models for the curated dataset, we evaluate its performance on forces, as well as energies, on a set of 70 more diverse MOFs unseen by the model during training (see Supplementary Table 4 for structure details). As shown in Table 4, the MACE-MP-MOF0 models exhibit energy RMSDs relative to DFT that are five times lower than those of MACE-MP-0b, while also delivering 30% more accurate forces and stresses for these out-of-sample MOFs. We observe the deviations on out-of-sample MOFs to be 10 times the RMSDs of MACE-MP-MOF0 models on the curated dataset in Table 3, which demonstrates the significant improvement that can be achieved by adding a few reference configurations of MOFs to the training DFT data.
Benchmarking vibrational properties
After benchmarking forces, energies and stresses we quantitatively analyze the derived phonon density of states and band structures. In Fig. 4, phonon density of states (DOS) data is shown up to 20THz (see Supplementary Fig. 4 for the full range of predicted frequencies). The MACE-MP-MOF0 and MACE-MP-MOF0-v2 models are in very good agreement with DFT DOS, even well above the low frequency range (≤ 6THz). We were able to eliminate all spurious imaginary modes with the MACE-MP-MOF0 models to obtain physically meaningful frequencies. While GFN1-xTB and MACE-MP-0b capture certain aspects of the DOS curves accurately, they fail to accurately capture the vibrational properties of the most prototypical MOFs like MOF-5 and UiO-66.
Density of states (DOS) in the frequency range (≤ 20 THz) for the investigated MOFs, a MOF-5 b UiO-66 c MOF-74 and d MIL-53, obtained with the linear tetrahedron method with a pitch of 0.01 THz predicted by the respective methods overlaid on DFT data. DFT DOS were calculated from force constants available in23,62 on a 11 × 11 × 11 mesh.
Table 5 shows the performance of MACE-MP-MOF0 and MACE-MP-MOF0-v2 in accurately obtaining the phonon frequencies for the full frequency range over the entire Brillouin Zone relative to DFT. The low errors over the full mesh ensure an accurate prediction of mechanical stability and heat transport properties as they depend heavily on off-Γ phonons45. For direct comparison with the phonon frequencies reported at Γ-only with MTP and VASP MLP in23, we also report the Γ-only predictions with our models in Table 5. A Γ-only comparison reveals that the MACE-MP-MOF0 models achieve a better or comparable performance for MOF-5, UiO-66 and MIL-53 than MTP and VASP MLP, while the errors are larger for MOF-74. It is important to emphasize that achieving this level of accuracy with such on-the-fly MLPs requires generating extensive DFT datasets tailored to specific MOFs. On the other hand, the MACE-MP-MOF0 training set had 10 to 100 times lower number of reference configurations for the respective MOFs, as indicated in Table 1, than the training set for these on-the-fly MLPs23. The DFT data used here for benchmarking phonons, as well as the training data for the MTP and VASP MLP was generated in23 with an extremely tight convergence criteria of ENCUT = 900 eV, EDIFF = 1 × 10−8eV with a max force component ≤ 1 × 10−3ev/Å, which are necessary to obtain accurate phonons rendering the DFT calculations very expensive. Whereas, the training DFT data generated in this work for MACE-MP-MOF0 models used a significantly cheaper convergence criteria of ENCUT = 520 eV, EDIFF = 1 × 10−5eV and EDIFFG = −0.02 eV/Å (refer to the “Methods” section for details). This demonstrates that the fine-tuned transferable MLP, MACE-MP-MOF0, can achieve the same high-quality ab initio description as other accurate machine-learned force field potentials for MOFs23,27,46, but with significantly reduced model training and DFT data generation costs.
The similar errors obtained with MACE-MP-MOF0 and MACE-MP-MOF0-v2 in Table 5 show that the different reference configurations of the investigated systems in their respective training sets does not significantly affect the performance of the models. Since MACE-MP-MOF0 and MACE-MP-MOF0-v2 models exhibit similar frequency errors, we use the slightly better performing MACE-MP-MOF0-v2 model for these MOFs to show the accurate overlap between the band structures predicted with DFT and MACE-MP-MOF0-v2 in Fig. 5. Due to the large number of bands in MOFs, the band structures in fig. 5 are restricted to the low frequency region.
Benchmarking of mechanical properties
The bulk modulus serves as a sensitive and computationally efficient metric for assessing phonon accuracy, as both phonon frequencies and bulk modulus arise from second derivatives of energy- phonons with respect to atomic displacements and bulk modulus with respect to volume. Since both properties share similar force accuracy requirements, an accurate bulk modulus indicates that the elastic response and interatomic potential curvature are well captured, suggesting reasonable phonon predictions. Hence, in this section, we analyze mechanical properties, such as the bulk modulus, for additional widely studied MOFs beyond the four systems investigated, comparing them to experimental and DFT data reported in the literature. The bulk modulus is obtained from fitting the Birch-Murnaghan equation of state to energy-volume data obtained by straining the cell ± 2%. From Fig. 6, we observe that MACE-MP-MOF0 and MACE-MP-MOF0-v2 are able to qualitatively capture bulk moduli trends, compared to DFT and experimental data, as well as quantitatively reproduce the values with minor deviations.
The deviations in bulk modulus with MACE-MP-MOF0 versions relative to DFT are the largest for MIL-125 in Fig. 6b, which notably, is the only Ti-based MOF in the curated dataset. We hypothesize that the deviation is due to insufficient training data for Ti-based MOFs. Serendipitously, the MACE-MP-MOF0 and MACE-MP-MOF0-v2 predictions of bulk moduli actually perform better than the explicit DFT data, relative to experiments. The DFT data includes VdW corrections, which tends to overestimate the long range interactions in MOFs and therefore produces higher bulk modulus than experiments47. As the MACE-MP-MOF0 model was trained on DFT data with vdW corrections included, these long-range interactions are therefore slightly underestimated in the model relative to DFT, which moderates the bulk moduli predictions. Hence, the small RMSDs obtained for phonon frequencies with MACE-MP-MOF0-v2 and MACE-MP-MOF0 translate to low deviations in phonon-derived properties such as the bulk modulus.
The agreement between experiments, DFT and MACE-MP-MOF0 models in obtaining the bulk modulus of several diverse out-of-sample, well-known MOFs like ZIF-4, ZIF-8, UiO-66-Ce, UiO-66-Hf and promising MOFs for direct air capture like XEDPON, SUSZOW and XEXMEU48, thus demonstrates the transferability and accuracy of the model in obtaining bulk modulus of MOFs unseen by the model during training.
Negative thermal expansion in MOFs
Finally, we evaluated our model on negative thermal expansion (NTE). Experimental data on NTE for MOFs is limited, making it challenging to benchmark against a wide range of models and structures. In addition, obtaining NTE with DFT is computationally expensive due to the quasi-harmonic phonon calculations required for MOFs. To address this challenge, in this work, we focused our NTE analysis on MOF-5 and UiO-66 as they are widely studied in experiments, and their high symmetry and isotropic nature reduce the cost of such expensive computations.
Comparing DFT and MACE-MP-MOF0-v2 predictions to experimental data is challenging, as factors such as pressure, cooling or heating rates, as well as defect concentration, can vary significantly in experiments, influencing the recorded NTE49,50. For example, in50, the authors observed that, while the equilibrium UiO-66 intrinsically shows a coefficient of thermal expansion (CTE) of -35 × 10−6K−1, a wide range of CTE from + 45 to −80 × 10−6K−1 is recorded depending on the rate of thermal treatment in UiO-66, suggesting a re-evaluation of previous experimental reports of NTE in MOFs.
Based on the available data, we find that MACE-MP-MOF0-v2 predictions are in very good agreement with DFT and other computational models and qualitatively capture experimental NTE trends, such as the larger NTE observed for UiO-66 compared to MOF-5 (see Table 6).
Discussion
Benchmarking the vibrational properties of the investigated MOFs highlights the limitations of MACE-MP-0b and GFN-1xTB, while demonstrating the excellent agreement of the MACE-MP-MOF0 models with DFT in accurately capturing the full phonon vibrational spectra. We further rationalize the performance of these models by analyzing their ability in accurately capturing the atomic positions and interactions in the equilibrium structure in Table 7 for MOF-74. An accurate capture of atomic positions in the MOF provides a better description of forces, which ultimately correlates to the partial phonon density of states (PDOS) obtained with DFT in Fig. 7.
Figure 4 for MOF-74 shows that the GFN1-xTB method is in good agreement with DFT for frequencies lower than 6 THz, above which the errors significantly increase. The overlap is better in the lower frequency range because this region is dominated by the heavy Zn atom vibrations which have low RMSDs in Table 7. The GFN1-xTB and MACE-MP-0b errors in the atomic positions for C and O, dominating the vibrations beyond 6THz, are ten to hundred times larger than the MACE-MP-MOF0 errors. The greater flexibility and higher rotational degrees of freedom of these atoms in the organic linkers are more difficult to capture than the surrounding metal nodes17.Therefore, the GFN1-xTB method accurately captures the vibrations of the metal nodes, but it inadequately represents the interactions between atoms in the organic linkers for phonon calculations. As the MACE-MP-MOF0 models consistently achieves ten to thousand times lower errors for all the elements present in the MOFs than GFN1-xTB and MACE-MP-0b, we obtain the high agreement between DFT and MACE-MP-MOF0 model predictions throughout the phonon spectra (see Supplementary Table 5 for the analysis of other MOFs). MACE-MP-MOF0 demonstrates a significant improvement in capturing the covalent interactions in MOFs, addressing the limitations of MACE-MP-0b and GFN1-xTB, which primarily capture non-covalent interactions. This enhancement positions MACE-MP-MOF0 as a promising tool for systems beyond MOFs, where covalent interactions dominate, such as in covalent organic frameworks.51.
In summary, in this work, we present a fine-tuned MLP for MOFs, MACE-MP-MOF0, that can be used to produce ab initio quality phonons in a high-throughput way. The model was trained on a representative dataset of 127 MOFs selected by efficiently sampling the phase space based on MACE descriptors which significantly reduced the computational efforts in generating DFT reference data. The MACE-MP-MOF0 model predicts phonon-derived bulk modulus in excellent agreement with experiments and DFT on MOFs unseen by the model, demonstrating its transferability.
The MACE-MP-MOF0 model presented here is also 50% faster than the MACE-MP-0b foundation model (with dispersion corrections included) and 10 times more accurate in obtaining optimized geometries, forces, energies and stresses which are crucial for obtaining accurate vibrational properties. Additionally, MACE-MP-MOF0 model is 90% faster than GFN1-xTB for large MOFs (500 atoms per unit cell) (see SI) The computational efficiency and accuracy of the model make it an excellent candidate for screening phonon-derived properties, such as bulk modulus, in MOFs, providing a quantitative description that aligns closely with experimental and DFT results. The model can calculate phonons within the quasi-harmonic approximation to obtain ab initio-level thermal expansions, which would otherwise be computationally expensive to obtain using DFT for MOFs.
Through the preliminary analysis for thermal expansion of MOF-5 and UiO-66 with MACE-MP-MOF0 and DFT, we conclude that the quasi-harmonic approximation does not provide a sufficient description of anharmonic effects in MOFs and is an area with scope for improvement in the computational study of vibrational properties of MOFs. While MACE-MP-MOF0 accurately predicts phonons in the majority of the frequency range (0–60 THz) which dominate most of the physical properties of interest, nevertheless, further improvement is possible by capturing the high frequency range (90–100 THz, which correspond to the vibration of H atoms in the linkers) as shown in Fig. 8. We would also like to highlight that while the current model covers 60% of MOFs in the QMOF database that span the same chemical space of metal nodes as the 127 MOFs, these sampled MOFs are majority closed-shell metal ions, which avoids electronic spin-degrees of freedom. In MOFs with magnetic elements, having training data of atomic configurations with different spin-states in the same MOF can lead to poor training of the MLP. HKUST-1, a well-known Cu-based MOF, is the only MOF in the curated dataset with the magnetic element which contributes to the higher element-wise MAE obtained with MACE-MP-MOF0 for Cu (refer SI). In addition to the model presented here being ready-to-use for a wide range of metallic nodes, organic linkers, and topologies, it is also significantly easier to re-parameterize for new species by simply including a few reference configurations of the MOF into the DFT training set, compared to traditional force fields.
In conclusion, the presented model provides a platform for performing high-throughput calculations in an efficient manner to guide the design and synthesis of MOFs for complex dynamical properties. The reported data motivates further development of MLPs that are easy-to-train and transferable to replace extremely expensive ab initio methods for the analysis of lattice dynamics in MOFs.
Methods
DFT calculations
The DFT computations for generating the dataset were performed using VASP28,29,30,31 and the atomate2 software52. Since MACE-MP-0b was trained on the MPtrj dataset35, we ensured consistency by using the same parameters and pseudo potentials to produce our DFT training dataset. The Perdew-Burke-Ernzerhof (PBE) form of the generalized gradient approximation (GGA) exchange-correlation functional53 was used along with the Projector Augmented Wave (PAW) method (version 54)54. A D3(BJ) van der Waals correction55,56 was included for all MOFs, which eliminated some spurious phonon instabilities. In57, it has been shown that different functionals perform similarly in structure prediction when applied to chemically diverse MOFs for screening purposes. Hence, the chosen PBE-D3 (BJ) functional offers computational consistency and convenience for the diverse curated dataset. Adding the dispersion correction to the training DFT data instead of adding dispersion on the MACE-MP-0b model increased the speed of the model by 1.5 times (refer SI).
Static DFT data was produced with an energy convergence of 10−5 eV. The cutoff energy for the plane-wave basis set was set to 520 eV. Geometry optimization was performed on deformed configurations of the curated MOFs to produce trajectories with DFT with a force convergence of EDIFFG = −0.02 ev/Å. DFT data for Equation of State (EOS) was generated by obtaining the energies of 6 isotropic deformations with a linear deformation ranging from −10% to 10% using the Birch-Murnaghan EOS.
Improving accuracy of MLP via fine-tuning
Fine tuning was carried out on our curated dataset using the multi-head approach as implemented in the MACE code starting with version 0.3.7. We used MACE-MP-0b as one head and our PBE+D3(BJ)53,54,55,56 calculations as the other head. The ratio of weights for the loss functions for energy, forces and stresses was 1:10:100 and we tuned the MLP training for 2500 epochs. It is important to note that during the training process, MACE employs a FPS approach to select configurations from the original MPtrj training set35 and adds them to our data. Table 3 shows the final convergence results for the fine tuning for the two models trained, MACE-MP-MOF0 and MACE-MP-MOF0-v2. Due to the relatively large number of atoms in our training set per MOF, training was carried out on 16 A100@40GiB RAM NVIDIA GPUs. The hyperparameters for fine tuning the model, are inherited from the original MACE-MP-0b model (L=1, r_max = 6.0, hidden irreps 128 × 0e + 128 × 1o, 2 layers, each with correlation order: 3 (body order: 4) and spherical harmonics up to: l=3, see33 for the full list), Specifically hyperparameters for fine tuning are: learning rate 0.005, ema decay 0.995, batch size 8.
Dealing with phonon imaginary modes
Observing phonon imaginary modes for MOFs are common when the geometry optimization leads to a configuration which does not correspond to the minima of the potential energy surface. Among the curated dataset, NOTT-300 is a prototypical example that shows this phenomenon. We tested three methods to eliminate such spurious imaginary modes: i) Mode Mapping—The process begins by identifying modes with negative frequencies and generating a structure by displacing the atoms along the imaginary mode. The structure is then geometry-optimized, followed by standard phonon calculations. Large enough displacements are needed to escape the local minima. This method is convenient for structures having only few imaginary modes. ii) Structure rattling—this method is by far the most brute-force method, but surprisingly effective. Here, we used the ASE40,58 rattle implementation which perturbs atomic positions with random displacements drawn from a normal distribution of 0.01 standard deviation. It is possible to rattle the full structure or only atoms involved in the imaginary modes, such as the H atoms that hydroxylate the inorganic center in NOTT-300. Once rattled, the structure is regularly geometry optimized, and iii) Molecular Dynamics—in this method, we perform a low-temperature NVE ensemble simulation at T = 7.5 K or 10 K for as few as 40 steps, using either the MACE-MP-0b potential or the fine-tuned MACE-MP-MOF0 potential. We perform geometry optimization on the last frame of the resultant simulation. Importantly, specific atoms may be fixed in position if needed during the molecular dynamics simulation. All three methods mentioned above successfully eliminate imaginary modes, with methods ii) and iii) being preferred due to their ease of application and better suitability for handling structures with a large number of negative frequencies. The process of eliminating imaginary modes is stopped when the remaining negative frequencies become negligible (≤ 10−4 THz).
Data availability
All datasets that were used and/or generated in this work are publicly available in this repository: https://github.com/ddmms/data/tree/main/mace-mof-0.
Code availability
The scripts used to generate the training DFT dataset, the MACE-MP-MOF0 models are provided here: https://github.com/ddmms/data/tree/main/mace-mof-0.
References
Furukawa, H., Cordova, K. E., O’Keeffe, M. & Yaghi, O. M. The chemistry and applications of metal-organic frameworks. Science 341, 1230444 (2013).
Yaghi, O. M. et al. Reticular synthesis and the design of new materials. Nature 423, 705–714 (2003).
Li, H., Eddaoudi, M., O’Keeffe, M. & Yaghi, O. M. Design and synthesis of an exceptionally stable and highly porous metal-organic framework. Nature 402, 276–279 (1999).
Hanikel, N. et al. Evolution of water structures in metal-organic frameworks for improved atmospheric water harvesting. Science 374, 454–459 (2021).
Zheng, Z. et al. High-yield, green and scalable methods for producing mof-303 for water harvesting from desert air. Nat. Protoc. 18, 136–156 (2023).
Hoffman, A. E. J. et al. The role of phonons in switchable MOFs: a model material perspective. J. Mater. Chem. A 11, 15286–15300 (2023).
Wieser, S. et al. Identifying the bottleneck for heat transport in metal-organic frameworks. Adv. Theory Simul. 4, 2000211 (2021).
Formalik, F., Fischer, M. & Kuchta, B. Correlating phonons and deformations: a method for structural phase transformation analysis in metal–organic frameworks. Cryst. Growth Des. 23, 8962–8971 (2023).
Takenaka, T. et al. Strongly correlated superconductivity in a copper-based metal-organic framework with a perfect kagome lattice. Sci. Adv. 7, eabf3996 (2021).
Kuchta, B., Formalik, F., Rogacka, J., Neimark, A. V. & Firlej, L. Phonons in deformable microporous crystalline solids. Z. f.ür. Kristallographie - Crystalline Mater. 234, 513–527 (2019).
Hohenberg, P. & Kohn, W. Inhomogeneous electron gas. Phys. Rev. 136, B864–B871 (1964).
Kohn, W., Becke, A. D. & Parr, R. G. Density functional theory of electronic structure. J. Phys. Chem. 100, 12974–12980 (1996).
Dreizler, R. M. & Gross, E. K. U. Density Functional Theory: An Approach to the Quantum Many-Body Problem. 1st edn, Springer Book Archive (Springer-Verlag Berlin Heidelberg, Berlin, Heidelberg, 1990).
Kamencek, T., Bedoya-Martínez, N. & Zojer, E. Understanding phonon properties in isoreticular metal-organic frameworks from first principles. Phys. Rev. Mater. 3, 116003 (2019).
Gaus, M., Cui, Q. & Elstner, M. Dftb3: Extension of the self-consistent-charge density-functional tight-binding method (scc-dftb). J. Chem. Theory Comput. 7, 931–948 (2011).
Grimme, S., Bannwarth, C. & Shushkov, P. A robust and accurate tight-binding quantum chemical method for structures, vibrational frequencies, and noncovalent interactions of large molecular systems parametrized for all spd-block elements (z = 1–86). J. Chem. Theory Comput. 13, 1989–2009 (2017).
Nurhuda, M., Perry, C. C. & Addicoat, M. A. Performance of gfn1-xtb for periodic optimization of metal organic frameworks. Phys. Chem. Chem. Phys. 24, 10906–10914 (2022).
Chung, Y. G. et al. Advances, updates, and analytics for the computation-ready, experimental metal–organic framework database: core Mof 2019. J. Chem. Eng. Data 64, 5985–5998 (2019).
Chung, Y. G. et al. Computation-ready, experimental metal–organic frameworks: a tool to enable high-throughput screening of nanoporous crystals. Chem. Mater. 26, 6185–6192 (2014).
Rappe, A. K., Casewit, C. J., Colwell, K. S., Goddard, W. A. I. & Skiff, W. M. Uff, a full periodic table force field for molecular mechanics and molecular dynamics simulations. J. Am. Chem. Soc. 114, 10024–10035 (1992).
Brooks, B. R. et al. CHARMM: the biomolecular simulation program. J. Computational Chem. 30, 1545–1614 (2009).
Coupry, D. E., Addicoat, M. A. & Heine, T. Extension of the universal force field for metal–organic frameworks. J. Chem. Theory Comput. 12, 5215–5225 (2016).
Wieser, S. & Zojer, E. Machine learned force-fields for an ab-initio quality description of metal-organic frameworks. npj Computational Mater. 10, 18 (2024).
Bureekaew, S. et al. Mof-ff - a flexible first-principles derived force field for metal-organic frameworks. Phys. Status Solidi (B) 250, 1128–1141 (2013).
Bristow, J. K., Skelton, J. M., Svane, K. L., Walsh, A. & Gale, J. D. A general forcefield for accurate phonon properties of metal-organic frameworks. Phys. Chem. Chem. Phys. 18, 29316–29329 (2016).
Eckhoff, M. & Behler, J. From molecular fragments to the bulk: development of a neural network potential for mof-5. J. Chem. Theory Comput. 15, 3793–3809 (2019).
Vandenhaute, S., Cools-Ceuppens, M., DeKeyser, S., Verstraelen, T. & Van Speybroeck, V. Machine learning potentials for metal-organic frameworks using an incremental learning approach. npj Computational Mater. 9, 19 (2023).
Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169–11186 (1996).
Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B 59, 1758–1775 (1999).
Kresse, G. & Furthmüller, J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Computational Mater. Sci. 6, 15–50 (1996).
Kresse, G. & Hafner, J. Ab initio molecular dynamics for liquid metals. Phys. Rev. B 47, 558–561 (1993).
Shapeev, A. V. Moment tensor potentials: a class of systematically improvable interatomic potentials. Multiscale Modeling Simul. 14, 1153–1173 (2016).
Batatia, I. et al. A foundation model for atomistic materials chemistry, arXiv preprint, arXiv:2401.00096, (2024).
Batatia, I., Kovacs, D. P., Simm, G., Ortner, C. & Cs nyi, G. Mace: Higher order equivariant message passing neural networks for fast and accurate force fields. Adv Neural Inform Process Sys. 35, 11423–11436 (2022).
Deng, B. et al. Chgnet as a pretrained universal neural network potential for charge-informed atomistic modelling. Nat. Mach. Intell. 5, 1031–1041 (2023).
Rosen, A. S. et al. High-throughput predictions of metal-organic framework electronic properties: theoretical challenges, graph neural networks, and data exploration. npj Comput Mater 8, 112 (2022).
Rosen, A. S. et al. Machine learning the quantum-chemical properties of metal–organic frameworks for accelerated materials discovery. Matter 4, 1578–1597 (2021).
Moghadam, P. Z. et al. Development of a Cambridge structural database subset: a collection of metal–organic frameworks for past, present, and future. Chem. Mater. 29, 2618–2625 (2017).
Chong, S., Lee, S., Kim, B. & Kim, J. Applications of machine learning in metal-organic frameworks. Coord. Chem. Rev. 423, 213487 (2020).
Larsen, A. H. et al. The atomic simulation environment-a Python library for working with atoms. J. Phys.: Condens. Matter 29, 273002 (2017).
Ong, S. P. et al. Python materials genomics (pymatgen): a robust, open-source Python library for materials analysis. Computational Mater. Sci. 68, 314–319 (2013).
Agne, M. T., Anand, S. & Snyder, G. J. Inherent anharmonicity of harmonic solids. Research 2022, 9786705 (2022).
Togo, A. & Tanaka, I. First principles phonon calculations in materials science. Scr. Materialia 108, 1–5 (2015).
Hu, Z., Wang, Y. & Zhao, D. The chemistry and applications of hafnium and cerium(iv) metal-organic frameworks. Chem. Soc. Rev. 50, 4629–4683 (2021).
Ziman, J. M. et al. Electrons and Phonons: The Theory of Transport Phenomena in Solids, 540 (Oxford University Press, 2001).
Ying, P. et al. Sub-micrometer phonon mean free paths in metal–organic frameworks revealed by machine learning molecular dynamics simulations. ACS Appl. Mater. Interfaces 15, 36412–36422 (2023).
Formalik, F., Neimark, A. V., Rogacka, J., Firlej, L. & Kuchta, B. Pore opening and breathing transitions in metal-organic frameworks: coupling adsorption and deformation. J. Colloid Interface Sci. 578, 77–88 (2020).
Sriram, A. et al. The open dac 2023 dataset and challenges for sorbent discovery in direct air capture. ACS Cent. Sci. 10, 923–941 (2024).
Cliffe, M. J., Hill, J. A., Murray, C. A., Coudert, F.-X. & Goodwin, A. L. Defect-dependent colossal negative thermal expansion in uio-66(hf) metal-organic framework. Phys. Chem. Chem. Phys. 17, 11586–11592 (2015).
Vornholt, S. M., Chen, Z., Hofmann, J. & Chapman, K. W. Node distortions in uio-66 inform negative thermal expansion mechanisms: Kinetic effects, frustration, and lattice hysteresis. J. Am. Chem. Soc. 146, 16977–16981 (2024).
Lyle, S. J., Waller, P. J. & Yaghi, O. M. Covalent organic frameworks: Organic chemistry extended into two and three dimensions. Trends Chem. 1, 172–184 (2019).
Ganose, A., et al. Atomate2: Modular workflows for materials science. ChemRxiv. https://doi.org/10.26434/chemrxiv-2025-tcr5h (2025).
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953–17979 (1994).
Grimme, S., Ehrlich, S. & Goerigk, L. Effect of the damping function in dispersion corrected density functional theory. J. Computational Chem. 32, 1456–1465 (2011).
Grimme, S., Antony, J., Ehrlich, S. & Krieg, H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 132, 154104 (2010).
Nazarian, D., Ganesh, P. & Sholl, D. S. Benchmarking density functional theory predictions of framework structures and properties in a chemically diverse test set of metal-organic frameworks. J. Mater. Chem. A 3, 22432–22440 (2015).
Bahn, S. R. & Jacobsen, K. W. An object-oriented scripting interface to a legacy electronic structure code. Comput. Sci. Eng. 4, 56–66 (2002).
Wang, L. et al. Large negative thermal expansion provided by metal organic framework MOF-5: a first-principles study. Mater. Chem. Phys. 175, 138–145 (2016).
Han, S. S. & Goddard, W. A. Metal organic frameworks provide large negative thermal expansion behavior. J. Phys. Chem. C. 111, 15185–15191 (2007).
Lock, N. et al. Elucidating negative thermal expansion in MOF-5. J. Phys. Chem. C. 114, 16181–16186 (2010).
Kamencek, T., Schrode, B., Resel, R., Ricco, R. & Zojer, E. Understanding the origin of the particularly small and anisotropic thermal expansion of mof-74. Adv. Theor. Simul. 5, 2200031 (2022).
Shi, Z., Weng, K. & Li, N. The atomic structure and mechanical properties of zif-4 under high pressure: Ab initio calculations. Molecules (Basel, Switz.) 28, 22 (2022).
Yang, L.-M., Ravindran, P., Vajeeston, P. & Tilset, M. Properties of irmof-14 and its analogues m-irmof-14 (m = cd, alkaline earth metals): electronic structure, structural stability, chemical bonding, and optical properties. Phys. Chem. Chem. Phys. 14, 4713–4723 (2012).
Hoffman, A. E., Wieme, J., Rogge, S. M., Vanduyfhuys, L. & Speybroeck, V. V. The impact of lattice vibrations on the macroscopic breathing behavior of mil-53(al). Z. f.ür. Kristallographie - Crystalline Mater. 234, 529–545 (2019).
Dissegna, S. et al. Tuning the mechanical response of metal–organic frameworks by defect engineering. J. Am. Chem. Soc. 140, 11581–11584 (2018).
Yot, P. G. et al. Exploration of the mechanical behavior of metal organic frameworks uio-66(zr) and mil-125(ti) and their nh2 functionalized versions. Dalton Trans. 45, 4283–4288 (2016).
Yot, P. G. et al. Metal-organic frameworks as potential shock absorbers: the case of the highly flexible mil-53(al). Chem. Commun. 50, 9462–9464 (2014).
Redfern, L. R. et al. Isolating the role of the node-linker bond in the compression of uio-66 metal–organic frameworks. Chem. Mater. 32, 5864–5871 (2020).
Redfern, L. R. et al. Porosity dependence of compression and lattice rigidity in metal–organic framework series. J. Am. Chem. Soc. 141, 4365–4371 (2019).
Acknowledgements
This research used resources of the National Energy Research Scientific Computing Center (NERSC), a Department of Energy Office of Science User Facility using NERSC award ’GenAI@NERSC’. P.D.K. acknowledges financial support from the U.S. National Science Foundation’s ”The Quantum Sensing Challenges for Transformational Advances in Quantum Systems (QuSeC-TAQS)” program. A.M.E. and F.Z.’s work was also supported by Ada Lovelace Center at STFC (https://adalovelacecentre.ac.uk/), Physical Sciences Databases Infrastructure (https://psdi.ac.uk) under EPSRC grant no. EP/X032663/1 and EPSRC under grant no. EP/V028537/1. A.S.R. acknowledges support via a Miller Research Fellowship from the Miller Institute for Basic Research in Science, University of California, Berkeley. T.J.I. acknowledges support via the Bakar Institute of Digital Materials for the Planet (BIDMaP).
Author information
Authors and Affiliations
Contributions
All authors contributed to the conceptualization of the model. A.S.R. and F.Z. provided insights for the dataset curation and initial model training. P.D.K. generated the DFT dataset for training and benchmarking. A.M.E. fine-tuned the foundation model and generated the benchmarking data for the model. P.D.K. and A.S.R. developed the DFT workflow and optimized the parameters for DFT dataset generation. T.J.I. performed the element-wise mean absolute energies analysis with the models. P.D.K. and A.M.E. drafted the manuscript. K.A.P. supervised the research. All authors contributed to the discussions and provided feedback on the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Elena, A.M., Kamath, P.D., Jaffrelot Inizan, T. et al. Machine learned potential for high-throughput phonon calculations of metal—organic frameworks. npj Comput Mater 11, 125 (2025). https://doi.org/10.1038/s41524-025-01611-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41524-025-01611-8
This article is cited by
-
Machine learning and data-driven methods in computational surface and interface science
npj Computational Materials (2025)