Unified physio-thermodynamic descriptors via learned CO2 adsorption properties in metal-organic frameworks

Lin, Emily; Zhong, Yang; Chen, Gang; Deng, Sili

doi:10.1038/s41524-025-01700-8

Download PDF

Article
Open access
Published: 12 July 2025

Unified physio-thermodynamic descriptors via learned CO₂ adsorption properties in metal-organic frameworks

Emily Lin¹,
Yang Zhong¹,
Gang Chen¹ &
…
Sili Deng¹

npj Computational Materials volume 11, Article number: 225 (2025) Cite this article

2791 Accesses
1 Citations
2 Altmetric
Metrics details

Subjects

Abstract

The large design space of metal-organic frameworks (MOFs) has prompted the utilization of deep learning to drive material design. Nonetheless, the prediction of key thermodynamic properties, such as heat of adsorption ($\Delta {H}_{{\rm{ads}}}$), remains largely unexplored for CO₂ adsorption in MOFs. Herein, we present IsothermNet, a high-throughput graph neural network designed to estimate uptake and $\Delta {H}_{{\rm{ads}}}$ over 0–50 bars, enabling high-quality full isotherm reconstruction (PCC: 0.73–0.95 [uptake], 0.76–0.88 [$\Delta {H}_{{\rm{ads}}}$]). We further bridged these adsorption properties to uptake behaviors (i.e., isotherm shapes/types) and structural information by performing detailed ablation studies to investigate the relative importance of local and global features in relation to predictive performance. This comparative analysis facilitated the discovery of a (1) physically-interpretable and (2) analytically-derived universal descriptor set capable of illustrating interdependencies between easily-computed, accessible textural information and extrinsic adsorption properties. When used cooperatively with IsothermNet, these descriptors enable efficient material screening, accelerating high-performance MOF discovery for CO₂ capture.

Neural ordinary differential equations (ODEs) for smooth, high-accuracy isotherm reconstruction, interpolation, and extrapolation

Article Open access 20 November 2025

Enhancing CO₂ adsorption capacity of ZIF-8 by synergetic effect of high pressure and temperature

Article Open access 16 October 2023

CRAFTED: An exploratory database of simulated adsorption isotherms of metal-organic frameworks

Article Open access 20 April 2023

Introduction

The increase in anthropogenic CO₂ has prompted a fundamental need for carbon capture/utilization/storage (CCUS) technologies, such as physisorption. While many candidates, like zeolites, have been proposed for CO₂ adsorption, metal-organic frameworks (MOFs) have become an emergent material due to their potential to achieve stable cyclability, high surface area, versatile functionalization, and high CO₂ affinity, thereby enabling superior carbon capture performance^1,2. The high tunability of metal cluster-ligand pairs, however, has opened a large optimization space of MOFs. Recently accelerated by the rise of machine learning, adsorption property estimations have started to provide insights into the mechanistic connection between MOF structures and adsorption capabilities for guiding material design.

Prior works have leveraged multilayer perceptrons (MLPs)³, graph transformers⁴, convolutional neural networks (CNNs)⁵, and light gradient boosting machine (LGBM)⁶ for uptake and adsorbate diffusivity/selectivity⁷ prediction of various adsorbates (e.g., CO₂, N₂, CH₄) in MOFs. Separate studies focused on thermodynamic property (adsorption enthalpy^8,9, Henry’s constant^8,10, and free energy¹¹) estimation at different baric/thermic ranges using support vector machine^8,9, random forest/decision trees^9,10, and eXtreme Gradient Boosting (XGBoost)¹¹. Nonetheless, their scope and scale are often limited to singular or few pressures that are low relative to the adsorbate’s saturation pressure (computation time of molecular simulations scales with pressure). While researchers have explored uptake estimations at high pressure (e.g., 46 bars¹² and 500 bars⁴), these studies are scarce and often exhibit inadequate prediction quality.

Despite its fundamental role in adsorption processes, $\Delta {H}_{{\rm{ads}}}$ prediction in CO₂-MOF systems remains absent from the literature, even at low-pressure regimes. $\Delta {H}_{{\rm{ads}}}$ quantifies the total heat release during the adsorption process due to van der Waals and coulombic force contributions between adsorbates and the MOF framework, serving as a key metric for identifying materials with favorable regeneration properties and species selectivity. Currently, $\Delta {H}_{{\rm{ads}}}$ measurement is challenging due to its need for (1) specialized equipment (e.g., calorimeter), or (2) multiple measurements at different temperatures, which can be time-consuming. Although some studies have derived geometric¹³, energy-based^8,11,14, and chemical motif^6,15 descriptors to facilitate adsorption property predictions, no unified descriptor exists that simultaneously links uptake, $\Delta {H}_{{\rm{ads}}}$, and structural information. This is a crucial gap that hinders the efficient selection of high-performance MOFs.

A further challenge lies in understanding the relationship between adsorption properties and isotherm behavior/shape¹⁶, which characterizes gas uptake as a function of pressure. This can be attributed to the typical requirement of complete isotherms for precise type classification. Computational methods such as Grand Canonical Monte Carlo (GCMC) and molecular dynamics (MD)^17,18 simulations can calculate uptakes and $\Delta {H}_{{\rm{ads}}}$ across a wide range of pressures. However, they can be computationally prohibitive for large-scale MOF screening. The ability to rapidly classify isotherm shapes based on adsorption properties would provide valuable physical insights into adsorbate-adsorbent interactions and surface characteristics, enabling more efficient material selection. However, current models do not establish a direct link between $\Delta {H}_{{\rm{ads}}}$, uptake, and isotherm behavior, thus limiting their practical application in MOF design.

To fill these knowledge gaps, we developed IsothermNet, a machine learning model capable of directly predicting uptake and $\Delta {H}_{{\rm{ads}}}$ across a broad pressure range (0-50 bar). Unlike some prior methods that rely on flat structural representations (e.g., SMILES/SELFIES strings^19,20,21 and molecular fingerprints^22,23), IsothermNet employs graph-based^{24,25,26,27,28} learning to capture spatial atomic interdependencies intrinsic to 3D MOF structures. IsothermNet uses a combined crystal graph convolutional neural network (CGCNN)²⁹ and graph attention (GAT)³⁰, allowing high-resolution adsorption property predictions across 15 different pressures. To reveal the structural-enthalpic interplay in MOFs, an extensive feature importance study was performed to quantify the relative significance of various local structural attributes and global textural properties on the prediction quality of uptake and $\Delta {H}_{{\rm{ads}}}$. We then formulated physical and analytical universal descriptors capable of bridging readily accessible (and computationally inexpensive) textural information to physics-based properties (uptake, isotherm shape, $\Delta {H}_{{\rm{ads}}}$). The realization of these descriptor sets can enable high-throughput identification of material operation regime for ideal MOF selection and efficient prediction of $\Delta {H}_{{\rm{ads}}}$ provided a singular uptake obtained via GCMC (computationally expensive and requires heavy preprocessing) or quick predictive learning algorithms (like IsothermNet) at a specific pressure. These descriptors, when used synergistically with IsothermNet, can therefore provide a powerful framework for high-throughput MOF selection, accelerating the discovery of materials optimized for CO₂ capture.

Results

Crystal representation and properties

Like undirected graphs, molecular structures are composed of nodes (atoms) and edges (bonds). As shown in Fig. 1a, we can therefore deconstruct the crystal by extracting atomic-level properties (element, aromaticity, metallicity, electronegativity, coordination number, position, Lennard-Jones parameters [van der Waal radius ($\sigma$)$,$ well depth ($\varepsilon$), and inverse separation distance ($1/r$)]) and bond (edge indices of the starting/ending nodes and bond distance) information for each MOF structure obtained from the Quantum MOF (QMOF) database (of the 20,375 MOFs, only 5,394 are CO₂ adsorption-capable based on the kinetic diameter of a CO₂ molecule; data distribution and input details are described in Supplementary Fig. S1 and Note S1, respectively)^31,32. In addition to these local features, global textural properties, such as pore limiting diameter [PLD], largest cavity diameter [LCD], gravimetric surface area [${A}_{s}$], void fraction [$\phi$], channel volume [$V$], number of channels [$n$], density [$\rho$] (all determined via Zeo + +), can also be used to describe a MOF. Together, these properties will constitute the inputs to the model.

**Fig. 1: The graph neural network (GNN) design and its outputs.**

The target/output adsorption properties (uptake, $\Delta {H}_{{\rm{ads}}}$) are similarly important for characterizing crystal thermodynamics. These properties were computed using GCMC simulations at 15 different pressures between 0.5–50 bars (more details in Methods) for neutral, rigid frameworks. GCMC equilibrates the chemical potential between the bulk adsorbate reservoir and adsorbed molecules while via particle insertion/deletion moves while fixing the volume, number of particles, and temperature. Thus, GCMC is advantageous for sampling many different adsorption configurations, which is particularly useful at low-pressure regimes, where adsorption probabilities are lower. Using MD, however, convergence may take significantly longer due to small time/length-scale.

IsothermNet: architecture and performance

IsothermNet (Fig. 1a) utilizes attention mechanisms and graph convolutions to guide the prediction of uptakes and heat of adsorption at individual pressures. First, local structural (node and bond) features serve as inputs into a two-layer (size: 128 $\to$ 64) graph attention network (GAT), wherein the computed attention scores of neighbouring nodes, with respect to each target node, are normalized and used to update the central node representation. Afterwards, global textural features and intermolecular information (Lennard-Jones parameters) are fed into multilayer perceptron networks (MLPs) to increase the expressivity of the model and improve upon learned representations at the MOF surface. The GAT outputs, concatenated with global features, are the inputs to a two-layer (size: 64 $\to$ 32) crystal graph convolutional neural network (CGCNN) that performs message passing via the iterative aggregation of weighted information between each target node and its neighbours. Each GAT and CGCNN layers are followed by batch normalization and a rectified linear unit (ReLU) activation. To condense feature dimensions through further down sampling, a global pooling operation is then performed, after which the output is passed through four fully connected linear layers (size: 16 $\to$ 8 $\to$ 4 $\to$ 1) to produce the target property. The hyperparameters (learning rate, weight decay, number of attention heads, dropout ratio) were tuned using a tree-structured parzen estimator (TPE) sampler, which yielded optimal values of 1.67e−2, 1.03e−4, 12, and 0.3148, respectively. The layer sizes and number of layers for the CGCNN and GAT were tuned manually after solidifying the optimizer and low-level network parameters. We trained the model for 500 epochs (using a batch size of 64) with a training patience of 50 epochs to prevent overfitting. Additional details regarding the multi-objective hyperparameter optimization and the network architecture are reported in Supplementary Note S2 and S3, respectively.

$${h}_{i}^{\left(l+1\right)}=\sigma \left(\sum _{j\in {N}_{i}}\alpha \left({h}_{i}^{\left(l\right)},{h}_{j}^{\left(l\right)}\right)W{h}_{j}^{\left(l\right)}\right)$$

(1)

$${h}^{(l+1)}=\sigma \left({\tilde{D}}^{-\frac{1}{2}}\tilde{A}{\tilde{D}}^{-\frac{1}{2}}{h}^{(l)}W+b\right),{\rm{w}}{\rm{h}}{\rm{e}}{\rm{r}}{\rm{e}}:\mathop{A}\limits^{ \sim }=A+{I}_{n}\,{\rm{a}}{\rm{n}}{\rm{d}}\,{\mathop{D}\limits^{ \sim }}_{ij}=\sum _{j}{\mathop{A}\limits^{ \sim }}_{ij}$$

(2)

We used a GAT network (update function shown in Eq. 1, where ${h}_{i}^{(l)}$ is the nodal embedding at node $i$/layer $l$, $\alpha$ is a learned attention function, and $W$ are the weights) to capture long-range dependencies and enable the model to assign importance weighting from the start, thereby allowing for more relevant/insightful learned representation as inputs to subsequent layers by filtering out the more extraneous information, which can lead to improved results. The attention mechanism is also advantageous for more local (nodal) interpretations. CGCNNs (message-passing function shown in Eq. 2, where $D$ is a diagonal matrix, $A$ is the adjacency matrix, $b$ is the bias, and ${I}_{n}$ is an $n\times n$ identity matrix), however, can be more ideal in representing both local and global (textural) features and can aggregate such information to form a more comprehensive feature set. Additionally, because the dataset is more heterogeneous after the addition of global features, and since CGCNNs can effectively process dissimilar data types (due to aggregation during the convolution operation), CGCNNs can generate more robust results. The framework was ultimately finalized based on the observation that using GAT-CGCNN, as opposed to CGCNN-GAT, CGCNN-only, or GAT-only, yields superior results (details in Supplementary Note S4).

The predictive capacity of IsothermNet (Fig. 1b) is quantified, as portrayed in the parity plots, which compare model-predicted values to ground truth values obtained from GCMC simulations. The MAE (mean absolute error), MAPE (mean absolute percentage error), and PCC (Pearson correlation coefficient) of the test set can be summarized in Table 1 after training on a 8:1:1 split of 5,394 MOFs. The MAE for uptake and $\Delta {H}_{{\rm{ads}}}$ prediction range from 15.95 g/kg (0.5 bars) to 112.43 g/kg (50 bars) and 1.95 kJ/mol (50 bars) to 2.66 kJ/mol (0.5 bars), respectively. Additional point density parity and convergence plots for the full range of pressure and textural-based parity plots (0.5, 50 bars) are reported in Supplementary Fig. S2 and Fig. S3, respectively. Several notable observations can be made. First, the predictive performance for uptake increases with pressure (PCC scores are 0.730, 0.794, and 0.954 for 0.5, 5, and 50 bars, respectively). Unlike uptake prediction, the predictive performance for heat of adsorption is relatively constant across all pressures (PCC scores are 0.850, 0.881, and 0.761 for 0.5, 5, and 50 bars, respectively). This inverse accuracy trend is potentially due to the non-monotonic and highly-variable nature of $\Delta {H}_{{\rm{ads}}}$ curves, especially at intermediate/high pressures.

Table 1 The MAE, MAPE, and PCC of the test set after training on 8:1:1 split of 5,394 samples from the QMOF database

Full size table

Additionally, we benchmarked the uptake PCC scores from IsothermNet with MOFNet⁴ (and other high-performing models reported in the same work). We note that the other models were trained on 6,997 samples from the Cambridge Structural Database for CO₂ adsorption in MOFs (CSD MOF Subset)³³. However, per Materials Project³⁴, the QMOF database is composed entirely of MOFs from the CSD MOF Subset, so the data distribution should be extremely similar, making this benchmarking task possible. As shown in Table 1, IsothermNet performs significantly better than the other models at the reported pressures (0.5, 5, 10, 15 bars). While IsothermNet is not trained on 500 bars, there are several implications. Since (1) the saturation pressure of CO₂ is 64.5 bars, the uptake between 50 and 500 bars is mostly small, and (2) from the present study, we also know that prediction is easier at higher pressure. Thus, we conclude that IsothermNet performs significantly better at higher pressure compared to the other models (IsothermNet PCC $=$ 0.954 at 50 bars vs. MOFNet PCC $=$ 0.854 at 500 bars). This improvement can be rationalized by IsothermNet’s cascading message-passing mechanism for encoding/aggregating local and global information to ultimately formulate a comprehensive atomistic representation. No prior studies have been performed for predicting $\Delta {H}_{{\rm{ads}}}$ of CO₂-MOF, so we report the MAE, MAPE, and PCC for future benchmarking.

The parity plots for the full pressure range (as a function of point density and textural properties) are detailed in Supplementary Fig. S4 and Fig. S5, which highlight the increased clarity of textural-based trends with pressure for uptake prediction but is unchanged for enthalpy prediction, which indicates that $\partial (\Delta {H}_{{\rm{ads}}})/\partial P$ is generally small. Moreover, the lack of these trends at low pressure for uptake prediction implies additional, other than textural, features are required for adequate uptake representation. To better understand these outcomes, we divert to a feature importance study.

Ablation study

A comprehensive ablation study (Fig. 2a, extended version in Supplementary Fig. S6) is then performed using permutation importance methods to quantify the relative effects of various local and global features (Fig. 2b) on the predictive capacities of uptake and heat of adsorption at low (0.5 bar), high (10 bars), and saturation pressure (50 bars). Interestingly, for uptake prediction, $V$, $\rho$, and LCD become more dominant with pressure, while PLD, ${A}_{s}$, and $\phi$ assumes greater significance at lower pressures, where local structure (e.g., space group/symmetry, lattice dimensions, position), electrostatics-driving properties (e.g., bond distance), and global features are more balanced in effect. At elevated pressures, textural features largely drive uptake. In the case of heat of adsorption prediction, $V$, $\rho$, and LCD dominate at both low- and high-pressure regimes, while PLD, ${A}_{s}$, and $\phi$ are more prominent at high pressure. At this regime, the ability for channel volumes and available surface area to accommodate the high adsorbate concentration becomes more important (until pore-blocking occurs, which is governed by the PLD)³⁵. This phenomenon is highly-correlated with the amount of CO₂ adsorbed (uptake), which directly relates to enthalpy. Additionally, it can be noted that the importance of the number of channels, $V$, $\rho$, and LCD exceeds that of all other variables. This is expected, since a MOF is a highly-hierarchical (and highly-structured) network of pores, which can be described volumetrically. Finally, the high relevance exhibited by bond distance can be rationalized by its ability to affect surface and adsorbent site geometries and the electronic structure of the MOF, aligning with the fact that pore structure dictates uptake/enthalpy³⁶. We observe that the Lennard-Jones parameters (for each atom in the MOF) do not impact the heat of adsorption prediction significantly, which can be attributed to the non-polar nature of CO₂ (van der Waals forces are weak). Overall, the strong imbalance in importance score distribution for enthalpy, compared to uptake, prediction implies lower representation capabilities for heat of adsorption, thus supporting the results in Fig. 1b.

To gain a broader understanding of feature importance, we performed a principal component analysis (PCA) study using textural features to study textural surface property representations of adsorption characteristics (Fig. 2c). In this analysis, principal components PC 1 and PC 2 can encapsulate 82.8% and 7.4% of the information, respectively, leading to 9.8% unexplained. The PCA trends further validate that at high pressure, textural features can characterize, and therefore drive, uptake due to there being evident boundaries, unlike at low pressure, where local structural representations are also needed. Similarly, because the enthalpy PCA trends at low and high pressure are comparable with vaguely-defined clusters, this explains the higher sufficiency of textural properties for heat of adsorption prediction at any given pressure. This also justifies the minor, pressure-independent variations in importance scores for more dominant textural properties. The t-distributed stochastic neighbor embedding (t-SNE) plots are depicted in Supplementary Fig. S7. Overall, by identifying the features with the highest impact in these studies (most notably, textural properties), we can construct a more physics-aligned and informed universal descriptor that can encapsulate the most important information while minimizing the number of parameters required.

Isotherm and heat of adsorption curve prediction

Having determined the principal features affecting the quantifiable adsorption properties of interest, we now consider uptake behavior (often defined by isotherm shape). To classify the shapes, we quantify the isotherm’s ability to fit a Langmuir (only monolayer effect on a homogeneous surface) or Sips (accounts for multilayer effects and heterogenous surface) function and threshold based on curvature (details in Methods). During the classification of valid QMOFs, we identified five distinct isotherm shapes: Type I³⁷, Type I(b)³⁸, Type III³⁹, Type V⁴⁰, and Weak Type V. We observed that the shape of the latter isotherm cannot be well-categorized by conventional IUPAC classification standards due to nanoconfinement effects of physisorption in MOFs. Nevertheless, we noticed highly-correlative adsorption trends of Weak Type V isotherms in relation to other types. We would also like to note that the isotherm classes are stratified according to the 8:1:1 split during training to ensure proper representation in the training, validation, and test sets. These isotherms can be described as follows (additional details on isotherm shape in Supplementary Fig. S8). Type I (Langmuir) isotherms (depicted in Fig. 3a) start with a pronounced DC-I (downward concave increase), followed by saturation as $P\to \infty$, unlike Type I(b) isotherms with nonexistent asymptotic behavior at high pressure and mild DC-I at low pressures. Distinct trends (Fig. 3b) in the heat of adsorption profiles for different isotherm types are also apparent. Type I isotherms, for example, are generally DC-I with a waning slope at $\ge$10 bars. The initial uptake slope steepness also seemingly dictates the general curvature of the enthalpy curve (qmof-7af11ed vs. qmof-9433b10). Type I(b) starts UC-D (upward concave decrease) but increases. Interestingly, the degree of saturation in the uptake plot indicates the stationary point in the $\Delta {H}_{{\rm{ads}}}$ plot (Supplementary Fig. S9): larger ${\left.\frac{{\rm{dq}}}{{\rm{dP}}}\right|}_{P=50{\rm{bars}}}$ will shift the stationary point to higher pressure (qmof-db02aaa). The steep initial uptake increase of Type I isotherms indicates microporous structures that translate to many available binding sites with smaller pore size variance, leading to strong adsorbent-adsorbate interactions, which are reflected in the high $\Delta {H}_{{\rm{ads}}}$ of Type I isotherms relative to those of other types. Upon pore filling of the finite adsorption sites, the isotherm and $\Delta {H}_{{\rm{ads}}}$ plateaus at higher pressure regimes. Type I(b) materials exhibit a slower saturation due to a larger pore size distribution, composed of both micropores and mesopores. This contributes to a more gradual pore filling that can potentially be achieved only at higher pressures, thus leading to a slow uptake rise at low pressures.

**Fig. 3: The uptake and heat of adsorption curves in relation to textural properties.**

Type III isotherms exhibit monotonic UC-I (upward concave increase), which parallels the UC-I behavior of a non-saturating Type V. This suggests weak adsorbent-adsorbate interactions, especially at low pressures, as commonly exhibited by macroporous structures. The lack of strong binding sites encourages slow saturation due to gradual surface coverage growth. At high pressure regimes, however, the host-guest interaction strengthens, which can facilitate further adsorption at the surface, thus rationalizing the monotonically-increasing $\Delta {H}_{{\rm{ads}}}$ curve.

Type V isotherms exhibit a weak UC-I to DC-I evolution, while Weak Type V isotherms demonstrate comparable trends that approach saturation at higher pressures, except with delayed transition. More notably, the $\Delta {H}_{{\rm{ads}}}$ profile for Type I(b) MOFs that reach near-saturation (qmof-91c9eea, qmof-0eaf3ad) bears similar curves to those of Type V (starts UC-D and transitions to UC-I, followed by DC-I). Weak Type V has a similar shape but exhibits a shorter UC-I and longer DC-I segment that mirrors the corresponding initial transient of the isotherm. This behavior is often observed in mesoporous materials, which signifies weak adsorbent-adsorbate interactions at low pressure. At high pressures, however, the uptake increases and saturates due to more significant host-guest interactions via cooperative adsorption, as indicated by the monotonically-increasing $\Delta {H}_{{\rm{ads}}}$ for $P > {P}_{{\rm{inflection}}}$ (inflection point pressure).

These trends and errors depicted in Fig. 3a are representative of the typical isotherm predicted using IsothermNet. It is also interesting to note that the uptake capacity generally increases with each isotherm type (Type I to Type III). In terms of textural features (Fig. 3c), uptake increases with ${A}_{s}$, $V$, $\phi$, PLD/LCD (and inversely with $\rho$) since these properties increase the active area of physisorption. Inversely, the heat of adsorption typically decreases with each isotherm type (Type I to Type III), as evident from Fig. 3d and Supplementary Fig. S10.

We can similarly comment on the $\Delta {H}_{{\rm{ads}}}$ prediction performance in the context of uptake. For Type I isotherms, prediction quality decreased for MOFs with low uptake capacity, especially at higher pressure. Type V-esque isotherms boasted superior prediction accuracies, despite marginal overestimations for Type V high-uptake MOFs at low pressure ($\le$ 1 bar) and Weak Type V low-uptake MOFs at all pressures. Overall, predictions for Type I(b) and Type III isotherms were particularly challenging due to data deficiency (97 and 112 MOFs, respectively). There are several mitigation methods for the observed class imbalance. First, we can increase the number of MOF samples for the stated isotherm types. However, this is limited by both data availability and the high computational cost of GCMC simulations. Active learning, which is well-suited for data-limited cases and have been explored in the context of gas adsorption/separation^41,42, can therefore be an attractive solution. Other methods include data augmentation (e.g., oversampling underrepresented classes with interpolated adsorption values and features) and transfer learning/domain adaptation (i.e., pretraining on abundant isotherm types and finetuning on the rarer isotherm types). The per-class MAEs are provided in Supplementary Table S1-2.

Interestingly, $\Delta {H}_{{\rm{ads}}}$ prediction is more sensitive to class imbalance compared to uptake prediction. IsothermNet shows a tendency to underestimate $\Delta {H}_{{\rm{ads}}}$ at low pressure for Type I(b) isotherms, irrespective of uptake, while demonstrating lower prediction accuracy for high-uptake MOFs at high pressure. Likewise, in the case of Type III, there is a consistent trend of underestimation, particularly pronounced for MOFs exhibiting greater uptake capacity at elevated pressures. Overall, IsothermNet can offer high-resolution (15 points) estimations of complete isotherms and heat of adsorption curves (more effective for Type I, Weak Type V, and Type V) between 0 and 50 bars.

Universal descriptor formulation

So far, we have explored the effects of structural properties on adsorption properties independently. While discretized trends can be extrapolated from such isolated instances, there is an evident need for a generalized correlation relating adsorption characteristics to properties intrinsic to the MOF, thus motivating the formulation of universal descriptors that utilize readily-available, easily-calculated properties. From the ablation study, we pinpoint textural properties as the dominant parameters. Thus, they are used as the basis for determining two descriptors sets: the first is more physically-interpretable and the second follows a more analytical/mathematical form that excels at differentiating the five isotherm classes. In both cases, the descriptor parameters are optimized using ML-predicted $q$ and $\Delta {H}_{{\rm{ads}}}$.

The physical descriptors are motivated primarily by known adsorption isotherm function, specifically the Sips equation for the high-pressure regime (P1) and the Langmuir functional for the low-pressure regime (P2). We provide more details on the physical descriptor derivations in Supplementary Note S5. Thus, in addition to the textural features, we also introduce a pre-exponential factor closely associated with Henry’s constant (${K}_{0}$), a heterogeneity constant ($m$), and a pressure-dependent correction factor (${P}^{n}$). For P1, we observe that higher pressures exhibit a lower $m$, which indicates a higher deviation from the Langmuir model’s monolayer assumption, as shown in Table 2. Similarly, for P2, since $m < 0$, it indicates surface heterogeneity, with more pronounced effects at higher pressure.

$$q={\left({A}_{s}^{{a}_{1}}{\rho }^{{a}_{2}}{V}^{{a}_{3}}\left(1-\phi \right)\frac{{K}_{0}\exp \left(-\frac{\Delta {H}_{\mathrm{ads}}}{{RT}}\right)\cdot P\cdot {\left(\frac{\mathrm{PLD}}{\mathrm{LCD}}\right)}^{{a}_{4}}}{1+{\left({K}_{0}\exp \left(-\frac{\Delta {H}_{\mathrm{ads}}}{{RT}}\right)\cdot P\cdot {\left(\frac{\mathrm{PLD}}{\mathrm{LCD}}\right)}^{{a}_{4}}\right)}^{m}}\right)}^{1/m}+{P}^{n}$$

(P1)

$$q={A}_{s}^{{a}_{1}}{\rho }^{{a}_{2}}{V}^{{a}_{3}}{\left(1-\phi \right)}^{{a}_{4}}{\left(\frac{\mathrm{PLD}}{\mathrm{LCD}}\right)}^{{a}_{5}}{\left(\frac{{K}_{0}\exp \left(-\frac{\Delta {H}_{\mathrm{ads}}}{{RT}}\right)\cdot \left(1+{b}_{1}{\left(\sum _{i}\frac{{\varepsilon }_{\mathrm{mix}}{d}_{{\rm{C}}{{\rm{O}}}_{2}}}{{\sigma }_{\mathrm{mix}}+{d}_{{\rm{C}}{{\rm{O}}}_{2}}}\right)}^{{b}_{2}}\right)\cdot P}{1+{K}_{0}\exp \left(-\frac{\Delta {H}_{\mathrm{ads}}}{{RT}}\right)\cdot P}\right)}^{m}+{P}^{n}$$

(P2)

Table 2 The parameters for the physical descriptors modeled for high-pressure regime (P1) and low-pressure regime (P2)

Full size table

A closer observation of the P1-calculated vs. ML-predicted uptake PCCs reveals that P1 is indeed better at higher pressures ($P\ge$ 15 bars) but suffers at low pressures. Thus, in formulating P2, we added an additional interactions term composed of the effective ${\varepsilon }_{{\rm{mix}}}$ and ${\sigma }_{{\rm{mix}}}$ between each framework atom $i$ and the CO₂ adsorbate, and the kinetic diameter of CO₂ (${d}_{{\rm{C}}{{\rm{O}}}_{2}}$), which increased the PCC significantly to 0.612 for 5 bars. However, in comparing the P2-calculated uptake with the ground truth GCMC uptake, we note that IsothermNet still produced better predictions at lower pressure (${\rm{PC}}{{\rm{C}}}_{{\rm{P}}2}$ $=$ 0.66 vs. ${{\rm{PCC}}}_{{\rm{IsothermNet}}}$ $=$ 0.73 @ 0.5 bars). This is true up until 10 bars, wherein the physical descriptors become slightly more favorable (${\rm{PC}}{{\rm{C}}}_{{\rm{P}}2}$ =0.86 vs. ${{\rm{PCC}}}_{{\rm{IsothermNet}}}=$ 0.82 @ 10 bars | ${\rm{PC}}{{\rm{C}}}_{{\rm{P}}1}$ = 0.99 vs. ${{\rm{PCC}}}_{{\rm{IsothermNet}}}=$ 0.95 @ 50 bars). In-depth comparisons of the physical descriptors and IsothermNet predictions are summarized in Supplementary Table S3-4.

The analytical descriptor sets, A1 and A2 (Table 3), were determined by optimizing the parameters $({a}_{i},{b}_{i},$ for $i\in \{1,\ldots 6\})$ of the variable-transformed textural properties (in the form ${a}_{i}{z}_{i}^{{b}_{i}}$, where ${z}_{i}$ is the textural property), such that the centroid distance between each isotherm type cluster can be maximized using a basin hopping algorithm (more details can be found in Supplementary Note S6). Thus, unlike the physical descriptors, these analytical forms are more adept for clear isotherm shape determination.

$$x=\log \left(q* \left({a}_{1}{\left(\log {A}_{s}\right)}^{{b}_{1}}\right)\left({a}_{2}{\phi }^{{b}_{2}}\right)\left({a}_{3}{\left(\log {\rm{PLD}}\right)}^{{b}_{3}}\right)\left({a}_{4}{\left(\log {\rm{LCD}}\right)}^{{b}_{4}}\right)\left({a}_{5}{\rho }^{{b}_{5}}\right)\left({a}_{6}{V}^{{b}_{6}}\right)\right)$$

(A1)

$$x=\frac{\log \left(\left({a}_{1}{\left(\log {A}_{s}\right)}^{{b}_{1}}\right)\left({a}_{2}{\phi }^{{b}_{2}}\right)\left({a}_{3}{\left(\log {\rm{PLD}}\right)}^{{b}_{3}}\right)\left({a}_{4}{\left(\log {\rm{LCD}}\right)}^{{b}_{4}}\right)\left({a}_{5}{\rho }^{{b}_{5}}\right)\left({a}_{6}{V}^{{b}_{6}}\right)\right)}{\log q}$$

(A2)

Table 3 The parameters and functions that define descriptors A1 and A2, and the expressions for fitted curves that describe heat of adsorption ($\Delta {H}_{{\rm{ads}}}$ [kJ/mol]) as a function of textural properties (${A}_{s}$ [m²/g], $\phi$ [-], PLD [Å], LCD [Å], $\rho$ [g/cm³], $V$ [Å³]) and uptake ($q$ [g/kg]) (at 3, 25, and 50 bars)

Full size table

After plotting both curve fits (Fig. 4a-b), we observed that A2 yields more accurate predictions at low pressures ($\lesssim$ 10 bars), whereas A1 performs more favorably at higher pressures. Generally, for A1 and A2, most predictions fall within 10% deviation at high pressure and 25% deviation at low pressure, respectively. This difference in fitting quality can be related back to our feature importance study. We previously noted that at low pressure, there should be a relatively balanced contribution from most textural features (i.e., should not be governed solely by volume-defining properties like $V$, $\rho$, LCD). This is displayed in A2 (low ${b}_{4}$, ${b}_{5}$). Conversely, at high pressure, $V$, $\rho$, LCD should dominate over PLD, $\phi$, and ${A}_{s}$, which is displayed in A1 (high ${b}_{4}$, ${b}_{6}$; low ${b}_{1}$). The full set of fitting curves at pressures between 0 and 50 bars for both analytical descriptors are tabulated in Supplementary Table S5. The advantage of a defined descriptor is evident when examining the raw uptake vs. $\Delta {H}_{{\rm{ads}}}$ plot (Fig. 4c, Supplementary Fig. S11), which displays high $x$-$y$ variances and ambiguous trends that fluctuate with pressure. The outlined descriptors, conversely, demonstrate high $x$-$y$ correlation across all pressure regimes. It can be noted that while minimal, there is pressure-dependence, as highlighted by the regime map and its corresponding regression fits for A1 (Fig. 4d). This regime map can serve as a valuable tool for quick isotherm shape classification.

**Fig. 4: Two proposed universal descriptors to capture interdependencies amongst uptake, textural properties, heat of adsorption, and isotherm type/shape.**

To validate both descriptor frameworks, we compared the heat of adsorption obtained via published experimental data ($\Delta {H}_{{\rm{ads}}}=$ 22.9$\sim$24.3 kJ/mol for 1$\sim$0.1 bar⁴³) and descriptor-based fittings for UiO-66 (Fig. 4a-b, first row). The textural properties for UiO-66 were calculated using Zeo + + (20-second compute time; specific values reported in Supplementary Table S6) and the uptake was calculated with the Langmuir fitting function (using the experimentally-reported ${Q}_{m}$ and $b$ inputs). Using A1 and A2, the computed $\Delta {H}_{{\rm{ads}}}$ values at 1 bar are 19.856 (−13.29% error) and 23.538 ( + 2.78% error) kJ/mol, respectively. These errors align with the prior observation that A2 has better predictive ability at low pressures compared to A1. Moreover, the CO₂ adsorption isotherm for UiO-66 is type I (as it can be sufficiently described by Langmuir adsorption model), which is correctly classified by both descriptors. This validation process was repeated for IRMOF-1/MOF-5⁴⁴, HKUST-1⁴⁵, and MIL-101 (Cr)⁴⁵. Overall, we observed that prediction quality decreases for MOFs exhibiting significantly large channel volume (MIL-101 (Cr)). This is likely due to the larger ${a}_{6},{b}_{6}$ parameters describing volume, which would impact the sensitivity of the functional. Nevertheless, the isotherm type classifications for all trialed MOFs are correct when using A2.

We further validate the two physical descriptors with a larger unseen dataset from the CoRE MOF 2019 database⁴⁶. Repeating textural and adsorption property determination described in previous sections, we determined the uptake and $\Delta {H}_{{\rm{ads}}}$ for 322, 1426, 1416, and 616 MOFs at 5, 10, 30, and 50 bars, respectively (Fig. 5). We note that A2 performs better at lower pressures (MSE₁ = 23.11 kJ/mol vs. MSE₂ = 5.65 kJ/mol for 5 bars; MSE₁ = 26.79 kJ/mol vs. MSE₂ = 9.57 kJ/mol for 10 bars). At high pressure (30 bars), the error gap decreases significantly (MSE₁ = 16.57 kJ/mol vs. MSE₂ = 12.92 kJ/mol). Finally, at near-saturation pressure (50 bars), the MSEs are comparable (MSE₁ = 9.57 kJ/mol vs. MSE₂ = 8.71 kJ/mol). The detailed data distribution based on percent deviation can be found in Supplementary Table S7. Since the results on the holdout data are consistent with those projected based on the descriptors’ performance on the QMOF database, the generalizability of both descriptors can be substantiated.

**Fig. 5: Descriptor validation with randomly-sampled MOFs from the CoRE MOF 2019 database⁴⁶.**

Discussion

To bridge the gap in understanding physio-thermodynamic interactions in MOF-based CO₂ adsorption, we developed IsothermNet, a high-throughput machine learning model capable of predicting uptake and heat of adsorption across a broad pressure range (0–50 bars). This enables the reconstruction of full isotherm and enthalpy curves, providing a more comprehensive view of adsorption behavior. Since IsothermNet is trained on only neutral, rigid frameworks and CO₂ adsorbates, there remains an open research question for charged, multi-component, or flexible MOFs that can be explored in future studies. Beyond property prediction, we performed an in-depth feature importance study to investigate structure-uptake and, most importantly, structure-enthalpy relationships, which remain elusive in literature. Next, we addressed the overlooked connection between adsorption properties and isotherm shape classification, allowing us to extract physical insights into how structural attributes influence enthalpic behavior and uptake dynamics.

To unify these findings, we introduced a set of physical and analytical universal descriptors that effectively link structure, uptake and $\Delta {H}_{{\rm{ads}}}$. These descriptors not only generalize well to unseen datasets but also enable rapid and accessible $\Delta {H}_{{\rm{ads}}}$ (and $q$) prediction at any pressure by simultaneously leveraging IsothermNet. They are especially valuable for streamlining the $\Delta {H}_{{\rm{ads}}}$ characterization of new experimentally-synthesized materials without measuring uptakes at multiple temperatures, as required of the Clausius-Clapeyron equation⁴⁷. In addition to challenges with multiple measurements (which scales with sample size) at potentially inaccessible/specialized equipment, the lack of a structure file (e.g., CIF) inhibits the immediate use of molecular simulations. Thus, the use of universal descriptors would add an additional layer of speed that is favorable for quick material selection, design, and characterization (example scenarios in Supplementary Note S7). While these descriptors are useful for quickly calculating adsorption properties and determining isotherm shapes for theoretical MOFs, we want to emphasize that IsothermNet is still needed as a starting point for finding either $q$ or $\Delta {H}_{{\rm{ads}}}$, since the goal is to bypass time-consuming GCMC simulations. For context, each MOF averages 19.85 hours (using 8 CPU cores) for a full, 19-point, high-resolution isotherm computation, so both IsothermNet and the descriptors can expedite adsorption property retrieval significantly with almost instantaneous results, especially for large ( > 10⁶ MOFs) databases. Since IsothermNet can predict both adsorption properties, this gives the user a lot of freedom to choose how the descriptors can be used in a synergistic manner. This approach can therefore serve as a versatile framework for efficient and accelerated MOF design for CO₂ capture.

Methods

Grand Canonical Monte Carlo (GCMC) Computations

We computed the uptake and heat of adsorption of each QMOF with Grand Canonical Monte Carlo (GCMC) simulations using the software, RASPA⁴⁸. We used a van der Waals cut-off limit of 12 Å. The simulations were run for 4000 initialization cycles and 6000 post-initialization cycles for pressures of 1000, 5000, 10000, 50000, 100000, 200000, 300000, 400000, 500000, 700000, 1000000, 1500000, 2000000, 2500000, 3000000, 3500000, 4000000, 4500000, and 5000000 Pa at 298 K. The adsorbate used was CO₂ (properties and partial charges reflected in the molecule and pseudo atoms definition file, and forcefield based on TraPPE). The MOF framework atoms were defined using universal forcefield (UFF)⁴⁹ and partial charges are specified in the CIF files. The forcefield mixing rules (Lorentz–Berthelot) are also defined for van der Waals interactions between atoms. Ewald summation was used to capture electrostatic interactions. The helium void fractions for each MOF were also computed using RASPA and subsequently used as an input to the GCMC studies. All computations were carried out serially using the PSC Bridges-2 (RM – regular memory) clusters. All relevant files for RASPA and the code for high-throughput, automated job submissions can be found in the GitHub repository.

Textural properties calculations

The textural properties (pore limiting diameter [PLD], largest cavity diameter [LCD], gravimetric surface area [${A}_{s}$], void fraction [$\phi$], channel volume [$V$], number of channels [$n$], density [$\rho$]) were calculated using Zeo + +⁵⁰. The assumed probe diameter is the kinetic diameter of a CO₂ molecule (3.3 Å). The input files (.cssr) were converted from .cif files. The simulations were run with 50,000 sample numbers and with the high accuracy flag (-ha). All relevant files for Zeo + + and the code for automated job submissions can be found in the GitHub repository.

Isotherm classification

Due to the large number of MOFs studied, a generalized isotherm classification method was required. First, we note that all QMOF isotherms observed can be sufficiently represented by the Sips (Langmuir-Freundlich) function (Eq. 3), but only the Type I isotherm can be accurately represented by the Langmuir isotherm (Eq. 4). Thus, we can first differentiate Type I and Type I(b) isotherms.

$$q={q}_{\max }{\left(\frac{{KP}}{1+{\left({KP}\right)}^{n}}\right)}^{\frac{1}{n}}$$

(3)

$$q={q}_{\max }\frac{{KP}}{{KP}+1}$$

(4)

The summed mean square error computed after fitting the Langmuir isotherm and Sips isotherm were compared, such that if $\frac{1}{c}\frac{{MS}{E}_{{\rm{Langmuir}}}}{{MS}{E}_{{\rm{Sips}}}} > 1$ (where $c$ is the fitted ${q}_{\max }$ parameter in the Langmuir isotherm), the isotherm will be classified as Type I or Type I(b). To distinguish between the two isotherms, we note that Type I isotherms have a steeper increase at low pressures and a smaller slope at high pressures. Thus, computing the first derivative of the curves near-saturation pressure can enable adequate filtering, such that if the first derivative is less than a threshold of 0.005, the isotherm can be classified as a Type I isotherm. Otherwise, the isotherm will be classified as a Type I(b) isotherm.

If the isotherm does not meet the condition $\frac{1}{c}\frac{{MS}{E}_{{\rm{Langmuir}}}}{{MS}{E}_{{\rm{Sips}}}} > 1$, the isotherm is either Weak Type V, Type V, or Type III. The isotherm will be classified as Type III if $\frac{{MS}{E}_{\exp }}{{MS}{E}_{{\rm{Langmuir}}}} < 0.2$, since the Langmuir isotherm fits a Type III isotherm (which has an exponential trend) poorly. Additionally, Weak Type V and Type V isotherms should have an inflection point, unlike Type III isotherms. Thus, to distinguish these isotherms, we can utilize the second derivative of the curves. To discern Weak Type V from Type V isotherms, we note that the former exhibits a larger slope at low pressures. The specifications and results of the thresholding/tolerance parameter tuning are reported in Table 4. After using this high-throughput classification method to achieve the largest number of true positives, we visually ensured that classification is correct for each isotherm.

Table 4 The recall, number of true positives, and accuracy for each isotherm type (relative to the total number of samples) based on the threshold (to distinguish Type I isotherms from the other classes) and the tolerance (that separate Weak Type V isotherms from the other classes)

Full size table

Data availability

The crystallography/structural files for the MOFs can be found in the Quantum MOF (QMOF) database. The training/testing/validation data can be found in https://github.com/emilylin3/IsothermNet and https://zenodo.org/records/15555513.

Code availability

All relevant codes and scripts can be found in the GitHub repository: https://github.com/emilylin3/IsothermNet.

References

Chen, Z., Kirlikovali, K. O., Li, P. & Farha, O. K. Reticular Chemistry for Highly Porous Metal–Organic Frameworks: The Chemistry and Applications. Acc. Chem. Res. 55, 579–591 (2022).
Article CAS PubMed Google Scholar
Farha, O. K. et al. Metal–Organic Framework Materials with Ultrahigh Surface Areas: Is the Sky the Limit?. J. Am. Chem. Soc. 134, 15016–15021 (2012).
Article CAS PubMed Google Scholar
Anderson, R., Biong, A. & Gómez-Gualdrón, D. A. Adsorption Isotherm Predictions for Multiple Molecules in MOFs Using the Same Deep Learning Model. J. Chem. Theory Comput. 16, 1271–1283 (2020).
Article CAS PubMed Google Scholar
Chen, P., Jiao, R., Liu, J., Liu, Y. & Lu, Y. Interpretable Graph Transformer Network for Predicting Adsorption Isotherms of Metal–Organic Frameworks. J. Chem. Inf. Model. 62, 5446–5456 (2022).
Article CAS PubMed Google Scholar
Wang, S., Li, Y., Dai, S. & Jiang, D. Prediction by Convolutional Neural Networks of CO2/N2 Selectivity in Porous Carbons from N2 Adsorption Isotherm at 77 K. Angew. Chem. Int. Ed. 59, 19645–19648 (2020).
Article CAS Google Scholar
Guo, S. et al. Interpretable Machine-Learning and Big Data Mining to Predict Gas Diffusivity in Metal-Organic Frameworks. Adv. Sci. 10, 2301461 (2023).
Article CAS Google Scholar
Borboudakis, G. et al. Chemically intuited, large-scale screening of MOFs by machine learning techniques. npj Comput Mater. 3, 1–7 (2017).
CAS Google Scholar
Yu, X., Choi, S., Tang, D., Medford, A. J. & Sholl, D. S. Efficient Models for Predicting Temperature-Dependent Henry’s Constants and Adsorption Selectivities for Diverse Collections of Molecules in Metal–Organic Frameworks. J. Phys. Chem. C. 125, 18046–18057 (2021).
Article CAS Google Scholar
Kim, S.-Y., Kim, S.-I. & Bae, Y.-S. Machine-Learning-Based Prediction of Methane Adsorption Isotherms at Varied Temperatures for Experimental Adsorbents. J. Phys. Chem. C. 124, 19538–19547 (2020).
Article CAS Google Scholar
Wang, J. et al. Machine Learning for Accurate Prediction of Henry’s Law Constant in CO2–Ionic Liquid Systems. ACS Sustain. Chem. Eng. 13, 1582–1591 (2025).
Article CAS Google Scholar
Ren, E. & Coudert, F.-X. Enhancing Gas Separation Selectivity Prediction through Geometrical and Chemical Descriptors. Chem. Mater. 35, 6771–6781 (2023).
Article CAS Google Scholar
Li, X. et al. Applied machine learning to analyze and predict CO2 adsorption behavior of metal-organic frameworks. Carbon Capture Sci. Technol. 9, 100146 (2023).
CAS Google Scholar
Altintas, C., Altundal, O. F., Keskin, S. & Yildirim, R. Machine Learning Meets with Metal Organic Frameworks for Gas Storage and Separation. J. Chem. Inf. Model. 61, 2131–2146 (2021).
Article CAS PubMed PubMed Central Google Scholar
Orhan, I. B., Le, T. C., Babarao, R. & Thornton, A. W. Accelerating the prediction of CO2 capture at low partial pressures in metal-organic frameworks using new machine learning descriptors. Commun. Chem. 6, 1–12 (2023).
Article Google Scholar
Fernandez, M., Trefiak, N. R. & Woo, T. K. Atomic Property Weighted Radial Distribution Functions Descriptors of Metal–Organic Frameworks for the Prediction of Gas Uptake Capacity. J. Phys. Chem. C. 117, 14095–14105 (2013).
Article CAS Google Scholar
Al-Ghouti, M. A. & Da’ana, D. A. Guidelines for the use and interpretation of adsorption isotherm models: A review. J. Hazard. Mater. 393, 122383 (2020).
Article CAS PubMed Google Scholar
Ramsahye, N. A. et al. Adsorption of CO2 in metal organic frameworks of different metal centres: Grand Canonical Monte Carlo simulations compared to experiments. Adsorption 13, 461–467 (2007).
Article CAS Google Scholar
Molecular Dynamics Simulations of Breathing MOFs: Structural Transformations of MIL-53(Cr) upon Thermal Activation and CO2 Adsorption - Salles - 2008 - Angewandte Chemie International Edition - Wiley Online Library. https://onlinelibrary.wiley.com/doi/full/10.1002/anie.200803067?casa_token=3cAE47-WONYAAAAA%3AcNYTj4ujsQhmAOpgz8Kj_QEpskGl6Z_rkWbUF8xiCA79TmuriAALQmte9FK4XSFAGeRsjS57BkI21Wc.
Hirohara, M., Saito, Y., Koda, Y., Sato, K. & Sakakibara, Y. Convolutional neural network based on SMILES representation of compounds for detecting chemical motif. BMC Bioinforma. 19, 526 (2018).
Article CAS Google Scholar
Krenn, M. et al. SELFIES and the future of molecular string representations. Patterns 3, 100588 (2022).
Article CAS PubMed PubMed Central Google Scholar
Li, C., Feng, J., Liu, S. & Yao, J. A Novel Molecular Representation Learning for Molecular Property Prediction with a Multiple SMILES-Based Augmentation. Computational Intell. Neurosci. 2022, e8464452 (2022).
Google Scholar
Rogers, D. & Hahn, M. Extended-Connectivity Fingerprints. J. Chem. Inf. Model. 50, 742–754 (2010).
Article CAS PubMed Google Scholar
Wen, N. et al. A fingerprints based molecular property prediction method using the BERT model. J. Cheminformatics 14, 71 (2022).
Article Google Scholar
Edwards, M. & Xie, X. Graph Based Convolutional Neural Network. Preprint at https://doi.org/10.48550/arXiv.1609.08965 (2016).
Zhou, J. et al. Graph neural networks: A review of methods and applications. AI Open 1, 57–81 (2020).
Article Google Scholar
Reiser, P. et al. Graph neural networks for materials science and chemistry. Commun. Mater. 3, 1–18 (2022).
Article Google Scholar
Lin, X. et al. 3D-structure-attention graph neural network for crystals and materials. Mol. Phys. 120, e2077258 (2022).
Article Google Scholar
Fung, V., Zhang, J., Juarez, E. & Sumpter, B. G. Benchmarking graph neural networks for materials chemistry. npj Comput Mater. 7, 1–8 (2021).
Article Google Scholar
Lee, J. & Asahi, R. Transfer learning for materials informatics using crystal graph convolutional neural network. Computational Mater. Sci. 190, 110314 (2021).
Article CAS Google Scholar
Veličković, P. et al. Graph Attention Networks. Preprint at https://doi.org/10.48550/arXiv.1710.10903 (2018).
Rosen, A. S. et al. High-throughput predictions of metal–organic framework electronic properties: theoretical challenges, graph neural networks, and data exploration. npj Comput Mater. 8, 1–10 (2022).
Article Google Scholar
Rosen, A. S. et al. Machine learning the quantum-chemical properties of metal–organic frameworks for accelerated materials discovery. Matter 4, 1578–1597 (2021).
Article CAS Google Scholar
Moghadam, P. Z. et al. Development of a Cambridge Structural Database Subset: A Collection of Metal–Organic Frameworks for Past, Present, and Future. Chem. Mater. 29, 2618–2625 (2017).
Article CAS Google Scholar
Structure Sources | Materials Project Documentation. https://docs.materialsproject.org/apps/explorer-apps/mof-explorer/structure-details/structure-sources (2021).
Reddy, M. S. B., Ponnamma, D., Kumar Sadasivuni, K., Kumar, B. & Abdullah, A. M. Carbon dioxide adsorption based on porous materials. RSC Adv. 11, 12658–12681 (2021).
Article Google Scholar
Chiang, Y.-C., Lee, S.-T., Leo, Y.-J. & Tseng, T.-L. Importance of Pore Structure and Surface Chemistry in Carbon Dioxide Adsorption on Electrospun Carbon Nanofibers. Sens. Mater. 32, 2277 (2020).
CAS Google Scholar
Garnier, C. H. et al. Selection of coals of different maturities for CO2 Storage by modelling of CH4 and CO2 adsorption isotherms. Int. J. Coal Geol. 87, 80–86 (2011).
Article CAS Google Scholar
Thommes, M. et al. Physisorption of gases, with special reference to the evaluation of surface area and pore size distribution (IUPAC Technical Report). Pure Appl. Chem. 87, 1051–1069 (2015).
Article CAS Google Scholar
Burhan, M., Shahzad, M. W. & Ng, K. C. Energy distribution function based universal adsorption isotherm model for all types of isotherm. Int. J. Low.-Carbon Technol. 13, 292–297 (2018).
Article CAS Google Scholar
Kim, K. C., Yoon, T.-U. & Bae, Y.-S. Applicability of using CO2 adsorption isotherms to determine BET surface areas of microporous materials. Microporous Mesoporous Mater. 224, 294–301 (2016).
Article CAS Google Scholar
Mukherjee, K., Osaro, E. & Colón, Y. J. Active learning for efficient navigation of multi-component gas adsorption landscapes in a MOF. Digital Discov. 2, 1506–1521 (2023).
Article CAS Google Scholar
Osaro, E., LaCapra, M. & Colón, Y. J. Harmonizing Adsorption and Diffusion in Active Learning Campaigns of Gas Separations in a MOF. J. Phys. Chem. C https://doi.org/10.1021/acs.jpcc.5c00922 (2025).
Cao, Y. et al. UiO-66-NH2/GO Composite: Synthesis, Characterization and CO2 Adsorption Performance. Materials 11, 589 (2018).
Article PubMed PubMed Central Google Scholar
Nandi, S., Maity, R., Chakraborty, D., Ballav, H. & Vaidhyanathan, R. Preferential Adsorption of CO2 in an Ultramicroporous MOF with Cavities Lined by Basic Groups and Open-Metal Sites. Inorg. Chem. 57, 5267–5272 (2018).
Article CAS PubMed Google Scholar
Teo, H. W. B., Chakraborty, A. & Kayal, S. Evaluation of CH4 and CO2 adsorption on HKUST-1 and MIL-101(Cr) MOFs employing Monte Carlo simulation and comparison with experimental data. Appl. Therm. Eng. 110, 891–900 (2017).
Article CAS Google Scholar
Chung, Y. G. et al. Advances, Updates, and Analytics for the Computation-Ready, Experimental Metal–Organic Framework Database: CoRE MOF 2019. J. Chem. Eng. Data 64, 5985–5998 (2019).
Article CAS Google Scholar
Zhong, Y. et al. Bridging materials innovations to sorption-based atmospheric water harvesting devices. Nat. Rev. Mater. 1–18 https://doi.org/10.1038/s41578-024-00665-2 (2024).
Dubbeldam, D., Calero, S., Ellis, D. E. & Snurr, R. Q. RASPA: molecular simulation software for adsorption and diffusion in flexible nanoporous materials. Mol. Simul. 42, 81–101 (2016).
Article CAS Google Scholar
Rappe, A. K., Casewit, C. J., Colwell, K. S., Goddard, W. A. & Skiff, W. M. UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations. J. Am. Chem. Soc. 114, 10024–10035 (1992).
Article CAS Google Scholar
Willems, T. F., Rycroft, C. H., Kazi, M., Meza, J. C. & Haranczyk, M. Algorithms and tools for high-throughput geometry-based analysis of crystalline porous materials. Microporous Mesoporous Mater. 149, 134–141 (2012).
Article CAS Google Scholar
Gasteiger, J., Giri, S., Margraf, J. T. & Günnemann, S. Fast and Uncertainty-Aware Directional Message Passing for Non-Equilibrium Molecules. Preprint at https://doi.org/10.48550/arXiv.2011.14115 (2022).
Satorras, V. G., Hoogeboom, E. & Welling, M. E. (n) Equivariant Graph Neural Networks. Preprint at https://doi.org/10.48550/arXiv.2102.09844 (2022).

Download references

Acknowledgements

This work used the Engaging OnDemand clusters at MIT Office of Research Computing and Data (ORCD). This work additionally used Bridges-2 at Pittsburgh Supercomputing Center (PSC) through allocation MCH230021 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, which is supported by National Science Foundation Grants No. 2138259, 2138286, 2138307, 2137603, and 2138296. EL is supported by the National Science Foundation Graduate Research Fellowship under Grant No. 2141064. YZ is supported by the MIT Evergreen Graduate Innovation Fellowship.

Author information

Authors and Affiliations

Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA, 02139, USA
Emily Lin, Yang Zhong, Gang Chen & Sili Deng

Authors

Emily Lin
View author publications
Search author on:PubMed Google Scholar
Yang Zhong
View author publications
Search author on:PubMed Google Scholar
Gang Chen
View author publications
Search author on:PubMed Google Scholar
Sili Deng
View author publications
Search author on:PubMed Google Scholar

Contributions

E.L.: designed the research, curated and analyzed the data, designed and implemented machine learning framework, performed results post-processing and analysis, wrote the main manuscript text and prepared the figures, edited the paper; Y.Z.: designed the research, edited the manuscript; G.C.: edited the manuscript; S.D.: edited the manuscript. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Emily Lin or Sili Deng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

IsothermNet Code

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Lin, E., Zhong, Y., Chen, G. et al. Unified physio-thermodynamic descriptors via learned CO₂ adsorption properties in metal-organic frameworks. npj Comput Mater 11, 225 (2025). https://doi.org/10.1038/s41524-025-01700-8

Download citation

Received: 14 February 2025
Accepted: 13 June 2025
Published: 12 July 2025
Version of record: 12 July 2025
DOI: https://doi.org/10.1038/s41524-025-01700-8

This article is cited by

Neural ordinary differential equations (ODEs) for smooth, high-accuracy isotherm reconstruction, interpolation, and extrapolation
- Emily Lin
- Evelyn Wang
- Sili Deng
npj Computational Materials (2025)