Table 1 Descriptor groups in the dataset.

From: Accelerating the prediction of CO2 capture at low partial pressures in metal-organic frameworks using new machine learning descriptors

Group

Descriptor

Dataframe skeleton

MOF name

Target variable CO2 uptake (mmol g−1)

Pressure

Atom type (A)

Number of H atoms per unit volume

Number of C atoms per unit volume

Number of N atoms per unit volume

Number of F atoms per unit volume

Number of Cl atoms per unit volume

Number of Br atoms per unit volume

Number of V atoms per unit volume

Number of Cu atoms per unit volume

Number of Zn atoms per unit volume

Number of Zr atoms per unit volume

Geometric (B)

Accessible surface area

Non-accessible surface area

Accessible volume

Non-accessible volume

Accessible probe-occupiable volume

Non-accessible probe-occupiable volume

Pore limiting diameter

Largest cavity diameter

Largest free path diameter

Density

Volume

Chemical (C)

Total degree of unsaturation

Metallic percentage

Oxygen to metal ratio

Electronegative to total ratio

Weighted electronegativity per atom

Nitrogen to oxygen ratio

Effective point charge (D)

Charge-based uptake at 40 Pa

Charge-based uptake at 1 kPa

Charge-based uptake at 4 kPa

Charge-based uptake at 40 Pa averaged per atom

Charge-based uptake at 1 kPa averaged per atom

Charge-based uptake at 4 kPa averaged per atom

Charge-based uptake at 40 Pa per unit volume

Charge-based uptake at 1 kPa per unit volume

Charge-based uptake at 4 kPa per unit volume

Energy descriptor (E)

Henry coefficient

  1. Descriptors groups in the dataset where features are grouped based on similarities into atom type (A), geometric (B), chemical (C), effective point charge (D), and the energy descriptor (E). The dataset is curated by MOF name and the pressure at which the simulation was performed. The corresponding CO2 adsorption is recorded for the simulation result of each MOF and pressure combination.