Main

Catalytic processes rely on the control of complex competing multistep chemical reaction networks (CRNs)1,2. The core of a CRN consists of a set of species S, linked by a set of reactions R defined by properties P, usually including, but not limited to, their reaction and activation energies3,4. For chemical processes in which the list of elementary steps can be manually generated, density functional theory (DFT) enables the evaluation of reaction profiles. However, this approach finds its limits as network complexity grows5 and then it is necessary to leverage the CRN’s abstract mathematical foundations6,7,8,9,10,11.

CRN algorithms fall into two categories: CRN potential energy surface and rule-based generators. The former require on-the-fly ab initio energy evaluation of the species involved along the reaction paths12,13,14,15,16,17. As such, these algorithms are constrained to chemical spaces (CSs) of the order of 102 intermediates. Alternatively, rule-based CRN generators use templates to define the attainable CS via a list of possible transformations18,19, decoupling network generation from energy evaluation. Therefore, they can deal with larger networks compared with the CRN potential energy surface, including multiple paths that could contribute to the catalytic activity and rare events responsible for catalyst deactivation, thereby getting closer to achieving the completeness of the CS. However, the combinatorial explosion of the CS requires strategies to reduce the CRN complexity and/or substitute DFT with fast energy estimators.

The evaluation of the thermodynamic properties of the species in the network can be accelerated by using group additivity (GA) methods20,21,22, whereas linear scaling relationships23,24 and Brønsted–Evans–Polanyi (BEP) equations25,26,27 can be used for characterizing the reaction properties. An example of this approach is the reaction mechanism generator28, which was applied to the CO2 methanation and hydrocarbon (CH4 and C2Hx) oxidation on Pt(111) and Ni(111) surfaces29. On iterative species generation with graph-based approaches evaluated with the aforementioned estimators (GA–linear scaling relationship–BEP), a filtering step is applied after every network expansion to reduce the mechanism size. Although the reaction mechanism generator provides general reactivity trends, the accuracy of the energy predictions is low compared with DFT with a mean absolute error (MAE) of around 0.5 eV (ref. 30), limiting its applicability to mechanisms not involving selectivity issues.

The competition and correlation between the completeness of the CS and the associated computational burden arising from the characterization of CRNs are key to the development of the field. Expanding the boundaries of the investigated CSs requires more robust, accurate and efficient energy regressors, which can be obtained via ML acceleration strategies31,32,33,34. Ulissi et al.19 built a surrogate model with GA fingerprints fed to a Gaussian process and combined it with BEP relationships to obtain the rate-limiting steps for the conversion of syngas to C2 products on Rh catalysts. This model was used within an active learning strategy and tracking the uncertainty to predict the most relevant steps to be refined with DFT.

An alternative approach involves ML interatomic potentials (MLIPs) from the Open Catalyst Project32,35, trained on small CHNO fragments on metals and alloys, for producing initial guesses of the adsorption and transition-state (TS) geometries subsequently relaxed with DFT30,36, reaching an MAE of 0.1 eV. Other strategies target CHO species on metals including TSs via BEP and ML-based regression techniques, achieving MAEs of 0.2 eV (refs. 37,38). These models speed up DFT simulations, but are still too computationally demanding for high-throughput in silico exploration. MLIPs39,40,41,42,43, graph-based approaches33,44 and diffusion models45 reduce the prediction error, but have yet to be merged within CRN algorithms.

Here we present the Catalytic Automated Reaction Evaluator (CARE), an end-to-end framework tailored to investigate performance in heterogeneous catalysis that consists of (1) a catalyst-agnostic rule-based CRN generator for processes involving CHO species, (2) a module interfacing to state-of-the-art ML models for evaluating the thermodynamic and kinetic reaction parameters for thermal and electrochemical processes, mainly powered by GAME-Net-UQ, and (3) a microkinetic solver to predict the reactivity. CARE is benchmarked against experimental activity in methanol decomposition, used to study selectivity in electrochemistry and to generate large networks such as those describing the Fischer–Tropsch synthesis.

Results

CARE

CARE is an end-to-end framework for the generation and manipulation of CRNs in heterogeneous and electrocatalytic processes involving CHO species. The framework consists of three independent modules: a template-based network blueprint generator, an interface to state-of-the-art data-driven estimators for thermodynamic and kinetic reaction parameters, and a microkinetic solver (Fig. 1).

Fig. 1: CARE workflow from network generation to reactivity analysis.
Fig. 1: CARE workflow from network generation to reactivity analysis.The alternative text for this image may have been generated using AI.
Full size image

CARE starts from a CS defined by the maximum number of C and O atoms and a molecular template defining how to build the carbon backbone. The bond-breaking template returns the eCS and all potential reactions. Species in the eCS are automatically placed on the surface, and their energy and those of the linking TSs are evaluated with the ML model of choice. The reactivity of the network can be characterized using MKM under realistic reaction conditions. EIS, initial state energy; T, temperature; P, pressure.

Network blueprint generation

The first step for generating the CRN blueprint in CARE is the definition of the CS. The size of the CS is determined by the backbone of the largest molecule, obtained by defining the maximum number of C (NC) and O (NO) atoms for the species, namely, the carbon and oxygen cut-offs. To set up the CS, a pool of alkanes with up to NC carbon atoms is generated. Through a set of molecular templates, alkanes are then oxygenated and saturated molecules with up to NO oxygen atoms are obtained (for example, alcohols, ethers and epoxides). In contrast to NC and NO, custom CSs can be defined by directly providing a list of SMILES.

Next, a bond-breaking template is applied to the molecules in the CS, simultaneously generating a set of reactions, open-shell fragments and unsaturated closed-shell molecules (alkenes, ketones and acids). This template is based on SMILES manipulations implemented with RDKit. The obtained species plus the CS define the extended CS (eCS).

In contrast to homogeneous catalysis, the number of possible reaction templates in heterogeneous catalysis is relatively small, but the number of events can be very large, as in polymerization/depolymerization processes. Therefore, the main reaction templates in CARE describe XY (X, Y = C, H, O) bond-breaking/bond-forming reactions and the adsorption/desorption of closed-shell molecules. Additional available templates account for intramolecular rearrangements such as [1,2]-H shift and proton-coupled electron transfer (PCET), enabling modeling reaction networks under electrochemical conditions. The modularity of the network generator facilitates the inclusion of more reaction classes, for instance, reactivity in solution at the interfaces of electrodes.

The ensemble of the eCS and the set of reactions constitute the reaction network blueprint. It is catalyst agnostic and needs to be generated only once to model the same network on different catalysts. Additional details on the CRN blueprint generation are provided in Supplementary Note 1.

Estimating thermodynamic and kinetic parameters

To estimate the thermodynamic and kinetic properties of the reactions in the network, CARE provides access to many state-of-the-art ML energy estimators, but is mainly powered by GAME-Net-UQ (UQ, uncertainty quantification), a graph neural network targeting the DFT energy of adsorbed species and TSs. Originally, GAME-Net33 was designed as a direct initial-structure-to-relaxed-energy model for closed-shell molecules on transition metals, starting from the simplest possible graph representation, without any spatial coordinate encoded in the nodes or edges.

GAME-Net-UQ includes three crucial upgrades: (1) the extension to open-shell species, (2) the estimation of TS energies and (3) UQ. Including these features keeps the model lightweight (558,000 parameters), accurate and robust, enabling extensive direct energy evaluations at a minimal computational cost.

The graph dataset used to develop GAME-Net-UQ includes 12,303 structures (Supplementary Note 3). These represent the DFT-optimized adsorption of 184 closed-shell molecules from the GAME-Net training dataset, as well as intermediates and TSs included in the C2H6O2 decomposition network (see the ‘DFT’ section). The final graph dataset comprises 80% intermediates and 20% TSs of the surface-bond-breaking events. During training, to distinguish TSs from intermediates graphs, a Boolean feature is added to the graph edges to label the bond involved in dissociation (Fig. 2a). Consequently, the graph neural network includes a topology-adaptive convolutional layer46 utilizing the encoded edge attribute.

Fig. 2: GAME-Net-UQ performance.
Fig. 2: GAME-Net-UQ performance.The alternative text for this image may have been generated using AI.
Full size image

a, Graph representation of intermediates and TSs, and energy prediction as normal distribution. Etot, energy of the total system; Esurf, energy of the surface slab. b, Parity plot of the predicted versus DFT adsorption energy for the test intermediates (brown) and TS (orange) data. c, MAE of the predictions and associated mean uncertainties σ grouped by the metal and adsorbate size. Error bars represent the 95% confidence interval and ‘(g)’ refers to the gas-phase graphs. RMSE, root mean squared error; MAD, median absolute deviation; Sha, sharpness; cv, coefficient of variation (dimensionless). Statistic metrics refer to the whole test set (n = 2,460), which has been randomly sampled from the original data distribution.

Source data

The representation of the surface has been expanded to enable the study of structure-sensitive processes. Surface effects are taken into account by including (1) species adsorbed on the second- and third-most-stable surface facets of the metals in the dataset (Supplementary Fig. 5), (2) the generalized coordination number24 of surface atoms as the node feature and (3) the second-order surface neighbors to distinguish hollow adsorption sites.

As GAME-Net-UQ has been redefined as a mean variance estimator47, it returns predictions as normal distributions \(E\approx {\mathcal{N}}(\mu ,{\sigma }^{2})\), with the standard deviation σ representing the uncertainty (Fig. 2a). Details about graph representation, UQ, model architecture and training are provided in Supplementary Notes 46.

GAME-Net-UQ yields an MAE of 0.21 eV on a test set of 2,460 data points randomly sampled from the initial dataset, 0.21 eV for intermediates and 0.22 eV for TSs. The outliers lie in the [–4; +4] eV range (Fig. 2b), which are the most relevant for practical catalytic reactions. These outliers are small fragments adsorbed on Cd and Zn, typically not used as catalysts in their metallic form. The performed ablation study (Supplementary Note 7) shows that labeling the bond involved in the TS reduces the MAE by 0.23 eV, whereas encoding the generalized coordination number in the graph reduces the MAE by 0.04 eV. Predictions with small (big) errors are reflected in small (big) uncertainties, resulting in a model with a 1.3% miscalibration area (Supplementary Fig. 11). Small adsorbates on Cd, Zn and Fe show higher errors, but the uncertainty associated with these systems is positively correlated (Fig. 2c). The sharpness of the model, representing the magnitude of its uncertainty in the test set, is 0.23 eV, whereas the coefficient of variation, which quantifies the dispersion of the uncertainty estimates, is 0.34.

During inference, a streamlined version of DockOnSurf48 is used for the generation of adsorption configurations before evaluation. Adsorbate atoms with coordination lower than their valence are defined as anchoring points, whereas surface sites are identified with ACAT49 (Supplementary Note 2). As for the same intermediate multiple configurations screened within CARE, UQ can be leveraged to select the configurations with the lowest associated uncertainty, or to inform active learning strategies for reevaluating key species at higher accuracy. This is particularly meaningful for intermediates participating in the highest number of elementary reactions11.

GAME-Net-UQ propagates the uncertainty to reaction energy ΔE and activation barrier Eact (Supplementary Note 8). Surface reactions are evaluated in the bond-breaking direction A* + * → B* + C*. During inference, the TS graph is constructed from the reactant graph A*. The broken bond involved in the reaction is tagged to denote the TS (Fig. 2a and Supplementary Note 9). The products B* and C* in the final state (FS) of the reaction are evaluated separately as the training data represent one adsorbate on the surface that lacks lateral interactions. This might result in GAME-Net-UQ predicting non-physical TS energies ETS < EFS. In this situation, CARE assumes exothermic steps to be barrierless (Eact = 0 eV) and endothermic ones having Eact = ΔE.

Although the initial development of CARE focused on GAME-Net-UQ, interfaces to MLIPs such as FAIRChem potentials32, MACE42, Orb43, PET-MAD50 and SevenNet51 are available. These allow both relaxation and TS search with the nudged elastic band (NEB) method, providing a seamless workflow to fully characterize the CRN energy landscape with any MLIP, surface reaction and catalyst. As of January 2026, users can access 91 MLIPs (Supplementary Table 9). Having the option to evaluate the same network with different evaluators within the same framework is a pivotal feature of CARE, providing flexibility to quantitatively compare different ML strategies in terms of accuracy with respect to DFT and the derived macroscale properties with respect to experimental activity.

CRN analysis via microkinetic modeling

The prediction of the catalytic activity from CRNs in CARE with hundreds of reactions is not straightforward since these networks (1) have undefined directionality at the global level as this depends on the definition of the reactants, (2) are unconstrained to specific source and target species and (3) their catalytic activity depends on the applied operating conditions. Therefore, mean-field microkinetic modeling (MKM) functionalities52 are present in CARE.

MKM relies on the numerical solution of the system of ordinary differential equations (ODEs) defining the mass balance for the species in the CRN. The default model in CARE is a zero-conversion differential plug-flow reactor, which enables the extraction of macroscopic steady-state reaction rates and selectivity directly comparable with experiments, surface coverages and rate-determining steps. The kinetic coefficients of the elementary reactions are evaluated via TS theory for all surface steps except adsorption, for which the Hertz–Knudsen equation53 is used. The coefficients of the reverse reactions are obtained by imposing thermodynamic consistency (Supplementary Note 8). The ODE solvers in CARE automatically terminate the integration when the chemical steady state is reached. This occurs when all elemental balances are respected and the surface coverages sum to unity.

Elementary reactions possess characteristic times ranging multiple orders of magnitude, resulting in stiff ODEs requiring specialized solvers. Popular Python libraries such as SciPy54 lead to a prohibitive computational cost for kinetic simulations on huge CRNs and/or stiff ODEs. Therefore, CARE additionally implements a solver based on the DifferentialEquations.jl Julia package55, outperforming SciPy solvers in terms of convergence and simulation time required. Details about code implementation are provided in the ‘MKM’ section.

Application to industrially relevant problems

To highlight the capabilities of CARE, we have selected three industrially relevant applications of increasing complexity: methanol decomposition, propanol and propylene (C3H6) production in electrochemical conditions, and the Fischer–Tropsch synthesis. First, we benchmark the activity against an experimental dataset56. Systematic activity measures are severely lacking but CatTestHub presents an excellent example, providing activity data for different catalysts and operating conditions. The second case study illustrates a reaction network in electrochemical environments. Selectivity is key to processes and typically depends on two competing steps with small energy differences, which is challenging in terms of accuracy, especially for data-driven energy estimators. The validity of CARE in predicting selectivity patterns in electrochemical applications is assessed by analyzing the steps toward C3 products on Cu(100) (ref. 57) in the C3O2 (NC = 3, NO = 2) CRN. Last, the capacity of CARE to tackle large networks has been assessed by analyzing the Fischer–Tropsch synthesis. This is the case of a reaction that cannot be addressed with DFT given the high number of potential steps involved. For this case study, we generated the C6O1 CRNs on Co, Fe, Ni and Ru, resulting in networks of 40,000 intermediates and 370,000 elementary reactions and ran microkinetic simulations on their subnetworks, leading to linear products (37,000 reactions) to analyze the predicted product distribution.

Methanol decomposition

A benchmark of our framework against experimental activity is established by studying methanol decomposition to CO on different metal catalysts. Methanol is on the rise in sustainable energy applications as a hydrogen vector58,59, and developing efficient catalysts for its controlled decomposition is, therefore, crucial.

This example focuses on a small CS (NC = 1, NO = 2) but covers a high-throughput investigation spanning multiple catalysts and operating conditions. The public experimental database CatTestHub56 contains 96 kinetic experiments for this reaction, covering nine transition metal catalysts as well as multiple supports and operating conditions. As benchmark data, we selected a subset operating with the same nanoparticle size, inert carrier and operating conditions (Supplementary Table 11). We first built the C1O2 CRN including 28 surface species, 10 gas-phase molecules and 62 elementary reactions (Fig. 3a). As metal nanoparticles expose multiple surfaces, we rely on the Crystalium database60 to obtain the nature and fraction of the exposed surfaces obtained through the Wulff construction. The corresponding surface slabs of the eight face-centered cubic metals (Au, Ag, Cu, Ir, Ni, Pt, Pd and Rh) and hexagonal close-packed Ru (Supplementary Table 13) have been generated, and the corresponding networks have been evaluated with GAME-Net-UQ, EquiformerV2-31M35 and MACE-MP-042. Last, kinetic simulations have been run at 473 K, 102.73 kPa and MeOH/N2 = 10/90 (Supplementary Note 10).

Fig. 3: Methanol decomposition.
Fig. 3: Methanol decomposition.The alternative text for this image may have been generated using AI.
Full size image

a, Network visualization. Species with the same label represent constitutional isomers. b, Reaction (ΔE) and activation (Eact) energies obtained with GAME-Net-UQ versus data from ref. 61 obtained using DFT. The highlighted band represents the parity line of ±0.2 eV. n = 176 (22 surface reactions × 4 metals × 2 properties). c, CO formation rates from kinetic simulations on metal surfaces as function of the predicted CO adsorption energy. d, Parity plot between predicted and experimental values from ref. 56 of the normalized CO rates. Predictions obtained by aggregating the rates from each hkl surface weighted by its fractional area from ref. 60. Ag is not shown in this panel since it is an outlier. T = 473 K, P = 1 bar, yMeOH = 10%.

Source data

Figure 3a provides an overview of a C1O2 CRN. CARE represents CRNs as bipartite graphs with nodes representing species and reactions, with edges e(s, r) embedding the consumption/formation rate of species s due to reaction r. Sequentially evaluating the 63 CRNs with GAME-Net-UQ took 47 min on a GPU. DFT simulation of all species in the networks without TS evaluation would require 286,168 CPU hours and 2,392 kWh of energy consumption (Supplementary Note 12).

Comparing GAME-Net-UQ predictions with the available DFT data61 on a subset of the studied surfaces (Fig. 3b) shows an accuracy in terms of MAE of 0.47 eV for the reaction energy ΔE, and 0.33 eV for the activation energy Eact. The higher prediction error for ΔE arises from error propagation across the three involved species (A* → B* + C*), whereas Eact depends only on the energies of A* and TS. The outliers are associated with C–O-bond-breaking reactions typically occurring at steps62. Compared with EquiformerV2-31M and MACE-MP-0 (Supplementary Fig. 18), GAME-Net-UQ shows the highest error for ΔE but appears to be the most robust when predicting Eact. Although EquiformerV2-31M shows a remarkable accuracy when predicting ΔE (MAE = 0.40 eV), together with MACE-MP-0, it struggles with TSs, returning MAEs above 2 eV for Eact. Even though the CatTSunami framework36 reports excellent performance on predicting Eact for similar reactions with EquiformerV2-31M (MAE = 0.06 eV), this metric has been obtained by using the DFT-relaxed initial and final states for the NEB ML-assisted simulation, plus a single-point DFT run to get the final TS energy. Here GAME-Net-UQ and MLIPs have been used in a fully ML-based scenario. The higher errors of the considered MLIPs appear because their training data are mainly near-equilibrium structures63,64. The automated TS search methodology in CARE (Supplementary Note 9) introduces further error, necessitating future refinements to reduce the divergence of NEB simulations.

DFT studies typically focus on modeling reactive processes on the most stable surface. However, the observed activity in metal nanoparticles is attributed to contributions from multiple exposed surfaces. CARE enables targeting multiple surfaces and predicting their associated activity in a systematic way. GAME-Net-UQ is able to capture a clear volcano trend on the CO formation rate, providing a ranking of the different metal surfaces (Fig. 3c). The predictions highlight Ru as the most active metal, in agreement with experimental evidence56. Rh(310) appears as an outlier, mainly due to the stepped nature of the surface, which is reflected in Eads,CO 1 eV higher than on its other facets (Supplementary Table 13), where CO should adsorb on the step. This less favorable interaction leads to a higher CO formation rate.

Defining the overall CO formation rate as a weighted sum of the rates of the individual surface facets60 (Supplementary Fig. 16) leads to a general agreement with the experimental trend (Fig. 3d). The high deviation for Rh is due to the predicted rate on the (310) surface, which governs the metal activity and contributes 1% of the surface exposed by the nanoparticle. The offset between the predicted and experimental rates, which primarily originates from the additional ML error to that intrinsic to DFT, is a situation often encountered in mean-field kinetics based on atomistic data65,66. Other sources of discrepancy are the lack of lateral interactions and thermal contributions67.

Evaluating the CRNs with EquiformerV2-31M and MACE-MP-0 including TS search with NEB on GPU requires 100 times the resources of GAME-Net-UQ (Supplementary Fig. 17). The presence of a few reactions with very high barriers (Supplementary Table 10) and the general overestimation of the CO adsorption energy leads to predicted formation rates many orders of magnitude below the experimental trends and experimental activity ranking not correctly reproduced (Supplementary Figs. 19 and 20).

This case study highlights three important concepts. First, there is room for energy-targeted ML approaches not involving structural relaxation. Second, MLIPs for heterogeneous catalysis require data-sampling strategies enabling a better description of TSs63. Last, additional standardized experimental data are needed to define sound benchmarks for the computational heterogeneous catalysis community.

Electrochemical reduction to C3 products

This example focuses on the electrochemical reduction of CO2 on Cu toward C3 products. This process has been studied using CARE by evaluating a C3O2 CRN on Cu(100) with GAME-Net-UQ. Electrochemical reactions require the consideration of potential and pH contributions, which CARE implements via the computational hydrogen electrode approach68. This defines a set of specific electrochemical reactions between the adsorbed species, charged particles (proton and electron) and the solvent. The CRN has been modeled in neutral (pH 7) and alkaline (pH 13) conditions, at applied potentials of –0.4 and –1.0 VSHE, and at ambient conditions (298 K, 1 bar). The network contains 893 surface species, 93 gas-phase molecules and 7,709 elementary reactions, classified in seven types: PCET; C–O-, C–C-, O–H-, and C–H-bond-breaking/bond-forming reactions; adsorption/desorption and [1,2]-H shifts (Fig. 4a). The CRN evaluation took 5 min on a 24-core CPU. Further characterization details of the C3O2 network are provided in Supplementary Fig. 21.

Fig. 4: Electrochemical C3O2 reaction network on Cu(100).
Fig. 4: Electrochemical C3O2 reaction network on Cu(100).The alternative text for this image may have been generated using AI.
Full size image

a, CRN visualization with intermediates (left) and elementary reactions (right) colored by phase and reaction type, respectively. b,c, Reaction paths for the electrocatalytic processes from propionaldehyde to 1-PrOH (b) and allyl alcohol to C3H6 and 1-PrOH (c). The dashed lines represent the energy profiles obtained from ref. 57 with DFT, whereas solid lines show the profiles obtained using GAME-Net-UQ.

Source data

The global analysis of the CRN (Supplementary Fig. 22) shows that PCET steps cluster into two distinct groups according to their uncertainty. The cluster with a higher uncertainty corresponds to –OH protonation, leading to water formation and elimination. Shifting toward a more alkaline pH (from pH 7 to pH 13) leads to an increase in the reaction energy for all PCET reactions. This behavior is expected since increasing the pH implies a lower concentration of protons, thermodynamically disfavoring the protonation of the intermediates. As expected, lowering the applied potential (from –1.0 to –1.5 VSHE) leads to a favorable thermodynamic shift of the energies for all PCET reactions. A key problem in CO2 reduction on Cu is fine-tuning its selectivity toward long-chain products, such as 1-propanol (1-PrOH) versus C3H6 production57. Experimentally, for C2 products, ethylene and ethanol are observed, but in the C3 fraction, only 1-PrOH is found and the route to C3H6 is blocked. Following previous experimental and DFT investigations57, the selectivity of the network toward 1-PrOH and C3H6 was analyzed with CARE starting from two key intermediates: allyl alcohol and propionaldehyde. In particular, the obtained trends are in agreement with those obtained from the DFT study, regardless of the absence of implicit solvation and Bader charge corrections in the energy evaluation workflow, which instead are included in the DFT treatment. This demonstrates the capacity of GAME-Net-UQ to capture essential reactivity trends without accounting for solvent and charge effects. At alkaline pH (Fig. 4b), thermodynamically favorable pathways toward the formation of 1-PrOH are identified. These findings align with DFT57 and highlight the preferential formation of 1-PrOH over C3H6. Some discrepancies appear in the formation of CH3CH2CHOH* species. Although DFT simulations suggest the oxygen hydrogenation from propionaldehyde to CH3CH2CHOH* to be slightly endothermic (ΔE = +0.04 eV), GAME-Net-UQ yields a ΔE value of –0.1 eV.

Starting by allyl alcohol and under reductive potentials U = –1.0 VRHE and pH 7 (Fig. 4c), GAME-Net-UQ still reproduces the DFT trends including solvation and charge corrections. Therefore, CARE can deal with electrochemical systems by integrating a specific template and evaluating the energies taking into account pH and applied potential contributions. In this way, selectivity trends comparable with DFT and experimental observations can be extracted.

Fischer–Tropsch synthesis

CARE enables the generation of huge CRNs as those representing the Fischer–Tropsch process69, converting syngas into long-chain hydrocarbons. Such CRNs can neither be evaluated using DFT in reasonable times nor the MKM of networks with thousands of reactions has ever been performed before in heterogeneous catalysis11. Here we achieve these goals with CARE, using GAME-Net-UQ as the energy evaluator. To this end, a C6O1 (NC = 6, NO = 1) CRN has been generated and evaluated for Co(0001), Fe(110), Ru(0001) and Ni(111). The CRNs include 38,913 surface species, 985 gas-phase molecules and 369,365 elementary reactions. GAME-Net-UQ, trained on fewer than 200 TS structures per metal, is used here to evaluate the TS energy of the surface-bond-forming steps, where 99.98% of them being completely new to the model (Supplementary Table 19). Considering that simulating an intermediate adsorbed on ferromagnetic surfaces with DFT on average required 307 CPU hours and about 2.5 kWh on supercomputing facilities (Supplementary Note 12), evaluating the species of this CRN with DFT would have taken 106 kWh and 1,360 CPU years, assuming sampling one configuration per adsorbate–surface pair. With CARE, the CRN blueprint generation plus the energy evaluation of its intermediates and TSs took 4 h on a 24-core CPU, resulting in a speed-up of 106. Running microkinetic simulations on such large reaction networks fully energetically characterized (that is, both ΔE and Eact evaluated for all elementary steps) has, to the best of our knowledge, never been reported in the literature of heterogeneous catalysis. We have accomplished to run kinetic simulations up to the chemical steady state on a subset of the original CRNs comprising the linear products (4,203 species and 36,843 reactions) at 523.15 K, 10 bar and H2/CO = 2. These MKM simulations took between 6 and 50 min to converge, depending on the CRN stiffness. This achievement unlocks the possibility to quantitatively assess the power of ML models when deriving macroscale catalyst properties.

Our analysis at this point is semiquantitative to correlate the major trends experimentally observed. The energy trends obtained with GAME-Net-UQ, as well as the distribution of the derived formation rates of the products included in the CRN (Fig. 5), point to a general agreement with experimental evidence69.

Fig. 5: Energy and kinetic trends obtained from the C6O1 reaction network for the Fischer–Tropsch synthesis.
Fig. 5: Energy and kinetic trends obtained from the C6O1 reaction network for the Fischer–Tropsch synthesis.The alternative text for this image may have been generated using AI.
Full size image

a, Mean reaction energy ΔE obtained using GAME-Net-UQ for the surface reactions grouped by the bond-forming type and metal surface. Error bars correspond to the 95% confidence interval. nC−C = 164,602, nC−H = 143,445, nC−O = 49,177. b, Steady-state formation rates obtained from the kinetic simulations on the C6O1 CRNs containing linear products on Co, Fe, Ni and Ru. c, Associated Anderson–Schulz–Flory distribution of the predicted products (Wn, mass formation rate of Cn species). Operating conditions: T = 523.15 K, P = 10 bar, H2/CO = 2.

Source data

Co and Fe are the industrially used catalysts. Our results (Fig. 5a) show that Co prefers the formation of C–C bonds over C–H and C–O, in agreement with its known selectivity for the formation of long-chain products. Fe, on the other hand, tends to produce more CO2 through the undesired water–gas shift reaction—a phenomenon that has been effectively captured by the kinetic results (Fig. 5b). Product distribution shows that Co and Fe are catalysts producing the highest amount of long-chain hydrocarbons.

Ni is a catalyst known for its high hydrogenation activity. This behavior is not well captured by GAME-Net-UQ, where although C–H-bond-forming reactions appear as the most thermodynamically favorable, the kinetic results (Fig. 5b) show that methane is not among the main products. Ru shows a similar trend; however, its high cost hinders its industrial applicability. The Anderson–Schulz–Flory distributions (Fig. 5c) derived from the kinetic simulations are limited by the fact that the experimental linear trend is observed in the C4–C20 range. The absence of C6+ products creates a boundary condition that leads to an accumulation of C6 species, distorting the expected slope associated with the chain-growth probability.

Kinetic simulations enable the ranking of products by their formation rates (Supplementary Table 16). Co is the only metal showing C6 species in the top ten, whereas Fe presents the highest CO2 formation rate. The kinetic simulations enable the investigation of the main steps of the chain-growth mechanism (Supplementary Tables 17 and 18). Although the initial steps of the chain growth (C2 → C3/C4) are mainly governed by the addition of C1 intermediates, the formation of C5−6 species is primarily due to the combination of C2 + C3, C2 + C4 and C3 + C3 fragments. Although these pathways are in contrast with the proposed mechanisms for the Fischer–Tropsch synthesis70, CARE represents a promising tool that enables researchers to unveil the key factors governing this critical process.

In terms of uncertainty, the reaction barriers obtained with GAME-Net-UQ show an associated uncertainty between 0.3 and 0.8 eV (Supplementary Fig. 23). The higher uncertainty in Eact, compared with the values reported for species and TSs, reflects the additive error of individual species and TS energies when deriving the reaction properties. The lowest uncertainty is obtained for C–C couplings with values of around 0.3 eV, whereas hydrogenation steps show an uncertainty distribution of around 0.6 eV for CxHy species, and a second peak at around 0.8 eV for small open-shell and oxygenated intermediates. When evaluating the TS energy ETS of the surface reactions, GAME-Net-UQ predicts valid values (EIS < ETS > EFS) 83% of the times (Supplementary Table 14). The invalid evaluations mainly correspond to endothermic C–H steps, where the encountered situation is ETS < EFS, pointing to an overestimation of the FS energy due to absence of stabilizing lateral interactions in the treatment.

In summary, this case study highlights the capability of CARE to generate and evaluate huge networks cheaply as well as including the possible bond-breaking reactions that could eventually take place on the catalyst surface. Additionally, it demonstrates the power of the microkinetic module in CARE, unlocking the possibility to run simulations of networks containing thousands of reactions.

Discussion

CARE represents a robust software for evaluating the activity of heterogeneous catalysts in thermal and electro-conditions, consisting of a reaction network generator powered by state-of-the-art ML energy estimators, and a reactor microkinetic model. Several potential challenges in the field must be considered.

For the definition of CRNs, we have approached the network completeness with a set of templates, ensuring the inclusion of all potential bond-breaking reactions that leads to all the available intermediate species—even those that might be initially discarded based on chemical intuition. CARE is modular in nature, and allows the user to implement additional templates for increasing the definition of the explored CS. This systematic approach reduces bias in CRNs, but leads to larger reaction lists that can be a bottleneck when analyzing and manipulating the constructed networks. However, there are three definitive advantages in the methodology. The first is that due to its modularity, CARE can be easily extended to tackle more exotic reactions, such as isomerizations, cycle formations, and those containing heteroatoms (N, S and P) by adding reaction templates and fine-tuning the ML evaluator of choice accordingly. Second, CARE can consider rare events easily, providing the possibility to study their implication in catalyst deactivation71, an aspect overlooked and difficult to treat with DFT simulations but crucial for inferring the long-term stability of the catalyst. The third aspect is the substantial reduction in computational cost by accelerating and automatizing the evaluation of reaction properties with faster ML, such as GAME-Net-UQ and MLIPs.

Our data-driven energy estimator is fast and relatively cheap due to its size and direct approach for predicting the DFT energy of relaxed structures. The UQ of the predictions provides additional information that can be propagated to reaction properties and MKM, or exploited for active learning purposes and Bayesian optimization. The low computational cost of GAME-Net-UQ and its balanced accuracy for intermediates and TSs remove the need for empirical rules like linear scaling relationships, since GAME-Net-UQ covers a wider diversity in terms of metals, surfaces and species. GAME-Net-UQ allows the energy estimation of thousands of intermediates/TSs in less than 1 CPU hour, thereby being a viable ML strategy for fast property prediction. Additional challenges in this area are linked to the introduction of lateral interactions (high coverage or solvents), more detailed configurational analysis for adsorbates, and expansion of the domain of applicability to structurally complex materials such as alloys and oxides67. These aspects could be addressed by implementing lateral interactions via graph-based strategies72, active learning and -ML schemes73.

As for the challenges in the microkinetic part, the Julia solver implemented in CARE provides the computational scalability required to handle networks containing thousands of reactions. Future improvements would require strategies to include thermal and entropic contributions and a more detailed representation of electrochemical events. Clever network-pruning strategies are needed, particularly for huge networks, to facilitate the convergence of the associated kinetic simulations, and rate and selectivity control functionalities, essential to unveil important catalytic pathways. Finally, more complex reactor models able to capture relevant phenomena at the nanoparticle scale could be implemented.

Regarding CARE, the results highlight the power of direct energy estimators bypassing the need to perform the structural optimization of intermediates, to provide rankings for the catalytic performance. These models enable to focus on the overall catalytic reactivity, making substantial advances at 1% of the cost of MLIPs. The capability of CARE for reproducing experimental trends showcases its ability to rank potential materials and can be used to predict better formulations.

In summary, CARE paves the way for the automated exploration and characterization of heterogeneous catalytic processes with improved network completeness and reduced bias in an automated, robust and accurate manner. Our work opens a path toward the study of processes involving complex compounds such as plastics and biomass at a reasonable computational cost.

Methods

DFT

The DFT simulations for developing GAME-Net-UQ have been performed with the Vienna ab initio simulation package (v. 5.4.4)74. The Perdew–Burke–Ernzerhof75 functional with reparameterized D2 (ref. 76) dispersion correction for metals was used77. Core electrons were represented by projector augmented-wave pseudopotentials78 and valence electrons were represented by plane waves with a kinetic energy cutoff of 450 eV. Electronic convergence was set to 10−5 eV and the atomic positions were converged until the residual forces fell below 0.03 eV Å−1.

The database provided in GAME-Net-UQ includes eight face-centered cubic, one body-centered cubic and five hexagonal close-packed metal structures on the three most stable hkl surface facets for each one (Supplementary Fig. 2). Metal surfaces were modeled by four-to-ten-layer slabs (Supplementary Table 1), where the half uppermost layers were fully relaxed and the bottom ones were fixed to the bulk distances. A surface coverage of 0.02 molecules Å−2 was defined for all the adsorption structures, a reasonable value to neglect lateral interactions. The vacuum between the slabs was set between 13 and 16 Å, and the dipole correction was applied in the z direction79. The Brillouin zone was sampled by a Γ-centered 3 × 3 × 1 k-point mesh generated through the Monkhorst–Pack method80. TSs have been obtained using the improved dimer method81 and validated using DFT frequency analysis displacing adsorbate atoms by 0.015 Å (Supplementary Fig. 6) following the criterion in ref. 82.

MKM

Once the CRNs are constructed, three components are required for running microkinetic simulations.

(1) The stoichiometric matrix \({\boldsymbol{\nu }}\in {{\mathbb{Z}}}^{{n}_{{\rm{S}}},{n}_{{\rm{R}}}}\) represents the reaction network with nS species (including the surface site ‘*’) and nR elementary reactions. νi,j is the stoichiometric coefficient of species i in reaction j. As elementary reactions are mono- or bimolecular, stoichiometric coefficients lie in the interval [–2;+2]. This matrix is intrinsically sparse, and since most of the reactions in the CRN are of the kind A + B → C + D, its sparsity sν follows the relationship (Supplementary Fig. 12)

$${s}_{\nu }=\left(1-\frac{4}{{n}_{{\rm{S}}}}\right)\times 100{\rm{ \% }}.$$
(1)

Compressed sparse row/column formats and integer data type have to be preferred to optimize matrix multiplications for CRNs with more than 400 intermediates (sν = 99%).

(2) The operating conditions such as temperature and pressure (and applied potential and electrolyte pH for electrochemical systems) define the thermodynamic and kinetic constants of the elementary reactions (Supplementary Note 8).

(3) The last building block is the reactor model. The default in CARE is a zero-conversion differential plug-flow reactor defined by the system of ODEs:

$$\left\{\begin{array}{l}\frac{{\rm{d}}{\theta }_{i}}{{\rm{d}}t}=\mathop{\sum }\limits_{j=1}^{{n}_{{\rm{R}}}}{\nu }_{i,\,j}{r}_{j}\,\,\,\,\,\,(\mathrm{adsorbed}\,\mathrm{species})\\ \frac{{\rm{d}}{P}_{i}}{{\rm{d}}t}=0\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,(\mathrm{gas}\,\mathrm{species})\end{array}\right.,$$
(2)

where θi is the fractional surface coverage of species i, νi,j is the stoichiometric coefficient of species i in the reaction j and Pi is the partial pressure of the gas-phase reactant i. rj is the net rate of reaction j, defined as

$${r}_{j}=\vec{{r}_{j}}-\overleftarrow{{r}_{j}}=\vec{{k}_{j}}\mathop{\prod }\limits_{i=1}^{{n}_{{\rm{S}}}}{a}_{i}^{| \min ({\nu }_{i,\,j},0)| }-\overleftarrow{{k}_{j}}\mathop{\prod }\limits_{i=1}^{{n}_{{\rm{S}}}}{a}_{i}^{| \max ({\nu }_{i,\,j},0)| },$$
(3)

where \(\vec{{r}_{j}}\) and \(\overleftarrow{{r}_{j}}\) are the rates of reaction j in the forward and reverse directions, respectively; \(\vec{{k}_{j}}\) and \(\overleftarrow{{k}_{j}}\) are the corresponding kinetic coefficients; and ai refers to the activity of species i, corresponding to θi for adsorbed species and Pi for gas molecules. The initial conditions assume an empty surface exposed to the gas mixture defined by the user:

$$\left\{\begin{array}{cc}{\theta }_{* }(t=0)=1 & \,(\mathrm{active}\,\mathrm{site})\\ {\theta }_{k}(t=0)=0 & \,(\mathrm{surface}\,\mathrm{species})\\ {P}_{i}(t=0)={P}_{i}^{0} & i\in \,\,\mathrm{gas}\,\mathrm{reactants}\\ {P}_{j}(t=0)=0 & j\in \,\,\mathrm{gas}\,\mathrm{products}\end{array}\right..$$
(4)

To facilitate the convergence of highly stiff microkinetic simulations and to reduce the total ODE integration time, the analytical Jacobian of the ODE is defined and fallback strategies are additionally available. These include (1) removing surface reactions with energy barriers higher than a specified threshold, (2) considering all the reactions as barrierless (Supplementary Fig. 14) and (3) steering the model toward specified target products by removing species from the CRN that are unlikely to be formed based on user intuition. Steady state is reached when the following two conditions are satisfied:

$$\left\{\begin{array}{cc}\mathop{\sum }\limits_{i=1}^{{n}_{{\rm{S}}}}{\theta }_{i}=1 & \,(\mathrm{adsorbed}\,\mathrm{species})\\ \mathop{\sum }\limits_{i=1}^{\mathrm{reactants}}{R}_{i}{n}_{i,k}=\mathop{\sum }\limits_{i=1}^{\mathrm{products}}{R}_{i}{n}_{i,k} & k=\,{\rm{C}},\,\mathrm{H,}\,{\rm{O}}\end{array}\right.,$$
(5)

where Ri is the consumption/formation rate of the gas reactant/product and ni,k is the number of k atoms in the molecule. Here 64-bit floating-point arithmetic is implemented by default for the microkinetic simulations. For extremely stiff systems, the user has the option to run the simulation with higher precision when using the Julia solver (for example, BigFloat can be used for arbitrary-precision floating-point arithmetic). CARE uses the backward differentiation formula as the default ODE solver. Two essential ODE solver parameters are the absolute and relative tolerances (atol and rtol, respectively), which control the accuracy of the predicted surface coverages at each integration step. Initial atol and rtol values are conservatively set to 10−15 and 10−12, respectively.

Computational tools

CARE is written in Python (v. 3.12), and the CRN algorithm is mainly based on RDKit (v. 2023.09.6), ASE (v. 3.24.0)83 and NetworkX (v. 3.2.1)84. Surfaces from the Materials Project have been obtained through mp-api (v. 0.45.3). GAME-Net-UQ has been developed with PyTorch Geometric (v. 2.5.3)85 and PyTorch (v. 2.4.0)86, and it was trained on an NVIDIA RTX A2000 12-GB GPU with CUDA 12.3. MKM functionalities are implemented with SciPy (v. 1.12.0)54 and DifferentialEquations.jl55 (Julia (v. 1.10.2)). Visualizations have been created with Matplotlib (v. 3.8.0), Seaborn (v. 0.12.2) and Inkscape (v. 1.3.2). All runs with CARE have been performed on a 24-core 12th Gen Intel Core i9-12900K CPU and an NVIDIA RTX A2000 12-GB GPU with CUDA 12.2.