Introduction

Heterogeneous catalysis plays a central role in modern industry, driving advancements in energy conversion and environmental sustainability1,2. The understanding and design of catalytic active sites, which are specific surface regions or atom groups that directly affect molecular adsorption3, are crucial, as they directly determine the efficiency, selectivity, and stability of catalytic processes. The intricate microstructural characteristics of active sites, arising from the combination of components, dynamic behavior, and spatial positioning, render the analysis of molecular adsorption states significant challenges, even for the simplest yet most critical adsorption energies4,5,6,7. High-throughput density functional theory (DFT) calculations and machine learning (ML) methods, commonly referred to as forward design, have been extensively utilized to establish structure-property relationships and predict adsorption properties8,9,10. These efforts have deepened our fundamental insights into heterogeneous catalysis, exemplified by the establishment of linear scaling relationships and volcano plots in various catalytic reactions11,12,13,14. Once the optimal adsorption state for a specific catalytic reaction is understood, the focus naturally shifts to identifying which types of active sites can achieve this adsorption state beyond the benchmark catalysts. This process, referred to as inverse design, entails a transition from property to structure15. While the concept of inverse design has been around for many years, it has only recently begun to reveal its full potential with the emergence of deep generative models. The variational autoencoder (VAE) and the generative adversarial network, two prominent generative models, have made significant strides in the fields of molecules and materials, facilitating the realization of automated “closed-loop” design processes aimed at achieving targeted performances16,17,18,19,20. Yet, applications in heterogeneous catalysis remain scarce21.

The primary challenge is to accurately represent catalytic active sites, with two key factors contributing to their uncertainty and complexity: the variations in facets, defects, and size, referred to as the coordination effect22, and the random spatial distribution of different elements, known as the ligand effect23. In a real catalyst, the coordination and ligand effects intertwine to create a complex and diverse distribution of catalytic active sites. An ideal representation would encode both coordination and ligand effects, be compatible with gradient-based optimization, and allow active sites to be reconstructed and decoded from the generative model. Popular expressions of catalytic active sites based on cheminformatics or graph have achieved significant progress but still face challenges in capturing the distant atomic effect and the overall three-dimensional structural complexity8,24,25,26,27. Another challenge is that the “black box" nature of deep learning often makes generative models lacking in interpretability28. In catalyst design, we seek not only to identify effective active sites but also to understand the reasons behind their effectiveness. This raises a fundamental question in heterogeneous catalysis: which physical properties of a catalyst surface dictate the chemisorption strength of adsorbates29,30,31,32,33,34. Although generative models can reverse-engineer properties-structure relationships to identify new potential candidate materials, the use of high-dimensional latent spaces to represent data structures often means that the generation process lacks intuitive physical intuition, making it difficult to understand the specific mechanisms behind the model’s output35,36. Extracting hidden patterns from the latent space into interpretable formats can lead to testable theories and hypotheses, further advancing scientific understanding28.

In this work, we developed a novel topology-based VAE framework (PGH-VAEs) to enable the high-resolution representation of catalytic active sites and interpretable inverse design, using *OH adsorption—a key step in the oxygen reduction reaction (ORR)—on IrPdPtRhRu high-entropy alloys (HEAs) as an example. HEAs have recently emerged as a promising approach for fine-tuning catalytic properties due to their extensive variety of active site types, stemming from the high variability in local structural composition and coordination37,38,39,40,41. GLMY homology, proposed by Grigor’yan et al., is a generalized homology theory based on path complexes that extends classical homology to directed and non-symmetric structures42,43,44. It enables the topological analysis of complex systems with directionality or asymmetry, making it particularly useful for capturing subtle structural features and sensitivity in crystalline structures. We employ persistent GLMY homology (PGH) to achieve a refined characterization of the three-dimensional spatial features of catalytic active sites. A multi-channel VAE with modules dedicated to encoding and decoding the coordination and ligand features is developed, enabling the latent design space of active sites to possess substantial physical meaning. Leveraging a semi-supervised learning framework, we achieved a high-precision VAE model using only around 1100 DFT data points, attaining a remarkably low mean absolute error (MAE) of 0.045 eV in *OH adsorption energy predictions. We further elucidate how coordination and ligand effects, especially the distant atoms that do not directly contact with adsorbate, shape *OH adsorption states, offering targeted optimization strategies for HEA catalysts through inverse design. This proof-of-concept protocol establishes a solid foundation for interpretable inverse design of catalytic active sites and can be extended to any other catalytic processes and systems. The integration of topology-based descriptors with interpretable property-structure relationships renders the ML “black box" more transparent, bringing us a step closer to on-demand catalyst design rather than relying on traditional trial-and-error approaches.

Results

Workflow overview

The overall workflow of the PGH-VAEs is illustrated in Fig. 1. To characterize the intricate microstructure of catalytic active sites, we developed a topology-based descriptor enriched with chemical information, enabling a unified representation of coordination and ligand effects (Fig. 1a). Subsequently, we generated structures for *OH adsorption on HEAs with uniform elemental ratios across various crystal facets. As HEAs represent one of the most complex catalytic systems, employing them as a model enables an effective simulation of the multifaceted complexities found at active sites in real-world catalysts. Constrained by the substantial computational demands of DFT simulations, we are limited to conducting first-principles calculations on only a small selection of structures. To overcome this, we developed a semi-supervised ML model to enhance our dataset. Specifically, we start with a labeled database of adsorption sites with known energies obtained from DFT calculations. A lightweight and efficient ML model is trained on this dataset and subsequently used to predict the adsorption energies of newly generated structures that form the unlabeled database. By randomly generating unlabeled data and using the ML model trained on DFT data to label these samples—specifically by predicting their adsorption energies—we can effectively augment the dataset used for VAE training (Fig. 1b). This process transforms the original labeled data, the predicted adsorption energies, and the newly generated structures into a complete dataset, which is then used to train the VAE. Subsequently, a multi-channel VAE framework was introduced, with its latent space offering strong interpretability, allowing structure-performance relationships to be understood in terms of coordination and ligand effects. To avoid introducing bias into model evaluation, the unlabeled database was only used during training, and all testing was conducted exclusively on original DFT-calculated data. Further, the VAEs can generate novel active site structures tailored to specific adsorption energy criteria, offering insightful guidance for catalyst optimization and advancing the design of high-performance catalytic materials (Fig. 1c).

Fig. 1: Overview of feature extraction, dataset construction, and workflow for energy prediction and interface design.
figure 1

a Schematic illustration of feature construction. Coordination features are extracted using the PGH method, while ligand features are represented based on element properties. b Schematic of dataset construction using DFT and semi-supervised learning. A GBR model was initially trained on DFT-calculated adsorption energies and subsequently employed to predict adsorption energies for additional simulated active sites, enabling the construction of an expanded, pseudo-labeled dataset for model training. c Framework of PGH-VAEs, including modules for encoding, latent space visualization, latent space sampling, and decoding to generate potential active sites.

Active sites identification and representation

Catalytic active sites typically refer to specific regions or groups of atoms on the catalyst surface that directly influence the molecular adsorption state3. To maximize the diversity of active sites, we sampled on various Miller index surfaces, including (111), (100), (110), (211), and (532) of IrPdPtRhRu HEAs (Fig. 1a). The (111), (100), (110), (211), and (532) facets were selected because they are among the most commonly studied surfaces in ORR research45,46,47, representing a diverse set of low-index and high-index surfaces that capture a range of atomic coordination environments commonly observed in transition metal catalysts. IrPdPtRhRu HEAs were selected for their stability, ease of synthesis, and demonstrated potential in ORR applications37,48,49. According to Previous studies, the bridge site is considered to place the *OH adsorbate40,50. Figure S1 shows all 13 unique bridge sites on these Miller index surfaces with their designation. The bridge site and the first and second-nearest neighbors of the bridge atoms are considered to constitute the primary chemical environment (active sites) influencing molecular adsorption51,52.

Active sites are primarily influenced by coordination and ligand effects. Coordination effects refer to the spatial arrangement of atoms within the active site, encompassing structural features such as crystal facets, defects, and corner sites. We introduced PGH as a novel mathematical tool for capturing nuanced structural variations. \({\rm{Ker}}\) and \({\rm{Im}}\) represent the kernel and image of the boundary operators in PGH, capturing the cycles and boundaries in the topological structure. We include this formulation to illustrate the mathematical foundation of our topological fingerprinting approach. The detailed mathematical principles of PGH are presented in the Methods section. Here, we briefly introduce its logic and applications, using *OH adsorption on the (211) surface, denoted as (211) summit, as an example (Fig. 2a). First, the active site atoms are represented as a colored point cloud, with paths established based on bonding and element properties (group, period and atom radius) differences between points. Once paths are established, the atomic structure is converted into a path complex. The geometric characteristics of this path complex can be captured across various spatial scales through a process called filtration. As the filtration parameter (distance) increases, the number of visible paths expands, provided that the path lengths remain below the filtration threshold. Insets of Fig. 2a show adjacency matrices for path complexes at different filtration parameters. Simultaneously, the filtration process will generate the distance-based persistent GLMY homology (DPGH) fingerprint. Since each topological invariant (i.e., each Betti number) has its own persistence range, the number of recorded features can vary across different structures, leading to fingerprints of inconsistent dimensions. To resolve this issue, instead of recording each individual barcode, we discretize the continuous filtration parameter with a fixed step size of 0.1 Å and count the number of Betti numbers present at each discrete filtration value. These counts are then plotted as a line chart against the filtration parameter (the blue figure in Fig. 2a, b), and the resulting line chart is represented as a feature vector. Additionally, we set the maximum filtration distance to 4 Å, which ensures that the filtration process is complete for all samples. As a result, the input feature vectors for all structures have a consistent dimensionality of 40. Therefore, by using this statistical summarization approach over persistence diagrams, we ensure that the resulting fingerprint vectors have a uniform length, making them compatible with ML models. In algebraic topology, Betti numbers are topological invariants that describe the number of independent features in different dimensions of a space. For a given dimension n, the Betti number is defined as the rank of the nth homology group Hn, and it is typically denoted by βn. In our work, the Betti numbers have the following interpretations: (1) β0 corresponds to the number of independent components (atoms), representing 0-dimensional features. (2) β1 reflects the number of directed cycles composed of 1-paths (directed edges), capturing 1-dimensional topological structures. (3) β2 denotes the number of voids in terms of paths composed of 2-paths, representing 2-dimensional features. This notation offers an interpretable representation of topological characteristics relevant to catalytic active sites across multiple geometric scales. The blue figure, representing the DPGH fingerprint, is the line plot of Betti numbers as a function of filtration parameters, which captures topological invariants of the catalytic site using GLMY homology. The horizontal axis denotes the filtration parameter (in Å), and the vertical axis shows the Betti numbers in dimensions 0, 1, and 2. The more detailed analysis of the DPGH is presented in Supporting Note 1. This filtration process highlights the structural intricacies and allows us to capture the topological features essential for active sites.

Fig. 2: Workflow for coordination and ligand feature extraction of active sites.
figure 2

a A schematic workflow for extracting the coordination features of the (211) summit bridge active site using PGH. Active site atoms are initially represented as a colored point cloud, with paths direction defined by bonding and electronegativity differences. These paths form a path complex, whose geometric features are analyzed across scales using filtration. As the filtration parameter increases, visible paths expand and are recorded, then converted into vectors for machine learning. The blue bar chart is the DPGH fingerprint. The DPGH fingerprint encapsulates the GLMY homology in dimensions 0 (H0), 1 (H1), and 2 (H2). The vertical axis denotes the value of the Betti number, while the horizontal axis represents the filtering parameter in angstroms (Å). b The workflow for extracting the coordination features of the (211) valley bridge active site. c Coordination numbers of neighboring atoms for the (211) summit and (211) valley bridge sites are shown. It can be observed that they share the same coordination environment, with the bridge atoms both having one 9-coordinated atom and one 10-coordinated atom. d Illustration of the process for obtaining the ligand features of active sites. The element properties of atoms (group, period, and atom radius) are arranged as a vector according to their spatial distance from the adsorbate.

The application of PGH reveals distinctions between catalytic sites with similar coordination numbers but different structural features. Figure 2b presents another *OH adsorption configuration, denoted as (211) valley, on the (211) facet with the same neighbor coordination as the (211) summit in Fig. 2a. The representation based on coordination number miss such differences with the bridge atoms both having one 9-coordinated atom and one 10-coordinated atom (Fig. 2c) while the evolution of Betti numbers in Fig. 2b effectively differentiates this structure from that in Fig. 2a. Additional results for the other 11 active sites are presented in Fig. S2, offering a comprehensive analysis of structural features by PGH. Ultimately, incorporating the approximate treatment, mean values of β0, β1, and β2 in the DPGH fingerprints are computed across the filtering process. These values are amalgamated into a 1 × 3 vector to serve as the coordinate feature for the active sites. For the ligand effect, we represent it directly based on the element properties of atoms (group, period, and atom radius), ordered by their spatial distance from the adsorbate (Fig. 2d). Ultimately, the vector representation of the active site is constructed by concatenating the coordination effect vector with the ligand effect vector, thereby achieving a dual description of the active site’s topological and chemical properties.

The *OH adsorption dataset

Although there are no established guidelines on the optimal size of training datasets for deep generative models, empirical evidence suggests that larger datasets generally lead to higher-quality generated data16. Calculating the adsorption energy of *OH relies on DFT, which is highly computationally demanding. To address this, we introduce a semi-supervised learning framework to efficiently construct and expand the dataset. We constructed a dataset of 1159 DFT data points of *OH adsorption, randomly sampled to cover each unique bridge site across various Miller index surfaces. Notably, the adsorption energy here refers to the difference between the *OH adsorption energy on HEA active sites and that on Pt(111). It is widely recognized that an adsorption energy within 0.1 eV above that of Pt (111) can facilitate *OH desorption, thereby enhancing ORR performance37,53. Using this dataset, we trained a Gradient Boosted Regression (GBR) model with the combined feature representation, achieving a test set MAE of 0.078 eV for *OH adsorption energy prediction (Table S1). The labeled dataset for semi-supervised VAE training was constructed by randomly sampling 1159 DFT-labeled structures across diverse coordination environments, ensuring a broad representation of active site types. To expand the dataset, an additional 3477 structures were pseudo-labeled using a GBRT model trained on PGH-derived features. The simulated data were generated by modifying the atomic species within the averaged atomic Cartesian coordinates derived from the DFT-optimized structures. This approach is justified by the observation that *OH adsorption perturbs surface metal atoms by less than 0.1 Å in the DFT dataset, allowing the use of averaged coordinates to represent new structures. Notably, although the newly generated structures share the same spatial geometry, the directionality of the connections is altered, enabling the PGH to distinguish between different samples within the same local environment based on their topological features. This expanded our labeled dataset to 4636 points. Although 75% of these labels were not obtained via direct DFT calculations, they enhance VAE model convergence without sacrificing accuracy, as detailed below.

The training of PGH-VAEs

The PGH-VAEs are designed to predict and generate catalytic active sites, with the core innovations in their multi-channel architecture and high-precision adsorption energy predictions, which establish interpretability in the latent space and accuracy in inverse design.

In the training process, the model receives feature vectors representing coordination and ligand characteristics of HEA interfaces. These are separated into two distinct encoders (Fig. 3a), each generating Gaussian distributions for the coordination and ligand variables (mean and variance). Random sampling within these distributions provides coordination and ligand variables, which are then input into corresponding decoders (Fig. 3b). This process constructs a structured latent space where each point encodes specific coordination and ligand information (Fig. 3c). In the Generation section of Fig. 3d, the decoders reconstruct the original feature vectors, while the prediction branch combines coordination and ligand features to forecast adsorption energy. PGH-VAEs are trained by minimizing four losses: ligand feature reconstruction error (Lligand), coordination feature reconstruction error (Lcoordination), property (the adsorption energy) prediction error (Lproperty), and Kullback-Leibler divergence (KL divergence) (LKL), yielding a total loss function:

$${L}_{total}={L}_{ligand}+{L}_{coordination}+{L}_{property}+{L}_{KL}.$$
(1)
Fig. 3: PGH-VAEs: encoding, generation, and prediction workflow.
figure 3

a Encode Part: Two encoders process coordination and ligand feature vectors from the HEA interface, outputting mean and variance parameters as Gaussian variables. Ligand and coordination variables (green dashed lines) are randomly sampled. b Latent Space: Ligand and coordination variables are independently clustered and visualized on a 2D scatter plot, with data point colors representing adsorption energy. c Generation Part: Coordination and ligand variables from new sampling points are decoded by two decoders into feature vectors for new HEA interfaces. d Prediction Part: The framework predicts *OH adsorption energy by combining sampled ligand and coordination variables in the property prediction branch.

Initially, we attempted to use only the 1159 DFT-calculated OH adsorption energies as the model dataset, with a training-to-test set ratio of 7:3. Both the MAEs and RMSEs of the property predictions show a sharp initial decrease (Fig. S4), followed by a gradual decline. In testing, however, errors initially decrease but then rise, indicating overfitting—a common issue in computational materials science due to limited data. To address this, we incorporated the expanded dataset of 4636 points generated through semi-supervised learning. In training, only the semi-supervised data were added to the training set, while the test set remained as 30% of the original DFT-calculated data, ensuring robust and reliable test accuracy. PGH-VAEs were retrained with the updated dataset; architecture and parameters are provided in Tables S3 and S4. We observed that during the training of PGH-VAEs, LKL typically has a larger magnitude (~10) compared to Lligand (~1), Lcoordination (~1), and Lproperty (~0.1). This behavior is expected and arises from differences in task complexity and output dimensionality. Specifically, LKL is computed over latent distributions across the batch and all latent dimensions, making it inherently larger. In contrast, the prediction branch optimizes a single scalar property, which is easier to fit and naturally yields a smaller loss. To ensure balanced training, we carefully weighted the ligand and coordination reconstruction losses according to their dimensionality54,55. Figure S5 illustrates the MAE and RMSE trends over epochs in training and testing, showing a sharp initial drop followed by a gradual decline, indicating stable training and effective mitigation of the prior overfitting issue. The final results show that PGH-VAEs achieve an MAE of 0.045 eV for adsorption energy prediction, reducing the error by an impressive 50% compared to the coordination number-based neural network model for HEA catalysts (Table S5)40. These results underscore the exceptional performance of PGH-VAEs and highlight the transformative impact of the semi-supervised learning approach in significantly enhancing model performance.

Demonstration of PGH-VAEs on active sites design and optimization

Imbuing deep generative models with interpretability facilitates the understanding of the physical principles underlying materials design. During the prediction and generation process, PGH-VAEs employ a two-dimensional latent space, where each data point represents an input instance. The coordinates of these points in the latent space are derived from the downscaling of coordination and ligand encoders via principal component analysis (PCA). Both ligand variables and coordination variables explain over 95% of the total variance, ensuring effective dimensionality reduction. The results (Fig. 4a) reveal a significant correlation between the relative adsorption energy of samples and the coordination and ligand variables in the latent space. Along the diagonal from the top left to the bottom right, the adsorption energy transitions from negative to positive, indicating that coordination and ligand effects can be utilized to explain variations in adsorption performance, with their influence exhibiting a coupled nature.

Fig. 4: Demonstration of PGH-VAEs on active sites design and optimization.
figure 4

a Latent space of adsorption energy plotted against coordination and ligand variables derived via PCA. Data points are color-coded based on their adsorption energy values. b Correlation between numerical representation and mean adsorption energy for 13 unique bridge-site structures. The numerical representation of structures is obtained by linear combination of their Betti numbers (β0, β1, and β2). c Correlation between numerical representation and mean adsorption energy for 15 unique bridge-site elemental combinations. The numerical representation of elemental combinations is derived by linear combination of the element properties of bridge atoms. Standard deviations are shown for each point to visualize the variability in adsorption energy within specific structures or elemental combinations. d Percentage distribution of elemental composition for bridge-site atoms, first-nearest neighbors, and second-nearest neighbors in the latent space associated with optimal *OH adsorption. e Percentage distribution of elemental composition for bridge-site atoms, first-nearest neighbors, and second-nearest neighbors in active sites generated via inverse design. f Illustration of active sites generated through inverse design, highlighting bridge-site atoms and distal Ru atoms. g Density of states (DOS) analysis of Ru-doped Pt (111) panels from left to right show the d-orbital states, d-band centers, and *OH adsorption energies for Pt adsorption sites without Ru doping, with three Ru atoms doped in the first-nearest neighbors, and with three additional Ru atoms doped in the second-nearest neighbors of the bridge site. h, i Heatmaps of the relationship between PtPdRu alloy composition ratios and catalytic activity on the (111) and (211) facets, respectively. Color codes: Ir: green, Pd: pink, Pt: purple, Rh: blue, Ru: yellow, O: red, H: white. In (g), the bridge atoms that adsorb *OH are highlighted by dark purple.

To further distinguish how the two effects regulate adsorption energy, we analyzed their individual correlations with changes in adsorption energy. For the ligand effect, leveraging the use of PGH to describe the active sites, we can directly quantify structural features into specific numerical values by linearly combining the average values of β0, β1, and β2, detailed in Supporting Note 2. The 13 unique active sites show distinct numerical differences, even among bridge sites on the same Miller index surface, showcasing the strength of topological data analysis in capturing the three-dimensional structure sensitivity and reducing them to simple numerical representations (Tables S6 and S7). We mapped the numerical representations of these 13 active sites against their corresponding average adsorption energies, revealing a perfect linear correlation that underscores the strong influence of structure sensitivity on adsorption energy (Fig. 4b). The substantial standard deviations observed for each point reveal the considerable tuning space for adsorption energy by varying atomic species within identical structural environments. Furthermore, the weights of the average Betti numbers reveal that β0 (connected components) and β2 (three-dimensional cavities) dominate in the structural representation, underscoring their pivotal roles in determining adsorption energy, whereas the influence of β1 (ring structures) is comparatively minor. Chemically, both β0 and β2 reflect structural density from complementary perspectives: β0 represents the number of independent atomic components, with higher values indicating increased local atomic density, while β2 captures the presence of three-dimensional cavities. A greater number of such cavities implies a higher density of path combinations (chemical bonds) within the local structure, corresponding to a more compact atomic arrangement and a higher degree of bond saturation, thereby facilitating the *OH desorption. Consequently, the 211-valley bridge site and 111-bridge site (Fig. S1), which all exhibit the characteristic close-packed structure of FCC metals, exhibit more positive *OH adsorption energies, favoring ORR performance. The (532) surface displays substantial variations in *OH adsorption energy across its various bridge sites, elegantly captured through the numerical representations provided by PGH. A comparative analysis with the Generalized Coordination Number (GCN)56,57 descriptor shows that the GCN model achieved a correlation coefficient of 0.43 and an R-squared value of 0.18 (Fig. S8), substantially lower than the VAE-based model’s values of 0.96 and 0.93. This highlights PGH’s remarkable sensitivity in characterizing subtle microscopic structural features.

We applied a similar approach to analyze the ligand effect. The 15 unique atomic combinations at the bridge sites were quantified into specific numerical values based on their linear combination of element properties (Supporting Note 3, Tables S8 and S9). Plotting these numerical representations against the corresponding average adsorption energies also revealed a perfect linear correlation, emphasizing that the bridge-site atoms, being in direct contact with the adsorbate, play another pivotal role in determining adsorption energy (Fig. 4c). Despite identical bridge-site atoms, significant standard deviations in adsorption energy suggest that adjustments to the active site structure or distal atoms can further modulate *OH adsorption. Among the bridge-site combinations, Pt–Pt, Pd–Pt, and Pd–Pd exhibit the most positive adsorption energies, suggesting they are the most favorable active sites. This is intuitive, as both Pt and Pd in their metallic states exhibit excellent ORR performance. Based on these findings from the latent space, active sites on the (111) or (211) facets with Pd and Pt forming the bridge sites are identified as the most advantageous for achieving the optimal *OH adsorption state.

Discovering active sites with targeted properties is the core objective of inverse design. Therefore, we conducted ~240 structural samplings in the lower-right region of the latent space, where the optimal *OH adsorption state is observed. The corresponding coordination and ligand environments were decoded to identify the associated active sites. All these active sites exhibited characteristics of the (111) and (211) facets (Fig. S6), in accordance with the best-performing samples in the latent space and validating the reliability of our generative approach. Furthermore, we analyzed the elemental composition of these structures by averaging across all samples (Fig. 4e). Data points in the latent space with adsorption energies greater than that on Pt (111) were also plotted for comparison (Fig. 4d). Both the inversely generated data and the latent space data consistently exhibited a preference for Pt or Pd as the bridge-site atoms. Interestingly, in the inversely generated data, the proportion of Ru atoms in the second and third nearest neighbors is notably high compared to other elements(Fig. 4e, f). This suggests that incorporating Ru within the second and third nearest neighbors of the bridge site favors achieving the optimal *OH adsorption state. Such effects may stem from Ru’s high electron affinity than Ir and Rh, which can strongly influence the electronic structure of neighboring Pt and Pd atoms, thereby modulating the *OH adsorption state. To investigate this, we constructed simplified models of Pt (111) surfaces doped with Ru at various positions. The results (Fig. 4g) reveal that as the Ru content increases in the first and second-nearest neighbors of the bridge site, the d-band center of the Pt–Pt site directly interacting with *OH shifts significantly downward, facilitating *OH desorption at this site. Incorporating Ir or Rh while maintaining the same configuration results in significantly smaller effects on the d-band center and adsorption energy (Fig. S7). These findings demonstrate that distal atoms, even without direct adsorbate interaction, can modulate adsorption energies via local coordination and electronic effects. This may be attributed to the contribution of void structures (β2) (Fig. 4b and Table S7), as Ru in the second-nearest neighbors affects the bridge site via remote spatial interaction. This insight not only reinforces the robustness of our generative model but also provides a mechanistic basis for tailoring active sites in HEAs.

Based on the results of inverse design, an optimization strategy for IrPdPtRhRu HEAs can be proposed. On one hand, the surface should predominantly expose Pt and Pd sites as the direct *OH adsorption centers; on the other hand, the incorporation of Ru in the neighbors can enhance the activity than the Ir and Rh elements. To further quantify this strategy and provide quantitative guidance for experimental synthesis, we systematically explored the compositional ratios of PtPdRu ternary alloys. Alloy compositions were varied from 0.1 to 0.9, yielding 36 distinct concentration gradients. For each gradient, 1000 potential active sites were enumerated, and their *OH adsorption energies were predicted using the VAE’s prediction branch. The relative activity (relative to the catalytic performance of the pure Pt(111) surface) of each site was calculated via the Arrhenius equation, and the cumulative activity for each specific composition was determined (details in methods). Our findings (Fig. 4h, i) reveal that the highest activity is achieved with a Pt: Pd: Ru ratio of 0.7:0.2:0.1 on the (111) facet and 3:3:4 on the (211) facet, with activities 6646 times and 11 times greater than those of the PtPdIrRuRh (111) and (211) facets, respectively. Experimentally, such tailored alloy compositions can be synthesized through advanced techniques, including vapor phase spark discharge, sputtering, and acute chemical reduction, among other approaches38.

Discussion

GLMY topology leverages the relationships between vertices (atoms) and paths (chemical bonds) to represent structures, making it naturally suited for the description of atomic configurations. It retains critical three-dimensional spatial information across multiple scales, overcoming the dimensionality reduction issues often associated with graph and cheminformatics-based representations. Traditional graph-based descriptors, built on vertices and edges, are limited by their inability to capture directionality and high-dimensional interactions, making them insufficient to distinguish topologically distinct environments with similar coordination numbers. Furthermore, it translates structural features into quantitative values by extracting topological invariants, such as Betti numbers. These invariants—capturing attributes like connectivity, loops, and voids—provide a clear numerical representation of different structures, effectively quantifying structural differences and laying a foundation for the interpretability of deep generative models. The GLMY topology is inherently adaptable to systems with dissimilar atomic radii, as both vertex weighting and interatomic distances are naturally incorporated into the path complex construction. While the atomic radii of elements studied in this work are relatively similar, the proposed method remains applicable to more diverse systems, where variations in atomic size could play a more pronounced role in shaping topological features. Additionally, when combined with a multi-channel encoding structure, the VAE model organizes the latent space into a two-dimensional landscape of structural and chemical information. This arrangement visualizes regions with optimal *OH adsorption properties, guiding further exploration of catalytic active sites through targeted sampling in the latent space. Unlike traditional forward-learning methods that require extensive screening over predefined latent spaces, this inverse design approach selectively focuses on areas of interest, strengthening the insights derived from existing data and revealing new features that can drive catalyst optimization. We employed IrPdPtRhRu HEAs as a case example to demonstrate the framework’s capability. HEAs capture the multi-element interactions, diverse local structures, and dynamic coordination environments commonly seen in real-world catalysts, making them an ideal system for testing the framework’s generalizability and multi-scale structural characterization abilities. Notably, the PGH-VAEs are not limited to this specific elemental combination. Leveraging the multi-scale insights of GLYM topology and the flexibility of its multi-channel VAE, this framework is highly adaptable, with broad potential for application across catalytic systems.

The PGH framework demonstrates significant data and computational efficiency advantages over traditional machine learning potentials (MLPs). While MLP-based approaches often require training on thousands to tens of thousands of DFT-relaxed configurations—as exemplified by recent work on HEA catalysts using over 135,000 adsorption sites58—our method achieves accurate structure-property characterization with only ~1000 descriptors and without iterative optimization steps. Notably, a compelling vision for future development of PGH-VAEs involves integrating pretrained MLPs that capture structure-energy relationships over a broader elemental landscape. Pretrained MLPs, such as DPA-1 and DeePMD-kit, are typically trained on millions of atomic configurations spanning dozens of elemental species. These models offer broad coverage of atomic interactions, although their predictive accuracy for specific systems often remains limited without further fine-tuning. When extending the VAE framework to a broader chemical space, it is essential to construct an additional representative and chemically diverse training set, including new elements. To streamline this process, pretrained MLPs can be leveraged to perform large-scale pre-optimization of newly sampled structures. From these, structurally informative configurations can be selectively refined using high-accuracy DFT calculations. This hybrid strategy not only preserves the precision of DFT where it matters most but also substantially reduces the computational burden associated with dataset generation, thereby enabling broader generalization of the model across chemical spaces without compromising predictive fidelity. The feasibility of this approach has been demonstrated by prior studies, where similar strategies have been applied to efficiently search for low-energy adsorption configurations of adsorbates on diverse surface types59. The most challenging aspect of this workflow may lie in identifying representative and meaningful structures from the pre-optimized configurations. Low-energy states, high-uncertainty points, and geometries that are difficult to converge are potential candidates of interest; however, a systematic and reliable selection strategy remains to be fully developed. Such advancement would enhance the model’s design capabilities, paving the way for the rational creation and evaluation of complex, multi-element catalytic materials and ushering in a new paradigm in catalyst design.

Methods

Path complexes and their homology

PGH, an extension of GLMY homology theory, offers a method to quantify the persistence of homology within path complexes across various scales42,43,44.

GLMY homology is a generalized homology theory developed to extend classical topological tools to settings where directionality and asymmetry play essential roles. In classical homology theory, the fundamental building blocks of the studied space are simplices—vertices, edges, triangles, and higher-dimensional analogs—assembled into simplicial complexes, as shown in Fig. 5a. The goal is to analyze the connectivity and higher-dimensional structures within such complexes, often built from undirected graphs or point cloud data.

Fig. 5: Building blocks and topological analysis of a cubic system using simplicial and path complexes.
figure 5

a Basic building blocks of a simplicial complex. b Basic building blocks of a path complex. c Graph representation of a cubic system with distinct vertices. d Path complex representation of a cubic system with distinct vertices. e Zero-, one-, and two-dimensional Betti numbers of the cubic system with distinct vertices under different connection schemes.

In contrast, as shown in Fig. 5b, GLMY homology is defined on path complexes, where the basic units are directed paths rather than simplices. This framework allows the encoding of directional relationships between vertices, which are especially important in systems where vertex identity matters, such as in crystals or molecules with heterogeneous atomic types.

As shown in Fig. 5c, d, while homology captures the topological features of undirected structures, GLMY homology provides a richer representation by accounting for ordered, vertex-sensitive connections. As a result, it is particularly advantageous for studying crystalline and molecular systems, where the asymmetry and directional nature of local configurations carry essential chemical or physical meaning. By preserving more information about connectivity and composition, GLMY homology offers a more expressive and comprehensive topological descriptor in such contexts.

In algebraic topology, Betti numbers are a sequence of integers that describe the number of independent k-dimensional holes in a topological space. Specifically, the kth Betti number βk is the rank of the kth homology group, which counts the number of k-dimensional cycles that are not boundaries of (k + 1)-dimensional objects. For example, β0 counts connected components (0-dimensional holes), β1 counts independent loops or cycles (1-dimensional holes), and β2 counts voids enclosed by surfaces (2-dimensional holes). As shown in Fig. 5e, the Betti numbers in each dimension of a cube vary with changes in connectivity, capturing the topological features of the path complex. When there are two holes and two disconnected components, β0 = 2 and β1 = 2. When a cavity forms, β1 = 2 and β2 = 1.

Let V be a nonzero finite set. For a given integer p ≥ 0, the elementary p-path on V is a sequence i0i1i2 ip of elements in V. Let \({e}_{{i}_{0}{i}_{1}{i}_{2}\cdots {i}_{p}}\) be the generator corresponding to the elementary p-paths, then a \({\mathbb{K}}\)-linear space can be generated by all the elementary p-path, which is denoted as Λp = Λp(V). Specifically, we make the convention that Λ−1 = 0. An element ν in Λp can be uniquely written as

$$\nu =\sum _{{i}_{0},{i}_{1},{i}_{2},\cdots \,,{i}_{p}\in V}{a}^{{i}_{0}{i}_{1}{i}_{2}...{i}_{p}}{e}_{{i}_{0}{i}_{1}{i}_{2}...{i}_{p}},\quad {a}^{{i}_{0}{i}_{1}{i}_{2}...{i}_{p}}\in {\mathbb{K}}.$$
(2)

For any integer p ≥ 0, a \({\mathbb{K}}\) -linear map ∂p: ΛpΛp−1 is defined on the generator \({e}_{{i}_{0}{i}_{1}{i}_{2}\cdots {i}_{p}}\) as

$${\partial }_{p}{e}_{{i}_{0}{i}_{1}...{i}_{p}}=\mathop{\sum }\limits_{k=0}^{p}{(-1)}^{k}{e}_{{i}_{0}...\hat{{i}_{k}}...{i}_{p}},\quad p \,>\, 0,$$
(3)

and \({\partial }_{0}{e}_{{i}_{0}}=0\,\,\text{for}\,\,p=0,\) where \(\hat{{i}_{k}}\) indicates omission of the index ik, and ∂pp + 1 = 0. Thus \(\partial ={({\partial }_{p})}_{p}\) can be deduced as a boundary operator on \({({\Lambda }_{p})}_{p}\).

A path complex on V is a nonempty collection \({\mathcal{P}}\) of elementary paths on V, and it satisfies that \({i}_{0}{i}_{1}{i}_{2}\cdots {i}_{p}\in {\mathcal{P}}\) implies \({i}_{1}{i}_{2}\cdots {i}_{p-1}{i}_{p},{i}_{0}{i}_{1}{i}_{2}\cdots {i}_{p-1}\in {\mathcal{P}}\).

A digraph \(G=\left(V,E\right)\) consists of a set V of vertices and a subset E {VG × VG} of ordered pairs (v, w) of vertices called arrows. The arrow (v, w) is denoted vw. The collection {i0i1i2 ipikik+1for all 0 ≤ k ≤ p−1, p ≥ 0} of paths on G is a path complex, denoted by \({\mathcal{P}}(G)\). The p-paths in \({\mathcal{P}}\) are called allowed p-paths, and the \({\mathbb{K}}\)-linear space spanned by the allowed p-paths is denoted as

$${{\mathcal{A}}}_{p}={{\mathcal{A}}}_{p}({\mathcal{P}})=\left\{\sum _{{i}_{0},{i}_{1},...,{i}_{p}\in V}{a}^{{i}_{0}{i}_{1}...{i}_{p}}{e}_{{i}_{0}{i}_{1},...{i}_{p}}| {i}_{0}{i}_{1}...{i}_{p}\in {\mathcal{P}},{a}^{{i}_{0},{i}_{1},...{i}_{p}}\in {\mathbb{K}}\right\}.$$
(4)

Here, as convention, let \({{\mathcal{A}}}_{-1}=0\) be the null space. The space of ∂-invariant p-paths can be deduced by

$${\Omega }_{-1}=0,\quad {\Omega }_{p}={\Omega }_{p}({\mathcal{P}})=\{x\in {{\mathcal{A}}}_{p}| \partial x\in {{\mathcal{A}}}_{p-1}\},\quad p\ge 0.$$
(5)

Then \({\partial }_{p}{| }_{{\Omega }_{p}}:{\Omega }_{p}\to {\Omega }_{p-1}\) satisfies \({\partial }_{p}{| }_{{\Omega }_{p}}{\circ} {\partial }_{p+1}{| }_{{\Omega }_{p+1}}=0\) and \({({\Omega }_{p})}_{p}\) with the boundary operator \(\partial | ={({\partial }_{p}{| }_{{\Omega }_{p}})}_{p}\) is a sub-chain complex of \({({\Lambda }_{p}(V))}_{p}\). The GLMY homology of a path complex \({\mathcal{P}}\) is defined by

$${H}_{p}({\mathcal{P}};{\mathbb{K}}):= \frac{ker{\partial }_{p}{| }_{{\Omega }_{p}}}{im{\partial }_{p+1}{| }_{{\Omega }_{p+1}}},\quad p\ge 0.$$
(6)

The GLMY homology of a digraph G is that of the path complex \({\mathcal{P}}(G)\). The pth Betti number of the digraph G is the rank of the homology \({H}_{p}(G;{\mathbb{K}})={H}_{p}({\mathcal{P}}(G);{\mathbb{K}})\), denoted as βp(G).

Let (S, ≤) be an order set and (S, ≤) can be regarded as a category with elements in S as objects and all the binary orders as morphisms. A filtration of path complexes means a covariant functor \({\mathcal{F}}:(S,\le )\to {\bf{Path}}\) from the category (S, ≤) to the category of path complexes. For each element aS, \({{\mathcal{F}}}_{a}\) is a path complex. Let \({f}_{a,b}:{{\mathcal{F}}}_{a}\to {{\mathcal{F}}}_{b}\) be the morphism induced by ab,then fb,cfa,b = fa,c for a ≤ b ≤ c. The morphism fa,b induces a morphism of GLMY homology

$${\tilde{f}}_{a,b}:{H}_{p}({{\mathcal{F}}}_{a};{\mathbb{K}})\to {H}_{p}({{\mathcal{F}}}_{b};{\mathbb{K}}).$$

The pth (a, b)-persistent GLMY homology of \({\mathcal{F}}\) is defined by

$${H}_{p}^{ab}({\mathcal{F}};{\mathbb{K}})=im({H}_{p}({{\mathcal{F}}}_{a};{\mathbb{K}})\to {H}_{p}({{\mathcal{F}}}_{b};{\mathbb{K}})),\quad p\ge 0.$$
(7)

The (a, b)-persistent Betti number is defined as the rank of \({H}_{p}^{ab}({\mathcal{F}};{\mathbb{K}})\).

In practice, the path complex is usually defined on digraphs. Let Digraph be the category of digraphs and digraph maps. A filtration of digraphs is a covariant functor \({\mathcal{D}}:(S,\le )\to {\bf{Digraph}}\) from the category (S, ≤) to the category Digraph. A filtration of digraphs can induce a filtration of path complexes, which results in the PGH of digraphs. Different filtration can result in different persistence.

Let G = (V, E) be a digraph and V represents the set of data points in a metric space (X, ). Then, there is a weight function \(d:E\to {\mathbb{R}}\) on the edge set E deduced by

$$d({x}_{1},{x}_{2})=| | {x}_{1}-{x}_{2}| | ,\quad ({x}_{1},{x}_{2})\in E\subseteq X\times X.$$
(8)

Here, this work specifies the metric space (X, ) as the Euclidean space with L2-norm. Then, let Et = {(x, y) Ed(x, y) ≤ t} and \({{\mathcal{G}}}_{t}=(V,{E}_{t})\). It can be deduced that \({\mathcal{G}}:({\mathbb{R}},\le )\to {\bf{Digraph}}\), \(t\mapsto {{\mathcal{G}}}_{t}\) is a filtration of digraphs, which leads to a persistent diagram \({\mathcal{D}}({\mathcal{G}})\) of G.

Variational auto-encoders

In PGH-VAEs training, four key losses shape model performance. The reconstruction losses for ligand and coordination features quantify how accurately the decoder recreates these properties from the latent space, ensuring the network captures essential structural information. Minimizing these losses through encoder-decoder optimization improves feature representation and generalization. The property prediction loss, reflecting discrepancies between predicted and actual properties (adsorption energies), is reduced via supervised learning to enhance accuracy. KL divergence ensures the latent space aligns with a smooth prior distribution, enabling meaningful interpolation and sampling. Balancing the KL divergence term weight maintains a trade-off between reconstruction fidelity and latent space regularization.

In the sampling process, PGH-VAEs construct a low-dimensional latent space, where each point represents an input data instance. The coordinates of these points are derived from the downscaled ligand and coordination variables, respectively. Within this latent space, our framework identifies regions that may contain potential active sites by analyzing the characteristics of the existing data points. A sample point is then selected from these regions. The decoders reconstruct the ligand and coordination variables into ligand and coordination feature vectors, respectively, generating potential active sites. These feature vectors are combined and input into the property prediction branch to generate the predicted adsorption energy. This process facilitates the design of active sites with optimal adsorption capacity.

DFT calculation data

This work uses the same parameter setting and data as ref. 40. Spin-polarized plane-wave DFT60 calculations were performed for HEAs structure optimization using VASP with projector augmented wavefunctions and the PBE exchange functional. A kinetic energy cutoff of 500 eV and a Fermi smearing width of 0.1 eV were used to ensure convergence. Van der Waals interactions were tested on a subset of data using Grimme’s DFT-D3 method, revealing a negligible effect. Supercells contained at least 64 atoms, with a Monkhorst-Pack (2,2,1) k-point grid. Geometry optimizations were terminated when atomic forces fell below 0.02 eV/Å.

Surfaces with various Miller indices were created using ASE and PyMatGen, with lateral dimensions of at least 10 Å to minimize periodic image interactions. Lattice constants were averaged according to Vegard’s law. Surfaces included at least four atomic layers, with the bottom two fixed during relaxation, and a vacuum of 10 Å was added in the vertical direction.

The adsorption energy of *OH is calculated by comparing with Pt(111), which is the most referenced material.

$$\Delta {E}_{OH}-\Delta {E}_{OH,Pt(111)}=({E}_{* OH}-{E}_{* })-({E}_{* OH,Pt(111)}-{E}_{* ,Pt(111)}).$$
(9)

where E*OH, E*, E*OH,Pt(111), E*,Pt(111) are the electronic energies of *OH-adsorbed interface, clean interface, *OH-adsorbed interface of Pt(111), and clean interface of Pt(111), respectively.

The activity calculation of PtPdRu

For a type of catalytic active site, where the proportion of Pt is rPt, the proportion of Pd is rPd, and the proportion of Ru is rRu, with a total of N such catalytic active sites, the activity j can be determined using the following expression based on the format of the Arrhenius equation37,61:

$$j=\frac{1}{N}\mathop{\sum }\limits_{i=1}^{N}({r}_{Pt}\times {r}_{Pd}\times {r}_{Ru}){e}^{-\frac{| \Delta {E}_{OH}^{i}-\Delta {E}_{target}| }{T{k}_{b}}}.$$
(10)

Here, \(\Delta {E}_{OH}^{i}\) represents the modeled adsorption energy for the ith active site. Etarget is the optimal *OH adsorption energy according to the Sabatier principle, which is 0.1 eV higher than that in Pt (111)37,53. Based on this setting, the activity of Pt-Pd-Ir-Ru-Rh (111) is assigned a j value of 7.0 × 10−9, and the activity of Pt-Pd-Ir-Ru-Rh (211) is assigned a j value of 3.3 × 10−4. kb is Boltzmann’s constant, and T is the absolute temperature. Here, we assume that T = 300 K.