Fig. 1

Workflow to link synthesis descriptors to structure descriptors in zeolites. a Machine learning models were constructed from experimental records in the literature; the dataset contains synthesis descriptors and corresponding outcomes. b Synthesis descriptors extracted from the machine learning models mapped the synthesizable domains of zeolites onto a multidimensional (kinetic) phase diagram. The weight, xi, indicates the importance of each synthesis descriptor, i, obtained from the machine learning models. The synthesis similarity is represented by the distance between the centers of the synthesis conditions for each phase. c Structure descriptors define the structural similarity in a multidimensional space representing the presence or absence of building units. To link the synthesis descriptors to the structure descriptors quantitatively, the weight, wj, for each structure descriptor, j, was optimized to yield the structural similarity (arrow in c) close to the synthesis similarity (arrow in b). d A network was constructed by connecting structurally similar zeolites based on the structure descriptors. The resulting clustering was verified with historical data and our experiments