Fig. 1

Overview of the proposed method. The target data consists of pseudo-binary, pseudo-ternary, and pseudo-quaternary oxide compositions extracted from the ICSD. These compositions are transformed into data representing the end members and their compositional ratios. First, Tucker decomposition is applied to the pseudo-binary oxide data to obtain tensor embeddings of the end members. Next, the obtained tensor embeddings are used to encode each composition into a compositional descriptor. A classification model is then trained using the encoded pseudo-binary oxide data, based on whether the compositions are registered in the ICSD. Finally, the trained classification model is used to predict the registration of encoded pseudo-ternary and pseudo-quaternary oxide compositions in the ICSD, and the predictions are evaluated against actual ICSD registrations.