Fig. 5: The clustering and the performances of MLIPs in different property categories.

The optimal MLIP models on the Pareto fronts for the property pairs of b \(|\delta {E}_{{\rm{f}}}^{{\rm{vacancy}}}|\) versus \(|\delta {E}_{{\rm{f}}}^{{\rm{hexagonal}}}|\), d \(|\delta {E}_{{\rm{f}}}^{{\rm{vacancy}}}|\) versus \(|\delta {E}_{{\rm{f}}}^{{\rm{tetrahedral}}}|\), g |\(\Delta {NAC}(|{\delta }_{{\rm{F}}}|,{{\mathcal{D}}}^{{\rm{RE}}-{\rm{I}}})\)| versus |\(\Delta {NAC}(|{\delta }_{{\rm{F}}}|,{{\mathcal{D}}}^{{\rm{RE}}-{\rm{V}}})\)|, and i \(|{\sigma }_{{\rm{F}}}^{{\rm{RE}}-{\rm{V}}}|\) versus |\(\Delta {NAC}(|{\delta }_{{\rm{F}}}|,{{\mathcal{D}}}^{{\rm{RE}}-{\rm{I}}})\)|. The minimum values of error metrics are indicated by black lines, and their cross points are the reference points to calculate the hypervolume (HV) and the inverted generational distance (IGD) scores. The fraction of the optimal MLIP models for each type of MLIPs with the lowest and the 2nd lowest a HV scores and c IGD scores on all property pairs in the defect formation energy category and f, h in the REs category. The clusters of optimal MLIP models connected by the similarities (Methods) based on e all four properties in the defect formation energy category and j all eight properties in the REs category.