Fig. 4: UMAP two-dimensional projection of materials representations for domain identification.
From: Probing out-of-distribution generalization in machine learning for materials

a, b UMAP plots of ALIGNN embeddings learned from the leave-Mg-out and leave-O-out tasks in JARVIS. c UMAP plot of ALIGNN embeddings learned from the leave-H-out task in OQMD. d UMAP plot of ALIGNN embeddings learned by leaving out structures with 5 or more elements in MP. e UMAP plot of the XGB descriptors for the leave-period-5-out task in MP. f Absolute errors (left Y axis) of test data as functions of kernel-density estimates of training data for the UMAP plot of (e); the solid line denotes the MAEs (right Y axis) for different density intervals. In all UMAP plots, the training data, in-domain test data, and out-of-domain test data are marked in gray, blue, and red, respectively; clusters of out-of-domain test data are circled out in (a–c); the R2 scores are indicated for the in-domain, out-of-domain, and all-domain test data.