Fig. 7
From: Massive Atomic Diversity: a compact universal dataset for atomistic machine learning

Three-dimensional projections of the MAD dataset and popular benchmarks, using MLP-trained SMAP projections of PET-MAD last-layer embeddings. The grayscale points in the background correspond to the full test subset of the MAD dataset, and the colored points in each panel to the same, small set of 85 structures randomly selected from each dataset. Insets show the histogram of the Euclidean distances between the highlighted structures in each panel, with the histogram of distances within the MAD dataset plotted for reference.