Fig. 2: UMAP embeddings vary when other biomolecular structures are subsampled.

It is understood that the (random and uniform) choice of 19,994 biomolecular structures affects the UMAP embedding. To investigate the effect of subsampling, we further uniformly subsubsampled nine times 10,000 structures, and created the corresponding UMAP embeddings. Compound classes are color-coded as in Fig. 1. For easier visual inspection, some of the plots have been mirrored, as indicated. Source data are provided as a Source Data file.