Fig. 4: Simulation studies demonstrate the effectiveness of proposed scores.
From: Assessing and improving reliability of neighbor embedding methods: a map-continuity perspective

a Perturbation scores identify unreliable embedding points that have reduced uncertainty. Input points from 5-component Gaussian mixture data form separated clusters in the embedding space. t-SNE reduces perceived uncertainty for input points in the overlapping region (left), as captured by the label-dependent measurements, namely the entropy difference (middle). Our perturbation scores can identify the same unreliable embedding points without label information (right). Singularity scores reveal spurious sub-clusters on Gaussian mixture data (b) and Swiss roll data (c). At a low perplexity, t-SNE creates many spurious sub-clusters. Embedding points receiving high singular scores at random locations is an indication of such spurious structures. Source data are provided as a Source Data file.