Fig. 3
From: Interpretable dimensionality reduction of single cell transcriptome data with deep generative models

Benchmarking scvis against GPLVM, parametric t-SNE, and PCA to embed 22,000 synthetic out-of-sample data. a scvis mapping 22,000 new data points based on the learned probabilistic mapping function from the 2200 training data points, b the estimated log-likelihoods, and c the average K-nearest neighbor classification accuracies for different Ks across 11 runs, the classifiers were trained on the 11 embeddings from the 2200 points. The numbers at the top are the FDR (one-sided Mann–Whitney U-test) comparing the K-nearest neighbor classification accuracy from scvis with those from GPLVM (orange, bottom) and those from parametric t-SNE (golden, top). Notice that, for GPLVM, two runs produced bad results and were not plotted in the figure. Boxplots denote the medians and the interquartile ranges (IQR). The whiskers of a boxplot are the lowest datum still within 1.5 IQR of the lower quartile and the highest datum still within 1.5 IQR of the upper quartile. d scvis results on the larger dataset with the same perplexity parameter as used in Fig. 2; e scvis log-likelihoods on the larger dataset; and f t-SNE results on the larger dataset