Extended Data Fig. 2: Optimization of scVI parameters.
From: HypoMap—a unified single-cell gene expression atlas of the murine hypothalamus

a, Evaluation metrics calculated on scVI integration results of full HypoMap (384,925 cells). Purity refers to cell type purity only. Cell type separation (asw_norm = average silhouette width) is shown by the point size (see methods for details on metrics). PCA (orange) clearly mixes that data less well than scVI (pink). b, Evaluation metrics similar to (a), calculated on scVI integration results with comparable hyperparameters using either all cells (light blue) or only neurons (grey) as input. Using all cells as input did not affect the integration performance in mixing and purity, but the the cluster separation (asw_norm) was lower. c, Example box plots for detailed evaluation of scVI hyperparameters, visualizing the influence of the number of training rounds (epochs) and hidden layers on the three different metrics. Each point corresponds to a scVI training run on the full HypoMap data. The center of the boxplot is the median of all runs, the lower and upper hinges correspond to the first and third quartiles and the whiskers extend from each hinge to the largest value smaller than 1.5 times the distance between the first and third quartiles. Overall n = 224 scVI runs that were compared, the number differs between boxplots depending on the parameters. d, UMAP visualization of HypoMap colored by datasets to visualize mixing. e, UMAP visualization of HypoMap colored by mapped cell types for the evaluation of purity (see Methods and Supplementary Table 2 for details).