Supplementary Figure 5: Deeply sequenced SMART-seq2/C1 mESC data have similar characteristics for batch correction (Kolodziejczyk et al.).
From: A test metric for assessing single-cell RNA-seq batch correction

(a) Illustration of two full-length read datasets with replicates in 2i, LIF and a2i culture (219, 207 and 123 cells, respectively). (b) PCA plots for log(CPM + 1) ComBat-corrected data. (c) Percentage of retained highly variable genes versus kBET acceptance rate (equals 1 – rejection rate) for all combinations of normalization and batch-correction approaches. Best-performing normalization-regression strategies cluster in the top right corner, such as ComBat on log(CPM + 1) data. Isolated cells do not have mutual nearest neighbors and appear in some correction models. Seurat’s CCA alignment batch-corrects data only in a latent space as done in manifold learning, and we therefore could not compute highly variable genes and show only kBET values.