Figure 4 | Scientific Reports

Figure 4

From: Correspondence analysis for dimension reduction, batch integration, and visualization of single-cell RNA-seq data

Figure 4

The corralm multi-table adaptation of CA integrates count matrices across batches by finding a shared, low-dimensional latent space. (A) Comparison of nine integration workflows on the SCMixology benchmarking dataset (comprising mixtures of three cell lines: H2228, H1975, and HCC827 that were each used with three library preparation protocols—Dropseq, Celseq2, and 10X—followed by Illumina sequencing) The first column shows results on counts, and the second column shows logcounts (where appropriate). corralm is both fast and performant and can be combined with methods such as Harmony (the 3rd row) to further improve performance. (B) Scaled variance (SV) of the batches representing the three SCMixology library preparation platforms, computed on the first three components of counts and logcounts presented in Fig. 4A, colored by batch. SV close to 1 indicate that embeddings exhibit similar distribution across batches. corralm, Harmony with corralm, and SCTransform exhibit good batch alignment, while Harmony with PCA shows values far from 1, suggesting that the embeddings were not successfully integrated across batches (Includes all methods with ranked components). (C) Batch integration of pancreas data. For each of a selected set of methods, the left column shows UMAPs colored by dataset (batch), while the right column shows UMAPs colored by cell type. (D) ASWcell type assesses the embedding based on preserving biological context, while 1—ASWbatch assess integration, and are on the x and y axes respectively. For all methods, this is computed on 8 PCs.

Back to article page