Fig. 4: Doublet removal by Chord improves the analysis performance on real-world scRNA-Seq data. | Communications Biology

Fig. 4: Doublet removal by Chord improves the analysis performance on real-world scRNA-Seq data.

From: Chord: an ensemble machine learning algorithm to identify doublets in single-cell RNA sequencing data

Fig. 4

a Doublet detection was performed on the published lung cancer dataset23 using Chord. The number and the proportion of doublets for each cell type which was labelled by original paper were recorded. b UMAP of the 24,280 cells in this dataset. The cells were coloured by cell type (left), the predicted result of doublet detection (middle) and the clusters were defined by Seurat (right). c A bar chart showing the number of doublets and total cells in each cluster. d Heatmap of marker genes for doublets (in cluster 10), T Cells, and Plasma cells. e The ROGUE value24 of each cell type for each sample. A paired t-test was used to test the difference in each cell type in each sample between the two groups before and after doublet filtration (paired t-test, *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001). f A bar chart showing the changes in the total number of differentially expressed genes before and after doublet removal. The DEGs were calculated by the Wilcoxon rank-sum test (Seurat). The threshold value of logFC was measured by a gradient from 0.25 to 0.75 at 0.05 intervals. g The RNA UMI numbers predicted by Chord in each cell type were significantly different between the doublets and singlets (unpaired t-test, *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001). h The myeloid cells in original and filtered data for the pseudotime analysis were processed using Slingshot20. The trajectory of the original data and the filtered data are shown respectively.

Back to article page