Fig. 1: scOpen and benchmarking of scATAC-seq imputation methods.
From: Chromatin-accessibility estimation from single-cell ATAC-seq data with scOpen

a scOpen receives as input a sparse peak by cell count matrix. After matrix binarization, scOpen performs TF–IDF transformation followed by NMF for dimension reduction and matrix imputation. The imputed or reduced matrix can then be given as input for scATAC-seq methods for clustering, visualization, and interpretation of regulatory features. b Memory requirements of imputation/denoising methods on benchmarking datasets. The x-axis represents the number of elements of the input matrix (number of OC regions by cells). c Same as b for running time requirements. d Boxplot showing the evaluation of imputation/denoising methods for recovering true peaks. The y-axis indicates the area under the precision-recall curve (AUPR). Methods are ranked by the mean AUPR. The asterisk and the two asterisks mean that the method is outperformed by the top-ranked method (scOpen) with significance levels of 0.05 and 0.01 at a confidence level of 0.95 (Wilcoxon Rank Sum test, paired, two-sided), respectively (n = 1224 cells for Cell lines, n = 2210 cells for Hematopoiesis, n = 765 cells for T-cells, and n = 10,032 for PBMC). The box plot represents the median (central line), first and third quartiles (box bounds). The whiskers present the 1.5 interquartile range (IQR) and external dots represent outliers (data greater than or smaller than 1.5IQR). e Barplots showing silhouette score (y-axis) for benchmarking datasets. f Barplots showing the clustering accuracy for distinct imputation methods. The y-axis indicates the mean adjusted Rand Index (ARI). Dots represent individual ARI values of distinct clustering methods. Error bars represent the standard deviation (SD) of ARI. Data are represented as mean ± SD. The asterisk and the two asterisks mean that the method is outperformed by the top-ranked method with significance levels of 0.05 and 0.01 at a confidence level of 0.95 (n = 8 independent clustering experiments, Wilcoxon Rank Sum test, paired, two-sided), respectively. Source data for Fig. 1 are provided as a Source Data file.