Extended Data Fig. 9: Impact of mean gene expression level on cell type prioritization. | Nature Biotechnology

Extended Data Fig. 9: Impact of mean gene expression level on cell type prioritization.

From: Cell type prioritization in single-cell data

Extended Data Fig. 9

Cell type prioritizations were performed using both Augur and a representative single-cell differential expression method, the Wilcoxon rank-sum test, using the entire transcriptome (left column) or genes divided into five quintiles based on mean expression (right columns). Insets show two-sided Pearson correlations throughout. a, Relationship between Augur cell type prioritizations (AUC) and the proportion of differentially expressed genes between two simulated populations of cells (n = 200 cells total), as shown in Extended Data Fig. 1e. The mean and standard deviation of n = 10 independent simulations are shown. b, As in a, but with Augur applied to each quintile of gene expression separately. The AUC remains strongly correlated with the ground-truth perturbation intensity, regardless of mean expression levels (r ≥ 0.92). c, Relationship between Augur cell type prioritizations (AUC) and the location parameter of the differential expression factor log-normal distribution between two simulated populations of cells (n = 200 cells total), as shown in Supplementary Fig. 1f. The mean and standard deviation of n = 10 independent simulations are shown. d, As in c, but with Augur applied to each quintile of gene expression separately. The AUC remains strongly correlated with the ground-truth perturbation intensity, regardless of mean expression levels (r ≥ 0.95). e-f, As in a-b, but showing the number of differentially expressed genes detected by a Wilcoxon rank-sum test at 5% FDR, either across the entire transcriptome, e, or within each expression quintile, f. No differentially expressed genes are detected at 5% FDR outside of the top expression quintile. g-h, As in c-d, but showing the number of differentially expressed genes detected by a Wilcoxon rank-sum test at 5% FDR, either across the entire transcriptome, g, or within each expression quintile, h. No differentially expressed genes are detected at 5% FDR outside of the top expression quintile. i, Cell type prioritization in simulated scRNA-seq data from a tissue with 5,000 cells, distributed in eight cell types, with increasingly unequal numbers of cells per type, as quantified by the Gini coefficient and shown in Fig. 1f. The correlation to simulation ground truth (proportion of DE genes) is shown for Augur and a representative test for single-cell DE (Wilcoxon rank-sum test). The mean and standard deviation of n = 10 independent simulations are shown. j, As in i, but with both Augur and the Wilcoxon rank-sum test applied to each quintile of gene expression separately. k, Pearson correlation between Augur cell type prioritizations (AUC) and simulation ground truth (proportion of DE genes) in simulated scRNA-seq data from tissue with eight cell types, subjected to perturbations of varying intensity, as quantified by the the location parameter of the differential expression factor log-normal distribution. The mean of n = 10 independent simulations is shown for each perturbation intensity.. l, As in k, but with Augur applied to each quintile of gene expression separately. Augur incorporates information from lowly expressed genes even in subtle perturbations. m, Number of differentially expressed genes detected by a Wilcoxon rank-sum test at 5% FDR for each cell type in the Kang et al. dataset4, within each expression quintile, confirming the simulations in a-l reflect trends in real data.

Back to article page