Fig. 3: Benchmarking on the sensitivity of different approaches to cell-type identity.

a Schematic overview of dataset perturbations. Starting from two cell types within the 68 k PBMC dataset (abundant CD19+ B cells and rare CD14+ monocytes), we firstly generated three datasets by subsampling 2, 5 and 10 monocytes and 500 B cells. Differentially expressed (DE) genes between two cell types were selected through a stringent criterion for each dataset, respectively. We replaced a fixed number of non-DE genes by the pre-identified DE genes. The count of replaced genes varied between 1 and the total number of DE genes. b The comparison of sensitivity of different methods to cell-type identity while varying the number of DE genes replaced.