Table 1 Overview of existing methods and limitations
From: diffcyt: Differential discovery in high-dimensional cytometry via high-resolution clustering
Method | Short description | Limitations | Ref. |
---|---|---|---|
Citrus | Uses hierarchical clustering and regularized regression or classification models to select predictive features, such as cluster abundances or median expression of functional markers, that are associated with an outcome of interest | • Detected features cannot be ranked by importance • Lasso-regularized models cannot easily detect multiple correlated features • Rare cell populations cannot easily be detected, due to minimum cluster size requirement and computational limitations • Response variable is the clinical outcome variable, which makes it difficult to account for complex experimental designs (including batch effects, paired designs, and continuous covariates) | |
CellCnn | Applies convolutional neural networks in a representation learning framework to detect rare cell populations associated with an outcome of interest; designed specifically for detecting rare cell populations | • Ranking of detected cells cannot be interpreted in terms of statistical significance • Interpretation of detected populations (referred to as filters) can be difficult, since they may be composed of multiple distinct cell populations • Response variable is the clinical outcome variable, which makes it difficult to account for complex experimental designs (including batch effects, paired designs, and continuous covariates) • All protein markers are treated identically; there is no conceptual split between cell type and cell state (or functional) markers | |
cydar | Assigns cells to overlapping hyperspheres in the high-dimensional space; tests for differential abundance between hyperspheres using moderated tests from edgeR15,16, while controlling the spatial false discovery rate among overlapping hyperspheres | • Rare cell populations cannot easily be detected, due to their relatively small volume in the high-dimensional space • All protein markers are treated identically; there is no conceptual split between cell type and cell state (or functional) markers | |
classic regression-based approach | Automated clustering using FlowSOM14, followed by manual merging and annotation to define cell populations; differential testing of features such as population abundances or median expression of functional markers using generalized linear mixed models, linear mixed models, or linear models | • Manual merging and annotation step requires expert biological knowledge, and can be time-consuming and subjective • When testing large numbers of clusters, e.g. to detect rare cell populations: loss of statistical power due to multiple testing penalty; no sharing of information across clusters |