Communications Biology

Table 1 Overview of existing methods and limitations

From: diffcyt: Differential discovery in high-dimensional cytometry via high-resolution clustering

Method	Short description	Limitations	Ref.
Citrus	Uses hierarchical clustering and regularized regression or classification models to select predictive features, such as cluster abundances or median expression of functional markers, that are associated with an outcome of interest	• Detected features cannot be ranked by importance • Lasso-regularized models cannot easily detect multiple correlated features • Rare cell populations cannot easily be detected, due to minimum cluster size requirement and computational limitations • Response variable is the clinical outcome variable, which makes it difficult to account for complex experimental designs (including batch effects, paired designs, and continuous covariates)	⁹
CellCnn	Applies convolutional neural networks in a representation learning framework to detect rare cell populations associated with an outcome of interest; designed specifically for detecting rare cell populations	• Ranking of detected cells cannot be interpreted in terms of statistical significance • Interpretation of detected populations (referred to as filters) can be difficult, since they may be composed of multiple distinct cell populations • Response variable is the clinical outcome variable, which makes it difficult to account for complex experimental designs (including batch effects, paired designs, and continuous covariates) • All protein markers are treated identically; there is no conceptual split between cell type and cell state (or functional) markers	¹⁰
cydar	Assigns cells to overlapping hyperspheres in the high-dimensional space; tests for differential abundance between hyperspheres using moderated tests from edgeR^15,16, while controlling the spatial false discovery rate among overlapping hyperspheres	• Rare cell populations cannot easily be detected, due to their relatively small volume in the high-dimensional space • All protein markers are treated identically; there is no conceptual split between cell type and cell state (or functional) markers	¹¹
classic regression-based approach	Automated clustering using FlowSOM¹⁴, followed by manual merging and annotation to define cell populations; differential testing of features such as population abundances or median expression of functional markers using generalized linear mixed models, linear mixed models, or linear models	• Manual merging and annotation step requires expert biological knowledge, and can be time-consuming and subjective • When testing large numbers of clusters, e.g. to detect rare cell populations: loss of statistical power due to multiple testing penalty; no sharing of information across clusters	¹²

Overview of recently developed methods for performing differential analyses in high-dimensional cytometry data. For each method, a short description of the methodology and a summary of limitations are provided

Back to article page

Search

Advanced search

Quick links