Supplementary Figure 9: Analysis of feature (gene) selection in defined mixtures | Nature Methods

Supplementary Figure 9: Analysis of feature (gene) selection in defined mixtures

From: Robust enumeration of cell subsets from tissue expression profiles

Supplementary Figure 9

(a) Results from applying CIBERSORT to a spike series, in which the LM22 reference profile for CD8 T cells was spiked into the corresponding reference profile for resting mast cells (MCs–) in even increments (n = 21). (Of note, both cell types have highly distinct expression vectors in LM22; see Supplementary Fig. 2c.) (b) Comparison between genes selected by support vector regression (SVR) to deconvolve 100% resting mast cells, but not CD8 T cells, and vice versa. For each unique subset of genes, expression levels in the LM22 signature matrix are further compared between resting mast cells and CD8 T cells. A paired two-sided Wilcoxon signed rank test was used for within group comparison and an unpaired two-sided Wilcoxon rank sum test was used for between group comparisons. Data are presented as medians ± interquartile range. While genes uniquely selected for the 100% CD8 T cell sample are significantly more expressed in CD8 T cells than resting mast cells, the magnitude is small. Moreover, the opposite scenario is not observed for resting mast cell genes in the 100% resting mast cell sample, suggesting SVR gene selection is not strongly correlated with the presence or absence of a particular cell subset in the mixture. (c) Comparison between gene expression levels in LM22 (n = 547 genes) and the frequency that each gene was selected, if at all, by SVR from the set of 19 mixtures with >0% CD8 T cells and >0% resting mast cells (see panel a). Top: Comparison with expression levels of (left) CD8 T cells or (right) resting mast cells. Bottom: Comparison with mean expression levels of (left) CD8 T cells and resting mast cells or (right) all cell subsets in LM22. Regardless of spike-in composition, the highest correlation between expression and gene selection frequency was observed when considering all cell types in LM22. Statistical concordance in c was determined by linear regression (red lines).

Back to article page