Fig. 4 | Communications Biology

Fig. 4

From: Automated subset identification and characterization pipeline for multidimensional flow and mass cytometry data clustering and visualization

Fig. 4

Visualization of cluster analysis outcomes. a Display of the EPP clustering outcomes using MDS method. Each circle represents one subset identified by the EPP approach. The size of the circle directly correlates with the relative frequency of the subset in the sample. Subsets that match (identified using QFMatch) between Sample A and Sample B are highlighted with the same color. X and Y axes are MDS coordinates. We ran MDS on a mixture of Sample A and Sample B to display them in the same X/Y scale. Relative location of identified subsets in MDS space corresponds well with the Euclidean distances between subsets’ (groups’) medians presented in Supplementary Table 4. b QF (quadratic form)-tree built for Sample B. To build this hierarchical tree from individual clusters, we used the following modification of a multidimensional quadratic form score6 as a measure of dissimilarity to progressively merge clusters: quadratic form + c*DM, where DM is the Euclidean distance between clusters’ medians and c is a scaling factor ensuring that the smallest quadratic form score and the biggest DM are numbers of the same order of magnitude. This branching diagram starts by placing clusters with the smallest pairwise dissimilarity scores in the lowest branches of diagram; these pairs of clusters are further progressively merged in the next branching level of the QF-tree and further considered as one cluster; dissimilarity scores are then recalculated for all of the clusters on this branching level and the merging process is repeated. This process is sequentially repeated until all of the clusters identified within the sample are merged together. We named this tree-structure data display as QF-tree

Back to article page