Fig. 2: Characterization of healthy controls as generalized anchors.
From: CytofIn enables integrated analysis of public mass cytometry datasets using generalized anchors

A Seven cohorts of B-lymphoblastic leukemia samples were collected during 2014–2019. The samples were collected on two different CyTOF instruments using three antibody panels with slight variations in naming conventions, the number of proteins measured, and protein-metal labels. B Effects of ex vivo perturbation with cytokines or small molecules on the mean expression of 36 consensus markers in the healthy control samples. The significance of the overlap in the distribution was quantified by P-values using a two-sided Wilcoxon test across 36 protein markers comparing basal v. perturbation conditions. The perturbation conditions include B-cell receptor cross-linking (BCR), Dasatinib (DAS), thymic stromal lymphopoietin (TSLP), BEZ-235 (BEZ), sodium orthovanadate (PVO4), tofacitinib (TOF), and IL-7 (IL7). C Multidimensional analysis of the 50 healthy controls from 7 cohorts. Note that shapes represent conditions and colors represent cohorts. D Unsupervised clustering of both healthy and patient samples using an expression similarity network where nodes represent samples and edges represent the cosine similarity between sample mean expressions. The healthy samples form a highly connected subcluster distinctly separated from that of the patient samples using a stringent similarity threshold of 0.9. Comparison of node degree (number of connected edges) distribution between (E) healthy (red) and (F) patient subclusters (blue). The healthy subcluster exhibited a higher degree of network connectivity (node degreeavg = 24.4) to the patient subcluster (node degreeavg = 10.9), as revealed by denser between-node edges, indicating smaller within-sample variations. Source data are provided as a Source Data file.