Direct comparison of mass cytometry and single-cell RNA sequencing of human peripheral blood mononuclear cells

Su, Emily Y.; Fread, Kristen; Goggin, Sarah; Zunder, Eli R.; Cahan, Patrick

doi:10.1038/s41597-024-03399-6

Download PDF

Data Descriptor
Open access
Published: 30 May 2024

Direct comparison of mass cytometry and single-cell RNA sequencing of human peripheral blood mononuclear cells

Scientific Data volume 11, Article number: 559 (2024) Cite this article

7269 Accesses
6 Citations
Metrics details

Subjects

Abstract

Single-cell methods offer a high-resolution approach for characterizing cell populations. Many studies rely on single-cell transcriptomics to draw conclusions regarding cell state and behavior, with the underlying assumption that transcriptomic readouts largely parallel their protein counterparts and subsequent activity. However, the relationship between transcriptomic and proteomic measurements is imprecise, and thus datasets that probe the extent of their concordance will be useful to refine such conclusions. Additionally, novel single-cell analysis tools often lack appropriate gold standard datasets for the purposes of assessment. Integrative (combining the two data modalities) and predictive (using one modality to improve results from the other) approaches in particular, would benefit from transcriptomic and proteomic data from the same sample of cells. For these reasons, we performed single-cell RNA sequencing, mass cytometry, and flow cytometry on a split-sample of human peripheral blood mononuclear cells. We directly compare the proportions of specific cell types resolved by each technique, and further describe the extent to which protein and mRNA measurements correlate within distinct cell types.

Quantifying protein abundance on single cells using split-pool sequencing on DNA-barcoded antibodies for diagnostic applications

Article Open access 18 January 2022

Best practices for single-cell analysis across modalities

Article 31 March 2023

Diversity of post-translational modifications and cell signaling revealed by single cell and single organelle mass spectrometry

Article Open access 19 July 2024

Background & Summary

Single-cell techniques have revolutionized the resolution at which biology is studied in the lab. Such approaches overcome issues of averaging and subsequent masking of cell-cell heterogeneity experienced by bulk techniques, and thus have become staples in omics research. In particular, single-cell RNA sequencing (scRNA-seq)^1,2,3, a high-throughput method to profile transcriptomes, has assisted in cell-type identification⁴, trajectory inference (TI) modeling^5,6,7,8, and gene regulatory network reconstruction^9,10,11, to name a few applications. Mass cytometry^12,13,14, a high-throughput cytometry method with the ability to simultaneously measure over 40 parameters, has similar applications in cell-type identification¹⁵, TI modeling¹⁶, as well as differential signaling and cytokine expression studies¹⁴.

Transcriptomic data from scRNA-seq is commonly used as a proxy for studying the proteome, as its genomic-scale readout offers a breadth of detail that proteomic approaches are unable to quantify within a single cell. Though broad expression patterns generally associate well with cellular state, the correlation between individual protein expression and corresponding mRNA may be tenuous and even differ amongst proteins or between different cell types^17,18. These differences can arise from biological sources, including post-transcriptional regulation, or even technical biases, including dropout in scRNA-seq. Thus, improving our understanding of the relationship between measured mRNA and protein content may aid in refining the conclusions drawn from scRNA-seq.

Additionally, integration of scRNA-seq and cytometry is highly enticing as these data modalities are seemingly complementary, with scRNA-seq measuring a large number of features for a relatively low number of observations and vice versa for mass cytometry. Moreover, predictive methods that use one modality to refine the results of the other are equally desirable. Indeed, several such computational approaches have been reported, including COMET¹⁹, which utilizes scRNA-seq data to infer protein marker panels capable of distinguishing specific cell populations. As these approaches are developed, datasets of both readouts from the same population of cells will be useful as a gold standard for validation.

Here, we performed scRNA-seq, mass cytometry, and flow cytometry on a single, split sample of human peripheral blood mononuclear cells (PBMCs). We further compare the ability of scRNA-seq and mass cytometry in resolving distinct cell types and cell states, describe the extent to which protein and mRNA correlate, and quantify and compare cell-type composition based on data from each technique. Because of the split-sample nature of the work, this is a valuable dataset for the purposes of assessing integrative analyses and resolving measured mRNA-protein relationships. In the development of analysis tools capable of isolating and identifying rare populations, PBMCs are an excellent data source given the existence of well-documented subpopulations and small fraction of certain cell types such as dendritic cells. Such data may be further down-sampled or sub-sampled so as to mimic a sample with even rarer populations. Down-sampling may also enable the exploration of properties of each data modality with respect to accuracy and precision in estimating population structure in the presence of technical factors. Finally, while not presented here, we expect that this dataset can be aggregated with other existing PBMC data in meta-analyses that uncover variation in cell type and states in human PBMCs.

Methods

Human PBMCs were obtained from a donor, who had provided written informed consent (IRB 15328), at University of Virginia School of Medicine, Heart Center.

Split-sample preparation for scRNA-seq, CyToF, and flow cytometry

PBMCs were thawed in RPMI 1640 with 5% FBS, and incubated at 37 °C for 1 hr for recovery to ground state. 3 × 10⁵ cells were set aside for scRNA-sequencing. The remaining cells (~7.5 × 10⁶) were divided evenly for mass cytometry and flow cytometry. Cells allocated for scRNA-seq were strained and washed with PBS containing 0.4% BSA. Cell concentration was adjusted to ~500 cells/μL before proceeding with the 10x sequencing protocol.

Next, cells allocated for mass cytometry were fixed. Briefly, cells were incubated with cisplatin (10 µM in PBS) then quenched with cell staining medium (CSM; 0.5% BSA, 0.02% NaN3 in PBS). The cells were strained with a 100 micron nylon strainer before being fixed at room temperature for 10 minutes in 1.6% paraformaldehyde and subsequently stored at −80 °C in CSM. The sample was thawed and stained with metal-conjugated antibodies. Samples are first blocked with 10% donkey serum, stained with surface antibody metal-conjugated antibody cocktail (Table 1), then methanol permeabilized for 10 minutes at 4 °C before being stained for intracellular markers. After staining, samples are incubated with Iridium intercalator for DNA staining overnight at 4 °C before being analyzed on CyTOF mass cytometer (Standard Biotools). Normalization beads containing Lanthanum-139, Praseodymium-141, Terbium-159, Thulium-169, and Lutetium-175 are added to stained samples to perform normalization as previously described²⁰. Stained samples and normalization bead mixtures are then filtered through a 40 micron filter and subsequently analyzed across several runs at a rate of ~250 cells per second on the mass cytometer. After measurement, samples are normalized²⁰ and de-barcoded²¹ to individual FCS files. FCS files are gated for bead removal, debris clean up, and DNA intercalator.

Table 1 CyToF Panel.

Full size table

Finally, cells allocated for flow cytometry were blocked with FcBlock (BD, Catalog No. 564219), before they were further divided evenly into six tubes. Primary antibody incubation of each tube was as follows: Tubes 1 and 2, no primary antibody; Tubes 3–6 anti-CD3 (Thermo Fisher, Catalog No. MHCD0300), anti-CD19 (Thermo Fisher, Catalog No. 14-0199-80), anti-CD56 (Thermo Fisher, Catalog No. 14-0567-80), anti-CD14 (Thermo Fisher, Catalog No. 14-0149-80) respectively. Tubes were placed on ice for 30 minutes and washed twice with FACS buffer. Cells were then incubated with secondary antibody (Thermo Fisher, Catalog No. A-11001) for an additional 30 minutes and washed twice before resuspension in FACS buffer. Flow cytometry was carried out on a BD LSR II flow cytometer and analyzed using FlowJo.

scRNA-seq data processing

Quality control filtering, normalization, clustering, and differential gene expression analysis was performed using Scanpy²² (version 1.8.2). Genes were excluded if they were detected in less than 3 cells; cells were excluded if their mitochondrial gene content exceeded 10% of their total reads or if they had fewer than 200 unique genes in order remove data from any prematurely lysed cells or from ambient RNA. Thresholds were chosen based on manually detecting steep changes in corresponding distributions, aligning with currently accepted practices^22,23. Of note, varying these thresholds did not significantly change results of downstream analysis (Supplementary Fig. 1, Supplementary Table 1). Filtering resulted in 2653 cells and 15998 genes. The data was then normalized and log transformed and highly variable genes were identified (3004 genes). The data was then scaled and PCA was performed. Cells were clustered using the Leiden algorithm and visualized on a UMAP embedding (Fig. 1a). Further cell type classification was performed via SingleCellNet²⁴, using sampled data from Zheng et al. as reference data (Fig. 1b). Finally, we annotated cell identity based on these classification results and expression of marker genes (Fig. 1c).

Clusters ‘0’ and ‘1’ expressed CD3D and CD4, and were subsequently annotated as CD4 T cells. Clusters ‘3’ and ‘4’, which expressed CD3D and CD8, classified strongly as CD8 cytotoxic T cells, and were thus annotated as CD8 T cells. Clusters ‘5.0’ and ‘5.1’ expressed CD19, classified strongly as B cells, and were annotated as B cells. Cluster ‘6’, which expressed NCAM1 and KLRD1, classified strongly as natural killer (NK) cells, and were annotated as such. Clusters ‘2.0’ and ‘2.1’ expressed CD14 and CD68, did not express FCGR3A, and classified strongly as monocytes. Both clusters were annotated as CD16- monocytes. Cluster ‘7’ showed markedly lower expression of CD14, high expression of FCGR3A and MS4A7, and classified as monocytes. This cluster was annotated as CD16+ monocyte. Cluster ‘2.2’ did not express CD14 or FCGR3A, but did express CD68, and was annotated as dendritic cells (DC). Finally, the smallest cluster, cluster ‘8’, expressed ‘PPBP’ and is likely a small group of platelets and was annotated as megakaryocyte-lineage (Mk). To note, these annotations can be further divided into finer sub-populations should users choose to refine the clustering, use a different reference dataset for classification, or widen the field of marker genes to analyze.

CyToF data processing

Gating to remove debris and subsequent arcsin normalization was done on Cytobank. Leiden clustering and UMAP visualization was performed using Scanpy (Fig. 2a). No other normalization or dimension reduction was performed, and cell annotation was based on marker expression (Fig. 2b,c).

Briefly, clusters ‘0’, ‘1’, ‘9.0’, and ‘9.1’ were CD3+ CD4+, and were annotated as CD4 T cells. Clusters ‘2’, ‘5’, ‘6’, and ‘8.0’, were CD3+ and CD8a+, and were annotated as CD8 T cells. Clusters ‘4’ and ‘11’ were CD19+ CD20+ CD79b+ HLADR+, and were annotated as B cell. Cluster ‘14’ exhibited lower levels of CD19 and CD20, but was also annotated as B cell. Clusters ‘3.0’, ‘3,1’, ‘3.2’, and ‘3.4’ were CD56+ and were labeled as NK cells. Cluster ‘3.3’ were both CD56+ and CD3+, and were labeled as NKT cells. Cluster ‘7.0’ was CD14+ and CD16-, and was annotated as CD16- monocytes. Cluster ‘7.1’ and ‘12’ were HLADR+ and CD68+, and were annotated as DCs. Finally, we found isolated two populations in the CyToF data that we were unable to resolve or detect in the scRNA-seq population. First, Clusters ‘10’ and ‘13’ were CD3+ and both CD4- and CD8a-. We labeled these as double negative T cells (DN T cell). In contrast, cluster ‘15’ was CD3+ CD4+ CD8a+, and was labeled as double positive T cells (DP T cell). To note, based on our Ab panel, users of this dataset can further divide these broad annotations into finer subpopulations.

Quantifying differences in scRNA-seq and mass cytometry

Based on our cluster annotations, we quantified the percentage of each population in the scRNA-seq, mass cytometry, and flow cytometry data (Fig. 3a,b). Importantly, despite the split-sample nature of the three datasets, there was variation in the proportions of specific cell populations. Notably, while mass cytometry and flow cytometry largely agreed in percentage of T cells, scRNA-seq detected a lower percentage of the same population. As described above, this difference was further exacerbated by the DN and DP T cells that were not detected in the scRNA-seq data. In contrast, the scRNA-seq data exhibited a larger proportion of monocytes than both the mass cytometry and flow cytometry data. To note, we did not resolve a CD16+ monocyte population in the mass cytometry data. Finally, while the scRNA-seq and mass cytometry data exhibited a roughly equal proportion of NK cells, the flow cytometry data had a larger percentage of the same population. To note, differences in cell type percentages measured by scRNA-seq and cytometry have been previously reported in bone marrow mononuclear cells (BMMCs)²⁵.

To broadly estimate the correlation between scRNA-seq and CyToF measurements, we examined the normalized and log-transformed mass cytometry measurements and compared them to the normalized and log-transformed expression of the corresponding genes in the scRNA-seq data across the different cell types (Fig. 3c). Overall, the correlation between the mass cytometry and scRNA-seq measurements was relatively weak (r² = 0.47–0.66). Taken together, these brief analyses suggest an imprecise concordance between scRNA-seq and mass cytometry measurements. This finding may have broader impacts including suggesting the need for careful consideration in applications such as the identification of rare populations and cell states, which may be obscured using one data modality over another.

Data Records

The CyToF data²⁶ is available at https://flowrepository.org/id/FR-FCM-Z6ZN. The flow cytometry data²⁷ is available at https://flowrepository.org/id/FR-FCM-Z6ZX. The scRNA-seq data is available at GEO under accession GSE225431²⁸.

Technical Validation

Common quality control metrics were calculated in Scanpy (including genes per cell, UMI per cell, and percent mitochondrial gene transcripts), and cells were filtered based on these metrics (see Methods). Subsequent singleCellNet classification results of the transcriptomic data largely agreed with the marker gene expression and corresponding cell type annotation (Fig. 1). Additionally, the percentages of each cell type apparent in the scRNA-seq, mass cytometry, and flow cytometry datasets are in line with previously established percentages^{3,25,26,,29,30}. Of note, reported PBMC composition proportions vary, likely reflecting both biological and technical variability. For example, one source estimates 70–85% T cells, 5–10% B cells, 5–20% NK cells, 10–20% monocytes, 1-2% dendritic cells, and another measured 49–77% T cells, 6–17% B cells, 7–35% NK cells, 6–12% monocytes, 0.6–1.5% dendritic cells. However, given that our measurements fall within these ranges, we believe this dataset is an accurate representation of human PBMCs. Further, the split sample nature of this dataset ensures that RNA-seq and cytometry measurements correspond to the same sample state.

Usage Notes

Both raw counts and processed (normalized, log-transformed, with UMAP coordinates) scRNA-seq data are available as H5AD files. Both are stored as AnnData objects. Cluster annotation can be accessed from the ‘.obs’ slot. For the raw dataset, counts data can be found in the ‘.X’ slot. For the processed dataset, the full normalized, log-transformed expression matrix is stored in the ‘.raw’ slot, while the expression matrix subset for highly variable genes is stored in the ‘.X’ slot.

Similarly, the processed CyToF data is available as an H5AD file. Normalized and asinh-transformed data is stored in the ‘.X’ slot. Cluster annotations can be accessed from the ‘.obs’ slot.

References

Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).
Article CAS PubMed PubMed Central Google Scholar
Macosko, E. Z. et al. Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets. Cell 161, 1202–1214 (2015).
Article CAS PubMed PubMed Central Google Scholar
Zheng, G. X. Y. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Villani, A.-C. et al. Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors. Science 356, (2017).
Trapnell, C. Defining cell types and states with single-cell genomics. Genome Res. 25, 1491–1498 (2015).
Article CAS PubMed PubMed Central Google Scholar
Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods 14, 979–982 (2017).
Article CAS PubMed PubMed Central Google Scholar
Street, K. et al. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genomics 19, 477 (2018).
Article PubMed PubMed Central Google Scholar
Haghverdi, L., Büttner, M., Wolf, F. A., Buettner, F. & Theis, F. J. Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods 13, 845–848 (2016).
Article CAS PubMed Google Scholar
Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).
Article CAS PubMed PubMed Central Google Scholar
Qiu, X. et al. Inferring Causal Gene Regulatory Networks from Coupled Single-Cell Expression Dynamics Using Scribe. Cell Syst. 10, 265–274.e11 (2020).
Article CAS PubMed PubMed Central Google Scholar
Su, E. Y., Spangler, A., Bian, Q., Kasamoto, J. Y. & Cahan, P. Reconstruction of dynamic regulatory networks reveals signaling-induced topology changes associated with germ layer specification. Stem Cell Rep. 17, 427–442 (2022).
Article CAS Google Scholar
Bandura, D. R. et al. Mass cytometry: technique for real time single cell multitarget immunoassay based on inductively coupled plasma time-of-flight mass spectrometry. Anal. Chem. 81, 6813–6822 (2009).
Article CAS PubMed Google Scholar
Ornatsky, O. et al. Highly multiparametric analysis by mass cytometry. J. Immunol. Methods 361, 1–20 (2010).
Article CAS PubMed Google Scholar
Bendall, S. C. et al. Single-cell mass cytometry of differential immune and drug responses across a human hematopoietic continuum. Science 332, 687–696 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Levine, J. H. et al. Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis. Cell 162, 184–197 (2015).
Article CAS PubMed PubMed Central Google Scholar
Bendall, S. C. et al. Single-cell trajectory detection uncovers progression and regulatory coordination in human B cell development. Cell 157, 714–725 (2014).
Article CAS PubMed PubMed Central Google Scholar
Vogel, C. & Marcotte, E. M. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat. Rev. Genet. 13, 227–232 (2012).
Article CAS PubMed PubMed Central Google Scholar
de Sousa Abreu, R., Penalva, L. O., Marcotte, E. M. & Vogel, C. Global signatures of protein and mRNA expression levels. Mol. Biosyst. 5, 1512–1526 (2009).
PubMed Google Scholar
Delaney, C. et al. Combinatorial prediction of marker panels from single-cell transcriptomic data. Mol. Syst. Biol. 15, e9005 (2019).
Article CAS PubMed PubMed Central Google Scholar
Finck, R. et al. Normalization of mass cytometry data with bead standards. Cytometry A 83, 483–494 (2013).
Article PubMed PubMed Central Google Scholar
Fread, K. I., Strickland, W. D., Nolan, G. P. & Zunder, E. R. An updated debarcoding tool for mass cytometry with cell type-specific and cell sample-specific stringency adjustment. Pac Symp Biocomput 22, 588–598 (2017).
PubMed Google Scholar
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).
Article PubMed PubMed Central Google Scholar
Heumos, L. et al. Best practices for single-cell analysis across modalities. Nat. Rev. Genet. 24, 550–572 (2023).
Article CAS PubMed Google Scholar
Tan, Y. & Cahan, P. SingleCellNet: A Computational Tool to Classify Single Cell RNA-Seq Data Across Platforms and Across Species. Cell Syst. 9, 207–213.e2 (2019).
Article CAS PubMed PubMed Central Google Scholar
Oetjen, K. A. et al. Human bone marrow assessment by single-cell RNA sequencing, mass cytometry, and flow cytometry. JCI Insight 3, (2018).
Zunder, E. FlowRepository https://flowrepository.org/id/FR-FCM-Z6ZN (2023).
Zunder, E. FlowRepository https://flowrepository.org/id/FR-FCM-Z6ZX (2023).
Su, E. Y., Fread, K., Goggin, S., Zunder, E. R. & Cahan, P. GEO. https://identifiers.org/geo/GSE225431 (2023).
Autissier, P., Soulas, C., Burdo, T. H. & Williams, K. C. Evaluation of a 12-color flow cytometry panel to study lymphocyte, monocyte, and dendritic cell subsets in humans. Cytometry A 77, 410–419 (2010).
Article PubMed Google Scholar
Kleiveland, C. R. in The impact of food bioactives on health (eds. Verhoeckx, K. et al.) 161–167, https://doi.org/10.1007/978-3-319-16104-4_15 (Springer International Publishing, 2015).

Download references

Acknowledgements

We thank Coleen A. McNamara for providing the PBMCs and Chantel McSkimming for advice on the antibody panel. Research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health to PC under Award No. R35GM124725. EYS was supported by the NSF Graduate Research Fellowship under Grant No. DGE-1746891. ERZ was supported by NIH/NINDS Grant No. 5R01NS111220. KF was supported by NIH/NIGMS Grant No. 5T32GM136615. SG was supported by NIH Grant No. 5T32LM012416.

Author information

These authors contributed equally: Emily Y. Su, Kristen Fread.

Authors and Affiliations

Institute for Cell Engineering, Johns Hopkins School of Medicine, Baltimore, MD, USA
Emily Y. Su & Patrick Cahan
Department of Biomedical Engineering, Johns Hopkins School of Medicine, Baltimore, MD, USA
Emily Y. Su & Patrick Cahan
Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
Kristen Fread, Sarah Goggin & Eli R. Zunder
Department of Molecular Biology and Genetics, Johns Hopkins School of Medicine, Baltimore, MD, USA
Patrick Cahan

Authors

Emily Y. Su
View author publications
Search author on:PubMed Google Scholar
Kristen Fread
View author publications
Search author on:PubMed Google Scholar
Sarah Goggin
View author publications
Search author on:PubMed Google Scholar
Eli R. Zunder
View author publications
Search author on:PubMed Google Scholar
Patrick Cahan
View author publications
Search author on:PubMed Google Scholar

Contributions

E.Y.S. and K.F. performed the experiments with assistance from S.G. E.Y.S. wrote the manuscript with assistance from K.F. P.C. and E.R.Z. conceptualized and supervised the project.

Corresponding authors

Correspondence to Eli R. Zunder or Patrick Cahan.

Ethics declarations

Competing interest

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Su, E.Y., Fread, K., Goggin, S. et al. Direct comparison of mass cytometry and single-cell RNA sequencing of human peripheral blood mononuclear cells. Sci Data 11, 559 (2024). https://doi.org/10.1038/s41597-024-03399-6

Download citation

Received: 31 March 2023
Accepted: 21 May 2024
Published: 30 May 2024
DOI: https://doi.org/10.1038/s41597-024-03399-6

This article is cited by

Single cell transcriptome profiling of immune tissues from germ-free and specific pathogen-free piglet
- Jinwei Zhang
- Chuang Tang
- Liangpeng Ge
Scientific Data (2025)