Supplementary Figure 2: scRNA-seq analysis pipeline and distribution of identified subsets by scRNA-seq, flow cytometry, and protein fluorescence on each cell. | Nature Immunology

Supplementary Figure 2: scRNA-seq analysis pipeline and distribution of identified subsets by scRNA-seq, flow cytometry, and protein fluorescence on each cell.

From: Defining inflammatory cell states in rheumatoid arthritis joint synovial tissues by integrating single-cell transcriptomics and mass cytometry

Supplementary Figure 2

a, CCA-based integrative pipeline of scRNA-seq analysis. (1) We first select the highly variable genes from both scRNA-seq and bulk RNA-seq; (2) we integrate single cells with bulk samples based on the selected genes from both sides and learn a linear projection that the correlation between both sides are maximized using CCA; (3) we then calculate a cell-to-cell similarity matrix based on the top 10 canonical variates from CCA; (4) we build a k-nearest neighbors (KNN) network on the cell-to-cell similarity matrix and then convert it into an adjacency matrix; (5) we cluster the cells using the Infomap community detection algorithm to identify major groups on the cell-to-cell adjacency matrix; (6) we visualize the cells with tSNE; (7) we perform differential gene expression analysis on the identified cell type clusters and report three statistics: AUC, percent of non-zero expressing cells, and fold change; (8) finally, we perform gene set enrichment analysis to find pathways associated with each identified cell cluster. b, Silhouette analysis of 18 scRNA-seq clusters. The measure range is (−1, 1), where a value near 1 indicates a cell is far from neighboring clusters, a value near 0 indicates a cell is near a decision boundary, and a negative value indicates a cell is closer to a neighboring cluster. The features of the boxes are as follows. The box represents the 25th and 75th quantiles. The center line represents the 50th quantile. The low whisker is the lowest value greater than −1.5× the interquartile range plus the 25th quantile. The high whisker is the greatest value less than 1.5× the interquartile range plus the 75th quantile. The points are values outside the range of the whiskers. c, Number of cells for each cell type cluster across donors. d, Cellular composition of major synovial cell types for each donor by flow cytometry. e, Flow cytometry protein fluorescence of cell type markers on each single cell: PDPN, THY1 (CD90), CD45, CD19, CD14, and CD3.

Back to article page