Extended Data Fig. 7: RNA-seq analysis of CD19+CD20+CD38++ B cells sorted from healthy individuals.

a, Sorting strategy and approach: CD19+CD20+and CD20+CD38++ B cell populations were isolated by FACS from peripheral blood samples of six healthy donors. RNA-seq libraries were prepared for each isolated sample using the Nugen Ovation SoLo low-input RNA-seq library preparation kit. b, Differential gene expression between the CD20+CD38++ B cells and parental CD19+CD20+B cells from six paired samples was compared using DESeq2. Plot shows the log2 fold change versus the log2 of the mean normalized counts across all samples for each gene. TGSig genes are shown in red; genes from the differentially expressed gene set (BH-adjusted two-tailed p-value <1%, log of mean normalized counts >1; total genes in set: 105) that fall into the top enriched Gene Ontology Biological Processes category “Cell Activation” (enrichment analysis done using ToppGene70) are shown in cyan (21 genes). c, Enrichment of the 87 SLE-Sig genes in genes ranked by differential expression between CD19+CD20+CD38++ versus CD19+CD20+cells. The p value shown was computed from the GSEA test. d, Similar to (c) but instead of the SLE-Sig genes here the top k (k = 10 (TGSig), 30, 50) genes correlated with the frequency of CD20+CD38++ B cells is assessed (only 713 temporally stable genes with TSM≥ 0.75 were included in the analysis); also see Extended Data Fig. 2. e, Similar to (d) but using 7731 genes with TSM ≥ 0.5. A lower/more relaxed TSM cutoff was used to evaluate whether by starting with more genes (therefore potentially more statistical power for enrichment analysis) an enrichment signal can be detected.