Fig. 2: Development of a single-cell ASPEN analysis pipeline and global enrichment methods.
From: Cell populations in human breast cancers are molecularly and biologically distinct with age

ASPEN relies on parallel adaptations of GSEA and signature scoring to associate gene expression-based enrichment of functional pathways to age. a, The mean gene expression per cell type is matched to donor age and a correlation coefficient for each gene is calculated. The genes with nonzero coefficients are then ranked according to their correlation and GSEA is performed using select gene sets of choice (in our case, Hallmark). b, Concurrently, the gene sets are used to assign a signature score to every cell in the single-cell dataset using Seurat. After scoring, the mean signature score for each gene set is calculated per cell type per donor. These mean values are then correlated to donor age. c, The resulting NES from a are then plotted as the data point color for each cell type and pathway combination, with red indicating enrichment in older donors, blue indicating enrichment in younger donors and white indicating a failure to achieve statistical significance (false discovery rate (FDR)-adjusted P > 0.05). Irrespective of the correlation direction (coefficient <0 or >0) in b, the magnitude of the correlation of signature score to age is visualized as the size of the data point for each cell type and pathway combination, with point size being proportional to the magnitude of correlation (larger circle = more strongly correlated or anticorrelated). d, Box plot showing the distribution of 175 TNBC and 110 ER+ NES values that achieved Padj < 0.05 for each cell type and Hallmark pathway combination from ASPEN (colored circles in c; Fig. 3). An NES > 0 indicates significant enrichment in older patients; an NES < 0 indicates significant enrichment in younger patients. Significance was determined using a two-tailed Student’s t-test on the NES for each breast cancer subtype. The center line indicates the median; the box limits indicate the upper and lower quartiles; the whiskers indicate 1.5 times the interquartile range.