Fig. 3: Generation of islet cell type-specific identity genesets.
From: Generation of human islet cell type-specific identity genesets

A Methodology to determine the optimal genesets from our lists of identity genes. First, 21 incrementally smaller genesets were generated per cell type, including genes with an increasingly higher number of integrated analyses. To define the best identity geneset, we used GSEA to generate normalized enrichment scores (NES; proxy for geneset sensitivity) and gene retrieval rates (GRR; proxy for geneset specificity) based on differential expression in our γδε-dataset. The optimal geneset was defined as the geneset with the highest combined NES/GRR metrics. B Geneset sizes for each cell type, regarding every possible intersect level. Genesets were filtered to contain between 40 and 500 genes (darker colours). Genesets outside this range (lighter colours) were not considered for downstream evaluation. C Mean sensitivity score (normalized enrichment; n = 3 independent experiments) for each geneset for each cell type. D Mean specificity score (gene retrieval rate; n = 3 independent experiments) for each geneset for each cell type. E Multiplication of NES and GRR scores (n = 3 independent experiments) for each geneset for each cell type. Per cell type, the highest measured value is indicated by a dotted line, and the appropriate number of integrated analyses is indicated on the x-axis. Genesets were defined to include all genes with at least this amount of integrated analyses, resulting in genesets sizes as indicated in the top left corner. F Determination of final geneset sizes by applying the determined # integrated analyses on the lists of ID genes generated in Fig. 1. For each cell type, the determined cut-off is indicted as a thick black line, numbers indicated with a # indicate the cut-off value determined in panel (E). α-, β-, γ- and δ-cell genesets in red, green, magenta and blue, respectively. Source data are provided as a source data file.