Fig. 9: Functional bioinformatics analysis of γ-secretase substrates by enrichment analysis and pathway clustering.

a Workflow of functional bioinformatics analysis for human single-span N-out membrane proteins (n = 1534, human N-out proteome). Separate enrichment analyses were performed for all 250 high-confidence (HC) substrates (gray) and the 160 new HC substrates (red), for which an additional network analysis was conducted. b Stacked bar chart showing the relative distribution within the human N-out proteome for proteins from the “New (predicted)” (gray), SUBEXPERT (green), SUBLIT (light blue), and NONSUB (purple) datasets across the four confidence-based substrate classes (see Methods “Confidence-based substrate classes”). The “New (predicted)” dataset comprises all proteins of the human N-out proteome with unknown substrate status. c Bar chart showing the number of proteins with a naturally short ectodomain or known sheddase in the human N-out proteome. d Gene ontology (GO) enrichment analysis results for all HC substrates compared to the human N-out proteome. Top 6 semantic clusters (see Supplementary Methods “Enrichment analysis”) are shown for each GO domain: biological process (BP, orange), molecular function (MF, red), cellular component (CC, green). An enrichment score was computed for each semantic cluster as the mean −log10 P value of its constituent GO terms. e Pathway enrichment analysis results for known substrates (left) and new HC substrates (right) compared to the default g:Profiler background. The Benjamini-Hochberg adjusted −log10 P values are shown for the top 5–6 pathway terms from Reactome (light blue), KEGG (gray), and WikiPathways (gold). New pathway links (i.e., terms not previously linked to γ-secretase or its substrates) are highlighted in light blue. f Map displaying 7 clusters (C1–C7) of pathway terms linked by shared genes. Nodes represent pathway terms, sized by the number of associated new HC substrates and color-coded as in (e). Edges indicate the size of gene set overlaps. See Supplementary Figs. 13–15 for results of further functional bioinformatics analysis. Source data are provided as a Source Data file.