Fig. 2: Identification of the physicochemical signature of γ-secretase substrates using CPP. | Nature Communications

Fig. 2: Identification of the physicochemical signature of γ-secretase substrates using CPP.

From: Charting γ-secretase substrates by explainable AI

Fig. 2

a Workflow comprising the identification of substrate features by Comparative Physicochemical Profiling (CPP), the prediction of substrate candidates using machine learning, and the explanation of feature impacts on substrate prediction scores via Shapley Additive exPlanations (SHAP). be Results of CPP analysis comparing SUBEXPERT with OTHERS (dataset 1 with TMHMM annotation). Feature importance was obtained by machine learning models trained on SUBEXPERT against non-substrates (NONSUB with NONSUBPRED). Sequence length was set to 40 residues. b CPP profile showing cumulative feature importance per residue. Different sequence regions are indicated, including their total feature importance. c CPP feature map showing the feature value mean differences (SUBEXPERT - OTHERS) per residue position and scale subcategory, classified into 6 categories as provided by AAontology35. The cumulative feature importance per scale subcategory is indicated by gray bars (right). The feature importance per residue position and scale subcategory is highlighted by black squares if higher than 0.2%. d Relative occurrence of scale categories per sequence region as shown in (b). e Cumulative feature importance for top 10, 25, 50, and 117 out of 150 features. Source data are provided as a Source Data file.

Back to article page