Fig. 7: CPP-SHAP analysis for ERBB2.

The validated HC substrate ERBB2 analyzed by four CPP-SHAP plots using fuzzy labeling (see Methods “Combining CPP with SHAP”): The CPP-SHAP ranking plot (a) ranks the top 15 features by the absolute value of their impact, which can be positive (red) or negative (blue); the CPP-SHAP profile (b) shows the cumulative feature impact per residue; the CPP heatmap (c) highlights the differences in feature values between the respective protein and the reference dataset (OTHERS) per scale subcategory and residue; and the CPP-SHAP heatmap (d) illustrates the feature impact per scale subcategory and residue. Scale categories are from AAontology35 and uniformly color-coded. The CPP-SHAP analysis results for ERBB2 (88 ± 7% substrate prediction score) explain its prediction score of 76% (red) based on dataset 1 with TMHMM annotation. The CPP-SHAP ranking plot shows the predominantly positive impact of the top 15 features, such as an increased β-strand tendency in the TMD-C or an increased entropy in the TMD-C anchor. The positive impact of these regions is underlined in the CPP-SHAP profile. The CPP heatmap and CPP-SHAP heatmap reveal the negative impact of some residues within the TMD-C, such as two glycines, due to their α-helix destabilizing effect. For comparison, see the CPP-SHAP analysis for the LC substrate SLC27A1 (Supplementary Fig. 11). Source data are provided as a Source Data file.