Fig. 8: Clustering of confidence-based substrate classes based on feature impact. | Nature Communications

Fig. 8: Clustering of confidence-based substrate classes based on feature impact.

From: Charting γ-secretase substrates by explainable AI

Fig. 8

a Heatmap showing hierarchical clustering of dataset 1 (with TMHMM annotation) based on Pearson correlation coefficients for feature impacts (see Supplementary Methods “Clustering based on feature impact”). Dataset classes (top, color code of SUBEXPERT, NONSUB, and NONSUBPRED, according to Fig. 3a) and corresponding confidence-based substrate classes (left, color code according to (b)) are indicated. Five distinct clusters are highlighted by squares. b Five clusters from (a) depicted along the continuum of confidence-based substrate classes (top), ranging from high-confidence (HC) substrates, through low-confidence (LC) substrates and LC non-substrates, to HC non-substrates. The color code indicates the confidence-based substrate classes (Methods “Confidence-based substrate classes”). See Supplementary Fig. 12 for the CPP-SHAP analysis results of the four selected proteins highlighted by asterisks (SPN, BTC, murine IGSF5, and TLR1). Gene names are in uppercase for human and with the first letter capitalized for other organisms (Supplementary Methods “Datasets”). Source data are provided as a Source Data file.

Back to article page