Table 2 Systematic analysis of peculiar results found by the HNN model.

From: Transferring chemical and energetic knowledge between molecular systems with machine learning

First cluster

Second cluster

Expectation

Explanation

High similarity for predictions

Cluster 0 belongs to the extended family just like cluster 3, but the latter is mainly organized in poly-proline, thus introducing a significant structural difference that is detected by the HNN model

High similarity for predictions

The poly-proline conformation is closely related to the β-sheet, but HNN is able to discern the two secondary structures

High similarity for predictions

See the explanation for clusters 0 and 3

High similarity for predictions

Cluster 2 is mainly organized in α-helix, but part of it has unfolded structures, thus justifying the observed prediction differences

High similarity for predictions

Cluster 2 and cluster 4 are similar both in terms of structures and distribution of dihedral angles, however, they are considered different by HNN representing an outlier. See the “Secondary structure recognition” section for discussion.

High similarity for predictions

As for clusters 2 and 4, also 2 and 9 are structurally and energetically similar, but predicted differently by HNN. See the “Secondary structure recognition” section for discussion.

Low similarity for predictions

Cluster 3 has part of its structure organized as β-sheet, similarly to cluster 6

High similarity for predictions

Cluster 3 does not have a perfect poly-proline structure, while cluster 7 does, thus justifying the differences in the predictions

High similarity for predictions

See the explanation for clusters 3 and 7

  1. The column “Expectation” contains the expected outcome based on our visual comparison of the clustered structures, while the “Explanation” column gives a justification for the deviation from the expected outcome, highlighting that our model is able to detect subtle differences and similarities between structures in different clusters.