Fig. 5: Self-supervision improves the DiffNet’s ability to organize structural configurations based on their biochemical property.

Histogram showing DiffNet output labels across all simulation frames from M182T and M182S (red – highly stable variants in training set) versus WT and M182V (gray – less stable variants in training set) for a supervised autoencoder (a) and a self-supervised autoencoder (b). c Three key hydrogen bond lengths in helix 9 as a function of the DiffNet output label (n = 1,300,420) (yellow – supervised, black – self-supervised), which ranges from zero for structures associated with low stability to one for structures associated with high stability. The distances are between the carbonyl carbon of the ith residue and the nitrogen of the (i + 4)th residue. Standard error bars are not visible since the standard error is smaller than scatter points.