Fig. 9: Validation.
From: Structure prediction of alternative protein conformations

a TM-score distribution for the validation set vs. training step (n = 307). The medians are marked by black lines and the upper and lower quartiles are in colour. b lDDT CA distributions vs. training step (n = 307). c Pearson correlations for lDDT vs. plDDT for the alpha carbons (CA, n = 307). The correlation coefficients (R) remain close to 0.9 throughout the training. The medians are marked by black lines and the upper and lower quartiles are in colour. d Predicted structures for PDB ID 3BDL from the validation set at different training steps. The TM-scores are 0.59, 0.53 and 0.55 for 10,000, 20,000 and 30,000 steps, respectively. Visually, one can see that the network starts to make worse predictions >10,000 training steps, although this is not apparent from the metrics in (a, b, c). This suggests that the network starts to overfit certain structures at an early stage.