Fig. 5: Large scale iterative structure predictions for 7418 proteins curated from the PDB.
From: Rapid estimation of protein folding pathways from sequence alone using AlphaFold2

A structural feature based t-SNE embedding plot colored based on four different properties: ratio of β-sheets (left) and α-helix (middle left) over the sum of both, sequences the AlphaFold2-ab initio folds to within 3 Å of native (middle right), and positions of SH3 (red) and ubiquitin (black) like proteins (right). B Histogram (blue): probability density for the length of all selected sequences. Line: the percentage of proteins folded into structure less than 3 Å from native decreases with longer sequences for both predictions by model 1 (red) and model 2 (gray). C Comparison of predictions from model 1 and 2 in terms of the lowest RMSD against native structure from all iterative predictions. D The percentage of secondary structure distribution of each type for all selected structures. E Percentage of proteins fold into structure less than 3 Å from native versus the amount of secondary structure for each type. F, G The lowest RMSD from iterative predictions for SH3 (F) and ubiquitin (G) like proteins by model 1 (red) and 2 (gray). Source data are provided as a Source Data file.