Fig. 3: Ensemble generation strategy.
From: Accurate prediction of protein assembly structure by combining AlphaFold and symmetrical docking

a AF predictions are filtered based on their pLDDT scores and AFM predictions are filtered based both on their pLDDT and ipTM+pTM scores (AFMs model confidence score: 0.8 · ipTM + 0.2 · pTM) to generate an initial ensemble (red box). b To determine how many residues to trim at the N- and C-termini, the initial ensemble is collectively used to produce average values of three metrics per residue: connectivity (connectivity %), secondary structure propensity (SS %) and average residue pLDDT (Avg. pLDDT). Threshold values (dotted horizontal lines: 90 for Avg. pLDDT, 70% for connectivity %, and 70% for SS %) for the metrics are set and terminal residues were removed from the ends until all of the metrics goes beyond their respective threshold values (indicated by the red lines). c The final ensemble, which is created by removing residues as described in b, contains subunits for the target structure of equal sequence length (green box). d Ensemble RMSD distribution to the monomeric native structure for all benchmark structures for both local assembly (lighter color) and global assembly (darker color) (n for global/local assembly: 2ZY2: 53/61. 2QQY: 71/79. 3LEO: 121/62. 7Q03: 38/55. 6M8V: 54/57. 6HSB: 42/58. 4DCL: 29/75. 2CC9: 45/57. 4CY9: 54/57. 3WIS: 52/50. 5EKW: 120/60. 5H46: 105/53. 3N1I: 50/56. 6H05: 80/62. 7O63: 88/56. 7OHF: 46/73. 3BXV: 33/50. 7PF9: 37/52. 1HQK: 30/57. 1T0T: 44/54. 4RFT: 1/5. 4V4M: 18/54. 1JH5: 48/59. 6ZLO: 51/50. 1X36: 22/56. 2WQT: 30/58. 7B3Y: 53/61). The “x” symbol shows the mean. Below each subfigure, two examples (underlined) of the structural ensembles superimposed for the local (left) and global (right) assembly are shown. Source data are provided in the Source Data file.