Extended Data Fig. 1: The predicted protein structures are well-structured, but small orthogroup refinements are required. | Nature

Extended Data Fig. 1: The predicted protein structures are well-structured, but small orthogroup refinements are required.

From: The role of metabolism in shaping enzyme structures over 400 million years

Extended Data Fig. 1

a) Mean pLDDT score as calculated by Alphafold2 versus the relative, length-normalised amino acid index for all our simulated structures, divided into 100 bins, due to varying protein lengths. Error bars denote the standard deviation coloured by the coefficient of variation. b + c) Structural alignment-based orthogroup refinement illustrated on orthogroup 1022. b) Hierarchical clustering of structures using US-align scores. The two resulting clusters are highlighted. Cluster 1 (blue) contains the reference structure S. cerevisiae Gre3p (Uniprot: P38715) and is named OG1022_REF_Scer_AF-P38715-F1-model and cluster 2 (red) contains Nit3p (Uniprot: P49954) and is named OG1022_REF_Scer_AF-P49954-F1-model. c) Phylogenetic tree based on US-align structural alignment of the entire original orthogroup 1022. The clusters based on hierarchical clustering correspond to structures that cluster together in the protein tree. d) Mean CR versus mean amino acid type CR. The dotted line denotes the identity line, other lines as in e). e) Mean CR versus variability of the octanol-water partition coefficient (KOW). Solid line is the best linear fit and the dashed line denotes the axis median. f) Example of the structural alignment with respect to the reference structure. In addition, in the bottom panel other features, like pLDDT, solvent-accessible surface-area (SASA), secondary structure prediction (DSSP), fully conserved residues and fully conserved amino acid types are depicted. Colour according to amino acids. In case of light grey, no residue could be aligned to the reference. A dark grey colour indicates an agreement with respect to the reference structure for the respective amino acid. The more dark grey, the higher the conservation ratio. g) Maximum-normalised violin plot of the pLDDT for the density distribution containing every single residue grouped by residues that could be mapped (n = 5,527,262) and that could not be mapped (n = 1,454,754) respectively.

Source data

Back to article page