Fig. 3: Comparison of three refolded structures (left) and the respective model-designed sequences (right) for proteins with PDB IDs 1NI8, 2HKY and 2P0X.
From: Mask-prior-guided denoising diffusion improves inverse protein folding

a, Refolded tertiary structure visualization of the sequences designed by three models MapDiff (red), GRADE-IF (orange) and LM-Design (blue). The refolded structures were generated by AlphaFold2 and superposed against the ground-truth structures (purple). For each model and structure, the recovery rate and RMSD value are indicated for foldability comparison. b, The alignment of the three native sequences and the respective model-designed sequences. The results are shown with secondary structure elements marked below each sequence: α-helices are shown in red cylinders, β-strands in blue arrows, and loops and disordered regions are unmarked. For the native proteins, the secondary structures were derived from their source PDB files. For the predicted proteins, the secondary structures were assigned by first identifying all interbackbone hydrogen bonds and then searching for hydrogen-bonding patterns that represent helices and strands. The refolded structures and sequence alignments are visualized using the Schrödinger Maestro software58. c, Recovery rates for loops and disordered regions (left panel) and α-helix and β-strand regions (right panel) across three structures. Bars indicate the recovery rates of three methods (MapDiff, Grade-IF and LM-Design). The percentage composition of regions for each structure is provided below the panel titles. MapDiff consistently achieves the highest recovery rates across different categories of regions for the three structures, with an average improvement of 5.1% in loops and disordered regions and 13.4% in α-helix and β-strand regions compared with Grade-IF. d, Jaccard region intersections between the predicted and ground-truth structures for loops and disordered regions (left panel) versus α-helix and β-strand regions (right panel). The Jaccard index measures the fraction of the overlap between two sets, and the results demonstrate that MapDiff achieves the highest score across both categories of regions.