Fig. 6: Simulations show dominant misfolding mechanism is structurally-interdependent loss and gain of NCLEs.

a Overview of temperature quenching simulations showing the coarse graining (CG) of the all-atom crystal structure using the \({{{{\rm{C}}}}}_{{{{\rm{\alpha }}}}}\) model; unfolding at a temperature above the protein’s melting temperature; then followed by an instantaneous temperature quench to 310 K. Some trajectories properly fold, others adopt non-native misfolded states. Shown is the crystal structure of the protein encoded by gene P37747, (PDB 1I8T, Chain A). Native non-covalent lasso entanglement (NCLE) loop (red) and thread (blue) are shown in the final natively folded and misfolded states to exemplify changes in NCLEs status. b The average misfolding propensity (Eq. 10) between the set of proteins predicted to be highly refoldable (low experimentally observed misfolding propensity, \(n=7\)) and the set of proteins predicted to be highly non-refoldable (high experimentally observed misfolding propensity, \(n=4\)). \(50\) trajectories for each protein simulated taking the last 200 ns of frames. ***, **, * indicate, respectively, conditions in which the two-sided \(p\) values are below significance thresholds of 0.001, 0.01, and 0.05. c Example from the simulations of a loss of a native NCLE with the loop (red) and the thread that is lost upon misfolding (blue) (P31142, PDB 1URH, Chain A). d Example of a gain of a non-native NCLE in the protein from gene Q46856 (PDB 1OJ7, Chain D). The misfolded structure shown is taken from the simulations. e Probability that misfolded structures contain only losses of native NCLE(s), only gains of a non-native NCLE(s), or the presence of both types of misfolding (Eq. 11). 50 trajectories for each protein simulated taking the last 200 ns of frames. f The conditional probability of an observed unique loss of a native NCLE having overlap (being structurally interdependent) with a unique gain of a non-native NCLE in the same structure (Eq.( 12)). 50 trajectories for each protein simulated taking the last 200 ns of frames. Data are presented as mean values with 95% confidence interval.