Table 2 Key factors that can influence the interpretation of structural IDR data
Deviation type | Deviation description | Example | Ref. |
|---|---|---|---|
Deviations from the canonical protein sequence | Definition of the construct used in the experiment | ||
Post-translational modifications (PTMs) (Covalent modification of a residue side chain) | PTMs can change the physicochemical properties of a sequence and thereby alter the structural state, compaction or dynamics of an IDR. The structures of several IDRs have been shown to be modulated by the addition or removal of a PTM. Studies aimed at investigating these mechanisms will characterize modified proteoforms to understand the structural changes. | 4E-BP2 folds into a four-stranded beta structure upon phosphorylation of residues T37 and T46. | |
Substitutions, insertions and deletions (Replacement, addition or removal of residues of the canonical protein sequence) | Substitutions, insertions and deletions can affect local and global physicochemical properties of a region (for example, the charge, hydrophobicity, interaction capacity and size), potentially affecting the structural properties of a protein. Studies altering the protein sequence can enable the testing of the effect of indels, polymorphisms or disease variants, certain PTMs (such as phosphomimetics) or isoforms (by addition or removal of an exon). | A p.F82K substitution in ferricytochrome c induces localized unfolding of a distal site in the ferric state. | |
Tags and labels (Covalent attachment of an entity that enables analysis, identification, purification or solubility of the protein) | Tags and labels can have a measurable influence on the dynamics and stability of the protein they are attached to. The addition of tags is almost always a technical necessity, and its aim is not to measure a biological phenomenon. Tags play three major roles in IDR experiments: (1) for purification (for example, FLAG tag), (2) for solubility (for example, maltose-binding protein), (3) for experimental readout (for example, fluorescent tags for fluorescent microscopy or paramagnetic tags for NMR). | The addition of a His tag influences myoglobin short time scale (picoseconds) dynamics. | |
Proteolytic cleavage (Cleavage of the protein chain induced by a protease) | Cleavage can disrupt both local structural elements and long-range contacts by increasing the distance between residue pairs. Cleavage also introduces new N and C termini in the protein chain, changing the polarity, solubility and interaction capacity of regions. Many proteins, especially extracellular proteins, are known to undergo cleavage, often in many subsequent steps. Cleavage products can be created in response to signaling events and often have very different biological activity, interaction capacity and structural states. | Cleavage of the disordered osteopontin removes long-range intramolecular interactions, changing the structural state and the accessibility of the integrin-binding site. | |
Experimental parameters | Parameters of the experimental setup for a sample | ||
pH (pH of the sample) | The pH can affect the strength of ionic and hydrogen bonds and so can modulate the structural state of a protein32. Experimental parameters are often tweaked to find the optimal experimental parameters for the study of a specific protein, sometimes resulting in the use of non-physiological pH. Furthermore, comparison of a physiological state with a non-physiological pH state can be used to probe the structural properties of the region of interest, for example, forcing the complete unfolding of a construct with harsh experimental conditions to allow comparison to a ‘ground state.’ | NhaA, a sodium proton antiporter of the inner membrane of Escherichia coli, is activated at pH values between 6 and 7, with a maximal activity at pH 8.5, and is inactivated by acidic pH. | |
Temperature (Temperature of the sample) | The temperature has an explicit role in determining the strength of entropic terms in the Gibbs free energy that controls the stability of protein structures and complexes. Thus, changing the temperature can drastically change the stability of folded proteins and dynamics of IDRs. Changing the temperature of a protein sample in an experiment can serve to explore its folding or unfolding kinetics, stability and oligomerization. For calorimetric techniques, such as differential scanning calorimetry, temperature regulation is what provides the measurable signal. For certain experiments, such as NMR, changing the temperature is performed for technical reasons to improve the signal-to-noise ratio. | Hp26 becomes active with increased temperature in a two-step mechanism that first activates the protein and then unfolds it. | |
Pressure (Hydrostatic pressure of the protein sample) | High hydrostatic pressure (HHP) can induce unfolding by breakage of intramolecular interactions and exposure of cavities allowing binding of water. HHP is used to study the structure of partially structured intermediate transition states and the monomeric forms of oligomeric and aggregated proteins. | The 1D 1H NMR spectra support the proposed molten-globule state of Arc repressor under high pressure; moreover, the 1H NMR spectra at a pressure range of 3.5–5 kbar are substantially different from those of the native state (1 bar, 20 °C) and the fully denatured state (1 bar, 70 °C). | |
Force (Mechanical force applied to the protein) | Opposing forces applied to different parts of the protein can mechanically unfold the structure (either partially or completely), converting mechanical signals into biochemical ones. The most typical information provided are the number of steps in which a protein unfolds (reflecting the number of domains or intermediate structural states) and the force required for unfolding. For proteins undergoing force-induced unfolding in biological settings, these measurements explore their biological function. Atomic force microscopy and high-speed force spectroscopy are used to assess the stability and the folding and unfolding kinetics of proteins. | Mechanical unfolding of TTN-1 and twitchin of Caenorhabditis elegans affects the auto-inhibitory region and the catalytic core of the protein. | |
Redox potential (Redox potential of the sample) | The redox potential affects the behavior of residues, especially that of cysteine. Under oxidizing conditions, cysteines can form disulfide bridges; under reducing conditions, they can coordinate cations. Redox potential parameters are often tweaked to find the optimal experimental parameters for the study of a specific protein. Various cellular compartments have drastically different redox potentials (for example, the extracellular space is oxidizing, whereas the cytoplasm is reducing); thus, changing the redox potential in a sample can model various compartments or the transport between them. | The nuclear export signal (NES) of Yap1 is masked by a structured domain held together by disulfide bridges in the oxidized state. In reducing conditions, the domain unfolds, and the NES becomes exposed and functional. | |
Light (Irradiating the protein with visible, UV or infrared light) | Many light-sensitive proteins contain additional chromophores that can undergo structural changes (most often cis–trans isomerization) that consequently alter the structure and/or dynamics of the protein that they are embedded in. Light-induced folding or unfolding of photosensitive proteins as a response to light is studied by altering these conditions. | Light-induced unfolding of the water-soluble photoactive yellow protein (PYP) allows it to become functionally active and bind partners. | |
Protein concentration (Concentration of the protein being tested in the sample) | Increased protein concentration can promote aggregation, liquid-to-liquid phase separation and liquid-to-solid phase transition. Consequently, the structural state of an IDR can be concentration dependent. The solubility limit defines the concentration in which molecules are miscible in solution. If the protein concentration increases beyond that limit, the macromolecule–macromolecule interactions are energetically more favorable than the macromolecule–solute interactions. | Several phase-separation drivers (that is, FUS and hnRNPA1) can undergo percolation or liquid–liquid phase separation in a concentration-dependent manner. | |
Protein source (Details of the protein purification) | An important element of the experimental setup is how the protein was generated, because its prior history may have a significant effect on its structural state by determining the exact proteoform, including post-translational modifications, partial proteolysis and so on. Best practice is to check the final proteoform used in the structural studies, either by mass spectrometry or, if possible, by the structural experimental method itself (such as NMR structure determination). | Important information includes the cell type in which the protein was expressed (for example, E. coli, yeast, insect cells (for example, SF9) or human cells (for example HEK-293); not the source genome where the protein is encoded), the method of extraction (for example, by sonication) and subsequent purification, especially if it included an intermittent heat treatment and the application of agents for solubilization and/or denaturation (for example, tween-20 and urea), protease inhibitors and/or reducing agents. | |
Computational parameters (Details of the parameters used in computational processing) | Complex processing of experimental data is commonly required for data interpretation in the IDP field. Any software used and computational parameters that can influence the results should be described. | The interpretation of the results of raw data post-processing, residue-specific intrinsic disorder prediction, molecular dynamics and integrative structural studies all rely heavily on the software and parameterization that are used. | |
Experimental sample components | Components added to the sample that are required for technical aspects of the experiment | ||
Crowding agents (Addition of crowding agents to a sample to mimic the molecular concentrations found in cells) | Quinary interactions can have a strong effect on both the structural properties and interactions of a protein. Consequently, proteins behave differently in different contexts: for example, in the cell, in high concentrations of crowding agents and in a buffer. Few experiments have been performed to probe the effect of crowding on structure and interactions; however, the limited data available have suggested that the contribution can be significant, and that it is largely protein specific. Biophysical measurements taken in vitro may not reflect the actual dynamics in the cellular milieu; consequently, the crowding agents are added to partially mitigate biases introduced by the non-physiological conditions. | Experiments studying the effects of a range of crowding agents at different concentrations on IDRs from PUMA, Ash1, E1A and p53 reveal that the induced structural changes depend on both protein sequence and the crowding agent used. | |
Solubility agents (High ionic strength, amino acids, organic solvent) | Solubility agents (or hydrotropic agents) are typically small molecules that have both a hydrophobic and a hydrophilic region, and can increase the solubility of proteins by shielding their local hydrophobic regions from the solvent. Molecules added to a sample in a structural analysis to improve the solubility of the protein to be studied may alter its structural state. | Ionic strength and glycerol are used to mirror protein charges or increased repulsions, respectively. These two experimental components were both used to keep proteins stable in solution. | |
Folding/unfolding agents (Small molecules, organic solvents, high salt or non-ionic detergents) | Folding and unfolding agents constitute a diverse set of molecules used in the structural characterization of an IDR. They are used to modulate the structural state of a protein by shifting it towards either a folded or unfolded state. This is then used as a reference state with known properties that can be compared with other states, helping understand a structural property of the region under investigation. | Several cosolvents were used to perturb protein’s stability: guanidine hydrochloride (GdnHCl) and urea are used to denature or partially unfold proteins, whereas hexafluoroisopropanol (HFIP) and trifluoroethanol (TFE) induce secondary structure formation. | |
Preservatives (Protease inhibitors, chelating agents and sodium azide) | Protease inhibitors, chelating agents and sodium azide are often used to improve the overall stability of samples (for example, against proteolysis) and might have an impact on protein’s behavior. | ||
Biological background (Cell lysate, cell extract or in-cell sample) | IDPs are increasingly being investigated in biological backgrounds rather than in vitro. For example, isotopically labeled samples can be specifically studied by NMR in cell lysates, cell extract (nuclear/cytoplasmic extract) or even in cells or organelles. Fluorescently labeled proteins can also be studied in cells. | A range of cell lines, cell extracts and organelles have been used to characterize IDPs in their microenvironments. However, specific information on the amount of sample inside cells, the potential manipulation of cells with genetic engineering or drugs, for example, should be defined. Proper controls for intracellular pH and crowding should be provided for these data to be comparable. | |
Biological sample components | Components added to the sample that are directly related to the biological hypothesis being tested | ||
Binding partners (Known or predicted binding partners or ligands) | Binding an interaction partner including ions, small molecules, proteins, nucleic acids or lipids/membranes can modulate the dynamics, compaction or secondary and tertiary structure of an IDR. Many disordered regions will form distinct conformations in the presence of a specific binding partner. These conformational changes can be drastic, shifting the protein from disordered to highly ordered, or to partially ordered with residual large amounts of disorder. In all cases they result in a shift in the sampled conformations. | ||
• Proteins | In isolation, p27 is disordered with nascent secondary structure. Upon binding to Cdk2–cyclin A complex, p27 becomes ordered. | ||
• Nucleic acids | In isolation, HMG-1 is intrinsically disordered; however, upon binding to DNA the protein becomes ordered and adopts a well-defined conformation in the minor groove. | ||
• Lipids or membranes | The intrinsically disordered N-terminal region of Hsp12 adopts a folded conformation comprising four α-helices upon micelle binding. | ||
• Small molecules | The dynamic KIX domain of the coactivator CBP/p300 can be stabilized by the addition of a small molecule. | ||
Cofactors (Metal ions, iron-sulfur (Fe-S) clusters or organic cofactors (vitamins and their derivatives or fatty acids)) | Cofactors acting as cell state signals can heavily modulate the behavior of an IDP. These observations can include folding and unfolding in the presence or absence of specific metal ions, protein aggregation by negatively charged cofactors compensating positively charged repeat regions or induction of liquid–liquid phase separation. | The calcium-binding Repeat-in-ToXin (RTX) of Bordetella pertussis adenylate cyclase toxin (CyaA) is disordered in the absence of calcium but folds upon calcium binding. This region acts as a switch integrating the differing calcium concentrations between the extracellular and intracellular environment. |