Extended Data Fig. 6: Sequence based predictors of LLPS.
From: Expanding the molecular language of protein liquid–liquid phase separation

(a) Solvation free energy from Wolfenden et al.1 vs. saturation concentrations measured in this work. Each data point represents a unique variant used in this work. Variants differing by only one residue are connected by lines such that each mutation results in increasing saturation concentration. Dashed lines indicate variants that follow the trend of preferred interaction with solvent leading to lower phase separation propensity, while solid lines show mutations that result in less favorable interaction with solvent and lower phase separation propensity. (b) Ratio of phase separation propensity score for each sequence relative to the propensity score for WT, calculated using several online sequence-based predictors – DeePhase, PScore, PSpredictor, FuzDrop, LLPhysScore and catgranule. Experimental values are shown as black circles. All predictor values are normalized with the WT to account for different scales used by the predictors. In all cases, when the normalized score is above 1, the sequence is predicted to undergo LLPS more avidly than the WT, while values below 1 indicate a lower propensity to undergo LLPS when compared to WT. Experimental values are calculated from the saturation concentration values (Csat) measured at 37 °C. The experimental values are represented as Csat of WT divided by Csat of variant, such that here too, a value above 1 indicates greater phase separation propensity compared to WT, whereas a value below 1 indicates lower phase separation propensity. (c) Correlation between experimental values and predictor results. Data for all data sets are normalized from 0 to 1. Symbols are the same as shown in (b) for the predictors.