Fig. 3: Evaluation of methods’ performance on predicting missense mutations that impact natural phase separation (PS) (‘Impact’ mutation). | Nature Communications

Fig. 3: Evaluation of methods’ performance on predicting missense mutations that impact natural phase separation (PS) (‘Impact’ mutation).

From: Decoding Missense Variants by Incorporating Phase Separation via Machine Learning

Fig. 3

Methods with abbreviations include DeePha (DeePhase), FuzDro (FuzDrop), and catGRA (catGRANULE). a Discriminative power evaluation of representative PS prediction methods for ‘Impact’ mutations against random ‘Background’ mutations, comparing absolute score changes pre- and post-mutation (P-values computed by a two-sided Mann–Whitney U test, left for ‘Impact’ mutations, n = 307 and right for ‘Background’ mutations n = 35,000; the boxplot components within each violin plot, from top to bottom are maxima, upper quartile, median, lower quartile, and minima.). b Performance evaluation of representative PS prediction methods on discerning ‘Impact’ mutations against random ‘Background’ mutations (IP task). AUROC is based on the absolute score changes. ch Model performance evaluation in identifying ‘Impact’ mutations. For LOSO, 50 replicates of subset sampling from the background dataset were used to evaluate performance, and the average AUROC and the area under the curve of the precision-recall curve (PRC) (AUPR) were computed and visualized. For LOSO AUPR, data are presented as mean values ± SD (Standard Deviation), and the scatter points represent the distribution of background dataset sampling repeats. c, d Model performance in identifying ‘Impact’ mutations evaluated using leave-one-source-out (LOSO, Left) and an independent test set (Right), measured by AUROC (c) and AUPR (d). e, f A parallel evaluation similar to (c and d) but the ‘Background’ mutations were generated following the same IDRs: Domains ratio as the collected ‘Impact’ samples (weighted sampling). (g, h) A parallel evaluation similar to (c, d) but the ‘Background’ mutations were generated by aligning the frequency of different mutations with their frequency in the impact dataset (AA weighted sampling). We assigned weights to each type of mutation based on the number of occurrences in the impact dataset, with a minimum weight of 1 to ensure all mutation types are considered. Source data are provided as a Source Data file.

Back to article page