Extended Data Fig. 8: Comparisons between TopFit and other methods for mutation effects prediction using Spearman correlation.

a,b, This is an analog for Fig. 4a-b, but TopFit combines VAE score, eUniRep embedding, and PST embedding. All supervised models use 240 labeled training data. Results are evaluated by Spearman correlation ρ. DeepSequence VAE takes the absolute value of ρ. The average ρ from n = 20 repeats is shown. All 34 datasets are categorized by their structure modality used: X-ray, nuclear magnetic resonance (NMR), AlphaFold (AF) and cryogenic electron microscopy (EM). a, Dot plots show results across 34 datasets. b, Dot plots show pairwise comparison between TopFit with one method at each plot. Medians of difference for average Spearman correlation Δρ across all datasets are shown. One-sided rank-sum test determines the statistical significance that TopFit has better performance than VAE score, eUniRep embedding and PST embedding with P values 3 × 10−7, 2 × 10−7 and 4 × 10−7, respectively.