Extended Data Fig. 1: Ablation and analysis on μFormer components.
From: Accelerating protein engineering with fitness landscape modelling and reinforcement learning

a) Ablation study to evaluate the importance of each component in μFormer. The change in performance after removing various components from the model relative to a full model is shown. Negative numbers (blue) indicate a loss of performance and positive numbers (red) indicate an improvement in performance. The last row displays the average performance change over 9 proteins. The plus/minus signs at the bottom indicate the presence/removal of the corresponding component. b) Spearman ρ statistics on 3 FLIP GB1 datasets of μFormer, ECNet, and their variants. ECNet w/ μFormer encoder replaces the language model in ECNet with μFormer’s language model. μFormer-S (Methods) is a variation with a model size similar to ECNet. 1-vs-rest: a train-test split where single-point mutants are used for training, and multi-point mutants are reserved for testing. 2-vs-rest: a train-test split where single- and double-point mutants are used for training, and all higher-order mutants are reserved for testing. 3-vs-rest: a train-test split where single-, double-, and triple-point mutants are used for training, and all higher-order mutants are reserved for testing. See Supplementary Notes for details.