Fig. 5: Context-specific effect of the F456L substitution.
From: A protein language model for exploring viral fitness landscapes

a Examples of convergent acquisitions of specific substitutions. A node indicates the acquisition events, and node color denotes fitness gain at the acquisition events. Branch color denotes the presence (gray) or absence (light gray) of specific substitutions in the reconstructed ancestral S protein sequences. b Fitness gain upon F456L in each backbone S protein sequence, inferred by in silico mutational scanning using CoVFit. Variants with available DMS data (shown in (d)) were included in this analysis. c Site-wise immune escape score for the ancestral D614G strain, BA.2, and XBB variants, estimated by mAb escape estimator37 based on Cao’s DMS data24. The top 5 sites regarding the escape score are annotated. d Effect of F456L on the S protein’s expression (stability) and ACE2-binding affinity, extracted from publicly available DMS data from Taylor and Starr36. The dot color indicates inferred fitness gain shown in (b). Higher values indicate enhanced higher expression and ACE2-binding affinity values. Source data are provided as a Source Data file.