Fig. 3: Protein language model used for engineering proteins with identified mutations.
From: Integrating protein language models and automatic biofoundry for enhanced protein evolution

a A scheme illustrating application of PLM for sampling informative mutants at one mutation site, assuming that four amino acids are selected. b A flow chart illustrating the process of PLMeAE Module II. FP, fitness predictor. c Evaluation of various ESM models in GB1 dataset. d Performance of PLMeAE Module II tested in the GB 1 dataset. The violin plots show the distribution of protein variants fitness values. Inside each plot, a box-and-whisker diagram is included, where the whiskers represent the minimum and maximum values, the box spans from the first to the third quartile, the central solid line marks the median value and the dash line represents the mean value. Source data are provided as a Source Data file.