Fig. 3
From: Medium-sized protein language models perform well at transfer learning on realistic datasets

Impact of model size on transfer learning. (A) LassoCV regression results using mean embeddings from different pLMs for 36 DMS datasets. (B) LassoCV regression results using mean embeddings from different pLMs for 12 targets originated from the PISCES dataset. The y-axis displays the average \(R^2\) scores from 5-fold cross-validation for each dataset or target. The x-axis represents the number of parameters in each model, with different models grouped by color: ESM1v (black), ESM-2 models (blue), ESM C (green), and AMPLIFY (orange).