Table 2 Benchmarking of PLMs in TRILL on the SP_final test set against sota methods.
From: Benchmarking protein language models for protein crystallization
Model | Method | F1 | ACC | MCC | Prec | Rec | AUPR | AUC |
---|---|---|---|---|---|---|---|---|
fDETECT | RF | 0.580 | 0.616 | 0.381 | 0.913 | 0.425 | 0.882 | 0.837 |
DeepCrystal | CNN | 0.788 | 0.759 | 0.53 | 0.876 | 0.716 | 0.877 | 0.874 |
ATTCrys | Multi-Stage CNN | 0.814 | 0.772 | 0.521 | 0.831 | 0.797 | 0.856 | 0.827 |
CLPred | CNN + Bi-LSTM | 0.832 | 0.801 | 0.599 | 0.885 | 0.783 | 0.880 | 0.887 |
ESM2 T6-8M | XGBoost | 0.712 | 0.713 | 0.524 | 0.955 | 0.568 | 0.948 | 0.913 |
ESM2 T12-35M | XGBoost | 0.615 | 0.646 | 0.445 | 0.957 | 0.453 | 0.929 | 0.881 |
ESM2 T30-150M | XGBoost | 0.836 | 0.814 | 0.646 | 0.933 | 0.757 | 0.947 | 0.919 |
ESM2 T33-650M | XGBoost | 0.795 | 0.781 | 0.61 | 0.953 | 0.682 | 0.948 | 0.922 |
ESM2 T36-3B | XGBoost | 0.814 | 0.802 | 0.657 | 0.981 | 0.696 | 0.964 | 0.935 |
Ankh | XGBoost | 0.761 | 0.743 | 0.528 | 0.907 | 0.655 | 0.932 | 0.906 |
Ankh Large | XGBoost | 0.84 | 0.819 | 0.653 | 0.934 | 0.764 | 0.955 | 0.93 |
ProstT5 | XGBoost | 0.829 | 0.81 | 0.648 | 0.948 | 0.736 | 0.957 | 0.94 |
ProtT5-XL | XGBoost | 0.794 | 0.776 | 0.593 | 0.936 | 0.689 | 0.938 | 0.909 |
SaProt-35M | XGBoost | 0.859 | 0.827 | 0.636 | 0.874 | 0.845 | 0.916 | 0.897 |
SaProt-650M | XGBoost | 0.858 | 0.835 | 0.676 | 0.929 | 0.797 | 0.936 | 0.922 |
xTrimoPGLM-1B | XGBoost | 0.879 | 0.857 | 0.711 | 0.932 | 0.831 | 0.954 | 0.932 |
xTrimoPGLM-3B | XGBoost | 0.845 | 0.819 | 0.638 | 0.907 | 0.791 | 0.937 | 0.914 |
xTrimoPGLM-10B | XGBoost | 0.875 | 0.852 | 0.701 | 0.925 | 0.831 | 0.960 | 0.938 |
ESM2 T6-8M | LightGBM | 0.871 | 0.848 | 0.694 | 0.924 | 0.824 | 0.953 | 0.915 |
ESM2 T12-35M | LightGBM | 0.803 | 0.781 | 0.585 | 0.914 | 0.716 | 0.934 | 0.888 |
ESM2 T30-150M | LightGBM | 0.873 | 0.848 | 0.688 | 0.912 | 0.838 | 0.954 | 0.931 |
ESM2 T33-650M | LightGBM | 0.883 | 0.857 | 0.699 | 0.901 | 0.865 | 0.946 | 0.921 |
ESM2 T36-3B | LightGBM | 0.911 | 0.89 | 0.769 | 0.924 | 0.899 | 0.961 | 0.938 |
Ankh | LightGBM | 0.885 | 0.857 | 0.694 | 0.885 | 0.885 | 0.931 | 0.912 |
Ankh Large | LightGBM | 0.876 | 0.848 | 0.681 | 0.894 | 0.858 | 0.954 | 0.929 |
ProstT5 | LightGBM | 0.898 | 0.878 | 0.751 | 0.941 | 0.858 | 0.964 | 0.94 |
ProtT5-XL | LightGBM | 0.873 | 0.848 | 0.688 | 0.912 | 0.838 | 0.952 | 0.927 |
SaProt-35M | LightGBM | 0.836 | 0.806 | 0.606 | 0.886 | 0.791 | 0.917 | 0.902 |
SaProt-650M | LightGBM | 0.871 | 0.848 | 0.694 | 0.924 | 0.824 | 0.939 | 0.924 |
xTrimoPGLM-1B | LightGBM | 0.895 | 0.873 | 0.739 | 0.927 | 0.865 | 0.947 | 0.924 |
xTrimoPGLM-3B | LightGBM | 0.847 | 0.819 | 0.631 | 0.895 | 0.804 | 0.933 | 0.905 |
xTrimoPGLM-10B | LightGBM | 0.864 | 0.835 | 0.658 | 0.892 | 0.838 | 0.934 | 0.917 |
ESM2 T30-150M | CNN + AVG Embed | 0.884 | 0.857 | 0.697 | 0.872 | 0.896 | 0.941 | 0.924 |
ESM2 T36-3B | CNN + AVG Embed | 0.902 | 0.878 | 0.739 | 0.905 | 0.899 | 0.964 | 0.937 |
ProstT5 | CNN + AVG Embed | 0.886 | 0.857 | 0.693 | 0.892 | 0.88 | 0.921 | 0.93 |
ESM2 T30-150M | LSTM + AVG Embed | 0.859 | 0.823 | 0.621 | 0.865 | 0.853 | 0.929 | 0.91 |
ESM2 T36-3B | LSTM + AVG Embed | 0.879 | 0.852 | 0.691 | 0.858 | 0.901 | 0.956 | 0.933 |
ProstT5 | LSTM + AVG Embed | 0.902 | 0.882 | 0.757 | 0.872 | 0.935 | 0.956 | 0.939 |