Table 1 Benchmarking of PLMs in TRILL on the balanced test set against sota methods.
From: Benchmarking protein language models for protein crystallization
Model | Method | F1 | ACC | MCC | Prec | Rec | AUPR | AUC |
---|---|---|---|---|---|---|---|---|
fDETECT | RF | 0.504 | 0.646 | 0.355 | 0.840 | 0.360 | 0.777 | 0.778 |
DeepCrystal | CNN | 0.822 | 0.828 | 0.658 | 0.851 | 0.795 | 0.886 | 0.903 |
ATTCrys | Multi-Stage CNN | 0.811 | 0.810 | 0.621 | 0.805 | 0.817 | 0.850 | 0.876 |
CLPred | CNN + Bi-LSTM | 0.850 | 0.851 | 0.700 | 0.849 | 0.852 | 0.900 | 0.928 |
ESM2 T6-8M | XGBoost | 0.674 | 0.746 | 0.546 | 0.934 | 0.527 | 0.9 | 0.916 |
ESM2 T12-35M | XGBoost | 0.643 | 0.726 | 0.51 | 0.921 | 0.494 | 0.905 | 0.916 |
ESM2 T30-150M | XGBoost | 0.803 | 0.826 | 0.669 | 0.92 | 0.713 | 0.929 | 0.936 |
ESM2 T33-650M | XGBoost | 0.754 | 0.794 | 0.618 | 0.928 | 0.635 | 0.91 | 0.928 |
ESM2 T36-3B | XGBoost | 0.716 | 0.767 | 0.571 | 0.914 | 0.588 | 0.908 | 0.92 |
Ankh | XGBoost | 0.764 | 0.792 | 0.602 | 0.883 | 0.672 | 0.893 | 0.913 |
Ankh Large | XGBoost | 0.783 | 0.804 | 0.619 | 0.874 | 0.709 | 0.906 | 0.917 |
ProstT5 | XGBoost | 0.761 | 0.791 | 0.6 | 0.885 | 0.667 | 0.907 | 0.924 |
ProtT5-XL | XGBoost | 0.757 | 0.791 | 0.606 | 0.903 | 0.651 | 0.913 | 0.924 |
SaProt-35M | XGBoost | 0.821 | 0.820 | 0.641 | 0.815 | 0.828 | 0.892 | 0.908 |
SaProt-650M | XGBoost | 0.839 | 0.843 | 0.686 | 0.855 | 0.824 | 915 | 0.927 |
xTrimoPGLM-1B | XGBoost | 0.826 | 0.830 | 0.660 | 0.843 | 0.809 | 0.900 | 0.916 |
xTrimoPGLM-3B | XGBoost | 0.808 | 0.819 | 0.642 | 0.858 | 0.764 | 0.901 | 0.916 |
xTrimoPGLM-10B | XGBoost | 0.834 | 0.839 | 0.679 | 0.857 | 0.813 | 0.898 | 0.920 |
ESM2 T6-8M | LightGBM | 0.828 | 0.837 | 0.676 | 0.869 | 0.791 | 0.9 | 0.914 |
ESM2 T12-35M | LightGBM | 0.803 | 0.821 | 0.652 | 0.891 | 0.731 | 0.916 | 0.92 |
ESM2 T30-150M | LightGBM | 0.854 | 0.857 | 0.715 | 0.871 | 0.838 | 0.916 | 0.932 |
ESM2 T33-650M | LightGBM | 0.845 | 0.845 | 0.69 | 0.843 | 0.846 | 0.9 | 0.917 |
ESM2 T36-3B | LightGBM | 0.829 | 0.833 | 0.666 | 0.843 | 0.816 | 0.904 | 0.916 |
Ankh | LightGBM | 0.848 | 0.843 | 0.687 | 0.82 | 0.877 | 0.896 | 0.91 |
Ankh Large | LightGBM | 0.831 | 0.832 | 0.663 | 0.83 | 0.833 | 0.907 | 0.918 |
ProstT5 | LightGBM | 0.85 | 0.851 | 0.702 | 0.855 | 0.845 | 0.916 | 0.929 |
ProtT5-XL | LightGBM | 0.838 | 0.842 | 0.685 | 0.86 | 0.817 | 0.919 | 0.928 |
SaProt-35M | LightGBM | 0.821 | 0.825 | 0.650 | 0.838 | 0.804 | 0.894 | 0.913 |
SaProt-650M | LightGBM | 0.848 | 0.849 | 0.699 | 0.853 | 0.843 | 0.913 | 0.927 |
xTrimoPGLM-1B | LighGBM | 0.836 | 0.836 | 0.672 | 0.835 | 0.836 | 0.888 | 0.912 |
xTrimoPGLM-3B | LightGBM | 0.826 | 0.832 | 0.664 | 0.849 | 0.806 | 0.889 | 0.909 |
xTrimoPGLM-10B | LightGBM | 0.820 | 0.827 | 0.656 | 0.854 | 0.788 | 0.899 | 0.919 |
ESM2 T30-150M | CNN + AVG Embed | 0.859 | 0.858 | 0.716 | 0.868 | 0.85 | 0.922 | 0.946 |
ESM2 T36-3B | CNN + AVG Embed | 0.867 | 0.865 | 0.731 | 0.883 | 0.852 | 0.941 | 0.955 |
ProstT5 | CNN + AVG Embed | 0.865 | 0.856 | 0.719 | 0.925 | 0.813 | 0.899 | 0.938 |
ESM2 T30-150M | LSTM + AVG Embed | 0.862 | 0.859 | 0.719 | 0.883 | 0.842 | 0.926 | 0.94 |
ESM2 T36-3B | LSTM + AVG Embed | 0.841 | 0.844 | 0.688 | 0.829 | 0.853 | 0.932 | 0.936 |
ProstT5 | LSTM + AVG Embed | 0.849 | 0.852 | 0.704 | 0.836 | 0.862 | 0.927 | 0.939 |