Table 3 Benchmarking of PLMs in TRILL on the TR_final test set against sota methods.
From: Benchmarking protein language models for protein crystallization
Model | Method | F1 | ACC | MCC | Prec | Rec | AUPR | AUC |
---|---|---|---|---|---|---|---|---|
fDETECT | RF | 0.747 | 0.841 | 0.663 | 0.918 | 0.631 | 0.768 | 0.887 |
DeepCrystal | CNN | 0.781 | 0.841 | 0.657 | 0.800 | 0.762 | 0.815 | 0.910 |
ATTCrys | Multi-Stage CNN | 0.758 | 0.810 | 0.605 | 0.718 | 0.802 | 0.793 | 0.880 |
CLPred | CNN + Bi-LSTM | 0.807 | 0.854 | 0.690 | 0.787 | 0.829 | 0.865 | 0.930 |
ESM2 T6-8M | XGBoost | 0.729 | 0.835 | 0.648 | 0.926 | 0.602 | 0.911 | 0.944 |
ESM2 T12-35M | XGBoost | 0.692 | 0.819 | 0.616 | 0.932 | 0.551 | 0.901 | 0.939 |
ESM2 T30-150M | XGBoost | 0.816 | 0.875 | 0.73 | 0.9 | 0.746 | 0.933 | 0.96 |
ESM2 T33-650M | XGBoost | 0.772 | 0.854 | 0.685 | 0.912 | 0.668 | 0.917 | 0.954 |
ESM2 T36-3B | XGBoost | 0.783 | 0.863 | 0.708 | 0.94 | 0.671 | 0.925 | 0.955 |
Ankh | XGBoost | 0.756 | 0.839 | 0.649 | 0.858 | 0.676 | 0.875 | 0.932 |
Ankh Large | XGBoost | 0.797 | 0.858 | 0.69 | 0.844 | 0.754 | 0.898 | 0.942 |
ProstT5 | XGBoost | 0.762 | 0.84 | 0.65 | 0.846 | 0.693 | 0.88 | 0.943 |
SaProt-35M | XGBoost | 0.798 | 0.840 | 0.670 | 0.749 | 0.853 | 0.860 | 0.923 |
SaProt-650M | XGBoost | 0.811 | 0.857 | 0.696 | 0.791 | 0.832 | 0.885 | 0.939 |
xTrimoPGLM-1B | XGBoost | 0.814 | 0.858 | 0.700 | 0.788 | 0.842 | 0.879 | 0.938 |
xTrimoPGLM-3B | XGBoost | 0.799 | 0.853 | 0.683 | 0.807 | 0.791 | 0.881 | 0.936 |
xTrimoPGLM-10B | XGBoost | 0.834 | 0.874 | 0.733 | 0.809 | 0.861 | 0.895 | 0.946 |
ProtT5-XL | XGBoost | 0.776 | 0.852 | 0.678 | 0.878 | 0.695 | 0.91 | 0.948 |
ESM2 T6-8M | LightGBM | 0.846 | 0.885 | 0.755 | 0.841 | 0.85 | 0.909 | 0.947 |
ESM2 T12-35M | LightGBM | 0.807 | 0.868 | 0.712 | 0.873 | 0.751 | 0.9 | 0.941 |
ESM2 T30-150M | LightGBM | 0.862 | 0.894 | 0.778 | 0.833 | 0.893 | 0.929 | 0.959 |
ESM2 T33-650M | LightGBM | 0.829 | 0.867 | 0.723 | 0.787 | 0.877 | 0.901 | 0.944 |
ESM2 T36-3B | LightGBM | 0.862 | 0.894 | 0.777 | 0.835 | 0.89 | 0.925 | 0.956 |
Ankh | LightGBM | 0.82 | 0.853 | 0.706 | 0.748 | 0.906 | 0.869 | 0.928 |
Ankh-Large | LightGBM | 0.835 | 0.87 | 0.732 | 0.785 | 0.89 | 0.892 | 0.944 |
ProstT5 | LightGBM | 0.839 | 0.875 | 0.739 | 0.799 | 0.882 | 0.903 | 0.949 |
ProtT5-XL | LightGBM | 0.844 | 0.88 | 0.749 | 0.814 | 0.877 | 0.912 | 0.951 |
SaProt-35M | LightGBM | 0.798 | 0.848 | 0.677 | 0.782 | 0.816 | 0.869 | 0.932 |
SaProt-650M | LightGBM | 0.826 | 0.866 | 0.719 | 0.792 | 0.864 | 0.887 | 0.941 |
xTrimoPGLM-1B | LightGBM | 0.806 | 0.849 | 0.685 | 0.766 | 0.850 | 0.868 | 0.936 |
xTrimoPGLM-3B | LightGBM | 0.817 | 0.863 | 0.708 | 0.804 | 0.832 | 0.870 | 0.933 |
xTrimoPGLM-10B | LightGBM | 0.814 | 0.862 | 0.704 | 0.808 | 0.821 | 0.893 | 0.944 |
ESM2 T30-150M | CNN + AVG Embed | 0.847 | 0.88 | 0.752 | 0.896 | 0.803 | 0.915 | 0.96 |
ESM2 T36-3B | CNN + AVG Embed | 0.855 | 0.885 | 0.765 | 0.914 | 0.803 | 0.925 | 0.96 |
ProstT5 | CNN + AVG Embed | 0.835 | 0.862 | 0.735 | 0.949 | 0.746 | 0.872 | 0.959 |
ESM2 T30-150M | LSTM + AVG Embed | 0.84 | 0.873 | 0.741 | 0.906 | 0.783 | 0.913 | 0.958 |
ESM2 T36-3B | LSTM + AVG Embed | 0.842 | 0.878 | 0.745 | 0.874 | 0.811 | 0.933 | 0.96 |
ProstT5 | LSTM + AVG Embed | 0.843 | 0.878 | 0.746 | 0.882 | 0.807 | 0.902 | 0.958 |