Table 2 Benchmarking of PLMs in TRILL on the SP_final test set against sota methods.

From: Benchmarking protein language models for protein crystallization

Model

Method

F1

ACC

MCC

Prec

Rec

AUPR

AUC

fDETECT

RF

0.580

0.616

0.381

0.913

0.425

0.882

0.837

DeepCrystal

CNN

0.788

0.759

0.53

0.876

0.716

0.877

0.874

ATTCrys

Multi-Stage CNN

0.814

0.772

0.521

0.831

0.797

0.856

0.827

CLPred

CNN + Bi-LSTM

0.832

0.801

0.599

0.885

0.783

0.880

0.887

ESM2 T6-8M

XGBoost

0.712

0.713

0.524

0.955

0.568

0.948

0.913

ESM2 T12-35M

XGBoost

0.615

0.646

0.445

0.957

0.453

0.929

0.881

ESM2 T30-150M

XGBoost

0.836

0.814

0.646

0.933

0.757

0.947

0.919

ESM2 T33-650M

XGBoost

0.795

0.781

0.61

0.953

0.682

0.948

0.922

ESM2 T36-3B

XGBoost

0.814

0.802

0.657

0.981

0.696

0.964

0.935

Ankh

XGBoost

0.761

0.743

0.528

0.907

0.655

0.932

0.906

Ankh Large

XGBoost

0.84

0.819

0.653

0.934

0.764

0.955

0.93

ProstT5

XGBoost

0.829

0.81

0.648

0.948

0.736

0.957

0.94

ProtT5-XL

XGBoost

0.794

0.776

0.593

0.936

0.689

0.938

0.909

SaProt-35M

XGBoost

0.859

0.827

0.636

0.874

0.845

0.916

0.897

SaProt-650M

XGBoost

0.858

0.835

0.676

0.929

0.797

0.936

0.922

xTrimoPGLM-1B

XGBoost

0.879

0.857

0.711

0.932

0.831

0.954

0.932

xTrimoPGLM-3B

XGBoost

0.845

0.819

0.638

0.907

0.791

0.937

0.914

xTrimoPGLM-10B

XGBoost

0.875

0.852

0.701

0.925

0.831

0.960

0.938

ESM2 T6-8M

LightGBM

0.871

0.848

0.694

0.924

0.824

0.953

0.915

ESM2 T12-35M

LightGBM

0.803

0.781

0.585

0.914

0.716

0.934

0.888

ESM2 T30-150M

LightGBM

0.873

0.848

0.688

0.912

0.838

0.954

0.931

ESM2 T33-650M

LightGBM

0.883

0.857

0.699

0.901

0.865

0.946

0.921

ESM2 T36-3B

LightGBM

0.911

0.89

0.769

0.924

0.899

0.961

0.938

Ankh

LightGBM

0.885

0.857

0.694

0.885

0.885

0.931

0.912

Ankh Large

LightGBM

0.876

0.848

0.681

0.894

0.858

0.954

0.929

ProstT5

LightGBM

0.898

0.878

0.751

0.941

0.858

0.964

0.94

ProtT5-XL

LightGBM

0.873

0.848

0.688

0.912

0.838

0.952

0.927

SaProt-35M

LightGBM

0.836

0.806

0.606

0.886

0.791

0.917

0.902

SaProt-650M

LightGBM

0.871

0.848

0.694

0.924

0.824

0.939

0.924

xTrimoPGLM-1B

LightGBM

0.895

0.873

0.739

0.927

0.865

0.947

0.924

xTrimoPGLM-3B

LightGBM

0.847

0.819

0.631

0.895

0.804

0.933

0.905

xTrimoPGLM-10B

LightGBM

0.864

0.835

0.658

0.892

0.838

0.934

0.917

ESM2 T30-150M

CNN + AVG Embed

0.884

0.857

0.697

0.872

0.896

0.941

0.924

ESM2 T36-3B

CNN + AVG Embed

0.902

0.878

0.739

0.905

0.899

0.964

0.937

ProstT5

CNN + AVG Embed

0.886

0.857

0.693

0.892

0.88

0.921

0.93

ESM2 T30-150M

LSTM + AVG Embed

0.859

0.823

0.621

0.865

0.853

0.929

0.91

ESM2 T36-3B

LSTM + AVG Embed

0.879

0.852

0.691

0.858

0.901

0.956

0.933

ProstT5

LSTM + AVG Embed

0.902

0.882

0.757

0.872

0.935

0.956

0.939

  1. Significant values are in bold.