Table 3 Benchmarking of PLMs in TRILL on the TR_final test set against sota methods.

From: Benchmarking protein language models for protein crystallization

Model

Method

F1

ACC

MCC

Prec

Rec

AUPR

AUC

fDETECT

RF

0.747

0.841

0.663

0.918

0.631

0.768

0.887

DeepCrystal

CNN

0.781

0.841

0.657

0.800

0.762

0.815

0.910

ATTCrys

Multi-Stage CNN

0.758

0.810

0.605

0.718

0.802

0.793

0.880

CLPred

CNN + Bi-LSTM

0.807

0.854

0.690

0.787

0.829

0.865

0.930

ESM2 T6-8M

XGBoost

0.729

0.835

0.648

0.926

0.602

0.911

0.944

ESM2 T12-35M

XGBoost

0.692

0.819

0.616

0.932

0.551

0.901

0.939

ESM2 T30-150M

XGBoost

0.816

0.875

0.73

0.9

0.746

0.933

0.96

ESM2 T33-650M

XGBoost

0.772

0.854

0.685

0.912

0.668

0.917

0.954

ESM2 T36-3B

XGBoost

0.783

0.863

0.708

0.94

0.671

0.925

0.955

Ankh

XGBoost

0.756

0.839

0.649

0.858

0.676

0.875

0.932

Ankh Large

XGBoost

0.797

0.858

0.69

0.844

0.754

0.898

0.942

ProstT5

XGBoost

0.762

0.84

0.65

0.846

0.693

0.88

0.943

SaProt-35M

XGBoost

0.798

0.840

0.670

0.749

0.853

0.860

0.923

SaProt-650M

XGBoost

0.811

0.857

0.696

0.791

0.832

0.885

0.939

xTrimoPGLM-1B

XGBoost

0.814

0.858

0.700

0.788

0.842

0.879

0.938

xTrimoPGLM-3B

XGBoost

0.799

0.853

0.683

0.807

0.791

0.881

0.936

xTrimoPGLM-10B

XGBoost

0.834

0.874

0.733

0.809

0.861

0.895

0.946

ProtT5-XL

XGBoost

0.776

0.852

0.678

0.878

0.695

0.91

0.948

ESM2 T6-8M

LightGBM

0.846

0.885

0.755

0.841

0.85

0.909

0.947

ESM2 T12-35M

LightGBM

0.807

0.868

0.712

0.873

0.751

0.9

0.941

ESM2 T30-150M

LightGBM

0.862

0.894

0.778

0.833

0.893

0.929

0.959

ESM2 T33-650M

LightGBM

0.829

0.867

0.723

0.787

0.877

0.901

0.944

ESM2 T36-3B

LightGBM

0.862

0.894

0.777

0.835

0.89

0.925

0.956

Ankh

LightGBM

0.82

0.853

0.706

0.748

0.906

0.869

0.928

Ankh-Large

LightGBM

0.835

0.87

0.732

0.785

0.89

0.892

0.944

ProstT5

LightGBM

0.839

0.875

0.739

0.799

0.882

0.903

0.949

ProtT5-XL

LightGBM

0.844

0.88

0.749

0.814

0.877

0.912

0.951

SaProt-35M

LightGBM

0.798

0.848

0.677

0.782

0.816

0.869

0.932

SaProt-650M

LightGBM

0.826

0.866

0.719

0.792

0.864

0.887

0.941

xTrimoPGLM-1B

LightGBM

0.806

0.849

0.685

0.766

0.850

0.868

0.936

xTrimoPGLM-3B

LightGBM

0.817

0.863

0.708

0.804

0.832

0.870

0.933

xTrimoPGLM-10B

LightGBM

0.814

0.862

0.704

0.808

0.821

0.893

0.944

ESM2 T30-150M

CNN + AVG Embed

0.847

0.88

0.752

0.896

0.803

0.915

0.96

ESM2 T36-3B

CNN + AVG Embed

0.855

0.885

0.765

0.914

0.803

0.925

0.96

ProstT5

CNN + AVG Embed

0.835

0.862

0.735

0.949

0.746

0.872

0.959

ESM2 T30-150M

LSTM + AVG Embed

0.84

0.873

0.741

0.906

0.783

0.913

0.958

ESM2 T36-3B

LSTM + AVG Embed

0.842

0.878

0.745

0.874

0.811

0.933

0.96

ProstT5

LSTM + AVG Embed

0.843

0.878

0.746

0.882

0.807

0.902

0.958

  1. Significant values are in bold.