Table 1 Benchmarking of PLMs in TRILL on the balanced test set against sota methods.

From: Benchmarking protein language models for protein crystallization

Model

Method

F1

ACC

MCC

Prec

Rec

AUPR

AUC

fDETECT

RF

0.504

0.646

0.355

0.840

0.360

0.777

0.778

DeepCrystal

CNN

0.822

0.828

0.658

0.851

0.795

0.886

0.903

ATTCrys

Multi-Stage CNN

0.811

0.810

0.621

0.805

0.817

0.850

0.876

CLPred

CNN + Bi-LSTM

0.850

0.851

0.700

0.849

0.852

0.900

0.928

ESM2 T6-8M

XGBoost

0.674

0.746

0.546

0.934

0.527

0.9

0.916

ESM2 T12-35M

XGBoost

0.643

0.726

0.51

0.921

0.494

0.905

0.916

ESM2 T30-150M

XGBoost

0.803

0.826

0.669

0.92

0.713

0.929

0.936

ESM2 T33-650M

XGBoost

0.754

0.794

0.618

0.928

0.635

0.91

0.928

ESM2 T36-3B

XGBoost

0.716

0.767

0.571

0.914

0.588

0.908

0.92

Ankh

XGBoost

0.764

0.792

0.602

0.883

0.672

0.893

0.913

Ankh Large

XGBoost

0.783

0.804

0.619

0.874

0.709

0.906

0.917

ProstT5

XGBoost

0.761

0.791

0.6

0.885

0.667

0.907

0.924

ProtT5-XL

XGBoost

0.757

0.791

0.606

0.903

0.651

0.913

0.924

SaProt-35M

XGBoost

0.821

0.820

0.641

0.815

0.828

0.892

0.908

SaProt-650M

XGBoost

0.839

0.843

0.686

0.855

0.824

915

0.927

xTrimoPGLM-1B

XGBoost

0.826

0.830

0.660

0.843

0.809

0.900

0.916

xTrimoPGLM-3B

XGBoost

0.808

0.819

0.642

0.858

0.764

0.901

0.916

xTrimoPGLM-10B

XGBoost

0.834

0.839

0.679

0.857

0.813

0.898

0.920

ESM2 T6-8M

LightGBM

0.828

0.837

0.676

0.869

0.791

0.9

0.914

ESM2 T12-35M

LightGBM

0.803

0.821

0.652

0.891

0.731

0.916

0.92

ESM2 T30-150M

LightGBM

0.854

0.857

0.715

0.871

0.838

0.916

0.932

ESM2 T33-650M

LightGBM

0.845

0.845

0.69

0.843

0.846

0.9

0.917

ESM2 T36-3B

LightGBM

0.829

0.833

0.666

0.843

0.816

0.904

0.916

Ankh

LightGBM

0.848

0.843

0.687

0.82

0.877

0.896

0.91

Ankh Large

LightGBM

0.831

0.832

0.663

0.83

0.833

0.907

0.918

ProstT5

LightGBM

0.85

0.851

0.702

0.855

0.845

0.916

0.929

ProtT5-XL

LightGBM

0.838

0.842

0.685

0.86

0.817

0.919

0.928

SaProt-35M

LightGBM

0.821

0.825

0.650

0.838

0.804

0.894

0.913

SaProt-650M

LightGBM

0.848

0.849

0.699

0.853

0.843

0.913

0.927

xTrimoPGLM-1B

LighGBM

0.836

0.836

0.672

0.835

0.836

0.888

0.912

xTrimoPGLM-3B

LightGBM

0.826

0.832

0.664

0.849

0.806

0.889

0.909

xTrimoPGLM-10B

LightGBM

0.820

0.827

0.656

0.854

0.788

0.899

0.919

ESM2 T30-150M

CNN + AVG Embed

0.859

0.858

0.716

0.868

0.85

0.922

0.946

ESM2 T36-3B

CNN + AVG Embed

0.867

0.865

0.731

0.883

0.852

0.941

0.955

ProstT5

CNN + AVG Embed

0.865

0.856

0.719

0.925

0.813

0.899

0.938

ESM2 T30-150M

LSTM + AVG Embed

0.862

0.859

0.719

0.883

0.842

0.926

0.94

ESM2 T36-3B

LSTM + AVG Embed

0.841

0.844

0.688

0.829

0.853

0.932

0.936

ProstT5

LSTM + AVG Embed

0.849

0.852

0.704

0.836

0.862

0.927

0.939

  1. Significant values are in bold.