Table 1 Performance comparisons across different encoding types and model architectures for classification performance on the Atezolizumab dataset

From: The RESP AI model accelerates the identification of tight-binding antibodies

Encoding type

Model type

Num hidden layer weights

MCC on 5× CV for all class classification

AUC-ROC on 5× CV for RH03 vs rest classification

MCC on the test set for all-class classification

One-hot (default)

Bayesian NN w/ ordinal regression (BNN-OR)

84,090

0.717 ± 0.009

0.967 ± 0.001

0.721

Autoencoder

BNN-OR

12,810

0.69 ± 0.006

0.966 ± 0.002

0.703

UniRep

BNN-OR

114,930

0.62 ± 0.01

0.947 ± 0.003

0.638

ProtVec

BNN-OR

3930

0.641 ± 0.003

0.852 ± 0.003

0.650

ESM-1b

BNN-OR

39,330

0 (model did not converge)

–

–

AbLang

BNN-OR

23,970

0.636 ± 0.01

0.96 ± 0.002

0.664

AntiBertY

BNN-OR

16,290

0.647 ± 0.007

0.961 ± 0.002

0.650

One-hot (default)

Fully connected net (FCNN)

84,090

0.73 ± 0.01

0.973 ± 0.001

0.734

Autoencoder

FCNN

12,810

0.731 ± 0.003

0.970 ± 0.001

0.734

UniRep

FCNN

114,930

0.70 ± 0.01

0.963 ± 0.002

0.699

ProtVec

FCNN

3930

0.690 ± 0.003

0.962 ± 0.002

0.683

ESM-1b

FCNN

39,330

0 (model did not converge)

–

–

AbLang

FCNN

23,970

0.715 ± 0.003

0.969 ± 0.0008

0.719

AntiBertY

FCNN

16,290

0.709 ± 0.008

0.968 ± 0.001

0.707

One-hot (default)

Random forest

NA

0.673 ± 0.003

0.956 ± 0.003

0.672

Autoencoder

Random forest

NA

0.7 ± 0.009

0.960 ± 0.003

0.708

  1. This table compares both different encoding types (one-hot, the autoencoder, UniRep, ProtVec etc) and different models (a random forest model, a Bayesian network and a traditional fully connected network) based on classification accuracy.