Table 2 Evaluation metric results

From: Ab-initio amino acid sequence design from protein text description with ProtDAT

Method

Seq-Identity

pLDDT↑

TM-score↑

RMSD↓

Progen2

0.207

59.58 ± 20.35

0.344 ± 0.238

5.494 ± 2.060

ProLLaMA

0.212

41.86 ± 10.30

0.228 ± 0.080

5.536 ± 0.927

ProtGPT2

0.183

58.64 ± 17.28

0.240 ± 0.188

4.601 ± 2.066

ESM-3

0.117

61.26 ± 17.94

0.229 ± 0.106

5.276 ± 1.436

ProtDAT(PM1)

0.304

79.91 ± 14.45

0.598 ± 0.300

3.509 ± 2.223

ProtDAT(PM2) (without MCM)

0.325

58.06 ± 22.62

0.572 ± 0.328

3.852 ± 2.374

ProtDAT(PM2)

0.334

65.85 ± 21.61

0.607 ± 0.300

3.477 ± 2.234

  1. Protein structure files are obtained by ESMFold. Except for global sequence identity, all other metrics are presented as mean ± standard deviation, calculated based on the total number of proteins generated by the different models and frameworks in Table 1.
  2. Evaluation results of the Generated-Dataset which is built by sequences generated by Progen2, ProLLaMA, ProtGPT2, ESM-3, and ProtDAT under two different prompt methods across various metrics.