Table 7 Comparison of F1 scores on the test set for the ESM-2 token classification model and the GNN model, alongside the F1 scores of prosperousplus, obtained by training on our dataset and evaluating on our test set. A Dash (–) indicates that an error occurred during training for the corresponding protease due to the prosperousplus code failing to complete the calculations.
From: Prediction of peptide cleavage sites using protein language models and graph neural networks
MEROPS ID | ESM-2 Token Classification F1 | Graph Neural Network F1 | ProsperousPlus F1 (1:1 undersampling training set) | ProsperousPlus F1 (no undersampling training set) |
|---|---|---|---|---|
C14.001 | 0.3896 | 0.1613 | 0.0359 | 0.0000 |
C14.003 | 0.5389 | 0.367 | 0.0837 | - |
C14.004 | 0.3394 | 0.1393 | 0.0469 | 0.0563 |
C14.005 | 0.8655 | 0.737 | 0.4547 | 0.6713 |
M10.003 | 0.5966 | 0.2513 | 0.0853 | - |
M10.005 | 0.2420 | 0.0887 | 0.0567 | - |
S01.010 | 0.3456 | 0.1672 | 0.0438 | - |
S01.217 | 0.7671 | 0.3692 | 0.1405 | 0.2083 |