Table 7 Performance comparison of feature representation methods using their respective optimized CNN architectures for Khib site prediction in the rice dataset. Boldface values indicate the best performance for each metric.
Feature representation method | 10-fold cross-validation set | Independent test set | ||||||
|---|---|---|---|---|---|---|---|---|
ACC | F1 | MCC | AUC | ACC | F1 | MCC | AUC | |
ESM | 0.690 | 0.699 | 0.380 | 0.756 | 0.676 | 0.687 | 0.352 | 0.745 |
One hot | 0.682 | 0.682 | 0.365 | 0.744 | 0.670 | 0.684 | 0.338 | 0.733 |
CTD | 0.703 | 0.735 | 0.421 | 0.761 | 0.708 | 0.751 | 0.425 | 0.762 |
PSSM | 0.725 | 0.736 | 0.453 | 0.789 | 0.733 | 0.746 | 0.464 | 0.807 |
AAP | 0.777 | 0.785 | 0.556 | 0.852 | 0.764 | 0.771 | 0.529 | 0.852 |
BLOSUM | 0.785 | 0.794 | 0.577 | 0.869 | 0.807 | 0.822 | 0.614 | 0.887 |