Table 1 Statistics of the lysine and arginine methylation datasets, covering different types of methylation (mono- and di- for arginine, and mono-, di-, and tri- for lysine).

From: Two-Level Protein Methylation Prediction using structure model-based features

Dataset

Methylation types

Methyllysine

proteins/sites

Methylarginine

proteins/sites

Training set

mono-

313/465

598/883

di-

123/172

285/479

tri-

88/117

—

Total

485/721

818/1278

Independent test set I

mono-

77/110

159/231

di-

30/45

69/103

tri-

27/32

—

Total

121/180

205/311

Independent test set II

mono-

2239/4973

206/323

di-

7/21

110/217

tri-

6/6

—

Total

2243/4993

1700/3416

Structure dataset

Total

151/218(#3313)

99/128(#1515)

  1. The training set is used for training models, while the two independent test sets are used for an objective and fair comparison with other existing methods. The experimental structure dataset is collected for analysing structure information of the methylation sites. The number with # indicates the number of negative samples.