Table 7 Subsets of training data ASTRAL + CullPDB.
From: Protein Secondary Structure Prediction Based on Data Partition and Semi-Random Subspace Method
Subset | Protein length L | Number of proteins | Number of amino acids |
|---|---|---|---|
D1 | (0, 100] | 2260 | 161952 |
D2 | (100, 200] | 5256 | 774167 |
D3 | (200, 300] | 3548 | 877583 |
D4 | (300, 400] | 2382 | 822913 |
D5 | (400, 500] | 1170 | 519422 |
D6 | (500, ∞) | 1058 | 707309 |