Table 2 Benchmark datasets details.

From: An accurate alignment-free protein sequence comparator based on physicochemical properties of amino acids

Datasets

Number of sequences

Amino acids (approximate lengths)

Sources

Betaglobin

9

150

NCBI

Betaglobin

50

150

EM Article2 and fuzzy integral5

Betaglobin

88

150

Natural vector Article29

ND5

9

600

NCBI

ND6

8

175

NCBI

Coronavirus

24

1500

NCBI

Coronavirus

50

1500

EM Article2 and fuzzy integral5

TF

24

700

NCBI

AFP

27

140

EM Article2

HRV

114

2200

Natural vector Article29

Xylanases

20

500

Fuzzy integral Article5

Influenza A

1163

480

Natural vector Article29