Table 1 Characteristic features of the analyzed pentapeptide categories.

From: Global pentapeptide statistics are far away from expected distributions

Pentapeptide category

abcde

a2bcd

a2b2c

a3bc

a3b2

a4b

a5

Total

# different residues

5

4

3

3

2

2

1

 

# permutation classes, g

15504

19380

3420

3420

380

380

20

42504

# sequences in each class, m

120

60

30

20

10

5

1

 

# sequences in category, g*m

1860480

1162800

102600

68400

3800

1900

20

3200000

SQ

# peptides

11826966639

10678009933

1474713851

1250343041

137441351

118960963

16794874

25145695663

DM

# peptides

8403194988

7331367618

946224643

761468629

66880015

48819824

2475299

17560431016

% peptides

71.1

68.7

64.2

60.9

48.7

41.0

14.7

69.8

avr. count of peptides per sequence

4517

6305

9222

11133

17600

25695

123765

5488

α = 0.05

# outlier peptides

493138899

424035844

53363368

42424412

4159480

7698699

 

1024820702

% outlier peptides

5.9

5.8

5.6

5.6

6.2

15.8

 

5.8

# sequences with high-abundance outliers

53101

39865

4056

2897

203

248

 

100370

# sequences with low-abundance outliers

0

1

4

12

8

114

 

139

# sequences with outliers

53101

39866

4060

2909

211

362

 

100509

% sequences with outliers

2.9

3.4

4.0

4.3

5.6

19.1

 

3.1

# classes with no outliers

1268

4079

1298

1622

229

204

20

8720

% classes with no outliers

8.2

21.0

38.0

47.4

60.3

53.7

 

20.5

α = 0.001

# outlier peptides

280213162

220140554

26341865

15042352

1528111

5171988

 

548438032

# sequences with outliers

23395

15522

1445

814

60

159

 

41395

# classes with no outliers

4442

9749

2314

2714

322

276

20

19837

ND

# peptides

1945203546

1941405338

318589904

302524159

46571312

47671781

9690386

4611656426

% peptides

16.4

18.2

21.6

24.2

33.9

40.1

57.7

18.1

avr. count of peptides per sequence

1046

1670

3105

4423

12256

25090

484519

1441

α = 0.05

# outlier peptides

65287845

60332804

11852693

10445205

7080780

8322425

 

163321752

% outlier peptides

3.4

3.1

3.7

3.5

15.2

17.5

 

3.5

# sequences with high-abundance outliers

45401

31854

3205

2331

255

257

 

83303

# sequences with low-abundance outliers

0

6

9

19

20

84

 

138

# sequences with outliers

45401

31860

3214

2350

275

341

 

83441

% sequences with outliers

2.4

2.7

3.1

3.4

7.2

17.9

 

2.6

# classes with no outliers

2400

6238

1673

1956

183

215

20

12685

% classes with no outliers

15.5

32.2

48.9

57.2

48.2

56.6

 

29.8

α = 0.001

# outlier peptides

29230408

23141426

3649728

2180996

4410942

4902831

 

67516331

# sequences with outliers

17112

10753

884

535

109

164

 

29557

# classes with no outliers

6940

12430

2713

2951

276

273

20

25603

NN

# peptides

1309190957

1254681724

190474037

170300897

22748621

21599223

4612762

2973608221

% peptides

11.1

11.8

12.9

13.6

16.6

18.2

27.5

11.7

avr. count of peptides per sequence

704

1079

1856

2490

5986

11368

230638

929

α = 0.05

# outlier peptides

27845910

27168059

5035864

4734407

3100618

4456107

 

72340965

% outlier peptides

2.1

2.2

2.6

2.8

13.6

20.6

 

2.4

# sequences with high-abundance outliers

28667

21968

2487

1822

304

284

 

55532

# sequences with low-abundance outliers

1

10

10

18

12

93

 

144

# sequences with outliers

28668

21978

2497

1840

316

377

 

55676

% sequences with outliers

1.5

1.9

2.4

2.7

8.3

19.8

 

1.7

# classes with no outliers

4067

8309

1902

2153

146

202

20

16799

% classes with no outliers

26.2

42.9

55.6

63.0

38.4

53.2

 

39.5

α = 0.001

# outlier peptides

10275506

8847995

1465607

683925

2104751

2921002

 

26298786

# sequences with outliers

10017

6975

665

355

156

192

 

18360

# classes with no outliers

9279

14491

2856

3105

229

259

20

30239