Table 2 Number of predictions (associations between a gene function and a COG gene family) obtained using different gene function prediction methods based on neighborhoods.

From: Patterns of diverse gene functions in genomic neighborhoods predict gene function and phenotype

Dataset

Method

Number of predictions (functions of any IC)

Number of predictions of general functions with 2 < IC ≤ 4

Number of predictions of specific functions with IC > 4

Precision 0.5

Precision 0.8

Precision 0.5

Precision 0.8

Precision 0.5

Precision 0.8

Prokaryotes

10-NN

31,759

4,828

3,642

1,093

5,664

1,942

GFP

61,418

15,094

11,740

3,089

13,194

4,763

NFP

88,579

25,635

26,448

7,247

16,804

6,228

Fungi

10-NN

65,370

448

17

0

1,284

448

GFP

140,255

20,020

3

2

0

0

NFP

178,687

20,204

3,592

355

7,668

1,579

Metazoa

10-NN

66,403

327

912

74

936

253

GFP

89,992

464

2,552

11

390

94

NFP

102,631

3,057

9,546

769

988

217

  1. 10-NN, ten nearest neighbors; GFP, Gaussian Field Label Propagation (network-based approach); NFP, neighborhood function profile. IC, information content of GO term, where lower IC signifies more general functions. Bold numbers show the best method for a given combination of dataset, stringency and set of functions. The exhaustive list of annotations obtained by the NFP can be seen in Supplementary Material 2 and novel predictions in Supplementary Material 3.