Table 1 TOP: BioGRID (version 3.4.164, downloaded Sept. 2018), sorted by number of edges.

From: SANA: cross-species prediction of Gene Ontology GO annotations via topological network alignment

Species

ShortName

Common name

Nodes

Edges

Mean degree

Max degree

H. sapiens

HS

Human

17,200

282,181

32.8

2385

S. cerevisiae

SC

Baker’s yeast

5984

104,962

35.1

3603

D. melanogaster

DM

Fruit fly

8728

46,364

10.6

266

A. thaliana

AT

Water cress

9364

34,725

7.42

1341

M. musculus

MM

Mouse

6777

18,108

5.34

1671

S. pombe

SP

Fission yeast

2811

8931

6.36

298

C. elegans

CE

Round worm

3194

5572

3.49

181

R. norvegicus

RN

Rat

2391

3554

2.97

808

Code

Description of sequence-based evidence (i.e., disallowed in our predictions)

IBA

curated transfer amongst related sequences Based on common Ancestry (derived by sequence comparison)

IEA

Electronic Annotation (strong sequence-based evidence not directly traceable to experimental evidence)

ISM

Inferred from sequence model

ISA

Inferred from sequence alignment

ISO

Inferred from Sequence Orthology

IGC

Inferred from Genomic Context

RCA

Inferred from Reviewed Computational Analysis

ISS

Inferred from Sequence or Structural Similarity

  1. The graphs are undirected; duplicate edges, self-loops and all interactions with proteins outside the specified species were removed.
  2. BOTTOM: Sequence-based GO evidence codes disallowed in “NOSEQ” cases: Note that we are rather more Draconian in our interpretation of “sequence-based” than is the norm: we disallow any code in which sequence could have had any influence, including manually curated sequence comparison. This supports our hypothesis that NAF discovers semantic similarity “in the absence of sequence similarity”.