Table 1 Metrics computed on the 233,756 protein functional clusters (PFC) from the sequence similarity network of MAGs proteins.
From: Towards omics-based predictions of planktonic functional composition from environmental data
PFC size | Functional scores | Taxonomy scores | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Homogeneity | Unknowns quantification | Homogeneity | Unknowns quantification | ||||||||
Mean | 3.24 | Mean homogeneity score with EggNOG annotations (Number of NA values) | 0.94 (35,818) | EggNOG annotations | PFCs only composed of annotated proteins (% of total PFCs) | 181,595 (77.7%) | PFCs associated to only 1 Phylum (% of total PFCs) (% of PFCs with at least one Phylum annotation) | 221,541 (94.8%) (97.5%) | Phylum level | Only proteins from annotated MAGs (% of total PFCs) | 220,839 (94.5%) |
Only proteins from unannotated MAGs (% of total PFCs) | 6,367 (2.7%) | ||||||||||
PFCs with at least one annotated protein (%of total PFCs) | 197,938 (84.7%) | PFCs associated to only 1 Class (% of total PFCs) (% of PFCs with at least one Class annotation) | 192,095 (82.2%) (96.8%) | Class level | Only proteins from annotated MAGs (% of total PFCs) | 186,331 (79.7%) | |||||
Only proteins from unannotated MAGs (% of total PFCs) | 35,338 (15.1%) | ||||||||||
Minimum | 2 | PFCs only composed of unknown proteins (%of total PFCs) | 35,818 (15.3%) | PFCs associated to only 1 Order (% of total PFCs) (% of PFCs with at least one Order annotation) | 144,265 (61.7%) (93.8%) | Order level | Only proteins from annotated MAGs (% of total PFCs) | 135,046 (57.8%) | |||
Only proteins from unannotated MAGs (% of total PFCs) | 79,921 (34.2%) | ||||||||||
Mean homogeneity score with KEGG annotations (Number of NA values) | 0.99 (113,321) | KEGG annotations | PFCs only composed of annotated proteins (% of total PFCs) | 91,103 (39.0%) | PFCs associated with only 1 Family (% of total PFCs) (% of PFCs with at least one Family annotation) | 100,801 (43.12%)(95.3%) | Family level | Only proteins from annotated MAGs (% of total PFCs) | 88,404 (37.8%) | ||
Maximum | 1072 | Only proteins from unannotated MAGs (% of total PFCs) | 128,010 (54.76%) | ||||||||
PFCs with at least one annotated protein (%of total PFCs) | 120,435 (51.5%) | PFCs associated to only 1 Genus (% of total PFCs) (% of PFCs with at least one Genus annotation) | 21,921 (9.4%) (91.9%) | Genus level | Only proteins from annotated MAGs (% of total PFCs) | 13,544 (5.8%) | |||||
PFCs only composed of unknown proteins (% of total PFCs) | 113,321 (48.5%) | PFCs associated with only 1 MAG (% of total PFCs) | 7146 (3.1%) | Only proteins from unannotated MAGs (% of total PFCs) | 209,892 (89.8%) |