Table 1 Overview of the main results for the experiments using the Rotterdam Study, UK Biobank, and Swedish Schizophrenia Exome Sequencing study data.

From: GenNet framework: interpretable deep learning for predicting phenotypes from genetic data

Dataset (type)

Trait

Subjects & phenotype

AUC gene test (val)

Top three predictive genes

AUC pathway test (val)

Top predictive pathway

 

Class I

Class II

Global level

Middle level

Local level

Rotterdam Study (genotype array)

Eye color

4041 Blue

2250 Other

0.75 (0.74)

HERC2, OCA2, LAMC1

0.50 (0.52)

Organismal Systems (78.4%)

Digestive system (72.6%),

Pancreatic secretion (59.1%)

UK Biobank (exome)

Hair color

1734 Red

1727 Other

0.93 (0.94)

MC1R*, SHOC2, DCTN3

0.77 (0.77)

Genetic Information Processing (87.4%)

Replication and repair (83.4%)

Fanconi anemia pathway (79.7%)

 

3762 Black

3753 Other

0.80 (0.82)

OCA2, RPL23AP8, MC1R*

0.76 (0.78)

Organismal Systems (46.9%)

Endocrine system (18.6%)

Axon guidance (5.0%)

 

4501 Blond

4518 Other

0.66 (0.65)

OCA2, TC2N, SLC45A2

0.58 (0.57)

Organismal Systems (70.2%)

Endocrine system (30.1%),

Adrenergic signaling in cardiomyocytes (4.1%)

 

Bipolar disorder

343 Cases

347 Controls

0.60 (0.56)

LINC00266-1, CSMD1, TCERG1L

0.47 (0.55)

Organismal Systems (76.9%)

Endocrine system (46.5%)

Melanogenesis (32.9%)

 

Atrial fibrillation

192 Cases

194 Controls

0.56 (0.59)

**BRINP1, SORBS3, ELM0D3

0.57 (0.63)

Organismal Systems (39.6%)

Signal transduction (11.2%)

Cytokine−cytokine receptor interaction (4.4%)

 

Coronary Artery Disease

1563 Cases

1600 Controls

0.56 (0.58)

**STARD7-AS1, VWC2L, NSD2

0.54 (0.56)

Environmental Information Processing (29.5%),

Signal transduction (27.7%)

PI3K-Akt signaling pathway (4.6%)

 

Dementia

139 Cases

142 Controls

0.60 (0.65)

RPL23AP87, CTNNA3, LINC01003

0.55 (0.58)

Human Diseases (39.8%)

Signal transduction (22.2%)

Pathways in cancer (5.6%)

 

Male balding pattern

3454 Balding pattern 1

3454 Balding pattern 4

0.56 (0.57)

NGEF, NKRD18B, SYNJ2

0.54 (0.55)

Organismal Systems (34.6%)

Nervous system (9.7%)

Metabolic pathways (8.7%)

 

Asthma

4229 Cases

4214 Controls

0.55 (0.57)

HLA-DQB1, HCG9, LINC00266-1

0.51 (0.54)

Genetic Information Processing (52.3%)

Folding, sorting and degradation (41.5%)

Ubiquitin mediated proteolysis (22.0%)

 

Diabetes

2557 Cases

2555 Controls

0.54 (0.57)

**DNAH10, SNAR-I, PSMD13

0.54 (0.54)

Environmental Information Processing (43.5%),

Signal transduction (40.5%),

Ras signaling pathway (7.9%)

 

Breast cancer

1070 Cases

1082 Controls

0.51 (0.56)

RPL23AP87, LINC00266-1, HPSE

0.51 (0.56)

Human Diseases (57.1%)

Infectious diseases: Viral (16.6%)

Pathways in cancer (6.5%)

Sweden (exome)

Schizophrenia

4969 Cases

6245 Controls

0.74 (0.73)

ZNF773, PCNT, DYSF

0.68 (0.67)

Human Diseases (30.8%)

Infectious diseases: Viral (27.3%)

Human papillomavirus infection (11.7%)

  1. The performance in AUC for the network with gene-annotations is bold if the network outperformed or matched LASSO regression. Manhattan plots for the genes can be found in Supplementary Notes 3−5. *MC1R was not annotated but was identified by linkage disequilibrium. **Many genes contributed to the prediction without clear separation between genes (see Supplementary Note 4).