Table 2 Machine Learning Model Results

From: Using genomic context informed genotype data and within-model ancestry adjustment to classify type 2 diabetes

Model Input

Model Type

Architecture Diagram

Task(s)

T2D Classification AUC (95% CI)

Estimated PC MSE

Estimated PC R2

T2D Classification from Estimated PCs AUC (95% CI)

T2D Classification from Alternate Input

PCs

NN

Fig. 1a

T2D

0.56 (0.54–0.56)

---

---

---

---

Genotype

NN

Fig. 1b

PCs, T2D from PCs (stop gradient)

---

0.27

0.73

0.55 (0.54–0.56)

---

PC adjusted PRS

LR

---

T2D

0.57 (0.56–0.58)

---

---

---

---

PRS-CS

LR

---

T2D

0.59 (0.58–0.60)

---

---

---

---

LDpred2

LR

---

T2D

0.63 (0.62–0.64)

---

---

---

---

Genotype

NN

Fig. 1c

T2D, PCs, T2D from PCs

0.66 (0.65–0.67)

0.38

0.62

0.56 (0.55–0.57)

---

Genotype

NN

Fig. 1d

T2D, PCs (stop gradient)

0.65 (0.64–0.66)

1.70

<0

---

---

Genotype

NN

Fig. 1e

T2D, PCs (adversarial)

0.66 (0.65–0.67)

2.32

<0

---

---

CID Alternate: Genotype

CNN

Fig. 1f

T2D, T2D from genotype, PCs (stop gradient)

0.62 (0.61–0.63)

0.69

<0

---

0.63 (0.62 – 0.64)

CID Alternate: Genotype

CNN

Fig. 1g

T2D from CID, T2D from genotype (stop gradient), PCs (stop gradient)

0.65 (0.64–0.66)

0.66

<0

---

0.54 (0.53 – 0.55)

CID Alternate: Genotype

CNN

Fig. 1h

T2D from CID, T2D from genotype (adversarial), PCs (adversarial)

0.59 (0.58–0.60)

3.30

<0

---

0.50 (0.50 – 0.50)

Genotype Alternate: CID

CNN

Fig. 1i

T2D from genotype, T2D from CID (adversarial), PCs (adversarial)

0.57 (0.56–0.58)

3.37

<0

---

0.50 (0.49 – 0.51)

h + i intermediate layer output

NN

---

T2D

0.61 (0.60–0.62)

---

---

---

---

  1. PC Principal Component, NN Neural Network, T2D Type 2 Diabetes, CID Context Informed Data matrix, CNN Convolutional Neural Network.