Table 2 Disease/outcome agnostic prediction: AUROC scores on different pretraining objectives for the 10 common and 10 uncommon diseases in Table 1

From: TransformEHR: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records

Models

BERT

TransformEHR

Chronic PTSD

R

81.04

±0.11

83.73

±0.07

0

76.74

±0.17

77.95

±0.12

Type 2 diabetes

R

85.00

±0.10

85.72

±0.07

0

79.97

±0.04

81.84

±0.05

Hyperlipidemia

R

86.78

±0.03

88.04

±0.05

0

81.28

±0.08

83.42

±0.08

Loin pain

R

81.47

±0.04

88.24

±0.05

0

76.88

±0.12

85.37

±0.08

Low back pain

R

85.43

±0.07

86.94

±0.03

0

80.16

±0.07

82.30

±0.10

Obstructive sleep apnea

R

80.74

±0.17

82.25

±0.16

0

73.06

±0.08

74.69

±0.19

Depression

R

86.73

±0.05

87.66

±0.12

0

82.60

±0.12

83.85

±0.11

Obstructive airway disease

R

83.57

±0.14

86.19

±0.07

0

76.99

±0.08

80.27

±0.07

Gastroesophageal reflux

R

84.98

±0.28

91.07

±0.11

0

76.29

±0.36

83.41

±0.33

Arteriosclerosis

R

82.21

±0.06

88.79

±0.10

0

75.78

±0.08

80.03

±0.20

Uncommon disease/outcome

0

75.63

±0.12

80.11

±0.12

  1. Many common diseases are chronic in nature. We therefore study whether prior history of the same disease has an impact on prediction performance, where R is recurrent and 0 is new disease onset. ± represents standard deviation.