Table 1 Prediction accuracy of BrainGENIE (40-PC model) computed via fivefold cross-validation of the GTEx Project (v.8) release across the 12 brain tissues.

From: BrainGENIE: The Brain Gene Expression and Network Imputation Engine

  

All predicted genes

Significantly predicted genes

Brain tissue

n donors (paired blood–brain transcriptomes)

# of measured genes

% of genes measured in tissue compared to all annotated genes (GRCh38)

Mean training R2

Mean CV R2

Max CV R2

# of genes

% of measured genes

% of genes measured in tissue compared to all annotated genes (GRCh38)

Mean training R2

Mean CV R2

Mean CV Pearson’s r

Amygdala

88

18,957

34

0.65

0.05

0.39

4265

22

8

0.70

0.13

0.36

Anterior cingulate cortex BA24

99

19,236

34

0.56

0.05

0.43

6799

35

12

0.59

0.11

0.33

Caudate basal ganglia

137

20,524

37

0.46

0.06

0.38

10,772

52

19

0.50

0.09

0.30

Cerebellum (Fresh frozen)

131

20,540

37

0.48

0.06

0.5

9573

47

17

0.53

0.11

0.33

Cerebellum (PAXgene preserved)

154

21,494

38

0.42

0.05

0.39

10,098

47

18

0.47

0.09

0.30

Frontal cortex (PAXgene preserved)

141

20,340

36

0.42

0.06

0.5

10,186

50

18

0.46

0.10

0.32

Frontal Cortex (Fresh frozen)

125

19,983

36

0.49

0.07

0.5

11,816

59

21

0.53

0.11

0.33

Hippocampus

122

20,189

36

0.47

0.03

0.46

4008

20

7

0.53

0.10

0.32

Hypothalamus

114

20,839

37

0.50

0.04

0.56

5749

28

10

0.56

0.11

0.33

Nucleus accumbens basal ganglia

148

20,884

37

0.43

0.05

0.46

11,252

54

20

0.47

0.09

0.30

Putamen basal ganglia

125

19,330

34

0.49

0.06

0.54

10,350

54

18

0.53

0.11

0.33

Substantia nigra

86

18,882

34

0.62

0.04

0.51

2947

16

5

0.69

0.14

0.37

  1. The proportion of genes expressed in the whole blood accounts for 31% of all genes annotated in the GRCh38 genome assembly. The criteria for declaring genes “significantly predicted” is as follows: cross-validation [CV] R2 ≥ 0.01, CV FDRp ≤ 0.05. The total number of genes detected in each brain tissue for which BrainGENIE models were trained appears in the third column (“# of genes”). The proportion of genes that were significantly predicted by BrainGENIE from the total number of detected genes per brain tissue is presented in the ninth column (“% of genes”).