Table 3 Selecteda CpG-gene-cancer trios suggesting DNA methylation influencing cancer risk by regulating gene expression

From: Integrating muti-omics data to identify tissue-specific DNA methylation biomarkers for cancer risk

CpG

Chr

Pos (HG19)

Gene

Distance (Mb)

Cytoband

CpG-cancer

CpG-gene

Gene-cancer

Dir

Pb

Dir

Pc

Dir

Pd

Breast cancer

 cg23231268

3

46,792,462

CCR9

0.848

3p21.31

+

1.43 × 10−8

-

1.15 × 10-4

-

7.36 × 10−8

 cg18035979

5

81,575,199

ATP6AP1L

0.000

5q14.2

+

1.21 × 10-9

-

3.05 × 10-5

-

1.95×10-9

 cg14587961

7

99,991,523

PILRA

Body

7q22.1

+

3.50 × 10−7

+

3.70 × 10-5

+

1.45×10-6

 cg07546779

8

29,495,175

LEPROTL1

-0.458

8p12

-

3.54 × 10-13

-

6.25 × 10-4

+

4.97 × 10-45

 cg02301815

17

44,249,491

KANSL1-AS1

-0.021

17q21.31

+

1.74 × 10−8

-

2.39 × 10-6

-

6.00×10−11

Colorectal cancer

 cg20019365

2

219,134,978

RP11-378A13.1

0.013

2q35

+

5.06 × 10-13

+

7.41 × 10-5

+

4.27×10-19

 cg14130039

6

32,121,225

HLA-DPA1

-0.911

6p21.32

-

4.04 × 10−10

-

3.82 × 10-4

+

0.01

 cg12934461

15

90,792,652

MAN2A2

-0.653

15q26.1

+

9.66 × 10-9

-

6.27 × 10-4

-

5.05 × 10-130

 cg19877683

17

80,969,515

FN3KRP

0.281

17q25.3

-

7.53 × 10−8

-

1.87 × 10-5

+

8.10 × 10-25

 cg19133199

19

41,869,409

B9D2

Body

19q13.2

+

2.48 × 10-15

-

3.14 × 10-5

-

3.08×10−10

Renal cell cancer

 cg13524857

11

69,240,192

CCND1

0.216

11q13.3

+

1.61 × 10−7

+

3.61 × 10-3

+

6.11 × 10-62

Lung cancer

 cg09476067

6

30,418,581

TRIM39

0.107

6p21.33

-

1.42 × 10-19

+

1.51 × 10-4

-

3.29 × 10-17

 cg15732223

11

118,551,206

TREH

0.001

11q23.3

+

3.46 × 10−8

+

5.44 × 10-4

+

3.18×10-6

 cg05651442

12

52,347,030

KRT2

-0.691

12q13.13

+

7.55 × 10−8

-

7.78 × 10-4

-

8.82 × 10−79

 cg22563815

15

78,856,949

CHRNA3

-0.028

15q25

-

1.02 × 10-26

+

4.08 × 10-5

-

3.91×10-35

 cg26812862

17

66,011,719

HELZ

0.770

17q24.3

+

1.20 × 10-9

-

1.18 × 10-3

-

6.89 × 10-27

Ovarian cancer

 cg18750960

2

177,016,417

HOXD4

Body

2q31.1

-

8.85 × 10−11

-

3.74 × 10-3

+

8.65×10−8

 cg09087803

11

32,125,186

QSER1

-0.790

11p13

+

1.18 × 10−7

-

1.97 × 10-3

-

1.99 × 10-17

 cg17117718

17

43,663,208

LRRC37A4P

0.036

17q21.31

+

2.37 × 10−10

-

1.28 × 10-12

-

1.13 × 10-13

Prostate cancer

 cg24838316

6

29,895,260

ZFP57

0.246

6p22.1

-

6.31 × 10−11

+

1.49 × 10-4

-

5.41×10-6

 cg16237302

11

47,429,196

ARFGAP2

0.231

11p11.2

+

1.11 × 10−8

-

3.59 × 10-4

-

5.15×10-4

 cg00524169

19

39,138,063

SAMD4B

-0.695

19q13.2

-

3.24 × 10-9

+

2.18 × 10-4

-

1.09 × 10-16

 cg15272956

20

62,332,704

RTEL1

0.005

20q13.33

+

1.73 × 10-23

+

4.23 × 10-4

+

6.13×10-41

 cg05092891

21

37,535,885

MORC3

-0.170

21q22.12

-

3.65 × 10−8

+

5.70 × 10-6

-

1.34 × 10-31

Testicular germ cell cancer

 cg23581489

6

33,164,210

B3GALT4

-0.081

6p21.32

+

2.40 × 10-9

-

1.99 × 10-4

-

4.43 × 10-97

 cg22340370

7

2,019,882

MRM2

-0.254

7p22.3

+

2.21 × 10-16

+

1.07 × 10-3

+

5.29×10-6

 cg13353244

16

50,099,780

BRD7

-0.248

16q12.1

-

2.12 × 10−8

+

4.69 × 10-4

-

0.02

 cg04198914

17

36,106,025

C17orf78

0.353

17q21.2

-

1.12 × 10-19

+

4.79 × 10-4

-

2.79 × 10−77

  1. Chr chromosome, Mb mega base, Dir association direction.
  2. a Selected from 854 CpG-gene-cancer trios demonstrating consistent directions of CpG-cancer, CpG-gene, and gene-cancer associations. In each trio, all of the three associations were statistically significant. Due to the large number of such trios, for each cancer, at most five trios in distinct loci are presented and all the other trois are available in Supplementary Data 2127.
  3. b P-values of associations between genetically predicted DNA methylation and cancer risk evaluated by applying GTEx-based DNA methylation prediction models to cancer GWAS data using SPrediXcan. Associations with Bonferroni-corrected P < 0.05 were considered significant.
  4. c P-values of associations between tissue DNA methylation and gene expression calculated by linear regression using GTEx data. Associations with false discovery rate (FDR)-corrected P < 0.05 were considered significant.
  5. d P-values of (1) associations between genetically predicted gene expression and cancer risk evaluated by applying GTEx-based gene expression prediction models to cancer GWAS data using SPrediXcan, or (2) differential gene expression between cancer and normal tissues obtained from GEPIA2. Associations or differential expressions with FDR-corrected P < 0.05 were considered significant. For genes with both P-values available, the one from SPrediXcan analysis is presented. All P-values from SPrediXcan analyses are highlighted in bold.