Figure 4: Contribution of metabolism to DNA methylation at cancer loci.
From: Integrative modelling of tumour DNA methylation quantifies the contribution of metabolism

(a) Prediction of ESR1 promoter methylation in test samples of breast cancer. The x axis shows the methylation value at ESR1 promoter, while the y axis shows the corresponding predicted values by Elastic Net. (b) Prediction of AR promoter methylation in test samples in prostate cancer. The axes are similar to a. (c,d) Top 20 variables as ranked based on the variable importance score from Random Forest model of ESR1 promoter methylation in breast cancer (c) and AR promoter methylation in prostate cancer (d). Variables in the SGOC network (including the met cycle enzymes and other SGOC enzymes) are shown in red and all other variables are shown in black. (e) Schematic depicting the ranking of all variables based on combined results of promoter and gene body methylation at cancer loci. (f,g) Variables that were most predictive of cancer gene methylation on average (top 15%) are ranked in order of increasing contribution (variable score=per cent usage by Elastic Net). Green arrows point to previously published factors associated with variations in DNA methylation (positive controls). (Variable names: official gene symbols are used to show gene expression variables (‘methionine cycle enzymes’, ‘other SGOC enzymes’, ‘transcription factors’, ‘chromatin remodelling factors’ and ‘ SAM metabolizing enzymes’), while ‘_mut’ and ‘_cn’ suffixes following gene symbols denote ‘mutations’ and ‘copy number variations’, respectively. For ‘clinical factors’, variable names match the descriptors used in the TCGA data files.) See Supplementary Fig. 11 for additional cancer types. (h) Sub-network of SGOC genes contributing to DNA methylation in multiple cancer types (at least four and three cancers based on Elastic Net and Random Forests models, respectively). Red and white nodes represent genes and metabolite, respectively. Solid edges denote direct biochemical links and dashed edges denote indirect biochemical links through enzymatic reactions not shown. Node sizes for the gene nodes correspond to the number of cancer types wherein each enzyme contributed significantly to cancer gene methylation. (Phosphoglycerate dehydrogenase (PHGDH)=6, MAT (MAT2B and MAT2A)=5, glycine amidinotransferase (GATM)=5, serine hydroxymethyltransferase 1 and 2 (SHMT1 and SHMT2)=4, sarcosine dehydrogenase (SARDH)=4, alanyl aminopeptidase (ANPEP)=4, L-amino acid oxidase (IL4I1)=4 and gamma-glutamyl hydrolase (GGH)=4.)