Extended Data Fig. 9: Trans-acting regulatory networks at the CTCF, NFKB1, REST, NFE2, MAD1L1 and ENRICH1 loci. | Nature Genetics

Extended Data Fig. 9: Trans-acting regulatory networks at the CTCF, NFKB1, REST, NFE2, MAD1L1 and ENRICH1 loci.

From: Genetic variation influencing DNA methylation provides insights into molecular mechanisms regulating genomic function

Extended Data Fig. 9

(a) Circos plots summarising i. genomic distribution of CpGs associated in trans [inner connections], and ii. known DNA binding sites of transcription factor encoded in cis [outer ring], for sentinel SNPs at CTCF, NFKB1, REST and NFE2 loci. Inset are observed and expected proportions of CpG sites that overlap respective DNA binding sites as available for different cell lines (see Methods). FDR < 1.17 × 10−2 for all cell lines and transcription factors. (b) Regulatory network of ERICH1 locus illustrating the connection between SNP rs10103269 (yellow rectangle) and expression of identified candidate gene ERICH1 (yellow ellipse), which is connected through protein-protein and protein-DNA interactions to methylation at trans-associated CpG sites (beige rectangles). Ellipses represent genes encoded at the genetic locus identified by the sentinel or that are part of the protein-protein interaction network. Genes marked with an asterisk (*) show co-expression with the candidate gene. Bold gene names indicate a strong genetic effect of the sentinel on the expression of that gene (eQTL). Fill colour of ellipses represent the random walk score (colour bar legend). The colour of edges connecting genes and CpG sites represent: i. protein-protein interactions (purple), ii. protein-DNA interactions identified by TFBS overlap (green), and iii. proximity (distance < 1 Mb) between genes and SNPs or CpG sites (blue). The thickness of edges represents correlation with gene expression (thick) or no correlation of/with gene expression (thin). Boxplot shows the effect of sentinel SNP (rs10103269) in cis on expression of ERICH1 with the p-value from linear regression of expression ~ genotype (n=1,546 biologically independent samples combined from both cohorts). Center line indicates median, lower and upper box limits correspond to the first and third quartiles, respectively; whisker extent indicates 1.5-fold interquartile range; outliers not shown. (c) MAD1L1 locus pathway analysis. Annotations and symbols are as described in (b).

Source data

Back to article page