Fig. 3: Integrative genetic analysis identifies CREB3 as protective factor in ALS.

a Experimental design to identify transcription factors (TFs) upstream of the WGCNA turquoise module which genes were intersected with TFs from ENCODE. Each TF was then investigated using an integrative approach combining WGCNA-based connectivity, expression profile of the TF-target genes and burden of rare variants. b Cluster dendrogram and heatmap showing TF-target gene expression in each of the 30 DE cell populations in ALS patients compared to controls. Row clustering of TFs based on their target gene expression identifies two clusters (blue and orange). The orange cluster (dashed line rectangle) was further prioritized based on a higher expression level of its target genes in L5-ET. The right-sided heatmap shows WGCNA-based TF connectivity in prioritized cell populations, association of missense rare variants in each TF with ALS risk, and target genes expression profile in mCSN at a presymptomatic (30–60 d) and symptomatic stages (90–105 d). The bottom heatmap shows the prioritized TF CREB3 expression profile in ALS and FTD patients compared to controls. c Quantile-quantile plot of the meta-analyzed gene burden of rare missense variants in a cohort of 1873 ALS patients and 3926 healthy controls showing CREB3 and ALS known genes. d Locus zoom plot showing the SNP (+/− 500KB) rs11538707 (R119G) association with ALS in the discovery cohort3 and the replication of 1873 ALS cases and 3926 healthy controls. Orange dashed lines shows the genome-wide significant SNP at a p-value < 5e-08 and colored dots represent LD with the lead variant (red diamond). e Forest plot of gene burden association of CREB3 gene together with known ALS genes. Aggregation of rare missense variants in CREB3 confers a reduced risk of ALS (OR = 0.66 95%CI 0.51–0.87; Firth-logistic regression p = 2.9e-03). Genome-wide RVBA association after Bonferroni correction 0.05/14245 = 3.51.e−06. f Forest plot showing association of the missense variant rs11538707 (R119G) on CREB3 with ALS risk, compared to formerly identified associations (see Supplementary Data 25). Errors bars represent 95% confidence interval to the odds ratios. P-values were calculated using a linear-mixed model and represent uncorrected genome-wide association. Genome-wide significance is fixed at p < 5e−08.