Fig. 3: Multivariate regression analysis between five tobacco smoking variables and genomic or epigenomic features in the EAGLE samples. | Nature Communications

Fig. 3: Multivariate regression analysis between five tobacco smoking variables and genomic or epigenomic features in the EAGLE samples.

From: APOBEC affects tumor evolution and age at onset of lung cancer in smokers

Fig. 3

a Distributions of the values of each smoking variable in the 218 EAGLE samples. b Forest plot for the associations between TMB and smoking variables, stratified by LAS (n = 114 tumors) and HAS (n = 84 tumors). P-values and regression coefficients with 95% confidence intervals (CIs) are shown for each category of smoking variables. Significant associations are highlighted in red. Trend test P-values, adjusted for multiple testing using the Benjamini–Hochberg method (FDRtrend) from associations between TTFC and TMB are included below the forest plots. Error bars represent 95% confidence intervals of the regression coefficients. c Volcano plot shows the association between each TTFC category and the mutation status of commonly mutated genes (Frequency > 20%). We performed logistic regression analyses between LAS and HAS tumors. The size of each point on the volcano plot indicates the overall gene mutation frequency. The red and green dashed line indicates the association significance threshold P = 0.05, and FDR = 0.05, respectively. d Example of an association between each TTFC category and ZFHX4 mutation frequency stratified by LAS and HAS subtypes. Trend test P-values (Ptrend) are labeled above each subplot. e Multivariate regression analysis of the DNA methylation level at CpG probe cg05575921 within the AHRR gene and smoking status, conducted in tumor (n = 116 samples) and normal (n = 119 samples) EAGLE tissue samples. The association analyses are performed on all tumors and separately between LAS and HAS tumor subtypes. Trend test P-values (Ptrend) are labeled above each subplot. f Volcano plots of the associations between smoking variables and methylation levels of known smoking-related CpG probes (n = 116 tumors). Association FDR values (adjusted using the Benjamini-Hochberg method) are shown on the y-axis. The orange dashed line indicates the associations with FDR < 0.05. The CpG probes associated with tobacco smoking are derived from a study58 comparing methylation levels between smokers and never smokers in normal lung tissue. The size and color of each point represent the FDR and association direction, respectively. All association analyses are adjusted for the following covariates: age, sex, histology, and tumor purity. All box plots display the median (centerline), interquartile range (box), and whiskers extending to 1.5 × the interquartile range (IQR) by default in ggplot2. Each data point is plotted individually as a dot. Source data are provided as a Source Data file.

Back to article page