Extended Data Fig. 2: Genomic differences between lung cancers from smokers and lung cancers from never-smokers, and landscape of 56 LCINS tumours that exhibit SBS4 activity.
From: The mutagenic forces shaping the genomes of lung cancer in never smokers

a, Differences between smoker and never-smoker lung cancer cases across SBS signatures. Volcano plot (top) indicating the enrichment of SBS signature prevalence in never-smokers (left) and smokers (right) with lung cancer. Statistically significant enrichments were evaluated using multivariable logistic regression models for smoking status and adjusted by age, sex, histology, genetic ancestry, and tumour purity. Firth’s bias-reduced logistic regressions were used for regression presenting complete or quasi-complete separation. P-values were adjusted for multiple comparisons based on the total number of mutational signatures considered, and adjusted p-values were reported as FDR values. Horizontal lines marking statistically significant thresholds were included at 0.05 (dashed orange line) and 0.01 FDR levels (dashed red line). Bar plot (bottom) indicating prevalence by smoking history. b, Tumour mutational burden differences between SBS4-positive (n = 56) and negative (n = 815) LCINS tumours for SBS, DBS, ID, CN segments, and SV events. Statistical significance was evaluated using two-sided Wilcoxon rank sum tests. The line within the box indicates the median, the upper and lower ends indicate the 25th and 75th percentiles, whiskers show 1.5 × interquartile range, and values outside are shown as individual data points. c–e, Mutational signature landscape for SBS (c), DBS (d) and ID (e) mutation types, including absolute and relative number of mutations assigned to each mutational signature, unsupervised clustering based on the signature contributions, and sample-level annotations of sex, genetic ancestry, passive smoking, and accuracy of signature reconstruction based on cosine similarity. f, Driver mutations landscape, including different types of genomic alterations, as well as sample-level annotations of sex, genetic ancestry, histology, and tumour purity. g, Enrichment of EGFR p.L858R hotspot driver mutations in SBS4-positive tumours from never-smokers compared to smokers using multivariable logistic regressions considering clinical and epidemiological covariates, including age, sex, genetic ancestry, histology, and tumour purity (n = 5 mutated non-smoker cases, n = 1 mutated smoker case, n = 51 non-smoker wild-type cases, n = 301 smoker wild-type tumours). Error bars indicate 95% CIs.