Introduction

Alcohol drinking and tobacco smoking are among the leading causes of death worldwide due to their high prevalence [1, 2]. This widespread use is partially attributed to the addictive effect of alcohol and nicotine on human brain physiology [3,4,5]. Additionally, several analyses observed consistent pleiotropy linking alcohol drinking and tobacco smoking to brain structure and function. For instance, early tobacco smoking initiation was genetically correlated with an increased precuneus surface area and decreased cortical thickness and surface area of the inferior temporal gyrus [6]. Similarly, alcohol drinking showed genetic correlation with an increased total cortical surface area and decreased average cortical thickness [6]. Because genetic information can be used as an anchor for causal inference [7], investigators also explored possible direct effects between drinking/smoking behaviors and brain morphology. For example, smoking initiation and alcohol drinking appear to have a possible causal association with decreased gray matter volume and the multivariable analysis pointed to alcohol drinking as the potential primary driver of this relationship [8]. In another study, genetically predicted global cortical thickness showed an effect on alcohol-drinking behaviors that was independent of neuropsychiatric phenotypes, substance use, trauma, and neurodegeneration [9]. Focusing on specific hypotheses, these previous investigations advanced our understanding of the brain mechanisms contributing to alcohol drinking and tobacco smoking behaviors. However, large-scale datasets allow investigators to expand further the depth of the analyses. Indeed, recent brain-wide pleiotropy analyses provided new insights into the role of brain structure and function on neuropsychiatric and behavioral traits [10,11,12,13].

In the present study, we systematically investigated the pleiotropic mechanisms linking alcohol drinking and tobacco smoking to brain structure and function. Specifically, we integrated genome-wide data generated by the Genome-wide association studies (GWAS) and Sequencing Consortium of Alcohol and Nicotine use (GSCAN; up to 805,431 participants) [14] with information related to 3935 brain imaging-derived phenotypes (IDPs; Supplementary Table 1) obtained from six magnetic resonance imaging (MRI) modalities in the UK Biobank cohort [15]. Our findings highlight the contribution of different pleiotropic mechanisms in the interplay between drinking/smoking behaviors and human brain.

Materials and methods

Study design

In the present study, we investigated the pleiotropic mechanisms linking alcohol drinking and tobacco smoking behaviors to brain structure and function by applying multiple analytic approaches to large-scale genome-wide datasets (Fig. 1). A global genetic correlation analysis was performed to assess the correlation of genetic effects across the genome between alcohol drinking and tobacco smoking and brain IDPs. Considering global genetic correlations surviving multiple testing correction, we leveraged the local analysis of [co]variant association (LAVA) approach [16] to identify specific chromosomal regions with statistical evidence of shared genetic effects. To assess the presence of causal relationships underlying the global genetic overlap observed, we applied the latent causal variable (LCV) approach [17] to pairwise combinations of brain IDP and alcohol/tobacco-related behaviors reaching nominally significant genetic correlation. We decided to apply a nominally significant threshold to identify potential targets to follow up with the LCV approach false discovery rate (FDR) correction to adequately control for multiple testing. To further assess LCV findings, we investigated LCV findings using the Mendelian Randomization (MR) framework.

Fig. 1
figure 1

Workflow of the analyses performed.

Data sources

Genome-wide association statistics regarding IDPs were derived from UKB. This is a large population-based prospective cohort containing in-depth genetic and health information from over 5,00,000 participants [18]. In UKB, brain imaging was conducted using six MRI modalities: T1-weighted structural image, T2-weighted fluid-attenuated inversion recovery (T2_FLAIR) structural image, diffusion MRI (dMRI), resting-state functional MRI (rfMRI), task functional MRI (tfMRI), and susceptibility-weighted imaging (SWI). A total of 3935 brain IDPs (Supplementary Table 1) were defined from MRI scans [19]. GWAS of brain IDPs in up to 33,224 UKB participants of European descent was previously described [15].

Genome-wide association statistics regarding behaviors related to alcohol drinking and tobacco smoking were derived from GSCAN. In its latest GWAS [14], this large collaborative effort meta-analyzed genome-wide information regarding smoking initiation (SmkInit) and the age at which the individual began smoking regularly (AgeSmk), cigarettes smoked per day (CigDay), smoking cessation (SmkCes), and alcoholic drinks per week (DrnkWk). Since brain IDP GWAS data were available only for individuals of European descent, we used publicly available GSCAN GWAS data for the same ancestry group (SmkInit N = 805,431; AgeSmk N = 323,386; CigDay N = 326,497; SmkCes N = 388,313; DrnkWk N = 666,978). The sample overlap due to UKB inclusion in both IDP and GSCAN GWAS does not affect genetic correlation, LAVA, and LCV analyses. However, to avoid potential sample overlap bias in Mendelian randomization (MR) analysis, we also analyzed GSCAN GWAS data excluding the UKB cohort (SmkInit N = 357,235; AgeSmk N = 175,835; CigDay N = 183,196; SmkCes N = 188,701; DrnkWk N = 304,322).

Linkage disequilibrium score regression

Single nucleotide polymorphism (SNP)-based heritability and global genetic correlation were estimated using the linkage disequilibrium score regression (LDSC) method [20]. These analyses were performed using the HapMap 3 reference panel [21] and LD scores derived from European reference populations available from the 1000 Genomes Project [22]. Considering FDR q < 0.05 to account for the number of brain IDPs tested with respect to each GSCAN phenotype, statistically significant genetic correlations were determined with respect to the scale, units, and models defined in the studies that generated the genome-wide association statistics.

Local analysis of [co]variant association

Local genetic correlation was assessed across 2495 semi-independent chromosomal regions (~1 Mb window) using LAVA [16]. The LAVA univariate analysis was performed to estimate local SNP-based heritability for each pair of GSCAN phenotype and brain IDP. Considering chromosomal regions with at least nominally significant local SNP-based heritability (p < 0.05), we estimated local genetic correlation between GSCAN phenotypes and brain IDPs using LAVA bivariate analysis. FDR multiple testing correction (FDR q < 0.05) accounting for the number of chromosomal regions tested was applied to define statistically significant local genetic correlations. To further characterize the genomic regions identified using the LAVA approach, we leveraged information available from the GWAS catalog including genetic regions and reports from previous associations [23].

Genetically inferred causal inference

We performed a LCV analysis [17] to estimate whether the global genetic correlation observed between GSCAN phenotypes and brain IDPs was due to possible cause-effect relationships. Considering pair combinations that reached at least nominally significance in the LDSC genetic correlation analysis (p < 0.05), we estimated the genetic causality proportions (gcp) between two traits. The gcp statistics can range from −1–1, where gcp = 0 indicates no genetic causality, gcp = 1 indicates a full genetic causality of trait #1 on trait #2, and gcp = −1 indicates full genetic causality of trait #2 on trait #1. In the present study, LCV analyses were performed considering brain IDPs phenotypes as trait #1 and GSCAN phenotypes as trait #2. Accordingly, positive gcp estimates indicate a causal effect of brain IDPs on GSCAN phenotypes, while a negative gcp estimate indicates a causal effect in the reverse direction. The sign of the genetically inferred causal effect is defined by the sign of the LCV rho statistics (i.e., rho > 0 corresponds to positive causal effects, while rho < 0 corresponds to negative causal effects). FDR correction accounting for the number of tests performed (FDR q < 0.05) was applied to define statistically significant LCV results.

To validate LCV results, we performed a MR analysis. These methods evaluate causal effect relationships considering different assumptions. The LCV approach assumes a single effect-size distribution to examines the presence of a single latent trait between genetically correlated traits. Conversely, MR assumes the three core instrumental variable (IV) assumptions (relevance, independence, and exclusion restriction on IVs). Accordingly, an effect consistent between these two approaches can be considered more reliable. The MR analysis was performed using the TwoSampleMR package [24] to estimate inverse variance weighting (IVW) estimates. Considering variants present in both exposure and outcome datasets, genetic instruments for TwoSampleMR analyses were identified considering SNPs with an exposure GWAS P-value threshold of 1 × 10−5. that were LD-independent (r2 = 0.001 within a 10,000-kb window). These were strong IVs for the exposures of interest (F statistics > 10; Supplementary Tables 523). Because MR analyses can be biased by sample overlap, this analysis was conducted using UKB brain-IDP GWAS and GSCAN GWAS data excluding UKB participants. Additional details are reported in the STROBE-MR (strengthening the reporting of observational studies in epidemiology using Mendelian randomization) checklist (Supplementary Text).

Results

Considering heritable brain IDPs (SNP-h2 p < 0.05, N = 3723) and applying FDR multiple testing correction (FDR q < 0.05), we identified three genetic correlations linking behaviors related to tobacco smoking with brain structure and function (Supplementary Table 2). AgeSmk showed a negative genetic correlation with the mean thickness of Pole-occipital in the left hemisphere generated by Destrieux (a2009s) parcellation of the white surface (aparc-a2009s lh thickness Pole-occipital, IDP 1219; rg = −0.23, p = 7.74 × 10−6). A positive genetic correlation was observed between SmCes and the total volume of white matter hyperintensities from T1 and T2_FLAIR images (T2 FLAIR BIANCA WMH volume, IDP 1437; rg = 0.16, p = 1.03 × 10−5). CigDay also showed a positive genetic correlation with the mean second level (L2) of right superior longitudinal fasciculus on fractional anisotropy skeleton from dMRI data (dMRI TBSS L2 Superior longitudinal fasciculus R, IDP 1736; rg = 0.14, p = 1.24 × 10−5). To further investigate the dynamics underlying these relationships, we conducted a local genetic correlation analysis and identified one region (hg38 chr2:35,895,678–36,640,246) showing statistically significant local genetic correlation between AgeSmk and aparc-a2009s lh thickness Pole-occipital (rho = 1, p = 1.01 × 10−5). In this region, 178 genome-wide significant associations (p < 5 × 10−8) were reported in the GWAS catalog [23] (Supplementary Table 3). Among them, several were related to brain related phenotypes including smoking initiation (rs62134085 p = 1 × 10−18), educational attainment (rs305191 p = 2 × 10−14), chronotype (rs848552 p = 5 × 10−14), self-reported math ability (rs6708545 p = 1 × 10−10), cognitive performance (rs6728742 p = 1 × 10−8), and cortical thickness (rs1017154 p = 3 × 10−8).

As mentioned above, positive gcp estimates in the LCV analysis indicate a causal effect of brain IDPs on GSCAN phenotypes, while the sign of the genetically inferred causal effect is defined by the sign of the LCV rho statistics (i.e., rho > 0 corresponds to positive causal effects, while rho < 0 corresponds to negative causal effects). Considering nominally significant genetic correlations between brain IDPs and GSCAN phenotypes (Supplementary Table 2), we found 19 relationships linking brain structure and function to behaviors related to alcohol drinking and tobacco smoking (gcp > 0, FDR q < 0.05; Fig. 2, Supplementary Table 4). Among them, 12 were related to brain connectivity analysis derived from rfMRI: three were related to DrnkWk (e.g, partial correlation of edge 363 in rfMRI dimensionality 100, ICA100 edge 363, IDP 2791; gcp = 0.77, P = 1.12 × 10−16, rho = 0.18 ± 0.07), four to AgeSmk (e.g., ICA100 edge 838, IDP 3266, Supplementary Fig. 1; gcp = 0.79, P = 3.72 × 10−23, rho = 0.20 ± 0.08), three to CigDay (e.g., ICA25 edge 184, IDP 2402, Supplementary Fig. 2; gcp = 0.47, P = 2.13 × 10−20, rho = −0.20 ± 0.08), and two to SmkCes (e.g., ICA25 edge 190, IDP 2408; gcp = 0.86, P = 9.43 × 10−22, rho = 0.21 ± 0.07). We observed significant LCV results not related to brain connectivity only with respect to AgeSmk. These included brain IDPs related to cortical thickness (i.e., mean thickness of V1 in the right hemisphere generated by parcellation of the white surface using BA_exvivo parcellation, IDP 1111; gcp = 0.67, P = 3.33 × 10−9, rho = −0.24 ± 0.07), regional brain volumes (i.e., volume of rostral anterior cingulate in the right hemisphere generated by parcellation of the white surface using DKT parcellation, IDP 492; gcp = 0.6, P = 1.45 × 10−7, rho = 0.18 ± 0.06), and cortical areas (e.g., area of orbital-inferior frontal gyrus in the right hemisphere generated by parcellation of the white surface using a2009s parcellation, IDP 958; gcp = −0.67, P = 7.07 × 10−10, rho = 0.22 ± 0.08). Among LCV effects surviving FDR-significance, we observed consistent effects in the MR analyses with respect to rfMRI connectivity ICA100 edge 772 (IDP 3200, Fig. 3) on DrnWk (LCV gcp = 0.38, p = 8.9 × 10−4, rho = −0.18 ± 0.07; IVW beta = −0.04, 95%CI = −0.07–−0.01). Direction consistency was observed for other 12 of the LCV significant results (Supplementary Table 24).

Fig. 2: Genetic causal proportion (false discovery rate, FDR q < 0.05) linking brain imaging-derived phenotypes (IDP) to alcohol drinking and tobacco smoking behaviors.
figure 2

Abbreviations: aparc-a2009s rh area G-front-inf-Orbital (IDP 0958); aparc-DKTatlas rh area rostralanteriorcingulate (IDP 0864); aparc-a2009s rh area S-interm-prim-Jensen (IDP 1000); aparc-Desikan rh area rostralanteriorcingulate (IDP 0707); aparc-a2009s rh area G-subcallosal (IDP 0977); BA-exvivo rh thickness V1 (IDP 1111); aparc-DKTatlas rh volume rostralanteriorcingulate (IDP 0492); ICA100 edge 838 (IDP 3266); ICA25 edge 190 (IDP 2408); ICA25 edge 184 (IDP 2402); ICA100 edge 363 (IDP 2791); ICA100 edge 649 (IDP 3077); ICA100 edge 534 (IDP 2962); ICA100 edge 1438 (IDP 3866); ICA100 edge 628 (IDP 3056); ICA100 edge 974 (IDP 3402); ICA100 edge 772 (IDP 3200); ICA100 edge 162 (IDP 2590); ICA100 edge 505 (IDP 2933). The description of brain IDPs is available in Supplementary Table 1. Full results are available in Supplementary Table 4.

Fig. 3
figure 3

Brain imaging-derived phenotype (IDP) 3200 reflecting edge 772 of dimensionality 100 separated by spatial Independent Component Analysis (ICA) in resting-state functional magnetic resonance imaging.

Discussion

The present study uncovered new information regarding the contribution of pleiotropic mechanisms to the complex interplay of alcohol drinking and tobacco smoking with brain structure and function. Building on previous studies that reported genetically informed relationships of drinking and smoking behaviors with brain cortical morphology and grey matter volume [6, 8, 9], our brain-wide analyses identified genetic overlaps linking alcohol drinking and tobacco smoking with previously unexplored brain-related phenotypes, including white matter hyperintensities, specific brain substructures, and brain connectivity.

The global genetic correlation analysis identified three smoking-related results surviving multiple testing correction. Specifically, SmCes phenotype (current versus former smoker) was genetically correlated with increased total volume of white matter hyperintensities (T2 FLAIR BIANCA WMH volume, IDP 1437). Tobacco smoking has been previously linked to the progression of white matter hyperintensity in a dose-response relationship [25]. However, although this previous study did not observe an association between years since quitting tobacco smoking [25], our finding highlights a possible genetic relationship between SmCes and white matter hyperintensities. This may be due to the fact that former smokers are more likely to have quit because of health conditions and heavier tobacco use in the past [26]. Additionally, SmCes is known to be associated with long-term weight gain [27], which can lead to white matter hyperintensities via inflammation [28]. We also observed a positive genetic correlation between CigDay and the morphology of the mean second level of the right superior longitudinal fasciculus (dMRI TBSS L2 Superior longitudinal fasciculus R, IDP 1736). This brain region is part of a brain network involved in spatial awareness and proprioception [29]. Interestingly, the right superior longitudinal fasciculus has been associated with olfactory performance [30, 31]. In this context, the relationship of this brain region with CigDay may be related to chemosensory processing. While there is a well-established relationship between tobacco smoking and olfactory dysfunction [32], our result suggests possible shared genetic mechanisms predisposing individuals with certain chemosensory abilities to tobacco smoking quantity. There was also a negative genetic correlation between AgeSmk and the mean thickness of pole-occipital (parc-a2009s lh thickness Pole-occipital, IDP 1219). This region is involved in visual attention [33] and appears to play a role in the reactivity toward smoking cues and tobacco craving [34, 35]. In young adults, a reduced mean thickness of the left occipital pole has been associated with exposure to domestic violence [36], which is also a risk factor for tobacco smoking [37]. With respect to the relationship between AgeSmk and IDP 1219, we also observed a significant local genetic correlation in hg38 chr2:35,895,678–36,640,246 region. In this region, the GWAS catalog [23] reports a number of genome-wide significant associations, including several related to smoking initiation, chronotype, educational attainment, cognitive performance, and cortical thickness. The majority of the associations reported were related to variants mapping to CRIM1 and FEZ2. CRIM1 gene has been linked to regulatory mechanisms over axon projection targeting [38, 39]. FEZ2 is a member of a hub protein family involved in neuronal development, neurological disorders, viral infection, and autophagy [40]. While these genes were not previously linked to substance use behaviors, their neurodevelopmental function suggests mechanisms pointing to the effect of brain development on tobacco-smoking behaviors later in life.

To understand potential cause-effect relationships linking alcohol drinking and tobacco smoking to brain structure and function, we conducted a genetically informed causal inference analysis, observing convergent results between the LCV and MR methods that support an inverse effect of rfMRI connectivity ICA100 edge 772 (IDP 3200) on DrnWk phenotype. Because LCV and MR approaches are based on different assumptions, findings supported by both can be considered highly reliable. IDP 3200 reflects the activity of prefrontal and premotor cortex in the left hemisphere and that of superior frontal sulci, superior and inferior precentral sulci, and cingulate sulci (Fig. 3). The left premotor cortex is involved in visual attention and in the integration of visual data at a semantic level [41, 42]. Precentral and cingulate regions are associated with the modulation of reward-seeking motivation [43] and are also involved in the processing of context-related social cues [44,45,46]. Conversely, the right orbitofrontal region is associated with social comprehension [47]. In this context, the effect of IDP 3200 on DrnWk could reflect the impact of social cognition and decision-making process on the propensity towards alcohol consumption.

While the other LCV results did not show statistical significance in the MR analysis, we observed directional consistency for 12 of them (Supplementary Table 24). Among these, we observed that CigDay was inversely affected by fluctuations in rfMRI connectivity in edge 184 for dimensionality 25 (IDP 2402) and edge 505 for dimensionality 100 (IDP 2933). IDP 2402 reflects increased activity across left hemisphere in dorsolateral prefrontal cortex and frontal gyri, decreased activity around right orbitofrontal areas, along with increased activation of parietal lobe (Supplementary Fig. 2). Being part of the superior longitudinal fasciculus, these brain areas are associated with spatial awareness and proprioception [29] and olfactory performance [30, 31]. While olfactory perception is especially mediated by the orbitofrontal cortex [48], brain lesions localized within the right orbitofrontal cortex have been recorded to specifically hinder the formation of conscious olfactory percepts [49]. Interestingly, stimulating the left dorsolateral prefrontal cortex and inhibiting the right orbitofrontal cortex resulted in a risk-averse response in human subjects enrolled in a transcranial direct current stimulation study [50]. IDP 2933 reflects the association between lower activation levels in the ventral posterolateral nucleus to higher ones in the right hemisphere along the posterior cingulate cortex, dorsolateral prefrontal cortex, and frontal gyri (Supplementary Fig. 3). Intriguingly, opposite fluctuation patterns were associated with increased pain in patients experiencing migraine attacks, possibly due to a strengthened network of nociceptive information processing [51]. Nicotine may have anti-nociceptive effects [52] and tobacco smoking has been linked to coping mechanisms related to migraine and chronic head pain [53]. Additionally, the interplay between nociception and olfaction has been proposed at both molecular and functional levels [54, 55]. In this context, our results reinforce the hypothesis of shared genetic mechanisms predisposing individuals with certain chemosensory abilities to tobacco smoking behaviors.

In conclusion, our brain-wide analyses highlighted that different pleiotropic mechanisms likely contribute to the relationship of brain structure and function with alcohol drinking and tobacco smoking, opening new directions in understanding the processes underlying these complex behaviors. However, we also acknowledge few key limitations. While we leveraged large-scale genome-wide datasets, these were generated including only participants of European descent, because of the lack of large genetic and imaging studies in other human populations. Accordingly, our findings may not be generalizable to other population groups. Another important limitation is related to genetically informed analysis. While we used multiple methods relying on different assumptions, our results may still be affected by unaccounted confounders. For instance, global and local genetic correlation estimates may be affected by certain population dynamics such as assortative mating. Similarly, genetically informed causal inference methods (LCV and MR) can be affected by unexpected violations of their assumptions. Thus, our findings will need to be confirmed by evidence generated by complementary study designs (e.g., prospective studies). Finally, while no genetically inferred effect of alcohol drinking and tobacco smoking was observed, future studies with larger datasets may be able to characterize this effect direction.