Fig. 2: Estimation of variance explained by GxE for 12 simulated scenarios.

Estimation of variance explained by GxE for 12 genome-wide scenarios from 10 simulations.“None” indicates the absence of a condition. Model: with or without collider features. Dashed red lines indicate true set G (R2GW) variance and dashed blue lines indicate true set GxE variance (R2GWEI). \({\beta }_{{GE}}\) Conditions: LD > Q3: all \({{{{{{\rm{\beta }}}}}}}_{{GE}}\) effects were sampled from GxE effect SNPs in the highest LD quartile; MAF < Q1: all \({{{{{{\rm{\beta }}}}}}}_{{GE}}\) effects were sampled from GxE effect SNPs in the lowest MAF quartile; sampling distribution for \({{{{{{\rm{\beta }}}}}}}_{{GE}}\) other than ~N(0,1) is denoted; R2E on Y: outcome variance explained by exposure; E continuous unless otherwise stated; E Heritability: additive or heterogeneous. Scenario conditions toggle these parameters: (i) estimation in the null base scenario (\({{R}^{2}}_{{GWEI}}\) = 0), (ii) estimation in the non-null base scenario (\({{R}^{2}}_{{GWEI}}\) = 0.1), (iii) estimation when the exposure variance is raised to 0.1, (iv) estimation when \({\beta }_{{GE}}\) is sampled from LD SNPs > Q3, (v) estimation when \({\beta }_{{GE}}\) is sampled from LD SNPs > Q3 and MAF SNPs <Q1, (vi–vii) estimation when the assumptions of standardization for \({{{{{{\rm{\beta }}}}}}}_{{GE}}\) effects were invalidated by generating effects with exponential and beta distributions (positive kurtosis), (viii) estimation in scenario (i) but using a dichotomous generated \({E}_{{sim}}\), (ix) estimation in the collider scenario where \({{{{{{\rm{\beta }}}}}}}_{{GE}}\) and \({{{{{{\rm{\beta }}}}}}}_{G}\) effects were randomly selected, (x) estimation in the collider scenario where \({{{{{{\rm{\beta }}}}}}}_{{GE}}\) effects were not an element (completely non-overlapping) of \({{{{{{\rm{\beta }}}}}}}_{G}\) effects, (xi) estimation in the collider scenario where \({{{{{{\rm{\beta }}}}}}}_{{GE}}\) effects are a strict subset (completely overlapping) of \({{{{{{\rm{\beta }}}}}}}_{G}\) effects, and (xii) estimation in the collider scenario where simulated exposures are heritable through additive and heterogenous genetic effects. Means and 95% confidence intervals are represented by dot and whisker plots per scenario. Each black dot represents a single genome-wide simulation. Simulations were based on quality controlled UKB data consisting of 325,989 individuals and 1,030,579 SNPs. Source data are provided as a Source Data file.