Causal mediation analysis for time-varying heritable risk factors with Mendelian randomization

Wu, Zixuan; Lewis, Ethan; Zhao, Qingyuan; Wang, Jingshu

doi:10.1038/s41467-025-61648-7

Download PDF

Article
Open access
Published: 28 July 2025

Causal mediation analysis for time-varying heritable risk factors with Mendelian randomization

Nature Communications volume 16, Article number: 6945 (2025) Cite this article

7433 Accesses
5 Citations
15 Altmetric
Metrics details

Subjects

Abstract

Understanding the causal mechanisms of diseases is crucial in clinical research. When randomized experiments are unavailable, Mendelian Randomization (MR) leverages genetic mutations to mitigate confounding. However, most MR analyses assume static risk factors, oversimplifying dynamic risk factor effects. The framework of life-course MR addresses this but struggles with limited GWAS cohort sizes and correlations across time points. We propose FLOW-MR, a computational approach estimating causal structural equations for temporally ordered traits using only GWAS summary statistics. FLOW-MR enables inference on direct, indirect, and path-wise causal effects, demonstrating superior efficiency and reliability, especially with noisy data. By incorporating a spike-and-slab prior, it mitigates challenges from extreme polygenicity and weak instruments. Applying FLOW-MR, we uncovered a childhood-specific protective effect of BMI on breast cancer and analyzed the evolving impacts of BMI, systolic blood pressure, and cholesterol on stroke risk, revealing their causal relationships.

Causal relationship between type 2 diabetes and glioblastoma: bidirectional Mendelian randomization analysis

Article Open access 17 July 2024

Genetically predicted plasma metabolites mediate the causal relationship between gut microbiota and osteosarcoma

Article Open access 01 March 2025

An optimized instrument variable selection approach to improve causality estimation in association studies

Article Open access 01 October 2024

Introduction

Understanding the causal mechanisms behind diseases is a core challenge in clinical research. While randomized controlled trials are considered the gold standard for establishing causality, they can be impractical in certain situations. As a result, Mendelian Randomization (MR) has become increasingly popular as an alternative. MR leverages genetic variations as a form of natural experiment, effectively reducing the influence of unmeasured environmental confounders in epidemiological studies^1,2.

Traditional MR analyses often rely on cross-sectional designs, treating risk factors as constant over time. However, many heritable risk factors vary across the lifespan and may have time-specific effects on disease outcomes^3,4. Ignoring this temporal aspect can lead to simplistic and misleading conclusions⁵. For example, while ecological studies in epidemiology show that vitamin D levels during childhood are associated with multiple sclerosis risk, standard MR approaches, which focus on adult vitamin D levels, have instead pointed to adult vitamin D as the key factor in disease etiology⁶.

In response to these limitations, life-course MR has emerged as a framework that considers how risk factors measured across an individual’s lifetime influence later-life outcomes⁷. Though the idea of incorporating life-long information is appealing, performing life-course MR can be challenging due to the small cohort size in Genome-wide Association Studies (GWAS) of earlier life traits and high auto-correlations of risk factors over time. One commonly used approach is multivariable MR (MVMR), such as MVMR-IVW⁸, which estimates the direct causal effects of the risk factor at each time point. However, MVMR will have limited efficiency and reliability on noisy GWAS data with a small cohort size. Additionally, current MVMR techniques struggle to evaluate indirect causal effects of early-life risk factors that influence outcomes through intermediate factors. Other methods, such as g-estimations of structural mean models^4,9 or functional principal component analysis that aggregates effects across time points¹⁰, offer some alternative solutions but require access to individual-level GWAS data that are often not available on a large scale.

To address these challenges, we propose FLOW-MR (Framework for Life-cOurse pathWay analysis using Mendelian Randomization), a computational method that estimates a full system of structural equations for temporally ordered heritable traits using only GWAS summary statistics. FLOW-MR enables comprehensive causal mediation analysis, allowing for the decomposition of total causal effects into direct, indirect, and path-specific components. As a longitudinal extension of GRAPPLE¹¹, FLOW-MR introduces a novel sufficient condition for causal identifiability that generalizes the well-known InSIDE assumption¹² to longitudinal settings, thereby accommodating both pervasive horizontal pleiotropy and unmeasured mediator-outcome confounding. FLOW-MR can be viewed as a Bayesian analogue of the full-information maximum likelihood (FIML) estimator commonly used in structural equation modeling in economics and social sciences¹³, with the advantage of requiring only summary data. Its hierarchical Bayesian framework allows for flexible modeling of weak and invalid instruments and captures complex interactions between genetic variants and environmental factors.

When benchmarked against existing MVMR methods, FLOW-MR substantially improves the efficiency and robustness of the statistical inference, especially when the GWAS data are noisy and from smaller cohorts. By incorporating a spike-and-slab prior for genetic effects, FLOW-MR can tackle the different polygenic patterns of complex traits and mitigate weak instrument bias. Applying FLOW-MR to publicly available GWAS summary statistics, we identify a protective effect of Body Mass Index (BMI) during childhood on breast cancer risk, confined to a specific developmental period. Our analysis further reveals a complex interplay between BMI, systolic blood pressure (SBP), and low-density cholesterol levels across different life stages and their effects on stroke risk. Our findings suggest that adulthood SBP may be the only factor with a direct causal effect on stroke, while the previously identified causal effect of BMI on SBP^14,15 in adulthood may be attributable to confounding by childhood SBP.

Results

Method overview

Consider a sequence of risk factors (X₁, X₂, …, X_K−1) in temporal order, so that X_i precedes X_j when i < j. Our goal is to understand the causal relationship between these traits and their effects on a later outcome Y (also denoted as X_K for convenience), typically some disease status measured in adulthood. Causal effects among the traits must adhere to the temporal sequence (i.e., only an earlier trait can causally influence a later trait) (Fig. 1a). We allow the genetic variants Z and the unmeasured environmental confounders U to have direct causal effects on any trait at any time point. Associated with this directed acyclic graph (DAG), we assume a series of structural equations for each individual (see Methods).

FLOW-MR requires only GWAS summary statistics for each trait, achieved by projecting the individual-level structural equations onto each single SNP Z_j. Specifically, let γ_kj be the marginal associations between a SNP j and trait X_k, we then have the following linear models:

$${\gamma }_{kj}={\sum }_{l=1}^{k-1}{\beta }_{kl}{\gamma }_{lj}+{\alpha }_{kj},\,\,k=1,\ldots,K.$$

(1)

Here, β_kl is our parameter of interest, representing the direct causal effect of an earlier trait X_l on a later trait X_k. The term α_kj can be viewed as the direct genetic effect of SNP j on trait k. If SNP Z_j is used as an instrument for trait X_k, then α_kj corresponds to the SNP-trait link used for instrumental variable analysis. In contrast, for ${k}^{{\prime} }\ne k$, ${\alpha }_{{k}^{{\prime} }j}$ represents pleiotropic effects of SNP Z_j on trait ${X}_{{k}^{{\prime} }}$.

GWAS summary statistics provide estimates of the marginal associations γ_kj with their standard errors. They can be obtained either from linear regressions or, for binary traits, from logistic regressions, following¹⁶ under the liability threshold model (Supplementary Fig. S1, SI text). For each SNP Z_j and trait X_k, we assume ${\hat{\gamma }}_{kj} \sim {{{\mathcal{N}}}}({\gamma }_{kj},{\delta }_{kj}^{2})$, where ${\hat{\gamma }}_{kj}$ is the observed estimate and ${\delta }_{kj}^{2}$ is the estimated variance. Our approach accommodates sample overlap across traits, thus allowing for correlations among ${\hat{\gamma }}_{kj}$ across different traits k. Following previous MR methods^11,17, we can adaptively estimate this correlation matrix from the GWAS summary data.

Notice that, unlike marginal associations γ_kj, the direct genetic effects α_kj are not observed from the data. Existing literature shows that most SNPs may have non-zero but weak effects on complex traits, while a small subset may be responsible for the core biological process and strongly impact the traits¹¹. Therefore, FLOW-MR assumes the following spike-and-slab distributional assumptions on α_kj:

$${{{{\mathbf{\alpha }}}}}_{kj}| \,{p}_{k},{\sigma }_{k0}^{2},{\sigma }_{k1}^{2}\quad { \sim }^{{{{\rm{ind}}}}}\quad (1-{p}_{k}){{{\mathcal{N}}}}\left(0,{\sigma }_{k0}^{2}\right)+{p}_{k}{{{\mathcal{N}}}}\left(0,{\sigma }_{k1}^{2}\right),\quad k=1,2,\ldots,K,$$

which allows for trait-specific effect sizes. To estimate the model, we use a hierarchical Bayesian framework and Gibbs sampler. The Gibbs sampler enables the convenient use of posterior samples to construct credible intervals for any function of the direct effects ${\{{\beta }_{kl}\}}_{k\ge l}$, including both direct and indirect effects among the traits, as well as the proportions of these effects relative to the total effects.

To address known confounders, we further extend our model to accommodate multiple traits at any given time point (Fig. 1b). Specifically, at each time point k, we assume that there are n_k ≥ 1 exposures, denoted as ${X}_{k1},\ldots,{X}_{k{n}_{k}}$. To facilitate the MR analysis, we assume that no causal relationship exists between traits measured at the same time point. If a known causal direction exists between two traits measured at the same time point k, our method can be still be applied by introducing a pseudo-time point (or stage) k + 1 following k, where the outcome trait of the two traits at time k is moved to stage k + 1. More details on the model, estimation, inference procedures, and key assumptions are provided in Methods and SI Text.

Identifiability of direct and indirect causal effects

Mediation analysis, as conducted in this study, is inherently vulnerable to unmeasured mediator-outcome confounding even when the treatment is fully randomized¹⁸. As illustrated in Fig. 2, with two exposures and an outcome (K = 3), if we rely solely on a set of valid genetic instruments for the first risk factor X₁ (Fig. 2a), it is not possible to separate the direct and indirect effects of X₁ due to unmeasured mediator-outcome confounding between X₂ and Y (see SI Text for a counter-example). In contrast, if each risk factor has its own set of valid instruments (Fig. 2b), it becomes feasible to identify all direct and indirect causal effects. However, since the sequence of risk factors often represents the same factor measured at different time points, finding SNPs that exert their effects exclusively at a specific time point can be challenging.

**Fig. 2: Identifiability of the direct and indirect causal effects between traits under three scenarios.**

FLOW-MR requires a more realistic assumption to account for pleiotropic effects of the SNPs (Fig. 2c). Instead of assigning each risk factor its own set of valid instruments, we allow SNPs to have nonzero direct effects on all risk factors–thereby permitting any given SNP to be an invalid instrument for a particular trait. However, we require that the direct genetic effects ${\alpha }_{1}={({\alpha }_{1j})}_{j=1}^{P},\ldots,{\alpha }_{K}={({\alpha }_{1j})}_{j=1}^{P}$ be orthogonal across traits (see Methods). More generally, if the number of SNPs P is large, it suffices for these direct effects to be independent rather than exactly orthogonal, extending the well-known InSIDE assumption¹² to multiple traits. Notice that we only assume independence among α_kj, allowing the marginal associations γ_kj to remain possibly correlated across traits. This assumption enables FLOW-MR to address both pleiotropic effects and unmeasured mediator-outcome confounding. A formal mathematical statement of the identification conditions and its proof are provided in the SI Text.

Systematic benchmarking with synthetic GWAS summary statistics

We compare FLOW-MR with alternative MVMR approaches to perform life-course MR. Specifically, we benchmark our method against six methods, including MVMR-IVW⁸, MVMR-Egger¹⁹, MVMR-Robust²⁰, GRAPPLE¹¹, MVMR-Horse²¹ and MVMR-cML²². Among these methods, MVMR-IVW is the most widely used in practice, and other methods are designed to deal with pleiotropic effects under different assumptions and weak instruments. To apply an MVMR method, we use a multiple-step approach where at each step k, we estimate the direct causal effects of X₁, ⋯ , X_k−1 on X_k. One limitation of this approach is the challenge of providing reliable statistical inference on indirect and pathwise effects, as estimates across different steps are correlated.

To assess performance, we generate synthetic GWAS summary datasets by soft-thresholding real GWAS summary statistics, allowing the true direct genetic effects to be exactly 0 for some SNPs. Specifically, if we observe ${\hat{\gamma }}_{kj}^{{{{\rm{real}}}}}$ with variance ${\hat{\delta }}_{kj}^{2}$ for a particular SNP j in a real GWAS dataset, we set ${\alpha }_{kj}=\,{\mbox{sign}}\,({\hat{\gamma }}_{kj}^{{{{\rm{real}}}}}){(| {\hat{\gamma }}_{kj}^{{{{\rm{real}}}}}| -0.01)}_{+}$ and randomly shuffle α_kj across j within each trait k to ensure independence across traits.

To specify the causal relationship across the temporally ordered traits, we simulate three scenarios: K = 3, K = 4, and a multivariate case with multiple exposures at each time point (Fig. 3a). We generate synthetic data for earlier traits using childhood GWAS data to mimic real MR mediation analysis, where earlier childhood traits typically have smaller sample sizes than adulthood traits. Specifically, we use childhood BMI data from the Norwegian Mother, Father, and Child Cohort Study (MoBa)²³ and childhood lipid traits from the Avon Longitudinal Study of Parents and Children (ALSPAC) cohort²⁴. Adult GWAS datasets are used to generate summary statistics for the remaining traits. Further details of the simulation design, along with additional sensitivity analyses and evaluation of computation costs, can be found in the SI Text. Supplementary Tables S1, S2 present the number of SNPs and conditional F-statistics²⁵ computed at each p-value threshold as evaluations of instrument strength.

**Fig. 3: Systematic benchmarking results.**

We evaluate the coverage and average lengths of the confidence and credible intervals of the direct and indirect causal effects across different methods (Fig. 3b, c, Supplementary Figs. S2, S3). For the direct causal effects, most MVMR methods suffer from severe under-coverage, largely because of limited capabilities to handle pervasive pleiotropic effects and weak-instrument bias. Specifically, MVMR-IVW does not allow any pleiotropic effects, whereas MVMR-Robust and MVMR-cML assume a plurality of valid genetic instruments for each step–a condition that is violated in our simulations and is also unlikely to hold in real life-course MR analyses. Although MVMR-Horse and MVMR-Egger can accommodate independent and pervasive pleiotropy, they remain under-covered in most scenarios, likely due to biases introduced by (conditionally) weak instruments (Supplementary Table S2) and correlation of SNP-trait associations across risk factors. By contrast, both FLOW-MR and GRAPPLE demonstrate good coverage regardless of SNP strength. Compared to GRAPPLE, FLOW-MR has shorter intervals and offers more efficient analyses. Furthermore, for the indirect causal effects of X₁, only FLOW-MR can provide reliable inference, with credible intervals showing good coverage and reasonable power.

Effect of early life body size on breast cancer

Recent studies suggest that early-life body size may protect against breast cancer^26,27,28. Using MVMR-IVW,³ found that this protective effect is not mediated by adult body size, which itself does not causally influence breast cancer. However, their study relied on a recalled early-life body score from the UK Biobank as a proxy for early-life body size. As noted by the original authors, such coarse measurements can raise concerns about the accuracy of their findings.

To address this, we revisited the analysis using FLOW-MR and childhood BMI GWAS summary statistics. First, we replicate the original study³ using the same dataset: adult BMI from the UK Biobank for adult body size, recalled early-life body score from UK Biobank for early-life body size, and breast cancer data from Michailidou et al.²⁹ for the outcome. Genetic instruments were selected at prespecified p-value thresholds from the GIANT adult BMI dataset and EGG childhood BMI dataset. FLOW-MR, alongside all MVMR methods, successfully replicates the original findings (Fig. 4a, Supplementary Fig. S4a), confirming no causal effect of adult BMI and a direct protective effect of childhood body size on breast cancer.

**Fig. 4: Evaluation of the effect of body size on breast cancer at different ages.**

However, when we replaced the early-life body size trait with GWAS data on 8-year-old BMI from MoBa²³, a direct measurement of childhood body size, MVMR-IVW and MVMR-Egger unexpectedly suggested a protective effect of adult BMI on breast cancer, while childhood BMI showed no direct effect (Fig. 4b, Supplementary Fig. S4b). In contrast, GRAPPLE, MVMR-Robust and MVMR-Horse lost the power to detect any causal effects. Only FLOW-MR replicates the original conclusion that childhood BMI has a protective effect. Although 8-year-old BMI provides a more precise measurement, its use in MR is challenging due to its small sample size (3K samples in the MoBa 2019 cohort compared to half a million in the UK Biobank). This is reflected in the substantial decrease in the conditional F-statistics in the childhood dataset (Supplementary Table S2), indicating a loss of instrument strength due to the limited cohort size of the MoBa study. FLOW-MR demonstrates both efficiency and robustness in this scenario.

We further explored the impact of including 1-year-old BMI from MoBa in our analysis. Given that the GWAS data of 1-year-old BMI and 8-year-old BMI traits are from the same cohort, we incorporate the estimated noise correlation matrix (Supplementary Fig. S4c). Despite significant genetic correlations between the 1-year-old and 8-year-old BMI traits (Supplementary Fig. S4d), FLOW-MR still detected the protective effect of 8-year-old BMI on breast cancer at the p-value threshold 0.001 (Fig. 4c, d). As the direct effect of 1-year-old BMI on breast cancer is insignificant, the analysis revealed that the protective effect of body size on preventing breast cancer is likely confined to a specific period in childhood.

Additionally, we found that 1-year-old BMI directly influences 8-year-old BMI but does not have any additional direct effect on adult BMI. Interestingly, the estimated causal effects of 1-year-old BMI on 8-year-old BMI and 8-year-old BMI on adult BMI appear to be of similar magnitude (Fig. 4d, Supplementary Fig. S4e). Furthermore, FLOW-MR has the power to detect path-wise indirect causal effects, indicating a significant protective effect of 1-year-old BMI against breast cancer, as well as a positive effect on adult BMI, both of which are mediated through 8-year-old BMI (Fig. 4e).

Temporal causal relationships between high blood pressure, BMI and stroke

High blood pressure is a well-established modifiable risk factor for stroke³⁰. Recent univariate MR studies have also suggested that adult BMI may be a potential causal risk factor for stroke and has a significant positive causal effect on high blood pressure in adulthood^14,15,31. Our study aims to disentangle the complex relationships between high blood pressure, BMI, and stroke, focusing on how these causal connections evolve throughout life. We also consider potential confounders, such as lipid traits.

We analyze the causal effects of systolic blood pressure (SBP), BMI, and low-density lipoprotein cholesterol (LDL-C) in both childhood and adulthood on stroke using FLOW-MR. To capture potential causal effects of BMI on SBP, even at the same time point, we introduce additional pseudo-time points (stages) for SBP traits both in childhood and adulthood (Fig. 5a). We utilize GWAS summary statistics from the ALSPAC cohort²⁴ for childhood traits, UK Biobank for adult BMI and SBP, and the Global Lipids Genetics Consortium for adult LDL-C³². Stroke data comes from Malik et al.³³, and SNP selection is based on the GERA datasets^34,35, the GIANT adult BMI dataset, and EGG childhood BMI dataset.

**Fig. 5: Multivariate mediation analysis on low-density lipoprotein cholesterol (LDL-C), body mass index (BMI), systemic blood pressure (SBP) and stroke.**

Our analysis reveals a clear positive causal effect of adulthood SBP on stroke, with no significant direct effects from other traits, including childhood traits and adult BMI (Fig. 5b, Supplementary Fig. S5a). Interestingly, our findings challenge previous univariate MR results (refs. ^14,15, Supplementary Fig. S5b), showing no evidence of a positive causal effect of adult BMI on adult SBP after accounting for childhood SBP. Further MR analyses using only these three continuous traits also suggest that the previously observed association between adult BMI and SBP may be confounded by childhood SBP (Supplementary Fig. S5c). Additionally, we observe a stronger genetic correlation between childhood SBP and adult BMI than between adult SBP and BMI (Supplementary Fig. S5d). These results imply that the impact of adulthood BMI on SBP in earlier studies may have been overstated due to unaccounted confounding factors.

Discussion

We propose FLOW-MR using GWAS summary data for life-course Mendelian Randomization, based on a full structural model for all traits. This method allows users to assess time-varying causal effects of heritable risk factors and distinguish between direct and indirect causal effects. By addressing key challenges in life-course MR—the high genetic correlation of the same trait at different ages and the limited cohort size of age-specific GWAS, FLOW-MR demonstrates superior performance in simulation studies. Unlike other methods, FLOW-MR is not prone to biases arising from using weakly associated SNPs as instruments and can efficiently integrate information across all traits.

For the identification of causal effects in life-course MR, a similar argument has been discussed in ref. ⁷. They show that to identify the causal direct effects of traits, genetic variants must exert different effects on each exposure in the model, and these effects must be linearly independent, meaning the true SNP-trait association matrix must have full rank. This condition is necessary but not sufficient for identifying all direct and indirect effects. For instance, in univariable MR, pleiotropic effects must adhere to specific assumptions (such as the InSIDE assumption) to ensure the identification of a risk factor’s casual effect. Merely assuming that the pleiotropic effects are not perfectly correlated with the marginal SNP-exposure associations is insufficient for identifying the causal effects. While we allow pervasive pleiotropy in our work, we assume that the direct genetic effects of SNPs are independent across traits, so there is no correlated pleiotropy between the traits. Our sensitivity analysis indicates that, although FLOW-MR models the life-course traits as a whole system, the statistical inference for a particular outcome demonstrates robustness to correlated pleiotropy between other traits (Supplementary Fig. S6). In practice, if confounding pathways are known to exist between traits with correlated pleiotropy, we recommend collecting additional GWAS data for these confounding factors and explicitly adjusting for them using FLOW-MR.

To assess the stability of estimates across different instrument selection criteria, in the paper we employed a range of p-value thresholds for selecting genetic instruments in both simulations and case studies, following the recommendations of Wang et al.¹¹. In our simulations, the results remain broadly consistent across the various p-value thresholds. In contrast, larger discrepancies are observed in the real data case studies. This difference may be partly due to the stronger causal signals and the enforced independence of direct genetic effects across traits in the simulation design. Additionally, the number of selected SNPs varies much more substantially in the real data (Supplementary Table S1), which contributes to differences in statistical power across thresholds.

Finally, as noted by previous studies^6,36, life-course MR for mediation analysis also faces several potential limitations. One such limitation is the absence of measurements at critical time points when the exposure exerts a substantial causal effect. However, if a missing time point X_k is not a hidden confounder for any pair of subsequent traits ${X}_{{k}_{1}}$ and ${X}_{{k}_{2}}$, our assumption of independent direct genetic associations remains valid, and the omission of that time point does not bias the estimation of causal effects at observed time points. Another concern is the possibility of reverse causation, where the outcome may causally influence the exposure measured at the latest time point. Although life-course MR typically benefits from the temporal ordering of traits, a challenge arises when the final exposure and outcome are measured concurrently–such as when both are assessed in adulthood. In such cases, if the selected SNPs influence the outcome primarily through their effects on the exposures, MR estimates are generally robust to reverse causality^6,11,37. Our proposed framework inherits this desirable property of MR.

Methods

Structural equations on individual-level data

Denote Z as the vector containing all available genetic variants. Based on the causal DAG (Fig. 1a), we assume the following additive structural equations for each individual:

$${X}_{K}:=Y= {\sum }_{l=1}^{K-1}{\beta }_{Kl}{X}_{l}+{f}_{K}({{{\bf{U}}}},{{{\bf{Z}}}},{{{{\bf{E}}}}}_{Y}),\\ {X}_{k}= {\sum }_{l=1}^{k-1}{\beta }_{kl}{X}_{l}+{f}_{k}({{{\bf{U}}}},{{{\bf{Z}}}},{{{{\bf{E}}}}}_{{X}_{k}}),\,\,k=2,\ldots,K-1\\ {X}_1= {f}_{1}({{{\bf{U}}}},{{{\bf{Z}}}},{{{{\bf{E}}}}}_{{X}_{1}}).$$

(2)

In these equations, the functions f₁( ⋅ ), ⋯ , f_K( ⋅ ) capture the direct genetic and environmental effects on the traits, which may be nonlinear and involve arbitrary interactions. Our primary assumption in (2) is that the causal effect β_kl for any earlier trait X_l(1 ≤l≤ K − 1) on any later trait X_k (l + 1 ≤ k ≤ K) is linear and homogeneous. When the outcome or risk factor traits are binary, we adopt a liability threshold model¹⁶, treating each binary trait X_k as a discretized version of an underlying continuous liability ${X}_{k}^{\star }$, that follows the additive structural equations above (Supplementary Fig. S1). In this context, our method estimates the causal relationships among the latent liabilities. We also further generalize (2) to allow for nonlinear causal effects and interactions among earlier traits, so that our linear model for summary statistics can accommodate more complex causal relationships measured at different time points. Additional details are provided in the SI Text.

Given these structural equations representing the causal relationships among the traits, each β_kl represents the direct causal effect of X_l on X_k whenever l < k. A path-wise causal effect from X_l to X_k along a path ${{{\mathcal{L}}}}=({X}_{{r}_{0}},{X}_{{r}_{1}},\cdots \,,{X}_{{r}_{t-1}},{X}_{{r}_{t}})$ (where ${X}_{{r}_{0}}={X}_{l}$ and ${X}_{{r}_{t}}={X}_{k}$) of length t is defined as

$${\beta }_{kl}^{{{{\mathcal{L}}}}}={\prod }_{i=1}^{t}{\beta }_{{r}_{i}{r}_{i-1}}.$$

The total effect sums up the effect from X_l to X_k through all directed pathways, and the indirect effect is the difference between the total effect and the direct effects^38,39. In other words, the indirect effect ${\beta }_{kl}^{{{{\rm{ind}}}}}$ sums all path-wise causal effects of length t ≥ 2:

$${\beta }_{kl}^{{{{\rm{ind}}}}}={\sum }_{t=2}^{k-l}{\sum}_{{{{\mathcal{L}}}}=\{{X}_{{r}_{0}}={X}_{l},\cdots \,,{X}_{{r}_{t}}={X}_{k}\}}{\beta }_{kl}^{{{{\mathcal{L}}}}}.$$

Model on summary statistics

Denote γ_kj ≡ argmin_γVar[X_k − γZ_j] as the marginal association between continous trait X_k and Z_j, and let ${\alpha }_{kj}\equiv {{\mbox{argmin}}}_{\alpha }{{{\rm{Var}}}}[{f}_{k}(U,{{{\bf{Z}}}},{{{{\bf{E}}}}}_{{X}_{k}})-\alpha {Z}_{j}]$ as the projection of the genetic and environmental direct effects of trait X_k onto SNP Z_j, which can be heuristically understood as the direct genetic effect of SNP j on X_k. Then by projecting the structural Eq. (2) onto a single SNP Z_j, we can obtain a linear model (1) with measurement error on the GWAS summary statistics (for mathematical details, see SI Text). These equations can be more conveniently expressed in matrix form as:

$${{{\mathbf{\Gamma }}}}={{{\bf{B}}}}\cdot {{{\mathbf{\Gamma }}}}+{{{\bf{A}}}},$$

(3)

where P is the number of SNPs and the matrices are defined as

$${{{\mathbf{\Gamma }}}}\equiv \left[\begin{array}{cccc}{\gamma }_{11}&{\gamma }_{12}&\ldots \,&{\gamma }_{1P}\\ {\gamma }_{21}&{\gamma }_{22}&\ldots \,&{\gamma }_{2P}\\ \vdots &\vdots &\ldots \,&\vdots \\ {\gamma }_{K1}&{\gamma }_{K2}&\ldots \,&{\gamma }_{KP}\end{array}\right],\,{{{\bf{A}}}}\equiv \left[\begin{array}{cccc}{\alpha }_{11}&{\alpha }_{12}&\ldots \,&{\alpha }_{1P}\\ {\alpha }_{21}&{\alpha }_{22}&\ldots \,&{\alpha }_{2P}\\ \vdots &\vdots &\ldots \,&\vdots \\ {\alpha }_{K1}&{\alpha }_{K2}&\ldots \,&{\alpha }_{KP}\end{array}\right],\,\\ {{{\bf{B}}}}\equiv \left[\begin{array}{ccccc}0&0&\ldots \,&0&0\\ {\beta }_{21}&0&\ldots \,&0&0\\ \vdots &\vdots &\ldots \,&\vdots &\vdots \\ {\beta }_{K1}&{\beta }_{K2}&\ldots \,&{\beta }_{K(K-1)}&0\end{array}\right].$$

Here, the matrix Γ represents the marginal SNP-trait associations, A denotes the direct effects of the SNPs on each trait and B is the matrix of direct causal relationships of earlier traits on later traits that we would like to infer. Equation (3) can be alternatively represented as

$${{{\mathbf{\Gamma }}}}=({{{\bf{I}}}}+\tilde{{{{\bf{B}}}}}){{{\bf{A}}}},\quad \tilde{{{{\bf{B}}}}}\equiv \left[\begin{array}{ccccc}0&0&\ldots \,&0&0\\ {\tilde{\beta }}_{21}&0&\ldots \,&0&0\\ \vdots &\vdots &\ldots \,&\vdots &\vdots \\ {\tilde{\beta }}_{K1}&{\tilde{\beta }}_{K2}&\ldots \,&{\tilde{\beta }}_{K(K-1)}&0\end{array}\right]$$

(4)

where I is the K × K identity matrix, and $\tilde{{{{\bf{B}}}}}$ is the lower-triangular matrix that solves ${({{{\bf{I}}}}-{{{\bf{B}}}})}^{-1}={{{\bf{I}}}}+\tilde{{{{\bf{B}}}}}.$ Compared to (3), the parametrization in (4) avoids matrix inversion, making it more amenable for statistical estimation. It is easy to show that $\tilde{{{{\bf{B}}}}}$ can be written as a Neumann series: $\tilde{{{{\bf{B}}}}}={{{\bf{B}}}}+{{{{\bf{B}}}}}^{2}+\ldots \,$. Intuitively, the (k, l) entry of the matrix $\tilde{{{{\bf{B}}}}}$ denotes the total causal effects of X_l on X_k through all directed pathways⁴⁰.

Since the GWAS cohort size is typically large, for each SNP Z_j, the summary statistics ${\hat{{{{\mathbf{\Gamma }}}}}}_{j}$ approximately follow a normal distribution ${\hat{{{{\mathbf{\Gamma }}}}}}_{j} \sim {{{\mathcal{N}}}}({{{{\mathbf{\Gamma }}}}}_{j},{{{{\mathbf{\Sigma }}}}}_{j})$, where Γ_j: = (γ_1j, γ_2j, …, γ_Kj) is a vector of marginal associations and Σ_j is a covariance matrix obtained from the GWAS standard errors and a correlation matrix that depends on the extent of sample overlap and the correlation between the traits. This correlation matrix is shared across the SNPs and can be estimated from the non-statistically significant GWAS summary statistics using the method described in ref. ¹¹ (SI Text).

Identification conditions for direct and indirect effects

We provide identification conditions for the matrix B of all direct causal effects from earlier traits to later ones. Once these direct effects are identified, the indirect effects–arising from sums and products of direct effects under our additive structural Eq. (2) become identifiable as well.

Under the relationship $\tilde{{{{\bf{B}}}}}={\left({{{\bf{I}}}}-{{{\bf{B}}}}\right)}^{-1}-{{{\bf{I}}}}$, it suffices to identify the lower-triangular matrix $\tilde{{{{\bf{B}}}}}$. Assuming an infinitely large cohort, the marginal association matrix Γ can be consistently estimated from GWAS data. We then provide conditions on the direct genetic effects A that make $\tilde{{{{\bf{B}}}}}$ uniquely solvable in Eq. (4).

Specifically, we prove that $\tilde{{{{\bf{B}}}}}$ is unique if matrix A lies in the set

$${{{\mathcal{A}}}}:=\{{{{\bf{A}}}}\in {{\mathbb{R}}}^{K\times P}:{{{\bf{A}}}}{{{{\bf{A}}}}}^{\top }={{{\rm{diag}}}}({d}_{1},\ldots,{d}_{K})\,{{{\rm{with}}}}\,\min \{{d}_{1},\ldots,{d}_{K-1}\} > 0\}.$$

Intuitively, rather than assuming that each trait k has its own set of valid SNPs (i.e., there is at least a SNP j where α_jk ≠ 0 and ${\alpha }_{j{k}^{{\prime} }}=0$ for any ${k}^{{\prime} }\ne k$), we require only that the direct genetic effects across traits are mutually orthogonal. Mathematically, this is because $({{{\bf{I}}}}+\tilde{{{{\bf{B}}}}})({{{\bf{A}}}}{{{{\bf{A}}}}}^{\top }){({{{\bf{I}}}}+\tilde{{{{\bf{B}}}}})}^{\top }$ provides a unique LDL decomposition of the identifiable matrix ΓΓ^⊤. To make this assumption more realistic, we further can show that when the number of SNPs P → ∞, it is sufficient for these direct effects to be merely independent across traits rather than strictly orthogonal–an extension of the well-known InSIDE assumption to multiple traits.

Selection of genetic instruments

We assume that we can use separate GWAS datasets to obtain p-values for SNP selection, thereby mitigating potential selection biases^11,41. Upon selecting the SNPs, another set of GWAS datasets is used to obtain the summary statistics for the exposure and outcome traits. The selection p-value for each SNP is defined as the Bonferroni-corrected p-value, computed from K − 1 GWAS summary datasets corresponding to traits X₁ to X_K−1. SNPs are subsequently chosen based on the selection p-values using LD clumping⁴² (with r² cutoff at 0.001), ensuring that selected SNPs are approximately independent. Compared to a stringent p-value threshold (such as 10⁻⁸) for SNP selection, our method allows for a higher threshold (such as 10⁻⁴ or 0.01) and uses SNPs that are weakly associated with the exposure traits. This can generally increase the power of MR analysis.

Model estimation and inference

We use a hierarchical Bayesian framework and Gibbs sampler to infer the direct and indirect causal effects, which can be expressed as functions of the matrix $\tilde{{{{\bf{B}}}}}$. In addition to the spike-and-slab prior on the direct genetic effects, we put Gaussian priors on elements of the matrix $\tilde{{{{\bf{B}}}}}$ and conjugate priors on the hyperparameters:

$${\tilde{\beta }}_{kl}| {\sigma }^{2} { \sim }^{{{{\rm{i}}}}.{{{\rm{i}}}}.{{{\rm{d}}}}.}{{{\mathcal{N}}}}(0,{\sigma }^{2}),\quad 1\le l < k\le K\\ {\sigma }^{2} \sim {{{\rm{InvGamma}}}}(\alpha,\beta )\\ {p}_{k} { \sim }^{{{{\rm{ind}}}}}{{{\rm{Beta}}}}({a}_{k},{b}_{k}),\quad k=1,2,\ldots,K\\ {\sigma }_{k0}^{2} { \sim }^{{{{\rm{ind}}}}}\,{\mbox{InvGamma}}\,({\alpha }_{k}^{0},{\beta }_{k}^{0})\quad k=1,2,\ldots,K\\ {\sigma }_{k1}^{2} { \sim }^{{{{\rm{ind}}}}}\,{\mbox{InvGamma}}\,({\alpha }_{k}^{1},{\beta }_{k}^{1})\quad k=1,2,\ldots,K$$

To estimate the above Bayesian hierarchical model, we employ a Gibbs sampler algorithm which can efficiently generate posterior samples when K is small. Posterior samples of $\tilde{{{{\bf{B}}}}}$ directly give us posterior samples of B, which can then be used to construct credible intervals for any direct and indirect causal effects, or their functions. Additionally, we use a simplified empirical Bayes approach to choose the hyper-parameters $({\alpha }_{k}^{0},{\beta }_{k}^{0}),({\alpha }_{k}^{1},{\beta }_{k}^{1})$ and (a_k, b_k), recognizing their significant impact on the posterior distributions of $\tilde{{{{\bf{B}}}}}$. For further computational and mathematical details, see SI Text.

Extension to multiple traits at each time point

We also extend our model to accommodate multiple traits at any given time point, incorporating two key assumptions to facilitate the MR analysis. First, we assume that there is no causal effect between traits at the same time point. This assumption avoids the need to identify causal directions between any two traits that coexist temporally. If two sets of traits {X_k1, ⋯ , X_km} and {X_k(m+1), ⋯ , X_kM} at the same time point k potentially causally related–with the first set influencing the second–our method can still be applied by introducing an additional pseudo-time point (or stage) k + 1 immediately following k. The second set of traits is then moved to this added stage, and all subsequent stages are shifted accordingly.

The second assumption is a generalization of the independence assumption on direct genetic effects. Specifically, we require that the direct genetic effects α_kj of any SNP j must be independent across all traits, including those measured at the same and at different time points. With these assumptions, we can adapt our model and Gibbs sampler to infer the direct and indirect causal effects.

Benchmarking methods

We benchmarked the competing methods using publicly available software packages. Specifically, GRAPPLE, MVMR-IVW (MVMR; https://github.com/WSpiller/MVMR), MVMR-Egger (MendelianRandomization; https://CRAN.R-project.org/package=MendelianRandomization), MVMR-Robust (https://github.com/aj-grant/robust-mvmr), MVMR-cML (https://github.com/ZhaotongL/MVMR-cML), and MVMR-Horse (https://github.com/aj-grant/mrhorse) were all implemented with their standard procedures recommended in the respective software documentation. We used the same set of selected SNPs for each method.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

GWAS summary statistics are all downloaded from public sources and are processed following the standard workflow in GRAPPLE to select genetic instruments. For adult BMI, GWAS summary statistics is obtained from the GIANT consortium website (https://portals.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium_data_files#2017_Gene_x_Environment_Summary_Statistics) and the UK Biobank Neale’s lab (http://www.nealelab.is/uk-biobank/) using phenotype code 21001. Adult lipid traits (LDL-C, HDL-C, and triglycerides) summary statistics are acquired from the GLGC cohort via the GLGC website (https://csg.sph.umich.edu/willer/public/lipids2013/) and from the GERA cohort through GWAS Catalog with study accession numbers GCST007141, GCST007140, and GCST007142. Breast cancer GWAS summary statistics come from GWAS Catalog with study accession number GCST004988, while those for T2D are downloaded from the DIAGRAM website (https://diagram-consortium.org/downloads.html), and for stroke from GWAS Catalog with study accession number GCST006906. For childhood traits, GWAS summary statistics on 1-year-old BMI and 8-year-old BMI are from the MoBa 2019 cohort, downloadable from their website (https://www.fhi.no/en/ch/studies/moba/for-forskere-artikler/gwas-data-from-moba/#2019). Childhood BMI GWAS data from the EGG Consortium can be found at http://egg-consortium.org/childhood-bmi.html. The ALSPAC datasets for childhood lipid traits, SBP, and BMI are obtained from GWAS Catalog with study accession numbers GCST90104679, GCST90104678, GCST90104680, GCST90104683, GCST90104677. Source data of all figures, along with the codes to reproduce the figures, are provided with this paper and can be accessed at (https://github.com/ZixuanWu1/FLOW-MR-PAPER/tree/main/Source-Data).

Code availability

The R package FLOW-MR for conducting our Bayesian mediation MR analysis is publicly available for installation at (https://github.com/ZixuanWu1/FLOW-MR). The code for all experiments and real data analyses can be accessed at (https://github.com/ZixuanWu1/FLOW-MR-PAPER).

References

Davey Smith, G. & Ebrahim, S. ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol. 32, 1–22 (2003).
Article Google Scholar
Davey Smith, G. & Hemani, G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum. Mol. Genet. 23, R89–R98 (2014).
Article CAS PubMed Central Google Scholar
Richardson, T. G., Sanderson, E., Elsworth, B., Tilling, K. & Smith, G. D. Use of genetic variation to separate the effects of early and later life adiposity on disease risk: Mendelian randomisation study. BMJ 369, m1203 (2020).
Shi, J. et al. Mendelian randomization with repeated measures of a time-varying exposure: an application of structural mean models. Epidemiology 33, 84–94 (2022).
Article PubMed Central Google Scholar
Labrecque, J. A. & Swanson, S. A. Interpretation and potential biases of Mendelian randomization estimates with time-varying exposures. Am. J. Epidemiol. 188, 231–238 (2019).
Article Google Scholar
Sanderson, E., Richardson, T. G., Morris, T. T., Tilling, K. & Davey Smith, G. Estimation of causal effects of a time-varying exposure at multiple time points through multivariable Mendelian randomization. PLoS Genet. 18, e1010290 (2022).
Article CAS PubMed Central Google Scholar
Power, G. M. et al. Methodological approaches, challenges, and opportunities in the application of Mendelian randomisation to lifecourse epidemiology: a systematic literature review. Eur. J. Epidemiol. 39, 501–520 (2023).
Burgess, S., Dudbridge, F. & Thompson, S. G. Re:multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am. J. Epidemiol. 181, 290–291 (2015).
Article Google Scholar
Shi, J. et al. Instrumental variable estimation for a time-varying treatment and a time-to-event outcome via structural nested cumulative failure time models. BMC Med. Res. Methodol. 21, 1–12 (2021).
Article Google Scholar
Cao, Y., Rajan, S. S. & Wei, P. Mendelian randomization analysis of a time-varying exposure for binary disease outcomes using functional data analysis methods. Genet. Epidemiol. 40, 744–755 (2016).
Article PubMed Central Google Scholar
Wang, J. et al. Causal inference for heritable phenotypic risk factors using heterogeneous genetic instruments. PLoS Genet. 17, e1009575 (2021).
Article CAS PubMed Central Google Scholar
Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through egger regression. Int. J. Epidemiol. 44, 512–525 (2015).
Article PubMed Central Google Scholar
Davidson, R., MacKinnon, J. G. et al. Estimation and inference in econometrics, 63 (Oxford, 1993).
Lee, M.-R., Lim, Y.-H. & Hong, Y.-C. Causal association of body mass index with hypertension using a Mendelian randomization design. Medicine 97, e11252 (2018).
Zhao, Q., Wang, J., Bowden, J. & Small, D. Statistical inference in two-sample summary-data Mendelian randomization using robust adjusted profile score. Ann. Stat. 48, 1742–1769 (2018).
Wu, Z. & Wang, J. Interpretation of two-sample mendelian randomization for binary exposures and outcome. Preprint at https://www.biorxiv.org/content/10.1101/2024.06.09.598150v1 (2024).
Morrison, J., Knoblauch, N., Marcus, J. H., Stephens, M. & He, X. Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nat. Genet. 52, 740–747 (2020).
Article CAS PubMed Central Google Scholar
VanderWeele, T. J., Vansteelandt, S. & Robins, J. M. Effect decomposition in the presence of an exposure-induced mediator-outcome confounder. Epidemiology 25, 300 (2014).
Article PubMed Central Google Scholar
Rees, J. M., Wood, A. M. & Burgess, S. Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy. Stat. Med. 36, 4705–4718 (2017).
Article MathSciNet PubMed Central Google Scholar
Grant, A. J. & Burgess, S. Pleiotropy robust methods for multivariable Mendelian randomization. Stat. Med. 40, 5813–5830 (2021).
Article MathSciNet PubMed Central Google Scholar
Grant, A. J. & Burgess, S. A bayesian approach to Mendelian randomization using summary statistics in the univariable and multivariable settings with correlated pleiotropy. Am. J. Hum. Genet. 111, 165–180 (2024).
Article CAS PubMed Central Google Scholar
Lin, Z., Xue, H. & Pan, W. Robust multivariable Mendelian randomization based on constrained maximum likelihood. Am. J. Hum. Genet. 110, 592–605 (2023).
Article CAS PubMed Central Google Scholar
Helgeland, Ø. et al. Genome-wide association study reveals dynamic role of genetic variation in infant and early childhood growth. Nat. Commun. 10, 4448 (2019).
Article ADS PubMed Central Google Scholar
O’Nunain, K., Sanderson, E., Holmes, M. V., Smith, G. D. & Richardson, T. G. A genome-wide association study of childhood adiposity and blood lipids. Wellcome Open Res. 6, 303 (2023).
Article PubMed Central Google Scholar
Sanderson, E., Spiller, W. & Bowden, J. Testing and correcting for weak and pleiotropic instruments in two-sample multivariable mendelian randomization. Stat. Med. 40, 5434–5452 (2021).
Article MathSciNet PubMed Central Google Scholar
Ooi, B. N. S. et al. The genetic interplay between body mass index, breast size and breast cancer risk: a Mendelian randomization analysis. Int. J. Epidemiol. 48, 781–794 (2019).
Article PubMed Central Google Scholar
Vatten, L. J. & Kvinnsland, S. Body mass index and risk of breast cancer. a prospective study of 23,826 norwegian women. Int. J. Cancer 45, 440–444 (1990).
Article CAS Google Scholar
Guo, Y. et al. Genetically predicted body mass index and breast cancer risk: Mendelian randomization analyses of data from 145,000 women of european descent. PLoS Med. 13, e1002105 (2016).
Article PubMed Central Google Scholar
Michailidou, K. et al. Association analysis identifies 65 new breast cancer risk loci. Nature 551, 92–94 (2017).
Article ADS CAS PubMed Central Google Scholar
Willmot, M., Leonardi-Bee, J. & Bath, P. M. High blood pressure in acute stroke and subsequent outcome: a systematic review. Hypertension 43, 18–24 (2004).
Article CAS Google Scholar
Chen, L. et al. Systematic Mendelian randomization using the human plasma proteome to discover potential therapeutic targets for stroke. Nat. Commun. 13, 6143 (2022).
Article ADS CAS PubMed Central Google Scholar
Willer, C. et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).
Article CAS PubMed Central Google Scholar
Malik, R. et al. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes. Nat. Genet. 50, 524–537 (2018).
Article CAS PubMed Central Google Scholar
Hoffmann, T. J. et al. Genome-wide association analyses using electronic health records identify new loci influencing blood pressure variation. Nat. Genet. 49, 54–64 (2017).
Article CAS Google Scholar
Hoffmann, T. J. et al. A large electronic-health-record-based genome-wide study of serum lipids. Nat. Genet. 50, 401–413 (2018).
Article CAS PubMed Central Google Scholar
Tian, H. & Burgess, S. Estimation of time-varying causal effects with multivariable Mendelian randomization: some cautionary notes. Int. J. Epidemiol. 52, 846–857 (2023).
Article PubMed Central Google Scholar
Burgess, S., Swanson, S. A. & Labrecque, J. A. Are Mendelian randomization investigations immune from bias due to reverse causation? Eur. J. Epidemiol. 36, 253–257 (2021).
Article PubMed Central Google Scholar
VanderWeele, T. Explanation in causal inference: methods for mediation and interaction (Oxford University Press, 2015).
Baron, R. M. & Kenny, D. A. The moderator–mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. J. Personal. Soc. Psychol. 51, 1173 (1986).
Article CAS Google Scholar
Sullivant, S., Talaska, K. & Draisma, J. Trek separation for gaussian graphical models. Ann. Stat. 38, 1665–1685 (2008).
Zhao, Q., Chen, Y., Wang, J. & Small, D. S. Powerful three-sample genome-wide design and robust statistical inference in summary-data Mendelian randomization. Int. J. Epidemiol. 48, 1478–1492 (2019).
Article Google Scholar
Purcell, S. et al. Plink: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS PubMed Central Google Scholar

Download references

Acknowledgements

J.W. is partly supported by the National Science Foundation under grants DMS-2113646 and DMS-2238656. Q.Z. is partly supported by EPSRC grant EP/V049968/1. We acknowledge the University of Chicago’s Research Computing Center for their support of this work.

Author information

Authors and Affiliations

Department of Statistics, University of Chicago, Chicago, IL, USA
Zixuan Wu, Ethan Lewis & Jingshu Wang
Statistical Laboratory, Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Cambridge, UK
Qingyuan Zhao

Authors

Zixuan Wu
View author publications
Search author on:PubMed Google Scholar
Ethan Lewis
View author publications
Search author on:PubMed Google Scholar
Qingyuan Zhao
View author publications
Search author on:PubMed Google Scholar
Jingshu Wang
View author publications
Search author on:PubMed Google Scholar

Contributions

J.W. supervised the work. J.W. and Q.Z. formulated the problem. J.W., Z.W. and E.L. developed the algorithm. Z.W. and E.L. implemented the method and performed simulations. Z.W. conducted real data analyses. J.W., Q.Z. and Z.W. wrote the paper.

Corresponding author

Correspondence to Jingshu Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Reporting Summary (download PDF )

Transparent Peer Review file (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wu, Z., Lewis, E., Zhao, Q. et al. Causal mediation analysis for time-varying heritable risk factors with Mendelian randomization. Nat Commun 16, 6945 (2025). https://doi.org/10.1038/s41467-025-61648-7

Download citation

Received: 20 August 2024
Accepted: 24 June 2025
Published: 28 July 2025
Version of record: 28 July 2025
DOI: https://doi.org/10.1038/s41467-025-61648-7

This article is cited by

Deciphering the causal nexus of mitochondrial biology with breast cancer
- Dongsong Liu
- Hongfang Xu
- Wangjun Yan
Discover Oncology (2026)