Fig. 5: Simulations illustrating the impact of reporting error and/or selective participation on exposure–outcome associations.

a, Directed acyclic graphs illustrating the different simulation settings, including either the ground-truth scenario (no participation bias or reporting error; 1, highlighted in blue) or scenarios where reporting error (2, highlighted in violet), participation bias (3, highlighted in green) or both (4, highlighted in orange) were present when assessing the effect of BMI on self-reported education (top) and the effect of self-reported education on BMI (bottom). b,c, The impact of the two participatory behaviours (reporting error and selective participation) in each of the simulated scenarios was assessed in terms of bias (b; 1 and 2, showing the difference between the estimated coefficient (y axis) and the true estimate of the exposure–outcome association (grey line, where the true causal effect was set to be −0.2)) and RMSE (c; 1 and 2, showing RMSE on the y axis, with the grey line indicating RMSE = 0) when testing the association between education and BMI. Data were simulated to mimic the UKBB response rate, where around 5.5% of the simulated data (for n = 9,000,000 individuals) were selected. All error bars shown in the figure represent the 95% CIs.