Fig. 2

A data-driven DAG (directed acyclic graph) synthesises evidence-based insights with data-driven modelling to represent the disease trajectory and guide the data analysis strategy employed in this study. The node represents the event and displays the relative case frequency (in brackets), which is the proportion of individuals (cases) experiencing the event among all included individuals. The edge (arrow) represents time-ordered sequences of events and displays two statistics, relative-antecedent frequency (risk), which is the proportion of exposure event (antecedent) individuals (cases) directly followed by the target individuals (cases); and median time (in brackets) represents the time from the exposure event to the outcome or competing event, in years. Only statistically significant transitions between PPI and H2B, with a transition risk greater than 10%, are shown in the indicator.