Fig. 3: Generative modelling with Delphi-2M informs future outcomes.
From: Learning the natural history of human disease with generative transformers

a, Schematic of the experiment design. Delphi-2M is used to simulate health trajectories using validation data (n = 63,622 individuals with disease records both before and after 60 years of age) observed until the age of 60. A single trajectory is simulated per individual. Subsequently, simulated trajectories are compared to the observed outcomes for the same person. b, Delphi-2M-modelled disease rates at ages 70–75 years (x axis) compared with observed rates at the same ages (y axis). c, The fraction of correctly predicted diagnoses (y axis) per 1-year age bin as a function of the years after simulation started at age 60 years (x axis). Delphi-2M, orange. The blue curve uses age and sex as a prediction baseline. d, Simulated (x axis) and observed (y axis) fold changes of disease rates for high versus low smoking, alcohol consumption and BMI groups. The evaluation period included ages 70–75 years and used simulations from the age of 60 years. e, The same as b, evaluated for simulations from birth. f, The AUC values of disease risk prediction (n = 1,334 disease–sex pairs) for Delphi when trained on UKB and Delphi-2M-sampled synthetic data (Methods). The box plots show the median (centre line), the first to the third quartile (box limits) and the 0.025 and 0.975 quantiles (whiskers).