Fig. 1: Delphi, a modified GPT architecture, models health trajectories.
From: Learning the natural history of human disease with generative transformers

a, Schematic of health trajectories based on ICD-10 diagnoses, lifestyle and healthy padding tokens, each recorded at a distinct age. b, Training, validation and testing data derived from the UK Biobank (left) and Danish disease registries (right). c, The Delphi model architecture. The red elements indicate changes compared with the underlying GPT-2 model. ‘N ×’ denotes applying the transformer block sequentially N times. d, Example model input (prompt) and output (samples) comprising (age:token) pairs. e, Scaling laws of Delphi, showing the optimal validation loss as a function of model parameters for different training data sizes. f, Ablation results measured by the cross-entropy differences relative to an age- and sex-based baseline (y axis) for different ages (x axis). g, The accuracy of predicted time to event. The observed (y axis) and expected (x axis) time to events are shown for each next token prediction (grey dots). The blue line shows the average across consecutive bins of the x axis.