Figure 5
From: EHR foundation models improve robustness in the presence of temporal distribution shift

Scaling of GRU- and transformer-based CLMBR along pretraining set size. A more positively sloped curve (negative for ACE) indicates better performance or robustness (performance in the OOD year group) along increasing pretraining set size. Note that the number of patients include both training and validation sets of the pretraining cohort. The asterisk in [09–12*] indicates a subset of patients in [09–12] that have been admitted to an inpatient unit during the year range. Raw performance scores and slopes are provided in Supplementary Tables S6 and S7 online, respectively. AUROC Area under the receiver operating characteristics curve; AUPRCC Calibrated area under the precision recall curve; ACE absolute calibration error; LR Logistic regression; CLMBR Clinical language model-based representations; GRU Gradient recurrent unit; TRANS Transformer; LOS Length of stay; ICU Intensive care unit.