Fig. 5: Comparison of model performances with and without lab features.

Error bars show 95% confidence intervals around the mean. ECG models had higher AUROCs than the baseline model (XGB: Age, Sex, Lab) for all time points, even without lab values, but the difference was smaller for longer range predictions. Addition of lab features significantly improved the model performances throughout, across models and time points, however the gains in performance were small in magnitude (0.99% on average, DeLong Test, all p < 0.001). Overall, the DL model with ECG traces, age, sex and lab was the best performing model in the comparison. AUROC Area under the receiver operating characteristic curve, DL deep learning, ECG electrocardiogram, XGB XGBoost.