Table 2 Comparison of the performance results of various machine learning models and statistical models on the Yale New Haven Hospital (YNHH) test cohort and external validation cohort.
Model | UNOS | XGBoost | RNN | LSTM | GRU | GRU-D | ODE-RNN |
|---|---|---|---|---|---|---|---|
Yale New Haven Hospital Test Cohort (Temporal Split, After 2021) | |||||||
Accuracy (4-way) \(\uparrow\) | \(0.627 \pm {0.000}\) | \(0.831 \pm {0.000}\) | \(0.816 \pm {0.014}\) | \(0.861 \pm {0.010}\) | \(0.862 \pm {0.021}\) | \(0.856 \pm {0.014}\) | \({\textbf {0.878}} \pm {0.007}\) |
Accuracy (<30 vs. >30) \(\uparrow\) | \(0.627 \pm {0.000}\) | \(0.900 \pm {0.000}\) | \(0.922 \pm {0.020}\) | \(0.947 \pm {0.009}\) | \(0.948 \pm {0.006}\) | \(0.941 \pm {0.005}\) | \({\textbf {0.955}} \pm {0.010}\) |
Accuracy (<60 vs. >60) \(\uparrow\) | \(0.711 \pm {0.000}\) | \(0.928 \pm {0.000}\) | \(0.892 \pm {0.012}\) | \(0.939 \pm {0.007}\) | \(0.946 \pm {0.010}\) | \(0.941 \pm {0.006}\) | \({\textbf {0.953}} \pm {0.003}\) |
Accuracy (<120 vs. >120) \(\uparrow\) | \(0.779 \pm {0.000}\) | \(0.934 \pm {0.000}\) | \(0.903 \pm {0.004}\) | \(0.924 \pm {0.006}\) | \(0.930 \pm {0.011}\) | \(0.929 \pm {0.012}\) | \({\textbf {0.942}} \pm {0.003}\) |
ROC-AUC (<30 vs. >30) \(\uparrow\) | \(0.584 \pm {0.000}\) | \(0.962 \pm {0.000}\) | \(0.952 \pm {0.023}\) | \(0.978 \pm {0.012}\) | \(0.973 \pm {0.010}\) | \(0.972 \pm {0.008}\) | \({\textbf {0.987}} \pm {0.004}\) |
ROC-AUC (<60 vs. >60) \(\uparrow\) | \(0.592 \pm {0.000}\) | \(0.966 \pm {0.000}\) | \(0.932 \pm {0.019}\) | \(0.968 \pm {0.012}\) | \(0.965 \pm {0.011}\) | \(0.963 \pm {0.007}\) | \({\textbf {0.987}} \pm {0.003}\) |
ROC-AUC (<120 vs. >120) \(\uparrow\) | \(0.623 \pm {0.000}\) | \(0.975 \pm {0.000}\) | \(0.922 \pm {0.018}\) | \(0.961 \pm {0.014}\) | \(0.957 \pm {0.011}\) | \(0.951 \pm {0.006}\) | \({\textbf {0.984}} \pm {0.002}\) |
PR-AUC (<30 vs. >30) \(\uparrow\) | \(0.734 \pm {0.000}\) | \(0.972 \pm {0.000}\) | \(0.953 \pm {0.034}\) | \(0.981 \pm {0.014}\) | \(0.971 \pm {0.012}\) | \(0.975 \pm {0.012}\) | \({\textbf {0.987}} \pm {0.003}\) |
PR-AUC (<60 vs. >60) \(\uparrow\) | \(0.799 \pm {0.000}\) | \(0.986 \pm {0.000}\) | \(0.956 \pm {0.024}\) | \(0.982 \pm {0.010}\) | \(0.974 \pm {0.010}\) | \(0.976 \pm {0.008}\) | \({\textbf {0.995}} \pm {0.001}\) |
PR-AUC (<120 vs. >120) \(\uparrow\) | \(0.857 \pm {0.000}\) | \(0.993 \pm {0.000}\) | \(0.962 \pm {0.019}\) | \(0.985 \pm {0.008}\) | \(0.977 \pm {0.010}\) | \(0.974 \pm {0.009}\) | \({\textbf {0.996}} \pm {0.001}\) |
F1 (<30 vs. >30) \(\uparrow\) | \(0.770 \pm {0.000}\) | \(0.929 \pm {0.000}\) | \(0.940 \pm {0.013}\) | \(0.958 \pm {0.007}\) | \(0.960 \pm {0.003}\) | \(0.954 \pm {0.003}\) | \({\textbf {0.967}} \pm {0.007}\) |
F1 (<60 vs. >60) \(\uparrow\) | \(0.831 \pm {0.000}\) | \(0.949 \pm {0.000}\) | \(0.926 \pm {0.008}\) | \(0.957 \pm {0.004}\) | \(0.963 \pm {0.006}\) | \(0.959 \pm {0.003}\) | \({\textbf {0.968}} \pm {0.002}\) |
F1 (<120 vs. >120) \(\uparrow\) | \(0.876 \pm {0.000}\) | \(0.957 \pm {0.000}\) | \(0.935 \pm {0.006}\) | \(0.953 \pm {0.003}\) | \(0.957 \pm {0.006}\) | \(0.953 \pm {0.009}\) | \({\textbf {0.960}} \pm {0.003}\) |
PPV (<30 vs. >30) \(\uparrow\) | \(0.627 \pm {0.000}\) | \(0.882 \pm {0.000}\) | \(0.930 \pm {0.024}\) | \(0.942 \pm {0.010}\) | \(0.945 \pm {0.010}\) | \(0.935 \pm {0.010}\) | \({\textbf {0.963}} \pm {0.010}\) |
PPV (<60 vs. >60) \(\uparrow\) | \(0.711 \pm {0.000}\) | \(0.944 \pm {0.000}\) | \(0.922 \pm {0.010}\) | \(0.945 \pm {0.007}\) | \(0.952 \pm {0.013}\) | \(0.945 \pm {0.008}\) | \({\textbf {0.967}} \pm {0.003}\) |
PPV (<120 vs. >120) \(\uparrow\) | \(0.779 \pm {0.000}\) | \(0.966 \pm {0.000}\) | \(0.925 \pm {0.015}\) | \(0.937 \pm {0.015}\) | \(0.935 \pm {0.018}\) | \(0.927 \pm {0.019}\) | \({\textbf {0.976}} \pm {0.010}\) |
NPV (<30 vs. >30) \(\uparrow\) | \(0.000 \pm {0.000}\) | \({\textbf {0.960}} \pm {0.000}\) | \(0.915 \pm {0.030}\) | \(0.955 \pm {0.018}\) | \(0.956 \pm {0.008}\) | \(0.954 \pm {0.009}\) | \(0.953 \pm {0.013}\) |
NPV (<60 vs. >60) \(\uparrow\) | \(0.000 \pm {0.000}\) | \(0.886 \pm {0.000}\) | \(0.826 \pm {0.023}\) | \(0.922 \pm {0.025}\) | \({\textbf {0.935}} \pm {0.006}\) | \(0.928 \pm {0.011}\) | \(0.922 \pm {0.011}\) |
NPV (<120 vs. >120) \(\uparrow\) | \(0.000 \pm {0.000}\) | \(0.829 \pm {0.000}\) | \(0.797 \pm {0.039}\) | \(0.885 \pm {0.042}\) | \({\textbf {0.919}} \pm {0.048}\) | \(0.914 \pm {0.025}\) | \(0.827 \pm {0.018}\) |
ECE \(\downarrow\) | \(0.054 \pm {0.000}\) | \(0.092 \pm {0.000}\) | \(0.051 \pm {0.004}\) | \(0.048 \pm {0.007}\) | \(0.055 \pm {0.014}\) | \(0.055 \pm {0.008}\) | \({\textbf {0.033}} \pm {0.008}\) |
External Validation Cohort | |||||||
Accuracy (4-way) \(\uparrow\) | \(0.641 \pm {0.000}\) | \(0.785 \pm {0.000}\) | \(0.791 \pm {0.015}\) | \(0.800 \pm {0.015}\) | \(0.821 \pm {0.010}\) | \(0.809 \pm {0.012}\) | \({\textbf {0.866}} \pm {0.010}\) |
Accuracy (<30 vs. >30) \(\uparrow\) | \(0.641 \pm {0.000}\) | \(0.884 \pm {0.000}\) | \(0.923 \pm {0.017}\) | \(0.930 \pm {0.003}\) | \(0.942 \pm {0.001}\) | \(0.934 \pm {0.003}\) | \({\textbf {0.953}} \pm {0.012}\) |
Accuracy (<60 vs. >60) \(\uparrow\) | \(0.713 \pm {0.000}\) | \(0.898 \pm {0.000}\) | \(0.892 \pm {0.014}\) | \(0.913 \pm {0.009}\) | \(0.932 \pm {0.005}\) | \(0.919 \pm {0.008}\) | \({\textbf {0.954}} \pm {0.007}\) |
Accuracy (<120 vs. >120) \(\uparrow\) | \(0.786 \pm {0.000}\) | \(0.883 \pm {0.000}\) | \(0.865 \pm {0.004}\) | \(0.877 \pm {0.012}\) | \(0.892 \pm {0.008}\) | \(0.884 \pm {0.011}\) | \({\textbf {0.934}} \pm {0.005}\) |
ROC-AUC (<30 vs. >30) \(\uparrow\) | \(0.508 \pm {0.000}\) | \(0.943 \pm {0.000}\) | \(0.946 \pm {0.012}\) | \(0.964 \pm {0.003}\) | \(0.966 \pm {0.004}\) | \(0.965 \pm {0.003}\) | \({\textbf {0.989}} \pm {0.005}\) |
ROC-AUC (<60 vs. >60) \(\uparrow\) | \(0.506 \pm {0.000}\) | \(0.952 \pm {0.000}\) | \(0.928 \pm {0.007}\) | \(0.950 \pm {0.005}\) | \(0.955 \pm {0.004}\) | \(0.952 \pm {0.003}\) | \({\textbf {0.987}} \pm {0.003}\) |
ROC-AUC (<120 vs. >120) \(\uparrow\) | \(0.534 \pm {0.000}\) | \(0.945 \pm {0.000}\) | \(0.896 \pm {0.005}\) | \(0.915 \pm {0.006}\) | \(0.926 \pm {0.004}\) | \(0.923 \pm {0.004}\) | \({\textbf {0.971}} \pm {0.004}\) |
PR-AUC (<30 vs. >30) \(\uparrow\) | \(0.695 \pm {0.000}\) | \(0.961 \pm {0.000}\) | \(0.950 \pm {0.018}\) | \(0.971 \pm {0.005}\) | \(0.956 \pm {0.015}\) | \(0.967 \pm {0.008}\) | \({\textbf {0.994}} \pm {0.003}\) |
PR-AUC (<60 vs. >60) \(\uparrow\) | \(0.752 \pm {0.000}\) | \(0.978 \pm {0.000}\) | \(0.951 \pm {0.012}\) | \(0.972 \pm {0.004}\) | \(0.959 \pm {0.012}\) | \(0.967 \pm {0.006}\) | \({\textbf {0.995}} \pm {0.001}\) |
PR-AUC (<120 vs. >120) \(\uparrow\) | \(0.824 \pm {0.000}\) | \(0.985 \pm {0.000}\) | \(0.946 \pm {0.007}\) | \(0.968 \pm {0.005}\) | \(0.960 \pm {0.004}\) | \(0.963 \pm {0.004}\) | \({\textbf {0.993}} \pm {0.001}\) |
F1 (<30 vs. >30) \(\uparrow\) | \(0.781 \pm {0.000}\) | \(0.918 \pm {0.000}\) | \(0.937 \pm {0.013}\) | \(0.946 \pm {0.002}\) | \(0.955 \pm {0.001}\) | \(0.949 \pm {0.003}\) | \({\textbf {0.966}} \pm {0.010}\) |
F1 (<60 vs. >60) \(\uparrow\) | \(0.833 \pm {0.000}\) | \(0.932 \pm {0.000}\) | \(0.926 \pm {0.007}\) | \(0.941 \pm {0.005}\) | \(0.954 \pm {0.003}\) | \(0.945 \pm {0.004}\) | \({\textbf {0.967}} \pm {0.004}\) |
F1 (<120 vs. >120) \(\uparrow\) | \(0.880 \pm {0.000}\) | \(0.926 \pm {0.000}\) | \(0.913 \pm {0.005}\) | \(0.924 \pm {0.006}\) | \(0.933 \pm {0.004}\) | \(0.929 \pm {0.007}\) | \({\textbf {0.955}} \pm {0.003}\) |
PPV (<30 vs. >30) \(\uparrow\) | \(0.641 \pm {0.000}\) | \(0.871 \pm {0.000}\) | \(0.947 \pm {0.013}\) | \(0.937 \pm {0.009}\) | \(0.949 \pm {0.005}\) | \(0.937 \pm {0.007}\) | \({\textbf {0.961}} \pm {0.009}\) |
PPV (<60 vs. >60) \(\uparrow\) | \(0.713 \pm {0.000}\) | \(0.903 \pm {0.000}\) | \(0.925 \pm {0.017}\) | \(0.925 \pm {0.018}\) | \(0.938 \pm {0.007}\) | \(0.922 \pm {0.010}\) | \({\textbf {0.958}} \pm {0.010}\) |
PPV (<120 vs. >120) \(\uparrow\) | \(0.786 \pm {0.000}\) | \(0.905 \pm {0.000}\) | \(0.892 \pm {0.016}\) | \(0.891 \pm {0.022}\) | \(0.902 \pm {0.016}\) | \(0.890 \pm {0.016}\) | \({\textbf {0.966}} \pm {0.011}\) |
NPV (<30 vs. >30) \(\uparrow\) | \(0.000 \pm {0.000}\) | \(0.932 \pm {0.000}\) | \(0.876 \pm {0.032}\) | \(0.917 \pm {0.013}\) | \(0.928 \pm {0.010}\) | \(0.928 \pm {0.015}\) | \({\textbf {0.949}} \pm {0.025}\) |
NPV (<60 vs. >60) \(\uparrow\) | \(0.000 \pm {0.000}\) | \(0.889 \pm {0.000}\) | \(0.820 \pm {0.015}\) | \(0.886 \pm {0.019}\) | \(0.919 \pm {0.008}\) | \(0.910 \pm {0.015}\) | \({\textbf {0.939}} \pm {0.010}\) |
NPV (<120 vs. >120) \(\uparrow\) | \(0.000 \pm {0.000}\) | \(0.769 \pm {0.000}\) | \(0.713 \pm {0.027}\) | \(0.794 \pm {0.020}\) | \(0.836 \pm {0.032}\) | \({\textbf {0.839}} \pm {0.013}\) | \(0.815 \pm {0.015}\) |
ECE \(\downarrow\) | \(0.032 \pm {0.000}\) | \(0.111 \pm {0.000}\) | \(0.035 \pm {0.008}\) | \(0.085 \pm {0.020}\) | \(0.074 \pm {0.007}\) | \(0.074 \pm {0.007}\) | \({\textbf {0.024}} \pm {0.009}\) |