Figure 7

Detailed modeling procedure: (A) Once data are split on training and test, training data (Jan 2016–Dec 2018) are additionally partitioned on two subsets. (B) Model candidates (LR, RF, DT and XGB) are trained for specified hyper-parameter grid on data Jan 2016-Dec 2017 and evaluated with AUC-PRC on validation partition. The set of parameters producing the best AUC-PRC score is selected. (C) 1000 bootstrapped sub-samples are generated. (D-1 and D-2) 1000 individual LR, RF, DT and XGB models are trained along Encounter-based Forest-like (EF) ensemble model. (E) EF ensemble and selected LR, RF, DT, XGB models are evaluated on 1000 bootstrapped samples drawn with replacement from the test set. (F) Threshold for Risk index 1, 2 and 3 determined for selected EF model are computed based on the calibration curve.