Introduction

Peripheral artery disease (PAD) impacts more than 200 million people globally, leading to reduced quality of life, mounting health care costs, amputation, and death1,2,3,4. A distinct subset of PAD is aortoiliac occlusive disease (AIOD), which involves atherosclerosis of the infrarenal aorta and iliac arteries, affecting approximately 14% of the population in general5. Historically, advanced AIOD was treated with open revascularization6. Over the last few decades, endovascular intervention has become an increasingly common less invasive alternative7. Nevertheless, endovascular revascularization of AIOD carries a significant risk of adverse events, with a systematic review of 19 studies demonstrating mortality and complication rates up to 6.7% and 45%, respectively8. Consequently, careful assessment of perioperative risk is recommended by the Global Vascular Guidelines when patients are considered for revascularization9.

There are currently no standardized tools to support prediction of complications following endovascular revascularization for AIOD. The Vascular Quality Initiative (VQI) Cardiac Risk Index (CRI) is limited to open revascularization10. Other tools such as the American College of Surgeons (ACS) National Surgical Quality Improvement Program (NSQIP) online surgical risk calculator11 rely on modeling techniques that require manual input of clinical variables, making them challenging to use in busy medical environments12. Therefore, there is an important need to develop better and more practical risk prediction tools for patients undergoing endovascular aortoiliac revascularization.

Machine learning (ML) is a rapidly evolving technology that enables computers to learn from large amounts of data and make accurate predictions13. Using advanced analytics, ML can learn non-linear, complex relationships between user inputs, such as patient characteristics, and outputs, such as clinical outcomes13. The advantage of newer ML techniques over traditional statistical methods is that they can better model non-linear relationships between covariates and outcomes14, which is common in health care data15. Previously, we built a ML algorithm that can accurately predict 30-day complications following open revascularization for AIOD, which achieved better performance compared to logistic regression16. To complement this model and provide further guidance for the management of patients with advanced AIOD, NSQIP data were used to train ML algorithms that can predict 30-day outcomes following endovascular revascularization for AIOD.

Results

Patients and events

Overall, 7102 patients underwent endovascular revascularization for AIOD in the targeted NSQIP vascular database from 2011 to 2021. The following exclusions were made: intervention performed for aortoiliac aneurysm (n = 162), acute limb ischemia (n = 7), or dissection (n = 17), unreported symptom status (n = 132) or procedure type (n = 58), and concurrent major amputation (n = 7) or surgical bypass (n = 118). Overall, 6601 patients were included. The primary outcome of 30-day major adverse limb event (MALE) or death occurred in 470 (7.1%) individuals. The 30-day secondary endpoints occurred in this distribution: major vascular reintervention (n = 242 [3.7%]), untreated loss of patency (n = 63 [1.0%]), major amputation (n = 113 [1.7%]), death (n = 130 [2.0%]), major adverse cardiovascular event (MACE, n = 314 [4.8%]; composite of myocardial infarction (n = 205), stroke (n = 22), and death (n = 130)), wound complication (n = 236 [3.6%]), bleeding requiring transfusion or secondary procedure (n = 374 [5.7%]), other morbidity (n = 337 [5.1%]; composite of pneumonia (n = 75), unplanned reintubation (n = 67), pulmonary embolism (n = 7), failure to wean from ventilator (n = 47), acute kidney injury (n = 72), urinary tract infection (n = 42), cardiac arrest (n = 40), deep vein thrombosis (n = 30), sepsis (n = 51), septic shock (n = 42), Clostridium difficile infection (n = 14)), non-home discharge (n = 430 [6.5%]), and unplanned readmission (n = 507 [7.7%]). A flowchart summarizing patient selection and outcomes is reported in Fig. 1.

Fig. 1: Flowchart for patient selection and outcomes.
figure 1

Patients undergoing open revascularization were not included in this study. NSQIP National Surgical Quality Improvement Program.

Pre-operative characteristics

In comparison to patients who did not develop 30-day MALE or death, individuals with a primary outcome were older with a greater proportion being Black, residing in nursing homes or other facilities, or transferred from another hospital. A higher percentage of individuals with 30-day MALE or death had congestive heart failure (CHF), insulin-dependent diabetes, end stage renal disease (ESRD) requiring dialysis, and ≥1 high-risk physiologic factor, and were partially or totally dependent functionally. Despite having more cardiovascular comorbidities, a lower proportion of patients with an event received statins or antiplatelets. A greater proportion of patients with 30-day MALE or death had an ankle brachial index (ABI) ≤ 0.39 or no recorded ABI and pedal pulses that were non-palpable, as well as a prior surgical bypass involving the segment being treated. Individuals with a primary outcome were more likely to present with chronic limb threatening ischemia (CLTI), undergo more distal revascularization of the aortoiliac segment (i.e., external iliac rather than common iliac intervention), receive urgent or emergent intervention, and have an American Society of Anesthesiologists (ASA) classification ≥4. Patients who underwent intervention by vascular surgeons were less likely to develop a primary outcome compared to other specialists (Table 1).

Table 1 Pre-operative characteristics of patients undergoing endovascular aortoiliac revascularization with vs. without 30-day major adverse limb event or death

Model performance

Six different ML models were trained and subsequently assessed on testing data for predicting 30-day MALE or death after endovascular revascularization for AIOD. Extreme Gradient Boosting (XGBoost) achieved the top performance with an area under the receiver operating characteristic curve [AUROC] (95% CI) of 0.94 (0.93–0.95) in comparison to random forest [0.92 (0.91–0.93)], radial basis function support vector machine [0.90 (0.89–0.91)], Naïve Bayes [0.85 (0.84–0.87)], multilayer perceptron artificial neural network [0.76 (0.74–0.77)], and logistic regression [0.74 (0.73–0.76)]. XGBoost attained the following secondary performance metrics: accuracy 0.86 (95% CI 0.85–0.88), specificity 0.87, sensitivity 0.86, negative predictive value 0.86, and positive predictive value 0.87 (Table 2).

Table 2 Model performance on testing data for prediction of 30-day major adverse limb event or death following endovascular aortoiliac revascularization using pre-operative features

XGBoost obtained the following AUROC’s (95% CI) for predicting 30-day secondary endpoints: major vascular reintervention [0.86 (0.85–0.87)], untreated loss of patency [0.95 (0.94–0.96)], major amputation [0.97 (0.86–0.98)], death [0.97 (0.96–0.98)], MACE [0.90 (0.89–0.91)], bleeding requiring transfusion or secondary procedure [0.95 (0.94–0.96)], wound complication [0.90 (0.89–0.91)], other morbidity [0.88 (0.87–0.89)], non-home discharge [0.97 (0.96–0.98)], and unplanned readmission [0.88 (0.87–0.89)] (Table 3). Logistic regression had worse performance compared to XGBoost across all primary and secondary outcomes, with AUROC’s ranging from 0.70–0.78.

Table 3 XGBoost performance on testing data for prediction of 30-day primary and secondary outcomes following endovascular aortoiliac revascularization using pre-operative features

Figure 2 illustrates the ROC curve for XGBoost in predicting 30-day MALE or death. The model was well-calibrated, achieving a Brier score of 0.08, which demonstrates excellent agreement between predicted/observed event probabilities (Fig. 3). For XGBoost, the most important predictors of 30-day MALE or death were: (1) CLTI, (2) functional status, (3) ≥ 1 high-risk physiologic factor, (4) pre-operative dialysis, (5) CHF, (6) transferred from another hospital, (7) urgency, (8) diabetes, (9) primary procedure (more distal revascularization of the aortoiliac segment), and (10) absence of pre-operative statin (Fig. 4). Subgroup analysis of feature importance based on symptom status demonstrated that eight of the ten most important features were identical for CLTI patients and individuals who were asymptomatic or claudicants, and the two most important features were functional status and ≥1 high-risk physiologic factor for both subgroups (Supplementary Fig. 1).

Fig. 2: Receiver operating characteristic curve for predicting 30-day major adverse limb event or death following endovascular aortoiliac revascularization using Extreme Gradient Boosting (XGBoost) model.
figure 2

AUROC area under the receiver operating characteristic curve, CI confidence interval.

Fig. 3: Calibration plot with Brier score.
figure 3

Extreme Gradient Boosting (XGBoost) model calibration for predicting 30-day major adverse limb event or death following endovascular aortoiliac revascularization.

Fig. 4: Variable importance scores (gain) for the top 10 predictors of 30-day major adverse limb event or death following endovascular aortoiliac revascularization in the Extreme Gradient Boosting (XGBoost) model.
figure 4

CLTI chronic limb threatening ischemia.

Subgroup analysis

Model performance was excellent across subgroups based on sex, age, race, ethnicity, procedure type, symptom status, urgency, and concurrent infrainguinal endovascular revascularization with a range of AUROC values from 0.92 to 0.95 and no significant differences were found between majority/minority subgroups (Supplementary Figs. 29).

Discussion

Using targeted NSQIP vascular data from 2011 to 2021, which comprised 6601 patients who received endovascular revascularization for AIOD, we built ML algorithms that pre-operatively predict 30-day MALE or death with excellent performance (AUROC 0.94). Our models also predicted 30-day untreated loss of patency, major vascular reintervention, major amputation, death, MACE, wound complication, bleeding, other morbidity, non-home discharge, and unplanned readmission with AUROC’s ranging from 0.86 to 0.97. There were three other notable findings. First, individuals who suffer adverse events following endovascular aortoiliac revascularization are a high-risk cohort with predictive factors pre-operatively. They have more comorbidities and poorer functional status, with a greater proportion having high-risk anatomic and physiologic features. Furthermore, a greater proportion of patients with adverse events had CLTI, underwent more distal revascularization, and required urgent/emergent intervention. Despite these differences, they were less likely to receive optimal medical therapy including antiplatelets and statins. This represents an important opportunity to improve medical management of patients with PAD. Second, we evaluated six different ML models and XGBoost achieved the best predictive performance. Our algorithm was well-calibrated and remained robust across demographic/clinical subgroups. Third, we reported the most important predictors of 30-day MALE/death in our algorithms. These variables can help clinicians understand patient characteristics that lead to specific risk predictions, thereby supporting patient/procedure selection and pre-operative optimization.

Risk prediction tools for patients undergoing endovascular aortoiliac revascularization are limited. Bertges et al.10 used multivariable logistic regression to develop the VQI CRI for predicting myocardial infarction after carotid endarterectomy, lower extremity bypass, and aortic aneurysm repair, achieving an overall AUROC of 0.7510. Notably, the model did not include endovascular aortoiliac revascularization, and therefore, our work fills an important gap in this tool10. Applying ML to a contemporary NSQIP cohort, we achieved an AUROC of 0.94 for predicting MALE or death at 30 days following endovascular revascularization for AIOD. This demonstrates the benefit of leveraging advanced data analytics to build procedure-specific risk prediction models using contemporary datasets. Our model provides accurate risk predictions for a patient population that has often not been included in existing tools10 and complements our previously described ML algorithm for predicting outcomes at 30 days after open revascularization for AIOD16.

Bonde and colleagues (2021) built ML models on a sample of patients in the NSQIP database undergoing more than 2900 different procedures to predict post-operative complications, attaining AUROC’s between 0.85 and 0.8817. Since individuals with PAD are a high-risk population with many vascular comorbidities, the ability for general risk prediction tools to perform well on this cohort could be limited18. By building ML models tailored to individuals receiving endovascular revascularization for AIOD, we attained AUROC’s above 0.90. Our algorithms can also predict clinically relevant limb-related outcomes such as major vascular reintervention and major amputation, which are of importance to interventionalists and vascular surgeons. Consequently, developing ML algorithms specific to endovascular aortoiliac revascularization can increase performance and applicability.

Our findings can be interpreted in several ways. First, individuals who suffer from adverse events after endovascular revascularization for AIOD are a high-risk group with many cardiovascular risk factors19. Multiple societal guidelines recommend statins and antiplatelets for all PAD patients9,20,21,22, yet individuals who had complications in our study were less likely to be taking these medications. The BEST-CLI trial corroborates the fact that the rates of best medical therapy are suboptimal in PAD patients23. Consequently, there remain critical opportunities to improve outcomes for PAD patients by accurately determining their perioperative risk and optimizing them medically before intervention. In our ML algorithm, we found that clinical presentation with CLTI was the most important predictor of 30-day MALE or death, which corroborates existing literature given that these patients generally have more extensive lesions and severe ischemia, increasing the risk of limb loss and adverse events24. Poor pre-operative functional status was also a strong predictive feature for worse outcomes, as it correlates well with the overall health status of the patient and their ability to tolerate physiologically demanding procedures25. Other important predictors included high-risk physiological factors, pre-operative dialysis, pre-operative CHF, among others, highlighting that the overall clinical status of the patient and their pre-existing conditions are important predictors of their post-operative outcomes26. Second, the ML models achieved better performance than existing tools for various reasons. In comparison to logistic regression, newer ML technology can better learn non-linear and complex relationships between covariates and outcomes27,28. This is particularly important in health care data because many demographic/clinical factors can influence patient outcomes29. XGBoost was the top-performing algorithm, which has important advantages over other ML approaches including relatively fewer issues with overfitting in addition to quicker computing while retaining precision30,31,32. Additionally, XGBoost performs well on structured data, which could be the reason it outperformed more complicated algorithms such as neural networks33. These findings are corroborated by previous work demonstrating that ML algorithms can achieve better discrimination than conventional statistical models34,35. Additionally, we used methodology that strengthens model validity including calibration measurement, adherence to reporting standards (TRIPOD + AI statement), and collaboration between computer scientists, biostatisticians, and clinicians34,35. Third, the performance of our algorithms remained excellent across demographic/clinical subpopulations. This is notable given that algorithm bias against underrepresented populations remains an important concern within ML research36. These biases were likely avoided because ACS NSQIP is a multi-national database that captures diverse patient populations with rich sociodemographic information37,38. Fourth, a small proportion of patients in our cohort underwent revascularization for asymptomatic disease (~5%). The reasons for these interventions are unclear from our dataset but may be related to treatment of hemodynamically significant stenoses of previous revascularization procedures, patient preference, poor adherence to guideline-directed revascularization, or coding errors39.

Clinical decision-making can be supported by our ML algorithms in several ways. At the pre-operative stage, individuals predicted to be at elevated risk of adverse events could be assessed further in terms of both non-modifiable and modifiable factors26. Individuals with risks that are non-modifiable could be considered for alternative options including medical management alone, amputation, or close surveillance40,41,42. Those with modifiable risks could benefit from further evaluation and optimization including referrals to anesthesiologists, cardiologists, and/or internal medicine specialists when appropriate43,44. Conversely, low risk patients could be considered for open revascularization, which may be more durable45. Post-operatively, high risk patients could receive increased monitoring on the ward or intensive care unit46. Individuals at elevated risk for unplanned readmission or non-home discharge could receive increased support from allied health to ensure safe discharge planning47. These clinical decisions supported by the ML risk prediction tool may improve outcomes by mitigating complications related to endovascular revascularization for AIOD.

The ML algorithm programming code is openly available on GitHub. This tool may be utilized by clinicians caring for patients under consideration for endovascular revascularization for AIOD. The algorithm could be implemented at the >700 NSQIP-participating centers globally. There is also potential to implement the model at non-NSQIP sites because the input variables are commonly captured for the routine care of vascular patients48. Given the challenges of deploying prediction models into clinical practice, consideration of implementation science principles is critical49. Our ML models have the advantage of providing automated risk predictions, thereby improving practicality in busy clinical settings compared with traditional risk predictors that generally require manual input of variables11. Specifically, our algorithm can autonomously extract a patient’s NSQIP data to make risk predictions. The clinical applicability of our risk prediction tool stems from both its automated nature and highly accurate predictions that can guide vascular specialists in terms of patient/procedure selection, counseling, and peri-operative management to improve care for patients with AIOD being considered for endovascular revascularization. We advocate for dedicated health care data analytics teams at the institution level, as their benefits to patient care have been previously demonstrated and model implementation can be facilitated by these experts50.

Our study has various limitations. First, the algorithms were built using data from NSQIP. Additional studies are needed to evaluate whether generalization of predictive performance can be made to non-NSQIP centers. Second, our dataset captured 30-day endpoints. Assessment of ML algorithms on datasets with longer follow-up periods could support prediction of long-term risk. Third, several variables that may contribute to risk predictions were not captured in our dataset, including lesion characteristics and use of low-dose rivaroxaban, drug eluting technology, post-procedural antiplatelet therapy, intravascular lithotripsy, covered stents, and intravascular ultrasound. Future predictive models developed on datasets that record these variables could improve accuracy. Additionally, some data points were unknown, not reported, or not documented; further refinement of datasets with more complete clinical information may improve model performance. Fourth, the sample size was lower than expected over a 10-year period likely because ACS NSQIP is primarily a surgical database, and procedures performed by interventional radiologists or other non-surgical specialists may be under-captured. Additional investigation of endovascular aortoiliac revascularization performed by non-surgical specialists may increase the sample size for analysis.

Using the NSQIP targeted vascular database, we built ML algorithms that predict 30-day MALE/death and other clinically relevant outcomes following endovascular revascularization for AIOD with excellent performance using pre-operative data. Given that our ML algorithms performed better than existing tools and logistic regression, they have potential for important utility in the peri-operative management of patients being considered for endovascular aortoiliac revascularization to mitigate adverse outcomes. Prospective validation of our prediction models is warranted.

Methods

Design

This was an ML-based prognostic study and findings were reported based on the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis + Artificial Intelligence (TRIPOD + AI) statement51. The methods were based on our previous work to demonstrate the robustness and reproducibility of our ML model development and evaluation process16.

Dataset

The ACS NSQIP database contains data on patients from >700 hospitals in ~15 countries worldwide52. Trained clinical reviewers prospectively collect data from electronic health records and ACS regularly audits the data for accuracy53. Targeted vascular registries in NSQIP were established in 2011 and comprise additional variables and outcomes specific to vascular procedures54. Research ethics board review was not required as the data source was a deidentified registry.

Cohort

All individuals in the NSQIP targeted vascular database who received endovascular revascularization for AIOD (angioplasty or stent of the aorta or iliac arteries) for PAD between 2011 and 2021 were included. Individuals who underwent the procedure for aortoiliac aneurysm, dissection, acute limb ischemia, malignancy, or trauma, patients with unreported symptom status or procedure type, and those who received concurrent major amputation or bypass were excluded.

Features

The input features for the ML models included 37 pre-operative variables. To maximize model performance, all pre-operative NSQIP variables were used, as ML methods excel at handling many inputs. Feature selection was not performed as it worsened model performance, and given the goal of automated predictions, reducing the number of features would decrease predictive performance without a significant change to model run time. Demographic features included age, body mass index, sex, race, and ethnicity. Comorbidities consisted of diabetes, hypertension, smoking status, chronic obstructive pulmonary disease, CHF, ESRD on dialysis, functional status, and high-risk physiologic factor [defined as ≥1 of: (1) ESRD, (2) age above 80, (3) CHF class III or IV (New York Heart Association), (4) ejection fraction below 30%, (5) unstable angina <30 days before intervention, or (6) myocardial infarction <30 days before intervention]. Medications consisted of statins, antiplatelets, and beta blockers. Labs included serum creatinine, blood urea nitrogen, serum sodium, albumin, hematocrit, white blood cell count, platelet count, partial thromboplastin time, and international normalized ratio. Limb hemodynamics based on ABI and toe pressure, and high-risk anatomic factors (defined as a prior bypass or endovascular intervention involving the currently treated segment), were noted. Concurrent procedures documented were minor amputation (below the ankle) and endovascular infrainguinal revascularization. Other characteristics were symptom status [CLTI (defined as tissue loss or rest pain), claudication, or asymptomatic], procedure type, urgency of intervention (elective, urgent, or emergent), ASA class, and specialty of the primary operator. Urgent and emergent interventions, generally performed within 24–48 h and <24 h of admission, respectively, were mostly performed for patients with severe ischemia resulting in CLTI. Isolated internal iliac artery interventions were included and generally performed on patients with lifestyle-limiting buttock claudication, rest pain, and/or tissue loss. Supplementary Table 1 provides a complete list of input features and definitions.

Outcomes

The primary endpoint was 30-day post-procedural MALE (defined as a composite of major vascular reintervention, untreated loss of patency, or major amputation) or death. The definition of major vascular reintervention was thrombectomy/thrombolysis involving the treated segment or a new or revision aortoiliac bypass or interposition graft. The definition of untreated loss of patency was an occlusion of the treated segment with no subsequent revascularization. The definition of major amputation was a transtibial or more proximal amputation of the ipsilateral leg. The definition of death was all-cause mortality. These outcomes and definitions align with the Society for Vascular Surgery reporting standards55.

The 30-day secondary outcomes consisted of individual components of MALE or death, MACE, bleeding requiring transfusion or secondary procedure, wound complication, other morbidity, non-home discharge, and unplanned readmission. MACE was a composite endpoint of myocardial infarction (ischemic changes on electrocardiogram, elevated troponin, or diagnosis by a physician or advanced provider), stroke (neurologic dysfunction exceeding 24 h in the setting of suspected stroke), or death. The definition of wound complication was a non-healing wound at the incision (if present), cellulitis, or dehiscence. Other morbidity was a composite endpoint of unplanned reintubation, failure to wean from the ventilator (cumulative time of ventilator-assisted respirations over 48 h), pneumonia, pulmonary embolism, acute kidney injury (rise in serum creatinine over 2 mg/dL from the pre-operative value or requirement of dialysis in a patient who was not on dialysis pre-operatively), cardiac arrest, urinary tract infection, Clostridium difficile infection, deep vein thrombosis requiring therapy, sepsis, or septic shock. The definition of non-home discharge was discharge to skilled care, rehabilitation, or other facility. These outcomes are defined by the ACS NSQIP data dictionary56.

Model development

We trained 6 ML models to predict 30-day primary and secondary outcomes: Naïve Bayes classifier, XGBoost, radial basis function support vector machine, random forest, multilayer perceptron artificial neural network, and logistic regression. We selected these models because of their demonstrated efficacy in predicting postoperative outcomes57,58,59. We used logistic regression for baseline comparison because traditional risk predictors most commonly use this modeling technique60.

We randomly split our data into two subsets: 70% training and 30% testing61. Testing data was reserved for model evaluation and not used for model training. We performed 10-fold cross-validation and grid search on training data to identify hyperparameters that were optimized for each ML model, including logistic regression62,63. The multivariable logistic regression model was developed and optimized with the same rigor as the advanced ML models with 10-fold cross-validation and grid search62,63. Preliminary analysis showed that the primary endpoint was rare, affecting 470/6601 (7.1%) of patients. To improve class balance, Random Over-Sample Examples (ROSE) was applied to training data64. ROSE uses smoothed bootstrapping to draw new samples from the feature space neighborhood around the minority class and is a commonly used method to facilitate predictive modeling of rare events64. The models were then assessed on test set data and ranked based on the primary discriminatory metric of AUROC. Our best performing model was XGBoost, which had the following optimized hyperparameters for our dataset: number of rounds = 200, maximum tree depth = 3, learning rate = 0.3, gamma = 0, column sample by tree = 0.6, minimum child weight = 1, and subsample = 0.9. The process for selecting these hyperparameters is described in Supplementary Table 2. Once the top-performing ML algorithm for the primary endpoint was identified, the same model was further trained to predict secondary endpoints.

Statistical analysis

Baseline features were reported as means and standard deviations or numbers and proportions. Differences between patients with vs. without 30-day MALE or death were evaluated with independent t-test (continuous variables) or chi-square test (categorical variables). All continuous data were normally distributed. To account for multiple comparisons, Bonferroni correction was used to set statistical significance. AUROC was the primary model evaluation metric as it considers both sensitivity and specificity65. Secondary model evaluation metrics were specificity, sensitivity, negative predictive value, positive predictive value, and accuracy. Model robustness was evaluated using a calibration curve and Brier score, which measures the agreement between predicted/observed event probabilities66. Feature importance was assessed using the variable importance score (gain), which measures the influence of each covariate in contributing to a prediction67. We reported feature importance for 3 groups: the overall cohort, asymptomatic/claudication patients, and individuals with CLTI. Model performance was assessed across subgroups based on sex, age, race, ethnicity, procedure type, symptom status, urgency, and concurrent infrainguinal endovascular revascularization.

Using a sample size calculator validated for prediction models, we determined that to achieve an AUROC over 0.8 with an outcome rate of approximately 7% and 37 input variables, 3815 patients with 268 events was the minimum cohort size required68,69. We included 6601 patients who had 470 primary events, which satisfied this condition. There was < 5% missing data for variables of interest; therefore, available case analysis was applied whereby only non-missing covariates for each patient were considered70,71. This is a valid method for analyzing datasets with minimal missing data70,71. Data were missing completely at random, and imputation was not performed to reflect modeling of real-world data, which generally has missing information that is not imputed72. All subjects included in the study (n = 6601) were used in all analyses. R version 4.3.073 was used for all analyses with the following packages: caret74, xgboost75, ranger76, naivebayes77, e107178, nnet79, and pROC80.