Fig. 2: Schematic diagram illustrating the two-strategy evaluation framework implemented in our study.

The dataset is initially split into an 80% discovery subset and a 20% hold-out test set, utilizing stratified random sampling at the patient level to ensure consistent data distribution among the different splits. Within the discovery subset, stratified 5-fold cross-validation is applied for model development and optimal parameter selection. The hold-out test set is then used to conduct an unbiased evaluation of the final model, assessing its performance on previously unseen data.