Fig. 3: Ten-fold cross-validation design and model performance evaluation.

a Schematic of the 10-fold strategy. For each fold, one subset was designated as the internal validation set and the remaining nine subsets formed the training set, whereas the independent validation cohort was kept locked for final external testing. Model training proceeded through three sequential fine-tuning phases with selective freezing of blocks A and B. b ROC curves for the internal 10-fold cross-validation folds. c Bar plot of classification accuracy in internal validation: overall accuracy, accuracy for non-recurrence cases, and accuracy for recurrence cases across folds. d ROC curves for external validation cohort across all 10 folds. e Classification accuracy in external validation, including overall, non-recurrence, and recurrence-specific accuracy per fold. f Macro and weighted evaluation metrics (precision, recall, F1-score) computed on the external validation set across folds. g PR curves for external validation. PR-AUC is reported for each fold, evaluating the model’s ability to handle imbalanced outcomes.