Table 3 Comparison the performance of ML models for YOCRC risk stratification in the temporal validation dataset

	AUC	Accuracy	Sensitivity (Recall)	Specificity	NPV	Precision (PPV)	F1 score	Brier score
LR	0.799	0.782	0.577	0.790	0.979	0.098	0.167	0.200
RF	0.888	0.779	0.872	0.775	0.994	0.133	0.231	0.163
KNN	0.726	0.648	0.679	0.647	0.981	0.071	0.128	0.244
SVC	0.827	0.757	0.744	0.757	0.987	0.108	0.188	0.192
DT	0.779	0.848	0.705	0.853	0.987	0.159	0.260	0.172
XGBoost	0.892	0.801	0.808	0.801	0.991	0.138	0.236	0.159
Adaboost	0.887	0.802	0.782	0.803	0.989	0.136	0.231	0.220
Stacking	0.849	0.804	0.821	0.803	0.991	0.141	0.241	0.150

Accuracy (\(\frac{{TP}+{TN}}{{TP}+{FP}+{TN}+{FN}}\)), Sensitivity (Recall) \((\frac{{TP}}{{TP}+{FN}})\), Specificity \((\frac{{TN}}{{TN}+{FP}})\), Negative predictive value (NPV) \((\frac{{TN}}{{TN}+{FN}})\), Precision or Positive predictive value (PPV)\(\,(\frac{{TP}}{{TP}+{FP}})\), and F1 scores (\(\frac{2* {Precision}* {Recall}}{{Precision}+{Recall}}\)), Brier score (Y, P) = 1/n * \({\sum }_{i=1}^{n}{({P}_{i}-{Y}_{i})}^{2}\).
ML machine learning, YOCRC Young-onset colorectal cancer, AUC area under the curve of ROC, LR logistic regression, RF random forest, KNN k-nearest neighbor, SVC support vector classification, DT decision tree, XGBoost eXtreme Gradient Boosting, AdaBoost Adaptive Boosting, TP true positive, TN true negative, FP false positive, FN false negative, P probability of model prediction, Y actual probability of occurrence (no occurrence recorded as 0), n number of predicted events.

Quick links

Search