Table 9 Comparison against baseline.
From: RABEM: risk-adaptive Bayesian ensemble model for fraud detection
Fusion method | Accuracy | F1 score | Precision | Recall | Calibration (ECE) | Notes |
|---|---|---|---|---|---|---|
Bayesian reliability fusion | 99.38% | 0.92 | 89.66% | 99.41% | 0.024 | Strong calibration, best recall |
Soft voting (mean probabilities) | 98.91% | 0.88 | 87.12% | 95.10% | 0.072 | Good baseline, less calibrated |
Bagging (random forest) | 98.24% | 0.81 | 85.11% | 90.45% | 0.090 | Lacks probabilistic calibration |
Stacking (logistic meta) | 98.67% | 0.86 | 87.54% | 92.44% | 0.063 | Better than bagging, still lower recall |