Table 19 Performance analysis under different class distribution strategies.

From: Novel metaheuristic optimized latent diffusion framework for automated oral disease detection in public health screening

Training strategy

Rare pathology sensitivity (95% CI)

Common pathology sensitivity (95% CI)

Overall accuracy (95% CI)

Precision (95% CI)

Recall (95% CI)

F1-score (95% CI)

AUC (95% CI)

Clinical utility index (mean ± SD)

Cohen’s κ

p-value

Artificially balanced (original)

94.3% (92.8–95.8)

97.8% (97.2–98.4)

97.3% (96.8–97.8)

97.1% (96.5–97.7)

96.5% (95.9–97.1)

0.968 (0.963–0.973)

0.993 (0.989–0.997)

9.1 ± 0.3/10

0.946

-

Natural prevalence weighted

79.2% (76.8–81.6)

98.7% (98.3–99.1)

91.4% (90.7–92.1)

95.8% (95.2–96.4)

91.4% (90.7–92.1)

0.935 (0.928–0.942)

0.971 (0.966–0.976)

8.3 ± 0.4/10

0.828

< 0.001

Hybrid balanced-weighted

86.7% (84.9–88.5)

98.3% (97.8–98.8)

94.9% (94.3–95.5)

96.4% (95.9–96.9)

94.9% (94.3–95.5)

0.956 (0.951–0.961)

0.984 (0.980–0.988)

8.8 ± 0.2/10

0.898

< 0.001

Cost-sensitive learning

83.4% (81.4–85.4)

98.1% (97.6–98.6)

93.2% (92.5–93.9)

95.9% (95.3–96.5)

93.2% (92.5–93.9)

0.945 (0.939–0.951)

0.978 (0.973–0.983)

8.6 ± 0.3/10

0.864

< 0.001

Focal loss optimization

85.9% (84.0-87.8)

97.9% (97.4–98.4)

94.1% (93.4–94.8)

96.2% (95.6–96.8)

94.1% (93.4–94.8)

0.951 (0.945–0.957)

0.981 (0.976–0.986)

8.7 ± 0.3/10

0.882

< 0.001