Table 16 Performance comparison with recent SOTA models (NASA + Fire videos datasets).
From: Real time fire and smoke detection using vision transformers and spatiotemporal learning
Model | Accuracy (%) | Precision (%) | Recall (%) | F1-score (%) | AUC-ROC (%) | FPS (GPU) |
|---|---|---|---|---|---|---|
Proposed hybrid model | 98.8 | 98.6 | 98.3 | 98.4 | 98.9 | 32 |
YOLOv1328 | 97.8 | 97.5 | 97.9 | 97.7 | 98.2 | 28 |
YOLO-NAS40 | 98.1 | 98.0 | 97.8 | 97.9 | 98.5 | 29 |
MobileViT30 | 96.5 | 96.0 | 96.3 | 96.1 | 97.2 | 35 |
EfficientViT25 | 96.8 | 96.4 | 96.7 | 96.5 | 97.4 | 33 |
FireViTNet14 | 97.2 | 97.0 | 97.1 | 97.0 | 97.8 | 27 |
Smoke detection transformer7 | 97.5 | 97.2 | 97.3 | 97.2 | 98.0 | 26 |