Table 1 Summary of existing literature for non-PPE detection.
Reference | Target PPE | Dataset | Method | Potential limitation compared to proposed method |
|---|---|---|---|---|
Fang et al.13 | Helmet | Custom dataset (publicly not open) | Faster R-CNN | Lower inference speed and detection accuracy |
Gu et al.14 | Helmet | Custom dataset (publicly not open) | Faster R-CNN with multi-scale training | Lower generalization, limited small-object detection |
Yang et al.15 | Helmet | Custom dataset (publicly not open) | YOLOv3 with DarkNet53 | Inferior accuracy and weaker performance on small objects |
Yan and Wang16 | Helmet | Custom dataset (publicly not open) | YOLOv3 with DarkNet53 | Less robust detection performance and slower inference |
Shen et al.17 | Helmet | Custom dataset (publicly not open) | VGG16-based face detector (Face); DenseNet-based classifier (Helmet) | Two-stage method, slower inference, and limited scalability |
Nath18 | Helmet, and vest | Custom dataset (publicly not open) | YOLOv3 (worker detection); VGG16, ResNet, Xception (helmet and vest classifiers) | Two-stage classification, lower real-time detection capability |
Wang et al.19 | Helmet, and vest | Custom dataset (publicly not open) | YOLOv5x and YOLOv5s | Lower AP, reduced accuracy on challenging small-scale objects |
Lee et al.20 | Helmet, and vest | Custom dataset (publicly not open) | Mask R-CNN with MobileNetV3 | Slower inference speed due to complex instance segmentation |
Nguyen et al.21 | Helmet, mask, glove, vest, shoes | Custom dataset (publicly not open) | SHO-based YOLOv5 | Moderate detection accuracy and limitations in object scale robustness |
Park et al.22 | Helmet, mask, glove, vest, shoes | Custom dataset (publicly not open) | YOLOv8, Swin Transformer, Axial Transformer | Slightly slower inference speed and lower overall accuracy |