Table 1 Summary of different methods derived from up-to date literature.

From: Safety helmet detection methods in heavy machinery factory

Ref.

Datasets

Method used

Results

Originality

Limitations

11

SHWD

Improved Yolov4

Accuracy:92.98%, model size :41.88 M, Detection speed: 43 frame/second

Introducing depth-separable convolution to reduce the model parameters; embedding the coordinate attention mechanism module to enhance the feature information; Designing the PB module to fuse the target information; and using the SIoU loss function to replace the CIoU loss function to improve the accuracy and speed of helmet detection, and reduce the size of the model.

Has the situation of missed detection of small targets at long range.

12

CUMT-HelmeT

Improved YOLOv5s

Average Precision:87.5%

Fusing the attention mechanism CBAM and YOLOv5s to improve the accuracy; designing the P2 small target detection layer to increase the multi-scale sensing field of the model; replacing the CIoU loss function with the EIoU loss to ensure the accuracy of the regression frame; replacing the ordinary convolutional Conv in the backbone network with the ShuffleNetV2 to realize the lightweight network model.

The situations of missed and misdirected tests still exist

13

KITTI

Improved YOLO v5

Average recognition accuracy:95.2%

A 160 × 160 small target detection head is added to improve shallow feature retention, a deformable convolutional network V2 (DCN V2) is introduced to improve the learning ability of small moving targets, a context augmentation module (CAM) is added to improve the detection of small targets at long distances, the loss function is replaced by EIoU to improve the accuracy of bounding box localization, and the SPPCSPC_group module is adopted to improve multi-scale feature fusion.

More hardware processing power and memory are required

14

TUM

Improved ORB-SLAM with YOLO

Target detection accuracy :99.3%

Combining the optical flow method with geometrical constraints for secondary judgment, and static feature points are utilized for position estimation; target tracking algorithm is used for inter-frame detection correction for the image blurring problem.

Weak robustness and insufficient computational resources in highly dynamic and complex scenarios.

15

Proprietary dataset

Improved YOLO v5

Precision:98.88%

recall:94.82%

mAP : 98.13%

CBAM attention module is added to the convolutional neural network feature extraction layer to enhance important features and suppress useless features; CARAFE is added to the feature fusion layer to be able to dynamically generate adaptable kernels. Improved precision and accuracy of machine tool recognition.

Weak complex environment adaptability and lower computational resource efficiency

16

VisDrone,

VEDAI

AMEA-YOLO

mAP:43.4%

Parameters size :10.4 M

GFLOPs:23.7

mAP :66.2%

The lightweight network Ghostone is designed as the backbone network and combined with FasterNet to accelerate the model training; the enhanced second-order channel attention module EnhancedSOCA is utilized to improve the high resolution of the image; the GC3 module is designed by introducing the SimAM attention mechanism to further lightweight the model; and the HardSwish activation function is used

Weak generalization capacity

18

Proprietary dataset

Improved YOLOv5

PSNR: 29.420

SSIM:0.855

The average accuracy AP:79.1%

A double residual channel Super-Resolution (SR) reconstruction module was designed to improve the image resolution. A new CSP module of YOLOv5 is proposed to reduce information loss and gradient confusion. An end-to-end safety helmet detection model based on SR reconstruction network and YOLOv5 is constructed.

Weak generality and real-time performance

19

SHEL5K

YOLOv7-WFD

mAP:92.6%

FPS:79%

The DBS composed of deformable convolution, batch normalization layer and SiLU activation function to enhance the feature extraction ability of the model was proposed. The CARAFE module was introduced for feature upsampling to improve the reconstruction ability of model details and structural information. The Wise-IoU loss function is used to calculate the localization loss to enhance the generalization ability.

Weak computational efficiency, hardware compatibility, furhter optimize dataset balance, and expand range of application scenarios