Table 6 Summary of inference speed (FPS) by architectures.

From: Automated non-PPE detection on construction sites using YOLOv10 and transformer architectures for surveillance and body worn cameras with benchmark datasets

Architecture

Statistics

Mean

Min

25%

50%

75%

Max

ViT

34.26

34.21

34.37

34.47

34.82

35.23

Swin Transformer

35.32

34.78

34.56

35.26

35.67

36.35

PVT

36.54

34.16

35.43

35.55

36.74

37.13