Table 11 Model performance and complexity comparison.

From: Real-time driver drowsiness detection using transformer architectures: a novel deep learning approach

Model

Accuracy (%)

Parameters

Average inference time (ms/frame)

Average FPS

ViT transformer

99.15

85,800,194

1021.83

0.98

Swin transformer

99.03

27,914,108

472.8

2.12

InceptionV3

98.5

23,867,553

136.33

7.35

InceptionResNetV2

98.5

55,851,105

122.8

8.14

MobileNet

98

3,208,001

55

18.18