Table 1 Comparison of HRTNet with other object detectors on COCO 2017 val set.
Model | Epochs | Params (M) | GFLOPs | FPSbs=1 | APval | \({\text{AP}}_{50}^{val}\) | \({\text{AP}}_{75}^{val}\) | \({\text{AP}}_{S}^{val}\) | \({\text{AP}}_{M}^{val}\) | \({\text{AP}}_{L}^{val}\) |
|---|---|---|---|---|---|---|---|---|---|---|
CNN-based object detector | ||||||||||
YOLOv5-s22 | 300 | 7.2 | 16.5 | 376 | 37.4 | 56.8 | – | – | – | – |
YOLOv6-v3.0-s23 | 300 | 18.5 | 45.3 | 339 | 44.3 | 61.2 | 48.7 | 24.8 | 50.4 | 62.5 |
YOLOv8-s25 | 500 | 11.2 | 28.6 | 99 | 44.3 | 60.7 | 47.9 | 18.9 | 42.2 | 59.7 |
Gold-YOLO-s56 | 300 | 21.5 | 46.0 | 286 | 45.4 | 62.5 | – | 25.3 | 50.2 | 62.5 |
YOLOv9-s26 | 500 | 7.2 | 26.7 | 161 | 46.1 | 62.3 | 49.9 | 19.3 | 44.1 | 62.5 |
YOLOv10-s27 | 500 | 8.1 | 24.8 | 100 | 46.1 | 61.8 | 49.5 | 19.8 | 43.3 | 61.3 |
YOLO-MS-s58 | 300 | 8.1 | 31.2 | – | 46.2 | 63.7 | 50.5 | 26.9 | 50.5 | 63.0 |
Transformer-based object detector | ||||||||||
DETR33 | 300 | 41 | 86 | 28 | 42.0 | 62.4 | 44.2 | 20.5 | 45.8 | 61.1 |
DETR-DC533 | 500 | 41 | 187 | 12 | 43.3 | 63.1 | 45.9 | 22.5 | 47.3 | 61.1 |
Deformable-DETR34 | 50 | 40 | 173 | 19 | 43.8 | 62.6 | 47.7 | 26.4 | 47.1 | 58.0 |
Anchor-DETR-DC559 | 50 | 39 | 172 | 16 | 44.2 | 64.7 | 47.5 | 24.7 | 48.2 | 60.6 |
SMCA-DETR60 | 36 | 35 | 210 | – | 45.1 | 63.1 | 49.1 | 28.3 | 48.4 | 59.0 |
RT-DETR-R1838 | 72 | 20 | 60 | 217 | 46.4 | 63.7 | – | – | – | – |
HRTNet (Ours) | 72 | 19.9 | 57.5 | 134 | 46.8 | 63.9 | 50.6 | 23.3 | 44.1 | 61.2 |