Table 5 Ablation Study

From: Localization and recognition of human action in 3D using transformers

Method

mAP @ 0.5 tIoU

2D Features38

14.5

Joint pos.

23.4

Early-layer AR feat.

21.3

Later-layer AR feat.

20.4

Joint pos. + Early + Later AR feat.

21.4

  1. Effect of different 3D Human Representations on BABEL-TAL-20 (BT-20). The 2D features obtained from the rendered videos exhibit inferior performance compared to the 3D joint features. This discrepancy arises from the 3D representation’s ability to encapsulate a richer information set compared to the 2D representation, underscoring the superiority of the 3D approach. mAP mean Average Precision, tIoU threshold IoU.