Table 2 Quantitative evaluations on the BABEL-TAL-20 (BT-20) dataset

We report the AP with the tIoU in the range [0.1, 0.9] as well as the mAP. LocATe represents our single-stage transformer-based approach, while LocATe w/ tricks refers to our method enhanced with tricks, including iterative bounding box refinement and a two-stage decoder⁴⁰. Notably, our approach LocATe outperforms the previous method Beyond-Joints, with particularly substantial improvements at lower tIoU thresholds when compared to other benchmark methods. AP Average Precision, tIoU threshold IoU, mAP mean Average Precision.

Quick links

Search