Table 1 Quantitative evaluation for camera-only multi-object tracking
From: Towards generalizable and interpretable three-dimensional tracking with inverse neural rendering
Training data unseen | Method | Car | Motorcycle | Modality | ||||||
---|---|---|---|---|---|---|---|---|---|---|
AMOTA ↑ | AMOTP (m) ↓ | Recall ↑ | MOTA ↑ | AMOTA ↑ | AMOTP (m) ↓ | Recall ↑ | MOTA ↑ | |||
× | PF-Track | 0.622 | 0.916 | 0.719 | 0.558 | 0.448 | 1.245 | 0.457 | 0.384 | Camera |
× | QTrack | 0.692 | 0.753 | 0.760 | 0.596 | 0.531 | 1.098 | 0.861 | 0.500 | Camera |
× | QD-3DT | 0.425 | 1.258 | 0.563 | 0.358 | 0.253 | 1.543 | 0.437 | 0.243 | Camera |
✓ | QD-3DT (trained on WOD) | 0.000 | 1.893 | 0.226 | 0.000 | 0.000 | 2.000 | 0.000 | 0.000 | Camera |
× (CP) | CenterTrack | 0.202 | 1.195 | 0.313 | 0.134 | 0.011 | 1.636 | 0.141 | 0.033 | Camera |
✓ (CP) | AB3DMOT | 0.387 | 1.158 | 0.506 | 0.284 | 0.254 | 1.549 | 0.360 | 0.232 | Camera |
✓ (CP) | Inverse neural rendering (ours) | 0.402 | 1.213 | 0.521 | 0.315 | 0.244 | 1.479 | 0.389 | 0.220 | Camera |