Table 3 Performance and inference speed of the pruned (P) and quantized (Q) models with respect to the baselines. Results are reported for the tiny (a), base (b), and large (c) variants on the NYU Depth V2 (left) and KITTI (right) datasets.
From: Efficient attention vision transformers for monocular depth estimation on resource-limited hardware
Model | RMSE [m] \(\downarrow\) | \(Abs_{Rel}\downarrow\) | \(\delta _1\uparrow\) | \(\delta _2\uparrow\) | \(\delta _3\uparrow\) | XG38 [s] \(\downarrow\) | Model | RMSE [m] \(\downarrow\) | \(Abs_{Rel}\downarrow\) | \(\delta _1\uparrow\) | \(\delta _2\uparrow\) | \(\delta _3\uparrow\) | XG38 [s] \(\downarrow\) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
NYU Depth V2 | KITTI | ||||||||||||
(a) | |||||||||||||
METER | 0.544 | 0.175 | 0.778 | 0.946 | 0.983 | 16.47 | METER | 5.945 | 7.408 | 0.287 | 0.485 | 0.604 | 18.13 |
METER P | 2.789 | 0.938 | 0.001 | 0.001 | 0.001 | 15.17 | METER P | 10.960 | 1.630 | 0.023 | 0.047 | 0.071 | 20.26 |
METER Q | 1.334 | 0.346 | 0.278 | 0.560 | 0.793 | 17.86 | METER Q | 6.985 | 7.390 | 0.183 | 0.362 | 0.499 | 17.40 |
PXF | 0.392 | 0.114 | 0.880 | 0.982 | 0.996 | 112.41 | PXF | 2.324 | 0.060 | 0.966 | 0.996 | 0.999 | 160.90 |
PXF P | 1.159 | 0.524 | 0.368 | 0.651 | 0.829 | 136.60 | PXF P | 17.172 | 1.609 | 0.112 | 0.225 | 0.337 | 188.94 |
PXF Q | 1.594 | 0.746 | 0.265 | 0.504 | 0.706 | 136.03 | PXF Q | 11.379 | 0.502 | 0.262 | 0.502 | 0.725 | 185.37 |
NeWCRFs | 0.388 | 0.112 | 0.885 | 0.980 | 0.995 | 170.78 | NeWCRFs | 2.373 | 0.059 | 0.965 | 0.995 | 0.999 | 234.65 |
NeWCRF P | 2.355 | 1.177 | 0.126 | 0.303 | 0.509 | 187.33 | NeWCRF P | 20.433 | 2.037 | 0.076 | 0.158 | 0.261 | 255.80 |
NeWCRFs Q | 1.820 | 0.845 | 0.235 | 0.465 | 0.662 | 211.43 | NeWCRFs Q | 20.951 | 2.086 | 0.077 | 0.160 | 0.261 | 303.66 |
(b) | |||||||||||||
METER | 0.497 | 0.149 | 0.811 | 0.951 | 0.987 | 17.10 | METER | 5.794 | 6.625 | 0.302 | 0.504 | 0.618 | 22.16 |
METER P | 2.777 | 0.930 | 0.019 | 0.019 | 0.019 | 19.72 | METER P | 11.206 | 1.134 | 0.191 | 0.199 | 0.208 | 29.89 |
METER Q | 1.487 | 0.663 | 0.296 | 0.555 | 0.752 | 19.63 | METER Q | 6.046 | 8.031 | 0.231 | 0.459 | 0.607 | 23.59 |
PXF | 0.338 | 0.096 | 0.918 | 0.988 | 0.997 | 181.09 | PXF | 2.205 | 0.055 | 0.972 | 0.997 | 0.999 | 262.77 |
PXF P | 2.573 | 1.287 | 0.109 | 0.262 | 0.463 | 234.45 | PXF P | 26.822 | 2.704 | 0.055 | 0.116 | 0.188 | 343.31 |
PXF Q | 1.487 | 0.692 | 0.281 | 0.550 | 0.738 | 238.05 | PXF Q | 13.297 | 0.433 | 0.255 | 0.427 | 0.571 | 334.93 |
NeWCRFs | 0.337 | 0.095 | 0.918 | 0.989 | 0.998 | 241.93 | NeWCRFs | 2.185 | 0.054 | 0.972 | 0.999 | 0.997 | 356.12 |
NeWCRF P | 2.566 | 1.283 | 0.109 | 0.266 | 0.464 | 304.09 | NeWCRFs P | 26.995 | 2.722 | 0.054 | 0.115 | 0.187 | 368.20 |
NeWCRFs Q | 1.870 | 0.908 | 0.202 | 0.428 | 0.635 | 358.69 | NeWCRFs Q | 18.496 | 1.808 | 0.087 | 0.186 | 0.308 | 488.34 |
(c) | |||||||||||||
METER | 0.460 | 0.133 | 0.834 | 0.966 | 0.992 | 21.52 | METER | 5.726 | 7.299 | 0.332 | 0.524 | 0.630 | 34.50 |
METER P | 2.207 | 0.649 | 0.039 | 0.093 | 0.182 | 31.30 | METER P | 10.732 | 2.291 | 0.034 | 0.068 | 0.103 | 51.86 |
METER Q | 1.728 | 0.812 | 0.223 | 0.473 | 0.683 | 26.41 | METER Q | 10.955 | 4.002 | 0.078 | 0.134 | 0.172 | 35.69 |
PXF | 0.324 | 0.091 | 0.928 | 0.991 | 0.998 | 294.74 | PXF | 2.123 | 0.052 | 0.975 | 0.997 | 0.999 | 426.68 |
PXF P | 2.616 | 1.308 | 0.105 | 0.255 | 0.457 | 437.33 | PXF P | 27.298 | 2.752 | 0.053 | 0.113 | 0.184 | 683.33 |
PXF Q | 1.387 | 0.632 | 0.306 | 0.570 | 0.761 | 430.85 | PXF Q | 10.530 | 0.516 | 0.273 | 0.544 | 0.766 | 729.44 |
NeWCRFs | 0.322 | 0.091 | 0.929 | 0.992 | 0.998 | 339.68 | NeWCRFs | 2.072 | 0.052 | 0.975 | 0.997 | 0.999 | 500.17 |
NeWCRF P | 2.656 | 1.328 | 0.105 | 0.248 | 0.445 | 477.79 | NeWCRF P | 27.389 | 2.761 | 0.053 | 0.113 | 0.183 | 733.47 |
NeWCRFs Q | 2.327 | 1.144 | 0.145 | 0.329 | 0.531 | 565.98 | NeWCRFs Q | 21.612 | 2.164 | 0.070 | 0.146 | 0.244 | 899.43 |