Table 2 Comparison of performances on the KITTI dataset.
From: A simple monocular depth estimation network for balancing complexity and accuracy
Method | Venue | Backbone | \(\delta _1\uparrow\) | \(\delta _2\uparrow\) | \(\delta _3\uparrow\) | AbsRel\(\downarrow\) | RMSE\(\downarrow\) | RMSE log\(\downarrow\) | SqRel\(\downarrow\) | Params\(\downarrow\) |
|---|---|---|---|---|---|---|---|---|---|---|
DORN17 | CVPR 2018 | ResNet-101 | 0.932 | 0.984 | 0.994 | 0.072 | 2.727 | 0.120 | 0.307 | – |
BTS58 | Arxiv 2019 | ResNext-101 | 0.956 | 0.993 | 0.998 | 0.059 | 2.756 | 0.096 | 0.245 | 47.0M |
PWA59 | AAAI 2021 | DenseNet161 | 0.956 | 0.994 | 0.999 | 0.062 | 2.708 | 0.096 | – | – |
AdaBins12 | CVPR 2021 | EfficientNet-B5 | 0.964 | 0.995 | 0.999 | 0.058 | 2.360 | 0.088 | 0.190 | 78.0 M |
P3Depth60 | CVPR 2022 | ResNet101 | 0.953 | 0.993 | 0.998 | 0.071 | 2.842 | 0.103 | 0.270 | 94.2M |
NeWCRFs19 | CVPR 2022 | Swin-L | 0.974 | 0.997 | 0.999 | 0.052 | 2.129 | 0.079 | 0.155 | 270.5M |
LifelongDepth22 | TNNLS 2023 | ResNet-34 | 0.939 | – | – | 0.070 | 3.286 | – | – | 22.23M |
DepthFormer61 | MIR 2023 | Swin-L+R-50-C1 | 0.975 | 0.997 | 0.999 | 0.052 | 2.143 | 0.079 | 0.158 | 273.0M |
IEBins18 | NeurIPS 2023 | Swin-T | 0.970 | 0.996 | 0.999 | 0.056 | 2.205 | 0.084 | 0.169 | 90.7M |
TrapAttention62 | CVPR 2023 | XCiT-M24 | 0.976 | 0.998 | 0.999 | 0.054 | 1.990 | 0.078 | 0.149 | 94.2M |
MDEUncertainty63 | TCSVT 2024 | Swin-L | 0.967 | 0.995 | 0.999 | 0.057 | 2.376 | 0.089 | 0.200 | – |
SQLDepth24 | AAAI 2024 | ResNet-50 | 0.962 | 0.993 | 0.998 | 0.058 | 2.925 | 0.095 | 0.289 | 95.5M |
Metric3Dv229 | ECCV 2024 | ConvNeXt-L | 0.967 | 0.995 | 0.999 | 0.060 | 2.843 | 0.087 | – | 203.24M |
SimMDE(Ours) | Ours | MSCAN-B | 0.974 | 0.997 | 0.999 | 0.052 | 2.119 | 0.079 | 0.156 | 30.9M |