Table 13 Comparison of methodological approaches: distance estimation.
Approach | Main description (major improvements of the previous approach) | Potential problem |
---|---|---|
1 | 1) Using the area ratio of the bounding boxes to represent the relative depth(z) \(z = \frac{\begin{gathered} {\text{max(Area}}\;{\text{of}}\;{\text{Bounding}}\;{\text{Box}}\;{1,}\; \hfill \\ {\text{Area}}\;{\text{of}}\;{\text{Bounding}}\;{\text{Box}}\;{2)} \hfill \\ \end{gathered} }{\begin{gathered} {\text{min(Area}}\;{\text{of}}\;{\text{Bounding}}\;{\text{Box}}\;{1,}\; \hfill \\ {\text{Area}}\;{\text{of}}\;{\text{Bounding}}\;{\text{Box}}\;{2)} \hfill \\ \end{gathered} }\) \({\text{Relative}}\;{\text{Distance}} = \sqrt {(x_{{2}} - x_{{1}} )^{{2}} + (y_{{2}} - y_{{1}} )^{{2}} + (z_{{2}} - z_{{1}} )^{{2}} }\) | 1) The relationship between the area ratio and the relative depth is not linear, thus the representation area ratio is not accurate |
2) Combining z with x and y | 2) The combination does not have theoretical robustness | |
2 | 1) Discovering an inverse square root relationship between the area of the bounding boxes and the depth of the object toward the camera thus solving the non-linear relationship problem \(z = \frac{1}{{\sqrt {{\text{Area}}\;{\text{of}}\;{\text{Bounding}}\;{\text{Box}}} }}\) \({\text{Relative}}\;{\text{Distance}} = \sqrt {(x_{{2}} - x_{{1}} )^{{2}} + (y_{{2}} - y_{{1}} )^{{2}} + (z_{{2}} - z_{{1}} )^{{2}} }\) | 1) The usefulness of this approach has not been replicated by other researchers and needs numerous validation attempts. The validation rounds are time-consuming |
2) Combining z with x and y using a 3D Euclidean formula to make the combination method more credential | 2) Although an inverse square root relationship exists when the camera has been shot from an eye-level perspective, as the camera angle varies between different videos, the equation of calculating z would be inaccurate | |
3 | 1) Using MiDaR to get a relative depth(z). This approach offers more credibility. 2) Combining z with x and y using a 3D Euclidean formula to make the combination method more robust \({\text{Relative}}\;{\text{Distance}} = \sqrt {(x_{{2}} - x_{{1}} )^{{2}} + (y_{{2}} - y_{{1}} )^{{2}} + (z_{{2}} - z_{{1}} )^{{2}} }\) | 1) MiDaR would be inaccurate in this research’s circumstance if the centers of the bounding boxes used to capture human objects in the frames are overlapping |