Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Pseudo-depth-based deep neural network model for object detection
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 26 March 2026

Pseudo-depth-based deep neural network model for object detection

  • Si-Qi Li1,
  • Wei Feng2,3,4,
  • Bin Liu5,
  • Xin Tong1 &
  • …
  • Qiang Li1 

Scientific Reports , Article number:  (2026) Cite this article

  • 438 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Engineering
  • Mathematics and computing
  • Optics and photonics

Abstract

Current machine learning methods only utilize the three-channel color features of optical images for computer visual tasks. However, the optical images only explicitly present information of RGB color and two-dimensional planar shape, where the third-dimensional spatial features are not fully exploited. This limitation restricts the potential improvement in recognition performance. To address this issue, we propose a detection scheme to enhance model’s detection capabilities based on four independent features by combining the pseudo-depth and the RGB features without adding any additional hardware sensors. The monocular depth estimation model is first used as a virtual depth sensor to extract the pseudo-depth features from input optical images. Then the fused Depth-RGB features are fed into the neural network model for object detection training and inference to enhance capability for extracting spatial features. Experiments show that the proposed method has improved the detection metric mAP\(_{50}\) by 3.8 and 8.0 percentage points on the public M\(^3\)FD and COCO datasets, respectively. Notably, the scheme can be easily embedded into any machine learning models to definitely improve the detection performance.

Similar content being viewed by others

Predicting wind-driven spatial deposition through simulated color images using deep autoencoders

Article Open access 25 January 2023

High quality monocular depth estimation with parallel decoder

Article Open access 05 October 2022

Efficient attention vision transformers for monocular depth estimation on resource-limited hardware

Article Open access 05 July 2025

Data availability

The datasets generated and/or analysed during the current study are available in the GitHub repository with link https://github.com/htyb275/Pseudo-Depth-Detection.

References

  1. Girshick, R., Donahue, J., Darrell, T. & Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation, in 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587, (2014) https://doi.org/10.1109/CVPR.2014.81.

  2. Ren, S., He, K., Girshick, R. & Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137. https://doi.org/10.1109/TPAMI.2016.2577031 (2017).

    Google Scholar 

  3. Dai, J., Li, Y., He, K. & Sun, J. R-fcn: object detection via region-based fully convolutional networks, in Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, (Red Hook, NY, USA), p. 379–387, Curran Associates Inc., (2016).

  4. Law, H. & Deng, J. Cornernet: Detecting objects as paired keypoints, in Computer Vision – ECCV 2018, Ferrari, V., Hebert, M., Sminchisescu, C., & Weiss, Y., eds., (Cham), pp. 765–781, Springer International Publishing, (2018)

  5. Jiang, P., Ergu, D., Liu, F., Cai, Y. & Ma, B. A review of yolo algorithm developments. Procedia Computer Sci. 199, 1066. https://doi.org/10.1016/j.procs.2022.01.135 (2022).

    Google Scholar 

  6. Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., & Tian, Q. Centernet: Keypoint triplets for object detection, in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October, (2019).

  7. Zhu, X., Su, W., Lu, L., Li, B., Wang, X., & Dai, J. Deformable detr: Deformable transformers for end-to-end object detection, arxiv:2010.04159.

  8. Tong, X. et al. Meson Properties and Symmetry Emergence Based on the Deep Neural Network. Chin. Phys. Lett. 43, 020201 (2026).https://doi.org/10.1088/0256-307X/43/2/020201arxiv:2509.17093

  9. Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., et al., Emerging properties in self-supervised vision transformers, inProceedings of the IEEE/CVF international conference on computervision, pp. 9650–9660, 2021.

  10. Niu, Z., Zhong, G. & Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 452, 48. https://doi.org/10.1016/j.neucom.2021.03.091 (2021).

    Google Scholar 

  11. Chen, Y., Chen, L., Xia, R., Yang, K. & Zou, K. Caat. Image super-resolution algorithm via channel attention and transformer. Array 28, 100628. https://doi.org/10.1016/j.array.2025.100628 (2025).

    Google Scholar 

  12. Meerits, S., Thomas, D., Nozick, V. & Saito, H. Fusionmls: Highly dynamic 3d reconstruction with consumergrade rgb-d cameras. Comput. Visual Media. 4, 287 (2018).

    Google Scholar 

  13. Chu, X., Deng, J., Ji, J., Zhang, Y., Li, H., & Zhang, Y. Oa-det3d: Embedding object awareness as a general plug-in for multi-camera 3d object detection: chu, X. et al., International Journal of Computer Vision 133 8022.(2025)

  14. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov ,A., & Zagoruyko, S., End-to-end object detection with transformers, in Computer Vision – ECCV 2020, A. Vedaldi, H. Bischof, T. Brox and J.-M. Frahm, eds., (Cham), pp. 213–229, Springer International Publishing, (2020)

  15. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov ,A., & Zagoruyko, S., End-to-end object detection with transformers, in Computer Vision – ECCV 2020, Vedaldi, A., Bischof, H., Brox, T., & Frahm, J.-M., eds., (Cham), pp. 213–229, Springer International Publishing, (2020)

  16. Cong, R. et al. Going from rgb to rgbd saliency: A depth-guided transformation model. IEEE Trans. Cybern. 50, 3627. https://doi.org/10.1109/TCYB.2019.2932005 (2020).

    Google Scholar 

  17. Cong, R. et al. Cir-net: Cross-modality interaction and refinement for rgb-d salient object detection. IEEE Trans. Image Process. 31, 6800. https://doi.org/10.1109/TIP.2022.3216198 (2022).

    Google Scholar 

  18. Piao, Y., Rong, Z., Zhang, M., Ren, W. & Lu, H. A2dele: Adaptive and attentive depth distiller for efficient rgb-d salient object detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition , 9057–9066 (2020).https://doi.org/10.1109/CVPR42600.2020.00908.

  19. Wang, T., Zhu, X., Pang, J. & Lin, D. Fcos3d: Fully convolutional one-stage monocular 3d object detection. Proceedings of the IEEE/CVF international conference on computer vision , 913–922 (2021).

  20. Piao, Y., Ji, W., Li, J., Zhang, M., & Lu, H., Depth-induced multi-scale recurrent attention network for saliency detection, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 7253–7262, (2019), https://doi.org/10.1109/ICCV.2019.00735.

  21. Wang, Y., Guizilini, V.C., Zhang, T., Wang, Y., Zhao, H., & Solomon, J., Detr3d: 3d object detection from multi-view images via 3d-to-2d queries, in Proceedings of the 5th Conference on Robot Learning, Faust, A., Hsu D., & Neumann, G., eds., vol. 164 of Proceedings of Machine Learning Research, pp. 180–191, PMLR, 08–11 Nov, (2022), https://proceedings.mlr.press/v164/wang22b.html.

  22. Liu, Y., Wang, T., Zhang, X., & Sun, J., Petr: Position embedding transformation for multi-view 3d object detection, in Computer Vision – ECCV 2022, Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., & Hassner, T., eds., (Cham), pp. 531–548, Springer Nature Switzerland, (2022).

  23. Musiat, A., Reichardt, L., Schulze, M., & Wasenmüller, O., Radarpillars: Efficient object detection from 4d radar point clouds, in 2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), pp. 1656–1663, IEEE, (2024).

  24. Song, H. et al. Cmkd-net: a cross-modal knowledge distillation method for remote sensing image classification. Adv. Space Res. 75, 8515. https://doi.org/10.1016/j.asr.2025.04.009 (2025).

    Google Scholar 

  25. Zhou, W., Cai, Y., Dong, X., Qiang, F. & Qiu, W. Adrnet-s*: Asymmetric depth registration network via contrastive knowledge distillation for rgb-d mirror segmentation. Information Fusion 108, 102392 (2024).

    Google Scholar 

  26. Ji, W., Li, J., Yu, S., Zhang, M., Piao, Y., Yao, S., et al., Calibrated rgb-d salient object detection, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9471–9481, June, (2021).

  27. Song, H. et al. Symmetrical learning and transferring: Efficient knowledge distillation for remote sensing image classification. Symmetry 17, 1002 (2025).

    Google Scholar 

  28. Qi, C.R., Litany, O., He, K., & Guibas, L.J. Deep hough voting for 3d object detection in point clouds, in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October, (2019).

  29. Chen, Y., Xia, R., Yang, K. & Zou, K. Dual degradation image inpainting method via adaptive feature fusion and u-net network. Appl. Soft Computing 174, 113010. https://doi.org/10.1016/j.asoc.2025.113010 (2025).

    Google Scholar 

  30. Zhang, J., Yang, J., Qin, Y., Xiao, Z. & Wang, J. Mgnet: Rgbt tracking via cross-modality cross-region mutual guidance. Neural Netw. 190, 107707. https://doi.org/10.1016/j.neunet.2025.107707 (2025).

    Google Scholar 

  31. Zhang, J., Zhang, S., Li, D., Wang, J. & Wang, J. Crack segmentation network via difference convolution-based encoder and hybrid cnn-mamba multi-scale attention. Pattern Recog. 167, 111723. https://doi.org/10.1016/j.patcog.2025.111723 (2025).

    Google Scholar 

  32. Shi, S., Wang, X., & Li, H. Pointrcnn: 3d object proposal generation and detection from point cloud, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June, (2019).

  33. Zhang, H., Jiang, H., Yao, Q., Sun, Y., Zhang, R., Zhao H., et al. Detect anything 3d in the wild, in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5048–5059, October, (2025).

  34. Yang, L. et al. Bevheight++: Toward robust visual centric 3d object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (2025).

  35. Zhang, H. et al. Test-time correction: An online 3d detection system via visual prompting. IEEE Trans. Pattern Anal. Mach. Intell. 48, 3666. https://doi.org/10.1109/TPAMI.2025.3642076 (2026).

    Google Scholar 

  36. Yang, L. et al. Sgv3d: Toward scenario generalization for vision-based roadside 3d object detection. IEEE Transactions on Intelligent Transportation Systems (2025).

  37. Simony, M., Milzy, S., Amendey, K., & Gross, H.-M. Complex-yolo: An euler-region-proposal for real-time 3d object detection on point clouds, in Proceedings of the European conference on computer vision (ECCV) workshops, pp. 0–0, (2018).

  38. Weng, X., & Kitani, K. Monocular 3d object detection with pseudo-lidar point cloud, in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, Oct, (2019).

  39. Chen, Z. et al. Graph-detr4d: Spatio-temporal graph modeling for multi-view 3d object detection. IEEE Trans. Image Process. 33, 4488 (2024).

    Google Scholar 

  40. Xie, Q., Lai, Y.-K., Wu, J., Wang, Z., Zhang, Y., Xu, K., et al. Mlcvnet: Multi-level context votenet for 3d object detection, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June, 2020.

  41. Ranftl, R., Lasinger, K., Hafner, D., Schindler, K. & Koltun, V. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE Trans. Pattern Anal. Mach. Intell. 44, 1623. https://doi.org/10.1109/TPAMI.2020.3019967 (2022).

    Google Scholar 

  42. Wang, J., Lin, C., Sun, L., Liu, R., Nie, L., Li, M., et al., From editor to dense geometry estimator, arxiv:2509.04338.

  43. Oquab, M., Darcet, T., Moutakanni, T., Vo, H.V., Szafraniec, M., Khalidov, V., et al., Dinov2: Learning robust visual features without supervision, Transactions on Machine Learning Research (2024) .

  44. Zhou, Y., & Tuzel, O. Voxelnet: End-to-end learning for point cloud based 3d object detection, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4490–4499, (2018).

  45. Liang, M., Yang, B., Wang, S., & Urtasun, R., Deep continuous fusion for multi-sensor 3d object detection, in Proceedings of the European conference on computer vision (ECCV), pp. 641–656, (2018).

  46. Yang, Z., Sun, Y., Liu, S., Shen, X., & Jia, J. Std: Sparse-to-dense 3d object detector for point cloud, in Proceedings of the IEEE/CVF international conference on computer vision, pp. 1951–1960, (2019).

  47. Liu, J., Fan, X., Huang, Z., Wu, G., Liu, R., Zhong, W., et al. Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5802–5811, (2022).

  48. Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., et al. Microsoft coco: Common objects in context, in Computer Vision – ECCV 2014, D. Fleet, T. Pajdla, B. Schiele and T. Tuytelaars, eds., (Cham), pp. 740–755, Springer International Publishing, (2014).

  49. Park, D., Ambrus, R., Guizilini, V.C., Li, J., & Gaidon, A. Is pseudo-lidar needed for monocular 3d object detection?, 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 3122.(2021)

  50. Ma, X., Liu, S., Xia, Z., Zhang, H., Zeng, X., & Ouyang, W. Rethinking pseudo-lidar representation, in Computer Vision – ECCV 2020, A. Vedaldi, H. Bischof, T. Brox and J.-M. Frahm, eds., (Cham), pp. 311–327, Springer International Publishing, (2020).

  51. Liu, Z., Tan, Y., He, Q. & Xiao, Y. Swinnet: Swin transformer drives edge-aware rgb-d and rgb-t salient object detection. IEEE Trans. Circuits Syst. Video Technol. 32, 4486. https://doi.org/10.1109/TCSVT.2021.3127149 (2022).

    Google Scholar 

  52. Zhang, H., Koh, J.Y., Baldridge, J., Lee, H., & Yang, Y., Cross-modal contrastive learning for text-to-image generation, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 833–842, (2021).

  53. Yang, L., Kang, B., Huang, Z., Zhao, Z., Xu, X., Feng, J., et al. Depth anything v2, in Advances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak et al., eds., vol. 37, pp. 21875–21911, Curran Associates, Inc., (2024), https://doi.org/10.52202/079017-0688.

  54. Tian, Y., Fan, L., Chen, K., Katabi, D., Krishnan, D., & Isola, P. Learning vision from models rivals learning vision from data, 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023) 15887.

  55. Bai, X., Hu, Z., Zhu, X., Huang, Q., Chen, Y., Fu, H., et al., Transfusion: Robust lidar-camera fusion for 3d object detection with transformers, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1090–1099, (2022).

  56. Li, Z. et al. Bevformer: Learning bird’s-eye-view representation from lidar-camera via spatiotemporal transformers. IEEE Trans. Pattern Anal. Mach. Intell. 47, 2020. https://doi.org/10.1109/TPAMI.2024.3515454 (2025).

    Google Scholar 

  57. Li, H. & Wu, X.-J. Crossfuse: A novel cross attention mechanism based infrared and visible image fusion approach. Inf. Fusion 103, 102147. https://doi.org/10.1016/j.inffus.2023.102147 (2024).

    Google Scholar 

  58. Tang, L., Xiang, X., Zhang, H., Gong, M. & Ma, J. Divfusion: Darkness-free infrared and visible image fusion. Inf. Fusion 91, 477. https://doi.org/10.1016/j.inffus.2022.10.034 (2023).

    Google Scholar 

  59. Chen, H., Li, Y. & Su, D. Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for rgb-d salient object detection. Pattern Recognition 86, 376. https://doi.org/10.1016/j.patcog.2018.08.007 (2019).

    Google Scholar 

  60. Deevi, S.A., Lee, C., Gan, L., Nagesh, S., Pandey, G., & Chung, S.-J., Rgb-x object detection via scene-specific fusion modules, in 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 7351–7360, (2024), https://doi.org/10.1109/WACV57701.2024.00720.

  61. Guo, X., Zhou, W. & Liu, T. Contrastive learning-based knowledge distillation for rgb-thermal urban scene semantic segmentation. Knowledge-Based Syst. 292, 111588. https://doi.org/10.1016/j.knosys.2024.111588 (2024).

    Google Scholar 

  62. Ma, J. et al. Multiscale sparse cross-attention network for remote sensing scene classification. IEEE Trans. Geosci. Remote Sens. 63, 1. https://doi.org/10.1109/TGRS.2025.3525582 (2025).

    Google Scholar 

  63. Wan, D., Lu, R., Fang, Y., Lang, X., Shu, S., Chen, J., et al., Yolov11-rgbt: Towards a comprehensive single-stage multispectral object detection framework, arxiv:2506.14696.

  64. Qingyun, F., Dapeng, H., & Zhaokui, W., Cross-modality fusion transformer for multispectral object detection, arxiv:2111.00273.

  65. Zhou, K., Chen, L., & Cao, X., Improving multispectral pedestrian detection by addressing modality imbalance problems, in Computer Vision – ECCV 2020, A. Vedaldi, H. Bischof, T. Brox and J.-M. Frahm, eds., (Cham), pp. 787–803, Springer International Publishing, (2020).

  66. Jocher, G., & Qiu, J., Ultralytics yolo11, (2024).

  67. Khanam, R., & Hussain, M., Yolov11: An overview of the key architectural enhancements, arxiv:2410.17725.

  68. Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollar, P., Focal loss for dense object detection, in 2017 IEEE international conference on computer vision (ICCV), IEEE International Conference on Computer Vision, pp. 2999–3007, IEEE; IEEE Comp Soc, (2017), https://doi.org/10.1109/ICCV.2017.324.

  69. Li, X., Wang, W., Wu, L., Chen, S., Hu, X., Li, J., et al., Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection, in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan and H. Lin, eds., vol. 33, pp. 21002–21012, Curran Associates, Inc., 2020, https://proceedings.neurips.cc/paper_files/paper/2020/file/f0bda020d2470f2e74990a07a607ebd9-Paper.pdf

  70. Cartucho, J., Ventura, & Veloso, M., Robust object recognition through symbiotic deep learning in mobile robots, in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2336–2341, (2018).

  71. Redmon, J., & Farhadi, A., Yolov3: An incremental improvement, arXiv preprint arXiv:1804.02767 (2018) .

  72. Mustapha, A., Mohamed, L., & Ali, K., An overview of gradient descent algorithm optimization in machine learning: Application in the ophthalmology field, in Smart Applications and Data Analysis, Hamlich, M., Bellatreche, L., Mondal A., & Ordonez, C., eds., (Cham), pp. 349–359, Springer International Publishing, (2020).

  73. Bochkovskiy, A., Wang, C.-Y., & Liao, H.-Y.M., Yolov4: Optimal speed and accuracy of object detection, arXiv preprint arXiv:2004.10934 (2020) .

  74. Zhang, H., Cisse, M., Dauphin, Y.N., & Lopez-Paz, D., mixup: Beyond empirical risk minimization, arxiv:1710.09412.

Download references

Acknowledgements

This work is supported by the National Key R&D Program of China (2022YFA1604803), the National Major in High Resolution Earth Observation (68-Y50G07-9001-22/23), and the Natural Science Basic Research Program of Shaanxi (2025JC-YBMS-020).

Author information

Authors and Affiliations

  1. State Key Laboratory of Porous Metal Materials, School of Physical Science and Technology, Northwestern Polytechnical University, Xi’an, 710072, China

    Si-Qi Li, Xin Tong & Qiang Li

  2. School of Information Mechanics and Sensing Engineering, Xidian University, Xi’an, 710071, China

    Wei Feng

  3. Xi’an Key Laboratory of Advanced Remote Sensing, Xi’an, 710071, China

    Wei Feng

  4. Shaanxi Innovation Center for Multi-source Fusion Detection and Recognition, Xi’an, China

    Wei Feng

  5. Shanghai Aerospace Control Technology Institute, Shanghai, 201109, China

    Bin Liu

Authors
  1. Si-Qi Li
    View author publications

    Search author on:PubMed Google Scholar

  2. Wei Feng
    View author publications

    Search author on:PubMed Google Scholar

  3. Bin Liu
    View author publications

    Search author on:PubMed Google Scholar

  4. Xin Tong
    View author publications

    Search author on:PubMed Google Scholar

  5. Qiang Li
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Conceptualization: Q.L., W.F., and B.L.; Methodology: S.Q.L., W.F., Q.L.; Formal analysis and data curation: S.Q.L., Q.L., B.L.; Writing—original draft preparation: S.Q.L., X.T.,Q.L.; Writing-review and editing: S.Q.L., W.F.,B.L.,X.T., and Q.L.. All authors reviewed the manuscript.

Corresponding author

Correspondence to Qiang Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, SQ., Feng, W., Liu, B. et al. Pseudo-depth-based deep neural network model for object detection. Sci Rep (2026). https://doi.org/10.1038/s41598-026-45310-w

Download citation

  • Received: 21 January 2026

  • Accepted: 18 March 2026

  • Published: 26 March 2026

  • DOI: https://doi.org/10.1038/s41598-026-45310-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Feature enhancement
  • Multispectral object detection
  • Pseudo-depth feature
  • Monocular depth estimation
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics