Abstract
Early positioning concentrates on static natural geographic features, shifting focus to capturing dynamic objects with the emergence of geographic information systems and the growing demand for spatial data. However, previous methods typically rely on expensive devices or external calibration objects for attitude measurement. Here we propose a real-time hybrid framework with a dual-phase strategy that leverages the time-series nature of dynamic objects, combining AI detection with mathematical modeling to estimate relative attitudes via efficient singular value decomposition, thus enabling reference-free 3D coordinate recognition. In particular, we enhance the state-of-the-art You Only Look Once version 12 model by incorporating time-series analysis for rapid and precise 2D detection, which serves as input for 2D-to-3D conversion via our singular value decomposition-based solver. By leveraging data from only three off-the-shelf smartphone cameras, the system achieves accurate and reference-free 3D positioning of a flying UAV. Experimental results demonstrate high precision in terms of RMSE, MAE, and R-squared. Therefore, under sensor-resource constraints, this AI-mathematics fusion enables real-time 3D coordinate recognition without traditional attitude measurement.
Similar content being viewed by others
Data availability
The UAV dataset used in this study is publicly available at https://github.com/wordbomb/PoseFree-GeoLocator.
Code availability
The source code is publicly available at https://github.com/wordbomb/PoseFree-GeoLocator. The code was developed using Python with PyTorch v2.3.0 and CUDA 12.1. The benchmark YOLO detector is based on the Ultralytics framework (https://github.com/ultralytics/ultralytics).
References
Chen, M. et al. Iterative integration of deep learning in hybrid earth surface system modelling. Nat. Rev. Earth Environ. 4, 568–581 (2023).
Hess, P., Drüke, M., Petri, S., Strnad, F. M. & Boers, N. Physically constrained generative adversarial networks for improving precipitation fields from Earth system models. Nat. Mach. Intell. 4, 828–839 (2022).
Balaian, S. K., Sanders, B. F. & Qomi, M. J. A. How urban form impacts flooding. Nat. Commun. 15, 6911 (2024).
Gomarasca, M. A. Basics of geomatics (Springer Science & Business Media, 2009).
Wayman, P. A. A least-squares solution for a linear relation between two observed quantities. Nature 184, 77–78 (1959).
Yang, Y., He, H. & Xu, G. Adaptively robust filtering for kinematic geodetic positioning. J. Geod. 75, 109–116 (2001).
Gentine, P., Pritchard, M., Rasp, S., Reinaudi, G. & Yacalis, G. Could machine learning break the convection parameterization deadlock? Geophys. Res. Lett. 45, 5742–5751 (2018).
Chen, M. et al. Artificial intelligence and visual analytics in geographical space and cyberspace: Research opportunities and challenges. Earth-Sci. Rev. 241, 104438 (2023).
Han, J., Jentzen, A. & E, W. Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci. 115, 8505–8510 (2018).
Irrgang, C. et al. Towards neural Earth system modelling by integrating artificial intelligence in Earth system science. Nat. Mach. Intell. 3, 667–674 (2021).
Zhang, Q. et al. Water limitation regulates positive feedback of increased ecosystem respiration. Nat. Ecol. Evolut. 8, 1870–1876 (2024).
Lin, N., Emanuel, K., Oppenheimer, M. & Vanmarcke, E. Physically based assessment of hurricane surge threat under climate change. Nat. Clim. Change 2, 462–467 (2012).
Ao, Z. et al. A national-scale assessment of land subsidence in China’s major cities. Science 384, 301–306 (2024).
Reichstein, M. et al. Deep learning and process understanding for data-driven Earth system science. Nature 566, 195–204 (2019).
Hofmann-Wellenhof, B., Lichtenegger, H. & Wasle, E.GNSS—global navigation satellite systems: GPS, GLONASS, Galileo, and more (Springer Science & Business Media, 2007).
Barshan, B. & Durrant-Whyte, H. F. Inertial navigation systems for mobile robots. IEEE Trans. Robot. Autom. 11, 328–342 (1995).
Yang, L. & Giannakis, G. B. Ultra-wideband communications: an idea whose time has come. IEEE Signal Process. Mag. 21, 26–54 (2004).
Wehr, A. & Lohr, U. Airborne laser scanning—an introduction and overview. ISPRS J. Photogramm. Remote Sens. 54, 68–82 (1999).
Mikhail, E. M., Bethel, J. S. & McGlone, J. C. Introduction to modern photogrammetry (John Wiley & Sons, 2001).
Moreira, A. et al. A tutorial on synthetic aperture radar. IEEE Geosci. Remote Sens. Mag. 1, 6–43 (2013).
Hartley, R. & Zisserman, A. Multiple View Geometry in Computer Vision (Cambridge University Press, 2003).
Prince, S. J. D. Computer Vision: Models, Learning, and Inference (Cambridge University Press, 2012).
Sturm, P. & Triggs, B. A factorization based algorithm for multi-image projective structure and motion. In European Conference on Computer Vision (ECCV), 709–720 (Springer, 1996).
Sie, N. J., Srigrarom, S. & Huang, S.Field test validations of vision-based multi-camera multi-drone tracking and 3D localizing with concurrent camera pose estimation. In Proc. 2021 IEEE 6th International Conference on Control and Robotics Engineering (ICCRE), 139–144 (IEEE, 2021).
Liu, Y., Sun, P., Wergeles, N. & Shang, Y. A survey and performance evaluation of deep learning methods for small object detection. Expert Syst. Appl. 172, 114602 (2021).
Triggs, B., McLauchlan, P. F., Hartley, R. I. & Fitzgibbon, A. W. Bundle adjustment—a modern synthesis. In International Workshop on Vision Algorithms, 298–372 (Springer, 2000).
Defense Mapping Agency Department of Defense World Geodetic System 1984: Its Definition and Relationships with Local Geodetic Systems. Technical Report 8350.2 (Defense Mapping Agency, 1987).
Zhang, Z. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22, 1330–1334 (2000).
Longuet-Higgins, H. C. A computer algorithm for reconstructing a scene from two projections. Nature 293, 133–135 (1981).
Lepetit, V., Moreno-Noguer, F. & Fua, P. EPnP: An accurate O(n) solution to the PnP problem. Int. J. Comput. Vis. 81, 155–166 (2009).
Kabsch, W. A discussion of the solution for the best rotation to relate two sets of vectors. Acta Crystallogr. Sect. A: Cryst. Phys. Diffr. Theor. Gen. Crystallogr. 34, 827–828 (1978).
Acknowledgements
This work is supported by the National Natural Science Foundation of China (61803047), and the Social Sciences Fund of Jiangsu Province 24XWB004. Ke-ke Shang is supported by Jiangsu Qing Lan Project and the NJU-China Mobile Joint Research Institute. Michael Small is supported by the Australian Research Council Discovery Grant (DP200102961). Michael Small also acknowledges the support of the Australian Research Council through the Center for Transforming Maintenance through Data Science (IC180100030).
Author information
Authors and Affiliations
Contributions
J.Y. (Co-first author) was responsible for experiment design, linear algebra analysis, coding, data analysis, and drafting the manuscript. K.-k.S. (Co-first author & Corresponding author) contributed to analysis, experiment, and simulation design, supervision, and manuscript writing. M.S. provided guidance and assisted in manuscript reviewing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Communications Engineering thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editors: [Alessandro Rizzo] and [Wenjie Wang]. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Yi, J., Shang, Kk. & Small, M. Bridging mathematical modeling and AI for 3D coordinate recognition of moving objects without external reference and attitude measurement. Commun Eng (2026). https://doi.org/10.1038/s44172-026-00648-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s44172-026-00648-x


