Abstract
Autonomous vehicles remain commercially limited largely due to safety performance stagnation. Existing deep learning, heavily reliant on failure data from rare safety-critical events, suffers from the seesaw effect—improvement in some scenarios causes regression in others. We introduce an innovative dense learning approach that prioritizes both informative failures and successes, informed by theoretical findings. Data is sampled proportionally to its contribution to the policy gradient and exposure frequency, excluding non-informative samples. This densifies the training dataset’s information, significantly reducing learning variance without bias, enabling tasks intractable for existing methods. To validate this, we trained a safety-critical driving agent for a highly automated vehicle using mixed reality on an urban test track. Results demonstrate that our approach breaks the performance stagnation, enhancing the model’s overall safety performance by one to two orders of magnitude. This marks a significant stride towards achieving human-level safety and widespread adoption for autonomous vehicles.
Data availability
The raw datasets that we used for modeling the naturalistic driving environment come from the Safety Pilot Model Deployment (SPMD) program54 and the Integrated Vehicle Based Safety System (IVBSS)55 at the University of Michigan, Ann Arbor, as well as the RounD dataset33. These raw datasets are subject to the data access policies and licensing terms of their respective providers and are therefore not redistributed by the authors. Access to the SPMD and IVBSS datasets can be requested from the University of Michigan Transportation Research Institute, and access to the RounD dataset is available through its original repository, subject to the corresponding usage agreements. The processed data generated in this study, which constitute the minimum dataset necessary to interpret, verify, and extend the findings reported in this article, have been deposited in the Zenodo repository and are publicly available at: https://zenodo.org/records/12735037. Source data supporting the figures and analyses presented in this paper are provided via Zenodo at: https://zenodo.org/records/12784827. To further illustrate the methodology and qualitative performance of the proposed approach, a collection of Supplementary Videos is provided. All Supplementary Videos, including high-resolution files, are publicly available through Zenodo at https://zenodo.org/records/14837884. These videos demonstrate the learning paradigms, the learned safety metric, the intelligent testing environment, and the performance of SafeDriver across simulated and real-world scenarios, including highway, roundabout, and Mcity test track environments, as well as case studies on the nuPlan benchmark.
Code availability
The simulation software SUMO, the NDE models, the intelligent testing environment, the automated driving system Autoware, and the RLLib platform with the implemented PPO algorithm are publicly available, as described in the text and the relevant references8,25,35,50,56. The source codes for the dense learning approach for SafeDriver is available at: https://zenodo.org/records/12735669 with the https://doi.org/10.5281/zenodo.12735668.
References
“Science: Radio Auto”. Time Magazine. Aug 10, 1925. https://time.com/archive/6653944/science-radio-auto/ (2013).
Hirsch, J. Reality check: $160 billion can’t get autonomous vehicles on road, Automotive News. Automotive News, https://www.autonews.com/mobility-report/autonomous-vehicle-reality-check-after-160-billion-spent (2022).
Society of Automotive Engineers. Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles, https://www.sae.org/standards/content/j3016_202104/ (2021).
Thadani, T. How a robotaxi crash got Cruise’s self-driving cars pulled from Californian roads, Washington Post, https://www.washingtonpost.com/technology/2023/10/28/robotaxi-cruise-crash-driverless-car-san-francisco/ (2023).
Shladover, S. E. & Nowakowski, C. Regulatory challenges for road vehicle automation: lessons from the California experience. Transp. Res. Part A Policy Pr. 122, 125–133 (2019).
Zhang, Y., Kang, B., Hooi, B., Yan, S. & Feng, J. Deep long-tailed learning: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 10795–10816 (2023).
Liu, H. X. & Feng, S. Curse of rarity for autonomous vehicles. Nat. Commun. 15, 4808 (2024).
Feng, S. et al. Dense reinforcement learning for safety validation of autonomous vehicles. Nature 615, 620–627 (2023).
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).
Silver, D. et al. Mastering the game of go without human knowledge. Nature 550, 354–359 (2017).
Brown, N. & Sandholm, T. Superhuman AI for multiplayer poker. Science 365, 885–890 (2019).
Wurman, P. R. et al. Outracing champion Gran Turismo drivers with deep reinforcement learning. Nature 602, 223–228 (2022).
Cummings, M. L. Rethinking the maturity of artificial intelligence in safety-critical settings. AI Mag. 42, 6–15 (2021).
Menghi, C. et al. ARCH-COMP 2023 category report: falsification. In Proceedings of 10th International Workshop on Applied Verification of Continuous and Hybrid Systems, Vol. 96, (International Federation of Automatic Control (IFAC), 2023).
Karpathy, A., Tesla Inc. System and method for obtaining training data. U.S. Patent Application 17/250,825 (2021).
Stocker, T. F. The seesaw effect. Science 282, 61–62 (1998).
Tang, H., Liu, J., Zhao, M. & Gong, X. Progressive layered extraction (PLE): a novel multi-task learning (MTL) model for personalized recommendations. In Proceedings of the 14th ACM Conference on Recommender Systems, 269–278 (ACM, 2020).
Zheng, S. et al. GPT-Fathom: benchmarking large language models to decipher Q12 the evolutionary path towards GPT-4 and beyond. Findings of the Association for Computational Linguistics: NAACL 2024 (2024).
Seshia, S. A., Sadigh, D. & Sastry, S. S. Toward verified artificial intelligence. Commun. ACM 65, 46–55 (2022).
Pek, C., Manzinger, S., Koschi, M. & Althoff, M. Using online verification to prevent autonomous vehicles from causing accidents. Nat. Mach. Intell. 2, 518–528 (2020).
Krasowski, H. et al. Provably safe reinforcement learning: conceptual analysis, survey, and benchmarking. Preprint at https://arxiv.org/abs/2205.06750 (2022).
Brunke, L. et al. Safe learning in robotics: from learning-based control to safe reinforcement learning. Ann. Rev. Control Robot. Autonomous Syst. 5, 411–444 (2022).
Cao, Z. et al. Continuous improvement of self-driving cars using dynamic confidence-aware reinforcement learning. Nat. Mach. Intell. 5, 145–158 (2023).
Thomas, P. S. et al. Preventing undesirable behavior of intelligent machines. Science 366, 999–1004 (2019).
Kato, S. et al. Autoware on board: enabling autonomous vehicles with embedded systems. In 2018 ACM/IEEE 9th International Conference on Cyber-Physical Systems (ICCPS), 287–296 (IEEE, 2018).
Hsu, K. C., Hu, H. & Fisac, J. F. The safety filter: a unified view of safety-critical control in autonomous systems. Ann. Rev. Control Robot. Autonomous Syst. 7, 47–72 (2023).
Tokdar, S. T. & Kass, R. E. Importance sampling: a review. Wiley Interdiscip. Rev. 2, 54–60 (2010).
Yan, X. et al. Learning naturalistic driving environment with statistical realism. Nat. Commun. 14, 2037 (2023).
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. Proximal policy optimization algorithms. Preprint at https://arxiv.org/abs/1707.06347 (2017).
Kochdumper, N. et al. Provably safe reinforcement learning via action projection using reachability analysis and polynomial zonotopes. IEEE Open J. Control Syst. 2, 79–92 (2023).
Shalev-Shwartz, S., Shammah, S. & Shashua, A. On a formal model of safe and scalable self-driving cars. Preprint at https://arxiv.org/abs/1708.06374 (2017).
Wang, X. & Althoff, M. Safe reinforcement learning for automated vehicles via online reachability analysis. In IEEE Transactions on Intelligent Vehicles (IEEE, 2023).
Krajewski, R., Moers, T., Bock, J., Vater, L. & Eckstein, L. The round dataset: a drone dataset of road user trajectories at roundabouts in Germany. In 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), 1−6 (IEEE, 2020).
Elbanhawi, M. & Simic, M. Sampling-based robot motion planning: a review. IEEE Access 2, 56–77 (2014).
Lopez, P. et al. Microscopic traffic simulation using sumo. In International Conference on Intelligent Transportation Systems (ITSC), 2575–2582 (IEEE, 2018).
Treiber, M., Hennecke, A. & Helbing, D. Congested traffic states in empirical observations and microscopic simulations. Phys. Rev. E 62, 1805 (2000).
Caesar, H. et al. Nuplan: a closed-loop ml-based planning benchmark for autonomous vehicles. In CVPR ADP3 Workshop (CVPR, 2021).
Dauner, D., Hallgarten, M., Geiger, A. & Chitta, K. Parting with misconceptions about learning-based vehicle motion planning. In Conference on Robot Learning (CoRL, 2023).
OpenAI. Gpt-4 technical report. Preprint at https://arxiv.org/abs/2303.08774 (2023).
Kandpal, N., Deng, H., Roberts, A., Wallace, E. & Raffel, C. Large language models struggle to learn long-tail knowledge. In International Conference on Machine Learning, 15696–15707 (PMLR, 2023).
Sauerbier, J., Bock, J., Weber, H. & Eckstein, L. Definition of scenarios for safety validation of automated driving functions. ATZ Worldw. 121, 4245 (2019).
Wang, J., Zhang, L., Huang, Y., Zhao, J. & Bella, F. Safety of autonomous vehicles. J. Adv. Transp. 2020, 1–13 (2020).
Zhu, Z. et al. Is sora a world simulator? a comprehensive survey on general world models and beyond. Preprint at https://arxiv.org/abs/2405.03520 (2024).
Owen, A. B. Monte Carlo Theory, Methods and Examples. https://artowen.su.domains/mc/ (2013).
Alain, G., Lamb, A., Sankar, C., Courville, A. & Bengio, Y. Variance reduction in sgd by distributed importance sampling. Preprint at https://arxiv.org/abs/1511.06481 (2015).
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).
Ciosek, K. & Whiteson, S. Offer: off-environment reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31, (AAAI, 2017).
Johnson, J. M. & Khoshgoftaar, T. M. Survey on deep learning with class imbalance. J. Big Data 6, 1–54 (2019).
Scanlon, J. M. et al. Waymo simulated driving behavior in reconstructed fatal crashes within an autonomous vehicle operating domain. Accid. Anal. Prev. 163, 106454 (2021).
Liang, E. et al. RLlib: Abstractions for distributed reinforcement learning. In International Conference on Machine Learning (PMLR, 2018).
Darweesh, H. et al. Open source integrated planner for autonomous navigation in highly dynamic environments. J. Robot. Mechatron. 29, 668–684 (2017).
Feng, S. et al. Safety assessment of highly automated driving systems in test tracks: a new framework. Accid. Anal. Prev. 144, 105664 (2020).
Chang A. X. et al. ShapeNet: an information-rich 3d model repository. Preprint at https://arxiv.org/abs/1512.03012 (2015).
Bezzina, D., Sayer, J. Safety Pilot Model Deployment: Test Conductor Team Report. Report No. DOT HS 812 171. Washington, DC: National Highway Traffic Safety Administration (2014).
Sayer, J. et al. Integrated Vehicle-based Safety Systems Field Operational Test: Final Program Report. No. FHWA-JPO−11−150; UMTRI-2010-36. United States. Joint Program Office for Intelligent Transportation Systems (2011).
Feng, S., Yan, X., Sun, H., Feng, Y. & Liu, H. X. Intelligent driving intelligence test for autonomous vehicles with naturalistic and adversarial environment. Nat. Commun. 12, 1–14 (2021).
National Center for Statistics and Analysis. Fatality Analysis Reporting System (FARS) Analytical User’s Manual, 1975-2018. Report No. DOT HS 812 827. Washington, DC: National Highway Traffic Safety Administration. https://www.nhtsa.gov/research-data/fatality-analysis-reporting-system-fars (2019).
Acknowledgements
This research was partially funded by the U.S. Department of Transportation (USDOT) Region 5 University Transportation Center: Center for Connected and Automated Transportation (CCAT) of the University of Michigan (69A3551747105) and the National Science Foundation (CMMI #2223517). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the official policy or position of the U.S. government.
Author information
Authors and Affiliations
Contributions
S.F. and H.L. conceived and led the research program, developed the dense learning approach, and wrote the paper. S.F., H.Z., and H.S. developed the algorithms, designed the experiments, and performed the results. H.Z., H.S., L.H., and S.L. implemented the algorithms in simulation, performed the simulation tests, and prepared the simulation results. H.Z., X.Y., B.L., and S.S. implemented the algorithms in the autonomous vehicle, performed the field tests, and prepared the testing results. S.F., J.Y., G.S., and L.W. developed the theoretical analysis. All authors provided feedback during the manuscript revision and results discussions. H.L. approved the submission and accepted responsibility for the overall integrity of the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Matthias Althoff, and Mozhgan Nasr Azadani for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Feng, S., Zhu, H., Sun, H. et al. Breaking through safety performance stagnation in autonomous vehicles with dense learning. Nat Commun (2026). https://doi.org/10.1038/s41467-026-69761-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-026-69761-x