Abstract
Testing and validating automated driving systems require carefully designed test cases that capture the complexity of real-world driving conditions. However, the inherent complexity of driving environments and the rarity of safety-critical situations pose significant challenges to developing reliable and efficient validation frameworks. This paper addresses these issues by selecting appropriate test cases from the largest-scale naturalistic driving study. We introduce a Kernel Test Case Sampling method, which selects cases satisfying two key criteria: representativeness, ensuring alignment with real-world scenarios, and coverage, capturing high-risk corner cases. To demonstrate the proposed method, it is applied to large-scale naturalistic driving study data. By selecting a limited number of cases, the method effectively captures long-tailed scenarios while approximating the distribution of naturalistic driving conditions. The sampling framework also enables robust accident-rate estimation, thereby ensuring fair comparisons across human driving performance and multiple systems. The proposed method supports standardized and scalable automated driving system safety validation, facilitating accelerated development and deployment while building public trust and regulatory confidence.
Similar content being viewed by others
Data availability
The scenario feature data and non-PII Strategic Highway Research Program Naturalistic Driving Study data are available under restricted access. Researchers may obtain access by submitting a Data Use License application, which requires specification of the requested data, documentation of Institutional Review Board (IRB) approval, institutional authorization, and a data security plan. Detailed instructions for data access are available at the following links: https://www.trb.org/StrategicHighwayResearchProgram2SHRP2/SHRP2DataSafetyAccess.aspxhttps://insight.shrp2nds.us/. The data underlying the figures presented in this paper are provided in the accompanying Source Data file. Source data are provided with this paper.
Code availability
The source code and implementation details are publicly available at Code Ocean through https://doi.org/10.24433/CO.9203840.v1
References
Feng, S., Yan, X., Sun, H., Feng, Y. & Liu, H. X. Intelligent driving intelligence test for autonomous vehicles with naturalistic and adversarial environment. Nat. Commun. 12, 748 (2021).
Thorn, E., Kimmel, S. C., Chaka, M., Hamilton, B. A. et al. A framework for automated driving system testable cases and scenarios. Tech. Rep., United States. Department of Transportation. National Highway Traffic Safety Administration, DOT HS 812 623 (2018).
Weber, H. et al. A framework for definition of logical scenarios for safety assurance of automated driving. Traffic Inj. Prev. 20, S65–S70 (2019).
Albrecht, H., Barickman, F. S., Schnelle, S. C. et al. Advanced test tools for ADAS and ADS. Tech. Rep., (2021).
Yan, X. et al. Learning a naturalistic driving environment with statistical realism. Nat. Commun. 14, 2037 (2023).
Chen, R., Arief, M., Zhang, W. & Zhao, D. How to evaluate proving grounds for self-driving? A quantitative approach. IEEE Trans. Intell. Trans. Syst. 22, 5737–5748 (2020).
Rahmani, S. et al. A systematic review of edge case detection in automated driving: Methods, challenges and future directions. arXiv preprint arXiv:2410.08491 (2024).
Abdel-Aty, M. & Ding, S. A matched case-control analysis of autonomous vs human-driven vehicle accidents. Nat. Commun. 15, 4931 (2024).
Venkatraman, V., Richard, C. M., Magee, K., Johnson, K. et al. Countermeasures that work: A highway safety countermeasure guide for state highway safety offices, 2020. Tech. Rep., United States. Department of Transportation. National Highway Traffic Safety Administration, DOT HS 813 097 (2021).
Harb, M., Stathopoulos, A., Shiftan, Y. & Walker, J. L. What do we (not) know about our future with automated vehicles? Trans. Res. Part C: Emerg. Technol. 123, 102948 (2021).
Wang, Y., Yu, R., Qiu, S., Sun, J. & Farah, H. Safety performance boundary identification of highly automated vehicles: a surrogate model-based gradient descent searching approach. IEEE Trans. Intell. Trans. Syst. 23, 23809–23820 (2022).
Liu, H. X. & Feng, S. Curse of rarity for autonomous vehicles. Nat. Commun. 15, 4808 (2024).
Scanlon, J. M. et al. Waymo simulated driving behavior in reconstructed fatal crashes within an autonomous vehicle operating domain. Accid. Anal. Prev. 163, 106454 (2021).
Dingus, T. A. et al. Driver crash risk factors and prevalence evaluation using naturalistic driving data. Proc. Natl. Acad. Sci. 113, 2636–2641 (2016).
Guo, F. Statistical methods for naturalistic driving studies. Annu. Rev. Stat. Appl. 6, 309–328 (2019).
Antin, J., Stulce, K., Eichelberger, L. & Hankey, J. Naturalistic driving study: descriptive comparison of the study sample with national data.Tech. Rep., Virginia Tech Transportation Institute (2015).
Ali, G., McLaughlin, S. & Ahmadian, M. The surface accelerations reference—a large-scale, interactive catalog of passenger vehicle accelerations. IEEE Trans. Intell. Trans. Syst. 24, 9031–9040 (2023).
Chatalic, A., Schreuder, N., Rosasco, L. & Rudi, A. Nyström kernel mean embeddings. In International Conference on Machine Learning, 3006–3024 (PMLR, 2022).
Mak, S. & Joseph, V. R. Support points. Ann. Stat. 46, 2562–2592 (2018).
Zhang, J. et al. An optimal transport approach for selecting a representative subsample with application in efficient kernel density estimation. J. Comput. Graph. Stat. 32, 329–339 (2023).
Kim, B., Khanna, R. & Koyejo, O. O. Examples are not enough, learn to criticize. Criticism for interpretability. Advances in Neural Information Processing Systems 29, (2016).
Joseph, V. R., Dasgupta, T., Tuo, R. & Wu, C. J. Sequential exploration of complex surfaces using minimum energy designs. Technometrics 57, 64–74 (2015).
Principe, J. C. Information theoretic learning: Rényi’s entropy and kernel perspectives (Springer Science & Business Media, 2010).
Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B. & Smola, A. A kernel two-sample test. J. Mach. Learn. Res. 13, 723–773 (2012).
Vaswani, A. et al. Attention is all you need. Advances in Neural Information Processing Systems. 30, (2017).
Chai, J. & Wang, X. Fairness with adaptive weights. In International Conference on Machine Learning, 2853–2866 (PMLR, 2022).
Hankey, J. M., Perez, M. A. & McClafferty, J. A. Description of the shrp 2 naturalistic database and the crash, near-crash, and baseline data sets. Tech. Rep., Virginia Tech Transportation Institute (2016).
Hammit, B. E., Ghasemzadeh, A., James, R. M., Ahmed, M. M. & Young, R. K. Evaluation of weather-related freeway car-following behavior using the shrp2 naturalistic driving study database. Trans. Res. Part F: Traffic Psychol. Behav. 59, 244–259 (2018).
Kumar, S., Mohri, M. & Talwalkar, A. Sampling methods for the nyström method. J. Mach. Learn. Res. 13, 981–1006 (2012).
Zhong, W., Qian, C., Liu, W., Zhu, L. & Li, R. Feature screening for interval-valued response with application to study association between posted salary and required skills. J. Am. Stat. Assoc. 118, 805–817 (2023).
Wang, Y., Sun, F. & Xu, H. On design orthogonality, maximin distance, and projection uniformity for computer experiments. J. Am. Stat. Assoc. 117, 375–385 (2022).
Kalra, N. & Paddock, S. M. Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability? Trans. Res. Part A: Policy Pract. 94, 182–193 (2016).
Feng, S. et al. Dense reinforcement learning for safety validation of autonomous vehicles. Nature 615, 620–627 (2023).
Russell, L. et al. Gaia-2: A controllable multi-view generative world model for autonomous driving. Preprint at https://doi.org/10.48550/arXiv.2503.20523 (2025).
Guan, Y. et al. World models for autonomous driving: An initial survey. IEEE Transactions on Intelligent Vehicles. (2024).
ASAM. Open scenario. https://www.asam.net/standards/detail/openscenario/v200/ (2024).
Si, S., Hsieh, C.-J. & Dhillon, I. S. Memory efficient kernel approximation. J. Mach. Learn. Res. 18, 1–32 (2017).
Teymur, O., Gorham, J., Riabiz, M. & Oates, C. Optimal quantisation of probability measures using maximum mean discrepancy. In International Conference on Artificial Intelligence and Statistics, 1027–1035 (PMLR, 2021).
Pourahmadi, M. High-dimensional covariance estimation: with high-dimensional data. (John Wiley & Sons, 2013).
Li, C.-L., Chang, W.-C., Cheng, Y., Yang, Y. & Póczos, B. Mmd gan: towards deeper understanding of moment matching network. Advances in Neural Information Processing Systems. 30, (2017).
Han, R., Shi, P. & Zhang, A. R. Guaranteed functional tensor singular value decomposition. J. Am. Stat. Assoc. 119, 995–1007 (2024).
Joseph, V. R. Space-filling designs for computer experiments: a review. Qual. Eng. 28, 28–35 (2016).
Wolfer, G. & Alquier, P. Variance-aware estimation of kernel mean embedding. J. Mach. Learn. Res. 26, 1–48 (2025).
Simon-Gabriel, C.-J. & Schölkopf, B. Kernel distribution embeddings: universal kernels, characteristic kernels and kernel metrics on distributions. J. Mach. Learn. Res. 19, 1–29 (2018).
Pronzato, L. Performance analysis of greedy algorithms for minimising a maximum mean discrepancy. Stat. Comput. 33, 14 (2023).
Tillé, Y. Sampling and estimation from finite populations (John Wiley & Sons, 2020).
Acknowledgments
We thank Drs. Miguel Perez, Jon Hankey, Kevin Kefauver and Zac Doerzaph for their valuable guidance on naturalistic driving study data and automated driving systems testing and validation.
Author information
Authors and Affiliations
Contributions
F. Guo. conceptualized the study and provided the data. C. Qian, J.Xu. and F. Guo developed the methodology. X. Xing validated the findings. C. Qian. and J.Xu performed the formal analysis. C. Qian. wrote the original draft, while all authors revising the editing the manuscript. F. Guo and X. Xing provided supervision.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Salvatore Cuomo, Francesco Finazzi, Marcos Nieto and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Qian, C., Xu, J., Xing, X. et al. Test case sampling optimization for safety validation of automated driving systems. Nat Commun (2026). https://doi.org/10.1038/s41467-026-69675-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-026-69675-8


