Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Real-world road damage dataset with potholes, cracks, and maintenance holes
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 01 April 2026

Real-world road damage dataset with potholes, cracks, and maintenance holes

  • Enrico Giordani1 na1,
  • Lorenzo Arcioni1 na1,
  • Manuel Gil-Martín2,
  • Gian Luca Foresti3 &
  • …
  • Marco Raoul Marini1 

Scientific Reports , Article number:  (2026) Cite this article

  • 118 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Engineering
  • Environmental sciences
  • Mathematics and computing
  • Natural hazards

Abstract

Road surface deterioration poses significant challenges to transportation safety, infrastructure longevity, and timely maintenance planning. Existing street-view datasets are often limited by wide-angle distortions that reduce geometric fidelity and hinder reliable damage analysis. This paper introduces the Road Damage Dataset: Potholes, Cracks, and Manholes, a novel dataset designed for robust detection of road-surface damage in urban and rural settings. The dataset was captured using two consumer-grade devices, acquiring diverse views that mimic real-world deployment situations. It contains high-resolution images with three major and often co-occurring road-damage classes: potholes, cracks, and maintenance holes. It includes 2009 hand-labeled images containing 1261 potholes, 2519 cracks, and 957 maintenance holes with verified bounding boxes. All images were post-processed to improve visual quality and remove sensitive information. The dataset includes several districts in Rome (Italy) and nearby semi-urban and rural towns such as Sacrofano, offering more environmental heterogeneity than many existing datasets. Thanks to its varied capture circumstances, viewing angles, and scene contexts, this dataset supports the development of generalizable models for real-world road-damage detection.

Data availability

The datasets generated and analysed during the current study are available in the Road Damage Dataset - Potholes, Cracks, and Manholes repositories on Zenodo41 and on Kaggle at https://www.kaggle.com/datasets/lorenzoarcioni /road-damage-dataset-potholes-cracks-and-manholes. Example code for loading and using the dataset is available at https://github.com/lorenzo-arcioni/Road-Damage-Detection-Dataset-Analysis. The dataset is released under Creative Commons Attribution 4.0 International license and is freely accessible without registration requirements. The dataset used for training YOLO models is available at https://www.kaggle.com/datasets/lorenzoarcioni/pothole-test. The image annotation tool is available at https://github.com/bnsreenu/digitalsreeni-image-annotator.

Code availability

The Python code used for the curation, analysis, training, and validation of the dataset is publicly available on GitHub, ensuring reproducibility and enabling further methodological and applied research. The repository is accessible at: https://github.com/lorenzo-arcioni/Road-Damage-Detection-Dataset-Analysis.

References

  1. Arya, D. et al. Global road damage detection: State-of-the-art solutions. In 2020 IEEE Int. Conf. Big Data (Big Data), 5533–5539, 10.1109/BigData50022.2020.9377790 (2020).

  2. Cui, L., Qi, Z., Chen, Z., Meng, F. & Shi, Y. Pavement distress detection using random decision forests. In Data Science (eds Zhang, C. et al.) 95–102 (Springer International Publishing, 2015).

    Google Scholar 

  3. Stricker, R., Eisenbach, M., Sesselmann, M., Debes, K. & Gross, H.-M. Improving visual road condition assessment by extensive experiments on the extended gaps dataset. In 2019 International Joint Conference on Neural Networks (IJCNN), 1–8, 10.1109/IJCNN.2019.8852257 (2019).

  4. Arya, D., Maeda, H., Ghosh, S. K., Toshniwal, D. & Sekimoto, Y. Rdd2022: A multi-national image dataset for automatic road damage detection (2022). arxiv:2209.08538.

  5. Maniat, M., Camp, C. V. & Kashani, A. R. Deep learning-based visual crack detection using google street view images. Neural Comput. Appl. 33, 14565–14582. https://doi.org/10.1007/s00521-021-06098-0 (2021).

    Google Scholar 

  6. Lei, X., Liu, C., Li, L. & Wang, G. Automated pavement distress detection and deterioration analysis using street view map. IEEE Access 8, 76163–76172. https://doi.org/10.1109/ACCESS.2020.2989028 (2020).

    Google Scholar 

  7. Ren, M., Zhang, X., Zhi, X., Wei, Y. & Feng, Z. An annotated street view image dataset for automated road damage detection. Sci. Data 11, 407. https://doi.org/10.1038/s41597-024-03263-7 (2024).

    Google Scholar 

  8. Arya, D., Maeda, H., Ghosh, S. K., Toshniwal, D. & Sekimoto, Y. Rdd 2020: An annotated image dataset for automatic road damage detection using deep learning. Data Brief 36, 107133. https://doi.org/10.1016/j.dib.2021.107133 (2021).

    Google Scholar 

  9. Kortmann, F. et al. Detecting various road damage types in global countries utilizing faster r-cnn. In 2020 IEEE International Conference on Big Data (Big Data), 5563–5571, 10.1109/BigData50022.2020.9378245 (2020).

  10. Vishwakarma, R. & Vennelakanti, R. Cnn model & tuning for global road damage detection. In 2020 IEEE International Conference on Big Data (Big Data), 5609–5615, 10.1109/BigData50022.2020.9377902 (2020).

  11. Pham, V., Pham, C. & Dang, T. Road damage detection and classification with detectron2 and faster r-cnn. In 2020 IEEE International Conference on Big Data (Big Data), 5592–5601, 10.1109/BigData50022.2020.9378027 (2020).

  12. Lin, C. et al. Da-rdd: Toward domain adaptive road damage detection across different countries. IEEE Trans. Intell. Transp. Syst. 24, 3091–3103. https://doi.org/10.1109/TITS.2022.3221067 (2023).

    Google Scholar 

  13. Kapp, A., Hoffmann, E., Weigmann, E. & Mihaljević, H. Streetsurfacevis: A dataset of crowdsourced street-level imagery annotated by road surface type and quality. Sci. Data 12, 92. https://doi.org/10.1038/s41597-024-04295-9 (2025).

    Google Scholar 

  14. Yin, T., Zhang, W., Kou, J. & Liu, N. Promoting automatic detection of road damage: A high-resolution dataset, a new approach, and a new evaluation criterion. IEEE Trans. Autom. Sci. Eng. 22, 2472–2484. https://doi.org/10.1109/TASE.2024.3379945 (2025).

    Google Scholar 

  15. Yang, H. et al. A large-scale image repository for automated pavement distress analysis and degradation trend prediction. Sci. Data 12, 1426. https://doi.org/10.1038/s41597-025-05748-5 (2025).

    Google Scholar 

  16. Zhang, H. et al. A new road damage detection baseline with attention learning. Appl. Sci. https://doi.org/10.3390/app12157594 (2022).

    Google Scholar 

  17. Pham, V., Nguyen, D. & Donan, C. Road damage detection and classification with yolov7. In 2022 IEEE International Conference on Big Data (Big Data), 6416–6423, 10.1109/BigData55660.2022.10020856 (2022).

  18. Alfarrarjeh, A., Trivedi, D., Kim, S. H. & Shahabi, C. A deep learning approach for road damage detection from smartphone images. In 2018 IEEE International Conference on Big Data (Big Data), 5201–5204, 10.1109/BigData.2018.8621899 (2018).

  19. Arya, D. et al. Deep learning-based road damage detection and classification for multiple countries. Autom. Constr. 132, 103935. https://doi.org/10.1016/j.autcon.2021.103935 (2021).

    Google Scholar 

  20. Guo, G. & Zhang, Z. Road damage detection algorithm for improved yolov5. Sci. Rep. 12, 15523. https://doi.org/10.1038/s41598-022-19674-8 (2022).

    Google Scholar 

  21. Pang, Z. et al. Road surface classification with texture-feature-embedded resnet for the active suspension systems in complex environments. Adv. Eng. Inform. 71, 104280. https://doi.org/10.1016/j.aei.2025.104280 (2026).

    Google Scholar 

  22. Liu, Y. et al. A non-destructive automatic pavement damage detection scheme based on end-to-end neural networks with multi-level attention mechanism. Eng. Appl. Artif. Intell. 156, 111246. https://doi.org/10.1016/j.engappai.2025.111246 (2025).

    Google Scholar 

  23. Yenni, H. et al. Mycd: Integration of yolo-cnn and densenet for real-time road damage detection based on field images. J. Appl. Data Sci. 7, 384–395. https://doi.org/10.47738/jads.v7i1.1040 (2025).

    Google Scholar 

  24. Arcioni, L. & Giordani, E. Road damage dataset collection route map (2025). https://www.google.com/maps/d/viewer?mid=1WrrMPBqnh6v_GnQmfvKJWaq3R0YPD78&usp=sharing.

  25. Bhattiprolu, S. Digitalsreeni image annotator (2024). https://github.com/bnsreenu/digitalsreeni-image-annotator.

  26. Ravi, N. et al. Sam 2: Segment anything in images and videos (2024). arxiv:2408.00714.

  27. Redmon, J., Divvala, S., Girshick, R. & Farhadi, A. You only look once: Unified, real-time object detection. Preprint at arxiv:1506.02640 (2016).

  28. Jocher, G. & Qiu, J. Ultralytics yolo11 (2024).

  29. Snoek, J., Larochelle, H. & Adams, R. P. Practical Bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems Vol. 25 (eds Pereira, F. et al.) (Curran Associates, Inc., 2012).

    Google Scholar 

  30. Jocher, G., Chaurasia, A. & Qiu, J. Ultralytics yolov8 (2023).

  31. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. & Chen, L.-C. Mobilenetv2: Inverted residuals and linear bottlenecks (2019). arxiv:1801.04381.

  32. Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition (2015). arxiv:1409.1556.

  33. Liu, W. et al. SSD: Single Shot MultiBox Detector 21–37 (Springer International Publishing, 2016).

    Google Scholar 

  34. Ren, S., He, K., Girshick, R. & Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks (2016). arxiv:1506.01497.

  35. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition (2015). arxiv:1512.03385.

  36. Lin, T.-Y. et al. Feature pyramid networks for object detection (2017). arxiv:1612.03144.

  37. Kaggle: Your Machine Learning and Data Science Community — kaggle.com. https://www.kaggle.com/.

  38. Biewald, L. Experiment tracking with weights and biases (2020). Software available from wandb.com.

  39. Lin, T. et al. Microsoft COCO: common objects in context. CoRR abs/1405.0312 (2014). arxiv:1405.0312.

  40. Khosravian, A., Amirkhani, A., Kashiani, H. & Masih-Tehrani, M. Generalizing state-of-the-art object detectors for autonomous vehicles in unseen environments. Expert Syst. Appl. 183, 115417. https://doi.org/10.1016/j.eswa.2021.115417 (2021).

    Google Scholar 

  41. Giordani, E., Arcioni, L., Gil-Martín, M. & Marini, M. R. Road damage dataset: Potholes, cracks and manholes, 10.5281/zenodo.17834373 (2025).

Download references

Acknowledgements

Special thanks are extended to Kaggle for offering accessible GPU resources.

Funding

This work was partially supported by project SERICS (PE00000014) under the MUR National Recovery and Resilience Plan (PNRR) funded by the European Union – Next Generation EU, Mission 4, CUP G23C24000790006 (2024-25). This paper was partially supported by University of Udine project on “Piano Stretegico Dipartimentale on Artificial Intelligence” (PSD-AI) (2022-25) project at the University of Udine. This research was partially supported by “Ayudas para estancias de movilidad en el extranjero José Castillejo para jóvenes doctores” from Ministerio de Ciencia, Innovación y Universidades of Spain. Moreover, it was supported by the ASTOUND project (101071191 HORIZON-EIC-2021-PATHFINDERCHALLENGES-01) funded by the European Commission. In addition, the Spanish Ministry of Science and Innovation, through the projects BeWord, GOMINOLA, TRUSTBOOST (PID2021-126061OB-C43, PID2020-118112RB-C21 and PID2020-118112RB-C22, PID2023-150584OB-C21 and PID2023-150584OB-C22, funded by MCIN/AEI/10.13039/501100011033, and by the European Union “NextGenerationEU/PRTR”).

Author information

Author notes
  1. These authors contributed equally: Enrico Giordani and Lorenzo Arcioni.

Authors and Affiliations

  1. VisionLab, Department of Computer Science, Sapienza University, Rome, 00198, Italy

    Enrico Giordani, Lorenzo Arcioni & Marco Raoul Marini

  2. Grupo de Tecnología del Habla y Aprendizaje Automático (THAU Group), Information Processing and Telecommunications Center, E.T.S.I. de Telecomunicación, Universidad Politécnica de Madrid (UPM), Madrid, 28040, Spain

    Manuel Gil-Martín

  3. Department of Computer Science, Mathematics and Physics, University of Udine, Via delle Scienze 206, Udine, UD 33100, Italy

    Gian Luca Foresti

Authors
  1. Enrico Giordani
    View author publications

    Search author on:PubMed Google Scholar

  2. Lorenzo Arcioni
    View author publications

    Search author on:PubMed Google Scholar

  3. Manuel Gil-Martín
    View author publications

    Search author on:PubMed Google Scholar

  4. Gian Luca Foresti
    View author publications

    Search author on:PubMed Google Scholar

  5. Marco Raoul Marini
    View author publications

    Search author on:PubMed Google Scholar

Contributions

L.A., E.G., and M.R.M. conceived the methodology, L.A. and E.G. conducted the data collection and performed the baseline experiments, G.L.F. supervised the work and M.G.-M. wrote the original draft of the manuscript. All authors validated and formally analyzed the dataset and reviewed the final manuscript.

Corresponding author

Correspondence to Marco Raoul Marini.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Giordani, E., Arcioni, L., Gil-Martín, M. et al. Real-world road damage dataset with potholes, cracks, and maintenance holes. Sci Rep (2026). https://doi.org/10.1038/s41598-026-46679-4

Download citation

  • Received: 19 December 2025

  • Accepted: 27 March 2026

  • Published: 01 April 2026

  • DOI: https://doi.org/10.1038/s41598-026-46679-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Autonomous driving
  • Object detection
  • Road dataset
  • Cracks and damage identification
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics