Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Quantifying mean, variability, and uncertainty in indoor radon exposure in Pennsylvania using random forest and quantile regression forest models
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 05 March 2026

Quantifying mean, variability, and uncertainty in indoor radon exposure in Pennsylvania using random forest and quantile regression forest models

  • Heechan Lee1,2,3,
  • Dakotah Maguire2,
  • Jeremy Logan4,
  • Greeshma Agasthya1,
  • Shaheen Dewji1 &
  • …
  • Heidi A. Hanson2Ā 

Scientific Reports , ArticleĀ number:Ā  (2026) Cite this article

  • 710 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Environmental sciences
  • Risk factors

Abstract

Radon is a naturally occurring radioactive gas that poses a serious health risk as the primary cause of lung cancer in non-smokers. Despite the well-known adverse association with health outcomes, current radon exposure assessments are limited to county-level or average-level estimates, which fail to capture regional variability. This study uses Machine Learning models, including Random Forest (RF) and Quantile Regression Forest (QRF), to estimate the indoor radon concentrations at the ZCTA (Zip code tabulation area)-level and characterize uncertainties in model estimates. Incorporating geological, meteorological, and building-specific data, the models aim to improve radon risk assessment by capturing mean exposure, variability, and extreme concentration levels. Processed radon test data (n = 718,111) were analyzed using average, variability, and quantile prediction methods. Models that estimate the average radon exposure at the ZCTA-level can yield promising model-fit results, but they do not capture the underlying variability of indoor radon exposure within a ZCTA. We utilize volatility analyses to identify characteristics indicative of high variability of indoor radon exposure. We also show that a QRF model can be used to estimate upper quantiles of residential radon exposure, thereby uncovering localized areas of elevated exposure that were not apparent in mean estimates. The results highlighted the need for a deep characterization of exposure risk and show that regions with moderate average exposure levels could still harbor extreme outliers with implications for evaluating health risks. Utilizing multiple radon exposure models allows for a deeper characterization of radon risk within a geographic area and can better identify high-risk areas. The results from this study provide a foundation for developing mitigation strategies and examining associations between radon exposure and health outcomes at fine scales. Future research should extend the geographic scope and incorporate additional environmental risk factors to establish a comprehensive framework for risk assessment.

Similar content being viewed by others

Lung cancer mortality attributable to residential radon: a systematic scoping review

Article 28 December 2022

A national comparison between the collocated short- and long-term radon measurements in the United States

Article 01 February 2023

Assessment of carcinogenic risk from indoor radon exposure influenced by geological structures in the mountains of southern Caspian Sea

Article Open access 25 April 2025

Data availability

Indoor radon measurement data for Pennsylvania were obtained from the Pennsylvania Department of Environmental Protection (PA DEP) which is publicly available.All datasets used in this study are publicly available from the original data providers: indoor radon measurement from the Pennsylvania Department of Environmental Protection, elevation from the USGS GMTED2010 product, soil characteristics from the USDA NRCS gNATSGO database, geochemical variables from the USGS Geochemical and Mineralogical Survey, hydrologic landscape data from USGS, meteorological variables from the Daymet database, and demographic and housing characteristics from the U.S. Census Bureau (Decennial Census and American Community Survey). Detailed information on data sources and preprocessing workflows is provided in the method paper. 44Data are available from the authors upon reasonable request.

References

  1. Wall, B. F. Ionising Radiation Exposure of the Population of the United States: NCRP Report No. 160 (Oxford University Press, 2009).

  2. Organization, W. H. WHO Handbook on Indoor Radon: a Public Health Perspective (World Health Organization, 2009).

  3. Tirmarche, M. et al. ICRP publication 115. Lung cancer risk from radon and progeny and statement on radon. Ann. ICRP. 40, 1–64 (2010).

    Google ScholarĀ 

  4. Dong, S. et al. Synergistic effects of particle radioactivity (Gross beta Activity) and particulate matter =2.5 Mum aerodynamic diameter on cardiovascular disease Mortality</at. J. Am. Heart Assoc. 11, e025470. https://doi.org/10.1161/JAHA.121.025470 (2022).

    Google ScholarĀ 

  5. Kim, S. H., Park, J. M. & Kim, H. The prevalence of stroke according to indoor radon concentration in South koreans: nationwide cross section study. Med. (Baltim). 99, e18859. https://doi.org/10.1097/MD.0000000000018859 (2020).

    Google ScholarĀ 

  6. Lee, H. et al. Evaluating county-level lung cancer incidence from environmental radiation exposure, PM(2.5), and other exposures with regression and machine learning models. Environ. Geochem. Health. 46, 82. https://doi.org/10.1007/s10653-023-01820-4 (2024).

    Google ScholarĀ 

  7. Al-Zoughool, M. & Krewski, D. Health effects of radon: a review of the literature. Int. J. Radiat. Biol. 85, 57–69. https://doi.org/10.1080/09553000802635054 (2009).

    Google ScholarĀ 

  8. Council, N. R. Health Effects of Exposure To Radon: BEIR VI (National Academies, 1999).

  9. Kang, J. K., Seo, S. & Jin, Y. W. Health effects of radon exposure. Yonsei Med. J. 60, 597–603. https://doi.org/10.3349/ymj.2019.60.7.597 (2019).

    Google ScholarĀ 

  10. Richardson, D. B. et al. Mortality among uranium miners in North America and europe: the pooled uranium miners analysis (PUMA). Int. J. Epidemiol. 50, 633–643. https://doi.org/10.1093/ije/dyaa195 (2021).

    Google ScholarĀ 

  11. Lagarde, F. et al. Glass-based radon-exposure assessment and lung cancer risk. J. Expo. Sci. Environ. Epidemiol. 12, 344–354 (2002).

    Google ScholarĀ 

  12. Park, N. W., Kim, Y., Chang, B. U. & Kwak, G. H. County-level indoor radon concentration mapping and uncertainty assessment in South Korea using Geostatistical simulation and environmental factors. J. Environ. Radioact. 208, 106044 (2019).

    Google ScholarĀ 

  13. Fujimoto, K. & Sanada, T. Dependence of indoor radon concentration on the year of house construction. Health Phys. 77, 410–419 (1999).

    Google ScholarĀ 

  14. Smith, B. J. & Field, R. W. Effect of housing factors and surficial uranium on the Spatial prediction of residential radon in Iowa. Environmetrics 18, 481–497. https://doi.org/10.1002/env.816 (2006).

    Google ScholarĀ 

  15. Abergel, R. et al. The enduring legacy of Marie curie: impacts of radium in 21st century radiological and medical sciences. Int. J. Radiat. Biol. 98, 267–275. https://doi.org/10.1080/09553002.2022.2027542 (2022).

    Google ScholarĀ 

  16. Gundersen, L. C. et al. Geology of radon in the United States. (1992).

  17. Otton, J. K. The geology of radon. (1992).

  18. Bulut, H. A., Şahin, R. & Radon Concrete, buildings and human Health—A review study. Buildings 14, 510 (2024).

    Google ScholarĀ 

  19. Mustonen, R. Natural radioactivity in and radon exhalation from Finnish Building materials. Health Phys. 46, 1195–1203 (1984).

    Google ScholarĀ 

  20. Marcinowski, F., Lucas, R. M. & Yeager, W. M. National and regional distributions of airborne radon concentrations in US homes. Health Phys. 66, 699–706 (1994).

    Google ScholarĀ 

  21. Yazzie, S. A., Davis, S., Seixas, N. & Yost, M. G. Assessing the impact of housing features and environmental factors on home indoor radon concentration levels on the Navajo Nation. Int. J. Environ. Res. Public. Health. 17 https://doi.org/10.3390/ijerph17082813 (2020).

  22. Sun, K., Guo, Q. & Cheng, J. The effect of some soil characteristics on soil radon concentration and radon exhalation from soil surface. J. Nucl. Sci. Technol. 41, 1113–1117. https://doi.org/10.1080/18811248.2004.9726337 (2004).

    Google ScholarĀ 

  23. Mose, D. G. & Mushrush, G. W. Prediction of indoor radon based on soil radon and soil permeability. J. Environ. Sci. Health Part. A. 34, 1253–1266. https://doi.org/10.1080/10934529909376894 (1999).

    Google ScholarĀ 

  24. Hassan, N. M. et al. Radon migration process and its influence factors; review. Japanese J. Health Phys. 44, 218–231 (2009).

    Google ScholarĀ 

  25. Khattak, N., Khan, M. A., Ali, N. & Abbas, S. M. Radon monitoring for geological exploration: A review. J. Himal. Earth Sci. 44, 91–102 (2011).

    Google ScholarĀ 

  26. Nunes, L. J. R., Curado, A., Graca, L., Soares, S. & Lopes, S. I. Impacts of indoor radon on health: A comprehensive review on Causes, assessment and remediation strategies. Int. J. Environ. Res. Public. Health. 19 https://doi.org/10.3390/ijerph19073929 (2022).

  27. Şen, G. Y., IƧhedef, M., SaƧ, M. M. & Yener, G. Effect of natural gas usage on indoor radon levels. J. Radioanal. Nucl. Chem. 295, 277–282. https://doi.org/10.1007/s10967-012-1841-8 (2012).

    Google ScholarĀ 

  28. Yang, J. et al. Modeling of radon exhalation from soil influenced by environmental parameters. Sci. Total Environ. 656, 1304–1311. https://doi.org/10.1016/j.scitotenv.2018.11.464 (2019).

    Google ScholarĀ 

  29. Bochicchio, F. et al. Annual average and seasonal variations of residential radon concentration for all the Italian regions. Radiat. Meas. 40, 686–694 (2005).

    Google ScholarĀ 

  30. Miles, J. C., Howarth, C. B. & Hunter, N. Seasonal variation of radon concentrations in UK homes. J. Radiol. Prot. 32, 275–287. https://doi.org/10.1088/0952-4746/32/3/275 (2012).

    Google ScholarĀ 

  31. Porstendorfer, J., Butterweck, G. & Reineking, A. Daily variation of the radon concentration indoors and outdoors and the influence of meteorological parameters. Health Phys. 67, 283–287 (1994).

    Google ScholarĀ 

  32. Rey, J. F. et al. Long-term impacts of weather conditions on indoor radon concentration measurements in Switzerland. Atmosphere 13, 92 (2022).

    Google ScholarĀ 

  33. Agency, U. S. E. P. EPA Maps of Radon Zones and Supporting Documents by State.

  34. Price, P. Predictions and maps of County mean indoor radon concentrations in the mid-Atlantic States. Health Phys. 72, 893–906 (1997).

    Google ScholarĀ 

  35. Price, P. N., Nero, A. V. & Gelman, A. Bayesian prediction of mean indoor radon concentrations for Minnesota counties. Health Phys. 71, 922–936 (1996).

    Google ScholarĀ 

  36. Apte, M., Price, P., Nero, A. & Revzan, K. Predicting new Hampshire indoor radon concentrations from geologic information and other covariates. Environ. Geol. 37, 181–194 (1999).

    Google ScholarĀ 

  37. Casey, J. A. et al. Predictors of indoor radon concentrations in Pennsylvania, 1989–2013. Environ. Health Perspect. 123, 1130–1137. https://doi.org/10.1289/ehp.1409014 (2015).

    Google ScholarĀ 

  38. Kropat, G. et al. Improved predictive mapping of indoor radon concentrations using ensemble regression trees based on automatic clustering of geological units. J. Environ. Radioact. 147, 51–62. https://doi.org/10.1016/j.jenvrad.2015.05.006 (2015).

    Google ScholarĀ 

  39. Nikkila, A. et al. Predicting residential radon concentrations in finland: model development, validation, and application to childhood leukemia. Scand. J. Work Environ. Health. 46, 278–292. https://doi.org/10.5271/sjweh.3867 (2020).

    Google ScholarĀ 

  40. Dai, D. et al. Confluent impact of housing and geology on indoor radon concentrations in Atlanta, Georgia, united States. Sci. Total Environ. 668, 500–511. https://doi.org/10.1016/j.scitotenv.2019.02.257 (2019).

    Google ScholarĀ 

  41. Li, L. et al. Predicting monthly Community-Level domestic radon concentrations in the greater Boston area with an ensemble learning model. Environ. Sci. Technol. 55, 7157–7166. https://doi.org/10.1021/acs.est.0c08792 (2021).

    Google ScholarĀ 

  42. Hanson, H. A. et al. Centralized health and exposomic resource (C-HER): Analytic and AI-Ready Data for external exposomic research. Preprint at https://arXiv.org/abs/2511.03750 (2025).

  43. UBER. H3: Uber’s Hexagonal Hierarchical Spatial Index, (2018). https://www.uber.com/blog/h3/

  44. Maguire, D., Logan, J., Lee, H. & Hanson, H. Radon exposure dataset. Preprint at https://arXiv.org/abs/2505.09489 (2025).

  45. Van den Bossche, J. et al. geopandas/geopandas: v1.1.1. Zenodo https://doi.org/10.5281/zenodo.15750510 (2025).

    Google ScholarĀ 

  46. Team, T. M. D. & Matplotlib visualization with Python (v3.10.7). Zenodo, (2025). https://doi.org/10.5281/zenodo.17298696

  47. Census Bureau, U. S. & o., U. S. D. C. 2020 TIGER/Line Shapefiles, (2020). https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.2020.html

  48. Protection, P. D. E. October 13, Radon Test Results September 1986 - Current Annual County Environmental Protection, < (2023). https://data.pa.gov/Energy-and-the-Environment/Radon-Test-Results-September-1986-Current-Annual-C/vkjb-sx3k &gt

  49. Administration, H. R. & a. S. UDS Mapper, (2023). http://www.udsmapper.org/

  50. Weber, E. et al. LandScan USA (Oak Ridge National Laboratory, 2022).

  51. Danielson, J. J. & Gesch, D. B. Global multi-resolution Terrain Elevation Data 2010 (GMTED2010). Report No. 2331 – 1258 (US Geological Survey, 2011).

  52. Dahn H3-Pandas, (2021). https://h3-pandas.readthedocs.io/en/latest/

  53. Staff, S. S. Gridded National Soil Survey Geographic (gNATSGO) Database for Pennsylvania, < (2017). https://nrcs.app.box.com/v/soils

  54. Smith, D. B., Solano, F., Woodruff, L. G., Cannon, W. F. & Ellefsen, K. J. Geochemical and mineralogical maps, with interpretation, for soils of the conterminous United States. Report. Reston, VA (2019).

  55. Wieczorek, M. E. & a., L. A.E. (U.S. Geological Survey data release, (2010).

  56. Thornton, M. et al. Daymet: Daily surface weather data on a 1-km grid for North America, version 4 R1. ORNL DAAC, Oak Ridge, Tennessee, USA. Single Pixel Extraction Tool| Daymet (ornl. gov) (2022).

  57. Harris, C. R. et al. Array programming with numpy. Nature 585, 357–362 (2020).

    Google ScholarĀ 

  58. Rey, S. J. & Anselin, L. In Handbook of Applied Spatial Analysis: Software tools, Methods and Applications 175–193 (Springer, 2009).

  59. Bureau., U. S. C. HOUSE HEATING FUEL [10]. Decennial Census, DEC State Legislative District Summary File (Sample), Table H040,, (2001). https://data.census.gov/table/DECENNIALSLDS.H040?q=house+heating+fuel

  60. Census Bureau, U. S. & o., U. S. D. C. House Heating Fuel. American Community Survey, ACS 5-Year Estimates Detailed Tables, Table B25040, (2021). https://data.census.gov/table/ACSDT5Y2023.B25040?q=house+heating+fuel

  61. Pedregosa, F. et al. Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    Google ScholarĀ 

  62. Johnson, R. A. quantile-forest: A python package for quantile regression forests. J. Open. Source Softw. 9, 5976 (2024).

    Google ScholarĀ 

  63. Meinshausen, N. & Ridgeway, G. Quantile regression forests. Journal Mach. Learn. Research 7 (2006).

  64. Vaysse, K. & Lagacherie, P. Using quantile regression forest to estimate uncertainty of digital soil mapping products. Geoderma 291, 55–64 (2017).

    Google ScholarĀ 

  65. Maxwell, K., Rajabi, M. & Esterle, J. Spatial interpolation of coal properties using geographic quantile regression forest. Int. J. Coal Geol. 248, 103869 (2021).

    Google ScholarĀ 

  66. Lewis, C. D. Industrial and business forecasting methods: A practical guide to exponential smoothing and curve fitting. (No Title) (1982).

  67. Moriasi, D. N., Gitau, M. W., Pai, N. & Daggupati, P. Hydrologic and water quality models: performance measures and evaluation criteria. Trans. ASABE. 58, 1763–1785 (2015).

    Google ScholarĀ 

  68. Ajrouche, R. et al. Quantitative health risk assessment of indoor radon: a systematic review. Radiat. Prot. Dosimetry. 177, 69–77 (2017).

    Google ScholarĀ 

  69. Lubin, J. H. & Boice, J. D. Jr Lung cancer risk from residential radon: meta-analysis of eight epidemiologic studies. J. Natl Cancer Inst. 89, 49–57 (1997).

    Google ScholarĀ 

Download references

Acknowledgements

This work was supported by the Office of Biological and Environmental Research’s Biological Systems Science Division. This manuscript has been authored by UT-Battelle LLC under Contract No. DE-AC05-00OR22725 with the US Department of Energy and Award AWD-002827 between UT-Battelle and the Georgia Tech Research Corporation. This research used resources of CADES at the Oak Ridge National Laboratory, which is supported by the US Department of Energy’s Office of Science under Contract No. DE-AC05-00OR22725.

Author information

Authors and Affiliations

  1. Nuclear and Radiological Engineering and Medical Physics Programs, George W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology, 770 State Street, Atlanta, GA, 30332, USA

    Heechan Lee,Ā Greeshma AgasthyaĀ &Ā Shaheen Dewji

  2. Advanced Computing for Health Sciences Section, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN, 37830, USA

    Heechan Lee,Ā Dakotah MaguireĀ &Ā Heidi A. Hanson

  3. Department of Environmental and Occupational Health, University of California, Irvine, 856 Health Sciences Quad, Irvine, GA, 92697, USA

    Heechan Lee

  4. Data Engineering Group, Data and AI Section, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN, 37830, USA

    Jeremy Logan

Authors
  1. Heechan Lee
    View author publications

    Search author on:PubMedĀ Google Scholar

  2. Dakotah Maguire
    View author publications

    Search author on:PubMedĀ Google Scholar

  3. Jeremy Logan
    View author publications

    Search author on:PubMedĀ Google Scholar

  4. Greeshma Agasthya
    View author publications

    Search author on:PubMedĀ Google Scholar

  5. Shaheen Dewji
    View author publications

    Search author on:PubMedĀ Google Scholar

  6. Heidi A. Hanson
    View author publications

    Search author on:PubMedĀ Google Scholar

Contributions

HL conducted the study design, processed the data, performed the analyses, interpreted the results, drafted the manuscript, and contributed to its editing. DM and JL conducted analysis design, data processing, and manuscript editing. GA and SD contributed to the study design and manuscript revisions. HH was responsible for study design, project supervision, interpretation of results, and critical revision of the manuscript.

Corresponding author

Correspondence to Heidi A. Hanson.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (download DOCX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, H., Maguire, D., Logan, J. et al. Quantifying mean, variability, and uncertainty in indoor radon exposure in Pennsylvania using random forest and quantile regression forest models. Sci Rep (2026). https://doi.org/10.1038/s41598-026-37891-3

Download citation

  • Received: 09 June 2025

  • Accepted: 27 January 2026

  • Published: 05 March 2026

  • DOI: https://doi.org/10.1038/s41598-026-37891-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Radon
  • Geology
  • Prediction model
  • Machine learning
  • ZCTA-level predictions
  • Environmental health
  • Exposome
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing