Abstract
Radon is a naturally occurring radioactive gas that poses a serious health risk as the primary cause of lung cancer in non-smokers. Despite the well-known adverse association with health outcomes, current radon exposure assessments are limited to county-level or average-level estimates, which fail to capture regional variability. This study uses Machine Learning models, including Random Forest (RF) and Quantile Regression Forest (QRF), to estimate the indoor radon concentrations at the ZCTA (Zip code tabulation area)-level and characterize uncertainties in model estimates. Incorporating geological, meteorological, and building-specific data, the models aim to improve radon risk assessment by capturing mean exposure, variability, and extreme concentration levels. Processed radon test data (nā=ā718,111) were analyzed using average, variability, and quantile prediction methods. Models that estimate the average radon exposure at the ZCTA-level can yield promising model-fit results, but they do not capture the underlying variability of indoor radon exposure within a ZCTA. We utilize volatility analyses to identify characteristics indicative of high variability of indoor radon exposure. We also show that a QRF model can be used to estimate upper quantiles of residential radon exposure, thereby uncovering localized areas of elevated exposure that were not apparent in mean estimates. The results highlighted the need for a deep characterization of exposure risk and show that regions with moderate average exposure levels could still harbor extreme outliers with implications for evaluating health risks. Utilizing multiple radon exposure models allows for a deeper characterization of radon risk within a geographic area and can better identify high-risk areas. The results from this study provide a foundation for developing mitigation strategies and examining associations between radon exposure and health outcomes at fine scales. Future research should extend the geographic scope and incorporate additional environmental risk factors to establish a comprehensive framework for risk assessment.
Similar content being viewed by others
Data availability
Indoor radon measurement data for Pennsylvania were obtained from the Pennsylvania Department of Environmental Protection (PA DEP) which is publicly available.All datasets used in this study are publicly available from the original data providers: indoor radon measurement from the Pennsylvania Department of Environmental Protection, elevation from the USGS GMTED2010 product, soil characteristics from the USDA NRCS gNATSGO database, geochemical variables from the USGS Geochemical and Mineralogical Survey, hydrologic landscape data from USGS, meteorological variables from the Daymet database, and demographic and housing characteristics from the U.S. Census Bureau (Decennial Census and American Community Survey). Detailed information on data sources and preprocessing workflows is provided in the method paper. 44Data are available from the authors upon reasonable request.
References
Wall, B. F. Ionising Radiation Exposure of the Population of the United States: NCRP Report No. 160 (Oxford University Press, 2009).
Organization, W. H. WHO Handbook on Indoor Radon: a Public Health Perspective (World Health Organization, 2009).
Tirmarche, M. et al. ICRP publication 115. Lung cancer risk from radon and progeny and statement on radon. Ann. ICRP. 40, 1ā64 (2010).
Dong, S. et al. Synergistic effects of particle radioactivity (Gross beta Activity) and particulate matter =2.5 Mum aerodynamic diameter on cardiovascular disease Mortality</at. J. Am. Heart Assoc. 11, e025470. https://doi.org/10.1161/JAHA.121.025470 (2022).
Kim, S. H., Park, J. M. & Kim, H. The prevalence of stroke according to indoor radon concentration in South koreans: nationwide cross section study. Med. (Baltim). 99, e18859. https://doi.org/10.1097/MD.0000000000018859 (2020).
Lee, H. et al. Evaluating county-level lung cancer incidence from environmental radiation exposure, PM(2.5), and other exposures with regression and machine learning models. Environ. Geochem. Health. 46, 82. https://doi.org/10.1007/s10653-023-01820-4 (2024).
Al-Zoughool, M. & Krewski, D. Health effects of radon: a review of the literature. Int. J. Radiat. Biol. 85, 57ā69. https://doi.org/10.1080/09553000802635054 (2009).
Council, N. R. Health Effects of Exposure To Radon: BEIR VI (National Academies, 1999).
Kang, J. K., Seo, S. & Jin, Y. W. Health effects of radon exposure. Yonsei Med. J. 60, 597ā603. https://doi.org/10.3349/ymj.2019.60.7.597 (2019).
Richardson, D. B. et al. Mortality among uranium miners in North America and europe: the pooled uranium miners analysis (PUMA). Int. J. Epidemiol. 50, 633ā643. https://doi.org/10.1093/ije/dyaa195 (2021).
Lagarde, F. et al. Glass-based radon-exposure assessment and lung cancer risk. J. Expo. Sci. Environ. Epidemiol. 12, 344ā354 (2002).
Park, N. W., Kim, Y., Chang, B. U. & Kwak, G. H. County-level indoor radon concentration mapping and uncertainty assessment in South Korea using Geostatistical simulation and environmental factors. J. Environ. Radioact. 208, 106044 (2019).
Fujimoto, K. & Sanada, T. Dependence of indoor radon concentration on the year of house construction. Health Phys. 77, 410ā419 (1999).
Smith, B. J. & Field, R. W. Effect of housing factors and surficial uranium on the Spatial prediction of residential radon in Iowa. Environmetrics 18, 481ā497. https://doi.org/10.1002/env.816 (2006).
Abergel, R. et al. The enduring legacy of Marie curie: impacts of radium in 21st century radiological and medical sciences. Int. J. Radiat. Biol. 98, 267ā275. https://doi.org/10.1080/09553002.2022.2027542 (2022).
Gundersen, L. C. et al. Geology of radon in the United States. (1992).
Otton, J. K. The geology of radon. (1992).
Bulut, H. A., Åahin, R. & Radon Concrete, buildings and human HealthāA review study. Buildings 14, 510 (2024).
Mustonen, R. Natural radioactivity in and radon exhalation from Finnish Building materials. Health Phys. 46, 1195ā1203 (1984).
Marcinowski, F., Lucas, R. M. & Yeager, W. M. National and regional distributions of airborne radon concentrations in US homes. Health Phys. 66, 699ā706 (1994).
Yazzie, S. A., Davis, S., Seixas, N. & Yost, M. G. Assessing the impact of housing features and environmental factors on home indoor radon concentration levels on the Navajo Nation. Int. J. Environ. Res. Public. Health. 17 https://doi.org/10.3390/ijerph17082813 (2020).
Sun, K., Guo, Q. & Cheng, J. The effect of some soil characteristics on soil radon concentration and radon exhalation from soil surface. J. Nucl. Sci. Technol. 41, 1113ā1117. https://doi.org/10.1080/18811248.2004.9726337 (2004).
Mose, D. G. & Mushrush, G. W. Prediction of indoor radon based on soil radon and soil permeability. J. Environ. Sci. Health Part. A. 34, 1253ā1266. https://doi.org/10.1080/10934529909376894 (1999).
Hassan, N. M. et al. Radon migration process and its influence factors; review. Japanese J. Health Phys. 44, 218ā231 (2009).
Khattak, N., Khan, M. A., Ali, N. & Abbas, S. M. Radon monitoring for geological exploration: A review. J. Himal. Earth Sci. 44, 91ā102 (2011).
Nunes, L. J. R., Curado, A., Graca, L., Soares, S. & Lopes, S. I. Impacts of indoor radon on health: A comprehensive review on Causes, assessment and remediation strategies. Int. J. Environ. Res. Public. Health. 19 https://doi.org/10.3390/ijerph19073929 (2022).
Åen, G. Y., IƧhedef, M., SaƧ, M. M. & Yener, G. Effect of natural gas usage on indoor radon levels. J. Radioanal. Nucl. Chem. 295, 277ā282. https://doi.org/10.1007/s10967-012-1841-8 (2012).
Yang, J. et al. Modeling of radon exhalation from soil influenced by environmental parameters. Sci. Total Environ. 656, 1304ā1311. https://doi.org/10.1016/j.scitotenv.2018.11.464 (2019).
Bochicchio, F. et al. Annual average and seasonal variations of residential radon concentration for all the Italian regions. Radiat. Meas. 40, 686ā694 (2005).
Miles, J. C., Howarth, C. B. & Hunter, N. Seasonal variation of radon concentrations in UK homes. J. Radiol. Prot. 32, 275ā287. https://doi.org/10.1088/0952-4746/32/3/275 (2012).
Porstendorfer, J., Butterweck, G. & Reineking, A. Daily variation of the radon concentration indoors and outdoors and the influence of meteorological parameters. Health Phys. 67, 283ā287 (1994).
Rey, J. F. et al. Long-term impacts of weather conditions on indoor radon concentration measurements in Switzerland. Atmosphere 13, 92 (2022).
Agency, U. S. E. P. EPA Maps of Radon Zones and Supporting Documents by State.
Price, P. Predictions and maps of County mean indoor radon concentrations in the mid-Atlantic States. Health Phys. 72, 893ā906 (1997).
Price, P. N., Nero, A. V. & Gelman, A. Bayesian prediction of mean indoor radon concentrations for Minnesota counties. Health Phys. 71, 922ā936 (1996).
Apte, M., Price, P., Nero, A. & Revzan, K. Predicting new Hampshire indoor radon concentrations from geologic information and other covariates. Environ. Geol. 37, 181ā194 (1999).
Casey, J. A. et al. Predictors of indoor radon concentrations in Pennsylvania, 1989ā2013. Environ. Health Perspect. 123, 1130ā1137. https://doi.org/10.1289/ehp.1409014 (2015).
Kropat, G. et al. Improved predictive mapping of indoor radon concentrations using ensemble regression trees based on automatic clustering of geological units. J. Environ. Radioact. 147, 51ā62. https://doi.org/10.1016/j.jenvrad.2015.05.006 (2015).
Nikkila, A. et al. Predicting residential radon concentrations in finland: model development, validation, and application to childhood leukemia. Scand. J. Work Environ. Health. 46, 278ā292. https://doi.org/10.5271/sjweh.3867 (2020).
Dai, D. et al. Confluent impact of housing and geology on indoor radon concentrations in Atlanta, Georgia, united States. Sci. Total Environ. 668, 500ā511. https://doi.org/10.1016/j.scitotenv.2019.02.257 (2019).
Li, L. et al. Predicting monthly Community-Level domestic radon concentrations in the greater Boston area with an ensemble learning model. Environ. Sci. Technol. 55, 7157ā7166. https://doi.org/10.1021/acs.est.0c08792 (2021).
Hanson, H. A. et al. Centralized health and exposomic resource (C-HER): Analytic and AI-Ready Data for external exposomic research. Preprint at https://arXiv.org/abs/2511.03750 (2025).
UBER. H3: Uberās Hexagonal Hierarchical Spatial Index, (2018). https://www.uber.com/blog/h3/
Maguire, D., Logan, J., Lee, H. & Hanson, H. Radon exposure dataset. Preprint at https://arXiv.org/abs/2505.09489 (2025).
Van den Bossche, J. et al. geopandas/geopandas: v1.1.1. Zenodo https://doi.org/10.5281/zenodo.15750510 (2025).
Team, T. M. D. & Matplotlib visualization with Python (v3.10.7). Zenodo, (2025). https://doi.org/10.5281/zenodo.17298696
Census Bureau, U. S. & o., U. S. D. C. 2020 TIGER/Line Shapefiles, (2020). https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.2020.html
Protection, P. D. E. October 13, Radon Test Results September 1986 - Current Annual County Environmental Protection, < (2023). https://data.pa.gov/Energy-and-the-Environment/Radon-Test-Results-September-1986-Current-Annual-C/vkjb-sx3k >
Administration, H. R. & a. S. UDS Mapper, (2023). http://www.udsmapper.org/
Weber, E. et al. LandScan USA (Oak Ridge National Laboratory, 2022).
Danielson, J. J. & Gesch, D. B. Global multi-resolution Terrain Elevation Data 2010 (GMTED2010). Report No. 2331āāā1258 (US Geological Survey, 2011).
Dahn H3-Pandas, (2021). https://h3-pandas.readthedocs.io/en/latest/
Staff, S. S. Gridded National Soil Survey Geographic (gNATSGO) Database for Pennsylvania, < (2017). https://nrcs.app.box.com/v/soils
Smith, D. B., Solano, F., Woodruff, L. G., Cannon, W. F. & Ellefsen, K. J. Geochemical and mineralogical maps, with interpretation, for soils of the conterminous United States. Report. Reston, VA (2019).
Wieczorek, M. E. & a., L. A.E. (U.S. Geological Survey data release, (2010).
Thornton, M. et al. Daymet: Daily surface weather data on a 1-km grid for North America, version 4 R1. ORNL DAAC, Oak Ridge, Tennessee, USA. Single Pixel Extraction Tool| Daymet (ornl. gov) (2022).
Harris, C. R. et al. Array programming with numpy. Nature 585, 357ā362 (2020).
Rey, S. J. & Anselin, L. In Handbook of Applied Spatial Analysis: Software tools, Methods and Applications 175ā193 (Springer, 2009).
Bureau., U. S. C. HOUSE HEATING FUEL [10]. Decennial Census, DEC State Legislative District Summary File (Sample), Table H040,, (2001). https://data.census.gov/table/DECENNIALSLDS.H040?q=house+heating+fuel
Census Bureau, U. S. & o., U. S. D. C. House Heating Fuel. American Community Survey, ACS 5-Year Estimates Detailed Tables, Table B25040, (2021). https://data.census.gov/table/ACSDT5Y2023.B25040?q=house+heating+fuel
Pedregosa, F. et al. Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825ā2830 (2011).
Johnson, R. A. quantile-forest: A python package for quantile regression forests. J. Open. Source Softw. 9, 5976 (2024).
Meinshausen, N. & Ridgeway, G. Quantile regression forests. Journal Mach. Learn. Research 7 (2006).
Vaysse, K. & Lagacherie, P. Using quantile regression forest to estimate uncertainty of digital soil mapping products. Geoderma 291, 55ā64 (2017).
Maxwell, K., Rajabi, M. & Esterle, J. Spatial interpolation of coal properties using geographic quantile regression forest. Int. J. Coal Geol. 248, 103869 (2021).
Lewis, C. D. Industrial and business forecasting methods: A practical guide to exponential smoothing and curve fitting. (No Title) (1982).
Moriasi, D. N., Gitau, M. W., Pai, N. & Daggupati, P. Hydrologic and water quality models: performance measures and evaluation criteria. Trans. ASABE. 58, 1763ā1785 (2015).
Ajrouche, R. et al. Quantitative health risk assessment of indoor radon: a systematic review. Radiat. Prot. Dosimetry. 177, 69ā77 (2017).
Lubin, J. H. & Boice, J. D. Jr Lung cancer risk from residential radon: meta-analysis of eight epidemiologic studies. J. Natl Cancer Inst. 89, 49ā57 (1997).
Acknowledgements
This work was supported by the Office of Biological and Environmental Researchās Biological Systems Science Division. This manuscript has been authored by UT-Battelle LLC under Contract No. DE-AC05-00OR22725 with the US Department of Energy and Award AWD-002827 between UT-Battelle and the Georgia Tech Research Corporation. This research used resources of CADES at the Oak Ridge National Laboratory, which is supported by the US Department of Energyās Office of Science under Contract No. DE-AC05-00OR22725.
Author information
Authors and Affiliations
Contributions
HL conducted the study design, processed the data, performed the analyses, interpreted the results, drafted the manuscript, and contributed to its editing. DM and JL conducted analysis design, data processing, and manuscript editing. GA and SD contributed to the study design and manuscript revisions. HH was responsible for study design, project supervision, interpretation of results, and critical revision of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisherās note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the articleās Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the articleās Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Lee, H., Maguire, D., Logan, J. et al. Quantifying mean, variability, and uncertainty in indoor radon exposure in Pennsylvania using random forest and quantile regression forest models. Sci Rep (2026). https://doi.org/10.1038/s41598-026-37891-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-37891-3


