Abstract
As environmental hazards become more frequent, it is critically important to understand their health impacts and identify individuals at disproportionately higher risk. Moderated Multiple Regression (MMR) provides a straightforward approach for investigating population heterogeneity by incorporating interaction terms between hazard exposure and population characteristics into a regression model. However, when vulnerabilities are embedded within complex, high-dimensional covariate spaces, MMR often fails to adequately model complex population heterogeneity. Here, we introduce a hybrid method, Regression-Guided Neural Networks (ReGNN), which integrates the flexibility of artificial neural networks (ANNs) within the structural form of a regression model. Briefly, ReGNN embeds an ANN inside a regression equation to generate a latent representation that nonlinearly combines potential sources of heterogeneity and moderates the effect of an environmental hazard. Because the outer layer maintains a regression structure, it delivers statistically robust inference while preserving traditional interpretability if augmented with Double Machine Learning (DML)-style residualization and cross-fitting. Through extensive simulation studies, we demonstrate ReGNN’s effectiveness in modeling complex heterogeneous effects. We further illustrate its utility by applying it to investigate population heterogeneity in the association of air pollution (PM2.5) with cognitive functioning scores. By comparing ReGNN’s results with those from traditional MMR models, we show that ReGNN can uncover patterns of heterogeneity that would otherwise remain hidden.
Similar content being viewed by others
Code availability
Code and detailed instructions for running the analyses are available on GitHub: https://github.com/njw0709/ReGNN.
Funding
This work was supported by the USC/UCLA Center on Biodemography and Population Health through a grant from the National Institute on Aging, National Institutes of Health (grant number P30AG017265 and T32AG000037 to J.A.A and K99AG090817 to EYC) and the Alzheimer’s Association (grant number AARF251473227 to EYC).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Nam, J.W., Choi, E.Y., Ailshire, J.A. et al. Unveiling population heterogeneity in health risks posed by environmental hazards using regression-guided neural network. Sci Rep (2026). https://doi.org/10.1038/s41598-026-54345-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-54345-y


