Introduction

Breast cancer is the most diagnosed malignancy in women worldwide. Well-known risk factors include genetic, behavioral, and reproductive determinants; however, disease incidence continues to rise even without a comparable increase in the prevalence of known risk factors1,2. This points to a poorly understood multifactorial disease etiology, with additional undefined factors likely at play. Recent literature suggests environmental exposures and chemical pollutants may also contribute to breast cancer risk and aggressiveness, with some studies reporting associations between increased breast cancer rates and (1) polycyclic aromatic hydrocarbons (PAHS), (2) nitrogen dioxide (NO2), and (3) fine particulate matter (PM) toxins3,4,5,6,7,8,9,10,11,12,13,14.

However, many of these studies focus on occupational-based exposures rather than those that may occur in other important social contexts, such as area of residence. The few studies that have observed a possible connection between residential exposures to chemicals and increased breast cancer risk15,16,17,18,19,20 are often limited in scope and may not be generalizable to the broader population19,20; for example, one study looking at the impact of hazardous air pollutants on the risk of invasive breast cancer was composed of a patient population that was 95% White and 72% premenopausal20.

Assessing the impact of environmental chemical exposures on breast cancer is important as highly polluted and contaminated regions have been linked with numerous adverse health issues. For example, “Cancer Alley”, a region in Louisiana with a high density of industrial emissions and known for its abundance of petrochemical industries and air pollution, reports cumulative cancer incidence rates up to 150% higher than statewide levels, in both the greater Mississippi Delta Region and nationally21. These findings are important in the context of breast cancer as many environmental and chemical pollutants, including microplastics and those found in and around superfund and harmful waste sites, such as benzenes, chloroethenes, heavy metals, polychlorinated biphenyls (PCBs), and PAHS, are known endocrine and estrogen disruptors. These chemicals are absorbed into the bloodstream and enter adipose-dense tissues, releasing reactive oxygen species (ROS)-metabolites that alter DNA, contributing to mammary tumor formation22,23,24,25,26,27. Outside of breast cancer, these chemicals are known to promote ROS-mediated epithelial-to-mesenchymal (EMT) transition in lung cancer tumors, resulting in increased rates of lung cancer metastasis28,29. Understanding the role of the environment and related exposures is therefore vital to inform further understanding of breast cancer pathogenesis and pathophysiology.

To assess this relationship between environmental chemical exposure and breast cancer, this study used residential proximity to NPL superfund sites to evaluate for association with increased odds of de novo metastatic versus non-metastatic breast cancer presentation. Residential proximity to NPL sites, approximated by the presence of a NPL superfund site in a patient’s census-designated place (CDP) of residence, was used as the measure of exposure. The primary objective was to assess the impact of residential proximity to superfund sites on the presence of metastatic disease at the time of diagnosis. Due to the estrogen-disrupting biological activity of many pollutants, we hypothesized that patients living in close proximity to superfund sites would have a higher odds of metastatic versus non-metastatic disease at the time of diagnosis, even after controlling for sociodemographic factors that may inherently contribute to advanced stage disease.

Methods

Study sample

The Florida Cancer Data System (FCDS) is a statewide, legislatively mandated incident cancer registry maintained by the University of Miami Miller School of Medicine and the Florida Department of Health. The current study utilizes FCDS data to investigate clinical and sociodemographic characteristics, in addition to the stage of disease at diagnosis, among 21,505 breast cancer patients residing within Sylvester Comprehensive Cancer Center’s catchment area (CA) counties, including Broward, Miami-Dade, Monroe, and Palm Beach. Additional inclusion criteria for selected cases encompassed subjects diagnosed between 2015 and 2019, of female sex, and with valid census tract identification numbers. Patients with missing data related to inclusion criteria, primary exposure, or primary outcome were excluded from the study.

Outcome and exposure variables

The primary outcome was the presence of de novo metastatic (versus non-metastatic) disease at the time of diagnosis. Metastatic disease was defined by FCDS data as involving “distant site(s)/node(s)”, specified by the “Summary Stage” variable. Distant sites included, among others, the adrenal glands, bone, lung, ovary, skin overlying the axilla, contralateral breast, sternum, and upper abdomen, while distant nodes may have included contralateral axillary, cervical, infraclavicular, internal mammary, and supraclavicular nodes. This stage information was transformed into a binary variable to indicate metastatic or non-metastatic disease presentation. Metastatic disease was used as the primary endpoint given prior findings of ROS-mediated EMT activity promoting increased rates of tumor metastasis for many environmental chemicals and pollutants28,29.

NPL proximity was the primary exposure of interest. We utilized data provided by the U.S. Environmental Protection Agency (EPA) (https://www.epa.gov/superfund/superfund-national-priorities-list-npl) to geocode NPL-designated sites for CDP linkage to CA counties. This data was then merged with FCDS data based on patient place of residence and a dichotomous variable was created to indicate whether patients were “NPL Proximal” (yes/no). We considered patients who lived within a CDP where at least one NPL site was located as having residential proximity; patients whose CDP did not include at least one NPL site were considered non-proximal. Both NPL sites and patient residences were linked to respective CDPs using a point in polygon spatial join. Due to limited information provided by the EPA, information about the exposure time in which patients lived in NPL site-exposed CDPs was not available.

Covariates of interest

Patient age at diagnosis, race, ethnicity, insurance type, and income were used as covariates for this analysis. Age was treated as a continuous variable for multivariable modeling purposes. Self-identified race was categorized as White, Black, or Other, and ethnicity as Hispanic, non-Hispanic, or unknown. Insurance type and income were considered proxy measures of socioeconomic status (SES) and access to care. Insurance type was categorized as privately insured, non-specified insured, Medicaid, Medicare, Military, uninsured, or unknown. Individual-level income was not available; therefore, we used the U.S. Census Bureau’s American Community Survey (ACS) 2017 five-year estimates of median income, provided at the census tract level, which reflect data collected from 2013 to 2017. This five-year estimate period offers temporal alignment with the study period (2015–2019), capturing socioeconomic conditions likely to precede or coincide with the cancer diagnosis. We felt this provides a more appropriate socioeconomic proxy than later estimates.

Statistical analysis

Study population characteristics were assessed overall and according to NPL proximity. Differences across NPL proximity groups were evaluated with chi-square tests for proportions and t-tests for continuous variables. To evaluate for the association between proximity to NPL sites and the presence of metastatic (versus non-metastatic) breast cancer at the time of diagnosis, a multilevel analysis was carried out to account for patients living in census tracts nested within different CDPs; unadjusted univariable and adjusted multivariable binomial generalized linear mixed-effects models were performed to specify random effects at the census tract and CDP-level for median income and NPL proximity, respectively. Model 1 was unadjusted, Model 2 was adjusted for age, race, and ethnicity, Model 3 was further adjusted for insurance type, and Model 4 was additionally adjusted for tract-level median income (nested within CDPs). These covariates were selected based on optimal model fit in addition to the clinical relevance of covariates of interest based on existing literature and subject matter knowledge. All analyses were conducted within R version 4.3.0 using the “glmer” function from the “lme4” package. Statistical significance was assessed using an alpha level of 0.05.

Results

Patient characteristics

A total of 21,505 breast cancer cases were identified from available FCDS data and included in this analysis. The mean age of patients was 62 ± 14 years. The majority were White (n = 16,994; 79%) and non-Hispanic (n = 13,477; 62.7%), followed by Black (n = 3828; 17.8%) and Hispanic (n = 7853; 36.5%). Over 90% (n = 20,122) of patients had non-metastatic disease at the time of diagnosis. Approximately 10% (n = 2178) of patients resided in neighborhoods with at least one NPL site (Table 1).

Table 1 Patient demographics and clinical characteristics.

Overall, race, ethnicity, insurance type, median income, and proportion with metastatic disease at the time of diagnosis were different between NPL proximity groups (p < 0.001). The proportion of metastatic disease at the time of diagnosis was highest (8.5%) amongst those living near NPL sites (Table 1). Spatial variability of de novo metastatic breast cancer diagnosis in relation to NPL site proximity is presented in Fig. 1.

Fig. 1
figure 1

Superfund sites and de novo metastatic breast cancer. Figure shows census-designated places (CDPs) grouped into quartiles based on the percentage of de novo metastatic breast cancer cases (calculated as total number of de novo metastatic cases divided by total number of breast cancer cases, within a given CDP, during the study period). Red dots indicate the locations of NPL superfund site locations in south Florida included in this study.

Superfund site characteristics

A total of 12 NPL superfund sites were represented in this analysis. The most common site types were Manufacturing/Processing/Maintenance (n = 6; 50%) and Recycling (n = 4; 33%). Groundwater was the most common media for contaminant spread. Characteristics of NPL sites represented are displayed in Table 2. Detailed information about the contaminants found within each site are presented in Supplementary Table S1.

Table 2 Superfund site Characteristics.

Metastatic breast cancer and superfund site density

Unadjusted Model 1 evidenced a 43% increase in likelihood of metastatic (versus non-metastatic) breast cancer presentation (OR 1.43; CI 1.14–1.81; p = 0.002, Table 3); minimally adjusted Model 2, which controlled for age, race, and ethnicity, still resulted in a significant effect with NPL-proximal status corresponding with a 30% increased likelihood for metastatic cancer at the time of presentation (aOR 1.30; CI 1.07–1.57; p = 0.007). Upon further analysis, living near NPL sites was still associated with approximately 30% higher likelihood for metastatic presentation after additional adjustment for insurance type (aOR 1.29; CI 1.08–1.59; p = 0.005; Model 3). After further controlling for tract-level median income, living near NPL sites was associated with a 27% higher likelihood for de novo metastatic disease presentation (aOR 1.27; CI 1.06–1.51; p = 0.009; Model 4). Interaction terms assessing potential effect modification by insurance type and income were not significant when added to Model 4 based on omnibus p-values (NPL x Insurance p = 0.122; NPL x Income p = 0.103), suggesting that socioeconomic status (SES) and access to care do not modify the effect NPL-proximity has on the likelihood of metastatic breast cancer presentation. Of note, Model 4 was also adjusted for air PM exposure (PM2.5, as annual average by place), however this was not significant (p = 0.48) and therefore not included in the final model.

Table 3 Metastatic breast cancer and NPL proximity.

Discussion

This study used FCDS data to assess the potential relationship between place of residence proximity to NPL superfund sites and de novo metastatic (versus non-metastatic) breast cancer presentation, finding that living near NPL sites was associated with an approximately 30% increased likelihood of metastatic breast cancer presentation at the time of diagnosis. This effect remained significant even after adjusting for proxy measures of SES and access to care, including insurance type and income, and was not affected by either factor on subsequent effect modification analyses. These results are important as they are believed to be the first to associate residential and sub-occupational threshold level chemical exposure from NPL sites with de novo metastatic breast cancer presentation, shedding new insight on the multifactorial etiology of breast cancer and highlighting association between environmental chemical pollutants and altered mammary tumor biology.

Genetic and behavioral characteristics are known risk factors for breast cancer, yet incidence continues to rise even in their absence. Recent literature suggests occupational level exposure to chemical pollutants, such as PAHS, benzenes, NO2, and biphenyls, is associated with increased rates of breast cancer, but the literature remains overall inconclusive3,4,5,6,7,8,9,10,11,31,32. For example, a number of studies have found associations between long-term residential and traffic-related PAHS (in the form of airborne NO2) exposure and increased breast cancer incidence6,33,34,35,36, with some reporting risks increased by as much as 148%, particularly for premenopausal women6. On the contrary, others report no association between chemical pollutant exposure and breast cancer incidence. For example, a 2016 study by Hart et al. investigating the association between PM exposure and breast cancer incidence found that PM exposure level did not influence incidence rates31. While a limited number of studies have presented evidence that environmental chemicals may induce altered mammary tumor biological processes capable of accelerating tumor progression and/or metastasis37,38,39, to our knowledge only one study has specifically assessed the association between pollution exposure and breast cancer stage. In the study, performed by Khorrrami et al. using 1,148 breast cancer cases from the Cancer Research Center in Iran, the authors found that 10-unit increases in the levels of the air pollutants benzene and o-xylene were associated with 16% and 18% higher odds of stage III/IV versus stage I/II disease, respectively, for postmenopausal women, even after controlling for several known risk factors. Notably, the authors also found the effect that certain pollutants had on the increased likelihood of later-stage disease was SES-dependent40. For example, while o-xylene was associated with an 18% higher likelihood for stage III/IV versus I/II disease overall, this was true only for low SES patients; odds for advanced-stage disease did not differ in high SES patients but were nearly 270% higher for low SES patients40. While the aforementioned study does report important findings regarding the association between environmental pollution and advanced breast cancer stage, it differs from the present study in notable ways. The present study, in addition to focusing specifically on de novo metastatic rather than advanced-stage disease, consisted of a notably larger patient population while additionally controlling for the impact of race and ethnicity on stage outcomes. Previous literature highlights that race and ethnicity have a particularly significant impact on breast cancer risk, possibly more than other sociodemographic factors41,42, and our results remained significant even after accounting for race, ethnicity, and other SES factors. The study by Khorrami et al. also was limited to air pollution, whereas the present study looked at NPL-site exposures more broadly, which may have included ground, water, and/or air pollution.

NPL superfund sites were used as the source of exposure in this study for multiple reasons. First, the EPA defines NPL superfund sites as places with known or threatened release of hazardous substances, pollutants, and contaminants and eligible for long-term investigation and remedial action. Many political, epidemiological, and public health calls for action have recently been implemented for better control of environmental contamination and pollution emanating from superfund sites. For example, in 2024 the EPA recently enacted new regulations for the first time in 30 years that will implement stricter policies for industrial pollution from these sites43. These changes are important as superfund site density has been associated with both higher overall pan-cancer incidence rates, as well as a linear relationship between increasing superfund site density and increasing incidence rates44. Furthermore, while the literature regarding superfund sites remains limited, proximity to superfund sites has also been associated with excess cancer mortality across multiple cancer types45. Additionally, NPL sites were used as the primary exposure of interest in this study as, as stated previously, these sites are known to contain many substances that act as endocrine and estrogen disruptors, thereby facilitating their ability to contribute to mammary tumor formation. Many of these chemicals, including phthalates, dioxins, dichlorodiphenyltrichloroethane (DDT) compounds, and polychlorinated biphenyls (PCBs), were present among many of the NPL sites included in this study (Supplementary Table S1). Prior literature suggests that many of these chemicals may drive breast cancer pathophysiology through direct genotoxic action, alteration of mammary gland development and hormone responsiveness, and hormonal tumor production3,46. Furthermore, as outlined previously, these chemicals can enter the bloodstream and travel to adipose-dense tissues, where they release ROS-metabolites that alter DNA in unfavorable ways22,23,24,25,26,27. While their impact on mammary tumor EMT transition has not yet been determined, these chemicals have previously been shown to promote ROS-mediated EMT transition in lung cancer tumors and increase the risk of lung cancer metastasis. It is therefore plausible to assume that similar biological behaviors may occur in mammary tumors, with these compounds similarly able to facilitate an increased risk of metastasis for breast cancer tumors. Furthermore, as seen in Supplementary Table S1, contaminants from included NPL sites may impact patients through multiple routes of exposure, including both groundwater and soil contamination. Ten of the 12 (83%) NPL sites in this analysis relied on groundwater as a primary or contributory media of contamination; this is particularly important given the unique climate and landscape of our South Florida community. Heavy and predicted increases in the amount of rain and flooding in coming years, coupled with the uniquely shallow groundwater table, may cause increased leaching of chemicals from these sites, which may further act as an unrecognized source of exposure for individuals living near NPL sites.

While the results of the present study imply an association between geographic place of residence proximity to NPL superfund sites and an increased likelihood of de novo metastatic breast cancer presentation, it is important to acknowledge that external validity is scarce. The International Agency for Research on Cancer (IARC) classifies outdoor air pollution and PM as Group 1 carcinogens, or agents carcinogenic to humans47,48. While these agents have previously been associated with increased lung, urological, and breast cancer incidence12,13,14,49,50,51,52 and later breast cancer stage at diagnosis40, little is known about the impact of environmental exposures outside of air pollution on breast cancer stage at diagnosis. Furthermore, there remains an inadequacy of evidence used to determine the carcinogenicity of various environmental contaminants3,53. This study is the first to our knowledge to assess the potential association between NPL superfund site proximity and de novo metastatic breast cancer biology. Future studies across other geographic regions and using other state-, national-, or global-level cancer databases are needed to externally validate our findings and further elucidate breast cancer-specific carcinogenic properties of various contaminants.

Although this study presents important preliminary findings, it is not without limitations. Due to the retrospective design and the limited amount of demographic, clinical, and site-specific information available within FCDS and EPA data, we were unable to control in multivariable modeling for many factors that may inherently alter breast cancer pathogenesis and pathophysiology. This notably included family history of breast cancer and/or presence of genetic mutations predisposing to heightened breast cancer risk. The FCDS database also provided only limited years of receptor status data, and therefore our group was unable to assess the impact of environmental exposures on receptor subtypes. We were also underpowered to perform added sensitivity analyses and to adjust our models using precise “Hazard Ranking System (HRS) scores” of given NPL sites. The HRS score is a scoring system used by the EPA to determine a given site’s potential harm to human health. HRS scores are calculated from a sum of scores from four pathways (groundwater migration, surface water migration, soil exposure, air migration) contaminants may take to affect human health; summated scores range from 0 to 100, with a score of ≥ 28.5 needed to reach NPL-level criteria (https://www.epa.gov/superfund/hazard-ranking-system-hrs; https://www7.nau.edu/itep/main/docs/waste/superfund/Sprfund_SiteAssessmentInfo_EPA.pdf; https://semspub.epa.gov/work/HQ/174041.pdf). Controlling for site-specific scores may have allowed us to further examine the impact of site-specific characteristics on de novo metastatic breast cancer presentation and should be an area of emphasis in future validation studies. Secondly, NPL proximity in this study was based on an ecologic and broad-level CDP of residence rather than precise, individual-level patient address. Using exact patient addresses and mapping distances to NPL sites may have provided a more precise measure of the association between NPL site proximity and metastatic breast cancer presentation, but this detail was not available and is an inherent constraint of using partially deidentified cancer registry data. Similarly, this study is limited by a lack of ability to establish temporality between NPL site exposure and de novo metastatic breast cancer presentation. Importantly, the duration of time in which patients lived in NPL-exposed CDPs was not provided by FCDS data. Understanding exposure history would allow for temporal association between NPL site proximity and disease outcomes, adding to the analysis an understanding of the potential latency between NPL exposure and diagnosis of metastatic breast cancer. Despite the lack of this information, all included NPL sites used in this analysis were designated as NPL sites between 1983 and 2012, while all patient diagnoses occurred between 2015 and 2019. This supports the temporal ordering of exposure preceding outcome. Furthermore, this study measured exposure via NPL proximity rather than through direct histologic analysis of both specific intra-tumor pollutant presence and quantifiable degree of pollutant presence. As this was a retrospective database study without access to tumor samples, precisely measuring these variables was not possible.

This study was also limited in its ability to account for socioeconomic and access to care measures. Marginalized social status contributes to one’s choice to live, knowingly or unknowingly, near superfund sites and environmentally toxic regions54,55,56,57,58. Superfund sites have previously been linked with higher levels of poverty and non-White and Hispanic populations, in addition to lower levels of education, all of which are direct or proxy-measure risk factors for breast cancer41,42,59. Marginalized social status also contributes to inferior access to care, which itself is associated with higher rates of advanced-stage disease60,61,62,63. Patients with inferior access to care more frequently do not undergo routine mammography or other preventative screening services that allow for earlier detection of disease64,65,66,67,68. It is, therefore, plausible to infer that the higher rates of de novo metastatic disease seen here are more directly related to SES and access to care measures rather than to NPL proximity and environmental exposures. However, we found that individuals living near NPL sites continued to have increased likelihood of metastatic disease presentation even after controlling for income and insurance type. Our results remained significant even after interaction terms assessing potential effect modification by insurance and income were added, suggesting the effect of NPL proximity on metastatic breast cancer presentation is not modified by the level of SES or access to care. While one’s SES and access to and utilization of healthcare are affected by a multitude of factors beyond solely insurance and income, these are commonly used proxy measures and our results held after accounting for both69,70,71,72,73. Nevertheless, future prospective studies should take into account additional measures of SES and access to care.

While the clinical applicability of ecologic studies such as this one remain to be determined, the results nevertheless highlight an important etiologic link in breast cancer development and biology. Future work needs to more specifically determine both how and which pollutants directly impact breast cancer biology and aggressiveness. While it may not be possible to change the environments where patients live, the results of this and future studies may increase public education about factors contributing to breast cancer development and subsequently help shape public health interventions designed to enhance early detection and treatment of breast cancer.

Conclusion

This study found that geographic place of residence proximity to NPL superfund sites was associated with an increased likelihood of metastatic (versus non-metastatic) breast cancer presentation at the time of diagnosis, even after controlling for race, ethnicity, insurance type, and income. These findings suggest an etiologic role of environmental chemical exposure in breast cancer and add further support for the endocrine- and estrogen-disrupting properties of many pollutants. Future work should focus on histologically evaluating for an association between NPL site proximity and breast cancer tumor biology, ascertaining precisely which chemicals are most detrimental and how they influence tumor biology. The findings presented here provide the first known association between residential NPL superfund site proximity and de novo metastatic breast cancer presentation, implying environmental chemical exposure is a crucial component of the multifactorial etiology of breast cancer and an important risk factor to target in future interventions.