Introduction

Air pollution, which broadly refers to contamination of the indoor and outdoor environment by gases, ozone and particulate matter (PM), is known to have multiple deleterious health effects, including potentially carcinogenic properties1,2,3. The health effects of PM smaller than 2.5 microns in diameter (PM2.5) have been increasingly well described. While air pollution exposure is known to be associated with harmful effects on multiple organ systems, including increased incidence of pulmonary disease (e.g. COPD, lung cancer, asthma)4, cardiovascular disease5, and liver disease6, PM2.5 has specifically been linked to certain types of cancer. PM2.5 exposure is associated with increased incidence of breast cancer and lung cancer, particularly in non-smokers1,7,8,9,10,11. Given this, the International Agency for Research on Cancer (IARC) has classified air pollution, particularly PM2.5, as a causal agent (Group 1 carcinogen) for lung cancer.

Although there is significant literature detailing the effects of air pollution on the pulmonary and cardiovascular systems, data linking air pollution to disease development in the upper aerodigestive tract and specifically to head and neck cancers (HNC) is more limited. Recent studies have demonstrated an association between PM2.5 exposure on a United States (US) county level with oral cavity cancer incidence12,13. However, the relationship between site-specific incidence in HNC and air pollution in conjunction with appropriate lag models is limited. The tissue of the head and neck may be particularly vulnerable to air pollution given direct contact with these particles and the known strong association between tobacco smoke and the most common HNC, squamous cell carcinoma14. While there are known non-environmental links to these cancers, including the rising incidence of oropharyngeal squamous cell carcinoma (OPSCC) linked to human papillomavirus infection15 and nasopharyngeal carcinomas linked to Epstein-Barr virus infection, air pollution may play a role in disease development16.

Given this lack of literature, we aimed to evaluate the relationship between PM2.5 exposure and HNCs utilizing a national cancer registry data and focusing on overall and site-specific incidences of HNC. Lag models were utilized to account for the time between exposure and disease presentation.

Methods

Data source

This study was approved by the Mass General Brigham Institutional Review Board before database composition and analysis in accordance with the Common Rule and Federalwide Assurances for Institutions within the United States. We examined the cancer incidence utilizing the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute (NCI) data available from 2000 to 2016 in which the data from 2002 to 2012 was used in the analysis. The SEER database contains deidentified patient data, and therefore informed consent to participate in this study was not possible or required. We linked this data with county-level air pollution data (i.e., PM2.5), alcohol and smoking use data across all available years.

The SEER registries included California, Connecticut, Southeast Michigan, Georgia, Iowa, Kentucky, Louisiana, New Jersey, New Mexico, Seattle (Puget Sound) and Utah. We included data based on International Classification of Diseases- Oncology version 3 (ICD-O-3) codes for oral cavity, oropharynx, larynx, hypopharynx, sinonasal and middle ear cancers (C000-009, C019-069, C090-119, C129, C130-130, C140, C142, C148, C300-301, C310-329). ICD-O-3 codes are site specific codes used by cancer registries to classify each tumor type by topography and histology, and the above groups were made based on specific anatomic boundaries within the upper aerodigestive tract. The primary outcome of the analysis is “all HNC incidences” and all cancer subtypes were used as secondary outcomes in the study. Age-adjusted incidence rates of HNC were calculated for each county. We also included county-level demographic characteristics such as age, sex, race, income and poverty levels, and education level from SEER.

Air pollution data, specifically PM2.5 concentrations, were obtained at a 0.01 × 0.01 degree resolution from a previous study, which used ground-based monitoring, chemical transport modeling, and satellite-derived PM2.5, from 1981 to 201617. Using available US county boundary data, we calculated mean annual PM2.5 concentrations at a County level in all 48 contiguous US states using Python grid (Python Software Foundation v3.11, Wilmington, DE [https://www.python.org/]).

Additional data on average alcohol use by county from 2002 to 201018, and from 2011 to 2016 were based on data from County Health Rankings, an online database compiling county attributes calculated from the CDC’s Behavioral Risk Factor Surveillance System (BRFSS)19. Smoking data by county covering years 1996–201020, and 2011–2016 were also included from County Health Rankings19.

Statistical analysis

We examined descriptive statistics using median and interquartile range (IQR) for PM2.5, county-level demographic characteristics, alcohol use and smoking data. Next, we examined the association between PM2.5 and HNC incidence by using a linear mixed model with county ID as a random effect, while controlling for county level demographics characteristics (percentage of population aged over 65 years old, percentage of population aged under 18 years old, percentage of Black, percentage of Hispanic, median income, and population size), smoking prevalence (percentage of smokers), and alcohol use (percentage of binge drinking). Median income and population size were log transformed due to skewed data. A series of models using lag times (ranging from 0 to 20 years) of PM2.5 as exposure were performed to assess the lagged effect of PM2.5. All the analysis were conducted among the overall HNC and cancer sub-types. Line charts were used to summarize the association between lag effect of PM2.5 to HNC incidence rate. All hypothesis tests were two-sided and p values < 0.05 were considered statistically significant. The analysis was performed using R software version 4.1.0 (R Foundation for Statistical Computing v4.3.2, Vienna, Austria (2020) [https://www.r-project.org/]).

Results

Based on the availability of all data, we generated a cross-sectional cohort of county-level data from 2002 to 2012, which includes HNC incidence in over 600 counties and 11 states. Summary statistics of this cohort are detailed by Table 1. Overall, the incident rates for hypopharyngeal, nasopharyngeal, and sino-nasal categories were lower than oral cavity, oropharyngeal, and laryngeal cancers. The PM2.5 concentration averages used in the lag model are summarized by Table 2. Overall, PM2.5 concentration decreased in US counties from 1981 to 2012.

Table 1 Summary Statistics of Cohort.
Table 2 Mean and Median PM2.5 Concentrations for each Year (µg/m3).

There was a significant associated between PM2.5 levels and the incidence rate of HNC, with the strongest correlation at a 5-year lag period (β = 0.24, p value < 0.001). The association was also significant at no lag (β = 0.16, p value = 0.02) and up to a 20-year lag period (β = 0.15, p value < 0.001).

When not adjusted for covariates, all HNC incidences combined and most subsite incidences (except for sinonasal) were associated with PM2.5 levels at all lag periods up to a 20-year lag (Fig. 1). Even when adjusted for covariates, all HNC incidences combined were still significantly associated with PM2.5 concentration up to a 20-year lag period (Fig. 2). This was also true for all combined aerodigestive non-oropharyngeal HNCs and laryngeal cancers. Oral cavity cancer was also significantly associated with PM2.5 concentrations with covariate adjustment, except at the 0-year lag period. Most beta coefficients were highest around the 5 years after presumed exposure, and remained elevated up to 20 years after exposure, indicating a lagged association over time. However, after adjusting for covariates, sinonasal, hypopharyngeal, nasopharyngeal, and oropharyngeal cancers were not associated with PM2.5 concentrations from 0 to 20 years after exposure.

Fig. 1
Fig. 1
Full size image

Unadjusted PM2.5 pollution effect on Head and Neck Cancer incidence and by subsite, Lag: years. Blue denotes statistical significance and orange denotes insignificant effect. By “statistically significant” it is meant that the PM2.5 is statistically significant at 5% level of significance (p value < 0.05).

Fig. 2
Fig. 2
Full size image

Adjusted PM2.5 pollution effect on Head and Neck Cancer incidence and by subsite. The model adjusted for percentage of binge drinking, daily smoking, female, race and ethnicity, median income, and population. Lag: years. Blue denotes statistical significance and orange denotes insignificant effect. By “statistically significant” it is meant that the PM2.5 is statistically significant at 5% level of significance (p value < 0.05).

Discussion

This analysis of SEER database county incidence data of HNCs demonstrates an association with exposure to PM2.5 at various time intervals. This association was seen in all HNCs combined and pertinent subsites including non-oropharyngeal sites, oral cavity, and laryngeal cancer. This study contributes to the current body of literature that identifies air pollution as an important modifiable risk factor for the development of HNCs and supports improving air quality standards given the potential impact of air pollution on oncologic disease development.

The strength of this current research includes the high number of incidence cases that were included from the SEER registry, as well as the heterogeneity of the location and geography of the counties within the data set. Currently, the SEER registries capture a large representative sample of the US population, making it a favorable choice for epidemiological studies in the US. Furthermore, defined lag time models were utilized to highlight the carcinogenic effects of air pollution exposure over time. This data includes total head and neck aerodigestive cancer incidence relating to air pollution, and this is further delineated by subsites compared to previously more broad studies of cancer incidence.

In previous studies, there is a heterogeneity of latency from pollution exposure to cancer incidence, with latency ranges of virtually no lag time with up to 15 years of lag time. Latency is important due to the variable nature of the rate of carcinogenesis after exposure from various risk factors, and faulty lag models are prone to exposure misclassification and bias21. Additionally, it is important to model exposure latency given the changes in air quality over time. Air quality as measured by air pollutant concentration has improved nationally over 40 years. However, studies have shown that even low levels of air pollutants can have deleterious effects on health. Small changes in PM2.5 and ozone concentrations even below the levels set by the EPA’s National Ambient Air Quality Standards (NAAQS) can have significant effects on mortality22,23. Choosing the correct lag models including longer latency periods can help account for these variations and improve accuracy in predicting HNC incidence and defining at-risk patient populations.

Air pollution exposure is an under recognized social determinant of health. Studies have shown that historically marginalized communities, specifically communities of color and those of lower socioeconomic status, are disproportionately exposed to adverse environmental conditions, including air pollution and heat24. The effects of disproportionate air pollution exposure on disease development and health outcomes in marginalized populations have been described in a range of conditions, including COVID-19 mortality25. Given this and the known disparities in stage at presentation and outcomes in HNC patients, the role of air pollution in pathogenesis should be considered.

The deleterious health effects of air pollution have been well established in the literature, with robust evidence linking air pollution exposure to increased cancer incidence and mortality, particularly in lung cancer7,8,9,10,23,26. These studies mostly focus on PM2.5 but other pollutants include nitrous oxide (NO2) and coarse particulate matter have been associated with lung cancer as well. For lung cancer studies, mortality was often used as an endpoint variable of interest. It is important to note that lung cancer mortality and incidence are able to be grouped together in these previous studies due to the fact that lung cancer has a high rate of mortality26. In this study, incidence is used as opposed to mortality given that survival rates are relatively high for most HNCs compared to those seen in lung cancer.

Current data on outdoor air pollution and HNC incidence is limited, but there is some evidence linking pharyngeal and laryngeal cancer incidence and indoor air pollution exposure, particularly from the burning of solid fuels for cooking and heating27. Current studies on HNC incidence and outdoor pollution exposure are in areas of high levels of pollution and areas where oral and nasopharyngeal cancer are more common and endemic, such as China and Taiwan. Two recent studies from Taiwan have linked high levels of PM2.5 to oral cancer diagnosed by Taiwan’s biennial screening program for oral cancer. These studies are limited in that they utilized very short lag times, and their classification of oral cancer also included hypopharyngeal and oropharyngeal cancer28,29. Three other studies, two from Taiwan30,31 and the other from China32, have linked outdoor air pollution including NO2, PM2.5 and coarse particulate matter to nasopharyngeal cancer incidence. Other international studies linking air pollution to HNC mortality were conducted in Brazil, with relatively specific cancer subsite definitions33,34. In this study, cancer specific mortality was measured instead of incidence and the study only included a lag time of up to 2 years. Interestingly, PM2.5 from wildfires was separated from other sources of PM2.5 and found to have a greater effect on cancer mortality. Further research is needed to understand the difference in carcinogenic effects seen with wildfire smoke.

In the US, studies assessing the link between pollution and non-lung cancer incidence are limited. Coleman et al.’s previous study on the association between PM2.5 and cancer incident rates derived from more than 8 million cases recorded in the SEER registry. Regarding HNC, only nasal and middle ear were well delineated from other head and neck aerodigestive subsites. In this study, “oral cancer” included ICD-10 codes C00-C14, which included major salivary glands, the oropharynx, nasopharynx, and hypopharynx12. While our study delineated head and neck subsites more specifically, a limitation to our analysis is that the definitions of tongue cancer could not be easily parsed, and as a result some base of tongue cancers were likely classified as oral cancer in our study. The “oropharyngeal” definition used in this study contained only cancers from the oropharyngeal subsite. Subsites were also grouped differently in another recent study of US pollution and head and neck cancer, which included SEER derived HNC incidence data with the addition of salivary gland and esophageal subsites from 2011 to 2019. Of note, this study did not account for a lag period13. Additional studies are needed to better understand the association between air pollution and pathogenesis in specific head and neck subsites, especially given the role of viral-mediated carcinogenesis in the oropharynx and nasopharynx.

There are important limitations to this study that must be acknowledged. This study is limited to the effects of PM2.5, whereas other air pollution components including NO2, PM10, ozone have been shown to impact the incidence of other malignancies, specifically lung cancer35. Another major limitation of this study is that the effects of viral-induced carcinogenesis, specifically human papilloma virus (HPV) in the oropharynx and Epstein-Barr virus (EBV) in the nasopharynx, were not considered because of limited availability of county level data. The availability of smoking and alcohol data was also limited at the county level before 1996, and other important covariate data regarding occupational exposures and racial and ethnic disparities at the county level were not available for the studied period. In general, ecological studies are limited because it is almost impossible to control for all confounding variables, as well as the innate inability to make predictions on the individual level. However, the large representative sample size of the SEER database makes these results more reliable despite population variability.

Conclusions

HNC incidence, both broadly and by subsites including combined non-oropharyngeal sites, oral cavity, and laryngeal cancer, is associated with exposure to PM2.5 pollution at various time intervals. The effects were most pronounced 5 years after exposure to PM2.5. With increasing importance on environmental health and preventative medicine, this study adds to the expanding body of literature that supports improved air quality standards to mitigate the effects of air pollution on development of cancer, including HNC.