Introduction

In the last decades, pollution – prominently air pollution – has represented a hotspot of investigation and received increasing interest from policy makers, in view of the notable burden that it implies for public health and welfare systems1. Air pollution is defined as a combination of various compounds originating from different anthropogenic and biogenic emission sources, ranging from particulate matter (PM, with different components based on their aerodynamic diameter) to gases (carbon monoxide, nitrogen dioxide, etc.), and organic chemical compounds (e.g., hydrocarbons)2. Not only chronic exposure to air pollution is associated with increased cardiovascular and respiratory diseases risk3, but recent findings have supported a link between air pollution and neurological disorders, including dementia of different origins and Parkinson’s disease (hereafter called PD)4,5. PD represents the second most common neurodegenerative disease, characterized by alpha-synuclein aggregates accumulation in dopaminergic neurons of the substantia nigra, which leads to progressive, irreversible loss of these neurons6.

While recent reviews and meta-analyses have reported growing evidence suggesting that chronic exposure to air pollution – especially to particulate matter with aerodynamic diameter <2.5 μm (PM2.5) and nitrogen dioxide (NO2) – is associated with an increased risk of both incident and prevalent neurodegenerative disorders7,8,9,10, these associations are often not replicated, especially for PD, and only partly concordant across the different studies, for several reasons. Among them, the heterogeneity of study settings probably represents one of the main hindrances to results concordance, with very scarce longitudinal studies carried out in large population cohorts (e.g., refs. 11,12,13). Indeed, studies reported so far were mostly carried out within administrative cohorts (e.g., ref. 14), in case-control (e.g., refs. 15,16,17) and geostatistical studies (e.g., ref. 18). Most of these analyses tested a handful of pollutants simultaneously, but only few analyzed them together with other potential factors influencing PD risk. Therefore, most of these studies often suffer from residual confounding bias due to scarce adjustment for other potential confounders of the association between air pollution and PD risk, like lifestyles and professional covariates. Other studies are mostly based on a case-control approach – due to the relatively low prevalence of PD (about 1% in the general population over 60 years of age)19 – and hence may suffer from reverse causality bias. Overall, studies on the environmental epidemiology of PD are quite scarce7,8,9,10 and only few of them are based on deeply characterized longitudinal cohorts (e.g., refs. 11,12,13). These may allow to disentangle the cluster of risk/protective factors for neurodegenerative disorders, including sociodemographic, lifestyles and professional factors.

The main aim of this study was to test the influence of air pollution on the incident risk of PD independent on other potential risk/protective factors. We did this in a general Italian population cohort, with a deep phenotypic assessment, by simultaneously analyzing several air pollutants, along with sociodemographic, lifestyle, and professional exposures, over a 12 years follow-up. We also tested potential mediation of pathways including circulating inflammation, cardiovascular, kidney function, lipid- and platelet-related biomarkers in the link identified, to provide hints on the pathophysiological mechanisms underlying PD.

Results

Associations between air pollution and incident PD

Table 1 reports the main sociodemographic and epidemiological characteristics of the analyzed sample and the comparison to participants excluded from the analysis. The analyzed subjects were younger (55.64 (11.91) vs 62.93 (12.30) years) and showed a generally higher education level and a lower prevalence of chronic health conditions (p < 0.0001).

Table 1 Baseline characteristics of the population under study

After 11.2 (2.0) years (median (IQR)) of follow-up, 213 incident cases of PD were identified, with a standardized incidence rate 0.61/1000 person/year in females (95% Confidence Interval = [0.49–0.75]; 1.04 in males ([0.88–1.24]) and 2.4 in participants ≥65 years ([2.00–2.80]).

A PCA of air pollutants levels available within the Molise region revealed three main principal components scores, explaining >5% of common variance across all environmental variables tested: PC1 (38.2%), PC2 (19.5%) and PC3 (8.5%; Fig. S1). While PC1 was quite clearly tagging PM10 levels, PC2 and PC3 showed moderate to high loadings of ozone (O3), carbon monoxide (CO), sulfur dioxide (SO2), and of nitric oxides (NOx) and of BTX hydrocarbons (benzene, toluene and xylene), respectively, although less clearly (Fig. S2a–c). This pattern was consistent with Spearman’s correlations across all the pollutants tested together, averaged across the years (Fig. S3).

We observed a significant association of PC1 with an increased PD risk (HR [95% CI] = 1.04 [1.02–1.06] per unitary increase of PC1 score), which was stable across models adjusted for professional factors (1.05 [1.02–1.07]) smoking (1.05 [1.03–1.07]) and living in urban settings (1.04 [1.02–1.07]). No other PC score showed significant associations surviving Bonferroni correction for multiple testing of three PC scores (α = 0.017; Table 2, Table S1). Since PC1 showed high loadings of PM10 and was therefore clearly tagging the levels of this pollutant, we tested directly average PM10 levels for association with incident PD risk, which revealed relative risks consistent with those observed for PC1. Indeed, this approach was preferred over testing yearly levels of PM10 as time-varying exposure, due to the high correlations across the years of follow-up (Fig. S4). Each unitary (µg/m3) increase of PM10 was associated with a 18% (95% CI = (13–24)%) increase of PD risk in the most conservative model. The association was stable in a model further enriched for lifestyles (or their proxies) other than smoking which may influence the outcome, including drinking status, adherence to Mediterranean Diet, physical activity and BMI (19 (13–24)% increase of PD risk per µg/m3 increase of PM10).

Table 2 Results of the main analysis of incident PD risk vs principal components of air pollutants exposure

Sensitivity analyses

Sensitivity analyses removing participants with PD onset in the first 2, 5, and 10 years after the start of follow-up (to remove incident cases possibly due to other chronic exposures not directly linked to air pollution) showed consistent associations, as did the analysis after removal of early onset PD cases (Table S2). Since PM10 showed a bimodal distribution (Fig. S5a–j), to ensure against potential biases resulting from departures from normality, we compared participants above and below the median level of PM10 in the analyzed population (11.65 µg/m3; Table S3), which revealed that participants exposed to average PM10 concentrations >11.65 µg/m3 showed a ⁓14 × (9–24) increase of PD risk, compared to subjects exposed to PM10 concentrations ≤11.65 µg/m3 (Fig. 1).

Fig. 1: Kaplan Meier curves of incident PD events vs PM10 quantiles.
Fig. 1: Kaplan Meier curves of incident PD events vs PM10 quantiles.
Full size image

Incident Parkinson’s disease events are compared between the two quantiles of exposure to PM10: below the median (11.65 μg/m3; green) and above the median (red).

Mediation analysis

When we tested potential mediations of the association between PM10 levels and incident PD, only three circulating markers showed at least suggestive associations (p < 0.1) with both PM10 levels and incident PD risk, namely platelet distribution width (PDW), cystatin C (cys-C) and lipoprotein a (Lp(a)). Only the latter showed explained a significant proportion of the association between PM10 levels and incident PD (2.8 (0.9; 8.4)%, p = 0.004), being negatively associated with both incident PD risk and PM10 levels (Table S4). No other putative mediators tested explained a significant proportion of the association between PM10 and PD risk, although plateletcrit approached significance (Table 3).

Table 3 Results of the mediation analysis of circulating blood markers in the association between PM10 and incident Parkinson’s Disease

Discussion

In the present manuscript, we analyzed the relationship between air pollution and incident PD risk in an Italian population, identifying a notable influence of PM10 levels on an increased risk of PD. This influence was independent on diverse other factors representing known risk/protective factors for PD, including lifestyles like smoking20, and on professional factors like exposure to toxic compounds. While a link between PM10 and neurodegenerative risk has been long hypothesized and an involvement of PM10 in PD mechanisms has been reported in animal models21, only few studies supported a clear association with PD, while recent meta-analyses revealed contrasting results and a yet unclear relationship7,10,22, reflecting inconsistent findings. A geospatial analysis of PD cases and controls reported a significant difference in the mean annual NO2 and PM10 levels between areas where PD cases were concentrated (hotspots) and those where they were not (coldspots)18. These findings are in line with a large nested case-control study from a Chinese health insurance cohort, where PM10 was the only pollutant reported to influence an increase in PD risk, among many others15. However, other cohort studies found no evidence of association with PM1011,13,23,24 as well as case-control studies16,17,25,26, and even meta-analyses27,28. Other works identified associations between PM2.5, incident PD risk (although limited to women living in a specific US region)12,13 and clinical events related to PD, like an increased mortality in PD patients10. Conversely, the Rome Longitudinal Study showed no association of PM2.5 and PM10 levels with Parkinson’s disease14, in spite of the evidence reported for vascular dementia29. Other studies found associations between exposure to air pollutants – particularly PM2.5, PM10, and NO2 - and neuropathological hallmarks of PD, but also with neuroimaging endophenotypes of neurodegeneration like a reduced prefrontal cortex, hippocampal volume, and cortical thickness7. Interestingly, a recent umbrella review and meta-analysis revealed a significant association of PM10 levels with vascular dementia but not with Alzheimer’s Disease risk30. This, along with the inconsistent findings from epidemiological studies on PD and with the scarcity of studies on animal PD models, suggests caution in the interpretation of our findings.

Of note, one of the pollutants most reported to have an influence on PD risk by previous studies, NO27,12, did not reveal any significant association in our population. This may depend on the different approach used here to investigate air pollution exposure while controlling for collinearity bias – namely through PC scores rather than at single pollutant levels. Still, our finding is not that surprising in view of the inconsistent results published so far for this and other compounds, which may be due to different factors, such as the duration of the exposure, the age of participants, occupational backgrounds and other professional exposures to pollutants and toxic compounds21.

Despite the promising findings on PM10, the molecular mechanism of action of this pollutant on the central nervous system remains largely unclear. Among the potential pathways tested, lipoprotein a was the only putative mediator explaining a significant proportion of their link. Although this proportion is relatively small (2.8%), this still provides interesting insights into the pathophysiology of PD and its relationship with air pollution. Lp(a) is a cardiovascular risk marker with a fundamental role in binding other lipoproteins to transport phospholipids in the blood and may have dual influence on inflammation31. Interestingly, Lp(a) was found to play a fundamental role in the binding of α-synuclein and high density lipoproteins (HDL) in the human plasma, a potential mechanism through which α-synuclein may be transported through the blood brain barrier32, which may explain the inverse association between PD and Lp(a). Alternatively, this relationship may be explained by the protective influence of Lp(a) against diabetes31,33, which is increasingly reported as a PD risk factor34. A handful of studies specifically investigating the relationship between Lp(a), PD and its endophenotypes have reported contrasting results, like weak positive associations with PD symptoms severity, no association with disease staging35 and a negative correlation with depressive symptoms36 and cognitive performance37,38.

To our knowledge, no studies investigated the relationship between exposure to PM10 and Lp(a), which warrants further investigations. Similarly, in light of the large genetic influence on circulating Lp(a) levels33, gene-by-environment interactions should also be investigated, to further clarify how pollution and other exposome layers may influence incident PD risk.

This study presents strengths like the comprehensive approach to multiple air pollutants from diverse sources and the simultaneous assessment of many sociodemographic, lifestyle, and professional factors, which allowed us to identify a clear influence of PM10 on incident PD risk. Moreover, we are not aware of any study testing potential explanations of the link between pollution and any neurodegenerative disorder through a vast range of circulating biomarkers as presented here, tagging different pathways. However, our work also has some limitations. First, the use of land measurements to interpolate exposure maps may not be as precise as maps integrating land and satellite data, as well as the fact that data for the first year of recruitment (2005) were not available in the ARPA Molise database. Still, Molise is a rather rural and upland region with does never experience sudden massive changes in the degree of anthropic activities, hence the levels of air pollutants tend to remain stable across the years. On the one hand this poses questions on the generalizability of our findings but on the other hand this may explain the large effect sizes observed. Some residual unmeasured confounding may still explain part of this effect size, but this possibility is quite remote in light of the E-values computed (29.05, see URLs)39. Second, the lack of PM2.5 data – which was in fact not available for the period of interest - hindered a deeper analysis of the influence of different components of particulate matters on PD risk. We plan to further investigate these associations using environmental data from other independent European sources, which will be made available in the future. Third, the algorithm used for defining incident PD cases may have led to classify as cases some subjects affected by related disorders (e.g., parkinsonisms) or with a yet unclear diagnosis, a problem often affecting neurological clinical practice. However, our PD incidence rate was in line with the data reported in other Italian regions40,41 and the diagnosis was clinically confirmed in a subgroup of patients with further clinical data available from our center. Fourth, the simultaneous assessment of environmental exposures and circulating markers does not allow us to unambiguously infer direction of causality between them. Last, other environmental factors like exposure to pesticides (not currently available in our cohort) may help untangling the investigated relationship, although proxies like agricultural activity were not previously associated with PD in our cohort34. We are now working on improving many of these aspects, so to run more powerful analyses with more precise exposure estimates and longer and validated neurological follow-up in the near future.

To conclude, we identified a significant direct influence of PM10 levels on incident PD risk in an Italian population cohort. This association was partly explained by lipoprotein a, suggesting that this may play a fundamental role in the link between air pollution and neurodegenerative risk. These findings point towards PM10 as a potential target to lower neurodegenerative risk in the Italian population and warrant further investigations on lipids-related pathways to clarify potential mechanisms underlying PD etiology.

Methods

Population of study

The study population consisted of participants to the Moli-sani project (N = 24,325; 51.9% women, 35−99 years), a prospective cohort of Italian residents recruited from the general population of the Molise Region (Central-Southern Italy), between March 2005 and April 2010. Exclusion criteria were pregnancy at the time of recruitment, inability to understand terms of participation, current poly-traumas (i.e., simultaneous injury to several organs or body systems), coma, or refusal to sign the informed consent42. The Moli-sani Study was approved by the ethical committee of the Catholic University of Rome (approval nr: P99, A-931/03-138-04/CE/2004, 11 February 2004; ClinicalTrials.gov ID: NCT03119142) and all the participants provided written informed consent.

Geolocalization

For this study, participants were geolocalized based on their residence address – available for all the participants - with the use of complementary dedicated software APIs, like Geokettle, QGIS and Here (see URLs). The data were cleaned up manually correcting the errors detected, such as incorrect postal codes. To facilitate the geocoding process, the addresses were transformed by replacing the abbreviations. Using an automated procedure based on the Here API, it was possible to assign latitude and longitude coordinates for all the subjects in the database. A handful of participants with residence outside the Molise region (n = 17) were geolocated accordingly and hence removed from the analysis. The results of the geocoding were verified using the shapefiles of the municipalities’ boundaries of the region. Addresses placed outside the limits of the city of residence were manually geolocated using Google Maps. With this procedure, it was possible to assign latitude and longitude coordinates for 24,303 subjects in the database, with a high level of confidence. The subjects were then linked to air pollution maps (built as described below), which allowed us to estimate the amount of exposure to each pollutant at the physical coordinates where subjects reported their residence.

Exposure: air pollution

We estimated yearly levels of exposure to ten different pollutants, including Nitrogen oxides (NOX, NO, NO2), Ozone (O3), particulate matter with aerodynamic diameter <10 μm (PM10), Sulfur dioxide (SO2), Carbon monoxide (CO) and BTX hydrocarbons (benzene, toluene and xylene) in 2006−2018, applying an ordinary Kriging interpolation algorithm – a geostatistical data interpolation method used to infer unsampled points across the spatial field43 (see URLs) - to fourteen land measurements of air pollutants available from the regional environmental authority (ARPA Molise, see URLs). To apply geostatistical algorithms and display analysis results on maps, we used ESRI ArcGIS tool, a proprietary geographic information system (GIS; see URLs) which allowed us to display air monitoring stations on the regional Molise map and finally build maps of exposure to different air pollutants (see Fig. 2a, b). PM2.5 was available for a small period of the follow-up range (nine months in 2018), in only four land measurement centers of the network, and was therefore excluded from the analysis. After defining the variogram to evaluate the spatial autocorrelation of observed data, Kriging algorithm was applied to each pollutant for each year of the follow-up time considered, over participants with physical coordinates available. Algorithm parameters (Semivariogram model, Lag Size, and Nugget) were fine-tuned so to minimize the Root Mean Squared Standardized Error, and the results were validated through cross-validation. Overall, 130 different exposures (10 pollutants × 13 years) were obtained for 24,303 participants. Toluene levels in 2017 presented a single level of exposure and were not varying across subjects, therefore this variable was removed. Since Kriging returned exposure intervals which were largely overlapping but with variable limits across the years, to reduce collinearity among variables, we decided to apply a principal component analysis (PCA) to derive latent variables which could capture most of their shared variance. To this end, we computed the mean of the two limits for each level of exposure (or rank) and each pollutant, so to have point estimates of exposure. The levels of the analyzed pollutants (Fig. S5a–j, Table S3) were comparable to the Italian estimates as reported by the Italy Air Quality Index (AQI) database and often lower than thresholds recommended by the World Health Organization (WHO) air quality guidelines (see URLs).

Fig. 2: Geolocalization of participants and air pollutants maps in the Moli-sani cohort.
Fig. 2: Geolocalization of participants and air pollutants maps in the Moli-sani cohort.
Full size image

a Geographical location (based on residence address) of each participant to the Moli-sani cohort. b Example of air pollutant exposure map built in the study, based on Kriging algorithm (PM10, year 2007). Triangles represent land boxes used to build the map, whose data are publicly available from the regional environmental authority (Agenzia Regionale per la Protezione Ambientale del Molise - ARPA Molise, see URLs). Gradient colors represent increasing PM10 levels (μg/m3), from low (blue shades) to high (red shades).

We carried out a preliminary Kaiser-Meyer-Olkin test of sampling adequacy, which revealed a very good factorability (MSA = 0.97). A Bartlett’s Test of Sphericity suggested a significant discrepancy with the identity matrix, hence notable reciprocal correlations across the variables, which further supported our choice. PCA was performed through Singular Value Decomposition with orthogonal (varimax) rotation (built-in prcomp() function in R44). The resulting PC scores were then tested as main exposures in survival analyses, as detailed below.

Outcome: incident PD cases

Ascertainment of the Parkinson’s disease status was carried out through linkage with Electronic Health Records (EHRs) databases using fiscal code of each participant as unique identifier. The Molise regional drug prescription registry was first used to identify patients starting PD therapy during the follow-up. According to the national health service, anyone prescribed a drug is entered into a specific registry maintained by the local health authority. Only patients with chronic use of medications with Anatomical Therapeutic Chemical (ATC) classification N04XX (anti-Parkinson drugs) were considered. A similar method for the identification of PD cases has been used in other studies (e.g., ref. 45).

Then, drug registry identification of cases was integrated through the regional registry of hospital discharge records (HDRs). Hospitalization was defined as any stay lasting ≥24 h in any hospital, clinics, emergency room, or similar, be it private or public. Hospitalizations showing an International Classification of Diseases, 9th revision (ICD-9) code for PD (332.0) in primary and secondary diagnoses were classified as incident cases. The earliest date indicating a PD status in the above mentioned registries was used to establish the onset of PD. Three independent neurologists with experience in movement disorders further validated PD diagnoses by combining information on anti-parkinsonian drugs in use (levodopa, dopamine agonists, MAO-B inhibitors) and hospital discharge records. Movement disorder specialists evaluated prescriptions to ensure that only those with dosages indicative of a progressive neurodegenerative disorder were considered. This included evidence of gradual dose escalation or the addition of other anti-parkinsonian medications over time, reflecting disease progression. Consequently, individuals with chronic low-dose dopamine agonist prescriptions, without signs of dose adjustment or combination therapy, were excluded, thereby minimizing the risk of misclassifying cases of Restless Legs Syndrome. Follow-up of each cohort member continued until the date of the diagnosis of Parkinson’s disease, death, or 31 December 2018.

In a subsample (n = 99) of the 213 incident PD cases identified using these criteria, it was possible to further validate PD diagnosis through medical records of the IRCCS Neuromed outpatients Parkinson Center (records of periodic outpatient visits, clinical diaries, discharge letters, DATSCAN examinations, drug prescriptions), which confirmed a PD clinical diagnosis and hence correct classification in 100% of these cases.

Statistical analysis

All statistical analyses of the present manuscript were carried out in R (see URLs)44. After removing prevalent PD (n = 52) cases, as well as those participants with missing information on incident events (372), sociodemographic covariates (38) and environmental exposure (22, see above), we finally tested 23,841 subjects for incident PD risk vs air pollution exposure through multivariable Cox Proportional Hazards (PH) regression models46. To test potential influences of air pollution independent on other clusters of risk/protective factors, multivariable models were built, incrementally adjusted for (i) age and sex (Model 1); (ii) education level completed and professional factors like working class and exposure to toxic compounds (Model 2); (iii) lifestyles like smoking (Model 3)and (iv) living in urban or rural settings, since this may affect both the levels of exposures to air pollutants and the access to movement disorders specialists (Model 4). Except for age and sex, these covariates were identified as confounders based on Directed Acyclic Graph (DAG) analysis, through DAGitty v3.1 (see URLs and Fig. S6). All the covariates are described in detail below. The main pollution exposure was initially tested as PC scores, as computed through the PCA over all yearly pollutant levels, which were tested all together in multivariable models. If any PC score showed a significant association, levels of the single pollutant/s tagged by that PC were tested, averaged over the actual years of follow-up for each participant. All the models showed Variance Inflation Factors (VIF) < 2, suggesting negligible collinearity bias.

A sensitivity model with additional lifestyles other than smoking (namely drinking status, adherence to Mediterranean Diet, physical activity in leisure time) or their proxies (e.g., body mass index, BMI) was performed to analyze the stability of the association in enriched models with other exposures potentially influencing the outcome. Similarly, additional sensitivity analyses were carried out, removing participants with PD onset in the first 2,5 and 10 years after the start of follow-up – to rule out the influence of other chronic exposures not directly linked to air pollution – and removing cases with early onset (below 50, 55 and 60 years of age), which were likely under strong genetic influence.

Definition of covariates

Smoking status was defined as a three-class variable based on cigarette smoking habits of participants: smokers, ex-smokers (i.e., subjects who quitted at least one year before the interview) and never-smokers (reference class).

Leisure-time physical activity was assessed through a structured questionnaire and expressed as daily energy expenditure in metabolic equivalent task-hours (MET-h/day)47.

Food intake was assessed through the validated Italian EPIC food frequency questionnaire48. Adherence to Mediterranean Diet was defined according to the Mediterranean Diet Score, ranging from 0 (low adherence) to 9 (high adherence)49. Then we defined three adherence classes: low (Trichopoulou score 0–3, used as reference), moderate (4–6) and high (7−9). The EPIC questionnaire also allowed to compute alcohol consumption habits along with additional questions, as described by ref. 50. Participants were classified based on the estimated daily alcohol intake in the year before enrollment: life-time abstainers, former drinkers, occasional drinkers and current drinkers who drank 1–12 (reference class), 12.1–24, 24.1–48 and >48 g/day.

Height and weight were measured for each participant, as well as waist circumference (cm), which was measured in the middle between the 12th rib and the iliac crest, while hip circumference (cm) was measured around the buttocks. Then Body mass index (BMI) was calculated, and participants were grouped into three categories: <25, 25–29.9, and ≥30 kg/m2.

Socioeconomic status (SES) information was self-reported and/or collected by a structured questionnaire administered by trained personnel. Educational attainment was defined as the education level completed and subjects were divided into four classes: None/Primary (reference), Lower secondary, Upper secondary and Post-secondary. Current occupational social class was classified based on the Registrar General’s occupation classification scheme, and ranked as previously described by ref. 51. Based on this rank, five occupational classes were defined: professional/managerial, skilled non-manual, skilled manual, partly skilled/unskilled and unemployed/unclassified (reference).

Professional exposure to potentially toxic compounds, like concrete, eternit, aluminum, starchy and paper-derived compounds was also self-reported by each participant (yes/no).

Prevalent diabetes and hyperlipidemia were defined as dichotomous variables (Yes/No), based on the reported and verified use of specific drugs for the treatment of these disorders. Prevalent cardiovascular disease (CVD) and cancer were defined as binary variables, classified into subjects reporting medical history of the disease (possibly supported by medical documentation or by the use of specific drugs), and those with no medical history of the disease.

Urban or rural setting was defined based on the urbanization level of the place of residence, as described by the European Institute of Statistics (EUROSTAT definition) and as defined in the Atlante Statistico dei Comuni, 2014 Edition (https://www.istat.it/notizia/atlante-statistico-dei-comuni/), by the Italian National Institute of Statistics (ISTAT).

Mediation analysis

We tested the role of different biomarkers available in the Moli-sani study as potential mediation pathways in the link between air pollution and PD risk. Circulating inflammation has been hypothesized to play a role in the above mentioned link7, and was therefore tested through a composite blood-based inflammation index, called INFLA-score, capturing both serum and cellular-circulating inflammation52. Additional non-composite blood markers were also tested, which were previously reported to influence PD risk. These included (i) lipid biomarkers, like total cholesterol, triglycerides, high (HDL) and low density lipoprotein (LDL)53,54,55, lipoprotein a (Lp(a)), apolipoprotein A1 (Apo-A1) and B (Apo-B)32,35,56,57,58; (ii) markers of glucose metabolism59,60: glucose, C-peptide, insulin; (iii) cardiovascular risk markers: N-Terminal Pro-B-Type Natriuretic Peptide (NT-proBNP) and high-sensitivity cardiac troponin I61; (iv) other hormones: testosterone62 and vitamin D63; (v) renal markers: uric acid, albumin, creatinine, c statin C (Cys-C)64,65,66; (vi) other inflammation markers like high sensitivity C-reactive protein (CRP) and immunity-related markers like total white blood cells (WBC), lymphocytes (LY), neutrophils (NE) and neutrophil-to-lymphocyte ratio (NLR)57,67,68,69,70,71; (vii) red blood cell traits like red cell distribution width (RDW)72, and hemoglobin levels (Hgb)73; (viii) platelet parameters like mean platelet volume (MPV), plateletcrit (PCT), platelet count (Plt) and distribution width (PDW)61,74,75,76.

All these biomarkers were preliminarily screened as putative mediators by testing associations with both the exposure (PM10 levels) and the outcome (incident PD risk). Only three markers showed at least suggestive associations with both the exposure and the outcome (p < 0.1; see Results section) – namely PDW, Lp(a) and Cys-C - and hence underwent a formal mediation analysis. This was carried out through the CMAverse package77, using a counterfactual approach. This framework is more robust than other traditional approaches since it protects against mediator-outcome confounding affected by the exposure, exposure-mediator interaction, and exposure-outcome confounding78. The counterfactual approach allows simultaneous estimation of the direct (the PM10-incident PD association not due to the mediator), indirect (association due to the mediator), and total effects (the sum of the direct and indirect effects). Using the cmest() function (regression-based method), we estimated these effects in fully adjusted Cox PH regressions (Model 4), for each putative mediator separately. Confidence intervals for the indirect effects were estimated through a bootstrap-based approach, over 1000 bootstrap samples with replacement. Where significant exposure-mediator interactions were detected (p < 0.05, for Cys-C), the direct, indirect and total effects were estimated under this assumption (option EMint = T). The proportion of association explained by each putative mediator was calculated as Direct Effect × (Indirect Effect - 1)/(Total Effect - 1), and proportions with p < 0.05 were considered significant.

Measurement of circulating biomarkers

Blood samples were obtained between 7 and 9 AM from participants who had fasted overnight and had refrained from smoking for at least 6 h (h). Biochemical analyses were performed in the centralized Moli-sani laboratory, as described elsewhere79. Haemochromocytometric analyses were performed by cell counter (Coulter HMX, Beckman Coulter, Milan, Italy) within 3 h from venepuncture. All haemochromocytometric analyses (white blood cell and platelet counts, granulocyte % and lymphocyte %) were performed by cell counter (Coulter HMX, Beckman Coulter, Milan, Italy) within 3 h from venipuncture. CRP was measured in fresh serum, by a latex particle-enhanced immunoturbidimetric assay (ILab 350 Instrumentation Laboratory, Milan, Italy).

Analyses of creatinine, apolipoproteins, C-peptide, cys-C, insulin, Lp(a), NT-proBNP, testosterone, troponin I and vitamin D were performed within the BiomarCaRE Consortium and have been described in details elsewhere (https://cordis.europa.eu/project/id/278913/reporting/it). Main information and descriptive statistics for all the circulating biomarkers currently available in the Moli-sani study are reported in80.

URLs

ARPA Molise: https://www.arpamolise.it/, Kriging algorithm: https://desktop.arcgis.com/en/arcmap/10.3/tools/3d-analyst-toolbox/how-kriging-works.htm, GeoKettle: http://www.geokettle.org/, HERE: https://www.here.com/, QGis: https://www.qgis.org/en/site/, Italy Air Quality Index (AQI) database: https://www.aqi.in/dashboard/italy, World Health Organization (WHO) air quality guidelines: https://www.eea.europa.eu/publications/status-of-air-quality-in-Europe-2022/europes-air-quality-status-2022/world-health-organization-who-air, R: https://www.r-project.org/, DAGitty: https://www.dagitty.net/, E-value online calculator: https://www.evalue-calculator.com/.