Abstract
Using follow-up data from the National Health and Nutrition Examination Survey (NHANES) database, we have collected information on 2572 subjects and used generalized linear model to investigate the association between urinary heavy metal levels and glaucoma risk. In addition, we have developed an individualized risk prediction model using machine learning algorithms and further interpreted the model results through feature importance analysis, local cumulative analysis, and interaction effects. In this study, we found significant association between logarithmically calculated arsenic (As) metabolites, especially arsenochlorine (AC), and glaucoma after adjusting for a series of confounders, including urinary creatinine (β = 1.090, 95% CI: 0.313–1.835). The Shapley Additive Explanations (SHAP) analysis results and clinical risk scores also indicated that As metabolites promoted glaucoma more severely than other variables. This study applied machine learning for the first time to explore the relationship between heavy metals and glaucoma while analyzing the effects of multiple heavy metal exposures on the disease, improving the predictive power compared to conventional models. Our results provided important insights into the potential role of heavy metals in the pathogenesis of glaucoma, facilitated the discovery of new biomarkers for early diagnosis, risk assessment, and timely treatment of glaucoma, and guided public health measures to reduce heavy metal exposure.
Similar content being viewed by others
Introduction
Glaucoma is an optic neuropathy characterized by depression of the optic disc, apoptosis of retinal ganglion cells, and loss of visual acuity1. The global prevalence of glaucoma is approximately 3.5% in people aged 40 to 80 years. With the growth of the global elderly population, the number of glaucoma patients is expected to reach 111.8 million by 2040, making it one of the leading causes of irreversible visual disability worldwide and posing a major public health challenge2. Based on ocular pathologic features, glaucoma can be subdivided into open-angle glaucoma (OAG) and closed-angle glaucoma, with primary open-angle glaucoma being particularly common in middle-aged and older adults3. Since most glaucoma is the result of damage to the optic nerve caused by elevated intraocular pressure (IOP), pharmacologic and surgical treatments are almost always aimed at lowering the IOP4, but patients with normotensive glaucoma still exist, and the damage to their vision cannot be reversed. Prompt identification of other risk factors and early intervention in high-risk populations may provide new perspectives of understanding in the diagnosis and treatment of glaucoma.
Heavy metals are environmental pollutants widely distributed in the natural environment and also belong to the category of endocrine disruptors, which exhibit considerable stability, indegradability, accumulability and biotoxicity5. Studies6 have demonstrated that whole-body heavy metal levels accumulate with aging and that visual acuity reflects optic nerve function, further suggesting that poisonous heavy metals are also associated with optic nerve damage in middle-aged and older adults. Experimental studies showed that arsenic exposure leads to impaired development of optic nerve axons7.
Currently, among the epidemiologic studies focusing on the association between heavy metals and glaucoma, Lee et al.8 conducted a cross-sectional study of 5,198 patients with open-angle glaucoma and normal subjects in Korea, and found that log-transformed blood cadmium levels were positively correlated with the prevalence of OAG with low intraocular pressure (OR = 1.41, 95% CI: 1.03–1.93). Meanwhile, there was no significant correlation between log-transformed levels of blood cadmium, mercury, and lead and disease prevalence in patients with high IOP. Another Korean study, covering 2,680 people, found no association between blood cadmium or lead levels or urinary arsenic levels and glaucoma diagnosis9. Other studies have focused on the effects of the intake of essential metal elements and vitamins on glaucoma, as these can directly or indirectly affect optic nerve trophic status, and are considered to be potential influences on the development of glaucoma2,4,10. However, most studies ignore the chemical changes in heavy metal metabolism in vivo, and dietary intake does not fully reflect the actual level of heavy metal exposure in the body. Considering the cumulative damage caused by long-term exposure to such heavy metals, the combined effects on the optic nerve and even the entire ocular microenvironment should be taken into account. Moreover, the traditional statistical methods used in published studies are usually applicable to small data sets, and it is difficult to explain the complex patterns of action in high-dimensional data11. Machine learning (ML) methods are applicable to large-scale data sets, and can automatically adjust the model to optimize performance through hyper-parameter tuning, cross-validation12. For example, the XGBoost algorithm has higher accuracy than traditional models in prediction tasks, and performs particularly well with data with complex or unknown patterns13.
Considering that the pathogenesis of glaucoma is still unclear, combined with previous published literature, we hypothesized that there is a close association between heavy metal exposure and glaucoma, and that urinary levels of heavy metal may be a potential predictor of glaucoma. Therefore, we first performed hypothesis testing by generalized linear regression analysis, followed by constructing an efficient prediction model based on the machine learning algorithm to explore the relationship between heavy metal exposure and the development of glaucoma in participants of the 2005–2008 National Health and Nutrition Examination Survey (NHANES) from different perspectives, and applying this prediction model to randomized subjects with different risk factors for risk scores to provide clinical predictive value. Such early screening can help control the deterioration of glaucoma and raise public awareness of protection, making an important contribution to reducing the socioeconomic burden and public health adverse events associated with this irreversible and serious eye disease.
Methods
Study population from NHANES 2005–2008
NHANES is administered by the National Center for Disease Control and Prevention (CDC), and the investigation is conducted every two years (i.e., one cycle) with nationally representative sample, which is made more representative through the use of standardized sampling and weighting schemes. Data for our study were obtained from the 2005–2008 NHANES through retrospective survey of U.S. residents that included collection of questionnaire content, physical examination, imaging, serology, and urine testing. Beginning in 2005, NHANES performed additional ophthalmologic examinations such as humphrey matrix frequency-doubling technology (FDT) perimetry testing (N30-5 FDT; Carl Zeiss Meditec) and optic disc photographs, which were also used to assess participants’ glaucoma diagnosis.The NHANES data were publicly available but did not include personally identifiable information about the participants. Ethical approval for the NHANES study was obtained from the Research Ethics Review Board of the National Center for Health Statistics. All participants signed the informed agreement documents.
Exposure variable assessment
In addition to the common lead (Pb), mercury (Hg), Cadmium (Cd), and in conjunction with previous studies14,15 on Arsenic (As) and its metabolites related to human health, dimethylated arsenical (DMA), monomethylated arsenical (MMA), arsenite III (As3), arsenate V (As5), arsenobetaine (AB), and arsenochlorine (AC) were included in the analysis because they are organic arsenic compounds with strong cytotoxic and genotoxic properties. A total of 20 urinary heavy metal data were eventually collected.Urine samples are widely accepted as biomarkers of heavy metal exposure. Urine heavy metals have a longer half-life than plasma tests and require little special handling. Subjects’ urinary heavy metal levels are generally detected in urine samples collected during home medical examinations, processed in a standardized laboratory and analyzed by solid phase extraction-high performance liquid chromatography-turbo ion spray ionization-tandem mass spectrometry (SPE-HPLC-TCI-MS/MS) for random quantitative analysis. More detailed laboratory procedures can be found in the NHANES Laboratory/Medical Technician Procedures Manual16. The Limit of Detection (LOD) values for each heavy metal are set according to the standards for laboratory test methods used by NHANES, with values below the LOD calculated as LOD/2.
Glaucoma evaluation standards
The division of participants into glaucoma group (n = 162) and non-glaucoma group (n = 2410) was based on the Rotterdam Criteria for Clinical Diagnosis, which is recognized as an objective clinical diagnosis of glaucoma4. The appearance of the optic nerve was assessed based on optic nerve photographs, and visual field defects were assessed by FDT visual fields. The researchers dilated the participant’s pupil with the dilating medication and captured clear images of the optic disk. Usually at least two images of each eye are captured and saved to ensure image quality and accuracy. The FDT examination is performed as follows: the participant’s one eye is first covered and the other eye is tested. The researchers instructed the participant to look at a fixed point in the center of the device, which presents strobe stimuli at different locations in the visual field with gradual changes in frequency and contrast. Instead of recording where the subject saw the stimulus, the similar process needed to be repeated on the other eye17. A patient is diagnosed with glaucoma if at least one eye has 2 or more abnormalities on the N30-5 FDT on 2 tests of the same eye with the optic cup to disk ratio of the optic nerve (CDR) in one eye or an asymmetry in the CDR between the two eyes greater than 97.5% of the normal NHANES population17. We did not analyze further subgroups such as open-angle and closed-angle glaucoma because no information on glaucoma subtypes was found in the 2005–2008 NHANES database.
Correlation of heavy metals with glaucoma
In the 2005–2008 NHANES sample, participants with both optic disk photographs and FDT visual field data were older than 40 years. Basic demographic data, including age, sex, and ethnicity, as well as laboratory tests and imaging findings, were obtained through electronic medical record system. Participants were generally asked in the home interview whether they had higher education, alcohol consumption and smoking in the past year, which are known risk factors for glaucoma18,19. Diagnostic information on medical history, such as hypertension, was obtained by self-report from the interviews. We wanted to adjust for these confounders as much as possible in the multivariate model to avoid interfering with the results.
To explore the correlation between heavy metals and glaucoma, we calculated correlation coefficients (β) and 95% confidence intervals (CI) based on heavy metal levels and multivariate generalized linear models. We selected covariates for inclusion based on the existing literature as well as results of multicollinearity detection and adjust them in different models2,8. Model 1 was not adjusted for any parameter, Model 2 adjusted for age and sex, and Model 3 further adjusted for ethnicity (non-Hispanic white, non-Hispanic black, Mexican American, or other), education (less than high school, high school, or more than high school), marital status (married or cohabiting with a partner, unmarried), smoking history (current smoking, never smoking, former smoking), alcohol consumption status (light, moderate, never, former, heavy), body mass index (BMI) (kg/m2), healthy eating index (HEI) score, history of diabetes mellitus (yes or no), hypertension (yes or no), hyperlipidemia (yes or no), chronic renal disease (yes or no), and urinary creatinine level.
Building and evaluating machine learning models
The dataset was divided into a training set (80%, n = 2058) and a test set (20%, n = 514), and the test set was data not seen by the models. In our study, the training process of the neural network model adopts scientific and reasonable parameter settings to ensure the convergence and stability of the model. Specifically, the model is trained for a total of 300 epochs, and an early stopping mechanism is introduced so that when the performance of the test set is not improved within 20 consecutive epochs, the training is terminated early to prevent overfitting. The initial learning rate is set to 0.001, and dynamically adjusted by the learning rate scheduler, decreasing by 0.1 when the test set loss stops decreasing. The optimizer is chosen to be Adam (Adaptive Moment Estimation), which combines momentum and adaptive learning rate to achieve efficient and stable convergence. The loss function uses binary cross-entropy loss to minimize the difference between the predicted probability and the true label. The training batch size (batch size) is set to 64. To avoid overfitting, we not only used 5-fold cross-validation, but also employed appropriate regularization techniques (e.g., L1 and L2 regularization) to further enhance the stability and generalization ability of the model. Considering the relatively small number of glaucoma cases (glaucoma = 162 vs. non-glaucoma = 2410), we used Synthetic Minority Classes Oversampling Technique (SMOTE) for oversampling during the training process and appropriately undersampled the majority class data. This helped to balance the distribution of classes in the training set, thus reducing the impact of class imbalance on model training.
11 different machine learning models including Neural Network (NN), Supported Vector Machine (SVM), Multi-Layer Perceptron (MLP), Gaussian Process (GP), Gradient Boosting Machine (GBM), Logistic Regression (LR), Naive Bayes (NB), XGBoost (XGB), C5.0 Decision Trees (C5.0) and k-nearest neighbor (KNN), Random Forest (RF)) were trained using the training set data, adjusting the hyperparameters for optimal performance. The predictive performance of different models is mainly compared through the area under the curve (AUC) values of receiver operating characteristic (ROC) curves as well as model performance validation using cross-validation sets, which evaluated the ability of the models to correctly discriminate between individuals with glaucoma and those without glaucoma. In addition to the AUC values, we also focus on a series of parameters of the confusion matrix (apparent prevalence, true prevalence, sensitivity, specificity, positive predictive value (PPV), Negative predictive value (NPV), positive likelihood ratio (PLR), negative likelihood ratio (NLR)) in order to assess the potential risk of the models in practical applications.
After constructing predictive model using the best-performing machine learning algorithms, we explored the association between heavy metal exposure and glaucoma, starting with feature importance analysis by the Local Interpretable Model-agnostic Explanations (LIME) method and the Shapley Additive Explanations (SHAP) method to assess which variables had the greatest impact on glaucoma risk prediction. In addition, partial dependence plot (PDP) analyses, accumulated local effects (ALE) analyses, and interaction analyses are also important methods for interpreting models. In bias plot analysis, the average change in predicted glaucoma risk is calculated by gradually changing selected target eigenvalues from low to high while keeping the values of other variables constant. In contrast, ALE calculates the local effect of each feature by partitioning to measure the contribution of changes in the value of that feature to the model output, focusing on its tendency to change in a localized region, while also avoiding the potential problem of model bias in PDP. We applied the final prediction model to the clinical setting to provide comprehensive assessment and prediction of glaucoma risk based on the levels of exposure to various heavy metal pollutants combined with other covariates in any given subject, and to identify possible high-risk glaucoma populations in advance for early intervention.
Statistical analysis
We not only explored the continuous distribution characteristics of urinary heavy metals in the subjects, but also stratified the heavy metal concentrations into low, medium, and high to clarify the concentration trend. On the one hand, considering the skewed distribution features of heavy metals, we performed logarithmic calculations on all heavy metal data to minimize potential bias. On the other hand, because urinary heavy metal levels are strongly influenced by the metabolic system in vivo, in order to reduce the measurement error caused by the different degree of urinary dilution between individuals and to make the test results more comparable, we adjusted the urinary creatinine, which is an important confounding variable, so that it can more accurately reflect the real relationship between heavy metals and glaucoma. We performed Pearson correlation analyses to initially explore the correlations between each heavy metal and other demographic variables, and applied variance inflation factors (VIF) to identify variables with multicollinearity.
Descriptive statistics were used to demonstrate differences in baseline characteristics of glaucoma as well as non-glaucoma populations. Age, BMI, energy intake, and HEI score were analyzed as continuous variables and mean ± standard was calculated. Sex, ethnicity, and marriage were presented as categorical variables in %. We also calculated p-values for differential results, less than 0.05 was considered statistically significant. All analyses were performed using the R 4.3.2 software with ‘randomForest’, ‘pROC’, ‘stats’, ‘stats’ ‘epiR’, ‘ggplot2’, ‘dplyr’ and other R packages.
Results
Characteristics of the inclusive subjects
A total of 2572 American noninstitutionalized civilians were included in our study, and their baseline characteristics were shown in Table 1. Glaucoma patients were more likely to be older, have higher energy intake, and former smokers. Self-reported medical history including chronic kidney disease, diabetes mellitus, and hypertension were more prevalent in glaucoma patients. All heavy metal levels did not show statistically significant differences between the two groups. However, the exposure concentrations of Ur (9.66 vs. 9.49), Pt (9.48 vs. 9.51) were all high relative to the other heavy metal variables, and Mo (0.87 vs. 0.87) was relatively low in both glaucoma and non-glaucoma patients. The tertile or dichotomous distribution of urinary heavy metals in glaucoma patients is characterized in Table S1. Pb was present in high concentrations (56 vs. 58) in glaucoma patients, whereas Tl (66 vs. 58), DMA (64 vs. 56), Pt (152 vs. 10), and As5 (160 vs. 2) were concentrated in low concentrations, and other heavy metals were detected mainly in medium concentrations.
The Pearson analysis including 20 heavy metals and other variables
We explored the correlations between the 20 heavy metals by Pearson correlation analysis, presented as Pearson correlation coefficients (r) in Fig. 1, and the results of the correlations between the heavy metals and the baseline characteristics of the participants were shown in Figure S1. Each heavy metal was correlated to varying degrees, with the most significant correlations between As-related compounds (0.69 to 0.98). As5 had strong correlation with As3 (r = 0.98), AC (r = 0.94), which were similar to the correlation between TM and As3, As5, and AC with correlation coefficients of 0.96, 0.96, and 0.97, respectively. The correlations between the heavy metals and the socio-demographic variables were weak (-0.21 to 0.27). The VIF values of the included covariates were all below 10, indicating that there was no multicollinearity between the variables.
Association between heavy metals and risk of glaucoma
After generalized linear regression analysis of urinary heavy metal levels, we found that the in vivo metabolized compounds of As showed the correlation with glaucoma (Table 2). The results showed that in multivariate model 3 after adjusting for a range of confounders such as ethnicity, smoking and drinking history, hypertension, diabetes mellitus, and chronic kidney disease, AC (β = 1.090, 95% CI: 0.313–1.835), MMA (β = 0.526, 95% CI: 0.017–1.028), and Cs (β = -0.453, 95% CI: -0.908- -0.001) showed statistically significant correlation between increasing exposure concentrations and changes in glaucoma risk, especially AC, which also remained consistent in both the crude model (β = 1.363, 95% CI: 0.626–2.078) and Model 2 adjusted only for age and sex (β = 1.104, 95% CI: 0.328, 1.850). However, Pb (β=-0.098), Hg (β=-0.138), and Cd (β=-0.054), which have been frequently mentioned in previous studies, did not show significant correlation with glaucoma, and the correlation coefficients were all negative.
Glaucoma risk prediction model based on machine learning algorithms
We experimented with the algorithms of each of the 11 machine learning models to fit the ROC curves for predicting the risk of glaucoma based on the training set data shown in Figure S2. The results of the test set were shown in Fig. 2. The RF, XGBoost, and NN model all showed good accuracy in identifying the risk of glaucoma, with an average AUC value of 1.000, which is higher than other models such as C5.0 (0.999), SVM (0.983), MLP (0.912), NB (0.791), GP (0.653), KNN (0.935), GBM (0.997) and LR (0.736). The parameters of the confusion matrix also demonstrate the excellent robustness of the constructed model, as detailed in Table S2. XGBoost, C5.0, RF, and KNN have similar performance in terms of sensitivity, specificity, NPV, FPR, PLR, and NLR. Combining the above results and the applicability conditions of the model itself, we found that XGBoost performed well and is more suitable for the application of subsequent characterization and risk prediction.
Interpretation of model features
In the summary plot based on the SHAP method (Fig. 3A), age, Ba, Cd, and DMA contributed the most to glaucoma. Figure 3B showed the featured importance of the 20 heavy metals calculated by the LIME method, sorted in descending order of their influence on glaucoma. The results showed that AB, Ba, MMA, As, and AC have the greatest influence on model performance, indicating that they may have strong association with glaucoma. By observing the effect of each heavy metal on the risk of glaucoma from different perspectives, we found that in addition to age, the contribution values of heavy metal elements, especially As metabolites such as As, AC, AB, DMA, and MMA were significantly higher than those of the other demographic variables, indicating that heavy metals, represented by As, are important risk factors for glaucoma.
Interpretation of model importance by critical variables. We used the (A) LIME method and the (B) SHAP method to make feature importance bar plots for all variables. The contribution of each heavy metal to the prediction was expressed using absolute SHAP values. Larger values represent higher concentrations of that heavy metal and greater risk of disease. (C) A subject numbered 10 were randomly selected for risk scoring of various risk factors.
The results of PDP analysis showed that increased exposure to DMA, Tl, and Ur at high concentrations was associated with greater risk of glaucoma (Figure S3), while most of the heavy metals were negatively correlated with the risk of glaucoma at very low concentration levels. The ALE analysis showed similar results, such as the localized cumulative effects of DMA, Tl, and Ur were more pronounced in contributing to the risk of glaucoma at high concentrations compared with other heavy metals (Figure S4). Interaction analyses showed that Pt, Ur, and Ba had stronger interactions with glaucoma risk, and As metabolites generally had weaker interactions with the outcome (Figure S5).
Personalized prediction of Glaucoma risk
In Fig. 3C, we presented the features and corresponding risk scores of randomly selected subjects numbered 10 and ranked in order of importance. The plotted observations indicated that As3 and As5 contribute significantly more to the identifiable glaucoma risk than the other features. For example, in Fig. 3C, the logarithmically calculated As5 exposure concentration of -3.29 ng/ml raises the risk of glaucoma disease by a factor of 0.0382.
Discussion
Our study incorporated follow-up data from 2005 to 2008 of the NHANES database, and confirmed the strong association between urinary heavy metal such as Cs, AC, and MMA and glaucoma after multivariate adjustment by generalized linear analysis, with AC maintaining significant positive association with glaucoma in all three adjusted-variable models. We further screened and constructed a machine learning prediction model with excellent performance, and found that As and its metabolites had significantly higher effect on the development of glaucoma than the other heavy metal elements by combining the interaction effect with the feature significance analysis, the PDP analysis, and the results of the local cumulative effect. In addition, based on this model, we scored the different risk factors of randomly selected subjects to predict the risk of glaucoma.
The evidence from previous epidemiologic studies on heavy metals and glaucoma is not uniform. A cross-sectional study in Korea reported positive association between blood cadmium levels and the risk of glaucoma, but no similar association was found for blood lead and mercury8. Another study, also conducted in South Korea and covering 2680 individuals, reported the association between blood manganese (Mn) (OR, 0.44; 95% CI, 0.21–0.92) and mercury (OR, 1.01; 95% CI, 1.00-1.03) levels and the prevalence of glaucoma, while no association was found between blood cadmium, lead, or urinary arsenic levels and glaucoma9. A Turkish investigation found blood Mn and Hg to be the most potent toxic metals affecting pseudoexfoliation syndrome, but no similar relationship was found in patients with pseudoexfoliation glaucoma20. A study showed that higher levels of cadmium were detected in atrial fluid samples from glaucoma patients compared to normal subjects21. Previous studies have shown the significant association between increased cadmium burden and increased incidence of glaucoma, which may be attributed to cadmium’s promotion of oxidative stress increasing damage to optic nerve axons8. Other experimental studies have shown that lead interferes with calcium homeostasis by acting as a competitive inhibitor and affecting action potential transmission22. Liu et al.23 reduced intraocular iron and calcium ion concentrations by chelating agents and found that they were able to promote optic ganglion cell survival without affecting IOP, suggesting that heavy metal ions may influence glaucoma pathogenesis by regulating the physiological activity of optic ganglion cells rather than acting by promoting IOP elevation. Although the research evidence for optic neurotoxicity of As is very limited, deleterious effects on the retina have been demonstrated in other eye diseases24,25.
DMA, MMA, AC, and AB are all major organic products of the metabolic reaction of methylation in arsenic, which are generally excreted through the urine and are considered to be less toxic of arsenic compounds26. However, it is still suggested that these are potential carcinogens that promote the production of free radicals, which cause lipid peroxidation leading to cellular damage27. In this study, we found that after adjusting for a series of confounders such as age and gender, the correlation between AC and glaucoma was still significant (β = 1.090, 95% CI: 0.313–1.835). In the prediction model constructed based on the XGBoost algorithm, the results of the feature importance analysis revealed that the contribution of DMA, AC to the risk of glaucoma was greater compared with that of other heavy metals, indicating that the ocular toxicity of As metabolites should be brought to the public’s attention, but this is an aspect that has often been overlooked in previous studies. Considering the low toxicity and metabolic transformation characteristics of AC, AB, and DMA, when they are detected at high levels, it may reflect the excessive As load in the body or the metabolic transformation ability is obviously weakened, and the body is unable to excrete the excess toxicants from the body. At the same time, the combination of cumulative effects and systemic toxicity of arsenicals may lead to neuronal damage and degenerative lesions of the optic nerve through oxidative stress and inflammatory mechanisms, resulting in much higher risk of glaucoma.
The SHAP and LIME results suggested that As and its metabolites exposure may be potential risk factors for the development of glaucoma, especially in individuals chronically exposed to high levels. Therefore, clinical knowledge of a patient’s history of environmental exposures, especially heavy metal exposures, may provide a new entry point for early screening of glaucoma. Regular glaucoma screening should be considered for high-risk groups, especially those in areas that may be chronically exposed to arsenic-containing drinking water. In addition, policies to improve the control of heavy metal contamination in water sources and soils are critical. We suggest that policy makers should strengthen the monitoring of environmental pollution and develop health interventions for populations at high risk of heavy metal exposure based on research findings.
XGBoost utilizes the Gradient Boosting Decision Tree (GBDT) method, which is able to improve the predictive power of the model by combining multiple weak classifiers. It is able to capture nonlinear relationships and complex interactions between features in the data, which is particularly important when dealing with the complex relationship between heavy metal exposure and glaucoma risk12. In the training process, with its ability to assign weights to features according to their importance, irrelevant or redundant features are eliminated28. In contrast, other models (e.g., support vector machine, logistic regression, etc.), although also capable of feature selection, perform less well when dealing with high-dimensional features or complex interactions between data29. In addition, in our dataset, especially with fewer glaucoma cases, XGBoost was able to improve the predictive performance of the model by adjusting the category weights or sampling strategy. PFI analysis helps researchers identify the most important heavy metal exposures and helps policy makers focus resources on priority monitoring and interventions30. The effect of age on glaucoma speaks for itself. An authoritative review this year31 suggested that aging is one of the most serious risk factors for glaucoma, and our findings validate this. The findings suggested that in addition to age, AB, DMA exposure is important in PFI analysis, indicating that glaucoma can be prevented by focusing on the level of As metabolite detection in the elderly population and other sensitive populations. Since heavy metals are extremely polluting to water sources, regular blood and urine screening of residents near factories, or consideration of more efficient treatment of metal contaminants to reduce individual As exposure levels, would be beneficial in controlling the development of glaucoma. Results of PDP analysis showed that DMA, Tl, and Ur had small effects on glaucoma risk at low levels, but the risk increased significantly when the concentrations exceeded certain thresholds, which could be very helpful for national public health departments to set safe exposure limits. ALE analysis further quantified the local contribution of each variable to the prediction results, avoiding the data distribution bias possibly presented in PDP analysis. In addition, modeling the risk of glaucoma at different heavy metal exposure levels can help decision makers to develop more rational public health policies and interventions.
The strength of our study lies in the fact that as the first epidemiological study to apply the machine learning modeling approach to explore the relationship between heavy metal and glaucoma, we have preliminarily demonstrated the association between multiple in vivo metabolites of As and glaucoma, which is the first systematic study of the association between As and glaucoma risk from the perspective of heavy metal exposure, and provided new insights into further understanding of the potential roles of heavy metals in the pathogenesis of glaucoma. We used multiple machine learning algorithms to predict glaucoma risk and comprehensively analyzed multiple heavy metal exposures, improving predictive power compared to traditional statistical models while avoiding the limitations of single metals. The data for this study were obtained from NHANES, a large, authoritative and reliable database with broad population coverage, providing robustness to the results. Our findings not only provided referable biomarkers for early clinical intervention and assessment of glaucoma risk, but also laid the foundation for future studies on the toxicological mechanisms of these heavy metals, further deepening the understanding of glaucoma pathogenesis.
However, our study still has some limitations. First, the cross-sectional design of our study does not allow us to clarify the causal relationship between heavy metals and glaucoma. Future longitudinal cohort studies are needed to dynamically track temporal changes in heavy metal exposure, with regular follow-up of urinary heavy metal concentrations and their metabolite levels and the pathogenesis of glaucoma in individuals at different time points, so that causality can be clarified. Moreover, large prospective cohorts in specific high-exposure areas or high-risk populations, combined with multiple measurements and time-dependent modeling, are more reliable for assessing the long-term effects of environmental exposures on glaucoma. Second, glaucoma onset is closely related to IOP, but as an important lesion characteristic, it is still necessary to adjust for confounders to make the results more convincing. But based on the availability of NHANES data, we were limited by the practicality to make adjustments, and we hope that future studies will be able to improve it. Furthermore, we used heavy metal concentrations in urine as a biomarker of long-term exposure, but urine does not always accurately reflect the level of long-term exposure, especially when it comes to the accumulation properties of different metals. In addition, although we used multiple machine learning algorithms for analysis and screening, the performance of different models may be affected by data imbalance, and future studies should further validate the performance of these models on other datasets.
Conclusion
We explored the association between urinary heavy metal levels and glaucoma risk based on the generalized linear model using follow-up data from the NHANES database, and constructed well-performing machine learning models for risk prediction. The results showed significant association between metabolites of As such as arsenochlorine and glaucoma in vivo, and stronger promotion of glaucoma by high levels of dimethylated arsenical, with similar results for individualized glaucoma prediction. This finding provided policy makers with scientific basis for the association between endocrine disruptors and age-related eye diseases and contributed to raising public awareness and preventive measures against As exposure.
Data availability
Publicly available datasets were analyzed in this study. This data can be found here: https://www.cdc.gov/nchs/nhanes/index.htm.
References
Bou Ghanem, G. O., Wareham, L. K. & Calkins, D. J. Addressing neurodegeneration in glaucoma: Mechanisms, challenges, and treatments. Prog. Retin. Eye Res. 100, 101261 (2024).
Zhang, Y. et al. Association between dietary calcium, potassium, and magnesium consumption and glaucoma. PLoS One 18(10), e0292883 (2023).
Hamel, A. R. et al. Integrating genetic regulation and single-cell expression with GWAS prioritizes causal genes and cell types for glaucoma. Nat. Commun. 15(1), 396 (2024).
Wang, Y. E., Tseng, V. L., Yu, F., Caprioli, J. & Coleman, A. L. Association of dietary fatty acid intake with glaucoma in the United States. JAMA Ophthalmol. 136(2), 141–147 (2018).
Rahman, Z. An overview on heavy metal resistant microorganisms for simultaneous treatment of multiple chemical pollutants at co-contaminated sites, and their multipurpose application. J. Hazard Mater. 396, 122682 (2020).
He, X. et al. Iron homeostasis and toxicity in retinal degeneration. Prog. Retin. Eye Res. 26(6), 649–673 (2007).
Rai, N. K. et al. Exposure to As, Cd and Pb-mixture impairs myelin and axon development in rat brain, optic nerve and retina. Toxicol. Appl. Pharmacol. 273(2), 242–258 (2013).
Lee, S. H. et al. Three toxic heavy metals in open-angle glaucoma with low-teen and high-teen intraocular pressure: A cross-sectional study from South Korea. PLoS One 11(10), e0164983 (2016).
Lin, S. C., Singh, K. & Lin, S. C. Association between body levels of trace metals and glaucoma prevalence. JAMA Ophthalmol. 133(10), 1144–1150 (2015).
Liu, Z. et al. Relationship between high dose intake of vitamin B12 and glaucoma: Evidence from NHANES 2005–2008 among United States adults. Front. Nutr. 10, 1130032 (2023).
Wang, T., Yang, J., Han, Y. & Wang, Y. Unveiling the intricate connection between per- and polyfluoroalkyl substances and prostate hyperplasia. Sci. Total Environ. 932, 173085 (2024).
Li, X. et al. Development of an interpretable machine learning model associated with heavy metals’ exposure to identify coronary heart disease among US adults via SHAP: Findings of the US NHANES from 2003 to 2018. Chemosphere 311(Pt 1), 137039 (2023).
Wang, S., Zhou, Y., You, X., Wang, B. & Du, L. Quantification of the antagonistic and synergistic effects of Pb(2+), Cu(2+), and Zn(2+) bioaccumulation by living Bacillus subtilis biomass using XGBoost and SHAP. J. Hazard Mater. 446, 130635 (2023).
Sun, X. et al. Arsenic (As) oxidation by core endosphere microbiome mediates As speciation in Pteris vittata roots. J. Hazard Mater. 454, 131458 (2023).
Wang, X. et al. Insights into deep decline of As(III) leachability induced by As(III) partial oxidation during lime stabilization of As-Ca sludge. J. Hazard Mater. 424(Pt C), 127575 (2022).
Inokuchi, M., Matsuo, N., Takayama, J. I. & Hasegawa, T. WHO 2006 Child Growth Standards overestimate short stature and underestimate overweight in Japanese children. J. Pediatr. Endocrinol. Metab. 31(1), 33–38 (2018).
Tseng, V. L., Lee, G. Y., Shaikh, Y., Yu, F. & Coleman, A. L. The association between glaucoma and immunoglobulin E antibody response to indoor allergens. Am. J. Ophthalmol. 159(5), 986–993 (2015).
Yang, D. L. et al. Indoor air pollution and human ocular diseases: Associated contaminants and underlying pathological mechanisms. Chemosphere 311(Pt 2), 137037 (2023).
Sano, K. et al. Association between alcohol consumption patterns and glaucoma in Japan. J. Glaucoma 32(11), 968–975 (2023).
Ceylan, O. M., Can Demirdogen, B., Mumcuoglu, T. & Aykut, O. Evaluation of essential and toxic trace elements in pseudoexfoliation syndrome and pseudoexfoliation glaucoma. Biol. Trace. Elem. Res. 153(1–3), 28–34 (2013).
Panteli, V. S., Kanellopoulou, D. G., Gartaganis, S. P. & Koutsoukos, P. G. Application of anodic stripping voltammetry for zinc, copper, and cadmium quantification in the aqueous humor: Implications of pseudoexfoliation syndrome. Biol. Trace Elem. Res. 132(1–3), 9–18 (2009).
Vennam, S. et al. Heavy metal toxicity and the aetiology of glaucoma. Eye (Lond). 34(1), 129–137 (2020).
Liu, P. et al. Metal chelator combined with permeability enhancer ameliorates oxidative stress-associated neurodegeneration in rat eyes with elevated intraocular pressure. Free Radic. Biol. Med. 69, 289–299 (2014).
Aberami, S. et al. Elemental concentrations in Choroid-RPE and retina of human eyes with age-related macular degeneration. Exp. Eye Res. 186, 107718 (2019).
Aschner, M. et al. Retinal toxicity of heavy metals and its involvement in retinal pathology. Food Chem. Toxicol. 188, 114685 (2024).
Zhang, Y. et al. Independent and combined associations of urinary arsenic exposure and serum sex steroid hormones among 6–19-year old children and adolescents in NHANES 2013–2016. Sci. Total Environ. 863, 160883 (2023).
Nigra, A. E., Moon, K. A., Jones, M. R., Sanchez, T. R. & Navas-Acien, A. Urinary arsenic and heart disease mortality in NHANES 2003–2014. Environ. Res. 200, 111387 (2021).
Ha, A. et al. Deep-learning-based prediction of glaucoma conversion in normotensive glaucoma suspects. Br. J. Ophthalmol. 108(7), 927–932 (2024).
Wang, S. Y., Ravindranath, R., Stein, J. D. & Consortium, S. Prediction models for glaucoma in a multicenter electronic health records consortium: The sight outcomes research collaborative. Ophthalmol. Sci. 4(3), 100445 (2024).
Zhao, B. et al. Prediction heavy metals accumulation risk in rice using machine learning and mapping pollution risk. J. Hazard Mater. 448, 130879 (2023).
Zhang, Y., Huang, S., Xie, B. & Zhong, Y. Aging, cellular senescence, and glaucoma. Aging Dis. 15(2), 546–564 (2024).
Acknowledgements
We thank the Department of Ophthalmology of the Second Affiliated Hospital of Anhui Medical University for their collaborative and logistical work.
Funding
This study was supported by the Research Fund Project of Anhui Institute of Translational Medicine (2023zhyx-C72) and the Second Affiliated Hospital of Anhui Medical University Research Program in 2024 (2024AH050789).
Author information
Authors and Affiliations
Contributions
X.C.W. conceived and designed the study. G.C. and R. H. analyzed and interpreted the patient data. Y.T.G., J.W.L. provided methodological support and software usage. T.C.X. were responsible for writing the original draft. Z.X.J., H.T.L reviewed and substantively revised the manuscript. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Wang, X., Chen, G., He, R. et al. Machine learning prediction of glaucoma by heavy metal exposure: results from the National Health and Nutrition Examination Survey 2005 to 2008. Sci Rep 15, 4891 (2025). https://doi.org/10.1038/s41598-025-88698-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-88698-7