Predicting skeletal fluorosis severity using machine learning across diverse fluoride-exposed populations in China

Long, Hongjiang; Zeng, Jiayi; Wei, Shaofeng; Hu, Ting; Lu, Yun; Jia, Qingzhen; Li, Jinshu; Ye, Hongbing; Gao, Yanhui; Zhang, Zhenting; Luo, Peng

doi:10.1038/s41598-026-43429-4

Download PDF

Article
Open access
Published: 17 March 2026

Predicting skeletal fluorosis severity using machine learning across diverse fluoride-exposed populations in China

Hongjiang Long¹^na1,
Jiayi Zeng¹^na1,
Shaofeng Wei¹,
Ting Hu¹,
Yun Lu¹,
Qingzhen Jia²,
Jinshu Li³,
Hongbing Ye⁴,
Yanhui Gao⁵,
Zhenting Zhang¹ &
…
Peng Luo ORCID: orcid.org/0000-0001-5800-8783¹

Scientific Reports volume 16, Article number: 13858 (2026) Cite this article

1065 Accesses
Metrics details

Subjects

Abstract

Skeletal fluorosis (SF) is a chronic metabolic bone disease resulting from long-term excessive fluoride exposure, affecting millions worldwide. Conventional diagnosis relies on radiographic evidence, which often detects the disease only at advanced stages, limiting opportunities for early intervention and prevention. A predictive model was developed to assess the severity of SF using comprehensive predictors, including demographic, environmental, and biomonitoring data, from 1,309 individuals across three major fluoride-endemic regions in China, representing coal-burning, drinking-water, and brick-tea fluoride exposure. After variable selection using the least absolute shrinkage and selection operator (LASSO) regression, five machine learning algorithms were trained and validated. Model performance was primarily evaluated using the area under the receiver operating characteristic curve (AUC). SHapley Additive exPlanations (SHAP) were applied to enhance model interpretability. The Random Forest model achieved the best predictive performance (AUC = 0.875 in the training set; 0.832 in the test set). SHAP analysis identified pain score, joint function, age, and UF concentration as the most influential predictors of SF severity. The model also captured regional differences in exposure and severity patterns across the three fluoride sources. This interpretable machine learning framework provides a robust tool for early risk screening and severity stratification of SF in high-risk populations. By enabling timely identification of individuals at risk of progression, the model serves as a foundation for targeted public health interventions and highlights the utility of data-driven methods in large-scale environmental health surveillance.

An interpretable machine learning model predicts frailty risk in middle-aged and older adults with gastrointestinal disease: a longitudinal study

Article Open access 28 April 2026

Identification and validation of a novel machine learning model for predicting severe pelvic endometriosis: A retrospective study

Article Open access 19 April 2025

Development of machine learning-based models for predicting sarcopenia risk in stroke patients and analysis of associated factors

Article Open access 08 December 2025

Introduction

Skeletal fluorosis (SF) is a neglected chronic disease that poses a significant public health burden in many low-and middle-income countries, with an estimated 100 million people affected globally, particularly in India, Africa, and China^1,2. Fluoride occurs naturally in water, air, soil, and food; however, exposure levels exceeding the World Health Organization (WHO) threshold of 1.5 mg/L can lead to serious metabolic and skeletal disorders^3,4,5. In China, fluorosis predominantly arises from three environmental exposure pathways: coal-burning, drinking-water contamination, and consumption of fluoride-rich brick tea. These pathways affect 28 provincial-level administrative regions and over 70,000 villages nationwide⁶. Although each exposure route leads to distinct fluoride absorption patterns, the majority of fluoride is ultimately deposited in mineralized tissues, cumulatively resulting in progressive skeletal damage over time⁷.

Clinically, early-stage SF presents with subtle symptoms such as joint pain and stiffness, which may progress to severe skeletal deformities and disability if left untreated⁸. Current diagnostic criteria rely heavily on radiographic evidence, which typically becomes apparent only after significant and often irreversible skeletal damage has occurred. Such diagnostic delays greatly undermine the effectiveness of early interventions⁹.

Recent advances in machine learning (ML) offer promising opportunities to overcome these diagnostic limitations through predictive modeling. ML algorithms can integrate heterogeneous clinical and environmental data to detect early, subtle indicators of SF progression, potentially outperforming conventional diagnostic approaches¹⁰. In this study, we aimed to develop and validate an ML-based predictive framework that integrates environmental fluoride exposure data with clinical biomarkers. We utilized data from three fluoride-endemic regions in China, each representing a unique environmental fluoride exposure route, and evaluated the performance of several ML algorithms. The inclusion of diverse exposure patterns enhances the generalizability of the model across different affected populations. This predictive tool enables early identification of high-risk individuals and provides a valuable approach for targeted public health interventions to reduce the burden and progression of SF.

Method

Study design and participants

Data and participants for this study were drawn from the China Fluorosis Cohort (CFC), initiated in 2022. Based on the national classification criteria for endemic fluorosis areas (GB/T 17018–2011) and historical surveillance data, three provinces with distinct sources of fluoride exposure were selected: Shanxi (drinking-water type), Guizhou (coal-burning type), and Sichuan (brick-tea type)^11,12,13. A multi-stage stratified cluster sampling strategy was used to recruit participants. In the first stage, 1–2 endemic areas were selected for each exposure type. In the second stage, the local Centers for Disease Control and Prevention (CDC) selected 3–6 communities/villages in each district based on population size. In the final stage, the diagnosis and grading of SF were conducted by a team of two radiologists and three orthopedic specialists based on standardized X-ray images and the People’s Republic of China Health Industry Standard for Endemic Fluorosis (WS/T192-2021). The same team was responsible for reading all radiographs from the three study regions. All cases were reviewed independently, and any diagnostic discrepancies were resolved through expert consensus. This process led to the inclusion of 1,789 individuals with varying degrees of SF severity.

All procedures in this study adhered to the ethical standards of our university, the National Research Committee, and the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. The study was approved by the Ethics Committee of Guizhou Medical University (Ethics Approval No. 2023 − 189), and all participants provided written informed consent.

Data collection and preprocessing

A multimodal data collection framework combining epidemiologic survey, clinical examination, and laboratory test was used for the data of this study. A structured questionnaire designed by the research team for the CFC was administered through face-to-face interviews by master’s-level graduate students with medical backgrounds. All interviewers received centralized, standardized training prior to fieldwork to ensure consistency in question delivery and interpretation. To maintain data quality, a random sample of 20% of completed questionnaires was cross-checked each evening by an experienced quality control team. Feedback and retraining were provided as necessary, and questionnaires deemed invalid were excluded from the final analysis.

On the final day of fieldwork at each community site, standardized physical examinations were conducted. Local health center nurses were responsible for measuring anthropometric and physiological indicators, including height, weight, and blood pressure. Orthopedic specialists evaluated the functional status and pain levels of major limb joints (shoulders, elbows, knees, and hips) using standardized clinical protocols.

Blood and urine samples were collected from all participants during the survey. These specimens were used to analyze a comprehensive panel of laboratory markers, including liver function, kidney function, lipid level, electrolytes, bone metabolism indicators, and inflammatory markers.

Participants who did not undergo clinical examination or refused to participate in laboratory tests were excluded. Variables with more than 20% missing data were removed, and individuals with over 50% missing values were also excluded from the analysis. For variables with missing data less than or equal to 10%, multiple imputation was performed using the mice package in R to reduce potential bias. The final study dataset incorporated 1309 individuals and was randomly divided into training and test sets in a 7:3 ratio. Minimum-maximum scaling was used for continuous variables to ensure consistent scaling between the training and test sets, and categorical variables were converted to dummy variables by one-hot coding to match the input requirements of the ML models.

Feature selection and ML model building

In order to optimize the feature selection and improve the model interpretation, Least Absolute Shrinkage and Selection Operator regression (LASSO) was used to select the variables in the training set, and the coefficients of the non-critical features were compressed to zero by L1 regularization to reduce the influence of multiple covariates and achieve feature sparsification. Initially, 80 candidate variables were included in the study, and the standardized feature matrix and the dichotomous dependent variable (SF severity, coded 0/1) were input into the model. LASSO regression was conducted using the cv.glmnet() function in the glmnet R package, with 10-fold cross-validation. The regularization path of the penalty parameter was generated automatically using the default data-adaptive sequence provided by glmnet, which spans a logarithmically spaced range from maximal to minimal regularization. A fixed random seed was applied to ensure the reproducibility of fold assignments. Finally, 22 non-zero coefficients were selected for subsequent analysis based on the minimum cross-validation error (lambda.min).

Using these selected features, ML algorithms, including Random Forest (RF), K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), and Decision Tree (DT), were constructed to develop prediction models for SF severity. To ensure model stability and prevent information leakage, all model training and hyperparameter tuning were conducted strictly within the training set using a grid-search strategy combined with 10-fold cross-validation. For each algorithm, a predefined and sufficiently wide hyperparameter space was systematically explored to identify the optimal configuration that maximized the area under the receiver operating characteristic curve (AUC). The search ranges were carefully calibrated to ensure that the finalized optimal values did not reside on the extreme boundaries of the predefined grids, thereby confirming the thoroughness of the optimization process. All evaluated parameters, their respective search ranges, and the final optimized values are detailed in Table 1.

Table 1 Hyperparameters for grid search tuning in each model.

Full size table

After model optimization, the final models were applied exclusively to the independent test set to evaluate their generalization performance. Model performance was evaluated using AUC, accuracy (ACC), sensitivity (SENS), specificity (SPEC), and F1 score ((1)–(4)). Among these metrics, AUC was predefined as the primary criterion for model comparison and selection, while the remaining indicators provided complementary assessments of classification performance. For the training set, performance metrics were reported as the mean values across the 10-fold cross-validation, whereas for the test set, metrics were calculated directly based on the independent predictions. Furthermore, calibration curves and decision curve analysis (DCA) were conducted on the test set to assess the agreement between predicted probabilities and observed outcomes, as well as to quantify the net clinical benefit across a range of risk thresholds. The evaluation metrics of all classifiers are presented as follows:

(1)
\(\text{Accuracy=\:}\frac{\text{TP}\text{+}\text{TN}}{\text{TP}\text{+}\text{FP}\text{+}\text{FN}\text{+}\text{TN}}\)
(2)
\(\text{\:Sensitivity=\:}\frac{\text{TP}}{\text{TP}\text{+}\text{FN}}\)
(3)
\(\text{\:Specificity=\:}\frac{\text{TN}}{\text{TN}\text{+}\text{FP}}\)
(4)
\({\text{F1~Score = 2}} \cdot \frac{{{\text{Precision~}} \cdot {\text{Sensitivity}}}}{{{\text{Precision + Sensitivity}}}}\)

Model explanatory analysis

SHAP (Shapley Additive exPlanations) values are used in this study for feature importance analysis, which is based on the Shapley value principle in cooperative game theory, and can quantify the contribution of each feature variable to the model prediction results, and visualize the direction of the variable’s influence and its relative importance.

Statistical analysis

All statistical analyses were performed using R (version 4.4.2) and SAS (version 9.4). The normality of continuous variables was assessed using the Kolmogorov-Smirnov test, which revealed that all continuous variables were non-normally distributed. Therefore, continuous variables were expressed as median and interquartile range (IQR) and analyzed using Mann-Whitney U test or Kruskal-Wallis test. Categorical variables were expressed as counts and percentages for comparisons using the chi-square test. All statistical analyses were performed using two-sided tests with α < 0.05 considered statistically significant.

Result

Characteristics of participants

The training set included 917 participants (600 with mild SF and 317 with moderate-to-severe SF), and the test set consisted of 392 participants (254 mild, 138 moderate-to-severe). No significant differences in baseline characteristics were observed between the training and test sets (external P > 0.05), indicating that the data split preserved the representativeness of the overall sample. In the overall cohort, patients with moderate-to-severe fluorosis were significantly older than those with mild fluorosis (median age ~ 68 vs. 58 years in the training set, P < 0.001), a trend that was also observed in the test set (P < 0.001). In addition, several demographic characteristics were associated with fluorosis severity. Notably, living area (region of residence), ethnicity, education level, and annual income differed significantly between mild and moderate/severe groups (P < 0.001 for each), whereas there was no significant difference in severity between male and female participants. The distribution of disease severity varied markedly by region. The coal-burning fluorosis region, Guizhou, had a much higher proportion of moderate-to-severe cases compared to the brick-tea fluorosis region, Sichuan, with the drinking-water fluorosis region, Shanxi, in between (overall P < 0.001 for living area). For example, in the training set, approximately half of the participants from Guizhou had moderate-to-severe SF, versus only around 15% of those from Sichuan (brick-tea area), with Shanxi around 40%. A similar pattern was seen in the test set. These differences reflect the varying intensity of environmental fluoride exposure across regions.

As for bone health–related indices, bone mineral density and a bone mass index measure were significantly lower in participants with moderate-to-severe SF than in those with mild SF. Conversely, parathyroid hormone (PTH) levels were significantly higher in the moderate/severe group than in the mild group (median 52.8 vs. 48.3 pg/mL, P = 0.002), whereas β-CrossLaps (a marker of bone resorption) levels were significantly lower in moderate/severe compared to mild cases (0.45 vs. 0.52 ng/mL, P = 0.001). These differences in bone turnover markers and bone indices were consistent in both the training and test sets (P < 0.05 in each set).

In terms of general biochemical and hematological indices, we also observed significant differences between severity groups. Total bilirubin, urinary fluoride (UF), white blood cell count (WBC), neutrophil count (NEUT), mean corpuscular volume (MCV), mean corpuscular haemoglobin (MCH), and multiple inflammatory cytokines (e.g. interleukin-5, interferon α, IL-2, IL-1β, IFNγ, and IL-4) all differed significantly between mild and moderate/severe SF. Notably, UF levels in patients with moderate-to-severe SF were substantially higher than those in mild cases (median 2.04 vs. 1.30 mg/L, P < 0.001). Interestingly, several inflammatory cytokines were higher in mild fluorosis than in moderate/severe fluorosis. The trends for all these biomarkers were similar in both the training and test cohorts. Detailed data on the characteristics and group comparisons can be found in Table 2.

Table 2 Participant characteristics of the training and test set.

Full size table

Feature selection

The LASSO regression identified 22 non-zero coefficient features from the 80 candidate variables that were significantly associated with the severity of SF, and the trends in the logarithmic values of the coefficient paths versus the regularization parameter λ are shown in Supplementary material 1.

The final selected predictors included demographic characteristics (Age, Living area, Average sleep time), skeletal health indicators (Bone Density, Bone Mass Index), joint mobility measures (Elbow, Spinal, Knee, Shoulder function), pain level score, bone metabolism markers (PTH, Osteocalcin, PINP, β-CrossLaps, Vitamin D), inflammatory marker (IL-4), hematological indices (WBC, HCT, MCV, MCHC), liver function indicators (Total bilirubin), and an environmental exposure marker (UF).

Evaluation and performance of ML models

We assessed the predictive performance of five ML classifiers using multiple evaluation metrics for predicting SF severity. As AUC was predefined as the primary metric for model comparison, the RF model achieved the highest cross-validated AUC in the training set among all evaluated classifiers. On the training set, the RF model achieved an AUC of 0.875 (95% CI: 0.851–0.913), along with favorable sensitivity (0.903; 95% CI: 0.871–0.936) and F1 score (0.855; 95% CI: 0.832–0.879). It also achieved an accuracy of 0.797 (95% CI: 0.765–0.829) (Fig. 1A; Table 3). When applied to the independent test set, the RF model maintained comparable performance with an AUC of 0.832, accuracy of 0.793, and F1 score of 0.841, indicating stable predictive performance and good generalizability (Fig. 1B; Table 4).

Calibration and decision curve analyses were used as complementary evaluations. Both RF and XGBoost showed good calibration and higher net benefit across decision thresholds (Fig. 1C,D). As RF consistently demonstrated the highest AUC during model development and stable performance in the independent test set, it was selected for subsequent interpretive analysis.

Table 3 Model evaluation metrics in the training set.

Full size table

Table 4 Model evaluation metrics in the test set.

Full size table

To further explore the contribution of individual features to predicting SF severity, we conducted SHAP value analysis using the RF model. The SHAP analysis identified pain level score, knee function, age, shoulder function, and UF levels as the top five predictors with the greatest impact on the model’s predictions (Fig. 2A). Pain score emerged as the most influential predictor, exhibiting the highest SHAP values (> 0.06), strongly indicative of its critical role in differentiating SF severity. Other key factors, such as bone mass index, place of residence, total bilirubin, and bone metabolism markers (PINP, osteocalcin), also significantly influenced severity predictions.

The SHAP summary plot (beeswarm plot, Fig. 2B) visualizes how each feature contributes to model predictions. Specifically, lower pain scores were highly consistent with the prediction of mild SF, while individuals with no significant activity limitations in knee and shoulder function were similarly categorized as mild SF. Conversely, older age indicated a greater likelihood of moderate to severe SF in the elderly. In addition, lower bone mass index values and higher levels of bone metabolic markers (PINP and osteocalcin) were also associated with SF severity.

SHAP dependence plots for the most important predictors, including pain score and UF concentration, were further analyzed (Fig. 2C and D). UF exhibited a non-linear relationship with fluorosis severity prediction: initial increases in UF concentration sharply raised SHAP values, signifying increased disease severity risk, but beyond a certain threshold, additional increases produced diminishing predictive impact. In contrast, the dependence plot for pain score revealed a more complex pattern; low pain scores were consistently associated with mild fluorosis predictions (negative SHAP values), while higher scores correlated positively and increasingly with predictions of moderate-to-severe SF. These results underline the pivotal role of pain severity and environmental fluoride exposure, captured through UF levels, as predictors of SF severity.

Discussion

This study presents a robust and interpretable ML model for predicting the severity of SF, integrating clinical biomarkers and environmental fluoride exposure data from three distinct endemic regions in China, representing coal-burning (Guizhou), drinking-water (Shanxi), and brick-tea consumption (Sichuan) exposure pathways. Among the five ML models tested, the RF model demonstrated the highest predictive performance (AUC = 0.875 in training and 0.832 in test), highlighting the utility of ML approaches in capturing the complex interplay between environmental exposure and disease progression. This represents an advancement beyond traditional radiographic severity assessment, offering a complementary data-driven perspective.

The selection of performance metrics is crucial when evaluating ML models in epidemiological contexts where class imbalance is common. Such an imbalance is frequently encountered in studies of chronic diseases, cancer prognosis, and disease severity stratification. In many of these settings, researchers may apply resampling techniques such as SMOTE or other balancing strategies during model training to address skewed class distributions^14,15. Importantly, however, these preprocessing strategies do not alter the predominant use of AUC as the primary evaluation metric in medical classification research^16,17. In this study, the dataset reflects the real-world distribution of SF severity, where moderate-to-severe cases are naturally less frequent than mild cases. Our objective was not to artificially rebalance the data, but to preserve this distribution to better reflect practical screening scenarios in endemic populations. While some researchers advocate for Precision-Recall (PR) curves in imbalanced settings, recent evidence suggests that AUC maintains robust discriminative power and remains highly competitive with PR-AUC in many medical classification scenarios^18,19. Nevertheless, we acknowledge that relying solely on AUC may obscure the practical implications of false positives under specific decision thresholds. To address this, we complemented AUC with threshold-dependent metrics such as the F1-score, which balances precision and recall, as well as calibration and decision curve analyses to provide a more practically interpretable evaluation of model performance.

Pain level emerged as the most influential predictor of SF severity in our model, followed by knee function, age, shoulder function, and UF. This finding aligns closely with the clinical progression of SF, which typically begins with joint pain, stiffness, and limited mobility, and may eventually lead to deformities due to ossification of ligaments and the formation of bony outgrowths^7,20. Deformities of the knees, hips, and spine are especially common in moderate-to-severe cases^21,22. SHAP analysis revealed a clear gradient between higher pain scores and the predicted probability of severe SF. This suggests that pain, while often considered a subjective symptom, carries substantial risk stratification and severity discrimination value. Previous studies have demonstrated strong correlations between pain level and both SF severity (r = 0.71, P < 0.001) and patient age (r = 0.79, P < 0.001)²³. This relationship indicates that pain may reflect not only current disease status but also cumulative exposure and biological vulnerability associated with aging.

In addition, pain scores are closely linked to joint function, particularly in the knees—a site frequently affected by fluorosis-related osteoarticular damage. A study emphasized the utility of pain scores in assessing functional impairment and disease burden in fluorosis-induced osteoarthritis²⁴. This supports our finding that knee and shoulder function were among the top-ranking features in the predictive model. As such, pain scoring and joint function may serve as a practical and cost-effective indicator for early risk identification in endemic communities.

UF, as a direct biomarker of environmental fluoride exposure and body burden, also played a key role in our model^23,25. Elevated UF levels have been shown to significantly increase the risk of SF (OR = 7.25)²⁶, and our SHAP analysis confirmed that individuals with higher UF levels were more likely to be classified as having moderate-to-severe SF. Interestingly, the relationship between UF and SF severity appeared to follow a nonlinear threshold effect—suggesting that once UF levels surpass a critical value (e.g., 4 mg/L), disease progression may accelerate more rapidly and irreversibly. This highlights the public health importance of early intervention in high-exposure communities before fluoride accumulation crosses harmful thresholds. Multiple studies have also affirmed the value of UF as a non-invasive, objective biomarker for both individual-level risk assessment and community-level exposure surveillance^27,28.

Together, these findings support a comprehensive model in which subjective symptoms (e.g., pain), functional measures (e.g., joint mobility), and objective biomarkers (e.g., UF) jointly contribute to SF severity assessment, risk evaluation, and prevention. The integration of such multidimensional indicators could significantly improve early detection and risk stratification strategies in endemic settings.

Beyond joint symptoms and fluoride exposure, alterations in bone metabolism also emerged as key indicators of SF severity. In particular, elevated levels of two bone formation markers, including PINP and osteocalcin, were strongly associated with moderate-to-severe disease in our model^29,30. PINP, an early marker of type I collagen synthesis, reflects enhanced osteoblast proliferation and collagen production, while osteocalcin, secreted by mature osteoblasts, indicates active matrix mineralization^31,32. The concurrent elevation of both markers points to a pathophysiological state of high bone turnover in SF patients. This metabolic profile underscores the involvement of systemic bone remodeling processes in the pathogenesis of SF, beyond the localized symptoms and fluoride burden. These findings align with the recognized pathological features of SF, which involve heightened osteogenic activity accompanied by compromised structural integrity of the newly formed bone. Although bone anabolism is accelerated, the newly formed bone tends to be structurally abnormal, characterized by increased density but higher brittleness, which ultimately compromises bone integrity³³. Our results reinforce this paradoxical relationship: despite elevated bone formation markers, individuals with moderate-to-severe SF had significantly lower bone mass index and bone density, compared to those with milder disease. This “high turnover, low quality” state of bone metabolism represents a hallmark of osteofluorosis and likely contributes to increased fracture risk, skeletal deformities, and impaired mobility^8,34.

Notably, bone metabolic disturbances were evident not only between healthy individuals and SF patients, but also among patients with differing levels of disease severity, indicating a progressive decline in bone quality along the disease continuum. Moreover, such imbalances appear to persist regardless of specific exposure pathways, highlighting the need to incorporate bone quality assessment into early screening and population-level health monitoring of SF. Evaluating markers like PINP and osteocalcin could aid in identifying individuals at risk for progressive bone damage, thereby informing preventive strategies and guiding the development of targeted therapeutic interventions aimed at restoring bone homeostasis.

Our results reveal a gradient association between fluoride exposure pathways and SF severity, with coal-burning areas showing the highest proportion of moderate-to-severe cases, followed by drinking-water and brick-tea regions. This pattern likely stems from differences in exposure complexity, cumulative dose, and intervention effectiveness. Coal-burning fluorosis in Guizhou is particularly severe due to combined dietary and inhalation exposure. Locally used coal and clay contain high fluoride levels- 78 mg/kg and 1116 mg/kg respectively, with coal-clay briquettes reaching 313 mg/kg³⁵. Roasting staple foods like corn leads to surface fluoride accumulation, raising levels to 20.26 mg/kg- 16 times that of unroasted samples³⁶. Despite stove renovation efforts, traditional practices persist, limiting control efficacy²⁵. In contrast, drinking-water-type fluorosis, though globally widespread (≥ 1.5 mg/L exposure in ~ 200 million people; 72.07 million in China)¹², has declined due to national interventions like the “Three Five-Year Plans” and the “Three-Year Action Plan”^37,38,39,40. Brick-tea-type fluorosis, prevalent in Sichuan, is caused by long-term consumption of fluoride-rich Camellia sinensis products⁴¹. Though it leads to chronic accumulation, symptoms are generally milder and reversible upon cessation, making behavioral intervention and education the primary prevention strategy^42,43. In summary, the severity of SF correlates with the intensity and multiplicity of fluoride exposure routes. These findings emphasize the need for tailored interventions, ranging from technological and environmental to behavioral measures, based on specific exposure contexts.

This study has several notable strengths. It is the first to construct an interpretable predictive model for SF severity that integrates three major fluoride exposure pathways (coal-burning, drinking-water, and brick-tea types), enhancing both the predictive utility and generalizability of the findings across endemic regions. The identification of key early-stage indicators, such as pain level scores, joint function, and UF, underscores their evaluative value throughout the disease continuum, extending beyond the distinction between patients and healthy individuals to the differentiation among varying degrees of severity. These insights support more precise risk stratification and provide an evidence base for targeted, context-specific public health interventions in fluoride-affected populations worldwide.

Nonetheless, several limitations should be acknowledged. First, the relatively low specificity observed in some models may be attributable to the substantial overlap in clinical and biochemical profiles between mild and moderate-to-severe SF groups, as well as the inherently non-linear and non-separable nature of disease severity progression. These characteristics may increase the likelihood of false-positive classifications when distinguishing between severity levels. Second, the study did not achieve precise quantification of individual-level fluoride exposure. Although regional fluoride types were considered, environmental monitoring data (e.g., from air, water, and food) were not directly integrated, limiting the granularity of exposure assessment. Third, key nutritional factors such as calcium and vitamin D intake, known to modulate fluoride toxicity were not included in the model, potentially affecting the comprehensiveness of disease prediction. Finally, while age was identified as a strong predictor, it does not fully capture cumulative fluoride exposure or distinguish age-related bone degeneration from fluoride-specific skeletal damage. Longitudinal cohort studies with detailed dietary and exposure tracking are needed to disentangle these effects and validate the model’s applicability over time.

Conclusions

This study developed and validated an interpretable machine learning framework for predicting the severity of SF by integrating multi-dimensional predictors, including demographic factors, environmental exposure levels, and key biomarkers. The model provides a practical tool for early identification and risk assessment in affected populations. By supporting timely diagnosis and stratified management, it holds potential value for guiding public health interventions in fluoride-endemic regions. Further work is needed to optimize the model’s performance and evaluate its broader applicability in diverse geographical contexts.

Data availability

Data described in the manuscript will be made available upon request pending application to and approval from the corresponding author PL.

Code availability

The R code used for data preprocessing, feature selection, ML model building, and SHAP analysis in this study is available as Supplementary 2. The provided script includes the comprehensive analysis pipeline necessary to reproduce the findings described in this study.

Abbreviations

SF:: Skeletal fluorosis
UF:: Uurinary fluoride
PINP:: Procollagen I N-terminal propeptide
PTH:: Parathyroid hormone
WBC:: White blood cell count
NEUT:: Neutrophil count
RBC:: Red blood cell count
HCT:: Red blood cell hematocrit
MCV:: Mean corpuscular volume
MCH:: Mean corpuscular hemoglobin
MCHC:: Mean corpuscular hemoglobin concentration
PDW:: Platelet distribution width
PCT:: Platelet crit
SVM:: Support vector machine
XGBoost:: Extreme gradient boosting
RF:: Random forest
KNN:: K-nearest neighbors
DT:: Decision tree

References

Bello, L. D. Fluorosis: an ongoing challenge for India. Lancet Planet. Health. 4, e94–e95 (2020).
Article PubMed PubMed Central Google Scholar
Podgorski, J. & Berg, M. Global analysis and prediction of fluoride in groundwater. Nat. Commun. 13, 4232 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Malinowska, E., Inkielewicz, I., Czarnowski, W. & Szefer, P. Assessment of fluoride concentration and daily intake by human from tea and herbal infusions. Food Chem. Toxicol. 46, 1055–1061 (2008).
Article CAS PubMed Google Scholar
Wei, W., Pang, S. & Sun, D. The pathogenesis of endemic fluorosis: Research progress in the last 5 years. J. Cell. Mol. Med. 23, 2333–2342 (2019).
Article CAS PubMed PubMed Central Google Scholar
Solanki, Y. S., Agarwal, M., Gupta, A. B., Gupta, S. & Shukla, P. Fluoride occurrences, health problems, detection, and remediation methods for drinking water: A comprehensive review. Sci. Total Environ. 807, 150601 (2022).
Article CAS PubMed Google Scholar
Zhao, L. et al. Spatial-Temporal Analysis of Drinking Water Type of Endemic Fluorosis — China, 2009–2022. CCDCW 6, 25–29 (2024).
Article Google Scholar
Krishnamachari, K. A. Skeletal fluorosis in humans: a review of recent progress in the understanding of the disease. Prog Food Nutr. Sci. 10, 279–314 (1986).
CAS PubMed Google Scholar
Sellami, M. et al. Skeletal fluorosis: don’t miss the diagnosis! Skeletal Radiol. 49, 345–357 (2020).
Article PubMed Google Scholar
Yang, C., Wang, Y. & Xu, H. Treatment and Prevention of Skeletal Fluorosis. Biomed. Environ. Sci. 30, 147–149 (2017).
CAS PubMed Google Scholar
Handelman, G. S. et al. eDoctor: machine learning and the future of medicine. J. Intern. Med. 284, 603–619 (2018).
Article CAS PubMed Google Scholar
Luo, K., Li, L. & Zhang, S. Coal-burning roasted corn and chili as the cause of dental fluorosis for children in southwestern China. J. Hazard. Mater. 185, 1340–1347 (2011).
Article CAS PubMed Google Scholar
Sun, D., Gao, Y. & Liu, H. Achievements and prospects of endemic disease prevention and control in China in past 70 years. zgggws 35, 793–796 (2019).
Cao, J. et al. The relationship of fluorosis and brick tea drinking in Chinese Tibetans. Environ. Health Perspect. 104, 1340–1343 (1996).
Article CAS PubMed PubMed Central Google Scholar
Singh, J. et al. Batch-balanced focal loss: a hybrid solution to class imbalance in deep learning. J. Med. Imaging (Bellingham). 10, 051809 (2023).
PubMed PubMed Central Google Scholar
Megahed, F. M. et al. The class imbalance problem. Nat. Methods. 18, 1270–1272 (2021).
Article CAS PubMed Google Scholar
Zheng, Y., Zhang, C. & Liu, Y. Risk prediction models of depression in older adults with chronic diseases. J. Affect. Disord. 359, 182–188 (2024).
Article PubMed Google Scholar
Xiao, L. et al. Development and validation of an interpretable machine learning model for predicting in-hospital mortality for ischemic stroke patients in ICU. Int. J. Med. Informatics. 198, 105874 (2025).
Article Google Scholar
Richardson, E. et al. The receiver operating characteristic curve accurately assesses imbalanced datasets. Patterns (N Y). 5, 100994 (2024).
Article PubMed PubMed Central Google Scholar
Fu, G. H., Yi, L. Z. & Pan, J. Tuning model parameters in class-imbalanced learning with precision-recall curve. Biom. J. 61, 652–664 (2019).
Article PubMed Google Scholar
Meena, L. & Gupta, R. Skeletal Fluorosis. N. Engl. J. Med. 385, 1510–1510 (2021).
Article PubMed Google Scholar
Kumar, H., Boban, M. & Tiwari, M. Skeletal fluorosis causing high cervical myelopathy. J. Clin. Neurosci. 16, 828–830 (2009).
Article PubMed Google Scholar
Richards, A., Mosekilde, L. & Søgaard, C. H. Normal age-related changes in fluoride content of vertebral trabecular bone—Relation to bone quality. Bone 15, 21–26 (1994).
Article CAS PubMed Google Scholar
Chen, G. et al. Clinical characteristics and related factors of elbow joint disease in skeletal fluorosis. Chin. J. Rheumatol. 28, 392–397 (2024).
Google Scholar
Singh, V. et al. Clinical and Radiological Study of Serum Fluoride in Relation to Knee Osteoarthritis. Malays Orthop. J. 14, 151–154 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Chen, J. et al. Coal utilization in China: environmental impacts and human health. Environ. Geochem. Health. 36, 735–753 (2014).
Article CAS PubMed Google Scholar
Rawat, N. & Bafana, A. Health risk modeling and risk factors of fluorosis in the fluoride endemic village of Maharashtra: A cross-sectional study. Environ. Monit. Assess. 196, 1230 (2024).
Article CAS PubMed Google Scholar
Rugg-Gunn, A. J., Villa, A. E. & Buzalaf, M. R. A. Contemporary biological markers of exposure to fluoride. Monogr. Oral Sci. 22, 37–51 (2011).
Article PubMed Google Scholar
Choubisa*, S. L. & Choubisa, A. A. Brief Review of Ideal Bio-Indicators, Bio-Markers and Determinants of Endemic of Fluoride and Fluorosis. J. Biomedical Res. Environ. Sci. 2, 920–925 (2021).
Article Google Scholar
Bayer, M. & Palicka, V. Utilization and Reference Values of Bone Turnover Markers: Osteocalcin and Procollagen Type 1 N-Propeptide. Biomarkers in Bone Disease. 239-252 (2017).
Chapter Google Scholar
Rathore, B., Singh, M., Kumar, V. & Misra, A. Osteocalcin: an emerging biomarker for bone turnover. Int. J. Res. Med. Sci. 4, 3670–3674 (2016).
Article Google Scholar
Xu, H., Wang, C., Zhao, Z., Zhang, W. & Li, G. Role of Oxidative Stress in Osteoblasts Exposed to Sodium Fluoride. Biol. Trace Elem. Res. 123, 109–115 (2008).
Article CAS PubMed Google Scholar
Xu, H., Zhou, Y., Zhang, X., Lu, P. & Li, G. Activation of PERK signaling through fluoride-mediated endoplasmic reticulum stress in OS732 cells. Toxicology 277, 1–5 (2010).
Article CAS PubMed Google Scholar
Park, Y. A. et al. Skeletal fluorosis secondary to methoxyflurane use for chronic pain. JBMR Plus. 8, ziae032 (2024).
Article CAS PubMed PubMed Central Google Scholar
Iamandii, I., Mazzoli, R., De Pasquale, L., Vinceti, M. & Filippini, T. Fluoride exposure and risk of fractures: a systematic review and dose-response meta-analysis. Eur. J. Public Health 34, ckae144.1420. (2024).
Luo, K., Li, L. & Zhang, S. Coal-burning roasted corn and chili as the cause of dental fluorosis for children in southwestern China. J. Hazard. Mater. 185, 1340–1347 (2011).
Article CAS PubMed Google Scholar
Guo, J., Wu, H., Zhao, Z., Wang, J. & Liao, H. Review on Health Impacts from Domestic Coal Burning: Emphasis on Endemic Fluorosis in Guizhou Province, Southwest China. Reviews of Environmental Contamination and Toxicology. 258, 1-25 (2021).
Chapter Google Scholar
Su, X. et al. Achievements, main experiences, challenges and suggestions on control of endemic diseases in China in the last decade. Chin. J. Endemiol. 780–784 (2022).
Sun, D. et al. Current main problems and countermeasures for prevention and control of endemic diseases in China. Chin. J. Endemiol. 1–3 (2023).
Zhang, L. et al. Probabilistic risk assessment of Chinese residents’ exposure to fluoride in improved drinking water in endemic fluorosis areas. Environ. Pollut. 222, 118–125 (2017).
Article CAS PubMed Google Scholar
Wang, F. et al. Epidemiological analysis of drinking water-type fluorosis areas and the impact of fluorosis on children’s health in the past 40 years in China. Environ. Geochem. Health. 45, 9925–9940 (2023).
Article CAS PubMed PubMed Central Google Scholar
Malinowska, E., Inkielewicz, I., Czarnowski, W. & Szefer, P. Assessment of fluoride concentration and daily intake by human from tea and herbal infusions. Food Chem. Toxicol. 46, 1055–1061 (2008).
Article CAS PubMed Google Scholar
Izuora, K. et al. Skeletal Fluorosis from Brewed Tea. J. Clin. Endocrinol. Metabolism. 96, 2318–2324 (2011).
Article CAS Google Scholar
Kurland, E. S. et al. Recovery From Skeletal Fluorosis (an Enigmatic, American Case). J. Bone Miner. Res. 22, 163–170 (2007).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We sincerely thank all the participants of the China Fluorosis Cohort (CFC) for their valuable contributions to this study. We would also like to thank the local Centers for Disease Control and Prevention (CDC) teams in Shanxi, Guizhou, and Sichuan, as well as the faculty at Harbin Medical University for their support in participant recruitment and data collection. Finally, we would like to thank each of the participating doctors and community workers for their help during the implementation of the project!

Funding

This work was supported by the National Key Research and Development Program of China (Grant no: 2022YFC2503003) and Guizhou Provincial Major Scientific and Technological Program (2024-015).

Author information

These authors contributed equally: Hongjiang Long and Jiayi Zeng.

Authors and Affiliations

Key Laboratory of Environmental Pollution Monitoring and Disease Control, School of Public Health, Ministry of Education, Guizhou Medical University, Guiyang, 561113, China
Hongjiang Long, Jiayi Zeng, Shaofeng Wei, Ting Hu, Yun Lu, Zhenting Zhang & Peng Luo
Shanxi Provincial Institute of Endemic Disease Control and Prevention, Linfen, 041000, China
Qingzhen Jia
Sichuan Provincial Center for Disease Control and Prevention, Chengdu, 610044, China
Jinshu Li
Guizhou Provincial Center for Disease Control and Prevention, Guiyang, 550004, China
Hongbing Ye
Harbin Medical University, Harbin, 150081, China
Yanhui Gao

Authors

Hongjiang Long
View author publications
Search author on:PubMed Google Scholar
Jiayi Zeng
View author publications
Search author on:PubMed Google Scholar
Shaofeng Wei
View author publications
Search author on:PubMed Google Scholar
Ting Hu
View author publications
Search author on:PubMed Google Scholar
Yun Lu
View author publications
Search author on:PubMed Google Scholar
Qingzhen Jia
View author publications
Search author on:PubMed Google Scholar
Jinshu Li
View author publications
Search author on:PubMed Google Scholar
Hongbing Ye
View author publications
Search author on:PubMed Google Scholar
Yanhui Gao
View author publications
Search author on:PubMed Google Scholar
Zhenting Zhang
View author publications
Search author on:PubMed Google Scholar
Peng Luo
View author publications
Search author on:PubMed Google Scholar

Contributions

H.L. and J.Z. wrote the main manuscript text and performed data analysis with model fitting. S.W., T.H., and Y.L. conducted data pre-processing and verification. Q.J., J.L., and H.Y. performed field investigations and data collection. Y.G. prepared visualizations and reviewed the manuscript. Z.Z. and P.L. supervised the study and revised the manuscript. All authors reviewed and approved the final manuscript.

Corresponding authors

Correspondence to Zhenting Zhang or Peng Luo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethics approval and consent to participate

This study was approved by the Ethics Committee of Guizhou Medical University (Approval No. 2023 − 189). Written informed consent was obtained from all participants.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (download DOCX )

Supplementary Material 2

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Long, H., Zeng, J., Wei, S. et al. Predicting skeletal fluorosis severity using machine learning across diverse fluoride-exposed populations in China. Sci Rep 16, 13858 (2026). https://doi.org/10.1038/s41598-026-43429-4

Download citation

Received: 25 April 2025
Accepted: 04 March 2026
Published: 17 March 2026
Version of record: 30 April 2026
DOI: https://doi.org/10.1038/s41598-026-43429-4

Subjects

Abstract

Similar content being viewed by others

An interpretable machine learning model predicts frailty risk in middle-aged and older adults with gastrointestinal disease: a longitudinal study

Identification and validation of a novel machine learning model for predicting severe pelvic endometriosis: A retrospective study

Development of machine learning-based models for predicting sarcopenia risk in stroke patients and analysis of associated factors

Introduction

Method

Study design and participants

Data collection and preprocessing

Feature selection and ML model building

Model explanatory analysis

Statistical analysis

Result

Characteristics of participants

Feature selection

Evaluation and performance of ML models

Discussion

Conclusions

Data availability

Code availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Ethics approval and consent to participate

Additional information

Publisher’s note

Supplementary Information

Supplementary Material 1 (download DOCX )

Supplementary Material 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links