Introduction

Obesity has become a significant public health issue worldwide, not only reducing individuals’ quality of life but also serving as a major risk factor for conditions such as diabetes, hypertension, and cardiovascular diseases1,2,3. In China, approximately one-fifth of children and half of adults suffer from overweight or obesity problems, making it the country with the highest number of overweight or obese people in the world4. However, most Chinese adults perceive obesity merely as a cosmetic problem rather than a life-threatening health issue, resulting in inadequate attention to obesity prevention and management5. To address the obesity epidemic, healthcare professionals need the ability to identify individuals at high risk of obesity, conduct necessary health education, and formulate personalized health management goals6. In this context, finding effective methods to predict the risk of obesity, identify individual obesity-related factors, becomes crucial for effective prevention and management of obesity.

The lifestyle and blood biochemical indicators of individuals are important bases for predicting the risk of obesity occurrence7, researchers mostly use traditional statistical methods such as correlation analysis to predict the risk of obesity occurrence8,9. With the development of artificial intelligence technology, machine learning, as a powerful tool for data analysis and prediction10, has provided new opportunities for predicting the risk of chronic diseases11,12,13. Previous studies, such as Islam et al., predicted hypertension risk using four machine learning algorithms, including Random Forest, after screening key risk factors with Least Absolute Shrinkage and Selection Operator (LASSO)14. Artzi et al. predicted gestational diabetes in the early stages of pregnancy based on electronic health records15. Davide et al. used Neural Network models to predict the incidence of various neonatal diseases16. From this, it can be seen that machine learning techniques can analyze the lifestyle and blood biochemical indicators of individuals to discover correlations and trends, predicting the risk of obesity for individuals. In addition, previous studies often only constructed machine learning models, and these models had a ‘black box’ characteristic, meaning it was difficult to explain which variables determined the prediction results. This posed challenges to formulating intervention measures to reduce adverse events and greatly limited the application of research results17,18.

To overcome these limitations, accurately predict individual obesity risk, and provide reasonable explanations for the prediction results, this study evaluated the performance of Random Forest (RF), XGBoost (XGB), Support Vector Machine (SVM), LightGBM (LGB), Decision Tree (DT), Gradient Boosting Tree (GBT), Multi-Layer Perceptron (MLP), K-Nearest Neighbors (KNN), Backpropagation Neural Network (BPNN), and Logistic Regression (LR) models in obesity prediction. Obtain the feature importance of the machine learning model with the best performance and combine it with SHapley Additive Explanation (SHAP) to develop an online obesity risk prediction system (Fig. 1). The developed online obesity risk prediction system is deployed on a webpage, allowing information input through various means to predict the risk probability of class 1 obesity in non-obese populations or class 2 obesity in class 1 obese populations. At the same time, the system can also use SHAP technology to explain individual obesity risk prediction results, rank the importance of features influencing obesity risk, and make the results interpretable. Therefore, the developed obesity risk prediction system possesses the advantages of accuracy, comprehensiveness, and practicality, assisting physicians in achieving personalized comprehensive health management for obesity and exhibiting broad potential for application and development.

Fig. 1
figure 1

A flowchart describing the general framework of the study.

Results

Research subjects

The dataset contains electronic medical examination records of 1678 individuals. Table S1 shows the distribution of lifestyle and blood biochemical indicators. According to BMI, 865 (51.5%) individuals in the dataset belonged to the non-obese group, 716 individuals (42.6%) belonged to class 1obese group, and 97 individuals belonged to class 2obese group (5.7%). The dataset comprised 68 variables, including gender, age, family medical history, eight lifestyle variables, two blood pressure variables, and 43 blood biochemical variables.

Classification performance

ROC curve is a method for evaluating the performance of classification models, and the area under the curve (AUC) represents the model’s performance, with a larger AUC indicating better performance. As shown in Fig. 2, the AUC values of the macro-average and micro-average ROC curves for RF, XGB, LGB, GBT, BPNN, and LR models were all around 0.95, indicating good predictive performance with high true positive rates and low false positive rates. Accuracy is the ratio of correctly classified samples to total samples. Compared to other models, the XGB model has the highest accuracy (Fig. 3). At the same time, we calculated the precision and recall rates for all models, and F1-Score is the harmonic mean of precision and recall rate (Table S2). The XGB model also has the highest average F1-Score (Fig. 3). These results indicate that the XGB model has the best overall performance among the ten machine learning models.

Fig. 2
figure 2

The ROC curves of different machine learning models for predicting different obesity class. The blue curve represents the non-obesity (class 0), the yellow curve represents the class 1 obese (class 1), and the red curve represents the class 2 obesity (class 2). The micro-average curve is a weighted average of true positive and false positive rates for all samples, represented by a red dashed line. The macro-average curve is the average of the ROC curves for the three class, represented by a blue dashed line.

Fig. 3
figure 3

The prediction accuracy and F1-score of different machine learning models. In the figure, the blue bar represents accuracy, and the yellow bar represents the F1-score.

We analyzed the misclassifications produced by the XGB model using the confusion matrix (Fig. 4). Among the 262 non-obese individuals, 83.97% were correctly classified, while the remaining 16.03% were misclassified as class 1 obese. Among the 208 class 1 obese individuals, 83.65% were correctly classified, 15.38% were misclassified as non-obese, and 0.97% were misclassified as class 2 obese. Among the 34 class 2 obese individuals, 61.76% were correctly classified, while the remaining 38.24% were misclassified as class 1 obese.

Fig. 4
figure 4

The mix matrix of the XGB model classification results based on the test set. The vertical coor-dinates class 0, class 1, and class 2 correspond to the actual number of people who are non-obese, class 1 obese, and class 2 obese, respectively, the horizontal coordinates class 0, class 1, and class 2 correspond to the number of people classified by the XGB model as non-obese, class 1 obese, and class 2 obese, respectively.

Feature importance

The five most important feature indicators of the XGB model obtained by the importance_scores algorithm are: ‘Hip circumference’, ‘Chest circumference’, ‘Body fat mass’, ‘Diet’ and ‘Triglycerides’. These are similar to the contributions of each feature to the prediction results obtained by SHAP (Fig. 5). SHAP is an interpretable artificial intelligence method based on game theory, it adopts the concept of Shapley value from cooperative game theory to assign an importance score to each feature, representing the contribution of each feature to the risk prediction model19,20. In the SHAP plot, the y-axis represents feature sorting, indicating the strength of each feature’s impact on the model, the color of the data points represents the feature values, where red corresponds to high values and blue corresponds to low values. We found that hip circumference was the most important feature in predicting individual obesity risk. Additionally, body fat percentage, chest circumference, triglycerides, dietary habits, and glycated hemoglobin A1 were also important features in the predicting obesity class. Among them, dietary habits had higher predictive value for class 2 obesity and some predictive value for class 1 obesity but no predictive value for non-obesity.

Fig. 5
figure 5

The significance of the characteristics of the different obesity class. (A) Characteristic importance of non-obese categories, (B) Characteristic importance of class 1 obese categories, and (C) Char-acteristic importance of class 2 obese categories.

Risk prediction system

In the visualized prediction system, the left side is the information input area, where information can be entered one by one (Fig. 6A) or imported through a file (Fig. 6B). When entering information one by one, continuous variables can be entered by dragging (e.g., age), and categorical variables can be selected by clicking (e.g., gender). The right side of the system is the output window. The upper part displays the current BMI’s obesity class and the prediction result for the next obesity class, while the lower part provides personalized result analysis for intervention strategy development.

Fig. 6
figure 6

Visualizes the obesity risk prediction system. (A) Use the prediction system by entering infor-mation item by item; (B) Use the prediction system by importing information files.

Discussion

Accurate and interpretable risk assessment is crucial for obesity prevention and intervention21,22. The proposed obesity prediction system in this study is an important health management tool that can assist physicians in deciding whether to intervene and develop personalized intervention plans.

We first assessed the performance of ten machine learning models in obesity prediction based on health examination data and established an obesity risk assessment model with good predictive performance. In contrast to similar studies23,24,25,26, we did not simply classify the population into obese and non-obese groups, we further divided the obese population into class 1 and class 2 obesity. Additionally, the predictive accuracy of our trained machine learning models on the test set was higher than that in similar studies. This maybe due to the fact that our best model was selected from a larger pool of machine learning models, and we employed a Monte Carlo Cross-Validation algorithm during the training process. The ten machine learning models we used included tree models, deep learning models, and traditional statistical models, and the tree model XGB demonstrated the best predictive performance. This may be because traditional statistical models like LR are more suitable for linear or normally distributed problems, while deep learning models are better suited for image or natural language processing tasks and have poorer predictive performance on small-sample tabular data27.

On the basis of the XGB model, we constructed a visual obesity risk prediction system using the SHAP algorithm, making the output results of the machine learning model interpretable. In this study, in addition to incorporating features such as age, gender, lifestyle, and blood routine, we also included body composition data such as total body water, protein content, and basal metabolic rate as variables. According to the SHAP interpretation results (Fig. 5C), bone mineral content was important predictors of class 2 obesity. This is consistent with the findings of Hwaung et al., which indicate that obesity is not only characterized by excessive adipose tissue but also by changes in characteristics such as protein content in skin and visceral organs28. Among the variables related to blood biochemistry, elevated triglycerides levels increased the risk of class 1 obesity, while elevated glycated hemoglobin and uric acid levels increased the risk of class 2 obesity. Conversely, decreased triglyceride and glycated hemoglobin A1 levels increased the probability of non-obesity (Fig. 5). These findings are in line with the research by Jeon et al., which identified triglycerides, glycated hemoglobin, and uric acid as important features for assessing obesity risk24. Indicators such as triglycerides and glycated hemoglobin are important features for assessing obesity risk, this may be because there is a close relationship between high triglyceride levels and insulin resistance, insulin resistance can lead to fluctuations in blood sugar levels, further stimulating excessive secretion of insulin, elevated insulin levels can promote fat synthesis and storage29,30. Therefore, when predicting obesity risk, it is necessary to consider not only common features such as lifestyle but also body composition and blood biochemical indicators in order to provide early intervention for individuals at high risk of obesity.

In this study, we constructed an obesity risk prediction system based on the XGB model and SHAP technology, which is accessible on a webpage. To explain the usage of this system, we presented an example of its use in Fig. 6. After inputting information into the left-hand input interface, the system indicates that the BMI of the examined individual does not currently reach the obesity level, but the risk probability of class 1 obesity is 34.77%. In the SHAP plot below, the length of the feature bar indicates the strength of its influence on the risk probability, where red represents positive influence and blue represents negative influence. Factors contributing positively to the risk of class 1 obesity include hip circumference, alcohol consumption, and glycated hemoglobin A1, while factors contributing negatively include triglycerides, lymphocyte percentage, red blood cell distribution width, and diet. Therefore, targeted control of factors such as hip circumference, alcohol consumption, and glycated hemoglobin A1 can reduce the risk of class 1 obesity. Our constructed obesity risk prediction system allows information input through individual entries and file reading, providing personalized obesity risk assessment for examination personnel in a more convenient and user-friendly manner, laying the foundation for the practical application of future prediction systems.

We are aware that our study has some limitations. Although we randomly divided the dataset into a training set and a test set, the results may still be influenced by the source of the data due to its single source. Additionally, the class 2 obese population was relatively less in the dataset, although we applied the SMOTE algorithm for over-sampling during model training, the F1-score for class 2 obesity by the best model was still lower than the other two classes, indicating relatively lower predictive ability for class 2 obesity. Lastly, the obesity risk prediction system developed in this study was based on the XGB model. While the XGB model demonstrated the best overall performance, its recall rate for the non-obese class was lower than that of the LGB and BPNN models, suggesting a potential need to improve the model’s predictive performance for the non-obese class through optimization algorithms or other approaches.

Conclusions

In summary, we have constructed an obesity risk assessment model with good predictive performance based on health management center examination data and ten machine learning models. Using SHAP to interpret the model’s output results, we have built an obesity risk prediction system accessible on a webpage. This system not only predicts the risk probability of class 1 obesity in non-obese populations but also forecasts the risk probability of class 2 obesity in class 1 obese populations, enabling comprehensive management of obesity issues. Additionally, our constructed obesity risk prediction system can output the important factors influencing individual obesity risk through SHAP technology, aiding healthcare professionals in providing targeted obesity management and achieving personalized services.

Methods

Study population

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics Committee of Hangzhou Normal University Affiliated Hospital (protocol code 2023E2-KS-110 and date of July 2023). Due to the research involving previously collected anonymous data and the absence of a direct risk of individual privacy disclosure, in accordance with Article 39 of the ‘Ethical Review of Biomedical Research Involving Humans’ in China, the Ethics Committee of Hangzhou Normal University Affiliated Hospital has waived the requirement for informed consent. Throughout the entire research process, we rigorously adhere to ethical principles, maintain transparency, integrity, and comply with all relevant regulations. This study utilized an anonymized dataset of 1678 individuals from the Health Management Center of Hangzhou Normal University Affiliated Hospital, covering the period from May 31, 2022, to May 31, 2023. Individuals who undergo health check-ups are usually those seeking active health management and prevention, and they are not necessarily patients who have already been diagnosed with specific diseases. The study population comprised individuals aged 18 and above who underwent health examinations, excluding pregnant women and individuals with physical disabilities, resulting in a dataset with 1678 health examination records. According to the Asia-Pacific criteria, we divided the study population into non-obese individuals (BMI < 25, class 0), class 1 obese individuals (25 ≤ BMI < 30, class 1), and class 2 obese individuals (30 ≤ BMI, class 2)31. The dataset contained 42.6% class 1 obese individuals and 5.7% class 2 obese individuals.

Data preprocessing

The collected data included information on individuals’ age, gender, lifestyle, blood routine, and biochemical test results, involving a total of 80 variables (the missing rate of 12 variables is more than 30%, the missing rate of 9 variables is less than 30%, and the remaining variables have no missing values). To ensure data quality, we utilized the Multiple Imputation by Chained Equations (MICE) algorithm to impute variables with a missing rate below 30% and removed variables with a missing rate over 30%, ultimately retaining 68 variables32.

Machine learning models

Machine learning models were designed using Python 3.11 in this study. The dataset was randomly divided into a training set (70%) and a test set (30%). To address the issue of imbalanced sample distribution in the training set, SMOTE oversampling technique was applied. RF, XGB, SVM, LGB, DT, GBT, MLP, KNN, BPNN and LR models were constructed, and the parameters of the models were selected using Monte Carlo Cross-Validation.

Model evaluation

To evaluate the performance of the machine learning models, we calculated the Accuracy, Precision, Recall, and F1-score of the models in classifying non-obese, class 1 obese, and class 2 obese individuals in the test set33. We also plotted macro-average and micro-average ROC curves.

Visualized system

Based on the above analysis, we selected the machine learning model with the best predictive performance as the optimal model. We calculated its feature importance index through the importance_scores algorithm and SHAP, respectively, and constructed an online prediction system34.