Abstract
Hypoglycemia is a serious complication in individuals with type 2 diabetes mellitus. Identifying who is most at risk remains challenging due to the non-linear relationships between hypoglycemia and its associated risk factors. The objective of this study is to evaluate the importance and impact of risk factors related to the incidence of hypoglycemia through an explainable machine learning method. This prospective study enrolled 1306 adults with type 2 diabetes mellitus at a specialized diabetes center. Over three months, participants were asked to do self-monitoring blood glucose measurements and record hypoglycemic events. Nine clinically relevant features were analyzed using five machine learning models. The performance of the models was evaluated by different metrics. The SHapley Additive exPlanation method was used to elucidate how each covariate influenced the risk of hypoglycemia. Overall, 419 participants (32.08%) reported at least one hypoglycemic episode. Our findings highlight the non-linear nature of hypoglycemia risk in individuals with T2DM. Insulin therapy, Diabetes duration (> 13.7 years), and eGFR (< 60.2 mL/min/1.73 m2) were the most important predictors of hypoglycemia, followed by age, HbA1C, triglycerides, total cholesterol, gender, and BMI.
Similar content being viewed by others
Introduction
Type 2 diabetes mellitus (T2DM) affects a substantial portion of the global population, with its prevalence rapidly increasing, particularly in low- and middle-income countries (LMICs). Recent data from the IDF Diabetes Atlas reveal that nearly 81% of the 537 million adults aged 20–79 years diagnosed with diabetes in 2021 were residents of LMICs, including various countries in the Middle East. T2DM places a considerable burden on patients and healthcare providers worldwide, largely due to its many complications1. Strict blood glucose management through glucose-lowering medications is recognized as an effective approach to reduce long-term risks, such as nephropathy, retinopathy, neuropathy, and all causes of mortality2,3. However, the threat of hypoglycemia, as a well-known side effect of insulin therapy, significantly limits the use of insulin or insulin secretagogues4.
Predicting hypoglycemic events in T2DM remains challenging, as these episodes are driven by multiple factors and show considerable variability in T2DM. Certain conditions, such as advanced age, female sex, chronic kidney disease, depression, and cognitive impairment, are known to increase an individual’s susceptibility to hypoglycemia5,6.
In individuals with T2DM using insulin or sulfonylureas, the frequency of hypoglycemic events is estimated at around 2.0 events/100 person-years. Moreover, each hypoglycemic event not only carries its own physical and psychological consequences but also increases the likelihood of future events7.
These events are associated with a greater risk of dementia, adverse cardiovascular events, and mortality. They also prompt fear and anxiety, which undermine diabetes self-management and severely impact patients’ quality of life. Despite these serious consequences, hypoglycemia remains the most preventable event among common diabetes-related complications8,9.
Most studies focus on severe hypoglycemia, which often leads individuals to seek medical attention at healthcare facilities. Yet, many vulnerable patients experience level 1 and level 2 hypoglycemia, which may not prompt hospital visits but still significantly affects their health and daily lives10,11,12. Recent guidelines suggest diabetic individuals detect levels 1 and 2 of hypoglycemia, as well as asymptomatic and natural hypoglycemia, by continuous glucose monitoring (CGM) systems as effective self-management tools13. CGM offers a significant advantage over traditional fingerstick-based self-monitoring of blood glucose (SMBG) by providing continuous data about blood glucose levels and their trends over time. However, due to the high cost of CGM systems, they remain unaffordable in the LMICs, which makes the diabetic population in these regions rely on SMBG to track their glucose levels14.
Historically, non-severe hypoglycemic events have been identified through retrospective self-reported questionnaires that depend on patients’ ability to recognize their symptoms15. The present study addressed this limitation by adopting a prospective design, which allows for a more objective collection of data on hypoglycemic episodes. There is also limited knowledge of how various aspects of diabetes management, such as lipid profiles and HbA1c levels, impact the incidence of hypoglycemia. By exploring these relationships, the study offers valuable insights for clinical care.
Machine learning (ML) methods have shown strong performance in disease modeling and classification, which has sparked growing interest among clinicians16,17. In the context of hypoglycemia prediction in individuals with T2DM, Agraz et al. have used ML algorithms on a dataset mostly related to white ethnics. They applied only a limited number of variables for model development, and the two proposed models in the study did not achieve high performance in terms of certain metrics18. While other studies have developed high-performing models using electronic health records from hospitalized patients, these models are not directly applicable to the daily lives of individuals with T2DM19,20. In contrast, our prospective cohort study of outpatients, though based on a smaller dataset, employs an explainable ML approach to identify and understand the key factors influencing hypoglycemia in everyday settings.
The complex mathematical processes behind ML approaches can make their decision-making hard to interpret. To address this issue, the SHapley Additive exPlanation (SHAP) method has emerged as a valuable tool for explaining ML model predictions at the individual level, enabling the visualization of how specific feature values influence the model’s outputs21.
In this prospective study, we applied ML models combined with interpretable methods to identify the most critical biochemical and clinical predictors of hypoglycemia in individuals with T2DM from a Middle Eastern country. We aimed to provide robust predictive performance while also clearly interpreting the relationships between these predictors and hypoglycemic outcomes. An overview of the study workflow is presented in Fig. 1.
Methods
Study population
This prospective study was conducted at the Yazd Diabetes Research Center in Iran, a specialized outpatient clinic dedicated to the follow-up care of individuals with diabetes. Adults with a prior diagnosis of T2DM who visited the center for routine follow-up between November 2022 and May 2023 were included. Eligibility criteria required an age above 30 years, at least 12 months since T2DM diagnosis, and no documented history of cognitive impairment or dementia. Exclusion criteria encompassed serious complications (e.g., severe diabetic foot or advanced cardiopulmonary failure), pregnancy, irregular attendance noted in medical records, hospitalization or receiving new medications during the study follow-up, or any changes in diabetes management. Examples of management changes included modifications to the type or dose of oral antidiabetic medications, adjustments to insulin type or dose, or a switch from oral medications to insulin therapy during the last specialist visit.
Ethical statement
This investigation was carried out in accordance with the Declaration of Helsinki and approved by the Ethics Committee of Shahid Sadoughi University of Medical Sciences (Ethics No.: IR.SSU.REC.1400.256). Written informed consent was obtained from all participants.
Definition of hypoglycemia
According to the American Diabetes Association, hypoglycemia is divided into three levels. Level 1 is defined as a measured glucose level between 54 mg/dl (3 mmol/L) and 70 mg/dL (3.9 mol/L), even when no symptoms are observed. Level 2 is marked by a glucose level below 54 mg/dL (3 mmol/L), which is the recognized threshold for neuroglycopenic symptoms. Level 3 is a severe episode with altered mental or physical functioning that requires external assistance, regardless of the measured glucose level22.
Data collection
Experienced nurses provided guidance to each participant on recognizing hypoglycemia-related symptoms. Upon experiencing such symptoms, participants were instructed to use their SMBG devices to measure blood glucose levels before starting any self-treatment, except in cases of level 3 episodes that required immediate help. They recorded all events classified as level 3 or any episode with a blood glucose reading below 70 mg/dL using a personalized recording method, either on paper or electronically, that included the date and corresponding glucose level. Data from the SMBG devices, which stored information from approximately 1000 recent tests or the associated mobile application on participants’ smartphones, was used to verify self-reported non-severe events. Participants returned to the center three months after their initial visit to report their recorded hypoglycemic episodes.
Blood samples were drawn after an eight-hour fast to measure parameters such as HbA1c, total cholesterol, triglycerides, and serum creatinine (for the calculation of eGFR using the CKD-EPI equation).
We selected clinical and biochemical features with the highest relevance to hypoglycemia in individuals with T2DM by combining clinical expertise with evidence from previous studies23,24,25,26. Specifically, the study included nine variables: age, sex, diabetes duration, treatment type (insulin versus oral medications), BMI, eGFR, HbA1c, triglycerides, and total cholesterol.
Model development
Some features in the dataset showed randomly occurring missing values, each accounting for less than 20% of the data. To address these missing values, we employed the IterativeImputer method27. No significant correlation was observed between any pair of features (Figure S.1), which indicates the data set’s suitability for use in a ML workflow.
The dataset was divided into training and test sets using a 70:30 split through a stratified random sampling approach to ensure that each set maintained balanced class proportions. Each patient was assigned exclusively to one set to prevent any overlap between the training and test datasets. The training set supported model development, and the test set was used for internal validation. To ensure consistency in the measurement units of clinical features and to reduce potential bias during training and prediction, the StandardScaler method was employed. This technique standardizes the data, resulting in each variable having a mean of 0 and a standard deviation of 1.
To predict hypoglycemic episodes, we developed five supervised machine learning classifiers: logistic regression (LR), K-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and eXtreme gradient boosting (XGBoost). Each model was trained using the training dataset, with hyperparameter optimization via the GridSearchCV approach. This process employed a fivefold, 10-repeated RepeatedStratifiedCV to ensure robust parameter selection. During tuning, we optimized the area under the receiver operating characteristic curve as the primary performance metric.
We evaluated five candidate models using a tenfold cross-validation approach on the test set. The performance of each model was assessed using the Area Under the Precision-Recall Curves (AUPRC), F1-score (the harmonic mean of precision and recall), the Area Under the Receiver Operating Characteristic Curve (AUROC), accuracy, and the Brier score. The Brier score of a model measures the mean squared difference between predicted probabilities and actual outcomes; thus, a lower score shows a better agreement between predictions and observations.
ROC curves and precision-recall curves of the models were plotted at equal error rate. AUROC scores between models were compared through DeLong’s tests. Pairwise comparisons for other metrics among the models were performed by the bootstrapping method with 1000 replicates.
Model explanation
The SHAP method was applied to interpret the final model’s predictions. This technique quantifies the impact of each feature on hypoglycemic episodes, clarifying how different factors contribute to the model’s outputs.
Statistical analysis
All ML workflows were executed in Python version 3.11.7 (https://www.python.org), along with scikit-learn (version 1.6.1), Pandas (version 2.2.3), and NumPy (version 2.1.3) packages. Continuous variables were reported as medians (interquartile ranges) and compared between the two outcomes using the Mann–Whitney U test. Categorical variables were expressed as frequencies (percentages) and analyzed using the Chi-square test. A two-tailed P-value less than 0.05 was considered statistically significant.
Results
Participant characteristics
Of the 1537 volunteers initially enrolled, 212 (13.79%) were lost to follow-up because of undocumented hypoglycemic events or an unwillingness to continue. Nineteen persons (1.23%) were excluded due to receiving new medications or hospitalization. Finally, 1,306 participants were retained for analysis. Among the total cohort, 684 (52.37%) were female, and 622 (47.63%) were male. The cohort’s characteristics, according to the presence or absence of hypoglycemia, are summarized in Table 1.
Hypoglycemia episodes
During three months of follow-up, 419 out of 1,306 participants (32.08%) experienced at least one hypoglycemic episode. Among them, 397 (30.3%) had only nonsevere episodes, whereas 22 (1.78%) reported severe hypoglycemia for at least one episode.
Information about oral antidiabetic medications and insulin therapy for the hypoglycemia and non-hypoglycemia groups is shown in Table 1. Individuals on insulin therapy were classified as receiving intensive insulin therapy (IIT) if they used either basal-bolus regimens (multiple daily injections) or premixed insulin regimens administered two or three times per day.
The hypoglycemia group received significantly more IIT or sulfonylureas compared with the non-hypoglycemia group. The percentage of those receiving basal insulin alongside oral medications did not significantly differ between the two groups. Individuals treated only with DPP4 inhibitors and/or SGLT2 inhibitors did not report any hypoglycemic events.
Model development and performance comparison
Table 2 presents the performance metrics of the RF, XGBoost, SVC, LR, and KNN. It is important to note that, unlike the other metrics, where higher values indicate better performance, a lower Brier score signifies improved performance. This score numerically represents how accurately a model is calibrated. The calibration curve for each model illustrates the mapping of the model’s predictions to the actual outcomes.
The top three models were identified as RF, XGBoost, and SVC. Table 3 presents the results of pairwise statistical tests comparing the metrics among these models. The findings indicated that RF and XGBoost outperformed SVC with respect to both the AUPRC and the Brier score.
Figure 2 shows ROC curves, Precision-recall curves, and calibration curves of all the models. In terms of AUROC values, pairwise DeLong tests confirmed that there was no statistically significant difference among the first three models: RF (AUC = 0.94 ± 0.01), XGBoost (AUC = 0.94 ± 0.01), and SVC (AUC = 0.93 ± 0.01). When we consider the balance between false positives and false negatives, the equal-error-rate (EER), RF outperformed the others with a misclassification rate of 14%. In comparison, both XGBoost and SVC had misclassification rates of 16%. This indicates that, on average, RF misclassified fewer cases than the other two models when equal weight is given to errors.
Performance comparison of the models for predicting hypoglycemia. A: The ROC curve, with the AUC score presented as Mean ± SD and shaded areas representing the upper and lower 95% confidence intervals. Equal Error Rates (EER) are indicated with “⨉,” where the rate of false positives is equal to the rate of false negatives. B: The precision-recall curves, with AUPRC shown as Mean ± SD and shaded for the upper and lower 95% confidence intervals, similarly indicating Equal Error Rates with “⨉.” C: The calibration curves, with the Brier score represented as Mean ± SD along with 95% confidence intervals (CI).
On precision–recall curves, RF and XGBoost achieved the highest AUPRCs, 0.88 ± 0.05 and 0.89 ± 0.02, respectively. A non-parametric bootstrap test showed no significant difference between these values. We then located the threshold at which precision equals recall. At this point, RF reached 81% precision and 81% recall, while XGBoost reached 77% for both. Thus, although their overall AUPRCs are nearly identical, RF offered a slightly better balance of precision and recall when errors are weighted equally.
Calibration curves alongside Brier scores are shown in Fig. 2C. The RF model achieved the best Brier score of 0.090 ± 0.012, closely followed by the XGBoost model, which scored 0.094 ± 0.013. However, the difference in performance between these two models was not statistically significant.
Table 3 shows that there was no statistically significant difference between the RF and XGBoost models across different metrics. Nevertheless, because the RF model demonstrated better EER in both the ROC curves and precision-recall curves, it was selected for subsequent analyses. Despite this, the same interpretation methods applied to the RF model were also used for the XGBoost model, and the results are provided in the Supplementary file.
The high AUC (0.94 ± 0.01) of the RF showed that the model distinguishes reliably between hypoglycemic and non-hypoglycemic cases, and the strong Brier score (0.090 ± 0.012) demonstrated that the predicted probabilities match actual outcomes well. These metrics provide confidence in the SHAP-based feature importance ranking.
The confusion matrix of the RF model was illustrated in Fig. 3.
Importance of covariates
We used the SHAP method to interpret the final RF model’s predictions by measuring each variable’s contribution to the outcome. When the RF model was applied to SHAP analysis, it enabled us to identify and interpret the most important risk factors for hypoglycemic events. In practical terms, the SHAP-derived feature importance provides clinicians with insights into which variables most strongly influence the risk of hypoglycemia. To assess the reliability of this ranking system, we applied Permutation Importance, a method for feature importance ranking, to the RF model.
The SHAP summary plot (Fig. 4) ranks features in descending order based on their average absolute SHAP values, each point representing an individual sample. Point color indicates the feature value, where warmer colors (red) indicate higher values and cooler colors (blue) denote lower values. The SHAP dependence plots (Fig. 5) illustrate how the actual value of each feature relates to its SHAP value. Positive SHAP values mean that the feature’s actual value increases the likelihood of hypoglycemia.
Model explanation by the SHAP method. SHAP summary plot of the RF model shows the overall importance of features and their impact on model predictions. Red dots represent higher feature values, except for the “Insulin therapy” and “Female” features, where red dots indicate the presence of insulin therapy and female gender, respectively.
A comparison of the SHAP summary plots for the RF model (Fig. 4) and the XGBoost model (Figure S.2), along with the results from applying permutation importance to the RF (Table 4) and XGBoost (Table S.1), identified the five most important common predictors. These predictors were insulin therapy, diabetes duration, eGFR, age, and HbA1c levels, respectively.
Diabetes duration exceeding 13.7 years, eGFR levels below 60.2 mL/min/1.73 m2, age older than 59 years, and HbA1c levels below 10% were more associated with an increased risk of hypoglycemia.
However, the RF and XGBoost showed different orders of importance for the other covariates; both agreed that female gender and lower levels of triglycerides and total cholesterol elevated the probability of hypoglycemia (Fig. 3 and Figure S.2).
Discussion
This prospective study evaluated and compared various ML models to analyze hypoglycemia events in the daily lives of individuals with T2DM receiving outpatient care. By leveraging a robust ML prediction model in conjunction with routine laboratory assessments, we identified and ranked a set of risk factors associated with hypoglycemia.
Our investigation was based on SMBG devices, including glucometers, test strips, and lancets. Although this method is less reliable than CGM systems, it is more available and affordable in LMICs because CGM systems are not widely supplied by governments or found in private pharmacies in these regions14. Additionally, recent meta-analyses have shown that while CGM systems can improve overall glycemic control and lead to a significant reduction in HbA1C levels among individuals with T2DM, they do not show a significant advantage in detecting severe hypoglycemic events or in reducing glucose variability when compared with SMBG devices. Also, CGM has been associated with an increased risk of certain adverse events, including symptoms at the sensor insertion site, infections, headaches, and gastrointestinal issues28,29. Altogether, these findings suggest that using SMBG devices is a valuable approach for capturing symptomatic hypoglycemic events in the diabetic population of middle-income countries.
Incidence of hypoglycemia
After a three-month follow-up period, 30.3% of participants reported at least one non-severe hypoglycemic episode, while 1.78% experienced at least one severe event. This rate is lower than what has been reported in previous studies with similar follow-up durations in Europe30,31. In the Middle East region (Israel, Lebanon, and Saudi Arabia), the HAT study found a 39.1% rate of hypoglycemia in patients with type 2 diabetes over four weeks32. Differences in study design may explain our lower estimates. Specifically, the HAT population consisted entirely of insulin-treated patients, whereas most of our participants were on oral antidiabetic medications. Additionally, our participants were asked to self-monitor blood glucose before recording a hypoglycemic event, which likely improved the accuracy of our data. Moreover, during the study period, none of our participants required hospitalization or started any new medications that could have affected their glycemic control.
Importance of covariates
The results of this study showed that insulin treatment, diabetes duration, and eGFR are the two most important predictors of hypoglycemia, followed by age and HbA1C. These factors have also been identified as important predictors in a CGM-based cohort study33.
The present study found that individuals with a history of T2DM exceeding 13.7 years demonstrated a higher likelihood of experiencing hypoglycemia. This association has been consistently reported in both SMBG and CGM-based studies25,26,34. Recent ML modeling using data from CGM systems introduced this factor as a significant demographic predictor for hypoglycemia35. Participants with a longer duration of T2DM are more likely to receive multiple daily insulin injections, a well-established cause of hypoglycemia36.
Impaired renal function is another major contributor to hypoglycemia occurrence, as reported in SMBG-based investigations24,25,37,38. Reduced eGFR changes the metabolism and clearance of diabetic medications, leading to prolonged retention and action of insulin or insulin secretagogues in the body39. Additionally, conditions frequently associated with impaired renal function, such as anorexia, malnutrition, and chronic or recurrent infections, further elevate the likelihood of hypoglycemic events by compounding the body’s inability to maintain stable glucose levels40,41.
In this study, eGFR was the third significant risk factor, with values below 60.2 mL/min/1.73 m2 substantially increasing the probability of hypoglycemia. A previous study based on the CGM method indicated that eGFR below 45 mL/min/1.73 m2 was associated with higher hypoglycemia risk42, while another CGM-based study reported 60 mL/min/1.73 m2 as a threshold for this factor43. Our finding aligns with a large cohort study of individuals with T2DM, which identified eGFR levels below 60 mL/min/1.73 m2 as an independent predictor of outpatient hypoglycemia events38. This threshold has been recognized by the American Diabetes Association Professional Practice Committee as a risk factor for hypoglycemia13, further validating the robustness of our ML model and its outcomes.
We found that HbA1C levels exceeding 10% were associated with a reduced frequency of self-reported hypoglycemia. Research on the relationship between HbA1c and hypoglycemia has produced mixed findings; some studies suggested that lower HbA1c levels increase the risk of hypoglycemia44, while others reported the opposite or found no significant association26,30,45. This study suggested that elevated HbA1c levels (> 10%) may be linked to impaired awareness of hypoglycemia (IAH), a condition marked by a diminished ability to detect hypoglycemic symptoms and a reduced sympathetic response to low blood glucose, thereby masking typical adrenergic signs.
Patients with poor metabolic control usually require complex insulin regimens and experience rapid rises in blood insulin concentration over a short period. This results in the storage of large amounts of glucose, followed by hypoglycemia46. Repeated episodes of hypoglycemia impair counterregulatory mechanisms, disrupt the release of key hormones, and weaken the sympathoadrenal response47, which causes IAH to become a persistent complication of such therapy48.
Recurrent hypoglycemia, a high daily pill burden, and complex treatment protocols are known predictors of poor treatment adherence among those with T2DM49. Patients with poor medication compliance have demonstrated more susceptibility to IAH50. A similar rationale may explain the lipid profile data, where elevated triglycerides and total cholesterol levels were associated with fewer reported hypoglycemia events. This trend could suggest poor treatment adherence and the reinforcement of IAH in this subgroup. Consistent with these findings, Mellor et al. reported that lower triglyceride and total cholesterol levels were associated with more frequent hypoglycemia events in their model based on a large population25.
In this study, participants aged over 59 years experienced a higher incidence of hypoglycemia episodes. Notably, individuals aged 81 years and older reported fewer occurrences of hypoglycemic events. This finding suggests that the older subgroup may have a diminished ability to recognize hypoglycemic symptoms, likely due to reduced bodily sensitivity to stimulation47,51.
The final RF model identified BMI as a less important predictor for hypoglycemia. Different studies have reported mixed findings on the relationship between BMI and hypoglycemia. For example, the UKPDS trial observed more frequent hypoglycemia in non-obese participants52, whereas the PREDICTIVE™ study associated lower BMI with an increased incidence of hypoglycemia53. Conversely, Malkani et al. found no significant association between BMI and hypoglycemia during a 3-month study period24. We found individuals with a BMI higher than 34 kg/m2 reported fewer events. The attenuated response of the autonomic nervous system reduced sympathetic activation, and diminished parasympathetic inhibition in individuals with obesity and high insulin resistance may act as a buffer against sudden glucose fluctuations, potentially preventing symptomatic hypoglycemia triggered by sharp glucose declines. Furthermore, a higher glycemic setpoint, which reduces the likelihood of reaching critically low glucose levels, along with an exaggerated cortisol and glucagon response during hypoglycemia, could decrease the perception of classic hypoglycemia symptoms54.
Our findings showed the female gender as a weak risk factor. A previous study utilizing CGM also found that gender plays a minimal role in predicting hypoglycemia35. The result is similar to reports suggesting females have a higher risk of experiencing hypoglycemia5,55,56,57. One possible mechanism for the higher risk in females is their weaker counterregulatory response to hypoglycemic episodes compared to males58. Factors like socioeconomic status, study participation patterns, comorbidities, or concurrent medications may play a role in the sex-based differences in hypoglycemia risk and warrant further exploration.
Strengths and limitations
This is the first prospective study conducted in Iran to investigate hypoglycemia in the T2DM population. We did not rely on questionnaires, which are subject to recall bias. Instead, participants were educated to recognize hypoglycemic symptoms and instructed to perform SMBG before recording any hypoglycemic events. The study developed an ML model to assess the importance and effect of different risk factors on the occurrence of hypoglycemia.
The reliability of self-reported hypoglycemia in this study depended on participants’ understanding of hypoglycemia symptoms and their use of personal glucometers to document episodes. Despite comprehensive education provided during the study, individual variations in symptom awareness and interpretation could have influenced reporting accuracy. Additionally, because CGM systems are largely unavailable and unaffordable in LMICs, we relied on SMBG readings, which may have missed asymptomatic or nocturnal events. In future work, however, applying advanced machine-learning methods to reconstruct virtual CGM profiles prospectively—from the very start of data collection—could not only enhance detection accuracy but also help interpret threshold effects more reliably59.
The lack of data on participants’ prior hypoglycemia experiences represented a limitation, as we did not collect this information due to concerns about the reliability of their recollection and understanding.
Since the patients did not receive any new medications during the three-month follow-up, no significant changes in their lipid profiles and serum creatinine levels were expected. Thus, second-visit laboratory tests were used for analysis in this study.
The final ML model demonstrated good performance scores; however, the absence of an external dataset limited our ability to perform external validation, leaving us reliant solely on internal validation. This constraint may have influenced the generalizability of our findings.
The investigation of IAH was not a primary objective of this study. We did not use specific questionnaires or tools designed to assess IAH, which limits the conclusions that can be drawn about its association with the covariates analyzed. Future research should use targeted assessments to further explore these relationships and validate the hypotheses generated by this study.
Conclusion
The analysis revealed that hypoglycemia was more frequently reported in those with intensive insulin therapy, a diabetes duration exceeding 13.7 years, and eGFR below 60.2 mL/min/1.73 m2. Other covariates, such as age, HbA1C, triglycerides, total cholesterol, and BMI, showed weaker, threshold-dependent effects on hypoglycemia. Also, the female gender was weakly related to hypoglycemia events.
Data availability
The data that support the findings of this study are not openly available due to reasons of sensitivity and are available from the corresponding author upon reasonable request.
Abbreviations
- ML:
-
Machine learning
- T2DM:
-
Type 2 diabetes mellitus
- SHAP:
-
SHapley additive exPlanation
- SMBG:
-
Self-monitored blood glucose
- LR:
-
Logistic regression
- KNN:
-
K-nearest neighbor
- SVM:
-
Support vector machine
- RF:
-
Random forest
- XGBoost:
-
EXtreme gradient boosting
- AUC:
-
Area under the receiver operating characteristic curve
- IIT:
-
Intensive insulin therapy
References
Magliano, D. J. & Boyko, E. J. committee IDFDAtes. IDF Diabetes Atlas. Idf diabetes atlas. Brussels: International Diabetes Federation © International Diabetes Federation, (2021).
Holman, R. R., Paul, S. K., Bethel, M. A., Matthews, D. R. & Neil, H. A. 10-year follow-up of intensive glucose control in type 2 diabetes. N. Engl. J. Med. 359(15), 1577–1589 (2008).
Lind, M., Imberg, H., Coleman, R. L., Nerman, O. & Holman, R. R. Historical HbA(1c) values may explain the type 2 diabetes legacy effect: UKPDS 88. Diabet. Care 44(10), 2231–2237 (2021).
de Galan, B. E. et al. Reducing the burden of hypoglycaemia in people with diabetes through increased understanding: Design of the Hypoglycaemia REdefining SOLutions for better liVEs (Hypo-RESOLVE) project. Diabet. Med. 37(6), 1066–1073 (2020).
Galindo, R. J. et al. Hypoglycemic and hyperglycemic crises among U.S. adults with diabetes and end-stage kidney disease: Population-based study, 2013–2017. Diabet. Care 45(1), 100–107 (2022).
Lee, A. K. et al. Risk factors for severe hypoglycemia in black and white adults with diabetes: The atherosclerosis risk in communities (ARIC) study. Diabet. Care 40(12), 1661–1667 (2017).
Lipska, K. J. et al. Trends in drug utilization, glycemic control, and rates of severe hypoglycemia, 2006–2013. Diabet. Care 40(4), 468–475 (2017).
Amiel, S. A. The consequences of hypoglycaemia. Diabetologia 64(5), 963–970 (2021).
Fidler, C., Elmelund Christensen, T. & Gillard, S. Hypoglycemia: An overview of fear of hypoglycemia, quality-of-life, and impact on costs. J. Med. Econ. 14(5), 646–655 (2011).
Karter, A. J., Moffet, H. H., Liu, J. Y. & Lipska, K. J. Surveillance of hypoglycemia-limitations of emergency department and hospital utilization data. JAMA Intern. Med. 178(7), 987–988 (2018).
Mattishent, K. & Loke, Y. K. Detection of asymptomatic drug-induced hypoglycemia using continuous glucose monitoring in older people - Systematic review. J. Diabet. Complicat. 32(8), 805–812 (2018).
Pilla, S. J. et al. Hospital utilization for hypoglycemia among patients with type 2 diabetes using pooled data from six health systems. BMJ Open Diabet. Res. Care 9(Suppl 1), e002153 (2021).
Glycemic Goals and Hypoglycemia. Standards of care in diabetes-2024. Diabet. Care 47(Suppl 1), S111–S125 (2024).
Ewen, M. et al. Availability, prices and affordability of self-monitoring blood glucose devices: Surveys in six low-income and middle-income countries. BMJ Public Health 3(1), e001128 (2025).
Au, N. H. et al. Real-world incidence and risk factors for daytime and nocturnal non-severe hypoglycemia in adults with type 2 diabetes mellitus on insulin and/or secretagogues (InHypo-DM Study, Canada). Can. J. Diabet. 46(2), 196-203.e2 (2022).
Montaser, E., Farhy, L. S. & Kovatchev, B. P. Novel detection and progression markers for diabetes based on continuous glucose monitoring data dynamics. J. Clin. Endocrinol. Metab. 110(1), 254–262 (2024).
Montaser, E., Shah, V. N. Prediction of Incident diabetic retinopathy in adults with type 1 diabetes using machine learning approach: An exploratory study. J. Diabet. Sci. Technol., 19322968241292369 (2024).
Agraz, M., Deng, Y., Karniadakis, G. E. & Mantzoros, C. S. Enhancing severe hypoglycemia prediction in type 2 diabetes mellitus through multi-view co-training machine learning model for imbalanced dataset. Sci. Rep. 14(1), 22741 (2024).
Shi, M. et al. A novel electronic health record-based, machine-learning model to predict severe hypoglycemia leading to hospitalizations in older adults with diabetes: A territory-wide cohort and modeling study. PLoS Med. 21(4), e1004369 (2024).
Witte, H., Nakas, C., Bally, L. & Leichtle, A. B. Machine learning prediction of hypoglycemia and hyperglycemia from electronic health records: Algorithm development and validation. JMIR Form. Res. 6(7), e36176 (2022).
Lundberg, S. A unified approach to interpreting model predictions. arXiv preprint arXiv:170507874. 2017.
Agiostratidou, G. et al. Standardizing clinically meaningful outcome measures beyond HbA(1c) for type 1 diabetes: A consensus report of the American association of clinical endocrinologists, the American association of diabetes educators, the American diabetes association, the endocrine society, JDRF international, the Leona M. and Harry B. Helmsley charitable trust, the pediatric endocrine society, and the T1D Exchange. Diabetes Care 40(12), 1622–1630 (2017).
Heller, S. R., Peyrot, M., Oates, S. K. & Taylor, A. D. Hypoglycemia in patient with type 2 diabetes treated with insulin: It can happen. BMJ Open Diabet. Res. Care 8(1), e001194 (2020).
Malkani, S. & Kotwal, A. Frequency and predictors of self-reported hypoglycemia in insulin-treated diabetes. J. Diabet. Res. 2017, 7425925 (2017).
Mellor, J. et al. Risk factors and prediction of hypoglycaemia using the Hypo-RESOLVE cohort: A secondary analysis of pooled data from insulin clinical trials. Diabetologia 67(8), 1588–1601 (2024).
Silbert, R., Salcido-Montenegro, A., Rodriguez-Gutierrez, R., Katabi, A. & McCoy, R. G. Hypoglycemia among patients with type 2 diabetes: Epidemiology, risk factors, and prevention strategies. Curr. Diabet. Rep. 18(8), 53 (2018).
Van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45, 1–67 (2011).
Jancev, M. et al. Continuous glucose monitoring in adults with type 2 diabetes: A systematic review and meta-analysis. Diabetologia 67(5), 798–810 (2024).
Seidu, S., Kunutsor, S. K., Ajjan, R. A. & Choudhary, P. Efficacy and safety of continuous glucose monitoring and intermittently scanned continuous glucose monitoring in patients with type 2 diabetes: A systematic review and meta-analysis of interventional evidence. Diabet. Care 47(1), 169–179 (2024).
Chantzaras, A. & Yfantopoulos, J. Evaluating the incidence and risk factors associated with mild and severe hypoglycemia in insulin-treated type 2 diabetes. Value Health Reg. Issues 30, 9–17 (2022).
Rombopoulos, G., Hatzikou, M., Latsou, D. & Yfantopoulos, J. The prevalence of hypoglycemia and its impact on the quality of life (QoL) of type 2 diabetes mellitus patients (The HYPO Study). Hormones (Athens) 12(4), 550–558 (2013).
Khunti, K. et al. Rates and predictors of hypoglycaemia in 27 585 people from 24 countries with insulin-treated type 1 and type 2 diabetes: The global HAT study. Diabet. Obes. Metab. 18(9), 907–915 (2016).
Cichosz, S. L., Olesen, S. S. & Jensen, M. H. Explainable machine-learning models to predict weekly risk of hyperglycemia, hypoglycemia, and glycemic variability in patients with type 1 diabetes based on continuous glucose monitoring. J. Diabet. Sci. Technol., 19322968241286907 (2024).
Monnier, L., Wojtusciszyn, A., Colette, C. & Owens, D. The contribution of glucose variability to asymptomatic hypoglycemia in persons with type 2 diabetes. Diabet. Technol. Ther. 13(8), 813–818 (2011).
Duckworth, C. et al. Explainable machine learning for real-time hypoglycemia and hyperglycemia prediction and personalized control recommendations. J. Diabet. Sci. Technol. 18(1), 113–123 (2024).
Emral, R. et al. Self-reported hypoglycemia in insulin-treated patients with diabetes: Results from an international survey on 7289 patients from nine countries. Diabet. Res. Clin. Pract. 134, 17–28 (2017).
Borzì, V. et al. Risk factors for hypoglycemia in patients with type 2 diabetes, hospitalized in internal medicine wards: Findings from the FADOI-DIAMOND study. Diabet. Res. Clin. Pract. 115, 24–30 (2016).
Kaewput, W. et al. Incidence and risk factors associated with outpatient hypoglycemia in patients with type 2 diabetes and chronic kidney disease: A nationwide study. Endocr. Res. 45(4), 217–225 (2020).
Biesenbach, G., Raml, A., Schmekal, B. & Eichbauer-Sturm, G. Decreased insulin requirement in relation to GFR in nephropathic Type 1 and insulin-treated Type 2 diabetic patients. Diabet. Med. 20(8), 642–645 (2003).
Alsahli, M. & Gerich, J. E. Hypoglycemia in patients with diabetes and renal disease. J. Clin. Med. 4(5), 948–964 (2015).
Snyder, R. W. & Berns, J. S. Use of insulin and oral hypoglycemic medications in patients with diabetes mellitus and advanced kidney disease. Semin. Dial. 17(5), 365–370 (2004).
Gómez, A. M. et al. Clinical factors associated with high glycemic variability defined by coefficient of variation in patients with type 2 diabetes. Med. Devices (Auckl). 14, 97–103 (2021).
Guzmán, G. et al. Glycemic control and hypoglycemia in patients treated with insulin pump therapy: An observational study. J. Diabetes. Res. 2020, 1581726 (2020).
Tschöpe, D. et al. Antidiabetic pharmacotherapy and anamnestic hypoglycemia in a large cohort of type 2 diabetic patients—An analysis of the DiaRegis registry. Cardiovasc. Diabetol. 10, 66 (2011).
Chen, N. C., Chen, C. L. & Shen, F. C. The risk factors of severe hypoglycemia in older patients with dementia and type 2 diabetes mellitus. J. Pers. Med. 12(1), 67 (2022).
Almigbal, T. H. Association between knowledge of hypoglycemia and likelihood of experiencing hypoglycemia among patients with insulin-treated diabetes mellitus. Diabet. Metab. Syndr. Obes. 14, 3821–3829 (2021).
Lamounier, R. N. et al. Hypoglycemia incidence and awareness among insulin-treated patients with diabetes: The HAT study in Brazil. Diabetol. Metab. Syndr. 10, 83 (2018).
Graveling, A. J. & Frier, B. M. Impaired awareness of hypoglycaemia: A review. Diabet. Metab. 36(Suppl 3), S64-74 (2010).
Peyrot, M., Barnett, A. H., Meneghini, L. F. & Schumm-Draeger, P. M. Factors associated with injection omission/non-adherence in the global attitudes of patients and physicians in insulin therapy study. Diabet. Obes. Metab. 14(12), 1081–1087 (2012).
Gomez-Peralta, F. et al. Adherence to antidiabetic treatment and impaired hypoglycemia awareness in type 2 diabetes mellitus assessed in Spanish community pharmacies: The ADHIFAC study. BMJ Open Diabet. Res. Care 9(2), e002148 (2021).
Bremer, J. P., Jauch-Chara, K., Hallschmid, M., Schmid, S. & Schultes, B. Hypoglycemia unawareness in older compared with middle-aged patients with type 2 diabetes. Diabet. Care 32(8), 1513–1517 (2009).
Wright, A. D., Cull, C. A., Macleod, K. M. & Holman, R. R. Hypoglycemia in Type 2 diabetic patients randomized to and maintained on monotherapy with diet, sulfonylurea, metformin, or insulin for 6 years from diagnosis: UKPDS73. J. Diabet. Complicat. 20(6), 395–401 (2006).
Lüddeke, H. J. et al. PREDICTIVE- a global, prospective observational study to evaluate insulin detemir treatment in types 1 and 2 diabetes: Baseline characteristics and predictors of hypoglycaemia from the European cohort. Diabet. Obes. Metab. 9(3), 428–434 (2007).
Lundqvist, M. H. et al. Altered hormonal and autonomic nerve responses to hypo- and hyperglycaemia are found in overweight and insulin-resistant individuals and may contribute to the development of type 2 diabetes. Diabetologia 64(3), 641–655 (2021).
Haukka, J. et al. Evaluation of the incidence and risk of hypoglycemic coma associated with selection of basal insulin in the treatment of diabetes: A Finnish register linkage study. Pharmacoepidemiol. Drug Saf. 22(12), 1326–1335 (2013).
Huang, B. et al. Hypoglycemia unawareness identified by continuous glucose monitoring system is frequent in outpatients with type 2 diabetes without receiving intensive therapeutic interventions. Diabet. Metab. Syndr. 14(1), 180 (2022).
Lee, S. E. et al. Trends and risk factors in severe hypoglycemia among individuals with type 2 diabetes in Korea. Diabet. Res. Clin. Pract. 178, 108946 (2021).
Diamond, M. P. et al. Gender influences counterregulatory hormone responses to hypoglycemia. Metabolism 42(12), 1568–1572 (1993).
Kovatchev, B. P. et al. The Virtual DCCT: Adding continuous glucose monitoring to a landmark clinical trial for prediction of microvascular complications. Diabet. Technol. Ther. 27(3), 209–216 (2025).
Acknowledgements
We thank all study participants, their relatives, the members of the nursing team, and the project management team at Yazd diabetic research center, particularly Dr. Nasim Namiranian.
Author information
Authors and Affiliations
Contributions
R.A and A.M contributed to the initial analysis plan, including the literature review on important risk factors of hypoglycemia, that informed this analysis. M.S was involved in data cleaning, harmonization, quality control, coding the data analysis, interpretation of the results and writing the first draft of the manuscript. S.B double checked the accuracy of the coding analysis. P.M.V, and S.B revised it critically for important intellectual content. P.H. was involved in collecting information from the participants. All authors edited, reviewed, and approved the final version of the manuscript. R.A is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Corresponding author
Ethics declarations
Competing interest
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Shabestari, M., Mehrabbeik, A., Barbieri, S. et al. Predictive factors of hypoglycemia in type 2 diabetes: a prospective study using machine learning. Sci Rep 15, 18143 (2025). https://doi.org/10.1038/s41598-025-03030-7
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-03030-7







