Introduction

Ischemic stroke (IS) is the most common type of stroke worldwide, accounting for 68% of all cases. According to the 2020 Global Burden of Disease study, IS affects 7.59 million individuals and is a leading cause of death, accounting for 49% of stroke deaths worldwide1. Although preventive measures and treatments for the disease continue to be updated, the high morbidity and mortality of IS position it as a major threat to human health. Therefore, greater attention should be paid to the prognostic management of IS patients in clinical research.

The main risk factors for IS include hypertension (HTN), atrial fibrillation, smoking, drinking, hyperlipidemia (HLD), and most importantly, diabetes. Controlling blood glucose levels and improving the status of other related risk factors (such as blood pressure and blood lipids) are key to preventing IS and improving patient outcomes2.

The concept of the hemoglobin glycation index (HGI) was first proposed in 2002 by Hempe et al.3, who defined it as the difference between actual glycated hemoglobin (HbA1c) and the HbA1c predicted from linear regression analysis of the fasting plasma glucose (FPG) (HGI = actual HbA1c—predicted HbA1c). This index can more accurately evaluate an individual’s blood glucose control than other methods.

A high HGI has been shown to be a risk factor for adverse cardiovascular diseases4,5,6. In the ACCORD trial, patients with high HGI were predicted to be at increased cardiovascular disease risk due to excessive hypoglycemia, whereas patients with low HGI were expected to benefit more from the use of treatment regimens for cardiovascular diseases7. This difference suggests that the HGI not only is a sign of blood glucose control but also may be closely related to the development of other complications. A study of patients with type 2 diabetes mellitus (T2DM) and acute IS revealed a U-shaped association between HGI and patient outcomes; that is, both low and high HGI levels were associated with increased risks of developing adverse outcomes8. However, another study based on data from the Medical Information Mart for Intensive Care (MIMIC)-IV database revealed that a low HGI was always associated with poor outcomes in patients with severe IS, whereas a high HGI may be a protective factor in the short term but may increase the risk of death in the long term9. In addition, recent research10 demonstrated HGI is a significant prognostic factor in acute ischemic stroke patients, with both low and high HGI levels associated with increased mortality risk, shedding light on its potential relevance to stroke outcomes.

The HGI could serve as a new predictor of all-cause mortality in IS patients by integrating the quality of blood glucose control, the extent of insulin resistance and individual biological differences. However, most existing studies are limited to critically ill populations and do not assess hospitalized patients outside the ICU setting. In contrast, our study includes a broader inpatient population, enhancing the external validity and clinical relevance of the findings. Moreover, we applied mediation analysis to explore the indirect effect of HGI between age and mortality and utilized machine learning approaches to construct predictive models.

Methods

Data source

In this study, data from the MIMIC-IV (version 3.1) were used11. MIMIC-IV is an extensive and deidentified collection of health records containing information on more than 65,000 intensive care unit (ICU) patients and more than 200,000 emergency department visits in Boston, Massachusetts, USA, between 2008 and 202212. After successful application, the author (Chuxin Lyu) obtained access to the MIMIC-IV database and completed the training initiative project at the partner institution (Certification Number: 61738903).

Study population

Inpatients diagnosed with IS were identified from the MIMIC-IV database according to International Classification of Diseases (ICD) codes13 (see Supplementary Table S1 for the ICD-9 and ICD-10 codes of the diseases of interest). Notably, the included IS patients were not necessarily newly diagnosed at the time of hospitalization, and IS was not always the main reason for their hospitalization. Some patients, for example, may have been admitted to the hospital for other reasons and been diagnosed with IS during hospitalization. For patients with multiple admissions, only the data from the first hospitalization were included for analysis. Only general ward inpatients were included in this study to ensure the consistency of baseline physiological status and to minimize the confounding impact of intensive interventions.

The inclusion criteria were as follows (Fig. 1):

Fig. 1
Fig. 1
Full size image

Flowchart of patient inclusion and exclusion.

Age ≥ 18 years on first admission to the hospital;

The exclusion criteria were as follows:

Patients with missing data for more than 20% of the key baseline variables (i.e., < 80% data completeness), including demographic information, medical history, laboratory results, and outcome variables;

Patients with Missing FPG or HbA1c data;

patients with data values outside plausible physiological ranges (measurements more than four standard deviations from the mean or implausible values incompatible with life) were excluded.

Data extraction and definitions

The baseline characteristics of the patients were extracted from the database with PostgreSQL software. Potential confounding variables included the following: 1. Demographic characteristics: Age and sex; 2. Complications: HTN, chronic kidney disease (CKD), HLD, T2DM, ischemic heart disease (IHD), and chronic obstructive pulmonary disease (COPD); 3. Laboratory test data: levels of urea nitrogen (BUN), creatinine (Cr), glycerol triglycerides (TG), low-density lipoprotein cholesterol (LDL-C), FBG, and HbA1c; white blood cell (WBC) count and platelet (PLT) count; and hemoglobin (Hb) level. For laboratory data, the first recorded values after admission were considered for analysis. The HGI was calculated via the method proposed by Hempe et al.7, in which the baseline FBG and HbA1c data of all individuals were used to assess the linear relationship between FBG and HbA1c in the study group. The predicted HbA1c was calculated from the included samples (predicted HbA1c = 0.0082*FPG + 4.8386), and the difference between the observed HbA1c and predicted HbA1c levels was subsequently calculated as the HGI. The relationship between the HGI and HbA1c level is shown in Fig. 2. The mortality rate was calculated from the first day of hospitalization. The primary endpoint of this study was the 30-day and 1-year mortality rates of IS patients.

Fig. 2
Fig. 2
Full size image

Association between HGI and HbA1c.

Statistical analysis

Categorical variables are presented as percentages. Continuous variables were first subjected to a normality test; data conforming to a normal distribution are expressed as the mean and standard deviation (SD), whereas for data that did not conform to a normal distribution, the median and interquartile range (IQR) are used to describe the central tendency and variability. Appropriate statistical tests, such as Student’s t test, the Mann–Whitney U test, the chi-square test, and Fisher’s exact test, were used for between-group comparisons. The missing data rates were 0.4% for Hb, 0.5% for WBC, 0.5% for PLT and 0.1% for Cr. Logistic regression analysis and Cox proportional hazards regression analysis were used to investigate the associations between age and the HGI and mortality. First, univariable analyses were performed for each potential predictive factor. The least absolute shrinkage and selection operator (LASSO) regression method was used to identify factors significantly associated with an outcome of death in IS patients. The factors identified with LASSO regression were used as independent variables in multivariable logistic regression and Cox proportional hazards regression models. We applied logistic regression to evaluate the association between HGI and binary endpoints (30-day and 1-year mortality), and improving prediction performance cox regression to capture time-to-event information and hazard dynamics. This dual approach enables comparison between static and dynamic prognostic values, which is particularly relevant in stroke cohorts with variable follow-up times. The performance of these two multivariable models was evaluated via receiver operating characteristic (ROC) curve analysis. For survival analysis, Kaplan–Meier (K–M) curves and the restricted mean survival time (RMST) were plotted. Restriction cubic spline (RCS) curves were used to study potential nonlinear relationships among age, the HGI and mortality. Models were adjusted for all aforementioned covariates. Mediation analysis was performed to determine whether HGI had a mediating role in the relationship between the exposure variable (age) and mortality. Analyses were adjusted for all aforementioned covariates except age. To increase the robustness of the analysis, bootstrapping with 1000 resamplings was performed. The results included the size of the indirect path effect, the proportion of the mediating effect, and the related p values. Owing to missing values for some variables and the nonnormal distribution of many variables, the median imputation method was used to process missing data. Variables with excessive missing data were excluded from the analysis. All the statistical analyses were performed with R software (version 4.4.1). A P value of less than 0.05 (two-sided) was considered to indicate statistical significance. The results of the multivariable logistic regression and Cox regression analyses are expressed as odds ratios (ORs) or hazard ratios (HRs) and 95% confidence intervals (CIs).

Machine learning (ML)

Variables selected by LASSO analysis were incorporated into a machine learning (ML) algorithm. The dataset was divided into a training set and a test set at a ratio of 7:3. The training set was used for model construction, and the test set was used for model evaluation. The support vector machine (SVM), extreme gradient boosting (XGB), random forest (RF) and decision tree (DT) algorithms were used to construct models on the basis of the selected variables and predict the 30-day mortality risk of IS patients. During model development, the optimal hyperparameters were set. The ROC curves and the corresponding area under the curve (AUC) were used to evaluate the model performance. The clinical practicality of the models was evaluated with decision curve analysis (DCA). The performance of the best model in the test set was further interpreted via the SHapley Additive exPlanations (SHAP) value, which provides insight into the importance and ranking of each variable included in the model. The SHAP value can clearly and intuitively reveal the positive or negative impact of each variable on the model prediction, and the screening threshold was set to 0.05. In addition, the performance of the optimal model was evaluated via bootstrapping.

Results

Baseline patient characteristics

A total of 3269 IS patients were included in this study. Table 1 shows the baseline characteristics of the participants at the 30-day and 1-year follow-ups. In this cohort, compared with the 30-day survival group, the patients in the 30-day nonsurvival group were significantly older, and the incidences of CKD and IHD were significantly greater. In terms of laboratory indicators, compared with the 30-day survival group, the 30-day nonsurvival group demonstrated a significantly lower HGI, significantly greater levels of BUN, Cr, and FBG, a significantly greater WBC count, and significantly lower LDL-C and Hb levels. Similar trends were observed at the 1-year follow-up results. Specifically, patients in the 1-year nonsurvival group were older than those in the survival group. The incidences of CKD, IHD, and COPD in the nonsurvival group were significantly greater than those in the survival group, whereas the prevalence of HTN was significantly lower than that in the survival group. In terms of laboratory indicators, compared with the survival group, the nonsurvival group again demonstrated a significantly lower HGI, significantly greater levels of BUN, Cr, and FBG, a significantly lower WBC count, and significantly lower LDL-C and Hb levels.

Table 1 Patient baseline information.

Primary outcomes of logistic regression analyses

Analysis of factors associated with 30-day mortality

Univariable and multivariable logistic regression analyses (Table 2) both revealed that age (univariable: OR = 1.06, 95% CI: 1.05–1.07, P < 0.001; multivariable: OR = 1.05, 95% CI: 1.04–1.07, P < 0.001) was positively associated with 30-day mortality, whereas the HGI (univariable: OR = 0.57, 95% CI: 0.46–0.71, P < 0.001; multivariable: OR = 0.44, 95% CI: 0.34–0.57, P < 0.001) was negatively associated with 30-day mortality.

Table 2 Logistic analyses: Risk factors of mortalit.

LASSO regression analysis confirmed that age, the HGI, Hb level, WBC count, and BUN level were significant predictors of 30-day and 1-year mortality. A multivariable logistic regression model was subsequently constructed, and the results revealed that age (OR = 1.05, 95% CI: 1.04–1.06, P < 0.001) was positively correlated with 30-day mortality, whereas the HGI (OR = 0.51, 95% CI: 0.40–0.65, P < 0.001) was inversely correlated with 30-day mortality.

Analysis of the factors associated with 1-year mortality

Both univariable and multivariable logistic regression analyses revealed that age (univariable: OR = 1.06, 95% CI: 1.05–1.07, P < 0.001; multivariable: OR = 1.07, 95% CI: 1.06–1.08, P < 0.001) was positively correlated with 1-year mortality, whereas the HGI (univariable: OR = 0.79, 95% CI: 0.67–0.92, P = 0.003; multivariable: OR = 0.69, 95% CI: 0.56–0.84, P < 0.001) was negatively correlated with 1-year mortality.

LASSO logistic regression analysis further confirmed that age (OR = 1.07, 95% CI: 1.06–1.07, P < 0.001) and the presence of COPD (OR = 1.46, 95% CI: 1.09–1.96, P = 0.011) were positively correlated with 1-year mortality, whereas the HGI (OR = 0.70, 95% CI: 0.58–0.83, P = 0.001) was negatively correlated with 1-year mortality.

Stratified analysis of the HGI and its relationship with mortality risk

To further investigate the association between HGI and mortality risk, the HGI was subject to stratified analysis by quartile (Q1-Q4) (Table 3). In Model 1, there was no adjustment for any covariates; in Model 2, the model was adjusted for age; and in Model 3, the model was adjusted for age, Hb level, WBC count, and BUN level. Compared with the Q1 group (−0.55—−0.33), in the Q3 (0.09—0.27) and Q4 groups (0.27—2.13), the 30-day mortality risk was significantly reduced [Q3 group, OR = 0.53 (95% CI: 0.38–0.73, P < 0.001) in Model 1, OR = 0.40 (95% CI: 0.29–0.56, P < 0.001) in Model 2 with the variable adjustment, and OR = 0.50 (95% CI: 0.35–0.71, P < 0.001) in Model 3; Q4 group, OR = 0.60 (95% CI: 0.44–0.82, P = 0.002) in Model 1, OR = 0.47 (95% CI: 0.34–0.65, P < 0.001) in Model 2, and OR = 0.51 (95% CI: 0.36–0.71, P < 0.001) in Model 3]. A trend test revealed that as the HGI increased (i.e., from the Q1 group to Q4 group), the mortality risk decreased significantly (p for trend < 0.0001). A similar pattern was observed at the 1-year follow-up, i.e., as the HGI increased (i.e., from the Q1 group to Q4 group), the mortality risk tended to decrease.

Table 3 The association between HGI levels and mortality by logistic regression analyses.

Results of Cox regression analysis

To identify prognostic indicators for 30-day and 1-year mortality, univariable and multivariable Cox regression analyses were performed. As shown in Table 4, in the analysis of 30-day mortality, age (univariable HR = 1.05, 95% CI: 1.04–1.06, P < 0.001; multivariable HR = 1.05, 95% CI: 1.04–1.06, P < 0.001; < 0.001) was an independent risk factor, whereas the HGI (univariable: HR = 0.59, 95% CI: 0.48–0.72, P < 0.001; multivariable: HR = 0.47, 95% CI: 0.38–0.59, P < 0.001) was an independent protective factor for the IS patients. For the 1-year mortality risk, Cox regression analysis revealed a similar trend; age (univariable HR = 1.05, 95% CI: 1.05–1.06, P < 0.001; multivariable: HR = 1.05, 95% CI: 1.05–1.06, P < 0.001) was an independent risk factor, whereas the HGI (univariable: HR = 0.79, 95% CI: 0.68–0.91, P = 0.001; multivariable: HR = 0.63, 95% CI: 0.54–0.74, P < 0.001) and Hb level (HR = 0.79, 95% CI: 0.76–0.82, P < 0.001) were independent protective factors for the IS patients.

Table 4 Cox analyses: Risk factors of mortality.

After screening the variables with LASSO regression analysis, age (30 days: HR = 1.05, 95% CI: 1.04–1.06, P < 0.001; 1 year: HR = 1.05, 95% CI: 1.04–1.06, P < 0.001) remained an independent risk factor for both 30-day and 1-year mortality for the IS patients, whereas the HGI (30 days: HR = 0.54, 95% CI: 0.44–0.66, P < 0.001; 1 year: HR = 0.68, 95% CI: 0.58–0.78, P < 0.001) and Hb level (HR = 0.87, 95% CI: 0.82–0.92, P < 0.001) remained independent protective factors against 30-day and 1-year mortality for the IS patients.

Kaplan–Meier (K–M) survival analysis

K–M survival analysis, (curves shown in Fig. 3a), revealed that the HGI was significantly associated with 30-day mortality in IS patients (log-rank test, p < 0.001). Analysis of the survival curves revealed that, during the 30-day follow-up period, the mortality risk in the low-HGI group was significantly greater than that in the high-HGI group. This survival difference persisted (log-rank test, p < 0.001) at the 1-year follow-up (Fig. 3b).

Fig. 3
Fig. 3
Full size image

KM survival curves of age and HGI levels for 30-day and 1-year mortality in IS patients.

Moreover, significant differences were observed in both the 30-day and 1-year survival rates between patients in the older and younger groups (log-rank test, P < 0.001). Specifically, survival curve analysis (Fig. 3c, d) revealed that the mortality risk in the older group was significantly greater than that in the younger group.

Restricted mean survival time

RMST analysis was employed to evaluate the association between the HGI and 30-day and 1-year mortality in IS patients (Fig. 4). Patients were divided into a high-HGI group (Arm 1) and a normal-HGI group (Arm 0) for comparative analysis.

Fig. 4
Fig. 4
Full size image

RMST analysis of age and HGI levels for 30-day and 1-year mortality in IS patients.

During the 30-day follow-up period, the average survival of patients in the high-HGI group was 26.70 days (95% CI: 25.88–27.51), whereas that of patients in the normal-HGI group was 28.17 days (95% CI: 27.95–28.40). The RMST difference between the two groups was −1.48 days (95% CI: −2.32—−0.63, p = 0.001) According to RMST analysis, the 30-day survival of patients in the high-HGI group was significantly shorter than that of patients in the normal-HGI group.

This survival difference was greater at the 1-year follow-up. The average survival of the high-HGI group was 272.75 days (95% CI: 257.49–288.02), whereas that of the normal-HGI group was 309.94 days (95% CI: 305.55–314.32).

Nonlinear analyses

To further investigate the association between HGI and the outcomes of IS patients, RCS analysis was used to assess the nonlinear association between the HGI and the 30-day and 1-year mortality of patients (Fig. 5).

Fig. 5
Fig. 5
Full size image

RCS analysis of HGI levels for 30-day and 1-year mortality in IS patients.

RCS analysis revealed a non-linear relationship between the HGI and both the 30-day and 1-year mortality risk. Low HGI values were associated with greatly elevated risks of mortality, but as the HGI increased, the mortality risk gradually decreased.

A significant nonlinear relationship was observed between age and the mortality risk in IS patients (Fig. 6). The RCS prediction plots revealed that with increasing age, the 30-day mortality risk in IS patients significantly increased, with a greater growth rate observed in the older group (> 80 years). In the 1-year mortality analysis, the association between age and mortality risk was also significant.

Fig. 6
Fig. 6
Full size image

RCS analysis of age for 30-day and 1-year mortality in IS patients.

Mediating effect of HGI on age and mortality in IS patients

Mediation analysis was performed to assess the mediating effect of the HGI on the relationship between age and the 30-day and 1-year mortality of IS patients (Fig. 7). To increase the interpretability of the assessments of the mediating effect, the values of the age variable were divided by 10. As shown in Table 5 and Fig. 8, in the analysis of 30-day mortality, the total effect of age on mortality was 0.0017 (95% CI: 0.0010 to 0.0029, P < 0.001), indicating that with increasing age, the 30-day mortality risk significantly increased. The average direct effect (ADE) was 0.0019 (95% CI: 0.0010–0.0032, P < 0.001), whereas the average causal mediation effect (ACME) via HGI was −0.0001 (95% CI: −0.0003–0.0001, P < 0.001). The proportion of mediation of the HGI was −0.0805 (95% CI: −0.1276—−0.0494, P < 0.001); the negative sign indicates that, as a mediating variable, the HGI has an inhibitory effect on the positive relationship between age and mortality. Similar results were observed for the 1-year mortality assessment, indicating that the inhibitory effect of the HGI remained significant in the assessment of long-term mortality risk.

Fig. 7
Fig. 7
Full size image

Mediational models.

Table 5 Mediating efect of HGI on age (per 10-year increase) and the mortality of IS patients.
Fig. 8
Fig. 8
Full size image

Mediating efect of HGI on age (per 10-year increase) and the mortality of IS patients.

Establishment and validation of the prediction models

Multiple ML algorithms, including the SVM, XGB, RF and DT algorithms, were used to construct models to predict the 30-day mortality of IS patients. The models were evaluated via ROC curve analysis and DCA.

In the ROC curve analysis, the four ML models all achieved good predictive performance. As shown in Fig. 9, the AUC values differ to some extent across the models, with the RF and SVM models showing the highest overall accuracy—76.66% and 76.76%, respectively—indicating relatively strong discriminative capabilities.

Fig. 9
Fig. 9
Full size image

The ROC curves for the four models.

Additional performance metrics for the four models are presented in Table 6. Among them, the RF model outperformed the others in terms of recall (0.437) and F1-score (0.469), and also achieved the highest G-mean (0.616), suggesting that it maintains better balance in identifying both positive and negative classes and offers more robust classification performance overall.

Table 6 Comparison of the performance of the ML models.

To explain the contribution of each feature to the performance of the ML models, the SHAP method was employed. As shown in Fig. 10, age was the most important feature for predicting the 30-day mortality of IS patients, followed by the BUN level, Hb level, WBC count, and HGI.

Fig. 10
Fig. 10
Full size image

The results of SHAP analysis.

The SHAP dual-coordinate line graph and histogram clearly show the ranking of the relative importance of each feature, with age having the most significant effect. The SHAP ripple plot further revealed the direction and magnitude of the impact of the changes in the feature on the prediction results and indicated that age was positively correlated with mortality risk.

A detailed analysis of the characteristic screening table and the SHAP result matrix showed that the age factor had the highest SHAP value for predicting mortality (mean approximately 0.0319), indicating that age contributed most to the model predictions. Although the HGI was not the most important predictor, it interacted with age, Hb level and other factors, thus having a supplementary value to the performance of the prediction model.

Discussion

Through the analysis of the data of 3269 IS patients in the MIMIC-IV database, this study revealed the important role of the HGI and age in the prediction of mortality risk in IS patients. The results revealed that the HGI was an independent protective factor for the 30-day and 1-year mortality in IS patients, whereas age was the main risk factor. In addition, this is the first study to reveal that the HGI has a mediating effect on the relationship between age and mortality, which provides a new perspective for understanding the association between abnormal glucose metabolism and the outcomes of patients with IS.

Zhang et al.4 showed that an increased HGI was associated with an increased probability of cardiovascular diseases and total mortality risk in T2DM patients. Ahn et al.5 analyzed the data of 1248 South Korean adults and revealed that a high HGI was independently associated with a greater risk of cardiovascular diseases, cerebrovascular diseases, impaired glucose metabolism, and peripheral arterial disease. However, contrary to these studies, in this study, both K–M curve and Cox multivariable regression analyses revealed that the HGI was negatively correlated with mortality rate in IS patients, instead serving as a protective factor for survival. Previous studies have shown that a low HGI was associated with increased mortality in cardiovascular patients14,15. Our results are partially consistent with those of Huang et al.9, who similarly employed the MIMIC-IV database and found that a low HGI was associated with poor outcomes in patients with severe IS. However, unlike their ICU-focused cohort, our study was based on all hospitalized IS patients in the MIMIC-IV database, which includes both ICU and general ward patients. This expands the applicability of our conclusions to a broader clinical population.

Notably, RCS analysis revealed a non-linear relationship between the HGI and mortality risk, with low-HGI patients having a significantly increased risk of mortality. Low HGI values arise from various factors, such as shortening or renewal of the erythrocyte lifespan and the glucose gradient across the human erythrocyte membrane16. Stress hyperglycemia, a common condition in IS, may lead to higher FPG values and lower HGI values17. This acute glucose metabolism disorder under the action of stressors18 is mainly due to high catabolism caused by the activation of the hypothalamus–pituitary–adrenal axis and the release of various counterregulatory hormones, such as glucocorticoids and catecholamines19, and is closely associated with insulin resistance20. Multiple studies have shown that stress hyperglycemia is associated with poor outcomes in IS patients21,22,23, which indirectly supports the correlation between the HGI and the outcomes of IS patients observed in this study.

Age has been extensively confirmed in previous studies to be a strong predictor of mortality in IS patients24,25,26. In this study, the nonlinear relationship between age and mortality risk was elucidated through RCS curve analysis, revealing that at age = 24, the mortality risk was the lowest, whereas older age (> 80 years) was associated with a rapid increase in the mortality risk. This age effect may reflect the multiple challenges faced by elderly patients, including decreased immune function, increased incidence of complications, reduced vascular function and others26,27,28.

This study explored the mediating role of the HGI in the relationship between age and mortality in IS patients. Mediation analysis revealed that the average mediation effect was 0.0358 (95% CI: 0.0269–0.0457, P < 0.001), indicating that the HGI has a significant mediating effect on the relationship between age and mortality29. Age indirectly reduced the mortality risk by increasing the HGI (when the HGI was low), thus showing that the HGI exerted an inhibitory effect. Specifically, when other factors remain unchanged, age directly increased the mortality risk, but simultaneously, by affecting the HGI, age slightly reduced the mortality risk. This negative mediating effect means that, without the mediating effect of HGI, the effect of age on mortality may be stronger. In other words, the HGI partially alleviates the adverse effects of aging on the mortality risk, which may be associated with adaptive changes in metabolic regulation in elderly patients30, such as individual changes in blood glucose control or glycation.

Interestingly, we observed a negative mediating effect of HGI in the association between age and 30-day and 1-year mortality in ischemic stroke patients. Although HGI has traditionally been associated with increased levels of AGEs and oxidative stress, our findings suggest a potentially paradoxical role in elderly populations. One possible explanation lies in age-related changes in glycemic physiology. With aging, some patients may exhibit altered hemoglobin glycation dynamics or better adaptation to chronic hyperglycemia, potentially leading to an HGI profile that reflects stable glycemic control rather than acute hyperglycemic toxicity. This is in line with evidence showing that in high glycemic variability settings, elevated HbA1c does not necessarily predict mortality31.

In this context, a moderately elevated HGI could indicate reduced glycemic variability or a more predictable glycation process—both of which have been associated with improved vascular outcomes and lower mortality. This hypothesis is partially supported by recent studies indicating that the prognostic implications of HGI are not uniformly adverse and may depend on the interplay between age, comorbidities, and baseline glucose control strategies.

However, we acknowledge that these proposed mechanisms remain speculative and require validation through experimental studies. The biological plausibility of our findings needs further investigation through mechanistic research, including studies examining glucose transporter expression, neuroinflammatory markers, and cellular stress responses in relation to HGI levels across different age groups.

In this study, multiple ML algorithms were used to successfully construct prediction models for 30-day mortality in IS patients. The RF model showed the best performance, which is consistent with the ability of the algorithm to capture complex nonlinear relationships among variables. SHAP analysis further confirmed that age was the most important predictor of mortality, followed by BUN level, Hb level, WBC count, and the HGI. The identification of these features provides not only a reference for clinical risk assessment but also potential targets for early intervention.

The DCA results confirmed that these prediction models were superior to the"treat-all“or”treat-none"strategies under multiple decision thresholds, indicating that the models can offer tangible clinical benefits in managing IS patients. The integration of these models into clinical decision support systems could improve the risk stratification and individualized management of IS patients.

Although the F1-score of the RF model was relatively modest (0.469), it is important to contextualize this metric within the broader scope of clinical predictive modeling. Similar performance levels have been reported in ICU prediction studies where class imbalance and heterogeneous patient populations limit precision-recall optimization. Notably, prior studies have validated the utility of RF models with moderate F1-scores in supporting decision-making via model explainability and net clinical benefit32,33. Furthermore, in our analysis, the RF model demonstrated the highest G-mean and favorable SHAP interpretability, suggesting practical value in clinical stratification even under imperfect classification scores.

Study limitations

This study has several limitations. First, as a retrospective study, the results may be limited by inherent selection and information biases. Second, although known confounding factors were adjusted for through multivariable analysis, unmeasured confounding factors may have affected the results. Third, this study is based on the data from a single-center database, which may limit the external validity of the results. Fourth, the calculation of the HGI is based on the value first measured at the time of admission and therefore fails to reflect the dynamic changes in the HGI over time. Lastly, although our mediation findings suggest a possible protective role of HGI in certain age strata, the biological mechanisms remain speculative. Further mechanistic and longitudinal studies are necessary to clarify whether this association reflects a true pathophysiological adaptation or confounding by unmeasured metabolic variables.

Clinical significance and future directions

The results of this study may have great clinical importance. First, as a simple and easy-to-obtain indicator, the HGI could be included in the routine assessments of IS patients, especially elderly patients. Second, the identification of the optimal HGI threshold could provide a reference for the clinical identification of high-risk patients. Third, an understanding of the interaction between the HGI and age can be helpful for developing more accurate individualized treatment strategies. In the future, studies with larger sample sizes will be conducted to understand the relationship between the dynamic changes in the HGI and patient outcomes and to determine the mechanism of action of this relationship to support the findings of this study.

Conclusions

In this study, the HGI was revealed to be an independent protective factor against 30-day and 1-year mortality in IS patients as well as a cofactor alongside age to affect the outcomes of IS patients. A non-linear nonlinear relationship was observed between the HGI and mortality risk, with excessively low HGI values associated with a significantly increased risk of mortality. The mediating role of the HGI in the relationship between age and mortality provides a new perspective for understanding the metabolic characteristics of elderly IS patients. The ML-based prediction models further confirmed the clinical predictive value of the HGI. These findings provide new ideas and tools for risk assessment and individualized management in IS patients.