Introduction

Type 2 diabetes (T2D), a long-term metabolic disease marked by sustained hyperglycemia, has emerged as a global health epidemic. Over 500 million individuals aged 20–79 years, accounting for approximately 10.5% of this age group, were living with diabetes worldwide in 2021, based on the International Diabetes Federation1. The global prevalence of diabetes is projected to surpass 700 million by 20451. T2D substantially increases the risk of a wide range of complications, such as heightened mortality, cardiovascular disease, diabetic retinopathy, poor stroke outcomes, and chronic kidney disease2,3,4,5. Consequently, the early detection of individuals at high risk for T2D is essential for implementing effective public health interventions and preventing disease progression.

In recent years, metabolic indices that integrate lipid and glycemic parameters have garnered significant attention as practical tools for predicting T2D risk. In particular, the triglyceride-glucose index (TyG-i) has been extensively studied and validated as both a marker of risk and a reliable predictor for insulin resistance and T2D6,7,8,9,10,11,12. However, the TyG-i has a notable limitation: it does not account for long-term glycemic control, which is a critical component of T2D pathophysiology. Glycated hemoglobin (HbA1c), a well-established marker reflecting average blood glucose concentrations over the past 2–3 months, offers a more comprehensive glycemic status assessment than fasting glucose alone13. Despite its clinical importance, few studies have investigated the integration of HbA1c into metabolic indices for T2D risk prediction.

To address this gap, we developed a novel metabolic marker, the triglyceride-glycated hemoglobin index (TyH-i), which combines triglyceride (TG) levels and HbA1c to capture better the interplay between lipid metabolism and long-term glycemic control. This study aimed to assess the correlation between TyH-i and T2D risk in a large Japanese cohort and to compare its predictive performance with that of the widely used TyG-i.

Methods

Data source and study participants

The present study analyzed data from the NAGALA database, a longitudinal cohort study conducted at Murakami Memorial Hospital in Gifu, Japan, over the period from 1994 to 2016. The dataset was obtained from the Dryad Digital Repository, and all analyses complied with the repository’s data usage policies. Murakami Memorial Hospital’s ethics committee approved the original research, and each participant provided written informed consent14. For the secondary analysis, no additional approval from the institutional review board was necessary. Additionally, our study was conducted in compliance with the principles of the Declaration of Helsinki, with all procedures carried out in accordance with applicable guidelines and regulations.

The study population included individuals who underwent at least two health checkups at Murakami Memorial Hospital14. Participants were excluded based on criteria from the original study, including pre-existing liver disease, excessive alcohol consumption at baseline, use of any medication, missing data, a confirmed diagnosis of diabetes at baseline, missing fasting plasma glucose (FPG) values, or unexplained withdrawal from the study. Of the initial 20,944 participants, 15,464 remained after applying the exclusion criteria (Fig. 1).

Fig. 1
figure 1

Study Population.

Data collection

Baseline data were collected during routine health checkups and included demographic, clinical, and laboratory variables by trained staff. Covariates included sex, age, body mass index (BMI), systolic blood pressure (SBP), diastolic blood pressure (DBP), alcohol intake, smoking status, exercise habits, total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C), aspartate aminotransferase (AST), alanine aminotransferase (ALT), gamma-glutamyl transferase (GGT), and FPG. Basic demographic information was collected through a structured questionnaire, while all blood examinations were performed using the MODULAR ANALYTICS system (Hitachi High-Technologies Corp. Ltd, Tokyo, Japan). The TyH-i was determined using the formula: TyH-i = Ln[HbA1c (%) × TG (mg/dL)/2]. Similarly, the TyG-i was determined using the formula: TyG-i = Ln[FPG (mg/dL) × TG (mg/dL)/2]7.

Outcome measures

T2D was diagnosed based on self-reported medical history, HbA1c levels ≥ 6.5%, or FPG levels ≥ 7.0 mmol/L15.

Statistical analysis

Baseline characteristics of the study population were summarized using descriptive statistics. Continuous data are reported as means ± standard deviations (SD) or medians with interquartile ranges (IQR), while categorical variables are reported as frequencies (%). Differences between groups were assessed using the chi-square test for categorical data, and the Kruskal-Wallis H test or one-way ANOVA for continuous data.

In adherence to the STROBE guidelines, three sequential multivariate Cox proportional hazard regression models were constructed to calculate hazard ratios (HR) and 95% confidence intervals (CI) for evaluating the effect of TyH-i levels on T2D. Model 1 was an unadjusted model with no covariates included. Model 2 adjusted for alcoholic intake, sex, exercise habits, age, BMI, smoking status, and SBP. Model 3 further expanded on Model 2 by incorporating additional adjustments for ALT, TC, AST, HDL-C, GGT, and FPG. Additionally, collinearity analysis identified DBP as collinear with other covariates, leading to its exclusion from the final multivariate Cox proportional hazards regression model (Table S1).

The robustness of the results was evaluated through sensitivity analyses. Considering the well-established associations between advanced age and obesity with T2D16,17, sensitivity analyses were performed by excluding participants aged ≥ 60 years or those with BMI ≥ 25 kg/m2 to assess the strength of the association between TyH-i and T2D risk.

This study employed Cox regression with smooth curve fitting and cubic spline functions to investigate the potential non-linear correlation between TyH-i and T2D. If non-linearity was detected, inflection points were identified using a recursive algorithm. This process involved dividing the dataset into two segments at an arbitrary starting point within the range of TyH-i values. Cox proportional hazards regression models were then applied separately to the left and right segments of the inflection point. The log-likelihood ratio test was used to determine the best-fitting model for assessing the association between TyH-i levels and T2D.

Further stratified analyses were conducted based on alcoholic intake, sex, exercise habits, age, BMI, and smoking status. Potential interactions among the stratified groups were evaluated using the likelihood ratio test.

The predictive performance of TyH-i was compared with the TyG-i using receiver operating characteristic (ROC) curve analysis, with the area under the curve (AUC) as the primary measure of discrimination. Model calibration was assessed using calibration plots generated by bootstrap resampling (1,000 iterations) and the Hosmer-Lemeshow goodness-of-fit (H-L) test. Model selection criteria, including Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), were calculated for each model to quantify goodness of fit and parsimony. To further investigate the incremental predictive value of the TyH index over the TyG index, we calculated both the categorical and continuous Net Reclassification Improvement (NRI) and the Integrated Discrimination Improvement (IDI). Moreover, we used a cohort of 8304 Chinese participants from the China Health and Retirement Longitudinal Study (CHARLS) for the external validation.

All results were reported following the STROBE statement guidelines. Statistical computations were performed utilizing the Empower-Stats software Version 5.2 (www.empowerstats.net, X&Y solutions, Inc. Boston, Massachusetts). Statistical significance was defined as a two-sided P-value below the threshold of 0.05.

Results

Participant characteristics

A total of 15,464 participants without T2D at baseline were included in the present study, with a mean age of 43.71 ± 8.90 years, of whom 54.51% (n = 8,430) were male. Table 1 presents the baseline characteristics stratified by TyH-i quartiles. In the highest TyH-i quartile (Q4), participants demonstrated higher ALT, TC, DBP, TG, FPG, SBP, AST, BMI, alcohol intake, GGT, and TyG-i. In addition, this group exhibited lower levels of HDL-C, a higher proportion of males and smokers, and a lower proportion of individuals engaging in exercise habits.

Table 1 The characteristics of participants.

The incidence rate of T2D

Over a median follow-up period of 5.39 years, a total of 373 individuals were diagnosed with T2D, resulting in a cumulative incidence of 2.41% (95% CI: 2.17–2.65) and an incidence rate of 3.99 per 1,000 person-years (Table 2). The cumulative incidence and incidence rate of T2D increased progressively across quartiles of TyH-i. Participants in the highest quartile (Q4) exhibited the highest cumulative incidence (5.79%, 95% CI: 5.06–6.53) and incidence rate (9.51 per 1,000 person-years), whereas those in the lowest quartile (Q1) had the lowest cumulative incidence (0.62%, 95% CI: 0.37–0.87) and incidence rate (1.07 per 1,000 person-years). A significant upward trend was observed across the TyH-i quartiles.

Table 2 Incidence rate of incident type 2 diabetes.

Additionally, Fig. 2 illustrates the Kaplan-Meier cumulative hazard curves for T2D stratified by TyH-i quartiles. The analysis revealed a clear gradient in T2D risk across the quartiles, with higher TyH-i levels associated with a significantly greater cumulative hazard of T2D (log-rank test, P < 0.0001). Participants in the highest TyH-i quartile (Q4) exhibited the steepest increase in T2D risk over time, while those in the lowest quartile (Q1) had the lowest cumulative hazard.

Fig. 2
figure 2

Kaplan-Meier Cumulative Hazard Curve for type 2 diabetes based on TyH-i quartiles (log-rank, P < 0.0001).

The results of the effect of TyH-i levels on T2D

Table 3 presents the effect of TyH-i levels on T2D across three models. In the unadjusted model (Model 1), elevated TyH-i levels were positively linked to T2D (HR: 4.00, 95% CI: 3.42–4.69, P < 0.0001). This correlation remained robust in Model 2 (HR: 2.41, 95% CI: 1.99–2.92, P < 0.0001), which adjusted for alcoholic intake, sex, exercise habits, age, BMI, smoking status, and SBP. In Model 3, which additionally accounted for ALT, TC, AST, HDL-C, GGT, and FPG, TyH-i remained independently associated with T2D risk (HR: 1.55, 95% CI: 1.22–1.97, P = 0.0003). Notably, in Model 3, individuals in the Q4 group exhibited a 21% higher risk of T2D (HR: (HR: 1.21, 95% CI: 0.74–2.00) than those in the Q1 group.

Table 3 Relationship between TyH-i and incident type 2 diabetes in different models.

Sensitive analysis

Sensitivity analyses were conducted to confirm the robustness of the findings (Table 4). In participants with BMI < 25 kg/m2 (Model 4), TyH-i remained significantly linked to T2D risk (HR: 1.57, 95% CI: 1.14–2.15, P = 0.0053). Similarly, in participants younger than 60 years (Model 5), the association persisted (HR: 1.59, 95% CI: 1.24–2.04, P = 0.0003).

Table 4 Relationship between TyH-i and incident type 2 diabetes in different sensitivity analyses.

Nonlinear relationship between TyH-i and T2D

Table 5 and Fig. 3 illustrate a nonlinear association between TyH-i and T2D. An inflection point for TyH-i was identified at 4.92. Below this threshold, no significant association was observed between TyH-i and T2D (HR: 0.61, 95% CI: 0.28–1.34, P = 0.2161). However, above the inflection point, TyH-i was strongly and positively linked to an elevated risk of T2D (HR: 1.73, 95% CI: 1.35–2.23, P < 0.0001).

Table 5 The result of the two-piecewise Cox proportional hazards regression model.
Fig. 3
figure 3

The nonlinear relationship between TyH-i and incident type 2 diabetes. The nonlinear relationship was observed after adjusting for multiple confounding factors, including sex, age, BMI, alcoholic intake, smoking status, exercise habits, SBP, ALT, AST, GGT, TC, HDL-C, and FPG.

Subgroup analysis

The effect of TyH-i on T2D risk was further analyzed in prespecified and exploratory subgroups (Table 6). No significant interactions were observed between TyH-i and subgroup characteristics, including age, sex, alcohol intake, smoking status, exercise habits, and BMI (all P for interaction > 0.05).

Table 6 Effect size of TyH-i on type 2 diabetes in prespecified and exploratory subgroups.

Diagnostic performance of TyH-i in identifying T2D

The diagnostic performance of TyH-i in identifying T2D was assessed through ROC curve analysis (Table 7; Fig. 4). The AUC for TyH-i was 0.751 (95% CI: 0.726–0.776), demonstrating good discriminatory ability. The optimal threshold for TyH-i was identified as 5.380, corresponding to a specificity of 67.0% and a sensitivity of 73.2%, with a resulting Youden Index of 0.402. The AUC for TyG-i was 0.750 (95% CI: 0.726–0.775). The optimal threshold for TyG-i was determined to be 8.196, with a specificity of 62.0% and a sensitivity of 77.5%, resulting in a Youden Index of 0.395. TyH-i and TyG-i possess comparable discriminatory ability in predicting incident T2D (AUC: 0.751 vs. 0.750, P = 0.8873). Moreover, TyH-i yielded lower AIC and BIC values than TyG-i (AIC: 3211.60 vs. 3211.91; BIC: 3226.89 vs. 3227.20) (Table 8). Calibration of the TyH-i and TyG-i models was assessed using calibration plots generated from 1,000 bootstrap resamples and the Hosmer-Lemeshow goodness-of-fit test (Fig. S1 and S2). The Hosmer-Lemeshow test yielded non-significant P-values (P = 0.1873 for TyH-i and P = 0.2164 for TyG-i). These findings demonstrate that both the TyH-i and TyG-i models exhibit excellent calibration and can reliably estimate the risk of T2D. Furthermore, external validation was conducted in a cohort of 8,304 Chinese participants. The AUC was 0.608 (95%CI: 0.591–0.626) for TyH-i and 0.605 (95%CI: 0.587–0.623) for TyG-i (Table S2 and Fig. S3). At the optimal cutoff, the specificity and sensitivity for TyH-i were 57.3% and 60.5%, respectively, while for TyG-i, the corresponding values were 48.0% and 67.8%.

Table 7 Areas under the receiver operating characteristic curves for each evaluated parameter in identifying type 2 diabetes.
Fig. 4
figure 4

The TyH-i for predicting type 2 diabetes in all participants by ROC analyses.

Table 8 Comparative analysis of TyH-i and TyG-i for type 2 diabetes risk prediction: evaluation using net reclassification improvement (NRI) and integrated discrimination improvement (IDI).

Discussion

This retrospective study investigated the correlation between TyH-i and T2D in a population of 15,464 participants without T2D at baseline. Our findings revealed that higher TyH-i levels remained significantly linked to a higher risk of T2D, even after adjusting for confounders. TyH-i and the TyG-i demonstrated comparable discriminatory abilities. Sensitivity and subgroup analyses consistently confirmed the robust association between TyH-i and T2D risk. These findings suggest that TyH-i could be a valuable and practical indicator for identifying individuals at elevated risk of developing T2D.

Insulin resistance is a fundamental feature in the pathogenesis of T2D and plays a critical role in the progression from normoglycemia to prediabetes and, ultimately, to diabetes18. TG and HbA1c, the two components of the TyH-i, are strongly linked to insulin resistance and metabolic dysfunction19,20,21,22. Previous studies have extensively investigated the relationships between TG, HbA1c, and T2D. For instance, a retrospective cohort analysis of over 30,000 participants demonstrated, through multivariate analyses, that elevated triglyceride levels within the normal range were linked to a higher risk of diabetes23. Similarly, a cross-sectional study of 20,108 diabetes patients found a J-shaped correlation between triglyceride levels and poor glycemic control after adjusting for confounding factors24. Furthermore, a prospective cohort study of 11,092 non-diabetic adults revealed that higher HbA1c levels, even within the nondiabetic range, were significantly linked to a higher risk of developing cardiovascular disease and diabetes, as shown by multifactorial outcome analyses25. Additionally, a systematic review encompassing 16 studies with a total of 44,203 participants found that HbA1c levels were strongly predictive of future diabetes, with individuals in the prediabetic range (5.7–6.4%) exhibiting a significantly higher risk of developing diabetes compared to those with lower HbA1c levels26. These findings underscore the critical roles of TG and HbA1c in metabolic dysfunction and their potential utility in predicting diabetes risk. Although there is no prior data on the association between TyH-i and the likelihood of developing T2D, an increase in TyH-i reflects either an elevation in triglyceride levels or an increase in HbA1c. Consequently, our results support the existing evidence that TyH-i is independently linked to T2D risk. Moreover, our sensitivity analysis demonstrated that the association between TyH-i and T2D risk remained robust in individuals with age < 60 years or BMI < 25 kg/m2. A higher TyH-i during the follow-up period indicates an elevated risk of T2D, underscoring the importance of early lifestyle interventions to proactively reduce the incidence of T2D.

A key finding of our study was the identification of a significant nonlinear relationship between TyH-i and T2D risk. We identified an inflection point at a TyH-i value (TyH-i = 4.92) through two-piecewise Cox proportional hazards regression analysis. Below this threshold, the correlation between TyH-i and T2D was not statistically significant. However, above this threshold, each unit increase in TyH-i was linked to a 73% higher risk of T2D. These results suggest that elevated TyH-i could be a valuable indicator for identifying individuals at an elevated risk of hyperglycemia, highlighting the importance of early lifestyle interventions to improve long-term health outcomes.

The observed association between TyH-i and T2D risk can be attributed to several underlying biological mechanisms. TyH-i represents the combined effects of triglycerides and glycated hemoglobin, both of which are strongly associated with insulin resistance and glucose dysregulation. Elevated triglyceride levels, a hallmark of metabolic syndrome, are linked to impaired insulin signaling and pancreatic β-cell dysfunction, both of which play critical roles in the pathogenesis of diabetes27,28. Similarly, glycated hemoglobin reflects chronic hyperglycemia and serves as a well-established predictor of T2D13,15.

The current study possesses several notable strengths. First, it is based on a large cohort with a long follow-up period, allowing for a robust assessment of the correlation between TyH-i and T2D risk. Second, the use of multiple statistical models and sensitivity analyses enhances the reliability and generalizability of our findings. Third, the comparison of TyH-i with TyG-i provides valuable insights into the relative utility of these indices in predicting T2D.

However, several limitations should be acknowledged. First, although we adjusted for a series of confounders, residual confounding could not be entirely excluded. Second, the study population was performed in a Japanese population, which might limit the generalisability of our results to other populations. Third, our study is a retrospective cohort study, and further experimental or interventional studies are required to confirm a causal correlation between TyH-i and T2D risk. Fourth, all participants who were taking any medication at baseline were excluded from our study. We acknowledge that this exclusion criterion may have impacted the representativeness of our sample. Unfortunately, the DATADRYAD database, which was the source of our data, does not provide information regarding the characteristics of the excluded population. In future research, we aim to validate the association between TyH-i and T2D in more diverse populations, including individuals who are taking medications at baseline.

Conclusion

Our study demonstrates that TyH-i is independently associated with an increased risk of T2D and exhibits a nonlinear relationship with T2D risk, with a steeper increase observed at higher TyH-i levels. These findings highlight the potential of TyH-i as a valuable and practical marker for T2D risk stratification.