Introduction

Global warming is progressing, the incidence of heat-related illnesses has been reported to be increasing yearly, estimated 500,000 additional deaths worldwide each year1,2. Heat-related illnesses encompass a continuum that includes heat edema, heat syncope, heat cramps, heat exhaustion, and the most severe form, heatstroke3. Clinically, heatstroke is characterized by central nervous system dysfunction, multiorgan failure, and extreme hyperthermia (usually > 40.5 °C)4.

Heatstroke can be classified as either classic or exertional, depending on its cause. Both types involve an imbalance between the body’s heat production and heat dissipation, though their underlying mechanisms differ5. Classic heatstroke (CHS) results from passive exposure to environmental heat and inadequate heat-dissipation mechanisms. In contrast, exertional heatstroke (EHS) occurs due to exposure to a hot environment during physical exercise, resulting when excessive metabolic heat production overwhelms the body’s physiological heat-loss mechanisms6. EHS affects mainly athletes, military personnel, firefighters, and occupational workers. For CHS, older adults are particularly vulnerable, especially those with common age-associated chronic health conditions (e.g., cardiovascular disease, hypertension, obesity, type 2 diabetes, chronic kidney disease)7. It is a life-threatening condition ultimately progressing to life-threatening multiple organ failure and associated with a reported 28-day mortality rate of up to 58%8. Currently, the primary treatment modalities for heatstroke include hypothermia control, rehydration therapy, and hemodialysis. While significant research has been conducted on symptomatic therapies for heatstroke, most of these approaches remain at various preclinical stages3. Therefore, management of heatstroke primarily focuses on prevention and early intervention to prevent the progression of the disease.

Rhabdomyolysis and heat-induced inflammatory damage both significantly elevate the risk of acute kidney injury (AKI) in heatstroke patients3. Previous studies have demonstrated that heatstroke complicated by AKI is associated with higher hospitalization costs and worse clinical outcomes9. Despite this urgency, AKI typically manifests in the later stages of the disease, and there remains a lack of studies specifically addressing the early prediction of AKI in heatstroke patients. To bridge this gap, this study introduces machine learning models designed to predict AKI incidence in heatstroke patients using clinical data obtained during the first 24 h of hospitalization.

Methods

Patients and study design

Data were collected from 55 hospitals in China between 2008 and 2024. After applying the inclusion and exclusion criteria, a total of 290 patients with heatstroke were enrolled in the study (Fig. 1)10. The inclusion criteria for this study were as follows: (1) a history of exposure to high-temperature environments and/or participation in high-intensity manual labor; (2) an axillary temperature above 39 °C; (3) evidence of central nervous system dysfunction, including symptoms such as delirium, coma, impaired consciousness, or disorientation5,6; (4) patients with a hospital stay of more than 24 h.

Fig. 1
figure 1

Flowchart of patient enrollment and group allocation. A total of 511 patients were enrolled from 55 hospitals between 2008 and 2024. After excluding 97 patients under 18 years old, 89 patients with pre-existing comorbidities prior to heatstroke onset, and 35 patients with more than 30% missing data, 290 patients were included in the study. These included patients were further divided into two groups: CHS and EHS.

The exclusion criteria were as follows: (1) patients aged under 18 years; (2) patients with pre-existing comorbidities prior to heatstroke onset, including diabetes, cerebral infarction, pulmonary infection, chronic kidney disease, and dementia; and (3) patients with more than 30% missing data in their records were excluded from the analysis.

Ethical considerations

The study was conducted by the PLA General Hospital and received approval from the ethics committees of all participating institutions. Each patient underwent comprehensive, condition-specific treatment, which included body cooling, fluid administration, and anti-inflammatory measures. For those diagnosed with rhabdomyolysis and AKI, organ support was provided as needed, in accordance with clinical guidelines. This support included appropriate hydration, urine alkalization, and, when necessary, continuous renal replacement therapy (CRRT), along with other interventions9.

Definitions

  1. 1.

    AKI was defined according to the Kidney Disease: Improving Global Outcomes (KDIGO) criteria as one of the following: (1) an increase in serum creatinine (Scr) to ≥ 26.5 μmol/L (≥ 0.3 mg/dL) within 48 h; (2) an increase in Scr to ≥ 1.5 times the baseline within 7 days; (3) urine output <0.5 mL/kg/h for 6 h11.

  2. 2.

    Rhabdomyolysis: Characterized by acute muscle weakness, myalgia, and muscle swelling combined with a creatine kinase (CK) cut-off value of > 1000 IU/L or CK > 5 × upper limit of normal for the standard definition of rhabdomyolysis. Additionally measured myoglobinuria and AKI indicate a severe type of rhabdomyolysis12,13.

  3. 3.

    Sequential organ failure assessment (SOFA) score: a validated tool used to quantify the extent of organ dysfunction in critically ill patients. It evaluates six organ systems—respiratory, cardiovascular, hepatic, coagulation, renal, and neurological—each assigned a score ranging from 0 (normal function) to 4 (severe dysfunction). The total SOFA score, ranging from 0 to 24, reflects the overall severity of organ failure, with higher scores associated with increased mortality (Supplementary Table 1)14.

  4. 4.

    Disseminated intravascular coagulation (DIC): Diagnosed based on a combination of clinical manifestations and laboratory findings that reflect systemic activation of the coagulation cascade. The International Society on Thrombosis and Haemostasis (ISTH) has proposed a widely accepted scoring system to identify overt DIC. The scoring system incorporates four parameters: platelet count, prolongation of prothrombin time (PT), levels of fibrin-related markers (such as D-dimer or fibrin degradation products), and fibrinogen concentration (Fib). Each parameter is assigned a score, and a cumulative score of ≥ 5 is considered indicative of overt DIC (Supplementary Table 2)15.

  5. 5.

    Effective cooling: Defined as the reduction of core body temperature to below 38.5 °C within 30–60 min of initiating treatment. This threshold is widely accepted to prevent irreversible neurological injury and multi-organ dysfunction. Commonly employed cooling strategies include cold-water immersion, evaporative cooling, ice blanket therapy, intravascular temperature management, and extracorporeal methods such as cold hemodialysis or high-flow continuous hemodiafiltration5,16.

Statistical analysis

All statistical analyses were performed using RStudio (version 2024.12.1) running R version 4.4.2 (R Core Team, 2024, https://www.r-project.org/). Prior to any analysis, data preprocessing was carried out to handle missing values and ensure data quality. Variables with more than 30% missing values were excluded from further analysis. For variables with less than 30% missingness, multiple imputation was performed using the “mice” package (version 3.18.0) to reduce bias and maximize statistical power17,18,19.

Continuous variables were first assessed for normality using the Shapiro–Wilk test. Normally distributed variables are reported as means with standard deviations (mean ± SD), while non-normally distributed variables are expressed as medians with interquartile ranges (median [Q1, Q3]). Categorical variables are presented as frequencies and percentages. For group comparisons, the two-independent-samples t-test was applied to normally distributed continuous variables, and the Mann–Whitney U test was used for non-normally distributed data. The chi-square (χ2) test was employed to assess associations between categorical variables. A two-tailed p value < 0.05 was considered statistically significant.

Univariate logistic regression analyses were then conducted for all candidate predictor variables to explore their associations with the outcome of interest. Variables with p values < 0.05 in the univariate analysis were considered for inclusion in the multivariable logistic regression model. Prior to multivariable modeling, multicollinearity was assessed using the variance inflation factor (VIF); variables with a VIF greater than 5 were considered to have potential multicollinearity and were reviewed accordingly.

To further evaluate the predictive performance of significant variables, receiver operating characteristic (ROC) curve analyses were conducted for those identified as significant in univariate testing. Variables with an area under the curve (AUC) ≥ 0.7 were retained as candidates for model development20. The Youden Index was used to determine the optimal cutoff points by maximizing the sum of sensitivity and specificity.

In this study, certain variables were derived to reflect the most clinically relevant values observed within the first 24 h of hospital admission. Specifically, variable suffixes such as “_min” and “_max” denote the minimum and maximum values, respectively, of clinical parameters recorded during that period. The choice of using either the minimum or maximum value for a given parameter was informed by both clinical expertise and relevant literature, reflecting the characteristic physiological trajectories observed in patients with heat stroke following disease onset.

Machine learning model construction

Logistic Regression, along with five machine learning algorithms—support vector machine (SVM)21,22, XGBoost23,24, k-nearest neighbor (kNN)25,26, Naive Bayes27,28, and decision tree (DT)29,30—were implemented to develop early warning models for AKI). Model development and evaluation were conducted using R 4.4.2 (R Core Team, 2024, https://www.r-project.org/). These machine learning methods were selected due to their wide applicability, robustness, and capacity to address classification challenges in medical data. The multivariable logistic regression model was constructed using a stepwise backward elimination approach, with a threshold of p < 0.05 for retention in the final model. Model discrimination was evaluated using AUC, and calibration was assessed with the Hosmer–Lemeshow goodness-of-fit test. Logistic regression served as a benchmark against which the performance of other machine learning algorithms was evaluated.

Model evaluation

To prevent overfitting, we applied 20-fold cross-validation (CV). The dataset was randomly partitioned into 20 subsets; each subset was used once as the test set while the remaining 19 subsets formed the training set. The final evaluation metric was the average of the performance across all folds. Model performance was assessed using standard classification metrics, including accuracy, precision, sensitivity, specificity, F1-score, and AUC. These are widely accepted evaluation measures in machine learning and are used here following conventional definitions. An ROC curve was also plotted to visually compare the classification performance of different models.

Feature importance

The important parameters of logistic regression are represented using formulas, while the parameters in various machine learning algorithms are ranked according to their significance. The interpretation of the models was conducted using the Shapley Additive Explanations (SHAP) metric, which provides a unified approach for precisely calculating the contribution and influence of each feature on the final predictions. The SHAP values indicate how much each predictor contributes, either positively or negatively, to the target variable31.

Results

Demographic characteristics and baseline clinical data

A multicenter dataset comprising 511 patients diagnosed with heatstroke was established using clinical records collected from 55 hospitals across China between 2008 and 2024. Detailed data were collected for each patient, including demographic characteristics, medical history, clinical symptoms and signs, laboratory test results, imaging findings, diagnostic information, treatments and medication use, surgical and therapeutic interventions, follow-up, and clinical outcomes.

After applying predefined inclusion and exclusion criteria, a total of 290 patients with heatstroke were included in the final analysis. Among them, 263 were male, with a median age of 25 [21, 41] years and a mean body mass index (BMI) of 23.66 ± 2.61. Occupational distribution showed that 89 patients (30.69%) were workers, 64 (22.07%) were farmers, followed by 38 unemployed individuals (13.10%), 35 athletes (12.07%), 26 students and teachers (8.97%), 23 retired individuals (7.93%), and 15 police officers or firefighters (5.17%).

Among the 290 cases, 90 (31.03%) were diagnosed with classic heatstroke (CHS) and 200 (68.97%) with exertional heatstroke (EHS). Rhabdomyolysis was observed in 78 EHS patients (39%). Among those, 57 cases (73.08%) developed acute kidney injury (AKI). In contrast, the incidence of AKI among EHS patients without rhabdomyolysis was 37.59%. In total, 117 of the 200 EHS patients (58.5%) developed AKI, compared with 51 of the 90 CHS patients (56.67%). Overall, AKI occurred in 168 of the 290 heatstroke patients (57.93%) and 28 patients (9.66%) died during hospitalization.

There were no significant differences in sex, age, or BMI between patients with and without AKI. However, the incidence of rhabdomyolysis and the proportion of patients receiving CRRT were significantly higher in the AKI group. In addition, cooling measures appeared to be less effective among patients who developed AKI (Table 1).

Table 1 Demographic and clinical characteristics of patients with heat stroke.

Univariate analysis

This study incorporated patients first 24 h data after in hospital, a total of 53 commonly used clinical indicators, including admission temperature, heart rate, respiratory rate, mean arterial pressure (MAP), SOFA score, GCS score, arterial blood gas analysis, complete blood count, coagulation parameters, biochemical markers, and myocardial injury biomarkers. Univariate analysis was performed to assess whether there were statistically significant differences in the indicators between patients who developed AKI and those who did not. Based on the results of the univariate analysis, 40 indicators that showed significant differences between the two groups were selected for further analysis (Supplementary Table 3).

The AKI group exhibited significantly higher body temperature (41.23 ± 2.17 °C vs. 38.35 ± 1.66 °C, p < 0.001) and higher heart rate (89.00 [76.75, 100.00] bpm vs. 80.00 [68.00, 102.00] bpm, p = 0.045) compared to the non-AKI group. Additionally, Scr and blood urea nitrogen (BUN) levels were markedly elevated in the AKI group (p < 0.001). Coagulation and inflammatory markers, including platelet count (PLT), thrombin time (TT), prothrombin time (PT), activated partial thromboplastin time (APTT), fibrinogen (Fib), prothrombin activity (PTA), international normalized ratio (INR), D-dimer, procalcitonin (PCT), neutrophil count (Neu), white blood cell count (WBC), lymphocyte count (Lym), lactate dehydrogenase (LDH), and interleukin-6 (IL-6), were all significantly higher in the AKI group (p ≤ 0.001). As well as, Liver function markers, including aspartate aminotransferase (AST), alanine aminotransferase (ALT), total bilirubin (TBIL), direct bilirubin (DBIL), and albumin (ALB), also indicated greater severity in patients with AKI (p ≤ 0.001). Furthermore, biomarkers associated with rhabdomyolysis, such as creatine kinase (CK) (p = 0.002), CK-MB, and myoglobin (Mb), were significantly elevated (p < 0.001).

Receiver operating characteristic curve analysis

Furthermore, ROC curve analysis was conducted for the 40 variables that showed statistical significance in the univariate analysis. Among these, seven variables demonstrated acceptable discriminatory performance, with an AUC ≥ 0.70. Specifically, the results were as follows: HCT_min (AUC = 0.710, 95% CI 0.649–0.772), Scr_max (AUC = 0.798, 95% CI 0.744–0.851), BUN_max (AUC = 0.704, 95% CI 0.643–0.765), Mb_max (AUC = 0.705, 95% CI 0.640–0.770), TnT_max (AUC = 0.709, 95% CI 0.627–0.792), D-Dimer_max (AUC = 0.755, 95% CI 0.694–0.816), and IL-6_max (AUC = 0.704, 95% CI 0.617–0.791), as shown in Fig. 2A.

Fig. 2
figure 2

ROC curves and correlation analysis of candidate predictors. (A) Receiver operating characteristic (ROC) curves for individual laboratory indicators in predicting acute kidney injury (AKI). The corresponding area under the curve (AUC) values and 95% confidence intervals are indicated for Hematocrit (HCT), Serum Creatinine (Scr), Blood Urea Nitrogen (BUN), Myoglobin (Mb), Cardiac Troponin T (TnT), Interleukin-6 (IL-6) and D-Dimer. (B) Heatmap of Pearson correlation coefficients among candidate predictor variables. The color gradient reflects the strength and direction of pairwise correlations, ranging from − 1 (strong negative correlation) to + 1 (strong positive correlation). This visualization complements the variance inflation factor (VIF) analysis by illustrating potential collinearity between variables.

Multicollinearity assessment

Considering that serum creatinine is part of the diagnostic criteria for AKI and may therefore artificially enhance model performance, we excluded this variable from subsequent analyses. Multicollinearity among the independent variables was assessed using the VIF.

Several variables exhibited moderate to high multicollinearity, with VIF values exceeding commonly accepted thresholds (Supplementary Table 4). Notably, K+_max (VIF = 120.246), GCS score_min (VIF = 38.269), Cl_min (VIF = 31.314), Lac_max (VIF = 21.404), INR_max (VIF = 20.672) demonstrated substantial collinearity. Other variables, including pH_min, BUN_max, Na+_min, resp_rate, MAP_max, Hb_min, and AST_max also show ed VIF values greater than 10, indicating moderate multicollinearity. A total of 35 variables with VIF < 5 were considered acceptable and retained for subsequent multivariable modeling. Figure 2B presents a heatmap of the Pearson correlation coefficients among candidate predictor variables.

Model performance

Based on acceptable model discrimination (AUC > 0.70) and the absence of significant multicollinearity (VIF < 5), four key predictors were selected for inclusion in the construction of the AKI early warning model. Multiple imputation was first performed to handle missing data, including HCT, D_dimer, IL_6, TnT, and Mb (Fig. 3A).

Fig. 3
figure 3

Model evaluation, feature importance, and data completeness analysis across multiple algorithms. (A) Proportion of missing values for the five selected features. Hematocrit (HCT), D-Dimer, IL-6, Troponin T (TnT), and myoglobin (Mb). (B) Receiver operating characteristic (ROC) curves for different classification models in predicting acute kidney injury (AKI). (CH) Feature importance rankings derived from different machine learning models. (C) Decision tree, (D) KNN, (E) Logistic regression, (F) Naive Bayes, and (G) SVM, (H) XGBoost (based on SHAP values).

Among all models evaluated, the kNN and SVM algorithms demonstrated the highest discriminative performance, with AUCs of 0.934 (95% CI 0.909–0.959) and 0.924 (95% CI 0.886–0.962), respectively, indicating excellent predictive accuracy. The XGBoost model also performed well, achieving an AUC of 0.863 (95% CI 0.842–0.884), followed closely by the naïve Bayes, with an AUC of 0.851 (95% CI 0.808–0.893). In contrast, logistic regression yielded relatively lower AUCs of 0.753 (95% CI 0.697–0.808), though maintained acceptable discriminative ability (Fig. 3B). The corresponding metric is presented in Supplementary Table 5. The final logistic regression model can be expressed as follows:

$$\begin{aligned} \log \left( {\frac{{P\left( {AKI} \right)}}{{1 - P\left( {AKI} \right)}}} \right) & = 0.1618 - 0.0602 \times HCT_{min} + 0.0001 \times IL\_6_{max} - 0.0017 \\ & \quad \times TnT_{max} + 0.0005 \times Mb_{max} - 0.0001 \times D\_Dimer_{max} \\ \end{aligned}$$

The model’s predictive performance was comprehensively evaluated using multiple metrics, including AUC with 95% CI, accuracy, precision, sensitivity, and F1 score, as summarized in Table 2. Among the models tested, the kNN algorithm demonstrated the best overall performance, with an AUC of 0.934 [0.909–0.959], accuracy of 0.841 [0.800–0.879], specificity of 0.851 [0.798–0.903], precision of 0.803 [0.733–0.873], sensitivity of 0.828 [0.758–0.891], and an F1 score of 0.814 [0.757–0.870].

Table 2 Classification performance metrics of various machine learning algorithms.

The Fig. 3C–H illustrate the feature importance for logistic regression and five machine learning models in predicting AKI, including Naive Bayes, KNN, SHAP-based feature importance for XGBoost, Decision Tree, and SVM. Across all models, HCT, TnT, Mb consistently emerged as the most important feature, followed by D-dimer, IL_6.

We observed that the increase in HCT levels was evident within the first 24 h following the onset of heat stroke, with a HCT of 2.66 [2.31, 4.40] in 290 patients. The HCT levels in patients who developed AKI after heat stroke were significantly higher compared to those without AKI (patients with AKI: HCT = 2.52 [2.21, 4.15]; patients without AKI: HCT = 3.42 [2.40, 4.40]; p < 0.001). Based on machine learning results, our findings suggest that a lower HCT level within the first 24 h of hospital admission serves as an important indicator for predicting subsequent acute increases in Scr or oliguria during hospitalization. Additionally, D-dimer emerged as a significant predictor of AKI development. Patients with AKI exhibited significantly higher levels of D-dimer and Mb compared to those without AKI (patients with AKI: D-dimer = 2.65 [0.83, 10.23], Mb = 557.95 [341.33, 2295.50]; patients without AKI: D-dimer = 1.08 [0.50, 3.78], Mb = 422.00 [233.40, 924.90]; p < 0.001).

Discussion

After applying the inclusion and exclusion criteria, a total of 290 heatstroke patients from 55 hospitals in China, between 2008 and 2024, were enrolled in the study. In this study, we define AKI based on the KDIGO criteria, where patients experience a sudden increase in Scr and oliguria during hospitalization and AKI occurred in 57.93% of the patients.

Currently available evidence suggests that clinical research focusing on risk assessment, prediction, and identification of risk factors for AKI in patients with heatstroke remains limited. Most existing studies have relatively small sample sizes, typically ranging from 58 to 187 patients, which may constrain the generalizability of their findings32,33. Previous studies have reported that the combination of serum Mb and lactate dehydrogenase (LDH) can effectively predict AKI in heat stroke patients with concomitant rhabdomyolysis, achieving an AUC of up to 0.911634. Additionally, the lowest recorded platelet counts during hospitalization demonstrated a predictive value for AKI with an AUC of 0.7335. Other identified independent risk factors for AKI in patients with EHS include elevated lymphocyte and neutrophil counts, D-dimer levels, and Mb ≥ 1000 ng/mL, amd Mb has been found to be a more reliable predictor of AKI than CK9,36. However, to the best of our knowledge, no studies to date have established an early warning model for AKI in heatstroke patients using machine learning algorithms based on early-phase clinical data.

Based on the first 24 h of hospitalization data, the AKI group exhibited significantly higher temperature and heart rate compared to the non-AKI group. Additionally, patients with AKI showed significantly elevated levels of kidney function indicators, coagulation and inflammatory markers, as well as more severe liver dysfunction and rhabdomyolysis. We performed logistic regression and five machine learning algorithms to predict AKI occurrence during hospitalization based on data from the first 24 h. Among the models tested, the kNN algorithm demonstrated the best overall performance, with an AUC of 0.934 [0.909–0.959], and key predictors included TnT, HCT, D-dimer, and Mb.

Although previous studies have suggested that AKI typically manifests in the middle to late stages of heatstroke progression, emerging evidence indicates that renal impairment may begin much earlier in the disease course. Notably, elevated Scr levels observed in the early phase are associated with a higher likelihood of a rapid rise in Scr and the subsequent development of oliguria, suggesting early subclinical kidney injury and a more aggressive renal trajectory during hospitalization37.

Coagulation dysfunction is a common complication in patients with heat stroke and shares pathophysiological similarities with sepsis-associated coagulopathy36. Previous studies have demonstrated that D-dimer levels are predictive of AKI in various clinical settings, including patients with intra-abdominal infections, those admitted to intensive care units (ICUs), individuals with ST-segment elevation myocardial infarction (STEMI), and patients with sepsis38,39,40. D-dimer is a fibrin degradation product that indicates ongoing fibrinolysis following the activation of coagulation. Its marked elevation in heatstroke may signal systemic endothelial injury, excessive thrombin generation, and a hypercoagulable state, all of which can result in microvascular thrombosis41. These microthrombi may occlude renal capillaries and small arterioles, impairing renal perfusion and oxygen delivery, thereby promoting ischemic tubular injury and contributing to the development or worsening of AKI9,42.

Furthermore, heat stroke is associated with a systemic inflammatory response similar to sepsis, leading to cytokine-mediated endothelial activation and dysfunction43. This promotes the expression of tissue factor, amplifies the coagulation cascade, and inhibits natural anticoagulant pathways (e.g., antithrombin III, protein C)44. The resulting imbalance between coagulation and fibrinolysis may aggravate microvascular thrombosis and inflammatory injury within renal tissue45. D-dimer may therefore serve not only as a biomarker of disease severity and AKI risk, but also as a potential therapeutic target. Modulating the coagulation pathway—such as through the use of anticoagulants or targeted therapies to prevent microthrombosis—could potentially mitigate renal injury.

Our findings identified Mb as a critical predictor of AKI in patients with heat stroke, aligning with prior studies. The underlying mechanism is primarily attributed to rhabdomyolysis-induced myoglobinemia, which leads to the accumulation of Mb in renal tubules. Myoglobin, especially under conditions of hypovolemia and acidic urine, can cause direct oxidative damage to tubular epithelial cells and promote tubular obstruction through cast formation, ultimately contributing to AKI onset46.

In addition, HCT was among the top-ranking variables in our predictive model. Previous research has shown that reduced HCT during cardiopulmonary bypass is significantly associated with higher AKI risk in cardiac surgery patients47,48. The proposed mechanism involves hemodilution-related reductions in oxygen-carrying capacity, resulting in renal hypoxia and impaired oxygen delivery to the tubular cells49. In the context of heat stroke, where volume depletion, hyperthermia, and systemic inflammation already compromise renal perfusion, a low HCT may further exacerbate renal ischemia, thereby increasing susceptibility to AKI47. These observations suggest that both Mb and HCT are not only valuable predictive markers but also reflect distinct yet converging pathophysiological pathways leading to heatstroke–related renal injury.

In patients with heat stroke complicated by AKI, CRRT is widely acknowledged as a consensus-driven therapeutic strategy50. In our study, 55.36% of patients diagnosed with AKI received CRRT, reflecting adherence to this clinical recommendation. Interestingly, a notable proportion of patients without AKI—18.45%—were also treated with CRRT. In clinical practice, beyond the presence of AKI, the initiation of CRRT should be considered in heatstroke patients presenting with any of the following conditions: persistent core body temperature above 40 °C unresponsive to standard cooling interventions; rhabdomyolysis; or other signs of organ dysfunction; presence of severe electrolyte imbalances or metabolic acidosis51,52. Hemodialysis (HD), continuous hemodiafiltration (CHDF), and continuous plasma diafiltration (CPDF) have been reported as effective adjunctive therapies in the management of heatstroke, not only for their capacity to reduce core body temperature, but also for their ability to support impaired organ function53,54. When conventional cooling methods—such as gastric lavage with cold water or intravenous infusion of cold saline—fail to achieve adequate temperature control, the use of cold dialysate in HD or high-flow cold CHDF can facilitate rapid core temperature reduction54. Cold hemodialysis (cold HD) is generally appropriate for patients requiring rapid cooling, while high-flow cold CHDF may be more suitable for critically ill patients with multi-organ involvement, as it provides continuous temperature regulation alongside organ support51,55. And extracorporeal blood purification helps to remove metabolic byproducts and inflammatory mediators, thereby alleviating organ burden and promoting functional recovery56.

However, our study has several limitations. This study integrated electronic medical records collected from multiple hospitals. To ensure data consistency and standardization across centers, we performed unified unit conversions for all relevant clinical variables. However, inter-institutional variations in laboratory testing methods and equipment may still introduce inherent heterogeneity in the dataset. Given the nature of heatstroke as a condition that predominantly occurs under extreme environmental conditions, its incidence remains relatively low, which poses challenges for patient recruitment. Although the sample size in our study exceeds that of most existing retrospective clinical studies on heatstroke, it remains relatively limited and may affect the robustness of model performance. Previous research has demonstrated that traditional machine learning algorithms can still achieve satisfactory performance even when applied to relatively limited datasets. Nevertheless, results obtained from ensemble-based algorithms such as XGBoost should be interpreted with caution. As a retrospective analysis, the findings of this study warrant further validation through well-designed, large-scale prospective studies to rigorously assess the predictive performance of our models, particularly regarding the early identification of AKI in patients with heatstroke.