Introduction

In the field of critical care medicine, managing severe acidosis remains a pivotal challenge with significant implications for patient outcomes1,2,3,4. Severe acidosis can precipitate detrimental physiological effects across multiple organ systems and increase mortality risk5,6,7. Continuous kidney replacement therapy (CKRT) is recognized for its ability to manage acid–base imbalances and correct severe acidosis in critically ill patients. Current guidelines recommend CKRT in cases of uncorrected severe acidosis, a practice widely accepted among nephrologists8,9,10,11. However, research quantitatively measuring the extent to which CKRT improves survival outcomes for patients with severe acidosis, as well as how its effectiveness may vary across individual patients, is insufficient.

Causal inference analysis is a powerful statistical method for identifying causes of disease and treatment effects. With recent advances in deep learning, deep learning-based causal inference has produced significant results, especially in environments where conducting randomized controlled trials is challenging12,13. The purpose of this study was to quantitatively assess how CKRT affects the probability of in-hospital mortality in patients with severe acidosis following admission to the intensive care unit (ICU) by employing the Generative Adversarial Nets for inference of Individualized Treatment Effects (GANITE), a deep learning-based causal inference model14,15. Furthermore, this study aimed to utilize the GANITE model and statistical methods to identify the characteristics of patients who derive the greatest benefit from CKRT, thereby facilitating better patient selection.

Methods

Study population

This study utilized the Medical Information Mart for Intensive Care III (MIMIC-III) database, a comprehensive collection of deidentified health-related data from over forty thousand patients who stayed in the critical care units of Beth Israel Deaconess Medical Center between 2001 and 201216. As this study was an analysis of a public database, this study was exempt from approval by the institutional review board of Seoul National University Hospital (no. 2405-041-1535). We selected patients who had available vital sign records, urine output data, and laboratory data collected within 48 h of ICU admission and who survived this initial period. Patients who exhibited a pH less than 7.2 at least once within the first 48 h of their ICU stay were included. We excluded patients who died within 48 h of ICU admission from the analysis because these patients did not have sufficient time to receive CKRT for a meaningful evaluation of its therapeutic effects. These patients were then randomly split into training and test datasets at a ratio of 0.85:0.15.

Variables and outcomes for analysis

The treatment variable was defined as initiation of CKRT within 48 h of ICU admission, with the comparator being deferral beyond 48 h or no CKRT. Demographic data included sex and age, while vital sign data included systolic blood pressure (SBP), diastolic blood pressure (DBP), mean arterial pressure (MAP), heart rate (HR), respiratory rate (RR), peripheral oxygen saturation (SpO2), and body temperature. Blood laboratory data included albumin, anion gap, bicarbonate, total bilirubin, creatinine, hemoglobin, white blood cell and platelet counts, sodium, potassium, prothrombin time, and pH. Additional variables included hourly urine output and the fraction of inspired oxygen. The outcome was in-hospital mortality occurring after the initial 48-hour period following ICU admission.

Data generation

Time-series data were generated for all patients at one-hour intervals for a total of 48 h. The vital sign, urine output, and laboratory data were assigned to the nearest hour post-ICU admission up to 48 h, and any data not measured within an hour were imputed with the last available value. Variables not measured at all were imputed as zero. Time-series data were generated for each patient starting from the first occurrence of a pH < 7.2, and a four-hour time window was used for analysis. If the first occurrence of pH < 7.2 was within the first four hours after admission, missing data within this window were padded with zeros. For each patient, time-series data from the first occurrence of a pH < 7.2 through the subsequent 48 h were analyzed using a four-hour time window.

Development of the deep learning-based causal inference model

The GANITE, a deep learning-based causal inference model, was used14. Because the GANITE model is traditionally applied to one-dimensional tabular data, modifications were necessary to handle input data. The network section was altered from a simple deep neural network to long short-term memory to accommodate time-series data17. The vector representing the variables remained a one-dimensional vector as per the original model design. We adopted a causal inference deep learning approach because the decision to initiate CKRT in severe acidosis depends on high-dimensional, non-linear, and time-dependent relationships among early hemodynamics, acid–base status, renal function, and oxygenation. Compared with conventional regression or propensity-based methods that assume restrictive functional forms, the GANITE architecture—augmented with LSTM layers—can learn hourly trajectories and complex interactions, enabling individual-level treatment-effect estimation and explicit heterogeneity of treatment effect under a clinically actionable, timing-defined policy (initiation within 48 h vs. deferral/no use). Under standard identification assumptions (consistency, conditional exchangeability given observed early covariates, and positivity), this framework yields policy-relevant, model-based estimates intended to complement—not replace—clinical judgment. The model performance was evaluated on the training and test datasets by analyzing the actual in-hospital mortality probabilities and calculating the accuracy, area under the receiver operating characteristic curve (AUROC), and F1-score. The 95% confidence interval for the AUROC was determined using bootstrapping with 1000 iterations. Calibration performance was assessed through calibration plots. Additionally, the average treatment effect (ATE) of CKRT within 48 h of ICU admission and the conditional average treatment effect (CATE) for patients who underwent CKRT within this timeframe were examined. The treatment effect referred to the impact of a specified treatment (i.e., the administration of CKRT within 48 h of ICU admission) on an outcome, which was the probability of in-hospital mortality occurring after the initial 48 h. The ATE addresses a policy question: for each patient, the model estimates the counterfactual risk of in-hospital mortality if CKRT were initiated within 48 h and if not, conditional on observed first-48-hour covariates; the ATE is the average of these individual differences across the study population. The CATE for patients who underwent CKRT is the corresponding contrast restricted to patients who actually received CKRT, characterizing heterogeneity of predicted benefit within the treated subgroup. Additionally, we analyzed the CATE based on creatinine, urine output, MAP, potassium, and pH measured at the last time point to determine if there were statistically significant differences. These findings were then visualized using bar plots. Subsequently, total time-series data were categorized based on whether CKRT treatment decreased or increased mortality risk, and differences in variables between these groups were analyzed. Additionally, a multivariable logistic regression analysis was conducted to identify variables that may indicate a reduction in mortality risk when CKRT treatment was applied. A P value < 0.05 was considered to indicate statistical significance.

Results

Study population and data generation

Among the patients who had available data collected within 48 h of ICU admission, 1,891 exhibited a pH lower than 7.2 at least once. These patients were randomly divided into training (n = 1,601) and test (n = 290) datasets (Fig. 1 and Supplementary Table S1). Of the 1,891 patients with severe acidosis, 535 (28.3%) experienced in-hospital mortality after 48 h of ICU admission (Table 1). The average age of the patients was 61.2 ± 0.4 years, with those in the mortality group being older (64.3 ± 0.7 years) than those who survived (59.9 ± 0.5 years). Males accounted for 54% of the patients, with no difference in the proportion of males between patients who did and did not die. Serum creatinine levels were higher in the mortality group (2.36 ± 0.08 mg/dL) than in the no mortality group (1.76 ± 0.09 mg/dL). The hourly urine output at admission was lower in the mortality group, averaging 55.29 ± 3.60 ml/hr, than 122.68 ± 3.80 ml/hr in the no mortality group. Time-series data were generated for all patients at one-hour intervals for a total of 48 h. The training dataset yielded 15,679 windows for the four-hour time series, while the test dataset generated 2,921 windows.

Fig. 1
figure 1

Flow diagram of the study population. MIMIC-III, Medical Information Mart for Intensive Care III; ICU, intensive care unit.

Table 1 Baseline characteristics of the study patients.

Deep learning-based causal inference model

The GANITE model was developed, and its performance in predicting in-hospital mortality within 48 h after ICU admission showed AUROCs of 0.887 (0.880 to 0.893) and 0.824 (0.804 to 0.843), accuracies of 0.883 and 0.841, and F1-scores of 0.811 and 0.776 for the training and test datasets, respectively (Table 2; Fig. 2). Calibration plots indicated good calibration performance in both the training and test datasets (Fig. 2). The ATE of CKRT within 48 h after ICU admission was 14.9% (14.6 to 15.3%) in the entire dataset, meaning that, on average, the application of CKRT within this timeframe increased the probability of in-hospital mortality by 14.9% across all patients (Table 3). In contrast, the CATE for patients who underwent CKRT was − 13.1% (–14.4% to − 11.9%), indicating that for those who received CKRT, the probability of in-hospital mortality decreased by an average of 13.1%. Similar ATEs and CATEs were observed in both the training and test datasets.

Table 2 Model performance parameters.
Fig. 2
figure 2

Model performance and calibration. A, Plot of the area under the receiver operating characteristic curve in the training dataset. B, calibration plot in the training dataset. C, Plot of the area under the receiver operating characteristic curve in the test dataset. D, calibration plot in the test dataset.

Table 3 Average treatment effects and conditional average treatment effects according to continuous kidney replacement therapy status.

CATE by various features

When examining the CATE for two groups divided based on specific cutoffs for only one variable, the application of CKRT was predicted to increase the probability of in-hospital mortality in all groups except for the group with a potassium level > 6 mmol/L (Fig. 3). However, when comparing the CATE for the groups simultaneously based on criteria of pH and other variables, the application of CKRT was predicted to reduce the probability of in-hospital mortality in patients with a high pH and other conditions, including levels of high creatinine, low hourly urine output, low MAP, and high potassium. Specifically, the group with a creatinine level < 3 mg/dL and a pH < 7 showed the most significant reduction in in-hospital mortality risk, with a CATE of − 20%.

Fig. 3
figure 3

Conditional average treatment effect according to patient conditions. A, Comparison of the effect according to the pH, creatinine, urine output, mean arterial pressure (MAP), and potassium levels. B, Comparison of the effect according to the combination of pH and other variables. Cr, creatinine; UO, urine output.

Logistic regression for decreasing mortality risk after CKRT application

When the data were divided into groups based on whether the treatment effect of CKRT was predicted to decrease the risk of in-hospital mortality (negative treatment effect) or not (positive treatment effect), all variables differed between the two groups in the entire dataset (Table 4). The group predicted to have a decrease in in-hospital mortality risk had higher pH values, higher creatinine levels, and lower hourly urine outputs than did the group predicted to have an increase in in-hospital mortality risk. Similar trends were observed in both the training and test datasets (Supplementary Table S2). A multivariable logistic regression analysis, using the predictions of decreased mortality risk by applying CKRT as the dependent variable, showed trends where an older age and a low BP were associated with a reduction in predicted mortality risk (Table 5). Furthermore, high levels of creatinine, pH, anion gap, bicarbonate, and potassium, as well as a low hourly urine output, were associated with a reduced probability of mortality after CKRT application.

Table 4 Baseline characteristics of the two groups with predicted increased and decreased mortality risk after the application of continuous kidney replacement therapy for the entire dataset.
Table 5 Factors related to reduced mortality after the application of continuous kidney replacement therapy.

Discussion

The present study leveraged real-time data from patients with severe acidosis, a critical challenge within the intensive care field, to predict mortality probabilities and assess the effect of CKRT on in-hospital mortality outcomes. The use of a deep learning-based causal inference model enabled not only the accurate forecasting of mortality risk but also the quantification of the therapeutic effects of CKRT. This innovative method facilitated a nuanced exploration of CKRT and its correlation with patient survival rates, thereby offering a comprehensive view that surpasses traditional predictive models. Furthermore, we used real-time clinical data and advanced causal inference techniques to aid clinicians in identifying high-risk ICU patients to receive timely interventions. This pioneering study could thus set a new benchmark for applying machine learning and deep learning in critical care settings, highlighting the model’s contribution to clinical decision-making in the face of severe acidosis challenges18.

Although numerous studies have underscored the importance of CKRT for patients with severe acidosis, few have directly examined the effect of CKRT on mortality under identical conditions through causal inference. Typically, randomization is deemed the most robust method for examining causal relationships; however, it leads to ethical and practical challenges in critical care, especially among severely ill patients19,20,21,22. The present study addressed these limitations by employing deep learning to simulate a variable-controlled environment, offering insights into the direct impact of CKRT on mortality. This method not only showcases the potential of deep learning in addressing the limitations of traditional methods but also underscores the significance of using advanced analytical techniques in critical care research.

The present model not only quantified the impact of CKRT on mortality with specific probability figures but also demonstrated good calibration performance, indicating how well the predicted probabilities aligned with the actual occurrence of outcomes. This performance, as indicated by the calibration curve, ensures that the predicted probabilities of the outcome are reliable. Furthermore, the model enabled the examination of individual patient responses to CKRT, moving beyond general trends and average effects. These detailed insights are invaluable for clinical decision-making, allowing health care providers to customize interventions based on the estimated benefits of CKRT for each patient23,24.

In this study, the analysis of the deep learning-based causal inference model indicated a model-predicted increase of 14.9% points in in-hospital mortality when CKRT was initiated within 48 h across the overall population. However, among patients who actually received CKRT, there was a model-predicted decrease of 13.1% points. CKRT-related complications—such as blood cell damage, nutritional loss, and vascular access problems—could contribute to worse outcomes when CKRT is applied without clear indications25; however, the observed contrast may also reflect non-selective early initiation extending treatment to patients unlikely to benefit, as well as residual and unmeasured confounding and limited overlap. These results suggest that if CKRT is not selectively applied for patients who need it, outcomes may worsen. Utilizing our model to determine the optimal patients and timing for the application of CKRT could improve outcomes in the management of severe acidosis26. This tailored approach aligns with the principles of precision medicine by emphasizing customized health care and facilitating individualized decisions on when to initiate CKRT based on existing evidence27,28,29. The present study exemplifies this approach, illustrating how advanced analytical tools and models can substantially contribute to precision medicine in critical care. However, these model-based, policy-level estimates should be interpreted as hypothesis-generating rather than definitive causal effects. Residual and unmeasured confounding and limited overlap may persist, and prospective, multicenter evaluation is warranted.

In contrast to previous research, the present study utilized 1-hour interval time-series data to capture rapid physiological changes in the ICU setting. These high-resolution data significantly enhanced the model performance by accounting for the dynamic and often volatile nature of patient status in the ICU setting. Furthermore, the use of such finely granulated data underscores the practicality of using the model in real clinical settings, enabling real-time application and thereby increasing its clinical relevance. This advancement not only boosts the potential for machine learning in critical care but also sets the stage for future innovations in patient monitoring and treatment optimization30,31.

Our study revealed that patients who benefited from a reduction in mortality risk due to the application of CKRT within 48 h of ICU admission were older than those who did not benefit as much, suggesting a greater efficacy of CKRT in reducing mortality risk among elderly patients. This finding may be attributed to the fact that elderly patients often have diminished physiological reserve and are less able to compensate for severe metabolic disturbances32. CKRT may provide a more controlled and gradual correction of acidosis, which could be particularly beneficial for older patients who might not tolerate rapid shifts in acid-base balance. Our findings also indicated that the implementation of CKRT in patients with high levels of creatinine and potassium, as well as low urine output and BP, is associated with a greater probability of mortality risk reduction. This outcome aligns with current guidelines on CKRT indications for deteriorating kidney function or hyperkalemia, confirming the appropriateness of these criteria for initiating CKRT9,27. In patients with low BP, the benefits of CKRT may be particularly significant due to the complex interplay between hypotension and acidosis. Hypotension can be both a consequence of severe acidosis and a contributing factor that exacerbates kidney injury, leading to worsening acidosis33,34. This creates a vicious cycle where acidosis and hypotension mutually worsen each other, increasing mortality risk35,36. Therefore, correcting acidosis through CKRT can break this cycle, potentially leading to greater mortality reduction in patients with low BP. Interestingly, a high pH increased the likelihood of reduction in mortality risk by CKRT, suggesting that initiating CKRT before the progression of severe acidosis might be highly effective for patient outcomes. Initiating CKRT at an earlier stage, when the patient is less acidotic, might prevent the cascade of worsening organ dysfunction associated with severe acidosis, thereby improving outcomes37. This insight may underscore the importance of early intervention with CKRT in patients with severe acidosis, potentially before acidosis reaches critical levels, which could optimize patient outcomes. Similarly, previous studies have shown that an early initiation of CKRT is associated with better outcomes26,38,39,40.

This study has certain limitations. The reliance on data from a single center and the exclusion of patients who experienced mortality within the initial 48-hour period after ICU admission may restrict the generalizability of the findings. These factors could introduce selection bias and limit the applicability of the results to broader patient populations. Adverse events were not adjudicated; therefore, causal attribution of mortality to CKRT-related complications is beyond the scope of this observational analysis. Additionally, our study focused on short-term outcomes, specifically in-hospital mortality, and did not account for long-term effects or recovery. While these long-term outcomes are crucial for a comprehensive assessment of CKRT’s impact, our ICU-specific dataset limited our ability to evaluate them. Although a causal inference method was employed, clinical trials with randomization are necessary to build upon these findings. To address these concerns, future studies should include data from multiple clinical settings, incorporate randomized controlled trials, and evaluate long-term outcomes. Moreover, not all clinically relevant covariates were captured or encoded, and the reported treatment effects are model-based predictions rather than directly observed contrasts. Accordingly, the estimates may be susceptible to residual and unmeasured confounding and model misspecification.

In conclusion, the present study makes a significant contribution to the existing body of knowledge on the effectiveness of CKRT in severe acidosis scenarios by leveraging the potential of the deep learning-based causal inference model. These findings underscore the importance of utilizing advanced analytical tools to guide clinical decisions, with the ultimate goal of enhancing patient care and outcomes in the ICU.