Abstract
Sepsis-induced immunosuppression leads to poor prognosis. Circulating lymphocyte count (LC), as an easily accessible clinical marker, closely reflects the immune status of sepsis. The study aims to perform immune phenotyping of sepsis patients using dynamic LC for early identification of high-risk individuals. A latent class trajectory model (LCTM) was used to analyze the dynamic trajectories of lymphocyte count (LC) based on repeated measurements obtained within at least two measurements of lymphocyte count (LC) within the first 24 h after sepsis diagnosis, followed by two more between day 2 and day 7. Survival differences among subphenotypes were assessed using Kaplan–Meier curves and Cox regression. Feature selection was conducted via the Boruta algorithm, and a high-precision machine learning model was developed to predict the target trajectory. Model interpretability was ensured through SHapley Additive exPlanations (SHAP). The predictive performance of the model for ICU mortality was assessed using the receiver operating characteristic (ROC) curve. The derivation cohort included 2085 sepsis patients from the China Multicenter Sepsis database, and the external validation cohort of 1299 sepsis patients. We identified four trajectory patterns of LC dynamics, among which the persistent lymphopenia (PL) subgroup exhibited the highest disease severity and poorest prognosis. The trajectory model demonstrated consistent patterns in external validation. Six machine learning models were utilized to determine the best model to identify the PL subphenotype, and an online prediction tool was developed for clinical application. Incorporating the PL trajectory subphenotype significantly improved the predictive performance for ICU mortality. Dynamic LC trajectories effectively capture immunological heterogeneity in sepsis, encompassing immunocompromised and immunocompetent hosts. These findings underscore the importance of early identification of patients with persistent lymphopenia to better target populations for future sepsis immunotherapy.
Similar content being viewed by others
Introduction
Sepsis is a dysregulated host response to infection, leading to organ dysfunction and high mortality1. Increasing evidence suggests that pro-inflammatory activation and immunosuppression coexist from the early stages of sepsis2. Sepsis-induced immunosuppression is characterized by extensive lymphocyte apoptosis, downregulation of HLA-DR expression, and impaired immune function3. Although previous studies have attempted to treat sepsis with immunomodulatory agents, most large randomized clinical trials have failed to demonstrate significant benefit4. Septic patients exhibit substantial heterogeneity in the immune responses, driven by the differences in disease severity and pre-existing immune status5,6.
A key manifestation of sepsis-induced immunosuppression is lymphocyte apoptosis, resulting in decreased peripheral lymphocyte counts, which are consistently associated with secondary infections, persistent organ dysfunction, and increased mortality. Persistent lymphopenia has therefore emerged as a clinically relevant marker of impaired immune status7,8,9. Although previous studies have attempted to establish dynamic lymphocyte count trajectories in sepsis, most have significant limitations. For example, a prospective multicenter study excluded immunocompromised hosts and assessed lymphocyte dynamics only within the first 72 h after sepsis diagnosis10. Another study using the MIMIC database excluded patients with immune-related comorbidities, effectively removing approximately 50% of septic patients11. However, in clinical practice, many septic patients have underlying conditions such as malignancy or organ transplantation, which lead to compromised baseline immune function12. While persistent lymphopenia has been associated with adverse outcomes in critically ill patients13, our study builds upon this evidence by focusing on patients with sepsis and validating the association in a nationwide multicenter cohort.
We utilised the latent class trajectory model (LCTM) to identify distinct 7-day lymphocyte trajectories using data from a nationwide, prospective multicenter sepsis cohort. Model stability was validated in an independent external cohort. We further compared the subphenotypes regarding inflammatory responses, coagulation function, and clinical outcomes. Finally, we developed machine learning models for the early identification of the persistent lymphopenia (PL) subphenotype and created a web-based application to facilitate clinical translation. Our goal is to provide a foundation for the early recognition of patients with persistent lymphopenia, to enable precision interventions and for enhanced clinical outcomes.
Methods
Study design and participants
This study included a derivation cohort and an external validation cohort (Fig. 1). The derivation cohort was derived from the China Multicenter Sepsis (CMS) database, hosted by the First Hospital of China Medical University. The CMS database included adult septic patients from January 1, 2023, to December 31, 2024 (n = 2,655), involving 27 intensive care units (ICUs) from tertiary university hospitals (details in Supplementary Table 1). Moreover, the external validation cohort included adult sepsis patients (n = 1,484) admitted to the ICU of Peking Union Medical College Hospital from January 1, 2023, to December 31, 2024. Sepsis was defined according to the Sepsis-3 criteria1. The inclusion criteria were as follows: (1) aged ≥ 18 years; (2) had at least two measurements of lymphocyte count (LC) within the first 24 h after sepsis diagnosis, followed by two more between day 2 and day 7. The study has been performed in accordance with the Declaration of Helsinki and was approved by the Research and Ethics Committee of the First Affiliated Hospital of China Medical University (Approval number: [2022] 2022–502-2, Shenyang, China) and the Institutional Review Board of Peking Union Medical College Hospital (Approval number: JS-3480D). The other 26 participating ICUs obtained their respective ethical approvals. Written informed consent was obtained from all participants and/or their legal surrogates before enrollment.
Flow diagram of participant enrollment with a dynamic modeling framework for lymphocyte count (LC) trajectories.
Data collection
Baseline data included demographic characteristics (age, sex), site of infection (lung, abdomen, urinary, bloodstream, skin, nervous system, and others), and preexisting conditions (hypertension, diabetes, coronary heart disease, chronic obstructive pulmonary disease, cancer, and immunocompromised). Immunocompromised hosts were defined by the presence of any of the following conditions: (a) acquired immunodeficiency syndrome; (b) malignant neoplasms treated with radiotherapy or chemotherapy within the preceding three months; (c) receipt of allogeneic bone marrow or hematopoietic stem cell transplantation; (d) receipt of solid organ transplantation; (e) autoimmune disease or ongoing immunosuppressive therapy; (f) glucocorticoid therapy (prednisolone > 20 mg/day or equivalent for > 2 weeks) within the preceding three months; and (g) chronic viral hepatitis. In the derivation and external validation cohort analyses, the initial measurements of vital signs and laboratory tests within the first 24 h of sepsis diagnosis were recorded. The laboratory variables included white blood cell count (WBC), LC, procalcitonin (PCT), C-reactive protein (CRP), platelet count (PLT), prothrombin time (PT), international normalized ratio (INR), activated partial thromboplastin time (APTT), fibrinogen (Fg), D-dimer (DD), fibrin degradation products (FDP), the PaO2/FiO2 ratio, lactic acid (Lac), creatinine (Cr), and total bilirubin (TBIL). The Acute Physiology and Chronic Health Evaluation (APACHE) II score, Sequential Organ Failure Assessment (SOFA) score, and outcomes were recorded.
Latent class trajectory model
The Latent Class Trajectory Model (LCTM)14 is a finite mixture model for longitudinal data. It groups individuals into hidden (“latent”) classes based on similar change patterns over time and fits a polynomial regression within each class to model their trajectory. The optimal number of classes was determined using a lower Akaike information criterion (AIC), Bayesian information criterion (BIC), and higher entropy, indicating a better model fit (Supplementary Table 4). Additionally, to increase the clinical relevance and reliability of the statistical analysis, a minimum subgroup size of 2% of the total cohort was established, along with the average posterior probability of group membership ≥ 70% for all subphenotypes15,16. Furthermore, we balanced model performance with clinical interpretability to select the final trajectory model. The chosen model was subsequently evaluated on the external validation cohort. Individual LC data from each patient in the validation cohort were fitted to the polynomial mixture components of the trained LCTM.
Model construction for the identification of subphenotype 1
We first selected clinical parameters that showed significant differences in subphenotype comparisons. The Boruta algorithm17 was then employed to determine the final feature set for modeling by comparing their importance to permuted “shadow” features, iteratively confirming or rejecting features for optimization. The derivation cohort was randomly split into training and test sets at a 7:3 ratio. Subsequently, six machine learning models, including logistic regression (LR), random forest (RF), support vector machine (SVM), extreme gradient boosting (XGB), multilayer perceptron (MLP), and light gradient boosting machine (LightGBM), were used to predict subphenotype 1 in sepsis patients. For each model, hyperparameter tuning was conducted using 10 rounds of tenfold cross-validation combined with Bayesian optimization based on the selected feature subset. The area under the receiver operating characteristic curve (AUC) was calculated for each model performance evaluation. Additionally, calibration curves and decision curve analysis (DCA) were used to evaluate the model’s performance. The Shapley Additive Explanations (SHAP) algorithm was also used for model interpretation. It ranks how vital each input feature is and clarifies predictions using SHAP values. To facilitate clinical translation, we developed a web-based calculator using a Streamlit application to identify Subphenotype 1 based on the best model.
Statistical analysis
Outliers in continuous variables were first screened using box plot analysis. An observation was considered an outlier if it exceeded the upper quartile (Q3) by more than 1.5 times the interquartile range (IQR) or fell below the lower quartile (Q1) by the same criterion. Identified outliers were then addressed using winsorization, in which extreme values were replaced with the corresponding upper or lower boundary defined by the 1.5 × IQR rule18. This procedure minimized the influence of extreme values while preserving the overall sample size. For the LC trajectories, no missing data processing was required because the LCTM inherently accommodates missing values through maximum likelihood estimation during model fitting. For the clinical characteristics, the disappearance rate for variables was calculated and reported (Supplementary Table 2). To ensure the accuracy of the results, we excluded variables with missing rates over 30%19. Based on this criterion, only Fibrin Degradation Products (FDP) were removed.
Continuous variables were presented as median (interquartile range [IQR]) and compared across subphenotypes using one-way ANOVA (if assumptions of normality and homogeneity of variance were met) or the Kruskal–Wallis test otherwise. Categorical variables, expressed as counts (percentages), were compared using the chi-squared test. We compared 28-day mortality in the derivation and validation cohorts via Kaplan‒Meier plots and estimated hazard ratios (HRs) across subphenotypes using Cox proportional-hazards regression models.
Independent predictors of ICU mortality were identified through univariable and multivariable logistic regression analyses. Two prediction models were constructed: a baseline model containing only conventional risk factors and an enhanced model incorporating the trajectory of persistent lymphopenia (PL). The AUCs of the two models were compared using DeLong’s test, with p < 0.05 considered statistically significant. All analyses were conducted in R (v4.3.0) and Python (v3.12.7). Key packages included: R: ‘lcmm’(v2.2.0) for latent class trajectory modeling, ‘mice’ (v3.16.0) for multiple imputation, ‘Boruta’ (v8.0.0) for feature selection. Python: ‘scikit-learn’ (v1.5.1) for model development, ‘scikit-optimize’ (v0.10.2) for hyperparameter tuning; ‘xgboost’ (v3.0.2) and ‘lightgbm’ (v4.6.0) for gradient-boosted classifiers. ‘shap’(v0.48.0) for model explanation.
Result
Patient Characteristics between subphenotypes in derivation and external validation cohorts
2,085 patients from CMS were included in the derivation cohort, and 1,299 patients from PUMCH were included in the external validation cohort according to the inclusion criteria (Fig. 1). Baseline characteristics and outcomes are detailed in Supplementary Table 3.
Based on the model fit statistics from LCTM with 1 to 6 classes (Supplementary Table 4). The optimal four-class model with fixed effect coefficients identified four lymphocyte trajectory subphenotypes in the derivation and external validation cohorts (Fig. 2 and Supplementary Table 5):
Latent Class Trajectory Model of lymphocyte count (LC) trajectories in the derivation and external validation cohorts. (A) The LC trajectories and the pie chart demonstrate the proportions of four subphenotypes in the derivation cohort. (B) The LC trajectories and the pie chart demonstrate the proportions of four subphenotypes in the external validation cohort. The shaded area of the curve represents each trajectory’s 95% confidence interval. Subphenotype 1 represents the persistent lymphopenia (PL, colored in blue), subphenotype 2 represents a low baseline LC followed by slowly increasing lymphocyte (SIL, colored in yellow), subphenotype 3 represents normal lymphocyte (NL, colored in green), and subphenotype 4 shows a normal initial lymphocyte count, followed by a rapid decline (RDL, colored in red).
PL (Persistent Lymphopenia, n = 589, 28.21%): Started with an initial LC count < 0.8 × 10⁹/L and remained persistently low.
SIL (Slowly Increasing Lymphocytes, n = 839, 40.26%): Started with a low baseline LC count (< 0.8 × 109) followed by a slow increase to the normal range (0.8–3.2 × 109).
NL (Normal Lymphocytes, n = 372, 17.65%): Maintained LC counts within the normal range (0.8–3.2 × 10⁹/L).
RDL (Rapidly Declining Lymphocytes, n = 285, 13.68%): Had an initial normal LC count with a rapid decline.
The Clinical characteristics among subphenotypes in both cohorts are presented in Table 1. Compared to other subphenotypes, PL patients were the oldest, had a higher incidence of pre-existing immunocompromised status and pulmonary infections, and a lower incidence of urinary tract infections. Correspondingly, PL had the highest SOFA and APACHE II scores, as well as the highest ICU and 28-day mortality. In contrast, NL patients were the youngest, with the mildest disease severity and the lowest mortality. Consistent with the derivation cohort, the external validation cohort showed similar demographic patterns and clinical outcomes across all subphenotypes.
Association of subphenotypes with coagulation and inflammation variables
Several laboratory variables were significantly different among the four subphenotypes. As for coagulation, PL patients had the lowest PLT counts and more prolonged PT, INR, and APTT. Although PL patients exhibited the lowest WBC and LC counts, they didn’t have a higher inflammatory response. By contrast, the SIL subphenotype showed the highest inflammation indicators, PCT and CRP (Fig. 3A, Supplementary Table 6). The laboratory values in the external validation cohort generally followed a similar pattern (Fig. 3B, Supplementary Table 7).
Laboratory parameters demonstrating differences across four subphenotypes are presented in the derivation cohort (A) and external validation cohort (B). The levels of laboratory parameters represent the mean and standard error of the mean (SEM). p values are derived from Kruskal–Wallis testing for significant differences between subphenotypes for laboratory values.
Clinical outcomes across subphenotypes
To assess the association of subphenotypes with the clinical outcomes, the Kaplan–Meier curve demonstrated that PL patients had the highest 28-day mortality, while NL patients had the lowest. Compared to PL, both SIL and NL subphenotypes were associated with a significantly lower mortality, whereas RDL showed no difference in either cohort (Fig. 4). Cox regression analysis confirmed these survival differences. In unadjusted models, with PL as reference, SIL and NL were associated with significantly reduced hazard ratios for 28-day mortality, while RDL showed no significant difference. After adjusting for age, sex, SOFA, APACHE II, Immunocompromised status, and pulmonary infection, SIL and NL remained at a significantly lower risk of mortality than PL. In contrast, RDL remained statistically similar to PL. The external validation cohort confirmed these findings (Table 2A).
Kaplan–Meier survival curves of the four subphenotypes in the derivation cohort (A) and external validation cohort (B). PL patients have the highest 28-day mortality among the subphenotypes, while NL patients have the lowest 28-day mortality. Statistical significance was determined by the log-rank test (***p < 0.001).
Consistent with 28-day mortality, ICU mortality was significantly lower in the SIL and NL subphenotypes than in PL, but not in RDL (Table 2B). PL patients showed minimal improvement in the SOFA score compared to the other subphenotypes (Supplementary Fig. 1).
Identification of the subphenotype 1 (PL) among sepsis using multiple machine-learning models
Given that persistent lymphopenia (PL) septic patients exhibit the poorest clinical outcomes, we aimed to identify this high-risk subphenotype early. The Boruta strategy was applied to select features to predict PL patients in sepsis. The chosen features included LC, WBC, Age, PCT, PLT, Urinary, DD, Fg, and preexisting immunocompromised status (Fig. 5A). Six machine learning models were employed using the selected features to identify the PL patients, and the hyperparameters of each model were tuned using Bayesian optimization (Supplementary Table 8). Integrating the selected features across the training, test, and external validation sets confirmed the superior predictive performance of the Random Forest (RF) model, which was therefore selected as the final model. It achieved AUROCs of 0.983 (training set), 0.841 (test set), and 0.848 (external validation set), outperforming other models in stability and accuracy (Fig. 5B–D, Supplementary Table 9). Additionally, calibration curves and decision curve analysis (DCA) were performed (Supplementary Fig. 2A-B).
(A) Feature selection process for early identification of PL (subphenotype 1) patients with sepsis using Boruta’s algorithm. The x-axis displays the variables, and the y-axis shows the Z-values for each variable. Green boxplots indicate confirmed features (significantly higher importance than the maximum shadow feature), red boxplots show rejected features, and yellow boxplots represent tentative features. Blue boxplots depict the min/mean/max range of shadow features. (B,C) ROC curves compare six predictive models in the Training Set and Test Set within the derivation cohort. (D) ROC curves for the external validation cohort. (E) The characteristic attributes and importance ranking of features in the Random Forest model are shown. The x-axis shows SHAP values, and each line represents a feature. Red dots indicate higher SHAP values, while blue dots indicate lower ones. (F) Explanation of the prediction results for a patient with persistent lymphopenia.
We applied SHAP to interpret the optimal prediction model and identify the most influential features. The SHAP swarm and summary bar plots indicated that lower LC, WBC, PLT, and PCT, along with older age and immunocompromised status, were the six most important features for distinguishing subphenotype 1 (Fig. 5E and Supplementary Fig. 2C). An explanation of the prediction for a specific instance of a persistent lymphopenia patient was shown in Fig. 5F. We implemented the visualization and basic application of the prediction model through a deployable web platform (https://rf-model-gf6efgvsvmcwrsj2i6cer9.streamlit.app/) using Streamlit and uploaded the source code to GitHub (Supplementary Fig. 3).
Identification of subphenotype 1 enhanced the predictive performance for ICU mortality
Given that PL patients had the worst outcomes, early identification of this high-risk subphenotype could be valuable for predicting clinical outcomes. Univariate and multivariate logistic regression analyses identified age, SOFA score, Lac, Temperature, infection source from the lung, and immunocompromised status (set1) as independent risk factors for ICU mortality (Table 3). The predictive performance for ICU mortality was significantly improved by the addition of the PL subphenotype to Set 1 (AUC increased from 0.702 to 0.722; p = 0.031; Fig. 6A). In the external validation cohort, the model’s AUC risen from 0.641 to 0.697 after incorporating the PL subphenotype (p = 0.0001; Fig. 6B). Although statistically significant, this improvement in discrimination was modest.
Comparison of ROC curves for ICU mortality prediction in the derivation cohort (A) and external validation cohort (B). The red curve represents the baseline model (Set1), which utilizes selected features, including pulmonary infection, Immunocompromised, Age, Temperature, Lac, and SOFA, as determined by both univariate and multivariate analyses. The blue curve illustrates the enhanced model (Set2), which incorporates the PL trajectory into the baseline model. The DeLong test compares the AUCs of the two models, revealing a significant improvement in predictive performance for the enhanced model over the baseline model in the derivation cohort (p = 0.031) and external validation cohort (p = 0.0001).
Discussion
This study, based on data from the China Multicenter Sepsis (CMS) database, used latent class trajectory modeling (LCTM) to describe dynamic changes in peripheral blood lymphocytes among patients with sepsis. External validation in an independent cohort from Peking Union Medical College Hospital further confirmed the reproducibility of our findings. We identified four distinct immune subphenotypes based on LC trajectories, revealing significant immunological heterogeneity in sepsis. Among them, patients with persistent lymphopenia (PL) exhibited the highest rates of immunosuppression, more frequent pulmonary infections, greater disease severity, and the poorest outcomes. We developed multiple machine learning models, among which the random forest performed best. SHAP analysis established the clinical interpretability of this model. To translate this into practice, we created an online tool to identify high-risk PL patients early. The incorporation of PL trajectory significantly enhanced the prognostic accuracy for ICU mortality.
Previous studies have assessed immune status using cross-sectional lymphocyte counts20, which have also observed poor outcomes in patients with sustained lymphopenia, further supporting the link between persistent lymphocyte loss and higher mortality. However, this method does not capture the dynamic change of the immune system, which can vary significantly with disease progression. In this study, we aimed to address this limitation by using a trajectory modeling approach to analyze longitudinal changes in lymphocyte counts. The patients with persistent lymphopenia indicated ongoing immunosuppression, consistent with the poor prognosis. These results align with a 2022 single-center study that identified lymphocyte trajectory subphenotypes and found the worst outcomes in the PL group13. However, that study included a general ICU population from a single center and did not stratify patients by reasons for ICU admission. Our study addressed these limitations by focusing on sepsis patients and validating lymphocyte count trajectories in a nationwide multicenter cohort with an independent external cohort.
Recent studies suggest that immune responses in sepsis are dynamic and potentially modifiable by anti-inflammatory treatment, showing that blood immune endotypes in COVID-19 pneumonia frequently shift over time and can be favorably influenced by targeted immunotherapy21. Within this framework, our lymphocyte trajectory classes likely reflect an evolving immune state shaped by both underlying host condition and intensive care interventions. Future work integrating longitudinal immunoregulatory treatment information and immune phenotyping may further clarify how specific therapies influence lymphocyte trajectories and their prognostic significance.
Notably, our findings differed from two previous multicenter studies that reported alternative prognostic patterns10,11. A prior single-center investigation developed lymphocyte trajectory classes and reported that patients with a “rapidly decreasing” trajectory had the worst outcomes10. However, their trajectories were derived from a single-center cohort, and immunosuppressed patients were excluded, which substantially limited the generalizability of their prognostic model and contributed to inconsistencies with our findings. Similarly, another study identified that patients with persistent lymphopenia had the highest mortality11. Yet, this trajectory model was not externally validated, raising concerns about its stability across populations. Additionally, immunosuppressed patients were excluded. In addition, their analysis categorized septic patients into only three classes, whereas our study identified four distinct trajectory classes, providing a more nuanced representation of host immune phenotypes.
In contrast to both prior studies, we included patients with pre-existing immunocompromised—such as individuals with cancer, organ transplants, or HIV—who are at particularly high risk for sepsis and death22,23,24. Their inclusion may explain why the PL phenotype appeared to have the worst prognosis in our study. Moreover, this inclusion enhanced the real-world applicability of our model, as these patients represent a significant portion of the sepsis population.
This study aimed not only to identify high-risk subphenotype through simple lymphocyte counts but also to build a reliable machine learning model for the early prediction of patients with prolonged immune suppression and poor outcomes. The model was then converted into a user-friendly online calculator designed for clinical use. Beyond lymphocyte count, our research identified several key factors crucial for predicting persistent lymphopenia in sepsis patients, including WBC, age, PCT, PLT, and preexisting immunocompromised status. Age is an inherent risk factor for sepsis severity and has been confirmed by many previous studies25. The aging process impairs immune function—specifically, reducing T-cell, B-cel26, and innate immune cell activity27—collectively lowering antigen presentation, phagocytic capacity, and cellular chemotaxis. As a result, elderly patients are more vulnerable to progressing into an immunosuppressed state after sepsis. Besides, preexisting immunocompromised status, which can be seen in patients with prior conditions like organ transplants, chronic steroid use, or cancers, has consistently been linked to a higher risk of developing sepsis-related immune paralysis. Studies indicate that these patients begin with a compromised immune response, which further deteriorates during sepsis, thereby predisposing them to persistent immune deficiency, higher mortality, and recurrent infections28. Furthermore, the decrease in WBC in PL patients indicates a state of immunosuppression. This decline results from reduced immune response and less proliferation of immune cells, leading to fewer circulating white blood cells. This decreased cellularity hampers the host’s ability to fight infections effectively. While our model mainly depends on features selected through the Boruta algorithm, all these indicators are easy to interpret clinically. This interpretability demonstrates the value of our online calculator as a practical tool for bedside decision-making.
Sepsis-induced immunosuppression has increasingly been recognized as a key factor influencing patient outcomes29. Early clinical monitoring emphasizes inflammatory or infection-related biomarkers such as WBC, CRP, and PCT28. However, immune responses in sepsis are highly heterogeneous. Some develop overwhelming hyperinflammation, while others enter a state of predominant immunosuppression, which is often associated with worse long-term outcomes. Early identification of this immunosuppression group remains a major clinical challenge27. Recent studies have emphasized the value of circulating immune cell parameters, including LC28, neutrophil-to-lymphocyte ratio (NLR)29, and monocytic HLA-DR expression (mHLA-DR)30, in assessing the immune status of septic patients. Our results support the importance of monitoring LC dynamically. In the PL group, lymphocyte counts stayed persistently low, indicating severe immunosuppression31. Consequently, traditional inflammatory markers (WBC, CRP, PCT) did not show significant abnormalities in PL patients. Similarly, patients in the rapidly decreasing lymphocyte (RDL) group also had relatively normal baseline inflammatory markers, yet their declining LC trajectory was strongly linked to poor prognosis. These findings demonstrate that relying solely on baseline inflammatory markers is inadequate to fully understand immune dysfunction in sepsis.
PL patients show coagulation abnormalities, such as lower platelet counts and longer clotting times. In immunosuppression, impaired pathogen clearance keeps the coagulation system active, raising the risk of disseminated intravascular coagulation (DIC). Evidence indicates that DIC occurs more frequently and is more severe in immunosuppressed patients32. Collectively, the PL phenotype signals ongoing immunosuppression and a profound immunity-coagulation imbalance, contributing to the high mortality of sepsis.
Although previous studies have reported an association between persistent lymphopenia and mortality in sepsis, our study provides several additional contributions. First, we identified the persistent lymphopenia subgroup at an early stage by integrating lymphocyte counts with routinely available clinical variables reflecting immune, inflammatory, and coagulation status, and translated this approach into a user-friendly web-based calculator to facilitate clinical use. Second, unlike many prior studies, we included patients with preexisting immunocompromised conditions and adjusted for their immunocompromised status, thereby enhancing the real-world applicability of our findings. Finally, the model was derived from a prospective, multicenter cohort and validated in an independent, external cohort, thereby strengthening its generalizability and robustness.
The study had several limitations. First, the generalizability of our findings may be limited by the single-center validation cohort and the exclusively Chinese population of our study. Previous studies have used the MIMIC database for external validation, so we did not repeat that step. Second, our classification relied exclusively on lymphocyte count dynamics and did not incorporate functional tests of lymphocyte subsets or other immune markers. Lymphocyte count was chosen as it is a standard, readily available component of routine blood tests, simplifying the clinical application of our findings. Third, the online prediction tool we developed has not yet been tested prospectively in actual clinical workflows, and future clinical studies are needed to confirm its practical usefulness and clinical value. Fourth, although adding the PL subphenotype produced a small but statistically significant increase in AUC in the external validation cohort, the model’s overall discrimination remained modest, likely due to the substantial biological and clinical heterogeneity of sepsis. Our ICU mortality model was not intended as a definitive bedside tool, but rather to demonstrate that the PL trajectory is a significant prognostic marker and that early identification of this pattern may be valuable. These findings suggest that future ICU mortality prediction models should consider including the PL trajectory as a candidate predictor.
Conclusion
By applying latent class trajectory modeling (LCTM) to a national, multicenter prospective cohort and an external validation cohort, we identified distinct LC trajectories, revealing the heterogeneity of immune responses in sepsis. The early identification of patients with persistent lymphopenia is crucial for assessing disease severity and prognosis. Future research should focus on developing individualized immunomodulatory strategies tailored to these specific immune subphenotypes.
Data availability
The data can be reasonably applied to the corresponding author.
Abbreviations
- LC:
-
Lymphocyte count
- PL:
-
Persistent lymphopenia
- ICU:
-
Intensive care unit
- LCTM:
-
Latent class trajectory modelling
- SHAP:
-
SHapley Additive exPlanations
- CMS:
-
China Multicenter Sepsis database
- PUMCH:
-
Peking Union Medical College Hospital
- AIC:
-
Akaike information criterion
- BIC:
-
Bayesian information criterion
- SIL:
-
Slowly increasing lymphocytes
- NL:
-
Normal lymphocytes
- RDL:
-
Rapidly declining lymphocytes
- HR:
-
Heart rate
- RR:
-
Respiratory rate
- MAP:
-
Mean arterial pressure
- T:
-
Temperature
- WBC:
-
White blood cell count
- PCT:
-
Procalcitonin
- CRP:
-
C-reactive protein
- PLT:
-
Platelet count
- PT:
-
Prothrombin time
- INR:
-
International normalized ratio
- APTT:
-
Activated partial thromboplastin time
- Fg:
-
Fibrinogen
- DD:
-
D-Dimer
- FDP:
-
Fibrin degradation products
- PaO2/FiO2:
-
Oxygenation index
- Lac:
-
Lactic acid
- Cr:
-
Creatinine
- TBIL:
-
Total bilirubin
- OR:
-
Odds ratio
- DIC:
-
Disseminated intravascular coagulation
References
Singer, M. et al. The third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA 315, 801. https://doi.org/10.1001/jama.2016.0287 (2016).
Hotchkiss, R. S., Monneret, G. & Payen, D. Sepsis-induced immunosuppression: From cellular dysfunctions to immunotherapy. Nat. Rev. Immunol. 13, 862–874. https://doi.org/10.1038/nri3552 (2013).
Giamarellos-Bourboulis, E. J. et al. The pathophysiology of sepsis and precision-medicine-based immunotherapy. Nat. Immunol. 25, 19–28. https://doi.org/10.1038/s41590-023-01660-5 (2024).
Rhodes, A. et al. Surviving sepsis campaign: International guidelines for management of sepsis and septic shock: 2016. Crit. Care Med. 45, 486–552. https://doi.org/10.1097/CCM.0000000000002255 (2017).
Gogos, C. A., Drosou, E., Bassaris, H. P. & Skoutelis, A. Pro- versus anti-inflammatory cytokine profile in patients with severe sepsis: A marker for prognosis and future therapeutic options. J. Infect. Dis. 181, 176–180. https://doi.org/10.1086/315214 (2000).
Boomer, J. S. et al. Immunosuppression in patients who die of sepsis and multiple organ failure. JAMA 306, 2594. https://doi.org/10.1001/jama.2011.1829 (2011).
Drewry, A. M. et al. Persistent lymphopenia after diagnosis of sepsis predicts mortality. Shock 42, 383–391. https://doi.org/10.1097/SHK.0000000000000234 (2014).
On behalf of the OUTCOMEREA study group, Adrie, C., Lugosi, M., Sonneville, R., Souweine, B. & Ruckly, S., et al. Persistent lymphopenia is a risk factor for ICU-acquired infections and for death in ICU patients with sustained hypotension at admission. Ann. Intensive Care. 7, 30 (2017). https://doi.org/10.1186/s13613-017-0242-0
Podd, B. S. et al. Early, persistent lymphopenia is associated with prolonged multiple organ failure and mortality in septic children. Crit. Care Med. 51, 1766–1776. https://doi.org/10.1097/CCM.0000000000005993 (2023).
Li, D. et al. Dynamic changes in peripheral blood lymphocyte trajectory predict the clinical outcomes of sepsis. Front. Immunol. 16, 1431066. https://doi.org/10.3389/fimmu.2025.1431066 (2025).
Yang, J., Ma, B. & Tong, H. Lymphocyte count trajectories are associated with the prognosis of sepsis patients. Crit. Care. 28, 399. https://doi.org/10.1186/s13054-024-05186-6 (2024).
Weng, L. et al. National incidence and mortality of hospitalized sepsis in China. Crit. Care. 27, 84. https://doi.org/10.1186/s13054-023-04385-x (2023).
Pei, F. et al. Lymphocyte trajectories are associated with prognosis in critically ill patients: A convenient way to monitor immune status. Front. Med. 9, 953103. https://doi.org/10.3389/fmed.2022.953103 (2022).
Proust-Lima, C., Philipps, V. & Liquet, B. Estimation of extended mixed models using latent classes and latent processes: The R package lcmm. J. Stat. Softw. https://doi.org/10.18637/jss.v078.i02 (2017).
Mirza, S. S. et al. 10-year trajectories of depressive symptoms and risk of dementia: A population-based study. Lancet Psychiatry 3, 628–635. https://doi.org/10.1016/S2215-0366(16)00097-3 (2016).
Lampousi, A.-M., Möller, J., Liang, Y., Berglind, D. & Forsell, Y. Latent class growth modelling for the evaluation of intervention outcomes: Example from a physical activity intervention. J. Behav. Med. 44, 622–629. https://doi.org/10.1007/s10865-021-00216-y (2021).
Kursa, M. B. & Rudnicki, W. R. Feature Selection with the Boruta Package. J. Stat. Softw. https://doi.org/10.18637/jss.v036.i11 (2010).
Hou, F. et al. Development and validation of an interpretable machine learning model for predicting the risk of distant metastasis in papillary thyroid cancer: a multicenter study. eClinicalMedicine. 77, 102913. https://doi.org/10.1016/j.eclinm.2024.102913 (2024).
Bhavani, S. V. et al. Distinct immune profiles and clinical outcomes in sepsis subphenotypes based on temperature trajectories. Intensive Care Med. 50, 2094–2104. https://doi.org/10.1007/s00134-024-07669-0 (2024).
Jing, J. et al. Characteristics and clinical prognosis of septic patients with persistent lymphopenia. J. Intensive Care Med. 39, 733–741. https://doi.org/10.1177/08850666241226877 (2024).
Kyriazopoulou, E. et al. Transitions of blood immune endotypes and improved outcome by anakinra in COVID-19 pneumonia: an analysis of the SAVE-MORE randomized controlled trial. Crit. Care. 28, 73. https://doi.org/10.1186/s13054-024-04852-z (2024).
Huson, M. A. M. et al. The impact of HIV co-infection on the genomic response to sepsis. PLoS ONE 11, e0148955. https://doi.org/10.1371/journal.pone.0148955 (2016).
Williams, J. C., Ford, M. L. & Coopersmith, C. M. Cancer and sepsis. Clin Sci. 137, 881–893. https://doi.org/10.1042/CS20220713 (2023).
Feng, T., Feng, X., Jiang, C., Huang, C. & Liu, B. Sepsis risk factors associated with HIV-1 patients undergoing surgery. Emerg. Microbes Infect. 4, 1–6. https://doi.org/10.1038/emi.2015.59 (2015).
Banerjee, D. & Opal, S. M. Age, exercise, and the outcome of sepsis. Crit. Care. 21, 286. https://doi.org/10.1186/s13054-017-1840-9 (2017).
Kumar, D. S., Song-bai, Z., Shi-jin, X. & Kalionis, B. Senescent remodeling of the immune system and its contribution to the predisposition of the elderly to infections. Chin. Med. J. (Engl). 125(18), 3325–3331. https://doi.org/10.3760/cma.j.issn.0366-6999.2012.18.023 (2012).
Gomez, C. R., Nomellini, V., Faunce, D. E. & Kovacs, E. J. Innate immunity and aging. Exp. Gerontol. 43, 718–728. https://doi.org/10.1016/j.exger.2008.05.016 (2008).
Deinhardt-Emmer, S. et al. Sepsis in patients who are immunocompromised: Diagnostic challenges and future therapies. Lancet Respir. Med. 13, 623–637. https://doi.org/10.1016/S2213-2600(25)00124-9 (2025).
Torres, L. K., Pickkers, P. & Van Der Poll, T. Sepsis-induced immunosuppression. Annu. Rev. Physiol. 84, 157–181. https://doi.org/10.1146/annurev-physiol-061121-040214 (2022).
Döcke, W.-D. et al. Monitoring temporary immunodepression by flow cytometric measurement of monocytic HLA-DR expression: A multicenter standardized study. Clin. Chem. 51, 2341–2347. https://doi.org/10.1373/clinchem.2005.052639 (2005).
Wang, Z., Zhang, W., Chen, L., Lu, X. & Tu, Y. Lymphopenia in sepsis: A narrative review. Crit. Care. 28, 315. https://doi.org/10.1186/s13054-024-05099-4 (2024).
Sun, Y. et al. Immunosuppression correlates with the deterioration of sepsis-induced disseminated intravascular coagulation. Shock 61, 666–674. https://doi.org/10.1097/SHK.0000000000002069 (2024).
Acknowledgements
We thank all the collaborators from CMS for providing cases: Peking Union Medical College Hospital: Furong Liu, Li Weng; Zhongda Hospital Southeast University: Jianfeng Xie, Xi Chen; The Second Affiliated Hospital of Kunming Medical University: Qingqing Huang, Jinxi Yue; The First Hospital of Jilin University: Dong Zhang, Yuting Li, Yao Fu; Beijing Friendship Hospital: Meili Duan, Mengya Zhao; West China Hospital of Sichuan University: Yan Kang, Jun Guo, Xue Zhang; The First Hospital of Qinhuangdao: Xiujuan Liu, Tianzhi Liu; The First Affiliated Hospital of Zhejiang University: Hongliu Cai, Xie Zheng, Yiqi Zhang; Qilu Hospital (Qingdao) of Shandong University: Dawei Wu, Huichan Bu; The First Affiliated Hospital of Chongqing Medical University: Fachun Zhou, Shijing Tian; Tianjing First Central Hospital: Yongqiang Wang, Hongmei Gao, Hua Xu; Henan Provincial People’s Hospital: Bingyu Qin, Shi Qiu; The Affiliated Hospital of Qingdao University: Jinyan Xing. Ying Liu, Xiangya Third Hospital: Kai Zhao, Yanjun Zhong, Xin Jin; Zhongshan Hospital: Ming Zhong, Yiqi Qian; The First Affiliated Hospital of Harbin Medical University: Xianglin Meng; Shengjing Hospital affiliated to China Medical University: Bin Zang, Yang Zhao; The First Affiliated Hospital of Kunming Medical University: Haiying Wu, Li Wang.
Funding
This work was supported by the National Key R&D Program of China (No. 2022YFC2304605) and the Natural Science Foundation of Liaoning Province (No. 2024-MSLH-543 and No.2024-MS-03).
Author information
Authors and Affiliations
Contributions
S.H. and Y.S. designed and performed the research, analyzed and interpreted the data, and wrote the manuscript. L.L. and C.W. collected and analyzed the original data. X.L., Y.L., and X.M. supervised the experiments. Y.S. had full access to all data in the study and took responsibility for the integrity of the data and the accuracy of the data analysis. All authors contribute to the editing of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics approval
The study was approved by the Research and Ethics Committee of the First Affiliated Hospital of China Medical University ([2022] 2022-502-2, Shenyang, China) and the institutional review board of Peking Union Medical College Hospital (Approval number JS-3480D). The other participating ICUs obtained their respective ethical approvals.
Informed consent
Written informed consent was obtained from each participant or their legal surrogates before enrollment.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Huang, S., Liu, L., Wang, C. et al. A machine learning-based prediction model for poor prognosis in sepsis using lymphocyte count: a national, multicenter prospective cohort. Sci Rep 16, 3816 (2026). https://doi.org/10.1038/s41598-025-33980-x
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-33980-x








