Introduction

Cardiac arrest (CA) is a critical global health concern, affecting over 1 million individuals annually with persistently high mortality and morbidity rates1. Despite advancements in cardiopulmonary resuscitation (CPR) and advanced life support, the overall hospital discharge rate remains under 10%2. Even among those achieving return of spontaneous circulation (ROSC), nearly two-thirds experience severe neurological deficits, and about one-third remain comatose after 72 h3. Thus, preserving neurological function has become a focal point in post-resuscitation care, making early, accurate prognostication essential for guiding targeted interventions or decisions on withdrawing life support. The European Resuscitation Council (ERC) and the European Society of Intensive Care Medicine (ESICM) advocate for a multi-modal approach, combining clinical assessments, neurophysiology, biomarkers, and neuroimaging4. However, while effective in specificity, these methods often lack sensitivity, resulting in prognostic uncertainty in up to 68% of cases5. This emphasizes the need for more sensitive tools to better predict neurological outcomes and improve patient care.

Electroencephalography (EEG) is widely utilized in post-cardiac arrest monitoring due to its real-time, portable, and quantitative capabilities6. Recent studies demonstrate that quantitative EEG (QEEG) features are highly effective in predicting neurological outcomes, with significant accuracy; for example, Lee et al. reported an AUC of 0.887. Research by Ghassemi et al. highlights that temporal EEG features, which incorporate dynamic patterns, enhance prognostic models, outperforming static features (AUC = 0.83 vs. 0.79)8. Additionally, biomarkers like alpha-power spectral density have demonstrated strong predictive value for positive outcomes, achieving an AUC of 0.903 in studies by Kim et al.9. However, using EEG as a standalone prognostic tool in clinical settings is limited due to variability in feature selection, patient-specific factors (e.g., comorbidities, medication), and technical challenges like artifact contamination. These issues impact its consistency and generalizability in clinical workflows, suggesting that an integrative approach incorporating additional physiological indicators may be necessary for a more comprehensive assessment of neurological outcomes.

Recent research has increasingly focused on integrating electrocardiography (ECG), especially heart rate variability (HRV) analysis, with EEG to enhance the prediction of neurological outcomes after cardiac arrest. HRV metrics, reflecting autonomic nervous system (ANS) activity, have been linked to brain function, with specific markers such as very-low-frequency (VLF), low-frequency (LF) power, and the LF/HF ratio correlating with improved neurological outcomes10,11. Studies have also identified ECG abnormalities like QTc prolongation and arrhythmias as predictors of severe brain injury and higher mortality rates, particularly in traumatic brain injury (TBI) patients12,13. However, the limitation of ECG-based methods is that they primarily offer indirect measures of brain function by reflecting ANS activity, rather than capturing direct neurophysiological changes, especially during critical phases like ischemia and reperfusion14. To bridge this gap, recent studies have explored brain-heart coupling, examining the interplay between EEG frequency bands and HRV. For instance, Hermann et al. found that disrupted EEG-HRV interactions were significantly associated with poor neurological outcomes (p < 0.02), highlighting the potential of coupling these modalities15. Although these findings suggest the potential of coupling studies, such research typically focuses on specific interaction patterns without leveraging the full range of data available from a comprehensive multi-modal approach. To date, no studies have used both EEG and ECG features at a granular level to predict neurological outcomes post-cardiac arrest. Such a multi-modal integration could provide more nuanced patient insights, leading to more accurate and holistic assessments.

Despite the significant potential of integrating EEG and ECG data to enhance prognostic accuracy, a critical challenge lies in the interpretability of the sophisticated machine learning models employed for such analyses. While these models often achieve impressive predictive performance, their inherent “black box” nature limits their clinical applicability, particularly in intensive care units (ICUs) where crucial decisions—such as whether to initiate or withdraw life-sustaining treatments—require transparency and clear justifications. This opacity may hinder clinicians’ trust and acceptance, thereby impeding the integration of these models into routine practice.

In this study, we developed a multi-modal prognostic model that synthesizes EEG, ECG, and clinical data to improve outcome predictions following cardiac arrest. Our approach goes beyond mere predictive accuracy by incorporating SHapley Additive Explanations (SHAP) to elucidate the contribution of each feature, thereby enhancing model interpretability. This integration of interpretability ensures that the model is not only accurate but also transparent, fostering clinician trust. By leveraging the combined strengths of multi-modal data, our framework provides a robust, interpretable solution that aligns with clinical needs, ultimately supporting evidence-based decision-making in critical care settings. This approach holds significant promise for enhancing prognostic accuracy while maintaining the transparency necessary for clinical adoption.

Materials and methods

Datasets

This study utilized data from the International Cardiac Arrest Research Consortium (I-CARE) database16, encompassing clinical, EEG, and ECG recordings from comatose patients post-cardiac arrest, collected across seven academic institutions in the United States and Europe. The cohort included adult patients who experienced either in-hospital or out-of-hospital cardiac arrest, achieved return of spontaneous circulation (ROSC), but remained comatose (Glasgow Coma Score ≤ 8). EEG monitoring commenced within hours of cardiac arrest and was sustained based on clinical requirements, with ECG recordings incorporated when available. Clinical data encompassed demographics, cardiac arrest specifics, ROSC duration, and targeted temperature management parameters, while neurological outcomes were evaluated using the Cerebral Performance Category (CPC) scale. Outcome data were acquired through phone interviews conducted at six months or chart review within three to six months post-ROSC. For a detailed workflow, refer to Fig. 1.

Fig. 1
figure 1

Comprehensive workflow diagram of data processing and analytical pipeline. Note: the FLF here represents feature level fusion, which includes EEG + ECG + clinical features.

Ethical statement

The study received approval from the Institutional Review Boards (IRBs) of all participating institutions, including the Partners Healthcare IRB (#2013P001024) for the institutions in the United States. Due to the retrospective nature of the study and the use of anonymized data, informed consent was waived.

Pre-processing

EEG data were imported into the EEGLAB environment and re-referenced to a common average17. ECG data were synchronized and merged with EEG data for multi-modal analysis. A band-pass filter (0.5 to 30 Hz) was applied to remove artifacts, and the data were down-sampled to 100 Hz18. The combined dataset underwent manual artifact removal by trained reviewers. ECG-specific processing involved an additional band-pass filter (0.5 to 20 Hz), with QRS complexes detected using the Pan-Tompkins algorithm. Manual corrections of R-peaks and linear interpolation of ectopic beats were performed to refine R-R intervals19

Feature engineering

The feature engineering process in this study comprises three key steps: unimodal feature extraction, feature selection, and feature fusion20. These steps are designed to optimize the multi-modal model for predicting neurological outcomes in post-cardiac arrest patients

Feature extraction

Clean 20-minute segments of EEG and ECG data were used for analysis. EEG features included time-domain (mean, variance), frequency-domain (power spectral density, band power, coherence), and nonlinear metrics (fractal dimension: Higuchi fractal dimension, Shannon entropy, Lempel-Ziv complexity). ECG features focused on HRV metrics such as NN50, pNN50, HF, LF, SD1, SD2, and indices like cardiac sympathetic index (CSI) and cardiac vagal index (CVI). Clinical features (age, sex, ROSC duration, hypothermia status, shockable rhythm) were also integrated to enhance the model’s predictive capabilities21,22,23.

Feature selection

To optimize classification performance, this study employed a multi-step feature engineering strategy encompassing polynomial feature expansion and L1 regularization (Lasso) logistic regression24. Initially, polynomial expansion was applied to generate interaction and quadratic terms, enabling the model to capture non-linear interactions between physiological signals (EEG, ECG) and clinical features. Subsequently, L1 regularization was utilized for feature selection across both unimodal (EEG or ECG) and multi-modal (EEG + ECG) datasets, effectively shrinking irrelevant feature coefficients to zero and retaining only the most informative predictors. For unimodal datasets, this approach filtered redundant features, while for multi-modal datasets, principal component analysis (PCA) was employed post-L1 selection to reduce dimensionality while preserving 95% of the variance, thereby mitigating multicollinearity and enhancing computational efficiency25,26

Feature fusion

The study implemented feature-level fusion (FLF) to integrate EEG, ECG, and clinical data into a unified feature vector27,28. By amalgamating features from distinct modalities, FLF effectively captures complementary information, thereby enhancing the model’s ability to discern complex interdependencies between neurological, cardiac, and clinical parameters. This integrative approach significantly improves the predictive accuracy of neurological outcomes post-cardiac arrest, providing a comprehensive and robust assessment framework

Classification

Data balancing

To address the class imbalance inherent in the dataset, where patients with a CPC score of 1–2 (good neurological outcomes) are significantly fewer than those with a CPC score of 3–5 (poor neurological outcomes), the Synthetic Minority Over-sampling Technique (SMOTE) was employed during the training process. SMOTE, implemented using Python’s sklearn toolkit29, was used to generate synthetic samples for the minority class (CPC 1–2) by setting the parameter K = 5 (nearest neighbors). This technique ensures a balanced distribution between the positive and negative classes, improving the model’s ability to learn from both classes equally. To avoid overfitting, SMOTE was applied independently to each training fold and excluded before cross-validation, ensuring that synthetic samples did not influence the validation process.

Model selection and hyperparameter tuning

In this study, we systematically evaluated a range of machine learning algorithms, including Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting (GBOOST). These algorithms are well-established for their ability to handle multi-modal data and tackle complex classification tasks30.

LR was selected as a baseline model due to its simplicity, interpretability, and effectiveness in linear classification problems. It serves as a reference for comparing more complex models, particularly when feature interactions are limited. SVM, on the other hand, was chosen for its capacity to define decision boundaries in high-dimensional spaces and its robustness in managing non-linear relationships. This is crucial for complex data types like EEG and ECG, as demonstrated in emotion recognition tasks involving multi-modal physiological signals31. RF, an ensemble learning technique, was employed for its ability to mitigate overfitting and perform automatic feature selection, making it ideal for handling multi-dimensional data. Its success in seizure classification and epilepsy localization further highlights its suitability for this study32. Lastly, GBOOST was incorporated for its capacity to build weak learners sequentially, progressively correcting errors from previous models. It is particularly effective in tasks involving intricate patterns and interactions, such as those found in multi-modal datasets used in seizure prediction and emotion recognition33,34.

The dataset was split into a training set and a test set using stratified sampling in an 80:20 ratio to preserve the original class distribution. This ensures proportional representation of each class, minimizing potential biases during model training and evaluation. The models were trained on the 80% training set and evaluated on the 20% test set to rigorously assess their generalization capabilities. Given the class imbalance in the dataset, all models incorporated a class_weight=’balanced’ strategy to address this issue. Additionally, the performance of each classifier was optimized using the Optuna framework for hyperparameter tuning. The Tree-structured Parzen Estimator (TPE) algorithm, employed by Optuna, efficiently explores the hyperparameter space and identifies optimal configurations35.

Performance evaluation metrics

Model robustness was assessed using 10-fold cross-validation with each fold serving as an independent test set once. This iterative process minimized data leakage by ensuring patient data was restricted to either the training or test fold. Key performance metrics, including AUC-ROC, accuracy, sensitivity, specificity, and F1 score, were reported with 95% confidence intervals to validate the model’s generalizability.

Interpretation

In this study, SHAP analysis were used in this study to interpret model predictions, clarifying the influence of specific features on patient outcomes and enhancing clinical applicability. SHAP, grounded in cooperative game theory, assigns each feature a contribution score indicating its effect on the prediction. Positive SHAP values increase the likelihood of poor outcomes, while negative values suggest favorable prognoses. By providing both global and individual prediction insights, SHAP supports clinicians in comprehending model behavior and its implications for patient care36.

Study endpoints

The primary endpoint of this study is the classification of neurological outcomes based on the CPC scale, which ranges from 1 (full recovery) to 5 (death). A good neurological outcome is defined as a CPC score of 1 or 2, while a poor neurological outcome corresponds to a CPC score of 3, 4, or 5. The classification task is framed as a binary prediction problem, where the model distinguishes between good (CPC 1–2) and poor (CPC 3–5) outcomes using features derived from EEG, ECG, and clinical data.

Statistical analysis

To assess differences among groups based on the 16 SHAP-identified features, robust statistical methods were applied. Continuous variables were analyzed using the Mann-Whitney U test, and associations with categorical variables, such as “Shockable Rhythm,” were evaluated using the Chi-Square test. To mitigate type I errors from multiple comparisons, the Benjamini-Hochberg correction was used with a significance threshold of p < 0.05. Conducted in SPSS, these analyses validated the discriminative power of key features, reinforcing the predictive model’s reliability and interpretability.

Results

Population description

In this study, data were selectively included based on the study objectives, retaining only patients who had both EEG and ECG data available. As a result, a total of 277 patients were included in the analysis (See supporting, Table 1 for details). The mean age was 61.04 ± 16.52 years, with 69.2% female. Of these, 88 patients had good neurological outcomes, while 178 had poor outcomes. The good outcome group was younger (mean age 56.49 ± 14.04) compared to the poor outcome group (63.15 ± 17.18). Shockable rhythms were more common in the good outcome group (69.3% vs. 31.6%), with similar rates of targeted temperature management (27.3% vs. 27%).

Classification performance

After pre-processing, three types of features were extracted: EEG, ECG, and clinical data. These features were then combined into three distinct scenarios for comparative analysis: (1) all features combined (EEG + ECG + clinical), (2) EEG features alone, and (3) ECG features alone. We employed four machine learning models: Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting (GBOOST) to evaluate the classification performance across these scenarios.

Fig. 2
figure 2

AUC scores for RF (A), LR (B), SVM (C), and GBOOST (D) models across 12–96 h. The three feature sets are EEG + ECG + Clinical (FLF) (blue), EEG only (orange), and ECG only (green).

Across all models, the multi-modal integration of EEG + ECG + clinical features (FLF) consistently provides the best performance, with AUC scores ranging from 0.75 to 1.0, and in some time periods, reaching a perfect score of 1.0.

Specifically, Fig. 2A (LR model) shows that the multi-modal approach yields high AUC values, especially in the 12–24 h and 49–72 h time windows. In contrast, EEG features alone exhibit more variability in performance, with AUC values ranging from 0.5 to 0.9, and ECG features alone generally demonstrate limited predictive power, with AUCs consistently below 0.6. Similarly, Fig. 2B (SVM model) demonstrates that the multi-modal approach (FLF) consistently outperforms both EEG only and ECG only features, with AUC scores between 0.8 and 1.0 across most time windows. EEG only features show moderate AUC values around 0.7 to 0.8 during the 25–48 h and 49–72 h windows, while ECG only consistently performs poorly, with AUC values typically below 0.6. In Fig. 2C (RF model), the multi-modal integration of EEG + ECG + clinical features (FLF) shows substantial variability, occasionally achieving AUC values close to 1.0, particularly in the 25–48 h time window. EEG features alone perform consistently with AUCs above 0.7 during the 49–72 h window, although their performance fluctuates in other periods. ECG features alone show AUC scores generally below 0.6, reflecting their limited utility as standalone predictors. Finally, Fig. 2D (GBOOST model) illustrates that the FLF approach again achieves superior AUC scores, particularly within the 12–48 h windows, where AUCs range from 0.75 to 1.0. EEG features alone show moderate performance, with AUC values around 0.7, while ECG features alone consistently exhibit weak predictive power, with AUCs remaining below 0.6 across all time windows.

Table 1 Performance comparison of machine learning models (SVM, LR, RF, GBOOST) at 24-hour and 36-hour time points with EEG, ECG, and FLF features.

Although performance across all time points is relevant, the 24-hour and 36-hour time windows are highlighted due to their consistently superior performance across all models using the multi-modal integration of EEG, ECG, and clinical features (FLF). These time points achieved the highest accuracy scores, consistently approaching 1.0, which demonstrates the clear advantage of the FLF approach in predictive modeling. As shown in Table 2, we emphasize these two time points by presenting additional metrics, further supporting the robustness of FLF.

At the 24-hour time window, FLF outperforms both EEG-only and ECG-only features across all models. The SVM model achieves an accuracy of 0.853, surpassing both EEG-only (accuracy: 0.756) and ECG-only (accuracy: 0.536) models. Similarly, the LR model shows a significant performance boost with FLF, attaining an accuracy of 0.829, again outperforming EEG-only (accuracy: 0.707) and ECG-only (accuracy: 0.585). The RF model also demonstrates the strength of FLF, with an accuracy of 0.707, compared to lower performance from EEG-only (accuracy: 0.609) and ECG-only (accuracy: 0.585). The GBOOST model follows a similar trend, with FLF yielding an accuracy of 0.731, surpassing both EEG-only (accuracy: 0.658) and ECG-only (accuracy: 0.512). At the 36-hour time point, FLF consistently achieves perfect performance across all models, with accuracy scores reaching 1.0. These results further highlight the superiority of the multi-modal approach in predicting outcomes. This reinforces the crucial role of multi-modal data fusion in improving predictive power, particularly at the 24-hour and 36-hour time points. For a comprehensive analysis across all time points, including those not highlighted in this summary, please refer to the complete dataset in the submitted Excel file.

SHAP analysis

In this study, SHAP analysis was employed to assess the contribution of individual features from a multi-modal dataset (including EEG, ECG, and clinical data) to the predictions made by the machine learning model. To ensure scientific rigor and reliability of the results, the Random Forest (RF) model was chosen as the primary analysis tool, owing to its superior performance across various evaluation metrics, such as accuracy and AUC. By utilizing the RF model, SHAP analysis provided valuable insights into how different physiological and clinical variables drive model predictions. Figure 3 highlights that “Shockable Rhythm” had the highest mean SHAP value (~ 0.17), indicating its strong influence on outcomes, while features like HiguchiFD-2, HiguchiFD-7, and SD2 had moderate effects (mean SHAP values of 0.02 to 0.03). The summary plot further illustrates the consistent positive impact of “Shockable Rhythm,” while other features, such as HiguchiFD metrics, displayed context-dependent effects ranging between − 0.1 and 0.1. Minimal influence was observed for features like LF and CVI, with SHAP values below 0.01.

Overall, the SHAP analysis confirms the dominant role of clinical features, particularly “Shockable Rhythm,” in model predictions, while EEG and ECG data offer complementary context-specific contributions. These results demonstrate that integrating multi-modal data significantly enhances model robustness and predictive accuracy in clinical decision-making.

Fig. 3
figure 3

SHAP based Feature ranking of top 16 features (a) SHAP feature ranking (b) SHAP summary plot. Note: The x-axis represents the shapley values while the y axis represents the included features ranking. Each blue dot corresponds to a lower magnitude of the feature for different samples, while the red dots indicate higher magnitudes of the features.

Statistical comparison of differences

Statistical tests validated the significance of key features identified via SHAP analysis. According to Supporting Table 2, Mann-Whitney U and Chi-Square tests showed that while some ECG features (e.g., SD2, Mean_RR) lacked statistical significance (p > 0.05), several EEG features (e.g., HiguchiFD_2, HiguchiFD_7) were highly significant (adjusted p < 0.001). Additionally, the clinical feature “Shockable Rhythm” was associated with better outcomes (p = 0.004). These findings emphasize that integrating EEG and clinical data significantly enhances predictive accuracy, even when ECG features alone are less informative, thereby improving model robustness and clinical applicability.

Discussion

In this study, we developed a machine learning model integrating multi-modal data (EEG, ECG, and clinical information) to predict neurological outcomes in post-cardiac arrest patients. The results demonstrate that combining all multi-modal features significantly improves predictive accuracy compared to single-modality approaches. Notably, during early critical time windows (12–24 and 25–48 h), the model achieved AUC values up to 1.0, outperforming models based solely on EEG (AUC 0.7–0.8) or ECG (AUC < 0.6). This highlights the importance of early, comprehensive assessments for guiding clinical interventions.

Model performance evaluation

This study builds upon previous research that has employed machine learning models to predict neurological outcomes in patients following cardiac arrest. Such as those by Maschke et al. and Amorim et al., have primarily focused on developing predictive models using EEG data alone. These single-modality approaches demonstrated moderate predictive performance, with AUC values typically ranging from 0.7 to 0.837,38. Similarly, studies by Sung et al. and Kim et al. investigated ECG-based features, like HRV, for outcome prediction; however, their models generally yielded lower AUC scores, often below 0.79,39. Our study extends these findings by integrating multi-modal data—encompassing EEG, ECG, and clinical features—into a unified machine learning framework. This multi-modal approach allows for a more comprehensive representation of the patient’s condition by leveraging the complementary strengths of each data type, thereby enhancing overall prediction accuracy. The results show that our model, incorporating all multi-modal features, achieved higher AUC values, ranging from 0.75 to 1.0, compared to single-modality models. This underscores the advantage of multi-modal integration in capturing a broader spectrum of factors that influence neurological outcomes.

A particularly notable finding in our study is the model’s performance across different time windows. The classification analyses revealed that integrating multi-modal data significantly enhances predictive accuracy, especially in the early time windows (12–24 h and 25–48 h), where AUC values reached 1.0 and 0.85, respectively. This is a substantial improvement over single-modality models based on EEG (AUC 0.7–0.8) or ECG (AUC below 0.6). Our results align with and extend the findings of Uslenghi et al., who emphasized the importance of early time-window analysis for reliable predictions in post-cardiac arrest patients40. The ability of our model to achieve higher predictive accuracy in early time windows underscores the critical value of early multi-modal assessments in clinical decision-making. Moreover, our approach’s ability to capture the temporal dynamics of various physiological signals is critical, as highlighted by Ghassemi et al.and Dai et al., who noted that the predictive value of certain physiological markers, such as EEG complexity, may change over time due to evolving pathophysiological states8,41.

By employing advanced machine learning models such as LR, SVM, RF, and GBOOST, our approach handles complex, high-dimensional data better than traditional methods. These models excel at capturing non-linear relationships, aligning with advancements in AI-driven critical care as noted by Callaway et al. and Nolan et al.42,43. Overall, integrating multi-modal data and focusing on early assessments significantly enhances prediction accuracy, offering a robust foundation for future research and clinical applications.

Explainable results

To elucidate the contribution of each feature to predicting neurological outcomes after cardiac arrest, we employed SHAP to interpret their impact on the model’s predictions. “Shockable Rhythm,” which refers to the initial presentation of shockable rhythms like ventricular fibrillation (VF) or pulseless ventricular tachycardia (VT), emerged as the most influential feature with the highest mean SHAP value. This finding is consistent with established clinical guidelines by Callaway et al. and Nolan et al., which emphasize that patients with initial shockable rhythms have a significantly better prognosis than those with non-shockable rhythms like asystole or pulseless electrical activity (PEA)42,43. The prominence of Shockable Rhythm in our model underscores the crucial role of timely defibrillation and early resuscitation interventions in improving neurological outcomes, reinforcing its value in clinical decision-making44.

In addition, our SHAP analysis also highlighted the importance of EEG-based Higuchi Fractal Dimension (HiguchiFD) metrics and HRV measures as significant predictors. HiguchiFD, which quantifies EEG signal complexity, has been validated as a marker for brain activity and injury severity after resuscitation21. Notable HiguchiFD features in our model, such as HiguchiFD-2 (channel FP2) and HiguchiFD-7 (channel P3), point to the relevance of specific brain regions. The FP2 electrode, associated with the frontal lobe, is linked to cognitive functions and decision-making processes crucial for recovery post-brain injury45. Similarly, P3 in the parietal lobe is related to sensory integration and working memory, which are vital for functional recovery and neural reorganization after cardiac arrest46. The significant SHAP values for these features suggest that neural complexity in these regions reflects key processes of neuroplasticity and network adaptation critical to recovery.

HRV features, including SD2, mean_RR, and SDNN, also emerged as important indicators, reflecting autonomic regulation during post-arrest recovery. Reduced HRV is a known predictor of poor outcomes, indicating autonomic dysfunction and impaired recovery potential47,48. Our findings, where mean_RR and SDNN show substantial SHAP values, reinforce the value of integrating HRV metrics for early risk stratification and improving the model’s predictive capability. By combining clinical indicators like the initial shockable rhythm with EEG complexity and HRV measures, our model offers a robust, multi-modal approach to neurological prognostication, supporting more personalized and effective interventions for cardiac arrest patients.

Clinical relevance

This study highlights the efficacy of integrating multi-modal data (EEG, ECG, and clinical variables) to enhance early-stage neurological outcome predictions in post-cardiac arrest patients, aligning with the European Resuscitation Council’s guidelines for a multi-modal prognostic approach43. Utilizing SHAP analysis significantly improves model interpretability, allowing clinicians to better understand each feature’s contribution to predictions49. This transparency is crucial for fostering trust in AI-assisted decision-making, particularly in critical interventions where the decision to continue or withdraw life-sustaining therapies is based on individualized risk assessments45. The methodology extends beyond cardiac arrest care, with potential applications in stroke management and anesthesia, by leveraging comprehensive physiological data to inform more tailored treatment strategies50.

Limitations and future directions

While our study yielded promising outcomes, certain limitations must be addressed for broader validation. The relatively small sample size restricts the generalizability and statistical power of our findings, highlighting the need for larger, diverse cohorts in future research. Although multi-modal data integration has improved predictive accuracy, challenges remain in synchronizing diverse data streams. Future studies should explore advanced data fusion methods to better capture temporal dynamics. The current use of static time windows may not fully capture dynamic physiological changes; thus, adaptive time-windowing techniques that adjust model parameters in real-time could enhance personalized interventions. Furthermore, incorporating additional signals like blood pressure variability and respiratory patterns could further improve neurological outcome assessments. Addressing these gaps will enhance the framework’s precision and clinical relevance in critical care settings.

Conclusion

Leveraging advanced machine learning models allowed us to effectively capture the intricate, nonlinear interactions between various features, establishing a solid foundation for precise and timely predictions in critical care environments.Multi-modal integration plays a crucial role in optimizing patient prognosis and tailoring interventions in post-cardiac arrest management, significantly improving patient outcomes and promoting more personalized care in post-arrest management.