Development and validation comparison of multiple models for perioperative neurocognitive disorders during hip arthroplasty

Wang, Gang; Xie, Yi; Bai, XiaRui; Zhang, Yuming; Guo, Jiao

doi:10.1038/s41598-025-93324-7

Download PDF

Article
Open access
Published: 19 March 2025

Development and validation comparison of multiple models for perioperative neurocognitive disorders during hip arthroplasty

Gang Wang¹^na1,
Yi Xie²^na1,
XiaRui Bai¹,
Yuming Zhang¹ &
…
Jiao Guo ORCID: orcid.org/0000-0002-3505-9825^1,3

Scientific Reports volume 15, Article number: 9393 (2025) Cite this article

2151 Accesses
Metrics details

Abstract

This study aims to develop optimal predictive models for perioperative neurocognitive disorders (PND) in hip arthroplasty patients, thereby advancing clinical practice. Data from all hip arthroplasty patients in the MIMIC-IV database were utilized to predict PND. With 62 variables, we applied multiple logistic regression, artificial neural network (ANN), Naive Bayes, support vector machine, and decision tree (XgBoost) algorithms to forecast PND. Feature analysis, receiver operating characteristic curve (ROC) and calibration curve plotting, and sensitivity, specificity, and F-measure β = 1 (F1-score) assessments were conducted on both training and validation sets for classifying models’ effectiveness. Brier score and Index of prediction accuracy (IPA) were employed to compare prediction capabilities in both sets. Among 3,292 hip arthroplasty patients in the MIMIC database, 331 developed PND. Five models using different algorithms were constructed. After thorough comparison and validation, the ANN model emerged as the most effective model. Performance metrics on the training set for the ANN model were: ROC: 0.954, Accuracy: 0.938, Precision: 0.758, F1-score: 0.657, Brier Score: 0.048, IPA: 90.8%. On the validation set, the ANN model performed as follows: ROC: 0.857, Accuracy: 0.903, Precision: 0.539, F1-score: 0.432, Brier Score: 0.071, IPA: 71.4%. An online visualization tool was developed (https://xyyy.pythonanywhere.com/).

Analysis of risk factors related to perioperative neurocognitive disorders in elderly patients with hip fractures

Article Open access 09 February 2025

Automated identification of hip arthroplasty implants using artificial intelligence

Article Open access 16 July 2022

Prediction of osteoporosis from simple hip radiography using deep learning algorithm

Article Open access 07 October 2021

Introduction

In 2018, the International Perioperative Cognitive Expert Group formally introduced the term “perioperative neurocognitive disorders (PND)” to replace cognitive change associated with surgery and anesthesia. This inclusive nomenclature encompasses a spectrum of cognitive alterations, including preoperative cognitive decline, early postoperative delirium, delayed neurocognitive recovery, postoperative cognitive dysfunction, and various degrees of cognitive deficits ranging from mild to severe¹. Perioperative neurocognitive disorders have lasting effects on patients and families beyond surgery. Many with mild cognitive issues develop Alzheimer’s within three years, raising dementia risk 12 times and increasing postoperative mortality significantly^2,3. An editorial revealed that 65% of Australian patients fear lasting cognitive issues after surgery⁴. This highlights the need for more research into perioperative neurocognitive disorders due to heightened patient concern.

Perioperative neurocognitive disorders vary in prevalence among different surgeries. A major study found a notable 19.6% rate of cognitive impairment after total hip arthroplasty, ranking it highest outside cardiac procedures⁵. Another study found a significant 54% detection of neurocognitive issues in 99 patients after total joint replacement⁶. Evidently, cognitive impairment is prevalent throughout the entire joint replacement process.

Existing research have primarily focused on risk factors, with recent findings highlighting hronic illnesses, alcohol intake, auditory deficits, infections, high Apache II scores at admission, sedative and analgesic use, catheter insertion, and sleep issues as independent risks^7,8. Furthermore, certain studies have delved into some laboratory values, revealing connections between perioperative cognitive dysfunction and serum lipid, uric acid, perioperative lactate, and inflammation-related factors. These links suggest potential predictive factors^9,10,11. Systematic reviews also analyze clinical parameters associated with cognitive impairment after hip arthroplasty. Findings suggest elderliness, lower BMI, smoking, visual impairment, preoperative delirium, dementia, hypertension, and COPD as significant risk factors¹². In addition, previous studies mainly focus on predictive factors in the general population, lacking tailored predictive analytics. There’s a crucial need for precise research to deliver personalized predictions for tailored interventions. Anticipating cognitive issues during surgery not only benefits patients but also helps healthcare professionals improve monitoring and care processes¹³.

Built upon the MIMIC-IV database, this study involves data from over 3000 hip arthroplasty patients, including 30+ clinical variables like demographics, lab tests, vital signs, complications, medications, and treatments. Its main goal is to create a predictive model for neurocognitive issues during hip arthroplasty using machine learning. After thorough validation and comparison, we’ll determine the most effective model for precise predictions. Also, we’ll develop an online tool to identify patients at risk of neurocognitive disorder during hip arthroplasty.

Methods

Data source

This study utilized patient data from the extensive MIMIC-IV database to develop and validate a predictive model for perioperative neurocognitive disorders following hip arthroplasty. Covering clinical data from over 190,000 patients and 450,000 hospital admissions at the Beth Israel Deaconess Medical Center between 2008 and 2019, the MIMIC-IV database is a valuable and publicly accessible clinical resource. Researchers can share data after completing a training program. Importantly, patient identifiers in MIMIC-IV have been meticulously removed to the fullest extent possible, ensuring the utmost protection of patient privacy.

Throughout this research, a team member, duly authorized to access the MIMIC database, employed the MIMIC-IV repository to compile a dataset that comprehensively encompasses inpatient data concerning perioperative neurocognitive disorders during individuals undergoing hip joint surgeries. This dataset incorporates an array of detailed information, encompassing demographics, laboratory findings, perioperative medication history, vital sign recordings, surgical interventions, treatment, disease diagnoses, medication regimens, and subsequent survival outcomes.

Participants and outcomes

We gathered information from all patients admitted for hip joint replacement or revision in the MIMIC-IV database, which covered surgeries conducted through various approaches, including both partial and total hip joint replacement procedures. The identification of these surgeries was based on codes from the 9th edition revision (ICD-9) and the 10th edition revision of the MIMIC-IV database (see Supplementary Appendix S1).

The main outcome of our study was on perioperative cognitive disorders occurring during hospitalization. In accordance with the guidelines outlined in the 5th edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), we classified delirium, mild cognitive impairment (MCI) and dementia as neurocognitive disorders in our study¹⁴. Data related to cognitive disorders were extracted using both ICD-9 and ICD-10 codes (see Supplementary Appendix S2).

Predictive variables

Following a comprehensive review of previously identified risk factors associated with cognitive impairment, we extracted the relevant risk factor variables from the MIMIC database, along with other pertinent variables. The included variables encompassed various domains: (1) Demographic data, including age, gender, admission type, marital status, ICU stay time, and in-hospital mortality rate; (2) Vital signs, including heart rate, systolic blood pressure, diastolic blood pressure, respiratory rate, temperature, and pulse oximetry (SpO₂); (3) Laboratory variables, including creatinine, urea nitrogen, hemoglobin, platelet count, international normalized ratio (INR), prothrombin time (PT), white blood cell count, lymphocytes, neutrophils, uric acid, total cholesterol, triglycerides, low-density lipoprotein (LDL), high-density lipoprotein (HDL), sodium, potassium, pH, glucose, and lactate; (4) Identification of complications based on ICD-9 and ICD-10 codes, including hypertension, diabetes, cardiovascular diseases, chronic bronchopulmonary disease, liver disease, kidney disease, craniocerebral disease, and sleep disorders(see Supplementary Appendix S3-S10); (5) Medications: including analgesics and sedatives (morphine, fentanyl, midazolam, propofol, etomidate, dexmedetomidine) and cardiovascular drugs (norepinephrine, epinephrine, nitroglycerin, nifedipine, captopril, hydrocortisone, dobutamine, and dopamine); (6) Scoring systems, including Simplified Acute Physiology Score II (SAPS II), Sequential Organ Failure Assessment (SOFA) score, and Glasgow Coma Scale (GCS); (7) Treatment strategies, including mechanical ventilation, renal replacement therapy, blood transfusion (see Supplementary Appendix S11). For laboratory data, vital signs, and related information, we opted for the earliest available results or the first postoperative results.

Data acquisition and missing management

In this study, we extracted various patient data from the MIMIC-IV database. Demographic information was obtained from the admissions and patients tables, while laboratory results came from the lab events table. We collected complications data from the “diagnoses_icd” table and medication records from the prescriptions table. Scores for the Sequential Organ Failure Assessment (SOFA), Simplified Acute Physiology Score II (SAPS II), and Glasgow Coma Scale (GCS) were retrieved from the corresponding derived tables: sofa, SAPS II, and GCS. To identify patients with perioperative cognitive issues, we queried the “d_icd_diagnoses” and “diagnoses_icd” tables. Each record in the MIMIC-IV database is uniquely identified by the “hadm_id” (hospital admission ID), which we used to link the different data sources for further analysis.

During data mining, we obtained 62 variables from the MIMIC database. Two variables lacked data and were excluded. Variables with over 30% missing data were also omitted. Missing values in continuous variables were filled with median values, and categorical variables were encoded using ordinal techniques. Consequently, 27 variables were removed during model building, including parameters like heart rate, blood pressure, respiratory rate, oxygen saturation, temperature, and various medications and treatments such as captopril, dobutamine, dopamine, dexmedetomidine, epinephrine, etomidate, fentanyl, hydrocortisone, midazolam, morphine, nifedipine, nitroglycerin, propofol, mechanical ventilation, renal replacement therapy, and transfusion.

Sample size

In the MIMIC-IV database, 3,292 patients underwent hip arthroplasty, with 331 developing perioperative neurocognitive disorders. This subset formed our study cohort, which was then randomly split into a development group and a validation group in a 7:3 ratio. To improve the stability and robustness of model performance, we incorporated fivefold cross-validation in the model development and evaluation process, addressing potential variability due to random seed selection. Adhering to the guideline advocating for a minimum of 10 occurrences per predictive variable in machine learning algorithms, the inclusion of 3,292 patients in our study cohort met the recommended sample size criteria for rigorous statistical analysis¹⁵.

Statistical analyses

The normality of the data was examined through the Shapiro–Wilk test. Continuous variables are presented as mean ± standard deviation for normally distributed data, or as median [P25, P75] for non-normally distributed data, while categorical variables are expressed as frequencies (percentages). To assess disparities between the PND and non-PND groups within both the development and validation sets, the Pearson chi-square test was employed to analyze distinctions among categorical variable groups. For continuous variables exhibiting non-normal distribution or heterogeneity, the Mann–Whitney U test or Kruskal–Wallis test was utilized accordingly.

Model development and validation

In this study, we developed five models employing various methodologies, encompassing multiple logistic regression (MLR), artificial neural network (ANN), Naive Bayes (NB), support vector machine (SVM), and decision tree (XgBoost) algorithms. For the MLR and SVM models, we applied the Least Absolute Shrinkage and Selection Operator (LASSO) regression for variable selection, with λ set at 1. This procedure led to the reduction of variables from 35 to 22 (see Supplementary Appendix S12). Subsequently, these selected predictive variables were utilized as inputs for the MLR and SVM to facilitate the machine learning process aimed at identifying pivotal features. Conversely, in the case of the remaining models developed using different algorithms, feature selection involved inputting all predictive variables, allowing the algorithms to autonomously discern significant features.

We conducted feature analysis plots and generated Receiver Operating Characteristic (ROC) curves for the five predictive models. The Area Under the Curve (AUC) from the ROC curves was evaluated to compare their performance. Additionally, sensitivity, specificity, and F1 score were assessed for each model on both the development and validation sets to ascertain their classification efficacy. Calibration curves were constructed to examine the alignment between predicted probabilities and actual outcomes in both datasets. Furthermore, the Brier score and the index of prediction accuracy (IPA) were computed to compare the calibration and prediction capabilities of the models. Finally, Decision Curve Analysis (DCA) plots were generated to estimate and compare the potential clinical utility of the models.

Results

Details of included cases and correlation among variables

The MIMIC-IV database encompasses records of 3,272 patients who underwent either hip arthroplasty or revision surgery. Among these, 330 cases experienced perioperative neurocognitive disorders (PND), while 2,942 cases did not. All cases were incorporated into the final analysis, comprising 264 PND cases and 2,353 non-PND cases in the training set, and 66 PND cases and 589 non-PND cases in the validation set. Table 1 provides a comprehensive overview of the characteristics exhibited by PND and non-PND patients across distinct datasets.

Table 1 The characteristics of PND and non-PND patients from train set and test set.

Full size table

Correlation analysis was undertaken on all variables included in the study. The interrelations among these variables are depicted graphically as a heatmap in Fig. 1. It is evident from the heatmap that there exists a correlation between high density lipoprotein (HDL), low density lipoprotein (LDL), and total cholesterol. Furthermore, a statistically significant correlation is observed between the percentages of lymphocytes and neutrophils.

The key features of models established by different algorithms

Figure 2 illustrates the essential features and their weight rankings in models established by each algorithm. Notably, models developed through different algorithms prioritize distinct key features. In the multivariate logistic regression model (Fig. 2a), the top 10 crucial variables include age, craniocerebral disease, serum creatinine, triglycerides, hemoglobin, international normalized ratio (INR), prothrombin time (PT), SOFA score, lymphocytes, and neutrophils. For the ANN predictive model (Fig. 2b), the algorithm assigns the highest weights to the following 10 variables: age, gender, serum creatinine, GCS score, and whether the patient has diabetes, hypertension, chronic disease, kidney disease, craniocerebral disease, or sleep disorder. Support vector machines (SVM), acclaimed for their classification prowess, stand out in delineating data into distinct categories by identifying the optimal hyperplane. In our SVM-based predictive model (Fig. 2c), the algorithm identifies the top 10 most crucial features, such as age, hemoglobin, SOFA score, sodium, potassium, PT, and indicators of craniocerebral disease, hypertension, liver disease, or kidney disease.

In the Naive Bayes model, the conditional probabilities of each feature under different categories are derived from the training data. By analyzing these logarithmic probabilities, the algorithm can assess the contribution of each feature to the classification task. Figure 2d illustrates the top 10 most crucial features for the occurrence of PND and non-PND cases in the Naive Bayes model. For PND cases, the most important features include hemoglobin, sodium, craniocerebral disease, age, GCS score, pH, neutrophils, total cholesterol, HDL, and LDL. Conversely, for non-PND cases, the top 10 features comprise potassium, age, sodium, pH, neutrophils, GCS score, total cholesterol, HDL, LDL, and the patient’s admission type. In practice, we obtain logarithmic probabilities for each feature under each category. A higher logarithmic probability for a specific category indicates a higher likelihood of the feature under that given category. In the decision tree (Xgboost) model, each node represents a feature, with branches signifying different values of that feature, and leaf nodes indicating the model’s prediction for input samples (Fig. 2e). Various splitting criteria, such as the Gini coefficient, are employed to determine the optimal feature for reducing data uncertainty through splitting. This process occurs at each node of the tree. Key nodes in the decision tree include age, the presence of craniocerebral disease, neutrophils (%), lactate, GCS score, admission type, the presence of chronic bronchopulmonary disease, hemoglobin, white blood cell count, and serum sodium. These nodes play pivotal roles in guiding the model’s predictions by effectively partitioning the data based on their respective features.

Model performance and comparation

As illustrated in Fig. 3 and summarized in Table 2, the artificial neural network (ANN)-based model exhibits superior performance compared to others on the training dataset, demonstrating a larger area under the Receiver Operating Characteristic (ROC) curve (AUC: 0.954; 95% CI 0.947–0.961). With an accuracy of 93.8%, precision of 75.8%, and an F1-score of 65.7%, the ANN model showcases notable predictive capabilities, surpassing models founded on multiple logistic regression (MLR), decision tree, support vector machine (SVM), and Naive Bayes (NB). However, it is noteworthy that the SVM-based model also presents commendable performance, notably excelling in accuracy and precision metrics (AUC: 0.879; 95% CI: 0.874–0.884; accuracy: 91.6%, precision: 65.0%, F1-score: 44.1%). These findings indicate that both artificial neural network (ANN) and support vector machine (SVM) models exhibit robust potential for predictive accuracy, with the ANN model demonstrating slightly superior overall performance. Moreover, upon assessing model performance on the validation dataset, the SVM-derived model emerges as noteworthy (AUC: 0.865; 95% CI 0.834–0.897), surpassing alternative algorithms with an accuracy of 91.1%, precision of 61.2%, and an F1-score of 41.7%.

Table 2 Models performance by different algorithms.

Full size table

In terms of calibration, we generated calibration plots and computed Brier Scores along with the index of prediction accuracy (IPA). Table 2 illustrates that the Brier Scores for all five predictive models consistently remain below 0.1, indicating outstanding accuracy. Notably, on the training dataset, models based on the decision tree, ANN and SVM algorithms show lower Brier Scores and higher IPA values compared to others (decision tree: 0.060; SVM: 0.064; ANN: 0.048; IPA, decision tree: 77.4%; SVM: 75.7%; ANN: 90.8%). Similarly, on the validation dataset, models derived from the decision tree and SVM algorithms demonstrate relatively superior accuracy (decision tree: 0.070; SVM: 0.067; ANN: 0.071; IPA, decision tree: 68.9%; SVM: 73.0%; ANN: 71.4%). These results underscore the remarkable accuracy and calibration performance of the ANN and SVM models across both training and validation datasets (Fig. 4a, c).

Based on the training dataset, we constructed Decision Curve Analysis (DCA) plots for the five models and performed a comparative assessment. The DCA plots reveal that within the predictive risk threshold range of 0% to 80%, models based on artificial neural networks (ANN) outperform others. This indicates that, in this risk range, patients would benefit more from the use of ANN-based predictive models than from alternative approaches.

The decreasing trend observed in the DCA lines suggests that, as the threshold probability increases beyond a certain point, the net benefit of using the predictive models diminishes. This implies that higher threshold probabilities may lead to more false positive predictions, thereby reducing the model’s clinical utility. The points where the DCA curves cross the x- and y-axes are critical: when a model’s curve crosses the y-axis, it shows the net benefit of treating all patients (i.e., the model has no predictive value). When the curve crosses the x-axis, it suggests the threshold probability beyond which the model’s predictions are no longer useful, as the net benefit becomes negative. Similar findings are observed in the validation dataset, where for predictive risk thresholds between 0 and 80%, ANN-based models again show superior performance compared to other models (Fig. 4b, d).

Constructure online tool

In summary, considering the model’s performance, DCA, and predictive accuracy, the ANN-based predictive model stands out as superior, showing greater potential for clinical use. Using this ANN model, we’ve created an online tool (https://xyyy.pythonanywhere.com/) to identify high-risk patients for PND during hip arthroplasty.

Discussion

In this investigation, we developed a rapid and convenient online tool based on a dataset of 3272 patients undergoing hip arthroplasty. The primary objective of this tool is to effectively identify high-risk patients susceptible to perioperative cognitive disorders (PND), thereby aiding in their perioperative management. Our findings underscore the significance of various clinical factors, including craniocerebral diseases, age, gender, hemoglobin, liver disease, kidney disease, admission type, GCS, sleep disorder and hypertension as pivotal features for predicting the onset of PND in patients undergoing hip joint replacement surgery.

Age and gender are widely acknowledged as a crucial risk factor for the occurrence of PND in patients during the perioperative period^16,17,18. Our study’s advanced model confirms that both factors significantly influence PND incidence, with older patients facing a higher risk compared to younger ones, and female patients exhibiting a greater likelihood of developing PND than their male counterparts. Additionally, our analysis identifies craniocerebral diseases as the primary predictors of PND, supporting clinical findings that vascular mechanisms are integral to PND-associated cognitive issues¹⁹. The Glasgow Coma Scale (GCS), essential for assessing cognitive functions and mental states, is also highlighted as a pivotal predictor of PND according to our data analysis²⁰. Furthermore, our findings indicate that renal and hepatic diseases, as well as sleep disorders, substantially impact PND risk through mechanisms like disrupted drug metabolism and toxin accumulation. Meta-analyses of existing cohort studies establish a strong correlation between chronic kidney disease and functional brain disorders, with individuals having elevated urine albumin-to-creatinine ratios facing increased risks of Alzheimer’s disease and vascular dementia²¹. Additionally, previous research underscores a significant rise in sleep disorders among individuals experiencing cognitive decline, reinforcing the profound link between sleep disturbances and cognitive deterioration²².

Observational studies have shown that individuals with hypertension are more susceptible to cognitive dysfunctions²³. Extensive research, including retrospective analyses and meta-analyses, has identified hemoglobin levels as a predictive factor for postoperative delirium in orthopedic patients, with lower hemoglobin levels associated with an increased likelihood of this condition^24,25,26. Consistent with prior studies, our results affirm that hypertension and low hemoglobin are key predictors of PND.

In the past, several studies have proposed similar models. However, these studies were based on a limited amount of data for model development. These studies had small sample sizes and incorporated numerous variables, which could potentially introduce statistical biases^27,28,29. Our study is built upon an extensive dataset comprising 3272 cases, from which we meticulously curated a comprehensive set of variables, totaling over 30. These variables were carefully selected to facilitate both screening and deep learning processes, aiming to develop a robust model targeting perioperative cognitive disorders in patients undergoing hip replacement or revision surgery. Furthermore, we employed a diverse array of five different methods for variable selection, including filter, wrapper, and embedded techniques, in contrast to the prevalent reliance on LASSO regression seen in most studies^27,30. Lastly, our study employed a variety of algorithms for model prediction and construction. Through thorough comparisons, we identified and developed the optimal model (ANN-based predictive model). Notably, the resulting model is not only highly effective but also straightforward, convenient, and user-friendly.

Nonetheless, our research does have some limitations. Firstly, being a retrospective study, it is contingent upon the completeness of the collected data. Initially, we selected over 60 variables; however, due to a high rate of missing data for certain variables, deletions were necessary. Furthermore, additional validation using external datasets is warranted for this study.

Conclusion

In summary, we have developed a simple and convenient big data-based visualization online tool for perioperative screening of high-risk patients for postoperative neurocognitive disorders (PND) in hip replacement or revision surgeries. This tool aims to assist healthcare professionals in making decisions regarding perioperative management strategies.

Data availability

All data included in this study are available upon request by contact with the corresponding author.

Abbreviations

PND:: Perioperative neurocognitive disorders
MLR:: Multiple logistic regression
ANN:: Artificial neural network
NB:: Naive Bayes
SVM:: Support vector machine
XgBoost:: Decision tree
ROC:: Receiver operating characteristic
IPA:: Prediction accuracy
DCA:: Decision curve analysis
SOFA:: Sequential Organ Failure Assessment
ASA:: American Society of Anesthesiologists
LASSO:: The Least Absolute Shrinkage and Selection Operator
AUC:: The area under the curve

References

Evered, L. et al. Recommendations for the nomenclature of cognitive change associated with anaesthesia and surgery-2018. Br. J. Anaesth. 121, 1005–1012. https://doi.org/10.1016/j.bja.2017.11.087 (2018).
Article CAS PubMed PubMed Central MATH Google Scholar
Goldstein, D. P. et al. Cognitive impairment and delirium in older patients undergoing major head and neck surgery. Otolaryngol. Head Neck Surg. 167, 97–99. https://doi.org/10.1177/01945998211045293 (2022).
Article PubMed MATH Google Scholar
Cao, E., Ma, D., Nayak, S. & Duong, T. Q. Deep learning combining FDG-PET and neurocognitive data accurately predicts MCI conversion to Alzheimer’s dementia 3-year post MCI diagnosis. Neurobiol Dis 187, 106310. https://doi.org/10.1016/j.nbd.2023.106310 (2023).
Article CAS PubMed Google Scholar
Rowley, P. et al. What do people expect of general anaesthesia?. Br. J. Anaesth. 118, 486–488. https://doi.org/10.1093/bja/aex040 (2017).
Article CAS PubMed MATH Google Scholar
Liu, F. et al. Subjective cognitive decline may mediate the occurrence of postoperative delirium by P-tau undergoing total hip replacement: The PNDABLE study. Front Aging Neurosci. 14, 978297. https://doi.org/10.3389/fnagi.2022.978297 (2022).
Article CAS PubMed PubMed Central Google Scholar
Labaste, F. et al. Postoperative delirium is a risk factor of institutionalization after hip fracture: an observational cohort study. Front Med. (Lausanne) 10, 1165734. https://doi.org/10.3389/fmed.2023.1165734 (2023).
Article PubMed Google Scholar
Wang, Y. et al. Identification of risk factors for postoperative delirium in elderly patients with hip fractures by a risk stratification index model: A retrospective study. Brain Behav. 11, e32420. https://doi.org/10.1002/brb3.2420 (2021).
Article PubMed PubMed Central Google Scholar
Fan, H. et al. Development and validation of a dynamic delirium prediction rule in patients admitted to the Intensive Care Units (DYNAMIC-ICU): A prospective cohort study. Int. J. Nurs. Stud. 93, 64–73. https://doi.org/10.1016/j.ijnurstu.2018.10.008 (2019).
Article PubMed MATH Google Scholar
Wang, F. et al. Potential value of serum uric acid in the identification of postoperative delirium in geriatric patients undergoing knee replacement. Front Aging Neurosci. 14, 909738. https://doi.org/10.3389/fnagi.2022.909738 (2022).
Article CAS PubMed PubMed Central Google Scholar
Lin, Y. et al. Potential value of serum lipid in the identication of postoperative delirium undergoing knee/hip arthroplasty: The perioperative neurocognitive disorder and biomarker lifestyle study. Front Psychiatry 13, 870317. https://doi.org/10.3389/fpsyt.2022.870317 (2022).
Article PubMed PubMed Central Google Scholar
Wang, R. et al. Perioperative platelet count in peripheral blood is associated with the early stage of PND after major orthopedic surgery: A prospective observational study. BMC Geriatr 22, 200. https://doi.org/10.1186/s12877-022-02899-7 (2022).
Article CAS PubMed PubMed Central MATH Google Scholar
Wu, J., Yin, Y., Jin, M. & Li, B. The risk factors for postoperative delirium in adult patients after hip fracture surgery: A systematic review and meta-analysis. Int. J. Geriatr. Psychiatry 36, 3–14. https://doi.org/10.1002/gps.5408 (2021).
Article PubMed Google Scholar
Dilmen, O. K., Meco, B. C., Evered, L. A. & Radtke, F. M. Postoperative neurocognitive disorders: A clinical guide. J. Clin. Anesth. 92, 111320. https://doi.org/10.1016/j.jclinane.2023.111320 (2024).
Article PubMed MATH Google Scholar
Sachdev, P. S. et al. Classifying neurocognitive disorders: the DSM-5 approach. Nat. Rev. Neurol. 10, 634–642. https://doi.org/10.1038/nrneurol.2014.181 (2014).
Article PubMed MATH Google Scholar
Collins, G. S., Reitsma, J. B., Altman, D. G. & Moons, K. G. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement. BMJ 350, g7594. https://doi.org/10.1136/bmj.g7594 (2015).
Article CAS PubMed Google Scholar
Chaiwat, O. et al. Postoperative delirium in critically ill surgical patients: incidence, risk factors, and predictive scores. BMC Anesthesiol. 19, 39. https://doi.org/10.1186/s12871-019-0694-x (2019).
Article PubMed PubMed Central MATH Google Scholar
Kim, E. M., Li, G. & Kim, M. Development of a risk score to predict postoperative delirium in patients with hip fracture. Anesth Analg. 130, 79–86. https://doi.org/10.1213/ane.0000000000004386 (2020).
Article PubMed PubMed Central MATH Google Scholar
Haight, T. N. & Marsh, E. B. Identifying delirium early after stroke: A new prediction tool for the intensive care unit. J. Stroke Cerebrovasc. Dis. 29, 105219. https://doi.org/10.1016/j.jstrokecerebrovasdis.2020.105219 (2020).
Article PubMed MATH Google Scholar
Mendyk-Bordet, A. M. et al. Endothelial dysfunction and pre-existing cognitive disorders in stroke patients. Biomolecules 14, 721. https://doi.org/10.3390/biom14060721 (2024).
Article CAS PubMed PubMed Central Google Scholar
Lelis, R. G. et al. Apolipoprotein E4 genotype increases the risk of postoperative cognitive dysfunction in patients undergoing coronary artery bypass graft surgery. J. Cardiovasc. Surg. (Torino) 47, 451–456 (2006).
CAS PubMed Google Scholar
Tang, X. et al. Association of kidney function and brain health: A systematic review and meta-analysis of cohort studies. Ageing Res. Rev. 82, 101762. https://doi.org/10.1016/j.arr.2022.101762 (2022).
Article CAS PubMed MATH Google Scholar
Pistacchi, M., Gioulis, M., Contin, F., Sanson, F. & Marsala, S. Z. Sleep disturbance and cognitive disorder: Epidemiological analysis in a cohort of 263 patients. Neurol. Sci. 35, 1955–1962. https://doi.org/10.1007/s10072-014-1870-x (2014).
Article PubMed Google Scholar
Butters, A. F. et al. Brain health clinics—An evolving clinical pathway?. J. Prev. Alzheimers Dis. 12, 100051. https://doi.org/10.1016/j.tjpad.2024.100051 (2025).
Article PubMed MATH Google Scholar
Kim, N. et al. Analysis of the incidence and risk factors of postoperative delirium in patients with degenerative cervical myelopathy. Neurospine 19, 323–333. https://doi.org/10.14245/ns.2142778.389 (2022).
Article PubMed PubMed Central MATH Google Scholar
Mariz, J. et al. Risk and clinical-outcome indicators of delirium in an emergency department intermediate care unit (EDIMCU): an observational prospective study. BMC Emerg. Med. 13, 2. https://doi.org/10.1186/1471-227x-13-2 (2013).
Article ADS PubMed PubMed Central MATH Google Scholar
Zhang, H. J., Ma, X. H., Ye, J. B., Liu, C. Z. & Zhou, Z. Y. Systematic review and meta-analysis of risk factor for postoperative delirium following spinal surgery. J. Orthop. Surg. Res. 15, 509. https://doi.org/10.1186/s13018-020-02035-4 (2020).
Article PubMed PubMed Central MATH Google Scholar
Chen, D. et al. Predicting postoperative delirium after hip arthroplasty for elderly patients using machine learning. Aging Clin. Exp. Res. 35, 1241–1251. https://doi.org/10.1007/s40520-023-02399-7 (2023).
Article ADS PubMed MATH Google Scholar
Choi, J. Y. et al. Development and validation of a prognostic classification model predicting postoperative adverse outcomes in older surgical patients using a machine learning algorithm: Retrospective observational network study. J. Med. Internet Res. 25, e42259. https://doi.org/10.2196/42259 (2023).
Article PubMed PubMed Central Google Scholar
Song, A. L. et al. Dynamic nomogram for predicting the risk of perioperative neurocognitive disorders in adults. Anesth. Analg. 137, 1257–1269. https://doi.org/10.1213/ane.0000000000006746 (2023).
Article CAS PubMed PubMed Central MATH Google Scholar
Li, Y. et al. Development and validation of prediction models for neurocognitive disorders in adult patients admitted to the ICU with sleep disturbance. CNS Neurosci. Ther. 28, 554–565. https://doi.org/10.1111/cns.13772 (2022).
Article CAS PubMed MATH Google Scholar

Download references

Acknowledgements

Special thanks to Jianbang Wu for the enhancement of the figures and tables. This research is funded by the Scientific and Technological Talent Support Program at Shaanxi Provincial People’s Hospital: (2023JY-37); Science and Technology Talent Support Program of Shaanxi Provincial People’s Hospital (2023BJ-02); Research Incubation Fund of Shaanxi Provincial People’s Hospital (2023YJY-01); Research Incubation Fund of Shaanxi Provincial People’s Hospital (2023YJY-32). The achievement was also supported by funding from the Shaanxi Province Health and Wellness Discipline Leader Visiting Scholar Program.

Author information

Gang Wang and Yi Xie share first authorship, they contributed equally to this work.

Authors and Affiliations

Department of Anesthesiology, Department of Emergency Surgery, Shaanxi Provincial People’s Hospital, Youyi West Road No. 256, Beilin District, Xi’an, 710000, China
Gang Wang, XiaRui Bai, Yuming Zhang & Jiao Guo
Faculty of Business, The Hong Kong Polytechnic University, Kowloon, China
Yi Xie
Xi’an Medical University, HanGuang North Road No. 74, Beilin District, Xi’an, 710000, China
Jiao Guo

Authors

Gang Wang
View author publications
Search author on:PubMed Google Scholar
Yi Xie
View author publications
Search author on:PubMed Google Scholar
XiaRui Bai
View author publications
Search author on:PubMed Google Scholar
Yuming Zhang
View author publications
Search author on:PubMed Google Scholar
Jiao Guo
View author publications
Search author on:PubMed Google Scholar

Contributions

Xie Yi contributed to data extraction and analysis, while Guo Jiao contributed to paper writing and data analysis. Wang Gang oversaw the paper revision, and Bai Xiarui contributed to the research design. Zhang Yuming provided financial support in this study. Each author contributed important intellectual content during manuscript drafting or revision and accepts personal accountability for the overall work, and agrees to ensure that questions pertaining to the accuracy or integrity of any portion of the work are appropriately investigated and resolved.

Corresponding author

Correspondence to Jiao Guo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary Material 2

Supplementary Material 3

Supplementary Material 4

Supplementary Material 5

Supplementary Material 6

Supplementary Material 7

Supplementary Material 8

Supplementary Material 9

Supplementary Material 10

Supplementary Material 11

Supplementary Material 12

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, G., Xie, Y., Bai, X. et al. Development and validation comparison of multiple models for perioperative neurocognitive disorders during hip arthroplasty. Sci Rep 15, 9393 (2025). https://doi.org/10.1038/s41598-025-93324-7

Download citation

Received: 05 April 2024
Accepted: 06 March 2025
Published: 19 March 2025
Version of record: 19 March 2025
DOI: https://doi.org/10.1038/s41598-025-93324-7