Introduction

Total knee arthroplasty (TKA) has been recognized as one of the most significant advancements in effectively treating end-stage knee diseases since the 20th century1,2,3. With the increasing number of TKA procedures, there is growing attention to perioperative anemia, blood loss and blood transfusion4,5. Therefore, effective perioperative blood management is essential for promoting recovery and reducing the social and economic burden in TKA patients6,7.

Coagulation and bleeding are the two primary components of perioperative blood management, working synergistically to maintain hemostasis balance8. However, the prevention of bleeding risk remains a concern among orthopedic surgeons in patients receiving TKA9,10,11. Although regular adoption of anticoagulants has reduced the incidence of perioperative venous thromboembolism (VTE) to as low as 0.5%−1% in TKA patients12. However, postoperative blood loss is still posing considerable challenges to its effective management. Statistics reveal that perioperative blood loss in major orthopedic surgeries such as TKA could reach up to 1200 ml13, with over 53.2% of patients developing moderate or severe hemorrhagic anemia14. Inadequate management of perioperative bleeding can result in a series of bleeding-related complications, including ecchymosis, systemic hemorrhage, and coagulation disorders, significantly compromising postoperative functional recovery and patient satisfaction15,16. Therefore, to better navigate this trade-off, there is an urgent need for surgeons that can accurately predict bleeding events in patients undergoing anticoagulation therapy.

As one of the most common signs of postoperative bleeding events, the incidence rate of postoperative ecchymosis is as high as 30–40%15,17,18,19. As subcutaneous bleeding exceeds 3 mm in diameter, ecchymosis often accompanies wound or severe swelling and pain in the lower limb, exacerbating local inflammation and signifying an increased tendency for further blood loss20,21,22. Meanwhile, post-operative ecchymosis generally indicates a hypocoagulable state18. Failure to promptly detect and intervene may significantly increase the risk of secondary bleeding, potentially leading to systemic coagulopathy, major hemorrhage, and even mortality23,24. Therefore, the prediction of post-operative ecchymosis can reserve the time window for personalized perioperative blood management strategies, decreasing the occurrence rates of postoperative anemia, blood loss and blood transfusion. Although several recent research explored the risk factors for postoperative ecchymosis17,19,25, the integration of these factors remains incomplete and lacks prospective evidence, necessitating further improvements in predictive accuracy and effectiveness.

With the rapid development of Machine Learning (ML), its application in precise clinical diagnostics and treatments has gained increasing popularity26,27,28. Unlike traditional statistical methods, ML combines computer technology and artificial intelligence for data organization and mining, offering significant advantages in managing complex issues like over-fitting and non-linearity in big data, often resulting in higher predictive accuracy29,30. To leverage ML for developing a predictive model for postoperative bleeding, using postoperative ecchymosis as an indicator, we designed this prospective cohort study to collect perioperative data from patients undergoing TKA for model development. Meanwhile, various ML techniques are employed to validate this model and prospectively assess its accuracy. The predictive model offers a critical time window for personalized perioperative blood management strategies, enhancing rapid recovery for TKA patients.

Participants and methods

Participants

From June 2023 to March 2024, patients diagnosed with end-stage osteoarthritis and scheduled for TKA at our medical center were screened based on strict inclusion and exclusion criteria. Inclusion criteria included: (1) Patients who received unilateral primary TKA due to end-stage knee diseases. (2) Age ≥ 18 years. Exclusion criteria included: (1) Patients with severe cardiovascular and cerebrovascular diseases (myocardial infarction, cerebral infarction, etc.); (2) Patients with significant bleeding tendencies (such as gastrointestinal active bleeding, cerebral hemorrhage, etc.); (3) Patients with systemic or local venous thromboembolism (such as deep vein thrombosis, pulmonary embolism, etc.); (4) Patients with previously defined coagulation dysfunction; (5) Patients with severe neuromuscular system diseases; (6) Patients scheduled for bilateral TKA or revision TKA; (7) Pregnant or breastfeeding patients. To maintain the precision of the trial and minimize variation, all surgical procedures were performed by the same experienced surgical team.

Surgical procedure

Following patient selection based on defined inclusion and exclusion criteria, all TKA surgeries were consistently performed by the same experienced surgical team, which included two senior surgeons with more than 20 years of experience to ensure standardization. All procedures strictly followed a standardized perioperative management protocol that included aspects such as anesthesia, pain control, medication, rehabilitation, and nursing. To ensure data integrity, all patients underwent identical blood tests and examinations consistently before and after surgery.

Each patient was positioned in a supine posture with a pneumatic tourniquet applied to the thigh of the affected leg. The surgeries adhered to standardized TKA protocols, involving precise bone cutting and fitting a uniform type of knee joint prosthesis. Intraoperative X-ray imaging ensured accurate positioning of the prosthesis before the wound was methodically sutured. A local 1 g dose of tranexamic acid (TXA) was administered, and no postoperative drainage tubes were used. Furthermore, each patient received a 1.5 g intravenous dose of TXA every 12 h post-surgery to manage bleeding.

General perioperative management

Anesthesia Protocol: All patients underwent TKA surgery using the same anesthetic agents, which included midazolam and propofol for sedation, remifentanil for pain relief, and vecuronium for muscle relaxation.

Pain Management Protocol: Post-surgical pain management for all patients was managed with a patient-controlled intravenous analgesia pump containing 0.1 mg of sufentanil and 100 mg of flurbiprofen axetil.

Infection Prevention Medication: From postoperative Day 1 to Day 3 (POD1-POD3), all patients received 1.5 g of cefuroxime every 12 h to prevent infection.

Mechanical and Drug Prophylaxis for VTE: To prevent VTE, intermittent pneumatic compression was administered twice daily, with each session lasting 20 min. Meanwhile, each patient received a a once-daily postoperative subcutaneous injection of 4,000 IU of low-molecular-weight heparin.

Nursing Care Protocol: All enrolled patients were managed perioperatively by the same nursing team, receiving consistent intravenous fluid types and dosages, with meticulous monitoring and recording of input and output over 24 h.

Nutritional Management: On the morning of surgery, all patients consumed an equal dose of short peptides plus whole protein enteral nutrition. Their intake and output were rigorously monitored, and their nutritional status, including albumin and prealbumin levels, was closely observed before and after surgery.

Observation and grouping

All patients who underwent TKA surgery were closely monitored postoperatively for the occurrence of ecchymosis around the incision and thigh area and grouped accordingly. If ecchymosis with subcutaneous bleeding exceeding 3 mm in diameter was observed, its location and size were precisely documented, and the patient was categorized into the ecchymosis group. Patients without ecchymosis were classified into the non-ecchymosis group. All enrolled patients underwent the same blood tests and other relevant examinations pre-operatively and on POD1.

Blood loss calculation

The hidden blood loss (HBL) of every patient is calculated according to Nadler et al.31 and Gross formula32:

1): Patient blood volume (PBV, mL) = [k1 × height3 (m) + k2 × weight (kg) + k3] × 1000, where k1 = 0.3669, k2 = 0.03219, and k3 = 0.6041 for male, and k1 = 0.3561, k2 = 0.03308, and k3 = 0.1833 for female.

2): Total blood loss (TBL, mL) = PBV × (preoperative Hct - postoperative Hct)/preoperative Hct.

3): Intraoperative blood loss (IBL) = volume collected by suction - volume of irrigation + weight of blood-soaked gauze, where 1 g of gauze = 1 ml.

Feature selection

Fifty-six features, including patient demographics and surgical details such as age, gender, BMI, surgical duration, intraoperative blood loss, and TBL on both preoperative and POD1 were included. Routine hematology parameters included hemoglobin (Hb), hematocrit (HCT), platelet count (PLT), mean platelet volume (MPV), and platelet distribution width (PDW). Nutritional indicators included total protein, prealbumin, and albumin. The coagulation function indicators comprised prothrombin time (PT), activated partial thromboplastin time (APTT), thrombin time (TT), international normalized ratio (INR), fibrinogen, fibrinogen degradation products (FDP), and D-dimer. Thromboelastography (TEG) measured several critical parameters: reaction time (R), clot formation time (K), alpha angle (angle), maximum amplitude (MA), lysis at 30 min (LY30), and the coagulation index (CI), a composite measure of overall coagulation status derived from R, K, angle, and MA values33. Change in coagulation index was calculated as XCI = CI. POD1 - CI. PAD (postadmission day).

First, in the training set, univariate logistic regression was applied to screen for potential risk factors with a significance threshold of P < 0.05. This initial screening identified 15 candidate variables. Subsequently, to refine this list and select the most stable and important features, we employed three distinct machine learning-based algorithms on the training set: Least Absolute Shrinkage and Selection Operator (LASSO), the Boruta algorithm, and Random Forest-Recursive Feature Elimination (RF-RFE). The final set of key risk factors was determined by taking the intersection of the variables selected by all three of these methods.

Model development

All TKA patients were randomly assigned to the training and validation sets in a 7:3 temporal ratio at the onset of the study (first seven months for training, subsequent three months for validation). This temporal split was intentionally chosen over a random split to simulate a real-world clinical application scenario, where a model is built on historical data and validated on future patients. This approach provides a more rigorous assessment of the model’s generalizability and robustness over time and helps prevent temporal data leakage. It is important to note that no variables, including sex, were explicitly matched, or stratified during this split. The distribution of all baseline characteristics was a result of the natural variation within these consecutively enrolled temporal cohorts.

LASSO regression was primarily utilized for feature selection. It identified essential features by modifying the optimization objective function with a penalty term to encourage sparsity, thereby reducing many feature weights to zero. Meanwhile, LASSO was used to exclude non-significant features, thus constructing sparse and interpretable models to prevent overfitting34.

The Boruta algorithm was applied to evaluate feature importance with random forests and compare randomly generated “shadow” features, effectively addressing the challenges of feature selection35.

RF-RFE was used to integrate the inherent feature selection capability of Random Forest into a recursive feature elimination strategy. By iteratively training the model and eliminating the least important features, RF-RFE optimized feature subsets to enhance model performance, reduce computational overhead, and improve model interpretability36.

After applying these three primary ML techniques to the training set data for essential variables filtering, the final variable results to identify critical risk factors were intersected. Using these risk factors, the predictive model was generated by multivariate logistic regression analysis and visualized by using line plots.

Model evaluation and validation

Several key statistical tools were utilized to comprehensively assess our predictive model. The Receiver Operating Characteristic (ROC) curve was employed to evaluate the model’s ability to differentiate between two types of outcomes—whether or not ecchymosis occurred after TKA. Area Under the Curve (AUC) was applied to indicate diagnostic accuracy, reported with 95% confidence intervals (CI). Additionally, calibration curves were utilized to test the model’s predictive accuracy. Decision Curve Analysis (DCA) was applied to balance patient benefits and risks while assessing the practicality of clinical decisions37.

To benchmark the performance of our primary logistic regression model and validate the robustness of the selected features, we also constructed and evaluated four additional prediction models on the validation set using the same set of five key features. This selection was deliberate to encompass a range of model architectures and strengths: Random Forest (RF) was chosen for its robustness against overfitting, its ability to model complex non-linear relationships without strong assumptions about the data distribution, and its inherent provision of feature importance measures38. EXtreme Gradient Boosting (XGBoost) was selected due to its high predictive performance and efficiency in handling structured/tabular data. Its gradient-boosting framework is particularly effective at capturing intricate patterns through sequential model correction39. Support Vector Machines (SVM) were employed for their effectiveness in high-dimensional spaces and their ability to find a clear margin of separation (hyperplane) between classes, which is beneficial for binary classification tasks like ours40. Light Gradient Boosting Machine (LGBM) was utilized for its computational speed and memory efficiency, especially on large datasets, while maintaining high accuracy comparable to other boosting algorithms. For comparative purposes, a Logistic Regression (LM) model was also included as a well-understood and interpretable baseline, against which the performance of the more complex ML models could be benchmarked. The evaluation focused on the comparison of ROC curves, the analysis of feature importance distribution, and the assessment of SHAP values. Additionally, the performance metrics for each model—AUC, Accuracy, Sensitivity, Specificity, F1 Score, Recall, and Precision—were compared and summarized to validate the efficacy of the predictive model and identify the optimal ML approach.

Data analyses

Data processing was conducted using R software version 4.2.3 in this study. Missing values were addressed using Multiple Imputation by Chained Equations (MICE), and variables with more than 5% missing data were excluded from the analysis41. Descriptive analyses were performed on all patients included in the study. Continuous variables following a normal distribution were represented by mean ± standard deviation (SD) and analyzed using the t-test. For non-normally distributed continuous variables, the median (interquartile range) [M (P25, P75)] was used, with group comparisons conducted via the Mann-Whitney U test. Categorical variables were presented as n (%) and analyzed using the chi-square test, with Fisher’s exact test applied when chi-square conditions were not met. All statistical tests were two-sided, and a P-value of < 0.05 was considered statistically significant. Multiple ML methods were implemented in R with packages in the supplementary materials.

Ethical considerations

This prospective cohort study was conducted in accordance with the CONSORT standards42. Ethical approval was granted by the local Ethics Committee, and the study was registered with the Chinese Clinical Trial Registry (registration No. ChiCTR2400080173) before participants were enrolled. All eligible patients were fully informed of the benefits and risks associated with the study before signing the informed consent form to participate. All methods were done in accordance with relevant guidelines. They were also advised that they could withdraw from the study at any point if they experienced any discomfort.

Results

Baseline data

From June 2023 to March 2024, 468 patients undergoing TKA were included. After excluding 52 patients according to the inclusion and exclusion criteria, 416 were prospectively analyzed in this study (Fig. 1). Among these patients, 135 developed ecchymosis after TKA, constituting 32.5% of the total cohort. Ecchymosis of varying degrees and sizes was observed in areas such as the popliteal fossa, groin, lateral thigh, and around the surgical wound (Fig. 2A and F). We found that patients with post-TKA ecchymosis had significantly lower American Knee Society (AKS) scores compared to those without ecchymosis (P < 0.001, Fig. 2G), significantly higher TBL on POD1 (P < 0.001, Fig. 2H), and a significantly lower CI (P < 0.001, Fig. 2I). These findings indicate a hypocoagulable state and a higher risk of postoperative bleeding in patients with post-TKA ecchymosis. Consequently, constructing a post-TKA ecchymosis predictive model is necessary.

Fig. 1
figure 1

The flowchart of this study. TKA, total knee arthroplasty. UKA, unicompartmental knee arthroplasty. RF, random forest. SVM, support vector machine. Xgboost, extreme gradient boosting. Lgbm, lightgbm.

Fig. 2
figure 2

Appearance of ecchymosis on different body parts and their different scores compared to non-ecchymosis population. (A-F): Ecchymosis of varying degrees and sizes appeared in different areas. (A): Popliteal ecchymosis. (B): Ecchymosis at the base of the thigh. (C): Ecchymosis on the calf. (D): Ecchymosis on the lateral thigh. (E): Ecchymosis at the wound site. F: Ecchymosis adjacent to the wound. (G): Lower AKS scores in ecchymosis patients. (H): Higher TBL in ecchymosis patients. (I): Lower CI values in ecchymosis patients. AKS, American Knee Society. TBL, total blood loss. CI, coagulation index. ***, P < 0.001.

The construction of the predictive model was based on the training and validation sets. According to the admission dates, data from 312 patients (non-ecchymosis group: 205, ecchymosis group: 107) from June to December 2023 were assigned to the training set, while data from 104 patients (non-ecchymosis group: 76, ecchymosis group: 28) from January to March 2024 were allocated to the validation set (Fig. 1). Baseline data for patients in the training and validation sets are presented in Table 1, while complete baseline data for all participants are available in Table S1. The rationale for the division into training and validation sets is detailed in Table S2.

Table 1 Comparison results of general clinical characteristics on training set and validation set.

Feature selection

Based on the training set, 15 potential risk factors were initially identified using univariate logistic regression with a significance threshold of P < 0.05; details can be found in Table 2 and Figure S1. The results of variable filtering and feature importance selection, using three ML algorithms (LASSO, BORUTA, RF-RFE), are displayed in Fig. 3. The results of all three methods were consistent, identifying five key factors: lower prealbumin (P = 0.006), CI (P = 0.002), XCI (P < 0.001), elevated levels of FDP (P < 0.001), and TBL (P = 0.009) on POD1, detailed in Table 3.

Table 2 Single factor logistic regression analysis in training set.
Fig. 3
figure 3

Three machine learning methods for variable selection. (A&B): Log (λ) value and the five variables selected by LASSO. (C&D): The results selected using the Boruta. (E&F): The RF-RFE selected five variables with the minimum error, and the final distribution of variable importance.

Table 3 Multiple logistic regression analysis in training set.

Construction of prediction model

Three ML algorithms, including LASSO, BORUTA, and RF-RFE, were applied to verify the five key variables essential for constructing the predictive model. The model’s coefficients were visualized in a nomogram (Fig. 4A), which allows clinicians to input a patient’s clinical data and rapidly compute the probability of postoperative ecchymosis in TKA patients. The formula for predicting ecchymosis probability is as follows: Ecchymosis prediction probability = 1.186066 − 0.017569 × Prealbumin. POD1 + 0.002506 × TBL. POD1 − 0.463744 × CI. POD1 − 1.163408 × XCI + 0.112593 × FDP. POD1. Meanwhile, the prediction model has been developed into a dynamic nomogram and registered on the website, allowing for real-time online prediction of the probability of bruising post-TKA surgery (Fig. 4B). The specific information about the online version of the prediction model can be found in Figures S2 & S3. Subsequently, the five variables from the training and validation sets were compared with our nomogram (Fig. 4C and D). The ROC curves clearly demonstrate that the nomogram’s AUC values are superior to those of individual variables, thereby validating the effectiveness of our prediction model. Finally, to internally validate the model’s discriminatory power and consistency, we applied the model’s formula to calculate the ecchymosis probability for every patient in both the training and validation sets. The results demonstrated that the model-assigned probability scores were significantly higher for patients in the observed ecchymosis group compared to those in the non-ecchymosis group (Fig. 4E and F). This consistent pattern across both datasets confirms that the model’s predictions are well-calibrated and reliably separate the two outcome groups, thereby underscoring its repeatability and clinical utility.

Fig. 4
figure 4

Construction of the nomogram. (A): The nomogram for predicting ecchymosis. (B): The registration on the website for the dynamic nomogram. (C): The ROC curves of the nomogram and other variables in the training set. (D): The ROC curves of the nomogram and other variables in the validation set. (E): The nomoscore between non-ecchymosis group and ecchymosis group in the training set. (F): The nomoscore between non-ecchymosis group and ecchymosis group in the validation set.

Evaluation of the prediction model

The initial validation of our predictive model demonstrated robust performance. The AUC for the training cohort was 0.927 (95% CI, 0.891-0.891 0.962) (Fig. 5A), while the AUC for the validation cohort reached 0.954 (95% CI, 0.910–0.998) (Fig. 5B). Calibration plots and Hosmer-Lemeshow tests (P > 0.05) indicated good consistency between the predicted probabilities of post-TKA ecchymosis and the observed probabilities in both the training and validation sets (Fig. 5C and D). In the clinical application of our nomogram, Decision Curve Analysis (DCA) was conducted, revealing that the nomogram provided greater net benefits than other models (Fig. 5E and F). This evidence confirmed the model’s accuracy and clinical utility in predicting the risk of ecchymosis post-TKA.

Fig. 5
figure 5

Evaluation of the prediction model in the training and validation sets. (A): The ROC curve of the training set. CI, confidence interval. (B): The ROC curve of the validation set. (C): The calibration plot of the training set. (D): The calibration plot of the validation set. (E): The DCA curve of the training set. (F): The DCA curve of the validation set.

Model validation based on multiple ML algorithms

The predictive performance of the five key features was further substantiated by building four independent prediction models on the validation set using different ML algorithms. RF, SVM, XGBoost, and LGBM models were used to reassess feature importance, draw ROC curves, and create SHAP value plots. These analyses consistently confirmed the strong predictive capability of the model (Fig. 6).

Fig. 6
figure 6

Validation of the prediction model using multiple machine learning methods. (A): Variable importance of the validation set by Random Forest. (B): ROC curve of the validation set by Random Forest. (C): SHAP diagram of the validation set by Random Forest. (D): Variable importance of the validation set by Support Vector Machine. (E): ROC curve of the validation set by Support Vector Machine. (F): SHAP diagram of the validation set by Support Vector Machine. (G): Variable importance of the validation set by Extreme Gradient Boosting. (H): ROC curve of the validation set by Extreme Gradient Boosting. (I): SHAP diagram of the validation set by Extreme Gradient Boosting. (J): Variable importance of the validation set by Lightgbm. (K): ROC curve of the validation set by Lightgbm. (L): SHAP diagram of the validation set by Lightgbm. (M): Comparison of ROC curves among various machine learning methods. (N): Model performance of the various machine learning methods.

Among the different ML methods, all demonstrated AUC values exceeded 0.9, indicating excellent model performance. Specifically, the AUC scores were as follows: LGBM: 0.949 (95% CI, 0.907–0.992); LM: 0.954 (95% CI, 0.910–0.998); RF: 0.954 (95% CI, 0.913–0.996); SVM: 0.959 (95% CI, 0.919–0.998); XGBoost: = 0.969 (95% CI, 0.939–0.999). Details comparing the effectiveness of each ML method were shown in Table 4; Fig. 6. In summary, the predictive model achieved satisfactory results across various ML validations, demonstrating its robustness and effectiveness in clinical settings.

Table 4 All the ML prediction models’ outcomes in the validation set.

Discussion

Osteoarthritis is one of the most debilitating diseases worldwide, and TKA remains one of the most effective treatments for end-stage osteoarthritis43. Recently, researchers have long been exploring and experimenting with bleeding and coagulation management after TKA. It is well-known that maintaining a delicate balance between coagulation and bleeding is challenging. Hence, personalized perioperative blood management was implemented to decrease blood loss without increasing the risk of VTE10,12,17,44,45. As a representative clinical sign of minor bleeding, newly formed ecchymosis also indicates a hypocoagulable state and should be paid attention to when establishing blood management schemes15,18,45. In the present study, taking ecchymosis as the “barometer” of postoperative bleeding, we constructed a post-TKA-ecchymosis predictive model using ML procedures, which would be applied to predicting postoperative bleeding and devising postoperative anticoagulation and hemostasis therapeutic strategies.

The clinical significance of predicting ecchymosis extends beyond its appearance. As our results showed, patients with ecchymosis had significantly lower early postoperative AKS scores (Fig. 2G). This suggests that ecchymosis, often accompanied by swelling and pain, can directly impede early functional recovery by discouraging active participation in physical therapy and limiting joint mobilization. Furthermore, extensive ecchymosis can cause significant patient anxiety and dissatisfaction, potentially affecting the overall perception of surgical success. Therefore, the ability to predict ecchymosis is not merely about anticipating a cosmetic issue, but about proactively managing a factor that can compromise early rehabilitation and patient-reported outcomes.

Postoperative anemia is still one of the most common complications following TKA44,46. Increased hidden blood loss is a primary cause of postoperative anemia; therefore, balancing coagulation and bleeding is essential to decrease postoperative blood loss. Clinically, anticoagulation therapy in the early postoperative period remains controversial for ecchymosis patients, as it undoubtedly increases the risk of bleeding-related complications15. In a study from Germany, 42.9% of patients experienced bleeding complications after using anticoagulants, with 6.1% suffering from major bleeding, posing significant challenges for clinicians47. Similarly, for patients prone to bleeding, such as those with ecchymosis, anticoagulation raises the risk of developing bleeding, while discontinuing it in the short term raises concerns about an increased risk of thrombosis. Therefore, earlier and accurate prediction of ecchymosis and intervention in risk factors can significantly benefit postoperative blood management and maintain the balance between coagulation and bleeding.

The rapid development of artificial intelligence has promoted advancements across various industries. Machine learning (ML), a branch of artificial intelligence, excels at managing, processing, and analyzing multidimensional data from large datasets efficiently and accurately, which can be beneficial in in clinical practice48. Unlike traditional statistical methods, ML adeptly identifies complex, multidimensional, and nonlinear relationships between clinical features in larger datasets, enhancing predictive performance. Consequently, it has achieved considerable success across various medical fields49.

In the field of perioperative blood management after orthopedic surgery, abundant scoring systems and scales have been developed to assess the risk of VTE50,51,52. However, predictive models that can anticipate and manage bleeding tendencies remain rare. To address this gap, this prospective observational cohort study was designed to predict postoperative ecchymosis in patients undergoing TKA. The study precisely monitors, and groups patients based on the presence of ecchymosis around the surgical wound and thigh, ensuring accurate data collection. Furthermore, the data were divided into a training set and a validation set in a 7:3 ratio based on chronological order. This setup enables us to build the predictive model using the training set and subsequently conducted a temporal validation using the held-out future cohort. This setup allows us to evaluate the model’s performance in a manner that closely mimics a prospective clinical application, thereby guaranteeing a realistic estimate of the model’s accuracy upon deployment.

In selecting features of the predictive model, a comprehensive range of perioperative indicators were included, encompassing patient demographics, surgical details, nutritional status, coagulation profiles, TEG data, etc. Three sophisticated variable selection techniques — LASSO, RF-RFE, and BORUTA—were employed and identified five critical risk factors, enhancing the reliability and persuasiveness of the predictive model. To further validate the predictive model, multiple ML algorithms were employed, achieving strong results across key performance metrics such as AUC, sensitivity, specificity, F1 score, and recall. These outcomes not only validate the accuracy and feasibility of our predictive model but also highlight its potential applicability in clinical decisions.

Ecchymosis is regarded as a manifestation of subcutaneous hematoma, with current studies confirming that increased perioperative blood loss is linked to a higher likelihood of ecchymosis15,17. Correspondingly, postoperative ecchymosis could serve as a window of systemic coagulation status, as evidenced by authoritative research18. CI, a parameter in TEG that reflects the overall systemic coagulation index, is more sensitive than traditional coagulation profiles and is widely used33. A lower CI typically indicates a tendency for bleeding, which aligns with our findings. Moreover, our findings suggest that the higher XCI from preoperative to postoperative is also a risk factor for postoperative ecchymosis. This could be attributed to XCI’s capacity to objectively reflect changes in coagulation states, thus influencing perioperative blood loss. Additionally, as a nutritional marker closely related to nutritional reserves and liver function, reduced prealbumin levels were observed to occur more frequently in patients with ecchymosis. It has been reported that insufficient nutrient reserves can impair the synthetic function of liver cells, directly leading to lowered levels of prealbumin and coagulation function53,54. Overall, the nutritional status of patients offers new insights into the treatment and management of ecchymosis. Blood clot formation and fibrinolysis are two complementary actions of coagulation55. Maintaining a balance between fibrinolysis and coagulation is crucial for hemostasis without excessive thrombosis risk. Excessively high-level activation of fibrinolysis generally indicates a bleeding tendency56. In the present study, we found that patients with ecchymosis exhibited higher levels of FDP on POD1, indicating heightened fibrinolysis. Certainly, these results further validate the rationale of our model, as its ROC values exceed those of each individual risk factor (Fig. 4C and D), consistently demonstrated across both the training and validation sets.

Regarding the natural history of this condition, postoperative ecchymosis typically follows a self-limiting course. Based on clinical observation and previous reports, the visible bruising often begins to fade within 1–2 weeks and usually resolves completely within 2–4 weeks, following the natural process of hemoglobin breakdown and reabsorption. However, the impact of its initial presence on the critical early recovery phase remains a concern. While our model effectively identifies patients at risk, the critical next step is to investigate how this early postoperative event influences long-term functional outcomes and overall prognosis after TKA. Future prospective studies with longer follow-up periods are essential to determine if ecchymosis is an independent risk factor for slower recovery, reduced range of motion, or lower patient satisfaction scores at 6 or 12 months. Such research would definitively establish the full clinical importance of our predictive model and guide the development of targeted interventions for high-risk patients.

This study has some limitations. Firstly, as a single-center study, there is potential bias in patient selection, and external validation with data from other medical institutions could strengthen our predictive model’s credibility. Secondly, the TBL for each patient was calculated using a formula, which could be influenced by the volume of perioperative fluids administered. To mitigate this, a uniform fluid regimen was implemented for all participants, with daily fluid intake and output meticulously recorded. Despite these limitations, this study is still a carefully designed prospective cohort study. All enrolled patients underwent consistent blood testing and related examinations, and were closely monitored for the occurrence of ecchymosis, ensuring the accuracy and reliability of our data.

Conclusion

In the present study, three ML algorithms were employed to screen features, and identified five independent risk factors contributing to ecchymosis: reduced prealbumin, CI and XCI levels, increased FDP, and TBL on POD1. Based on these factors, a predictive model for the occurrence of ecchymosis after TKA was developed and validated using various ML techniques. The implementation of this model in clinical practice may enable a more precise assessment of postoperative bleeding risks and thereby improve blood management and facilitate recovery through early intervention of identified risk factors.