Introduction

Acute Cerebral Venous Thrombosis (CVT) represents an uncommon disorder, accounting for approximately 0.5–1.5% of all stroke cases1,2. Its clinical presentation varies widely, ranging from chronic headaches to severe encephalopathy, often accompanied by epilepsy and loss of consciousness3,4. Due to its rarity and diverse manifestations, CVT is frequently overlooked during initial consultations, leading to diagnostic delays3,5,6. Failure to promptly recognize and administer anticoagulation therapy can result in clot progression, increasing the risk of coma or death. Therefore, early and accurate identification of CVT is critical to improving patient outcomes7.

Approximately 40% of CVT cases are associated with intracranial hemorrhage, which is an independent risk factor for poor prognosis. Patients presenting with acute intraparenchymal hematomas typically undergo nonenhanced CT (NECT) as an initial diagnostic evaluation in the emergency setting8,9. CVT-related hematomas can involve all lobes of the brain, including the basal ganglia, making it challenging for radiologists to distinguish them from hematomas caused by hypertension (HP), cerebral amyloid angiopathy (CAA), and arteriovenous malformation (AVM) based solely on visual assessment of NECT scans. Despite the diagnostic complexity, studies on the CT characteristics of CVT-related hematomas remain limited. Small juxtacortical hemorrhages, though recognized as key indicators, are observed in only a subset of patients10. Given these challenges, developing advanced analytical tools to characterize NECT features of CVT-related hematomas is imperative. Such tools could enhance radiologists’ ability to accurately identify CVT-related hemorrhages, thereby facilitating timely venography and arteriography when necessary8.

Radiomics is a non-invasive imaging technique that extracts quantitative features related to shape, intensity, and texture from radiographic images, enabling detailed computational analysis11,12,13. Its efficacy in differentiating tumor genotypes and predicting disease outcomes prior to surgery has been well established14. Unlike conventional statistical models, machine learning algorithms can effectively handle complex, high-dimensional relationships among multiple variables15,16,17. Consequently, integrating radiomics with machine learning methods has led to significant advancements in predictive modeling. Radiomics has been successfully used in the diagnosis of intracerebral hemorrhage (ICH), facilitating the differentiating between primary and secondary ICH, classifying AVM-related hematomas, and distinguishing intracranial contrast extravasation from hemorrhage18,19,20,21,22. The integration of radiomics and machine learning offers a novel approach for the differential diagnosis of cerebral hemorrhage. This method can be applied in the emergency room for the automatic identification of cerebral parenchymal hemorrhage caused by cerebral venous thrombosis (CVT). It will significantly improve the diagnostic accuracy of CVT, a condition prone to misdiagnosis. This study aims to characterize the radiomic signatures of CVT-related hematomas and develop a predictive model capable of distinguishing them from other hematoma types in NECT scans.

Materials and methods

Patient selection and image acquisition

This study analyzed 473 cases of intraparenchymal hemorrhage from two medical centers between January 2013 and January 2021. Participant recruitment relied on the subsequent eligibility conditions: (1) acute, non-traumatic intracerebral hemorrhage confined to the parenchyma, (2) NECT performed within 24 hours of symptom onset, and (3) confirmation of the hematoma etiology via MR venography (MRV), CT angiography (CTA), or digital subtraction angiography (DSA). Individuals were omitted if they satisfied any of the subsequent conditions: (1) the presence of substantial intraventricular or subarachnoid hemorrhage, (2) hematomas located in the cerebellum or brainstem, or (3) inadequate imaging quality. Following the implementation of these conditions, 275 individuals were incorporated into the ultimate evaluation. The training cohort consisted of 192 patients from Center 1 (46 with confirmed CVT and 146 with HP, CAA, or AVM), while the external validation cohort comprised 83 patients from Center 2 (24 with confirmed CVT and 59 with HP, CAA, or AVM). The participant enrollment procedure is depicted in Figure 1.

Fig. 1
Fig. 1The alternative text for this image may have been generated using AI.
Full size image

The flow chart of the study shows the recruitment pathway for patients.

Each patient underwent NECT imaging using a Philips 128-layer micro-fat CT scanner with a biplane configuration. The scanning protocol encompassed the entire head, from the skull vertex to the skull base, using acquisition parameters of 120 kV, 300 mA, 64×0.625mm detector collimation, 512×512 matrix size, 0.8 beam pitch, 0.5s rotation time, and a slice thickness of 2.5 mm. Image quality was assessed, and scans with severe motion artifacts were excluded. Since pixel values may vary across institutions due to differences in imaging techniques, a resampling method (0.5 × 0.5 × 5 mm3) was applied to standardize image resolution across all datasets. This study was approved by the local Ethics Committee of the 900TH Hospital (No. 2023-039). Due to the anonymity of the data collected from the retrospective cohorts, the need for informed consent was waived by the local Ethics Committee of the 900TH Hospital. All procedures in this study adhered to the ethical guidelines and rules outlined in the current version of the Declaration of Helsinki.

Hematoma segmentation and feature extraction

The radiomics analysis in this work followed a five-step process: hematoma segmentation, radiomics feature extraction and fusion, radiomics feature selection, model development, and model comparison (Figure 2). Hematoma segmentation was performed manually on a slice-by-slice basis using 3D Slicer (version 5.4.0, https://www.slicer.org/) to delineate the regions of interest (ROIs). Two experienced neurosurgeons (referred to as Reader A and Reader B), blinded to clinical information and final diagnoses, independently conducted the segmentation. To evaluate inter-rater and intra-rater reliability, intraclass correlation coefficients (ICCs) were computed. Thirty randomly selected cases were re-segmented for consistency evaluation. For intra-rater reliability, Reader A repeated the segmentation process after one month, while Reader B independently re-segmented the same cases for inter-rater reliability agreement. The agreement levels of feature extraction were assessed using ICCs for intra-rater and inter-rater analyses. ICC values exceeding 0.80 were considered indicative of strong consensus.

Fig. 2
Fig. 2The alternative text for this image may have been generated using AI.
Full size image

The workflow of the radiomics model construction.

Conventional radiomics(Rad) features were obtained utilizing PyRadiomics (https://pyradiomics.readthedocs.io/en/latest/), encompassing geometric, intensity-based, texture, and filtering features. Geometric attributes defined the 3D shape of the hematoma, while texture features captured spatial intensity distributions using multiple statistical descriptors. Various feature extraction methods were utilized, including the gray-level co-occurrence matrix (GLCM), gray-level dependence matrix (GLDM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), and neighborhood gray-tone difference matrix (NGTDM). In total, 1,834 radiomics features were extracted per patient (Table S1).

DL features were obtained utilizing Resnet101, a Convolutional Neural Network (CNN) architecture. In our study, to ensure the model’s effectiveness across various patient populations with notable variability, we implemented transfer learning. This process involved initializing the model with pre-trained weights from the ImageNet database, enhancing its adaptability to diverse datasets. For each patient, the slice containing the largest hematoma was selected as input for model training. DL features were obtained from the penultimate layer of the ResNet101 model (Table S2). Given the high dimensionality of CNN-derived features, Principal Component Analysis (PCA) was used, reducing the feature set to 512 dimensions for subsequent analysis.

Conventional radiomics and DL feature selection

We used the synthetic minority over-sampling technique (SMOTE) algorithm to balance minority samples in the training set. The z-score method was used to standardize all retrieved data before selecting features. Both types of features were filtered using three steps. Initially, the Mann–Whitney U test was used on all features, keeping only those with a P-value less than 0.05. Subsequently, to mitigate collinearity, we examined the correlations between features using Pearson’s correlation coefficient, excluding any feature from pairs that exhibited a correlation coefficient above 0.9. Finally, the least absolute shrinkage and selection operator (LASSO) regression was employed for conventional Rad and DL feature filtering in the training set. The LASSO model applies a regularization weight (λ) to shrink regression coefficients toward zero, effectively eliminating many uncorrelated features. The optimal λ value was determined using the minimum criteria, implemented through tenfold cross-validation, with the λ corresponding to the lowest cross-validation error being selected. The features that still had coefficients different from zero were utilized in the regression model and merged together to create the Rad Signature and DL Signature.

This study explores a hybrid approach that integrates the two aforementioned feature types to develop the Deep Learning Radiomics (DLR) signature. Early fusion was applied to combine features extracted from both the conventional radiomics method and the DL framework, creating a comprehensive feature set. Feature selection was then performed using the same statistical and regression techniques.

Three radiomics model development

Following attribute selection, two types of signatures were developed: a conventional Rad signature based on handcrafted features and a DL signature derived from DL features. Additionally, a DLR signature was constructed by integrating both feature types. In this study, the Random Forest (RF) algorithm was employed to develop the signature models. RF is ideal for managing high-dimensional data and large-scale datasets, demonstrating strong robustness against outliers and noise by aggregating the predictions of multiple decision trees, thus minimizing the likelihood of overfitting. Furthermore, it provides feature importance estimates, facilitating the identification of the most essential attributes affecting the model’s predictions. Optimal hyperparameters were identified through a grid search approach coupled with fivefold cross-validation within the training cohort.

Development of the DLRN

To enhance clinical applicability, Chi-square tests were performed on all clinical features to identify those with significant associations. Epilepsy was selected as a distinguishing clinical factor. Subsequently, a DLR nomogram (DLRN) was created by combining the DLR signature with the clinical signature through logistic regression. The effectiveness of the DLRN model in diagnosis was assessed using an external validation group.

Performance evaluation and model comparison

The ability of the radiomics models to distinguish was assessed using receiver operating characteristic (ROC) analysis, as well as scalar metrics like AUC, sensitivity, specificity, and accuracy at the best ROC point. The comparison of these metrics was conducted across all cohorts for the three radiomics signatures (Rad, DL, and DLR) and the DLRN. A fivefold cross-validation was used in the training cohort to guarantee a fair evaluation, while the external validation cohort was kept constant for independent performance evaluation. Model calibration was analyzed using calibration curves, further validated by the Hosmer-Lemeshow goodness-of-fit test to assess model reliability. Additionally, Decision Curve Analysis (DCA) was performed to assess the practical applicability of the prognostic models, providing insights into their potential benefits in a clinical setting.

Statistical analysis

To compare patient characteristics across groups, independent t-tests were used for variables following a normal distribution, whereas Mann–Whitney U tests were utilized to non-normally distributed data, which were medians with interquartile ranges. Chi-square assessments were conducted to examine categorical variables. ROC curves were utilized to evaluate the predictive performance. All these works analysed SPSS software (version 24.0) and the One-key AI platform (https://www.medai.icu), which is built on PyTorch 1.8.0. A two-sided p-value ≤ 0.05 was regarded statistically significant.

Results

Patient characteristics

In the training dataset (n = 192), 24% (46 cases) of hematomas were attributed to CVT, while 76% (146 cases) resulted from other causes. In the test dataset (n=83), 29% (24 cases) of hematomas were associated with CVT, whereas 71% (59 cases) were linked to other etiologies. Table 1 summarizes the baseline clinical characteristics, indicating that the presence of epileptic seizures upon admission was the only clinically significant factor (P = 0.001). No significant differences were observed in age (P = 0.565), hypertension (P = 0.824), or headache (P = 0.174).

Table 1 Baseline patient characteristics in the training cohort and external test cohort.

Feature selection and model development

After the feature selection process, 15 conventional radiomics features with nonzero coefficients were kept for additional analysis and the development of the Rad signature (Supplementary Figure S1). For fusion features, Figure 3 presents the final selection of 16 handcrafted features and 3 DL features used for the development of DLR signature. The two most relevant features, identified by the highest coefficients after selection, originated from the Rad feature set and the DL feature set, respectively. Finally, the DLRN model was established by integrating radiomics features with demographic characteristics (presence of epilepsy) to distinguish CVT-related intraparenchymal hematomas using multivariate logistic regression (Figure 4).

Fig. 3
Fig. 3The alternative text for this image may have been generated using AI.
Full size image

The histogram of the fusion radiomics feature importance score based on the final 16 selected handcrafted and 3 deep learning features. DL = deep learning; LHL = low–high-low-pass filtered image; GLDM = grey-level dependence matrix; HLH = high-low–high-pass filtered image; HHL = high-high-low-pass filtered image; GLSZM = gray level size zone matrix; HLH = highlow–high-pass filtered image; HHH = high-high-high-pass filtered image; GLCM = gray-level co-occurrence matrix; GLRLM = gray-level run length matrix; NGTDM= neighborhood gray-tone difference matrix.

Fig. 4
Fig. 4The alternative text for this image may have been generated using AI.
Full size image

Combining deep learning radiomics signature with epilepsy, a deep learning radiomic based nomogram (DLRN) was presented to distinguish intraparenchymal hematomas related to CVT from those resulting from other causes.

Comparison of performance between radiomics models and DLRN

Table 2 presents the scalar performance metrics of the three radiomics models and the DLRN in both the derivation and external test cohorts. The DL signature exhibited an AUC of 0.831 (95% CI, 0.718–0.943.718.943), accuracy of 0.771, sensitivity of 0.833, specificity of 0.746 within the test cohort, outperforming the traditional Rad signature, which achieved an AUC of 0.791 (95 % CI, 0.664–0.918.664.918), accuracy of 0.783, sensitivity of 0.708, specificity of 0.814. After integrating handcrafted radiomics and DL, the DLR signature displayed enhanced effectiveness, achieving an AUC of 0.904 (95 % CI, 0.821–0.988.821.988), accuracy of 0.892, sensitivity of 0.750, specificity of 0.949 in the external test cohort. DLRN have the best performance, in the external test cohort, it achieved an AUC of 0.911 (95 % CI, 0.827–0.996.827.996), accuracy of 0.904, sensitivity of 0.792, and specificity of 0.949, outperforming all other radiomics models, as shown in Figure 5. The DLRN emerged as the most accurate diagnostic tool based on AUC analysis, maintaining consistently high performance across both cohorts. Additionally, the DLRN indicator demonstrated robust diagnostic utility, suggesting that integrating multiple feature extraction approaches enhances overall accuracy. In both the derivation and test datasets, the DCA confirmed that DLRN provided greater clinical benefit compared to the traditional Rad signature, DL signature, and fusion DLR signature (Figure 6). Moreover, the calibration curves showed a high level of concordance between the DLRN predictions and the real risk categories of CVT-related hematomas in the test cohort (Supplementary Figure S2).

Table 2 Performance of predicting radiomics signatures and DLRN in the training cohort and external test cohort.
Fig. 5
Fig. 5The alternative text for this image may have been generated using AI.
Full size image

The receiver operating characteristic curves (ROCs) of three distinct radiomics models and DLRN in derivation (left) and external test cohort (right).

Fig. 6
Fig. 6The alternative text for this image may have been generated using AI.
Full size image

Decision curves of three distinct radiomics models and DLRN on derivation cohort (left) and external test cohort (right). The X-axis represents the threshold probability. The Y-axis represents the net benefit. The grey and black thin lines represent the hypothesis that all patients are CVT or no CVT, respectively. Compared with other models, DLRN is the optimal decision-making for maximal net benefit in distinguishing CVT-related intraparenchymal hematomas with the highest area under the curve.

Discussion

A noninvasive and rapid method for distinguishing CVT-related intraparenchymal hematomas is of significant clinical importance. In this research, we designed and confirmed a NECT-based DL radiomics nomogram (DLRN) to differentiate CVT-related hematomas from those induced by alternative causes. The DLRN model have the maximum precision in both the training and test cohorts, indicating its feasibility as a rapid and adjunctive tool for identifying CVT-related intraparenchymal hematomas in emergency settings.

Recent studies have reported a yearly increase in the incidence of CVT in the United States, reaching 26.3 per million by 201623,24. Furthermore, a retrospective cohort study revealed that the incidence of CVT increased to 42.8 per million within two weeks of a COVID-19 diagnosis, with higher in-hospital mortality rates observed among CVT patients with COVID-1925,26,27,28. Additionally, a small case study identified a 51.6% misdiagnosis rate for acute CVT when diagnosed solely based on emergency head CT scans29. This high misdiagnosis rate may be attributed to nonspecific neurological symptoms presented by CVT patients, as well as the limited exposure of many physicians to CVT cases7,30. In clinical practice, CVT patients presenting with hematomas often exhibit more severe conditions, frequently mixed with other etiologies, making diagnosis particularly challenging in emergency settings. This DLRN model provides a reliable auxiliary diagnostic tool to mitigate this issue. Current clinical guidelines advocate immediate heparin therapy for acute CVT, which contrasts sharply with the treatment principles for arterial cerebral hemorrhage4. Early identification of CVT-related hematomas is therefore essential for optimizing therapeutic strategies and determining the necessity of early anticoagulation therapy. The DLRN model offers a valuable and rapid diagnostic tool to support the prompt identification and treatment of CVT, ultimately guiding subsequent treatment decisions.

The high-density triangle sign and cord-like sign are key NECT-based indicators for predicting venous sinus thrombosis. These features, derived from the thrombus itself, can independently aid in diagnosing venous sinus thrombosis and cortical vein thrombosis31,32,33,34. However, these signs appear in only a limited proportion of patients, and their specificity for diagnosing cortical vein thrombosis remains insufficient35,36,37. Research on the NECT imaging characteristics of CVT-induced brain parenchymal hemorrhage is relatively limited. CVT-induced hemorrhage is caused by severe venous hypertension due to impaired collateral venous drainage, leading to blood product extravasation, a distinguishing feature of arterial bleeding38. The characteristics of CVT-related bleeding include: A higher plasma content and lower cellular component compared to arterial bleeding, resulting in softer hematomas that are more prone to dissolution. Consequently, venous hematomas exhibit lower CT attenuation and a cloud-like appearance with less distinct boundaries. A relatively slow bleeding rate, with hematomas spreading along white matter tracts, often scattered at the cortex-white matter junction. Jonathan M et al. defined small hemorrhages positioned within the white matter directly beneath the cortex as juxta-cortical hemorrhages (JCH), which represent a unique characteristic of CVT and are infrequently detected in other disorders10. Although studies highlight key differences between venous and arterial bleeding, visual assessment alone remains unreliable29. In this research, we first utilized traditional radiomics methods to construct a machine learning-based Rad model and evaluated its classification performance. The Rad model demonstrated strong discriminative ability in the training dataset but exhibited lower generalizability in the test dataset. To enhance model efficiency, we subsequently extracted DL features at the maximum ROI level and integrated them with conventional Rad features, aiming to optimize model performance.

Traditional radiomics and DL methods each offer distinct advantages in medical image analysis. Traditional radiomics enables precise quantitative analysis of various imaging features with strong interpretability, yet it lacks the capability to capture higher-level abstract features. In contrast, DL exhibits powerful adaptive learning capabilities, allowing it to extract high-level, abstract features, making it particularly effective for handling complex image analysis tasks39. However, DL models demand high computational resources and suffer from reduced interpretability, often referred to as the “black-box” problem40. Recent studies suggest that integrating traditional radiomics with DL enhances prediction performance compared to models based solely on a single feature category. For instance, Hongbo Zhang et al. developed a multi-modal MRI-based radiomics model to predict telomerase reverse transcriptase (TERT) promoter mutations in glioblastoma (GBM) patients. Their findings demonstrated that a feature fusion model combining radiomic and DL features significantly outperformed models relying on either feature type alone41. This research is the first to construct and confirm a DL-based radiomics model for identifying intracerebral hematomas following cerebral venous sinus thrombosis using non-contrast CT images from a multicenter cohort. In our analysis, integrating DL features with traditional radiomic features resulted in the DLR signature, which significantly improved performance. These findings indicate that feature fusion modeling enhances radiomics-based prediction models compared to single-feature approaches, further supporting their potential for advancing diagnostic imaging analysis.

The nomogram model enables the integration of multiple variables, transforming complex statistical prediction models into intuitive graphical representations. This visualization enhances the interpretability of relationships and interactions among variables, improving both prediction accuracy and clinical applicability. Prior investigations have shown the effectiveness of DLRNs in various clinical applications. Yingmei Zheng et al. developed a contrast-enhanced CT-based nomogram by integrating traditional radiomics, DL features, and clinical factors to predict the histological differentiation grade of head and neck squamous cell carcinoma42. Similarly, Hao Zhou et al. constructed a DLRN to assess the histological risk categorization of thymic epithelial tumors (TETs), achieving high predictive accuracy43. In our study, we found that epileptic seizures occurred significantly more frequently in patients with CVT-related cerebral parenchymal hemorrhage compared to those with hematomas of other etiologies, consistent with previous findings. Therefore, epilepsy status was incorporated as a key clinical variable alongside fused radiomics features to construc a DLRN model for forecasting whether an intraparenchymal hemorrhage is secondary to CVT. This model attained peak effectiveness, surpassing the predictive accuracy of the model based solely on fused radiomics features. Furthermore, decision curve analysis (DCA) confirmed that the DLRN model provided the greatest clinical benefit in distinguishing CVT-related intracranial hematomas, highlighting its potential as a non-invasive tool for rapid diagnosis and treatment guidance in emergency settings.

Our study has several limitations. First, we extracted DL features from the single image slice containing the largest hematoma ROI, instead of employing a 3D whole-hematoma volume, which may limit the model′s ability to capture spatial heterogeneity. Only radiomic features from the hematoma region were extracted, whereas the perilesional area was overlooked. Future analyses should include regions surrounding the lesion to improve the model’s performance and interpretability. Second, the non-CVT intraparenchymal hematoma group did not encompass all types of intracranial hemorrhage, meaning the dataset may not fully represent real-world clinical scenarios, potentially influencing the model′s generalizability. Third, the manual 3D segmentation of hematomas was time-intensive, highlighting the need for a semi-automated or fully automated segmentation method to enhance clinical feasibility. Fourth, the predictive model was specifically designed to identify CVT patients with concomitant cerebral hemorrhage but cannot differentiate CVT patients without hemorrhage, limiting its diagnostic scope. Finally, the DLRN model is not yet compatible with clinical PACS, necessitating the development of an integrated and user-friendly software solution for seamless clinical deployment.

Conclusion

In summary, this study developed and validated a DL radiomics signature based on NECT images to differentiate CVT-related intraparenchymal hematomas from those caused by other etiologies. This approach demonstrated superior performance compared to models relying solely on conventional radiomic attributes. Subsequently, we constructed a DLRN by integrating fused radiomic features with clinical characteristics, achieving the highest diagnostic accuracy among all models. This innovative, non-invasive model holds significant potential for rapid and accurate identification of CVT-related intraparenchymal hematomas in emergency settings, aiding in timely clinical decision-making and treatment optimization.