Introduction

Pulmonary embolism (PE) is a potentially life-threatening condition associated with considerable morbidity and mortality despite advances in diagnostic and therapeutic strategies1,2. Identifying patients at high risk for recurrent PE remains a major clinical challenge, as recurrence can lead to progressive pulmonary vascular obstruction, right heart failure, and increased long-term mortality3,4. Existing prediction tools for PE risk stratification largely rely on clinical characteristics, coagulation biomarkers, and scoring systems such as the Wells score and the revised Geneva score5. More recently, several studies have developed nomogram-based prediction models incorporating multiple clinical and laboratory variables to improve individualized recurrence risk assessment. For example, Wang et al.6 developed a nomogram for predicting pulmonary embolism risk in oncology patients using five features including neutrophil count and D-dimer, achieving AUCs of 0.758 and 0.702 in the training and validation cohorts, respectively. Similarly, Liu et al.7 evaluated elderly hospitalized patients and identified smoking status and dyspnea as significant predictors, demonstrating improved performance compared with traditional scoring systems8. However, these models have largely overlooked body composition parameters, which are increasingly recognized as important prognostic indicators in cardiovascular and thromboembolic diseases9. Emerging evidence suggests that skeletal muscle and adipose tissue significantly influence systemic inflammation, metabolic regulation, and hemodynamic stability—all of which may contribute to thrombus burden and recurrence risk10,11. CT-derived measurements such as skeletal muscle area (SMA), pectoralis muscle area (PMA), and subcutaneous adipose tissue area (SATA) have been shown to predict outcomes in conditions including chronic obstructive pulmonary disease, malignancy, and heart failure12,13. Notably, PMA was analyzed separately fromglobal SMA—given its anatomical prominence at the T4 level, high measurement reproducibility, and emerging role as a specific marker of cardiopulmonary dysfunction14. These metrics reflect physiological reserve, nutritional status, and inflammatory phenotype, offering potential value for improving PE risk stratification15. Despite this, the prognostic relevance of muscle and fat parameters in recurrent PE has not been systematically investigated. Given their increasing availability through routine CT imaging and their potential biological link to thromboembolic processes, incorporating body composition into recurrence prediction models may enhance individualized patient assessment. Therefore, this study aimed to develop and validate an integrative nomogram incorporating CT-derived body composition parameters for predicting recurrent pulmonary embolism.

Method and material

Participant acquisition

We retrospectively analyzed patients who underwent lung perfusion scans for suspected pulmonary embolism (PE) between Month 2019 and Month 2023. The diagnosis of PE was established according to the 2019 European Society of Cardiology (ESC) Guidelines for the Diagnosis and Management of Acute Pulmonary Embolism, which require visualization of intraluminal filling defects on computed tomography pulmonary angiography (CTPA) or high-probability findings on ventilation/perfusion (V/Q) scanning1. Inclusion criteria: (1) age ≥ 18 years; (2) first-time diagnosis of PE confirmed by lung perfusion scan or CTPA; (3) follow-up at our hospital after initial diagnosis under standard anticoagulation therapy; (4) availability of complete pre- and post-treatment imaging data. Exclusion criteria: (1) PE not diagnosed by lung perfusion scan or CTPA; (2) history of chronic pulmonary embolism, chronic thromboembolic pulmonary hypertension, or gout; (3) presence of significant hepatic or renal impairment, coagulation abnormalities, or other major comorbidities; (4) missing clinical data or loss to follow-up. A total of 184 patients fulfilled the study criteria. (Figure. 1)

Fig. 1
Fig. 1
Full size image

Flowchart of patient selection for analysis.

Image acquisition and processing

The Optima NM/CT 670 dual-probe SPECT/CT scanner manufactured by GE was used with a gamma camera (GE Medical Systems, Israel) and a low-energy universal parallel hole collimator. The patient was placed in the supine position with arms around the head and instructed to take a deep breath and inhale 99mTc-555 MBq nebulized gas under oxygen pressure nebulization using a nebulizer inhaler (Beijing Senco Pharmaceuticals) with an oxygen flow rate of 5–8 L/min and an inhalation time of 5–10 min, and then immediately underwent lung ventilation planar imaging in the anterior/posterior, left/right lateral, left anterior/right anterior oblique, and left posterior/right posterior oblique positions, respectively. 8 Body position planar imaging, energy peak 140 keV, window width 20%, magnification 1.23, number of probes 100 ~ 150 K/frame, 1.0 ~ 1.5 K/s. Lung ventilation tomography parameters: field of view including both lungs, matrix 128 × 128, probe rotated 360°, acquisition of 1 frame every 6°, acquisition time of 5 s for each frame, a total of 60 frames were acquired, and then lung perfusion imaging was performed. 99mTc-MAA 185 MBq was injected into the dorsal vein of the lower limb, and the lower limb vein was visualized first, and then lung perfusion planar imaging was performed from the above 8 positions, with each probe counting 500 K/frame, 4.0–5.0 K/s, and other imaging parameters were the same as those for lung ventilation tomography. Patients were instructed to keep their breathing stable during SPECT/CT lung perfusion imaging acquisition to reduce artifacts caused by respiratory motion. After the acquisition of SPECT/CT lung perfusion images, the patient was instructed to hold his breath and undergo a low-dose CT scan with the following parameters: field of view as above, tube voltage of 120 keV, tube current of 2 mA, and slice thickness of 3 mm, and the images were reconstructed and fused to obtain the fusion images of SPECT/CT lung perfusion images.

CT body analysis

Based on established evidence of the strong correlation between body composition and stature, this study utilized CT components from SPECT/CT examinations to perform body composition measurements. Analyses were conducted on a single axial slice at the level of the fourth thoracic vertebra (T4) using Slice-O-Matic software (version 5.0; Tomo Vision, Montreal, Canada) (Figure. 2). Tissue segmentation was performed based on Hounsfield Unit (HU) thresholds: skeletal muscle was segmented using a threshold of −29 to + 150 HU, and the total area was recorded as the skeletal muscle area (SMA). At the T4 level, the SMA primarily includes muscles such as the pectoralis major, pectoralis minor, serratus anterior, intercostal muscles, as well as the trapezius and rhomboids in the dorsal region16,17. Subcutaneous and intermuscular adipose tissue were segmented using a threshold of −190 to −30 HU, and the area was automatically calculated as subcutaneous adipose tissue area (SATA). To account for the influence of body height, SMA and SATA were normalized to the square of height (m²) to derive the skeletal muscle index (SMI) and subcutaneous adipose tissue index (SATI), respectively, following established methodologies in the field. Furthermore, the mean HU values for each tissue compartment were automatically extracted to quantify tissue density, recorded as skeletal muscle density (SMD) and subcutaneous adipose tissue density (SATD). For specific analysis of thoracic muscles, the pectoralis major and minor were separately delineated and combined to calculate the pectoralis muscle area (PMA) and its density (PMAD). This approach was justified for two primary reasons: First, the pectoralis muscles are the most prominent and representative parietal muscles at the T4 level, offering high signal-to-noise ratio and excellent measurement reproducibility. Second, compared to shoulder girdle muscles such as the supraspinatus and infraspinatus, which are often obscured by the scapulae and vary in appearance with patient positioning, the pectoralis muscles have a fixed anatomical location. They are clearly defined and easily identifiable on standard supine CT images, providing a more reliable and operator-independent indicator of localized muscle quality. Therefore, the separate analysis of PMA was not intended to exclude other muscles but to establish an optimal, standardized parameter for assessing muscle quality specifically at the T4 level.

Fig. 2
Fig. 2
Full size image

Imaging findings in a 58-year-old man with recurrent pulmonary embolism (PE). (a) Skeletal muscle area (SMA) at the fourth thoracic vertebra: 187.1 cm².(b) Pectoralis muscle area (PMA): 37.16 cm².(c) Subcutaneous adipose tissue area (SATA): 48.94 cm².Follow-up computed tomography pulmonary angiography (CTPA) performed 19 months later revealed a new embolism.

Observer variability analysis

All CT-based body composition measurements were obtained using semi-automated segmentation tools in Slice-O-Matic, with manual adjustments as needed and independent review by two experienced radiologists. To evaluate reproducibility, a random subset of 30 patients was selected. Intra-observer variability was assessed by having the same reader repeat the segmentations after a washout period while blinded to prior results, and inter-observer variability was determined from independent segmentations performed by a second reader. Reliability was quantified using intraclass correlation coefficients (ICC) based on a two-way random-effects model for absolute agreement, supplemented by intra-observer coefficients of variation (CV) to capture relative variability. Detailed intra- and inter-observer reliability metrics are presented in Supplementary Tables 2 and Supplementary Fig. 3.

Follow-up information and outcomes measure

Follow-up information was obtained through outpatient visits or telephone interviews.

The end of follow-up was defined as the date of the last contact (July 2024) or the occurrence of recurrent PE, whichever came first. Recurrent PE was diagnosed using spiral CT or lung scintigraphy, requiring either an intraluminal filling defect in at least one segmental or larger pulmonary artery or a segmental perfusion defect with normal ventilation indicating a ventilation–perfusion mismatch. All cases of suspected recurrent PE were adjudicated independently by a clinician and a radiologist blinded to the study data.

Statistical analysis

Patients were randomly allocated into training and validation cohorts at a 7:3 ratio using stratified sampling based on recurrence status. Continuous variables, presented as median (interquartile range), were assessed for normality, and group comparisons were performed using the chi-square test or Fisher’s exact test for categorical variables and the Student’s t-test or Mann–Whitney U test for continuous variables, as appropriate. In the training cohort, least absolute shrinkage and selection operator (LASSO) regression was employed to select the most informative predictors, which were subsequently incorporated into a multivariable logistic regression model to construct a predictive nomogram. The model’s performance was evaluated using receiver operating characteristic (ROC) curves, area under the curve (AUC), calibration plots, and decision curve analysis (DCA) to assess discrimination, calibration, and clinical net benefit. A post-hoc power analysis was performed based on the observed data (total n = 184 with 61 recurrence events and 27 candidate predictors), which demonstrated adequate statistical power (1-β = 0.81 at α = 0.05) to detect moderate effects, supporting the robustness of our findings (detailed results provided in Supplementary Table S1 and Figure S1). A two-sided p-value < 0.05 was considered statistically significant, and all analyses were conducted using R software (version 4.4.2).

Ethics approval and consent to participate

All procedures were conducted in accordance with the ethical standards of the institutional research committee and with the Declaration of Helsinki (1964) and its later amendments. The study protocol was approved by the Ethics Committee of the First Hospital of Shanxi Medical University (Approval ID: 2024–126), which waived the requirement for informed consent due to the retrospective nature of the study.

Results

Patient characteristics

The baseline characteristics of the 184 patients are summarized in Table 1, with 129 assigned to the training cohort and 55 to the internal test cohort. The median follow-up time for the entire cohort was 34 months (IQR, 21.5–52 months), with comparable durations between training (32 months) and validation (36 months) cohorts (p = 0.0.164). During follow-up, all-cause mortality occurred in 15 patients (8.2%), distributed as 9 (7.0%) in the training cohort and 6 (10.9%) in the validation cohort (p = 0.550). Overall, the two cohorts were well balanced across demographic and clinical variables. Sex distribution, age, and BMI showed no significant differences. Body composition measurements—including SMA, SMAI, SMAD, SATA, SATI, and SATD—were largely comparable between cohorts, with only PMA exhibiting a statistically significant but modest difference (p = 0.021). Pulmonary embolism characteristics were similarly distributed, with no significant differences in PE type, the presence of central PE, right ventricular dysfunction indicators, or concomitant deep vein thrombosis. Laboratory parameters, including white blood cell count, urea, creatinine, glucose, and uric acid, did not differ significantly, and D-dimer levels showed similar ordinal distributions between the groups. The only notable clinical imbalance was observed in sPESI, with a higher proportion of patients in the internal test cohort classified as sPESI ≥ 1 (70.9% vs. 48.1%, p = 0.004). Overall, the two cohorts demonstrated strong comparability in both baseline characteristics and follow-up outcomes, supporting their suitability for subsequent model development and validation.

Table 1 BMI, body mass index; SMA, skeletal muscle area at T4 level; SMAI, skeletal muscle index; SMAD, skeletal muscle density; PMA, pectoralis muscle area; PMI, pectoralis muscle index; PMD, pectoralis muscle density; SATA, subcutaneous adipose tissue area; SATI, subcutaneous adipose tissue index; SATD, subcutaneous adipose tissue density; RVD, right ventricular dysfunction; DVT, deep vein thrombosis; WBC, white blood cell count; Cre, creatinine; Glu, glucose; UA, uric acid. Data are presented as n (%), median (interquartile range), or mean ± standard deviation as appropriate. HU, Hounsfield units; FEU, fibrinogen equivalent units. PE type was classified based on the most proximal thrombus location: segmental (involving segmental arteries), Lobar (involving Lobar arteries), central (involving main or left/right pulmonary arteries). RVD indicators included right ventricular dilatation (RV/LV ratio > 1.0), septal bowing, or pulmonary artery dilatation on computed tomography pulmonary angiography.

Predictive model

Prior to model construction, correlation analysis was performed to exclude variables with a correlation coefficient greater than 0.7 to mitigate multicollinearity (Supplementary Figure S2). The following preselected predictors were included in the LASSO regression analysis: PE_type, Central_PE, RVD_indicators, DVT_present, Urea, Cre, Glu, UA, D_dimer, WBC, sex, BMI, age, sPESI, SMA, SMAD, PMA, PMAI, PMAD, SAT, and SATD. The LASSO regression analysis, detailed in Figure. 3, was employed to identify the most informative predictors from this set. Figure 3a shows the 10-fold cross-validation for selecting the optimal penalty parameter (λ), where the vertical dashed line indicates the λ value (0.0348) corresponding to the minimum mean cross-validated error. Figure 3b presents the LASSO coefficient path plot, illustrating the shrinkage of feature coefficients as the penalty parameter λ increases, with features selected at the optimal λ highlighted. This process identified eight variables with non-zero coefficients as the most predictive features: DVT_present, WBC, BMI, SMAD, PMA, SAT, and SATD. These variables were subsequently incorporated into the multivariable logistic regression model to construct the final predictive nomogram (Figure. 4a).

Fig. 3
Fig. 3
Full size image

The LASSO algorithm and 10-fold cross-validation were used to extract the optimal subset (a) Ten-fold cross-validation plot for selecting the optimal penalty parameter (λ) in the LASSO model. The vertical dashed line indicates the λ value (0.0348) corresponding to the minimum mean cross-validated error; (b) LASSO coefficient path plot showing the shrinkage of feature coefficients as the penalty parameter λ increases. Features selected at the optimal λ are highlighted.

Predictive model performance

The AUCs of the model in the different cohorts were shown in the (Figure. 4b). The (ROC curve analysis demonstrated that the predictive model achieved an AUC of 0.757 (95% CI: 0.673–0.840) in the training cohort, and an AUC of 0.679 (95% CI: 0.531–0.826) in the validation cohort, indicating moderate discriminatory performance across both datasets.

Fig. 4
Fig. 4
Full size image

(a) Nomogram prediction model; (b) ROC curves of the nomogram prediction model.

The calibration performance of the nomogram was evaluated in both the training and validation cohorts (Figure. 5 and Supplementary Table S4). The calibration plots demonstrated good agreement between the predicted probability of recurrence and the observed outcome in both datasets. Specifically, the curve for the training cohort (Figure. 5a) closely approximated the ideal reference line, indicating accurate risk estimation. This favorable calibration was maintained in the validation cohort (Figure. 5b), where the curve remained relatively close to the ideal line, confirming the model’s reliability across different patient samples. These results collectively indicate that the predicted probabilities generated by the nomogram are well-calibrated and consistent with actual clinical outcomes.

Fig. 5
Fig. 5
Full size image

(a) calibration curve of the nomogram prediction mode for the training cohort; (b) calibration curve of the nomogram prediction mode for the internal test cohort.

Decision curve analysis (DCA) was performed to evaluate the clinical utility of the nomogram in both the training (Figure. 6a) and validation (Figure. 6b) cohorts. Across a clinically relevant threshold probability range of 0–0.6, the nomogram demonstrated a consistently higher net benefit than both the treat-all and treat-none strategies in both datasets. This indicates that, within this range, the nomogram can more accurately identify patients at elevated risk who may benefit from intervention, while minimizing unnecessary treatment in low-risk individuals. Although the net benefit gradually decreased at higher threshold probabilities—reflecting the inherent trade-off between sensitivity and specificity—the nomogram consistently outperformed the default strategies across most clinically meaningful thresholds in both cohorts. These consistent findings suggest that using the nomogram to guide clinical decision-making can enhance individualized risk stratification and support evidence-based intervention decisions. Overall, the favorable and reproducible DCA profile in both training and validation sets underscores the potential utility of the nomogram in optimizing patient management following pulmonary embolism.

Fig. 6
Fig. 6
Full size image

(a) Decision curve analysis of the nomogram of the training cohort; (b) Decision curve analysis of the nomogram of the internal test cohort.

Discussion

In this retrospective study, we developed and validated a nomogram incorporating CT-derived body composition parameters to predict recurrent PE. The model demonstrated moderate discriminatory ability in both the training (AUC = 0.757) and validation (AUC = 0.679) cohorts, suggesting that integrating body composition metrics may improve recurrence risk assessment beyond traditional clinical variables. Our focus on readily available CT-based parameters was motivated by growing evidence that standard anthropometric measures such as BMI may not fully capture individual differences in muscle mass or fat distribution. In particular, skeletal muscle and adipose tissue distribution have been increasingly recognized as prognostic indicators in cardiopulmonary and thromboembolic diseases. For example, a recent multicenter study showed that CT-defined PMA and muscle density were associated with 30-day mortality in patients with acute PE11,18. Furthermore, in chronic lung disease populations, CT-derived PMA (rather than BMI) has been more strongly correlated with clinically relevant measures such as exercise tolerance, dyspnea, and functional capacity, underscoring its value as a surrogate of “physiological reserve.” Similarly, in oncology, lower PMA measured on chest CT has been associated with worse overall survival in non-small cell lung cancer. Beyond muscle mass, adipose tissue distribution also merits consideration. A recent large study using MRI from the UK Biobank demonstrated that increased visceral adipose tissue (VAT) volume was associated with a significantly elevated risk of venous thromboembolism (VTE), and that VAT correlated more strongly with VTE risk than BMI19,20. These findings support the conceptual framework that adiposity — particularly visceral fat — contributes to a pro-thrombotic milieu, likely via chronic inflammation, endothelial dysfunction, impaired fibrinolysis, and altered metabolic homeostasis. Indeed, adipose tissue is known to secrete pro-inflammatory adipokines and coagulation-modulating factors, which can increase thrombosis risk even in individuals with normal BMI21. Given this background, our decision to include CT-derived body composition parameters (muscle and fat) in a PE recurrence risk model appears biologically plausible and methodologically justified22. In our cohort, although no single body composition variable reached independent statistical significance, their combined inclusion meaningfully improved model discrimination and calibration, suggesting a cumulative effect rather than a single dominant factor — consistent with the multifactorial nature of VTE/PE recurrence23,24. Moreover, decision curve analysis (DCA) demonstrated that the nomogram provided a net benefit across a broad range of clinically relevant risk thresholds, indicating its potential to inform individualized clinical decision-making (e.g., who might benefit from closer follow-up or extended prophylaxis), while minimizing overtreatment in low-risk patients25. Nevertheless, several limitations warrant discussion. First, our study design was retrospective and conducted at a single center, which may introduce selection bias and limit generalizability. Second, the modest sample size may reduce statistical power, especially for detecting the independent effect of individual body composition parameters — this could explain why no single metric reached significance despite biologic plausibility. Third, we did not evaluate longitudinal changes in body composition, which may carry additional prognostic information; for example, muscle loss over time (sarcopenia progression) or fat redistribution could influence recurrence risk. Fourth, while our nomogram shows promise, the absence of external validation means we cannot yet assume its applicability in different populations or geographic settings. Therefore, we consider this work a necessary first step. Future research should include prospective, multicenter cohorts with larger sample sizes, capture longitudinal body composition changes, and — ideally — integrate other prognostic markers (e.g., inflammatory biomarkers, clot burden metrics, genetic or biomarker data) to build a more comprehensive and generalizable predictive model. In conclusion, our study supports the concept that CT-derived body composition parameters reflect underlying physiological vulnerability (muscle depletion, unfavorable adiposity) that may predispose to PE recurrence. The integrative nomogram combining these imaging markers with clinical variables demonstrates moderate predictive performance and favorable decision-analytic properties, highlighting a potentially valuable, readily accessible tool for individualized risk stratification and management of PE survivors.

Conclusion

The integrative nomogram combining clinical variables with CT-derived body composition parameters demonstrates modest but meaningful predictive capability for pulmonary embolism recurrence. While individual predictors showed limited standalone value, their combination provides a practical approach to risk stratification that leverages routinely available imaging data. This model represents a step toward more personalized assessment of recurrence risk, though further validation is required before clinical implementation.