Introduction

Breast cancer (BC) is the most commonly diagnosed malignancy among women worldwide1,2. According to the 2022 Global Cancer Statistics, BC ranks second in incidence among all cancers and fourth in cancer-related mortality3. Early detection and accurate staging are critical to improving outcomes. Axillary lymph node metastasis (ALNM) is a key marker of disease progression and is closely linked to clinical stage, treatment selection, and prognosis4,5. Thus, ALNM is central to therapeutic decision-making and an important predictor of survival and recurrence risk.

Sentinel lymph node biopsy (SLNB) is the current reference standard for assessing ALNM in BC. SLNB is minimally invasive and has a favorable safety profile; however, as with any surgical procedure, complications such as lymphedema, infection, and sensory disturbance can occur6,7. SLNB provides high diagnostic accuracy, particularly a high negative predictive value: when the sentinel node is negative, further axillary surgery or evaluation is usually unnecessary. When the sentinel node is positive, additional assessment of regional lymph nodes may be warranted8,9. In this context, predictive models—including the one proposed in this study-can serve as complementary tools to support preoperative planning and identify patients at higher risk for ALNM.

Despite numerous ALNM prediction models based on clinicopathologic and imaging features10,11, several limitations persist. Most studies are single-center, retrospective analyses with modest sample sizes; many models rely on a single modality rather than integrating multidimensional data, which restricts accuracy and generalizability; and external validation is often lacking, limiting applicability across populations. To address these gaps, we developed an interpretable, high-precision ALNM prediction model that integrates clinical variables, radiomics features, and tumor markers in a cohort of more than 1,000 BC patients from two hospitals. Key predictors were selected using least absolute shrinkage and selection operator (LASSO) regression and incorporated into a multivariable logistic regression model. We conducted both internal and external validation to evaluate robustness and generalizability and assessed clinical utility using decision curve analysis (DCA).

Methods

Patients

We retrospectively reviewed clinical and pathological data for 1,307 patients with BC who underwent surgery at Tengzhou Central People’ s Hospital between January 2019 and December 2023. Patients were randomly assigned to a training set (n=914) and an internal validation set (n=393) in a 7:3 ratio. An external validation cohort comprised 61 BC patients who underwent surgery at Zaozhuang Municipal Hospital from January to April 2025. All patients had complete pathological and clinical laboratory records. Data collected included demographic characteristics, laboratory results, tumor size, lymph node status, pathological type, histological grade, and other relevant variables. Inclusion criteria: (1) pathological diagnosis of BC; (2) unilateral, stage I-III disease; (3) availability of complete clinical, ultrasound, CT, and pathological data; (4) axillary lymph nodes negative or suspicious for metastasis on ultrasound and/or CT; (5) receipt of neoadjuvant therapy and standard surgical treatment for BC. Exclusion criteria: (1) incomplete clinical or pathological data; (2) ductal carcinoma in situ; (3) stage IV disease; (4) occult BC; (5) inflammatory BC; (6) bilateral BC. From eligible cases, we extracted the following variables: sex, age, tumor size, pathological type, histological grade (I-III), molecular subtype, ER, PR, HER2, Ki-67, P53, suspicious axillary lymph nodes on ultrasound, suspicious axillary lymph nodes on CT, CEA, CA15-3, CA125, and ALNM. The study flowchart is shown in Figure 1. This study complied with the Declaration of Helsinki and applicable ethical regulations. Given its retrospective design and use of anonymized data without identifiable personal information, the Ethics Committee of Tengzhou Central People’s Hospital waived the requirement for institutional review board approval and informed consent.

Fig. 1
Fig. 1
Full size image

Flow chart of the study.

Data preprocessing

Categorical variables were encoded for modeling as follows: Age: ≤35 years=1; 36–45 years=2; 46–59 years=3; ≥60 years=4. Ultrasound axillary nodes: suspicious=1; normal=0. CT axillary nodes: suspicious=1; normal=0. Lymph node metastasis: positive=1; negative=0. Molecular subtype: Luminal A=1; Luminal B (HER2-)=2; Luminal B (HER2+)=3; HER2-enriched=4; triple-negative=5. Histological grade: well differentiated=1; moderately differentiated=2; poorly differentiated=3. Tumor size: T1=1; T2=2; T3=3. Pathology: invasive ductal carcinoma=1; invasive lobular carcinoma=2; other=3. Sex: male=1; female=2. Receptor status: ER/PR/HER2 positive=1; negative=0. Ki-67: ≤14%=1; >14%=2. P53: positive or mutant=1; negative or wild-type=0. Tumor markers (CEA, CA15-3, CA125): elevated=1; normal=0.

Evaluation of relevant parameters

Tumor size was measured by ultrasound. Imaging-based assessment of ALNM followed standardized criteria: Ultrasound: nodes were considered suspicious if any of the following were present—cortical thickness >2 mm; round/oval shape with a full contour; eccentric cortical thickening or reduced/absent medulla; loss of the fatty hilum; and/or heterogeneous echogenicity. CT: nodes were considered suspicious if they showed heterogeneous parenchymal thickening, round or irregular/lobulated morphology, heterogeneous enhancement, and/or loss of the fatty hilum12. All imaging studies were interpreted by experienced radiologists who were blinded to pathological findings and used consensus protocols to ensure consistency; they participated in regular training updates. Axillary node positivity was defined as the presence of cancer cells on pathological examination. ER, PR, and Ki-67 expression were assessed by immunohistochemistry13. Tumor markers (CEA, CA15-3, CA125) were used as auxiliary indicators, with values above the reference range considered positive. Lymph node metastasis was confirmed with standard hematoxylin–eosin staining.

Statistical analysis

We used LASSO regression for variable selection and shrinkage. By penalizing model coefficients and shrinking some to zero, LASSO minimizes prediction error and retains variables with nonzero coefficients most strongly associated with the outcome. LASSO was implemented in R, and the optimal penalty parameter (lambda.1se) was chosen via 10-fold cross-validation based on the binomial deviance14,15. Candidate predictors included sex, age, tumor size, pathological type, histological grade, molecular subtype, ER, PR, HER2, Ki-67, P53, suspicious axillary nodes on ultrasound, suspicious axillary nodes on CT, CEA, CA15-3, and CA125. Variables with nonzero coefficients were entered into a multivariable logistic regression model. Odds ratios (ORs) with 95% confidence intervals (CIs) and two-tailed P values were reported.

A nomogram was constructed from the final model. Model performance was evaluated in the training, internal validation, and external validation cohorts. Discrimination was quantified using the concordance index (C-index; range, 0.5–1.0; higher values indicate better performance)16 and the area under the receiver operating characteristic curve (AUC)17. Calibration was assessed with calibration plots comparing predicted and observed ALNM18. Clinical utility was evaluated using DCA19. All analyses were performed in R version 4.1.3 (http://www.r-project.org).

Results

Clinical characteristics

A total of 1,368 patients with BC were included 914 in the training cohort, 393 in the internal validation cohort, and 61 in the external validation cohort. The overall rate of ALNM was 45.98%. ALNM positivity rates were 46.06% in the training cohort, 46.06% in the internal validation cohort, and 44.26% in the external validation cohort (Table 1.).

Table 1. Characteristics of the study cohorts.

LASSO and multivariable logistic regression

In the training cohort, least absolute shrinkage and selection operator (LASSO) regression identified predictors with nonzero coefficients associated with ALNM: ER (0.130), suspicious axillary lymph nodes on ultrasound (1.242), suspicious axillary lymph nodes on CT (1.475), and tumor size (0.005). Multivariable logistic regression confirmed the statistical significance of ER (P = 8.44×10^−6), suspicious axillary lymph nodes on ultrasound (P = 5.37×10^−13), suspicious axillary lymph nodes on CT (P = 9.11×10^−13), and tumor size (P = 0.004) (Table 2.).

Table 2. LASSO and multivariable logistic regression.

Nomogram development

Based on the LASSO-selected variables, we developed a nomogram to estimate the probability of ALNM. Tumor size, ER status, and the presence of suspicious axillary lymph nodes on ultrasound and CT contributed most to risk prediction. Lower predicted risk of ALNM was associated with smaller tumor size, ER-negative status, and normal-appearing axillary lymph nodes on both ultrasound and CT (Figure 2A). The nomogram was constructed using the regplot package to facilitate individualized risk estimation (Figure 2B). For example, a patient with a smaller tumor, ER-negative status, and normal axillary lymph nodes on ultrasound and CT had a total score of 38.4, corresponding to a predicted ALNM probability of 30.5%. The coefficient paths versus L1 norm, log lambda, and deviance explained demonstrated progressive coefficient shrinkage consistent with LASSO regularization (Figure 2C). The regularization path further illustrated how changes in lambda affected model fit, with notable shifts in binomial deviance at specific penalty strengths (Figure 2D).

Fig. 2
Fig. 2
Full size image

(A) Multivariable nomogram for predicting the risk of ALNM in breast cancer patients. (B) Static nomogram for predicting ALNM in breast cancer patients. (C) Coefficient plot of the lasso model from the training cohort. (D) Coefficient plot of the lasso model with tenfold cross-validation from the training cohort.

Validation of the nomogram

The nomogram showed strong discriminatory performance, with concordance indices (C-indices) of 0.81, 0.74, and 0.84 in the training, internal validation, and external validation cohorts, respectively. Areas under the receiver operating characteristic curve (AUCs) were 0.81, 0.74, and 0.84 for the respective cohorts (Figures 3A-C). Calibration plots indicated good agreement between predicted and observed probabilities in the training cohort, with slightly reduced agreement in the internal and external cohorts, likely reflecting smaller sample sizes (Figures 4A-C).

Fig. 3
Fig. 3
Full size image

The ROC curves reflected the predictive performance of nomograms in patients with ALNM breast cancer. (A) ROC curves in training cohort. (B) ROC curves in internal validation cohort. C. ROC curves in external validation cohort.

Fig. 4
Fig. 4
Full size image

Calibration curve of nomogram and ALNM in training cohort (A), internal validation cohort (B) and external validation cohort (C).

Clinical utility

DCA demonstrated that, across threshold probabilities corresponding to cost-benefit ratios from 1:100 to 4:1, the model provided a higher net benefit than the treat-all or treat-none strategies. In the training cohort (Figure 5A), the model’s net benefit gradually declined as the threshold increased, while the “all” and “none” strategies remained near zero. In the internal validation cohort (Figure 5B), the decline in net benefit was slower at certain higher thresholds, suggesting more stable performance in specific ranges. In the external validation cohort (Figure 5C), net benefit fluctuated at higher thresholds, indicating greater sensitivity to threshold selection. Overall, despite a gradual decrease at higher thresholds, the model retained clinically meaningful net benefit within selected cost-benefit ranges.

Fig. 5
Fig. 5
Full size image

Decision curve analyses (DCA) of nomogram and ALNM in training cohort (A), internal validation cohort (B) and external validation cohort (C).

Discussion

ALNM is a major determinant of prognosis and a cornerstone of therapeutic decision-making in BC. Tumor cells disseminate to axillary nodes via lymphatic channels, forming secondary foci that accelerate disease progression and correlate with higher recurrence and poorer outcomes20. Historically, axillary lymph node dissection (ALND) was routinely performed when preoperative nodal status was uncertain to reduce local recurrence. However, Soran et al. reported that indiscriminate ALND may disrupt the local immune microenvironment and facilitate distant spread, underscoring the importance of accurate preoperative assessment21. Our model, derived using LASSO and multivariable logistic regression and integrating clinicopathologic variables, imaging findings, and tumor markers, offers a high-precision and low-risk tool for preoperative evaluation of ALNM.

In this cohort, conventional serum tumor markers (CEA, CA15-3, CA125) were not significant predictors of ALNM, suggesting limited sensitivity for nodal involvement. Histological grade also did not retain independent significance after adjustment, in contrast to findings by Achouri et al22. This discrepancy may reflect the dominant predictive contribution of imaging assessments (ultrasound and CT) in our model, which could attenuate the effect of histological grade. Nevertheless, we observed a higher ALNM rate in grade III tumors compared with grades I-II, consistent with Gao et al., who linked higher grade to more aggressive biology and increased nodal metastasis23. Other variables—PR, HER2, Ki-67, and pathological type—were not statistically significant, a result aligned with several international studies22,24,25,26.Vascular invasion, although prognostically relevant, could not be incorporated because it relies on postoperative histopathology and thus is unavailable preoperatively. Notably, the model performed best in the external validation cohort (C-index=0.84; AUC=0.84), slightly exceeding performance in the development and internal validation cohorts. This may reflect the broader patient spectrum in the external cohort, enhancing generalizability.

Using LASSO, we identified four independent predictors of ALNM: tumor size, ER positivity, and suspicious axillary lymph nodes on ultrasound or CT. These predictors showed robust and independent associations with ALNM, in line with prior reports24,27. In our cohort, larger tumors and ER positivity were associated with higher odds of ALNM; accordingly, the nomogram indicates lower predicted risk for smaller tumors and ER-negative status. Imaging strengthened predictive accuracy: ultrasound provides rapid, low-cost evaluation with high specificity28, whereas CT offers detailed assessment of nodal morphology and enhancement patterns29. Suspicious nodes on either modality were strongly associated with ALNM, consistent with Riedel et al., highlighting the central role of imaging in preoperative risk stratification30.

These macroscopic features likely reflect underlying tumor biology. Emerging studies implicate dysregulated molecular pathways in tumor progression and nodal spread. For example, alterations in sphingolipid metabolism-related genes have been linked to BC outcomes31, and DBNDD1 expression has been associated with prognosis and immune biomarkers in invasive BC32, suggesting that lipid metabolism and microenvironmental interactions may modulate metastatic behavior. Although our model emphasizes readily available clinical and imaging data for practicality, future integration of such molecular markers could enhance mechanistic insight and further improve predictive performance.

Compared with prior models—such as those excluding imaging data10 or MRI-based radiomics models limited by cost and availability11—our approach combines routinely obtainable clinicopathologic and imaging variables, achieving strong discrimination (training C-index=0.81; external C-index=0.84) and broad applicability across diverse clinical settings. The inclusion of both internal and external validation strengthens the reliability and implementability of the tool.

This study has limitations. First, its retrospective design may introduce selection bias, potentially affecting generalizability; prospective validation is warranted. Second, the external validation sample size was relatively small, which may affect the stability of estimates. Third, lymph node metastasis was assessed with hematoxylin–eosin staining alone; absence of immunohistochemical evaluation could miss micrometastases—particularly in nodes that appear normal on ultrasound or CT—thereby affecting the model’s sensitivity. Finally, we did not provide a formal risk stratification scheme that aggregates key predictors (e.g., tumor size, CT-detected suspicious nodes, ER status) into clinically actionable risk tiers. Future work should prioritize large, multicenter, and multi-regional prospective studies; incorporate high-sensitivity diagnostic techniques such as immunohistochemistry; and develop standardized risk strata to facilitate decision-making and maximize clinical utility.

Conclusion

We present an interpretable, externally validated nomogram that predicts ALNM preoperatively using tumor size, ER status, and ultrasound/CT findings, with robust discrimination and clinical benefit. Prospective multicenter studies with high-sensitivity pathology and integrated risk tiers are needed to optimize generalizability and applicability.