Introduction

Triple-negative breast cancer (TNBC) is a heterogeneous disease characterized by the absence of estrogen receptor, progesterone receptor, and human epidermal growth factor receptor 2, accounting for 15–20% of all breast cancer cases1,2. Compared to other subtypes, TNBC is associated with an earlier age of onset, larger tumor size, higher recurrence rate, and more aggressive biology, resulting in a poorer prognosis2,3. The lack of endocrine and targeted therapies makes reducing recurrence and metastasis risk an urgent challenge for oncologists. Postmastectomy radiotherapy (PMRT) is a key treatment strategy for TNBC patients, as it improves locoregional recurrence-free survival (LRFS) and overall survival (OS)4,5.

In general, PMRT is recommended for breast cancer patients with ≥ 4 involved axillary lymph nodes (N2–3) or tumor size > 5 cm (T3–4)6. However, the role of PMRT in TNBC patients specifically has not been thoroughly evaluated7,8. Furthermore, PMRT is linked with adverse effects, including financial burden and long-term toxicities9, highlighting the need for more precise methods to identify TNBC patients who would benefit from PMRT.

Previous studies have demonstrated associations between eligibility for PMRT and advanced T/N stage, younger age, estrogen receptor-negative, and lymphovascular invasion (LVI)10,11. Our team has also reported that these clinicopathological risk factors can aid in personalizing PMRT decisions for breast cancer patients12,13. Meanwhile, hematoxylin and eosin (H&E)-stained whole slide images (WSIs) are widely used for predicting patient prognosis14,15, and there is growing interest in using deep learning models based on WSIs to predict radiotherapy response16,17. This suggests that histopathological characteristics derived from WSIs could provide valuable insights into PMRT outcomes and prognosis.

In this study, we aimed to employ machine learning to extract prognostic features from H&E-stained WSIs and combine them with key clinicopathological parameters to calculate a risk score. This risk score is designed to support PMRT decision-making for TNBC patients.

Materials and methods

Study population

Patients diagnosed with TNBC between March 2011 and May 2019 were recruited from the Breast Cancer Information Management System18,19,20 at West China Hospital, Sichuan University (WCH), and The Cancer Genome Atlas (TCGA) database. Inclusion criteria for the WCH cohort were as follows: (a) TNBC diagnosis with unilateral cancer; (b) no distant metastasis confirmed by imaging and pathological examination before treatment; (c) availability of at least one H&E-stained formalin-fixed paraffin embedded tissue section WSI. For patients undergoing neoadjuvant therapy, biopsy WSIs were used. The patient selection flowchart for each experiment is presented in Fig. 1a. Non-neoadjuvant TNBC patients from TCGA with WSIs and information on age, T/N stage (AJCC 8th ), surgical procedure, and PMRT were included for external validation of prognostic and PMRT decision-making models.

Fig. 1
figure 1

The patient selection process and schematic diagram of experimental design. (a) The flow chart of the patient selection process. (b) The schematic diagram of experimental design (created by BioRender.com). WSIs were divided into small tiles for energy analysis firstly. High-energy tiles were retained for further tumor cell segmentation and cell features quantification. Thus, the image score was calculated by a XGBoost model using important features selected by Cox regression analysis. Similar methods were applied to the clinicopathological score (containing age, T stage, N stage) and the risk score (containing the image score and age, T stage, N stage). Finally, patients were divided into “high-risk” and “low-risk” groups according to the risk score, and these patients could or could not benefit from PMRT. (c) The top 80 high energy value tiles (top) and the low energy value tiles (bottom) calculated using Sobel filter. Scale bar: 100 μm. Mask-RCNN, the mask region convolutional neural network; PMRT postmastectomy radiotherapy.

Ethics approval and consent to participate

This study was conducted following the Declaration of Helsinki and was approved by the Ethics Committee of West China Hospital, Sichuan University (No. 1308 in 2022), and a waiver of informed consent was granted.

Follow-up and outcomes

The follow-up duration ranged from 3 to 218 months. The endpoints included 5-year invasive disease-free survival (iDFS), OS and LRFS. Event definitions follow previously published article21. Notably, LRFS data were unavailable for TCGA cohort. All time-to-event endpoints were defined as the time in months from the date of pathology diagnosis.

Data partition

Three sets of eXtreme Gradient Boosting (XGBoost) experiments were conducted: (1) extracting WSI features to calculate the image score, (2) calculating the clinicopathological score using clinicopathological prognostic factors, and (3) combining the image score and prognostic factors to compute the risk score for prognostic prediction and PMRT benefit-risk stratification. The WCH dataset was randomly divided into training and validation cohorts at a ratio of 6:4, ensuring tiles from the same patient remained in the same cohort. TCGA dataset was used as the test cohort. Data partition was shown in Supplementary Fig. S1a&b.

WSI feature extraction, the image score and clinicopathological score calculation

The study framework is depicted in Fig. 1b. Initially, WSIs were segmented into 2000 × 2000-pixel tiles using Python’s Openslide package. Energy values for each tile were calculated using an efficient implementation of Sobel filter22, with high-energy tiles (enriched in tumor cells; Fig. 1c) confirmed by pathologists used for tumor cell segmentation via a mask region convolutional neural network23. From 126 tumor cell features, statistical metrics (mean, median, variance, maximum, minimum, quartile, and standard deviation) were calculated through the skimage (scikit-image) image processing library24, generating 882 variables. Following univariate Cox regression, 255 variables (P < 0.01) were retained for subsequent steps to enhance robustness and minimize overfitting. In this step, we introduced the False Discovery Rate (FDR) to assess the reliability of univariate Cox regression analysis. These variables, along with tumor cell count and energy value, were used as input data, and information on events and timing of invasive recurrence within 5 years were used as the outcome label to establish the XGBoost model and output the image score (XGBoost model 1; see the following function). The XGBoost model was selected since it has been shown to be effective in predicting disease risk25. Image score at the patient level was determined by the median score of each tile of the patient.

$$\:Z=H\left({x}_{i}\right)=\sum\:_{t=1}^{T}{f}_{t}\left({x}_{i}\right)$$

Where xi represent the independent variables, and ft(xi) is each tree output function. A similar methodology was employed for generating each patient’s clinicopathological score (XGBoost model 2). During the implementation of the XGBoost model, important parameters (Supplementary Methods; Supplementary Table S1) were adjusted using Grid Search26 to optimize its performance. Further details regarding this algorithm can be found in our previously published article27.

The risk score calculation and PMRT benefit-risk stratification

The risk score (XGBoost model 3; Fig. 1b) was computed by integrating the image score with clinicopathological prognostic factors, using 5-year iDFS as the outcome. Patients with unknown T/N stage or those undergoing breast-conserving surgery (BCS) were not included in this experiment (Fig. 1a). The optimal risk score cutoff was determined using the Youden index from survival receiver operating characteristic (ROC) curves in R. The cutoff aimed to satisfy two criteria simultaneously: (1) PMRT significantly improved outcomes in the “high-risk” group, with a smaller P value indicating stronger significance; and (2) no significant outcome differences between PMRT and non-PMRT groups in the “low-risk” cohort, with the P value being as large as possible. Specifically, time-dependent ROC curves were constructed to guide the design of a composite scoring function that simultaneously satisfied dual statistical significance thresholds (p-values). Candidate cut-offs were iteratively evaluated by stratifying patients into high- and low-risk groups, followed by survival analysis comparing PMRT-treated versus untreated subgroups. The final cut-off was selected through a comprehensive search across all candidate thresholds to optimize stratification robustness and clinical interpretability.

Statistical analysis

Clinicopathological variables analyzed included age, menopausal status, tumor grade, T/N stage, Ki-67 index, LVI, and treatment details. Cox proportional hazards regression and Kaplan–Meier analyses were performed to evaluate survival outcomes. XGBoost-based Shapley’s additive explanation (SHAP; XGBoost model 4) was used to assess variable importance28. We performed Cox regression analysis in R 4.1.2, Chi-square test or Fisher’s test for characteristics and Kaplan‒Meier analysis in GraphPad Prism 8.0.2, and SHAP plotting in Python 3.7.12, with P < 0.05 considered statistically significant.

Results

Baseline characteristics

Table 1 summarizes the detailed characteristics of the study participants. The preliminary analysis included 608 TNBC patients from the WCH cohort, with a median age of 48 years (range: 23–81). Among these, 212 patients (34.87%) received PMRT, 103 patients (16.94%) received neoadjuvant chemotherapy (NACT), 601 patients (98.85%) received adjuvant chemotherapy (ACT), and 54 patients (8.88%) underwent BCS. The chemotherapy regimens of patients receiving PMRT and those not receiving PMRT (no-PMRT) are shown in Supplementary Table S2. Before 2020, most patients received two-dimensional technique radiotherapy, while a significant number of patients received intensity-modulated radiotherapy after 2020. PMRT administration was negotiated by the physician and the patient, adhering to the guidelines of the Chinese Society of Clinical Oncology (CSCO) and the European SocieTy for Radiation Oncology (ESTRO). The prescribed dose was 46–50 Gy in 25 fractions, targeting the ipsilateral chest wall as well as the supraclavicular and infraclavicular regions. In the independent test cohort, 182 TNBC patients from TCGA were included, with a median age of 55 years (range: 27–90). In this cohort, 92 patients (50.55%) received PMRT, 74 patients (40.66%) received ACT, and 49 patients (26.92%) underwent BCS. Each patient contributed 1–7 H&E-stained WSIs, resulting in 1,417 WSIs and 25,187 tiles used in the analysis. Supplementary Tables S3 and S4 present the detailed characteristics of PMRT and no-PMRT patients in the WCH cohort and TCGA cohort, respectively.

Table 1 Baseline characteristics of TNBC patients included in this study (N = 790).

The image score emerges as a negative prognostic predictor for TNBC patients

During a mean follow-up period of 84.2 (range: 3–218) months, 95 patients (15.63%) experienced invasive recurrence, 21 patients (3.45%) experienced locoregional recurrence, and 67 patients (11.02%) died within 5 years. Univariate Cox analysis identified advanced T stage, N stage, and NACT as significant predictors of invasive recurrence [T2–4 vs. T1: hazard ratio (HR) = 1.88, 95% confidence interval (CI) 1.27–2.78, P = 0.002; N1–3 vs. N0: HR = 2.36, 95% CI 1.72–3.24, P < 0.001; NACT: HR = 2.39, 95% CI 1.54–3.72, P < 0.001] and mortality (T2–4 vs. T1: HR = 2.07, 95% CI 1.25–3.42, P = 0.004; N1–3 vs. N0: HR = 3.00, 95% CI 1.99–4.52, P < 0.001; NACT: HR = 3.50, 95% CI 2.13–5.76, P < 0.001) within 5 years (Table 2). Meanwhile, the image score demonstrated a strong correlation with poor outcomes in both the WCH validation cohort (iDFS: HR = 8.39, 95% CI 3.70–19.04, P < 0.001; OS: HR = 9.60, 95% CI 3.73–24.66, P < 0.001; LRFS: HR = 5.78, 95% CI 1.31–25.54, P = 0.021) and TCGA test cohort (iDFS: HR = 3.32, 95% CI 1.59–6.92, P = 0.001; OS: HR = 2.86, 95% CI 1.16–7.05, P = 0.022) (Fig. 2a).

Table 2 The univariate Cox proportional hazard regression model of TNBC patients (WCH; N = 608).
Fig. 2
figure 2

The image score for prognosis analysis. (a) The results of the relationship between the image score and survival analyzed by univariate Cox regression. Relative importance of variables according to SHAP for 5-year OS (b, c), LRFS (d, e), and iDFS (f, g). PMRT postmastectomy radiotherapy, LVI lymphovascular invasion, SHAP Shapley’s Additive Explanation.

Although age showed no significant correlation with survival, it was included in the multivariable Cox analysis alongside T stage, N stage, NACT, and the image score because it is often of concern to clinicians (Table 3). Multivariable analysis confirmed the image score as an independent predictor of poorer iDFS (HR = 8.80, 95% CI 4.85–15.96, P < 0.001), OS (HR = 9.02, 95% CI 4.24–19.17, P < 0.001), and LRFS (HR = 8.39, 95% CI 1.96–35.97, P = 0.004). In addition, the SHAP diagrams in Fig. 2b–g indicated that the image score held the highest predictive importance, followed by N stage and other clinicopathological variables.

Table 3 The multivariate Cox proportional hazard regression model of TNBC patients (WCH; N = 608).

Further employing SHAP analysis to visualize the contributions of clinicopathological features to the image score revealed that age, T-stage, and N-stage were the most influential features (Supplementary Fig. S2). Patients with advanced N-stage exhibited significantly higher image scores (Supplementary Fig. S3). Visual comparison of WSI tiles corresponding to the highest- and lowest-scoring cases showed that tiles with high image score exhibited a greater degree of nuclear atypia, increased mitotic figures, and larger, more prominent perinuclear halos. In contrast, tiles with low image score displayed more uniform nuclear morphology (Supplementary Fig. S4).

The risk score outperforms the clinicopathological score in prognostic prediction

To evaluate whether WSI features improve prognostic prediction ability, the predictive performance of the clinicopathological score (comprising age, T stage, and N stage) and the risk score was compared using area under the ROC curve (AUC), as shown in Fig. 3. The risk score consistently outperformed the clinicopathological score in both the WCH validation cohort (OS: 0.874 vs. 0.807; LRFS: 0.957 vs. 0.870; iDFS: 0.805 vs. 0.756) and TCGA test cohort (OS: 0.685 vs. 0.617; iDFS: 0.683 vs. 0.589). Additionally, the image score also showed slightly higher predictive accuracy than the clinicopathological score (Fig. 3a–e).

Fig. 3
figure 3

Comparison of the performance of the image score, clinicopathological score, and risk score in predicting survival. The ROC curves for predicting 5-year OS (a), LRFS (b), iDFS (c) of the WCH validation cohort (n = 209) and 5-year OS (d), iDFS (e) of TCGA test cohort (n = 182). AUC area under the curve.

Identifying PMRT candidates according to the risk score

In view of the toxicity of PMRT and its variable efficacy among patients, we sought to utilize the risk score to identify those most likely to benefit from this treatment. Due to the limited number of T1–2N0 patients receiving PMRT (7/281), we performed a 1:1 exact matching based on T stage, N stage, and age, ultimately including 14 T1–2N0 patients (Fig. 1a). This resulted 254 patients from the WCH cohort being enrolled in the PMRT benefit-risk stratification experiment (Supplementary Fig. S1c). As a result, patients were stratified into “high-risk”(n = 169) and “low-risk” (n = 85) groups based on an optimal risk score cutoff of 0.611 (range: 0.222–2.328). Kaplan‒Meier analysis revealed that PMRT significantly improved OS (P = 0.011) and LRFS (P = 0.015) in the “high-risk” group but provided no significant benefits in the “low-risk” group (OS: P = 0.343; LRFS: P = 0.779) (Fig. 4a–d). Similar findings were observed in the TCGA cohort for OS, comprising 133 patients after excluding those treated with BCS (Fig. 4e&f). Although “high-risk” patients in the WCH cohort demonstrated improved iDFS with PMRT (P = 0.002), the TCGA cohort only showed a trend towards improvement (P = 0.088; Supplementary Fig. S5).This aligns with prior findings that PMRT predominantly improves local control and OS29,30.

Fig. 4
figure 4

Kaplan‒Meier survival curves of 5-year OS and LRFS in the PMRT and no-PMRT groups. 5-year OS of the “high-risk” (a) and “low-risk” (b) patients in the WCH cohort. 5-year LRFS of the “high-risk” (c) and “low-risk” (d) patients in the WCH cohort. 5-year OS of the “high-risk” (e) and “low-risk” (f) patients in TCGA cohort. PMRT, postmastectomy radiotherapy; no-PMRT, did not receive postmastectomy radiotherapy. The P-value determined by log-rank test.

Discussion

As we known, histopathology has always been the gold standard of disease diagnosis, with tumor cellular characteristics closely linked to tumor biology. In this study, we extracted features and computed image scores from H&E-stained WSIs of 608 TNBC patients from WCH cohort and 182 TNBC patients from TCGA. This image score emerged as an independent predictor of 5-year OS, LRFS, and iDFS. More importantly, the risk score, which integrates the image score with clinicopathological factors, demonstrated superior performance in predicting prognosis compared to traditional characteristics. It also successfully identified “high-risk” individuals s likely to benefit from PMRT, underscoring its potential utility in guiding treatment decisions.

Compared with traditional pathological characteristics, WSIs of cancerous tissues have surfaced as a new crucial resource for clinical diagnosis, treatment response assessment, and prognosis prediction31. By leveraging the pattern discernment capabilities of machine learning algorithms and voluminous, multiscale information inherent in WSIs, our team has previously demonstrated the utility of these features for prognosis prediction and validated their performance on publicly available datasets, including the National Lung Screening Trial32, breast cancer, and low-grade glioma datasets from TCGA33,34. Moreover, WSI features have also shown robust prognostic prediction potential across a variety of diseases, including hepatocellular carcinoma35, bladder cancer36, cervical cancer37, and diabetic kidney disease38.

In the present study, the image score outperformed N/T stage and NACT as a prognostic predictor. This superiority may be attributed to the comprehensive information encapsulated in WSIs, such as details on cell mitosis39, tumor-stromal ratio40, tumor grade41, and even gene expression42. In addition, our proposed risk score surpassed a recently reported model combining clinicopathological features and WSI data to predict recurrence (local, regional, and distant) and death in TNBC patients, achieving an optimal AUC of 0.7743. Thus, the risk score we developed serves as a reliable tool for identifying TNBC patients who may benefit from PMRT, offering personalized treatment options.

It is worth mentioning that the role of the 21-gene OncotypeDX recurrence score in decision-making on PMRT has attracted much attention44,45. However, it is specifically designed for estrogen receptor-positive breast cancer patients. For TNBC and other subtypes, our previous studies emphasized the importance of clinicopathological risk factors in PMRT benefit stratification12,13. Consistent with these findings, this study reaffirmed the predictive value of tumor size and lymph node involvement in TNBC patients. Notably, our results align with previous studies of similar design46,47,48. Wang et al.46. analyzed 1,474 postmastectomy T1–2N1 patients, and identified age ≤ 40 years, histological grade 3, 2–3 positive lymph nodes, and T2 stage (> 3 cm) as independent predictors of poor locoregional control. Xu et al.48. also proposed that age ≤ 40 years, 3 positive lymph nodes, T2 stage and Ki-67 index > 20% were independent risk factors for predicting breast cancer specific survival in T1–2N0–1 postmastectomy patients. These studies further suggested that patients with ≥ 2 risk factors might benefit from PMRT46,48. Our investigation found that 90.17% (211/234) of “high-risk” TNBC patients, as classified by the risk score, had ≥ 2 risk factors in age ≤ 40 years, T2–4 stage and > 1 positive lymph node. In contrast, 68.63% (105/153) of “low-risk” patients had ≤ 1 risk factor. These findings highlight the potential of WSI-based risk score to enable precise and personalized PMRT decision-making.

This study inevitably has some limitations. The independent TCGA cohort lacked LRFS data, and the models were trained using data from a single center, which may limit generalizability. To address this issue, we will continue collecting samples prospectively for validation. Data heterogeneity modestly impacts model performance. First, features extracted from biopsy WSIs exhibit lower representational capacity than the global patterns observable in surgical resection specimens. Second, while patient data across diagnostic/treatment years were randomly distributed in training/validation sets, evolving radiotherapy regimens may introduce subtle confounding effects. Future validation should employ standardized samples and contemporary radiotherapy guidelines to enhance scenario-specific utility.

Despite these constraints, our work provides innovative prognostic tools and valuable insights for PMRT application in TNBC patients. Additionally, this model leverages existing patient-specific digital H&E-stained postoperative pathology slides and electronic medical record (EMR)-derived age and T/N stage data through a fully automated pipeline. Upon finalization of the pathology report, artificial intelligence analysis is triggered, generating real-time risk scores that are either embedded within the pathology report or directly transmitted to the multidisciplinary team platform. The model enables analysis within routine clinical timeframes at near-zero marginal cost and integrates seamlessly with existing workflows.

Conclusions

Compared to traditional clinicopathological parameters, the WSI-based risk score demonstrated improved prognostic prediction performance. Following further validation, it holds promise as a tool for precisely identifying TNBC patients suitable for PMRT, thereby facilitating the development of more targeted and effective treatment strategies.