Introduction

Breast cancer remains a significant global health challenge, and advancements in treatment modalities continually shape the landscape of patient care1,2. In the context of early breast cancer (EBC), the advent of neoadjuvant systemic therapy (NST) has transformed the landscape of treatment options, presenting a dynamic interplay between surgical modalities and adjuvant therapies3,4. In recent years, several observational studies have consistently indicated superior survival outcomes in patients treated with breast-conserving surgery plus radiotherapy (BCS + RT) compared to those treated with mastectomy5,6,7,8,9,10. However, the survival impact of type of local therapy following modern NST is unclear11,12,13. A critical decision facing clinicians involves the choice between BCS + RT and mastectomy for patients who have undergone NST.

The present study delves into this clinical dilemma, seeking to provide a comprehensive comparison of the long-term outcomes associated with BCS + RT and mastectomy in EBC patients post-NST. Over a span from 2010 to 2018, a cohort comprising 13,958 patients was meticulously analyzed, shedding light on the evolving trends in treatment preferences during this period. Notably, with the progression of imaging techniques for treatment response assessment and advancements in localizing breast lesions, there has been a notable rise in the utilization of BCS among patients undergoing NST3,14,15,16,17,18.

Beyond traditional survival analyses, our investigation incorporates a series of pioneering machine learning approaches aimed at enhancing personalized treatment decisions19. Leveraging data from the Surveillance, Epidemiology, and End Results (SEER)-Medicare database, we not only assess overall and breast cancer-specific survival but also introduce a Random Survival Forest (RSF) model. This model, based on ten prognostic variables, aims to predict long-term outcomes with superior accuracy, thereby offering a valuable tool for clinicians navigating the complex terrain of surgical treatment decisions. Furthermore, the integration of a cloud-based recommendation system further amplifies the impact of our study. By visualizing survival curves and deploying this system on the internet, we bridge the gap between research findings and real-world clinical applications. This user-friendly tool facilitates dynamic and data-driven decision-making, empowering clinicians to tailor treatment plans to individual patient profiles.

Methods

Study population

We acquired the dataset from the SEER 17 Registries, Nov 2022 Sub database and conducted analysis utilizing SEER*Stat 8.4.0 software. Given the public nature of the data without any personally identifiable patient information, our retrospective cohort study obtained approval from the Institutional Review Board of the Chongqing Health Center for Women and Children. The Board waived the requirement for informed consent. Inclusion criteria encompassed female breast cancer patients diagnosed specifically with IDC/ILC, with tumor stages T1–3, N0-3, M0, aged between 20 and 79 years, undergoing either breast-conserving surgery or mastectomy after receiving NST, with treatment initiated within 6 months of diagnosis. Exclusion criteria included patients receiving non-beam radiation or BCS without subsequent radiotherapy, those receiving preoperative or intraoperative radiotherapy, or an unspecified radiotherapy sequence, as well as patients who deceased within 6 months of diagnosis. Additionally, Patients with incomplete data on neoadjuvant therapy or missing values in the Site-Specific Codes for Neoadjuvant Therapy Treatment Effect were excluded. The detailed procedure for data filtering is illustrated in Fig. 1. Our study adhered to the principles of the Declaration of Helsinki (2013 revision).

Fig. 1
figure 1

The flowchart of developing models.

Study variables

EBC refers to cancer confined within the breast, with or without involvement of regional lymph nodes, and without spread to distant organs. Patients were categorized as having received NST if they underwent systemic treatment before surgery, or both before and after surgery. Clinical and pathologic data included surgery type, age, race, marital status, median household income, rural–urban status, histological type, tumor site, grade, T stage, N stage, molecular subtype, and response to neoadjuvant therapy. The primary endpoint was breast cancer-specific survival (BCSS), with overall survival (OS) considered as the secondary outcome.

Statistical analysis

We conducted both univariate and multivariate analyses to evaluate mortality risk among patients and to identify independent prognostic factors. Multiple imputations by chained equations were employed to handle value and maintain model stability20. To minimize potential selection biases and balance the baseline characteristics between the two surgical groups, we performed propensity score matching (PSM). Variables used for matching included age, race, tumor grade, T stage, N stage, estrogen receptor (ER) status, progesterone receptor (PR) status, HER2 status, marital status, and year of diagnosis. By incorporating the year of diagnosis into the PSM, we aimed to account for changes in diagnostic technologies and treatment strategies over time, ensuring a more balanced comparison between the BCS and mastectomy cohorts. Matching was conducted using a 1:1 nearest-neighbor approach without replacement, with a caliper width of 0.002. Categorical measurements were compared using the Chi-squared test. Survival analysis was conducted using Kaplan–Meier (KM) analysis and the log-rank test. All analyses were two-sided, and statistical significance was defined as a P value less than 0.05. Statistical software R version 4.1 and Python 3.11 were used for all analyses.

Machine learning model design

In this study, patients were randomly allocated into training and testing datasets in a 7:3 ratio. Variables independently associated with the primary outcome were identified through least absolute shrinkage and selection operator (LASSO) regression, as well as univariate and multivariate Cox regression analyses. These variables were then used to construct various machine learning models, including RSF, Rpart, Xgboost, Glmboost, Survctree, and Survsvm. To handle categorical features effectively, one-hot encoding was utilized to represent the different categorical values in a binary manner. Hyperparameters were optimized through tenfold cross-validation and Bayesian optimization to maximize the concordance index (C-index), which measures the ratio of correctly ordered patient pairs to all pairs. indicating better model performance with higher C-index values. The RSF model's hyperparameters were fine-tuned within specified ranges: number of trees (100–500), number of variables randomly sampled at each split (2–10), and minimum node size (2–10). Furthermore, the selected optimal model underwent comparison with the traditional Cox model for generalizability. Model performance was assessed using calibration plots, decision curve analysis (DCA), and receiver operating characteristic (ROC) curves. Construction of machine learning models relied on crucial dependencies such as Mlr3, randomForestSRC, and scikit-survival packages. Additionally, a user-friendly web-based prediction tool was developed to facilitate clinical application.

Results

Description of population

A total of 13,958 patients were included in this analysis. BCS + RT was used for the treatment of 9028 patients (64.7%) and mastectomy for the treatment of 4930 patients (35.3%). We observe a steady increase in BCS rates and a corresponding decrease in mastectomy rates among EBC patients after NST from 2010 to 2018 (Supplementary Fig. S1). The clinical-pathological characteristics of the patients are presented in Supplementary Table S1. The patients' ages were predominantly distributed in the range of 40–59 years, constituting 54.2% of the total; followed by the 60–79 age group, accounting for 34.9%; and patients aged 20–39 comprising 10.9%. In terms of race, white patients constituted 70% of the total. Regarding marital status, 57.3% of patients were non-single, while 38.6% were non-single. Approximately 51.4% of patients reported a relatively favorable economic status, with an annual median household income exceeding $70,000. Overall, 91.6% of patients resided in metropolitan areas. The most common histological type was invasive ductal carcinoma (IDC), representing 90.9% of cases. Common histological grades were predominantly distributed between Grade II (33.2%) and Grade III/IV (53.5%). The most frequent tumor location is in the upper outer quadrant. In terms of molecular subtypes, HR+/HER2− accounted for 39.7%, followed by HR−/HER2− (23.7%), HR+/HER2+ (22.8%), and HR−/HER2+ (11.3%). Tumor staging revealed 20.4%, 60.6%, and 18.9% for T1–T3 stages, respectively; and 43.0%, 41.6%, 9.9%, and 5.5% for N0–N3 stages, respectively. Regarding neoadjuvant therapy treatment effect, complete response (CR) was observed in 22.9%, partial response (PR) in 25.7%, overall response (OR) combined in 17.9%, and no response (NR) in 5.2% of cases.

Survival analysis

We performed univariate and multivariable Cox regression to identify significant variables affecting OS and BCSS in EBC patients after NST, namely, age, race, marital status, rural–urban status, grade, tumor site, T stage, N stage, molecular subtype, and response to neoadjuvant therapy (Supplementary Table S2). To adjust for confounding factors, we performed 1:1 PSM of all the above variables as well as the year of diagnosis (Table 1). After PSM, there were 3715 patients in each group, with a mean follow-up time of 63.2 ± 32.4 months. The KM survival curve showed that the BCS + RT group showed better BCSS (p < 0.001) and OS (p < 0.001) than the mastectomy group (Fig. 2). Furthermore, we also performed sub-analyses on the clinicopathological factors of interest, namely T stage, N stage, and responses to neoadjuvant therapy. In patients with T1 stage, there is no significant difference in BCSS between the two groups (p = 0.11), and there is a minor difference in OS (p = 0.02). However, among patients with T2 or T3 stage, both BCSS and OS show significant differences (p < 0.001). Additionally, regardless of lymph node metastasis, significant differences are observed in both BCSS and OS between the BCS + RT and mastectomy groups (p < 0.001 or p = 0.046). Furthermore, for patients with different responses to NST (CR, PR, OR, and NR), both BCSS and OS are significantly higher in the BCS + RT group than in the mastectomy group (p = 0.01 and p < 0.01 for CR, p < 0.001 and p = 0.003 for PR, p < 0.001 and p < 0.001 for OR, and p = 0.001 and p = 0.002 for NR).

Table 1 Comparison of baseline characteristics y before and after PSM.
Fig. 2
figure 2

Kaplan–Meier survival analysis for EBC treated with NST followed by BCS + RT or mastectomy. (A) PSM-adjusted BCSS based on the type of surgery. (B) PSM-adjusted OS based on the type of surgery. EBC early breast cancer. NST neoadjuvant systemic therapy. BCS + RT breast-conserving surgery plus radiotherapy. BCSS breast cancer-specific survival. OS overall survival. PSM propensity score matching.

Machine learning models

Algorithms, including Cox (Supplementary Table S2) and LASSO regression (Supplementary Fig. S2), analyses, showed that surgery, race, marital status, rural–urban status, grade, tumor site, T stage, N stage, molecular subtype, and response to neoadjuvant therapy were prognostic variables identified as independently associated with the primary outcome, namely BCSS. Six models, namely, RSF, Rpart, Xgboost, Glmboost, Survctree, and Survsvm, were established based on the identified ten prognostic variables. We divided the patients into train and test data according to 7:3, and to ensure the stability of the model, tenfold cross-validation and Bayesian Optimization was used in the train set to assess the optimal hyperparameters.

In the training and validation cohort, the predictive performance of the RSF model (0.847 and 0.795) was better than that of the other models, with C-index of Rpart (0.725 and 0.707), Xgboost (0.762 and 0.727), Glmboost (0.748 and 0.788), Survctree (0.764 and 0.766), and Survsvm (0.777 vs 0.790). The hyperparameters of the RSF model include 277 ntrees, 2 mtrys, and 7 nodesizes. Additionally, when compared to the traditional Cox model (0.749 and 0.782), the RSF model continues to exhibit higher predictive performance in both the training and validation cohorts. Then, we further evaluated the accuracy of the RSF model. RSF had good performance in AUC of 3, 5, and 10 years in the training cohort (0.878, 0.849, and 0.825, respectively) (Fig. 3A) and in the validation cohort (0.822, 0.783, and 0.713, respectively) (Fig. 3B). The agreement of RSF between predictions and observations in prognosis was assessed using a calibration plot. The 3-, 5-, and 10-year calibration plots showed good agreement between the predictive value and the actual value in the training and validation cohorts (Fig. 3C,D). DCA was applied to calculate a clinical “net benefit” for the prediction model, and the result of DCA indicated that the RSF model had a better net benefit at most threshold probabilities (Fig. 4A,B). We also assessed the ranking of clinical characteristics in terms of importance in the model. The results showed that N stage, response to neoadjuvant therapy, molecular subtype, grade, surgery type, and T stage were the top six determinants of patient survival (Fig. 5). Furthermore, we determined the optimal cutoff value for risk scores based on the survival curve and used this value to stratify patients into high-risk (risk score > 21.56) and low-risk (risk score ≤ 21.56) groups (Supplementary Fig. S3). The results of the KM analysis and log-rank test between the high-risk group and low-risk group were presented in Fig. 5, demonstrating a significant difference between the two groups (p < 0.001) (Supplementary Fig. S4).

Fig. 3
figure 3

ROC curves and calibration plots for the RSF model. The ROC of 3, 5, and 10 years between the RSF model in the training cohort (A) and the validation cohort (B). Calibration plots in 3 year, 5 years, and 10 years in the training (C) and validation (D) cohorts. ROC receiver operating characteristic, RSF random survival forest.

Fig. 4
figure 4

The DCA curves of RSF model. (A) The 3-year, 5-year and 10-year DCA curves of RSF model in the training cohort. (B) The 3-year, 5-year and 10-year DCA curves of RSF model in the validation cohort. DCA decision analysis, RSF random survival forest.

Fig. 5
figure 5

Variable importance and error rate curve of RSF. RSF random survival forest.

Since the RSF model has better performance than the traditional Cox model, we could not only predict the survival function of the current patient but also offer an adjuvant therapy reference to the surgery doctor based on prediction over different therapy treatment plans. Thus, we deployed the recommender system to the Internet, which could be accessed with a browser in [https://jhren.shinyapps.io/shinyapp1], input the current clinicopathologic characteristics of one patient, and click the predict button (Fig. 6).

Fig. 6
figure 6

The RSF model incorporates an input field featuring the current clinicopathologic characteristics of one patient, along with an output field showcasing patient risk scores and survival curves. RSF random survival forest.

Discussion

Comparative effectiveness of surgical modalities

The findings of our study underscore the superiority of BCS + RT over mastectomy in terms of both BCSS (p < 0.001) and OS (p < 0.001) in EBC patients post-NST. The observed improvement in outcomes aligns with several observational studies that have consistently reported favorable survival outcomes associated with BCS + RT in comparison to mastectomy21,22,23. Additionally, in a meta-analysis including 16 studies with a combined total of 3531 patients, Sun et al. showed that no significant difference in local recurrence and regional recurrence (p = 0.26 and p = 0.03), while they figured out a lower distant recurrence (p < 0.01), a higher disease-free survival (p < 0.01) and a higher OS (p < 0.01) in BCS + RT compared with mastectomy24. This could be due to various factors, including the biological behavior of the residual disease, the impact of radiotherapy, and the overall management strategy associated with BCS + RT. Furthermore, a rigorous and detailed postoperative follow-up regimen for these patients might facilitate the early detection and management of potential recurrences25. The evolving landscape of breast cancer management, particularly in the realm of EBC following NST, necessitates a nuanced understanding of the comparative effectiveness of treatment modalities26,27. Notably, our analysis, after PSM to balance relevant covariates, reinforces the robustness of these findings.

In this analysis, several factors associated with survival outcomes were identified, including namely, age, race, marital status, rural–urban status, grade, tumor site, T stage, N stage, molecular subtype, and response to neoadjuvant therapy. This is in accordance with several previous studies22,28,29. Considering these factors, we conducted subgroup analyses for T stage, N stage, and response to neoadjuvant therapy, respectively. Firstly, in our analysis of T1 stage tumors, we observed no statistically significant difference in BCSS between patients who underwent BCS + RT and those who underwent mastectomy. Although there is some difference in OS, it is less pronounced compared to the more evident differences observed in T2 and T3 stages. We deduced that patients at T1 stage generally present with smaller tumors, characterized by limited local invasion. In such cases, both of these surgical approaches may effectively control the disease. As the tumor size increases, BCS + RT may provide better local control, improving long-term outcomes for patients. We acknowledge that patients with T1 tumors and node-negative status are not typically recommended for NST in routine clinical practice. However, these patients might receive NST due to specific tumor biology, patient preference, or other clinical factors. Including these patients ensures a comprehensive reflection of diverse treatment decisions in real-world clinical settings. According to NCCN guidelines30, in patients with triple-negative or HER2-positive breast cancer, we may consider neoadjuvant therapy even if the tumor size is less than 2 cm and the axillary lymph node is negative. Treatment response provides important prognostic and adjuvant therapy information at an individual patient level, particularly in patients with triple-negative or HER2-positive breast cancer. In the T1 subgroup, there were no significant differences in BCSS and OS between the two groups, further supporting the idea that BCS + RT may provide better local control in patients with a higher tumor burden. Secondly, it is noteworthy that in the analysis of different subgroups based on N stage, we observed significant differences between the two surgical approaches. This may reflect the impact of lymph node involvement on surgical choices. BCS + RT might have an advantage in controlling lymph node involvement more comprehensively, especially in patients with higher N stages, resulting in better survival outcomes. Lastly, this finding is consistent across various subgroups of responses to NST, even in those who do not achieve a complete response. The survival benefit observed across all response categories to NST highlights the importance of considering surgical options beyond the extent of tumor response. BCS + RT may provide better local control in patients with higher tumor burden. Understanding the specific scenarios in which one surgical modality may confer a survival advantage over the other enables a more nuanced and personalized approach to breast cancer management.

Machine learning augmentation

The choice between BCS + RT and mastectomy remains a complex decision, influenced by clinical factors, patient preferences, and the evolving landscape of therapeutic options. However, there is a lack of accurate prediction models in the clinic. As a result, a more accurate and powerful model is needed. To our knowledge, the current study is the largest one to analyze the choice of surgical procedures in EBC patients following NST. Beyond traditional survival analyses, our study introduces six machine learning models to predict long-term outcomes for EBC patients post-NST. The RSF model, based on ten prognostic variables identified through Cox regression and LASSO regression, outperforms other machine learning models, including Rpart, Xgboost, Glmboost, Survctree, and Survsvm, in terms of C-index in both training and validation cohorts.

The RSF algorithm, first proposed in 200831, demonstrates superior predictive performance compared to the classical Cox model, highlighting the potential of machine learning to improve prognostic accuracy. This extension of traditional survival analysis leverages ensemble learning by constructing multiple survival trees. Through random sampling and feature selection, each tree predicts survival outcomes, and their collective results enhance robustness and reduce overfitting. This method integrates the advantages of random forests, offering a powerful tool for predicting time-to-event outcomes. Implementation involves random sampling during tree construction, yielding a diverse set of survival trees. The final prediction is an aggregation of individual tree predictions. Notably, the model's ability to handle high-dimensional data, capture non-linear relationships, and account for complex interactions positions it as a valuable tool for clinicians navigating the nuanced landscape of breast cancer treatment decisions32.

In addition, the importance of predictors can be calculated on the basis of the model to identify the factors that are closely related to prognosis for EBC after NST. This information might facilitate the surgery management and reduce the medical burden. So, we observed that clinical features, including N stage, response to neoadjuvant therapy, molecular subtype, grade, surgery type, and T stage, sequentially play significant roles in long-term prognosis, which were also referred to in prior study22,28,29.

The RSF risk stratification enables the evaluation of a patient's prognosis according to their clinicopathological profile. The high-risk cut-off (risk score > 21.56) was determined using the calibration curve to identify patients with lower predicted survival rates. This cut-off serves to distinguish between patients with different survival probabilities and to provide actionable information for surgeons and patients when considering surgical options. When applying the model, if the 'surgical type' variable for a high-risk patient is changed to BCS + RT, the RSF model may predict longer survival or move them to a lower-risk category. This indicates that opting for BCS + RT instead of mastectomy could potentially benefit these patients. Through individualized survival probability curves, the prognosis is presented with greater precision, providing a more detailed perspective on patients' outcomes. However, this change in surgical approach might not necessarily lower the patient's predicted risk score, as it also depends on other clinicopathologic features.

Web-based recommendation system

To bridge the gap between research findings and real-world clinical applications, we developed a cloud-based recommendation system. This system, accessible through a web interface, facilitates dynamic and data-driven decision-making by visualizing survival curves for each treatment plan. By deploying this system on the internet, we empower clinicians to make informed and personalized treatment decisions based on individual patient profiles.

Study limitation

The retrospective nature of our study relies on data extracted from SEER. While SEER provides a wealth of information, it is essential to recognize inherent limitations. Variability in data collection methods, potential coding errors, and the absence of certain clinical variables may introduce biases or limit the granularity of our analysis. Despite employing PSM to mitigate confounding factors, inherent selection biases may persist. Unmeasured or inadequately controlled variables, such as patient menopausal status or detailed information on the neoadjuvant therapy, could impact the observed outcomes. The retrospective design introduces challenges in fully accounting for all relevant clinical variables. Additionally, the study predominantly includes patients from the United States, potentially limiting the applicability of results to diverse healthcare settings. Furthermore, both training and test sets are from the same database, possibly introducing overlap and compromising the model's generalization capabilities. This implication should be interpreted with caution, as the model's predictions need external validation in other cohorts to ensure its reliability and generalizability. We strongly recommend further validation studies before applying the web-based prediction model to other patient populations.

Conclusions

Our study's findings carry substantial clinical implications for EBC patients post-NST, providing evidence in support of the efficacy of BCS + RT over mastectomy. The integration of machine learning models, particularly the RSF, introduces a new dimension to prognostic predictions, offering clinicians a powerful tool for personalized treatment decisions. Future directions may involve refining and expanding the machine learning model, incorporating additional relevant variables, and validating its performance in diverse patient populations. Further research could explore the impact of emerging therapies and evolving treatment paradigms on the comparative effectiveness of surgical modalities in EBC.

In conclusion, our study contributes to the evolving discourse on breast cancer management by providing robust evidence in favor of BCS + RT over mastectomy in EBC patients post-NST. The integration of machine learning augments prognostic predictions, and the web-based recommendation system facilitates the translation of research findings into actionable insights for clinicians.