Abstract
HER2-low positivity is reported to play substantial role in neoadjuvant chemotherapy for breast cancer, but its role in adjuvant chemotherapy remains unclear. We aim to explore the role of HER2-low positivity in the breast cancer patients who only received adjuvant chemotherapy. We evaluated 3214 patients from SRRSH Hospital and 16,273 patients from CHLP Hospital. All the patients were diagnosed as primary breast cancer who only received adjuvant chemotherapy. HER2-status was defined according to ASCO/CAP guidelines. Multivariable Cox models and machine learning models were applied in the analysis of overall survival. A total of 1009 HER2-zero, 1399 HER2-low and 806 HER2-positive patients were included in SRRSH cohort, while there were 5662 HER2-zero, 6471 HER2-low and 4140 HER2-positive patients in CHLP cohort. HER2-low patients showed significant better OS and reduced death risk compare to HER2-zero patients in both SRRSH (log-rank p = 0.033, HR = 0.62, 95% CI 0.39–0.97, p = 0.037) and CHLP cohort (log-rank p < 0.001, HR = 0.63, 95% CI 0.55–0.73, p < 0.001). In subgroup analysis, HER2-low patients had significantly reduced risk of death compare to HER2-zero patients in hormonal receptor-positive subset in both cohorts, but not in hormonal receptor-negative subset. MLP model demonstrated the best performance among all the predictive models in both cohorts. Our study indicated that HER2-low patients had better survival compare to HER2-zero patients in adjuvant chemotherapy only setting. We emphasized the significance of HER2 status in adjuvant setting in BC, implying more precise diagnostic and therapeutic strategies in future.
Similar content being viewed by others
Introduction
Breast cancer (BC) is one of the most prevalent malignancies in women worldwide and in China1,2,3. HER2 is a prototype oncogene as well as a substantial target of breast cancer with a significant response rate in HER2-positive patients. HER2 status is commonly determined by immunohistochemistry (IHC) and in situ hybridization (ISH) according to the guidelines for HER2 testing, which define IHC3+ or IHC2+ with ISH+ as HER2-positive4. HER2-positive BC merely accounts for 15–20% of all BC cases, while the HER2-low (IHC2+ with ISH- or IHC1+) and HER2-zero (IHC0) comprise the rest of cases5. HER2 target therapy has achieved overwhelming success in the past decades5, however, no effective target therapy has been implemented in HER2-low or HER2-zero patients.
Recently, an antibody-drug conjugate (ADC), Trastuzumab Deruxtecan (T-DXd) demonstrated promising effects in treating metastatic HER2-low BC patients6. The success of DESTINY-breast04 trial has renewed our knowledge regarding subtyping of breast cancer, which has brought the HER2 status subtyping to the forefront. The DESTINY-breast06 trial showed that T-DXd could also prolong PFS in HER2-ultralow metastatic breast cancer patients7. Could HER2-low BC be a novel subtype and receive distinct therapy? A recent pooled analysis that integrated four prospective neoadjuvant clinical trials indicated that HER2-low BC could be a novel subgroup of breast cancer, which had distinct responses to treatment and prognosis8. Also, hormone receptor (HoR) status was reported as an important factor influencing the outcome of HER2-low BC patients in neoadjuvant setting8,9. Another retrospective study indicated that HER2-low BC manifested less aggressive clinical phenotypes than HER2-zero BC via analyzing the data of 7371 cases of HER2-low and HER2-zero BC. Similarly, HoR-negative cases had poorer survival rate than HoR-positive cases in HER2-low BCs10. However, how HER2 status affects the outcomes of patients who received adjuvant chemotherapy in BC patients remains unknown.
The molecular characteristics of HER2-low status could significantly affect the treatment strategy and outcomes of BC patients. Shao et al.’s study is the largest original multi-omics study to date on the molecular portrait of HER2-low BC, which indicated that HoR status plays a pivotal role in this subtype. Further, it also characterized HER2-low BC more profoundly among triple negative breast cancer (TNBC) and luminal BC11. Likewise, another study indicated that HER2-low BCs with different HoR status had distinct profiles of somatic mutations12. In contrast, a genomic study revealed that HER2-low BC had significantly higher rate of ERBB2 alleles and lower rate of ERBB2 hemi-deletions than HER2-zero BC, but the two subtypes had no difference in tumor mutation burden (TMB)13. Therefore, though no consensus was reached in the molecular landscape of HER2-low BC, HoR status was a substantial factor influencing molecular portrait of HER2-low BC, which could be linked to the efficacy of treatment.
Machine learning models have been revolutionizing the treatment of BC. A recently published study successfully utilized a machine learning model incorporating clinical, pathological, genomic and transcriptomic profiles of pre-treatment tumor of BC to predict the efficacy of neoadjuvant chemotherapy14. Also, another retrospective study constructed a 21-gene predictive model for distant metastasis of BC employing public datasets15. However, all the existing studies were associated with the prediction of response to neoadjuvant chemotherapy or cancer metastasis. No machine learning models have been established so far to predict the outcomes of HER2-low BC patients who received adjuvant chemotherapy.
In this study, we aim to explore the clinical characteristics and outcomes of HER2-low BC compared to HER2-positive BC and HER2-zero BC patients who received adjuvant chemotherapy. We implemented a three-phase, retrospective study that involved two independent cohorts. First, we evaluated the outcomes of patients with different HER2 subtypes who received adjuvant chemotherapy in SRRSH cohort. Second, we validated the findings in CHLP cohort. Third, we constructed multi-variable predictive models utilizing machine learning methodologies based on the data of CHLP cohort, and further validated the models in SRRSH cohort. The final model was visualized and uploaded to our website.
Results
Study populations
In total, we recruited 3214 and 16,273 BC patients from SRRSH cohort and CHLP cohort, respectively. All the patients were female. In SRRSH cohort, 806 (25.0%) patients had HER2-positive tumor, 1399 (43.6%) patients had HER2-low tumor, and 1009 (31.4%) patients had HER2-zero tumor (Table 1). The patient age as a continuous variable among three HER2 subtypes was identical, but a significant difference was found when age was considered as a categorical variable (p = 0.028). Significant differences among three subtypes were detected for N stage, histologic type, ER status, PgR status, Ki67 index, and molecular subtypes. A significant difference was also found among three subtypes for treatments like radiation therapy, endocrine therapy, and HER2 target therapy. No differences were found for tumor stage, intravascular cancer thrombus, and T stage. The median follow-up period was 67.1 months.
In CHLP cohort, 4140 (25.4%) patients had HER2-positive tumor, 6471 (39.8%) patients had HER2-low tumor, and 5662 (34.8%) patients had HER2-zero tumor (Table 1). Significant differences were found among three HER2 subtypes for age, tumor stage, T stage, N stage, intravascular cancer thrombus, histologic type, ER status, PgR status, and molecular subtypes. Treatments like radiation therapy, endocrine therapy, HER2 target therapy were also different among the three subtypes. The median follow-up period was 63.9 months.
The impact of HER2 status on outcomes of BC patients
HER2-low patients demonstrated significantly better OS than HER2-zero patients and HER2-positive patients in Kaplan-Meier survival analysis in the SRRSH cohort (Log-rank p = 0.033, Fig. 1A), while HER2-low and HER2-positive patients showed identical OS, but both had significantly better OS than HER2-zero patients in the CHLP cohort (Log-rank p < 0.001, Fig. 1B). In multi-variable Cox model, HER2-low patients had significantly decreased risk of death compared to HER2-zero patients in both cohorts (HR = 0.62, 95% CI 0.39–0.97, p = 0.037, and HR = 0.63, 95% CI 0.55–0.73, p < 0.001, respectively) (Fig. 2A, B and Table 2). HER2-positive patients demonstrated significantly decreased risk of death compared to HER2-zero patients in CHLP cohort (HR = 0.51, 95% CI 0.42–0.61, p < 0.001), but not in A cohort (Table 2). We also analyzed the impact of HER2 status on DFS of enrolled BC patients from SRRSH cohort employing a multi-variable Cox model. However, HER2-low patients showed no difference in the risk of recurrence compared to HER2-zero patients (p = 0.60), and it was the same for HER2-positive patients (p = 0.65) (Table S3).
Forrest plots for multivariable Cox proportional hazard analysis for risk of death (A, B) or recurrence of disease (C) in HER2-low tumors compared with HER2-zero tumors. multivariable analysis included clinicopathological factors as indicated in the “Methods” section. HR hazard ratio. *p value for interaction between HER2 status and hormone receptor status.
In HoR-positive tumors, HER2-low patients demonstrated better OS than HER2-zero patients and HER2-positive patients in SRRSH cohort (Log-rank p = 0.067, Fig. 1C), whereas HER2-low and HER2-positive patients had significantly better OS than HER2-zero patients in the CHLP cohort (Log-rank p < 0.001, Fig. 1D). In multi-variable Cox model, HER2-low patients had significantly decreased risk of death compared to HER2-zero patients in both cohorts (HR = 0.59, 95% CI 0.35–0.99, p = 0.045, and HR = 0.63, 95% CI 0.53–0.76, p < 0.001, respectively) (Fig. 2A, B and Table S4). Similarly, HER2-positive patients showed significant decreased risk of death compare to HER2-zero patients in the CHLP cohorts (HR = 0.63, 95% CI 0.50–0.79, p < 0.001), but not in SRRSH cohort (HR = 0.49, 95% CI 0.22–1.10, p = 0.083) (Table S4).
In HoR-negative patients, patients with HER2-low tumor had no difference in OS compared to patients with HER2-zero or HER2-positive tumor in the SRRSH cohort (Log-rank p = 0.50, Fig. 1E). However, HER2-low patients had significant poorer OS than HER2-positive patients, while both had better OS when compared to HER2-zero patients in the CHLP cohort (Log-rank p < 0.001, Fig. 1F). Multivariable Cox modeling indicated that HER2-low patients had no difference in death risk compared to HER2-zero patients in both cohorts (HR = 0.45, 95% CI 0.19–1.10, p = 0.079, and HR = 0.91, 95% CI 0.71–1.15, p = 0.42, respectively) (Fig. 2A, B and Table S4). Nevertheless, HER2-positive patients showed significant lower risk of death than HER2-zero patients in the CHLP cohort (HR = 0.36, 95% CI 0.27–0.48, p < 0.001), but not in the SRRSH cohort (Table S4).
Machine learning integrated multiple clinic-pathological features
To optimize the predictive efficacy of the models, we utilized four machine learning models and statistical (Cox-regression based) models to integrate the involved clinic-pathological features in both cohorts. In the CHLP cohort, all the models showed excellent performance in predicting OS in the evaluation and test set (AUC = 0.80 and 0.81, respectively), while the MLP model demonstrated the highest performance in the SRRSH cohort among all the models (AUC = 0.73) (Table S1). In HoR-positive subgroup, the AUC of the MLP model was 0.81 in the evaluation set, 0.76 in the test set, and 0.73 in the SRRSH cohort. In HoR-negative subgroup, the AUC was 0.81 in the evaluation set, 0.79 in the test set, and 0.64 in the SRRSH cohort, respectively (Table S1). The ROCs of the four models were depicted in Fig. 3. The MLP model had the best performance in both cohorts, the AUC was 0.82 in the CHLP cohort, and 0.73 in the SRRSH cohort (Fig. 3A, B). The AUC were 0.81 and 0.73 for CHLP cohort and SRRSH cohort in HoR-positive subgroup, respectively (Fig. 3C, D), while in HoR-negative subgroup, the AUC were 0.81 and 0.64 for CHLP cohort and SRRSH cohort, respectively (Fig. 3E, F).
A, B In B cohort, all the models showed excellent performance in predicting OS (AUC score = 0.82) except statistic model (AUC score = 0.69), while the MLP model demonstrated the highest performance in A cohort among all the models (AUC score = 0.73); C, D In HoR positive patients, the MLP model demonstrated the best performance in predicting OS. The AUC scores were 0.81 and 0.73 for B cohort and A cohort, respectively. E, F In HoR negative patients, the MLP model also demonstrated the best performance in predicting OS. The AUC scores were 0.81 and 0.64 for B cohort and A cohort, respectively.
Furthermore, to explore the correlation between each variable and OS, we generated the SHAP values for each variable under MLP model in CHLP cohort. We showed that HER2 status had a strong association with the OS in all patients (SHAP value = 0.063) (Fig. 4A and Table S5). In HoR-positive subgroup, stage showed the strongest association with OS (SHAP value = 0.042), while the HER2 status showed the value of 0.037 (Fig. 4B and Table S5). In HoR negative subgroup, Ki-67 index had the strongest association with OS (SHAP value = 0.069), the HER2 status had the value of 0.041 (Fig. 4C and Table S5). In addition, the HER2-zero patients had the highest risk of death (SHAP value = 0.095), whereas the HER2-low patients and HER2-positive patients had the SHAP values of −0.043 and −0.054, respectively (Fig. 4D). Similar patterns were also identified in HoR positive and HoR-negative subgroup (Fig. 4E, F).
The SHAP values for each variable under MLP model in B cohort. A HER2 status has the closest association with the OS in all patients (SHAP value = 0.063); B In HoR positive tumors, HER2 status showed the value of 0.037; C In HoR negative tumors, HER2 status had the SHAP value of 0.041. D The HER2-zero patients had the highest risk of death (SHAP value = 0.095), whereas the HER2-low patients and HER2-positive patients had the SHAP value of −0.043 and −0.054, respectively; E, F In subgroup analysis, HER2-zero patients had the highest risk of death compare with HER2-low and HER2-positive patients in both HoR positive tumors and HoR negative tumors.
Discussion
In this study, we systematically evaluated the impact of HER2 status on the outcomes of BC patients who only received adjuvant chemotherapy for the first time by employing a three-phase, multi-cohort retrospective study. HER2-low patients showed significantly better OS compared to HER2-zero patients in both the SRRSH and CHLP cohorts, and HER2-low patients also demonstrated significantly reduced risk of death compared to HER2-zero patients in both cohorts. In subgroup analysis, HER2-low and HER2-positive patients had significantly reduced risk of death compared to HER2-zero patients in HoR-positive subset, whereas HER2-low patients had no difference of death risk compared to HER2-zero patients in both cohorts in HoR-negative subset. Furthermore, we optimized machine learning models to predict the OS of the patients. The MLP model demonstrated the best performance among all the models in both cohorts. The SHAP value of HER2 status in the MLP model demonstrated a strong association with the OS of BC patients from CHLP cohort.
The impact of HER2-low positivity on the survival of BC patients who received adjuvant chemotherapy has not been reported to date. In the current study, we showed that HER2-low patients had significantly improved OS compared to HER2-zero patients in adjuvant setting. This aligns with the results from the pooled analyses of neoadjuvant chemotherapy across four prospective trials in Europe8. A recently published meta-analysis also suggested that HER2-low positivity was associated with improved pathological complete response (pCR) rate, DFS and OS in the BC patients who received neoadjuvant chemotherapy16. Our findings suggest that HER2-low positivity was also a favorable feature of survival in the adjuvant setting. Interestingly, we also reported discordance with the results of neoadjuvant setting in subgroup analysis. We showed that the benefit of HER2-low positivity on OS was only observed in HoR-positive patients, not in HoR-negative patients in both cohorts. This is in contrast to the findings in a European study, which suggested that HER2-low positivity was correlated with improved OS in HoR-negative patients, but not in HoR-positive patients8. Despite the discordance, more studies supported our findings11,17,18,19. A retrospective study involving 1,136,016 BC patients from National Cancer Database demonstrated that HER2-low patients had significantly better OS than HER2-zero patients in HoR-positive patients, but not in TNBC subgroup17. Another study showed that HER2-low positivity was significantly associated with favorable OS in high genomic risk patients (RS > 25)19. This discordance may be derived from the race disparity and inter-patient heterogeneity. As the studies supported our results were mostly from oriental or mixed populations, like Chinese11, Korean18, American17, or Jews19, while the study against us was from Germany8. The crosstalk between HER2 signaling and ER signaling may also contribute to the difference. Shao et al. reported that HER2-low patients showed less loss/deletion in 17q peaks than HER2-0 patients in HoR-positive subset, which further explained the molecular mechanisms beneath the survival advantage. Therefore, our findings suggested the survival benefit of HER2-low patients over HER2-zero patients who only received adjuvant chemotherapy, and the benefit was mainly restricted to HoR-positive patients. Intensified targeted therapies like T-DXd may be applied in HER2-low and HoR-negative BC patients. Future research may warrant exploring the different role of HER2-low positivity in neoadjuvant and adjuvant chemotherapy.
HER2-low BC exhibits significantly molecular heterogeneity, which may contribute to the discrepancy in the survival and the response to chemotherapy. There were over 50% luminal BC and approximately 20% TNBC in HER2-low patients, while HER2-zero patients had higher proportion of TNBC (around 25%) and similar luminal BC. This was identical to another Chinese cohort which had multi-omics data11. The multi-omics study showed that HER2-low tumors were more distinguished from HER2-zero tumors in HoR-negative BC, which were less similar to basal-like tumors. HoR-negative and HER2-low tumors had significantly more PIK3CA mutation, FGFR4/PTK6/ERBB4 overexpression and lipid metabolism activation. In HoR-positive and HER2-low tumors, less loss/deletion in 17q peaks was identified compared to HER2-zero tumors. The peaks located in NF1, NBR1, and BRCA1 genes, which may influence survival and treatment efficacy20,21,22. Nevertheless, another comprehensive genomic study involving 1039 HER2 negative metastatic BC patients indicated that HER2-low tumors only had higher rate of ERBB2 copy count and lower rate of ERBB2 hemi-deletion than HER2-zero tumors13, implying HoR status was the major factor of biological behavior of the tumor. However, no prognosis or treatment data were provided in this study, which reduced its clinical value. Our study and existing molecular studies provide the clues for precise medications like T-DXd to the HER2-low tumors, though more explorations were still warranted.
We successfully applied multiple machine learning models to predict the OS of participating BC patients in this study. Among them, the MLP model demonstrated the best performance in predicting the OS. The SHAP value based on the MLP model also emphasized the substance of HER2-low positivity. No prior study has used multiple machine learning models to optimize survival prediction in breast cancer. A recent meta-analysis utilized four model building strategies to predict the 10-year risk of breast cancer related mortality in a population-based cohort study. The results showed that regression-based methods had better and more consistent performance compared with machine learning approaches23. This does not align with our finding, but our results showed that MLP model’s predictive performance was significantly better than the Cox-regression model in the CHLP cohort. Regarding model generalization, the MLP model also showed a slight advantage in the SRRSH cohort. In addition, machine learning models were also applied in the prediction of pCR using multi-omics data14, the efficacy of immunotherapy response using radiomic data24, and the distant metastasis of breast cancer using transcriptomic data15. Our study was the first study to use multiple machine learning models across two independent cohorts for predicting survival in HER2-low BC patients, which highlighted the potential of artificial intelligence in the precise treatment of BC.
Our study has significant strengths, including large populations across multiple independent cohorts of BC patients, critical study design, and the application of machine learning models. However, we also acknowledge several limitations. First, we only performed analysis of OS of the participants. DFS data was not available in the CHLP cohort, which restrained our exploration, especially in HoR-positive BC patients, as the survival may lag years behind recurrence in them. We analyzed the DFS in the SRRSH cohort, which showed no difference between HER2-low patients and HER2-zero patients. This may be attributed to the high proportion of bone and lymph node recurrences. As the breast cancer patients with bone only metastasis usually have better survival than patients with other recurrence sites25. DFS data in CHLP cohort was also under chart-review to validate this result. Second, we did not include tumor grade in the models for too many missing data in both cohorts. Tumor grade is one of the known factors affecting the outcomes of BC patients. The lack of this variable may affect the robustness of our models. Thus, our results should be interpreted with caution. Third, bias may exist in the evaluation of HER2 expression between two cohorts, especially in the low range (IHC 0 or 1+). Therefore, we reevaluated the slides by two independent pathologists in both cohorts, which may reduce the bias. However, the regional variation and inter-observer difference in the practice of classifying HER2 intensity of IHC may still exist due to large population, which require us interpreting the results with caution. Lastly, we failed to identify ER-low (ER 1–10%) cancers in HoR-positive subgroup, as this group of patients may have distinct biological entities and clinical outcomes25.
Our study provided new knowledge regarding the impact of HER2-low positivity on survival of the BC patients who only received adjuvant chemotherapy from two independent cohorts. HER2-low patients had significantly better OS compared to HER2-zero patients, especially in HoR-positive subgroup. We also established and optimized machine learning models to predict the survival of BC patients, which demonstrated high robustness compare to regression-based model. These findings could help us to stratify our patients more precisely, and they may also inform new targeted therapeutic strategies for BC patients.
Methods
Ethics
This study was performed in accordance with the Declaration of Helsinki. The study was approved and supervised by the Institutional Review Board of SRRSH (IRB#: 20210910-30) and CHLP (IRB#: 20241347-2). All the patients enrolled in this study were fully informed and gave their consent, and received periodical follow-up.
Study design, clinical cohorts, and central pathology
An overview of the study design is depicted in Fig. S1. The study followed REMARK guideline26. The BC patients who were treated in the Surgical Oncology Department of SRRSH from January 1, 1996 to December 31, 2023 were recruited into the SRRSH cohort, and those treated in Breast Oncology Department of CHLP from 2005 to 2023 were recruited into CHLP cohort. In total, there were 3220 and 16,273 BC patients who received adjuvant chemotherapy participating in this study from SRRSH cohort and CHLP cohort, respectively. The criteria of enrollment were defined as follows: (1) Willingness to participate in the study and signed written informed consent; (2) Patients with unilateral invasive cancer and only received surgery and adjuvant chemotherapy; (3) Central pathological evaluation of ER, PR, HER2, and Ki67 at Department of Pathology at SRRSH or CHLP; (4) No distant metastasis at the time of diagnosis; (5) No neoadjuvant chemotherapy.
The demographic features of cohorts SRRSH and CHLP were listed in Table 1. The clinic-pathological variables including age, tumor stage, intravascular cancer thrombus, histological subtype, ER, PR, HER2, Ki67 level and treatment modalities were collected from electronic medical records and the central pathology department. The primary tumor specimens were obtained during surgery and underwent a central pathology evaluation for histological evaluation and ER, PR, HER2, and Ki67 examination by the department of pathology at SRRSH or CHLP, respectively. The ER/PR positivity was defined as ≥1% positively stained tumor cells in IHC staining27. All the pathology results were independently reviewed and confirmed by two experienced pathologists in accordance with the 5th World Health Organization (WHO) Classification of Tumors of the Breast28. The tumor stage was defined according to the latest TNM classification edition of the American Joint Committee on Cancer (AJCC). Stage III/IV was defined as late stage, and stage I/II was defined as early stage. OS was defined as the period from the time of surgery to the time of death for any cause or the time of last follow-up. DFS was defined as the period from the time of surgery to the time of relapse, or death from any cause. Recurrence included both local, regional and distant recurrence.
Evaluation and rescoring of HER2 status
All the evaluations of HER2 status in this study were in accordance with the 2018 ASCO/CAP guidelines29. Briefly, the IHC staining of HER2 positivity was performed utilizing Ventana BenchMark Ultra automatic stainer and the Ventana Ultra View universal DAB detection kit (Ventana Medical System Inc., Roche Tucson, USA). We exclusively used antibodies purchased from Roche Ventana. Subsequently, a dual-probe fluorescence in situ hybridization (FISH) test was performed for those samples with equivocal IHC results (2+) using the PathVysion HER2 DNA probe Kit (Vysis Inc. in Downers Grove, IL) on the same specimen as the IHC test. All the IHC and FISH results were evaluated by two independent experienced pathologists.
The definition for HER2 overexpression (HER2 3+) varied from 30% to 10% in 201330,31, which affected the identification of HER 2+. However, the criteria of HER2 1+ and HER2 zero have remained unchanged from 1998 until today. Therefore, two independent experienced pathologists reevaluated the HER2 3+ versus HER2 2+ of enrolled patients from 2007 to 2013.
Machine learning prediction models
Four machine learning models, namely Random Forest, Extreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost), and Multi-Layer Perceptron (MLP), were employed to predict the 5-year survival outcome. The models were implemented using functions from three Python packages: scikit-learn, xgboost, and catboost. The independent variables of the models were as follows: HER2 status (HER2-zero (0) vs. HER2-low (1) vs. HER2-positive (2)), age, stage (Early (0) vs. Late (1)), Ki67 status (<15% (0) vs. 15%–35% (1) vs. >35% (2)), histologic type (Invasive ductal (0) vs. Invasive lobular (1) vs. Others (2)), intravascular cancer thrombus (No (0) vs. Yes (1)), endocrine therapy (No (0) vs. Yes (1)), target therapy (No (0) vs. Yes (1)), and radiation therapy (No (0) vs. Yes (1)). Among these, age was sequentially encoded into the model as 0 to 5 according to the variable binning rules in Table 1. The dependent variable of the model, namely the 5-year survival outcome, was encoded as survival (0) and death (1).
In the CHLP cohort and two subgroups (HoR positive and HoR negative), the dataset was randomly split into evaluation and test sets in a ratio of 8:2. Only complete observations were used in the analysis. Sample sizes for each dataset are provided in the ‘Sample’ row of Table S1. In the evaluation set, the models utilized a random search strategy within a finite parameter space to search for the optimal parameter combinations through five-fold cross-validation. The model’s generalizability was evaluated by first comparing the CHLP and SRRSH cohorts using t-tests and then validating it on the SRRSH cohort (Table S2). A sensitivity analysis was conducted by randomly splitting the dataset 10 times with a fixed split ratio. The mean and variance of the model’s performance were reported.
We performed attribution analysis on the optimized models using SHapley Additive exPlanations (SHAP) on the entire dataset32. SHAP values show the contribution of each feature to the model, implicitly controlling for other features. A positive SHAP value indicates a positive correlation between the variable and the occurrence of the death outcome, whereas a negative value indicates a negative correlation.
Statistical analysis
Pearson’s χ2 test (for categorical parameters more than two categories), Mann–Whitney test (for continuous parameters), and Fisher’s exact test (for binary parameters) were employed for evaluating the correlations between HER2 status and clinic-pathological parameters.
Multivariable Cox proportional hazard models including same variables as machine learning models were used to report hazard ratios (HR) with 95% CIs of OS. Similarly, multivariable Cox proportional hazard model involving same variables was used to report HR and 95% CIs of DFS in SRRSH cohort. The missing data was not imputed in all regression models.
Survival curves of OS and DFS were estimated using Kaplan–Meier method and a two-sided log-rank test was applied evaluating the difference between groups according to HER2 status. Subgroup analysis was performed according to HoR status. The models for HoR positive patients involved all the variables, whereas the models for HoR negative patients had all the variables except endocrine therapy. The Receiver Operating Characteristic (ROC) curve was utilized to evaluate the prediction performance of the models. The generalization performance was validated on the testing set of the CHLP cohort and on the entire dataset of SRRSH cohort. The machine learning model with the best generalization performance was uploaded to the cloud service and served as the core model for the online prediction website (https://s481r82389.imdo.co/home).
All the statistic test was two-sided and the threshold of significance was set at 5%. The statistical analyses were performed using SPSS Statistics 27 (SPSS Inc, Chicago, US) and Excel (Microsoft, WA).
Data availability
The datasets generated and/or analyzed during the current study are not publicly available due to patients’ privacy protection, but are available from the corresponding author on reasonable request.
Code availability
The code for this study is available in GitHub and can be accessed via this link https://github.com/Maxin-C/Her2-Low-ML.git.
References
Siegel, R. L., Miller, K. D., Wagle, N. S. & Jemal, A. Cancer statistics, 2023. CA Cancer J. Clin. 73, 17–48 (2023).
Xia, C. et al. Cancer statistics in China and United States, 2022: profiles, trends, and determinants. Chin. Med. J.135, 584–590 (2022).
Sung, H. et al. Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71, 209–249 (2021).
Rakha, E. A. & Ellis, I. O. Breast cancer: updated guideline recommendations for HER2 testing. Nat. Rev. Clin. Oncol. 11, 8–9 (2014).
Loibl, S. & Gianni, L. HER2-positive breast cancer. Lancet 389, 2415–2429 (2017).
Modi, S. et al. Trastuzumab deruxtecan in previously treated HER2-low advanced breast cancer. N. Engl. J. Med. 387, 9–20 (2022).
Bardia, A. et al. Trastuzumab deruxtecan after endocrine therapy in metastatic breast cancer. N. Engl. J. Med. 391, 2110–2122 (2024).
Denkert, C. et al. Clinical and molecular characteristics of HER2-low-positive breast cancer: pooled analysis of individual patient data from four prospective, neoadjuvant clinical trials. Lancet Oncol. 22, 1151–1161 (2021).
Poschke, P. et al. Clinical characteristics and prognosis of HER2-0 and HER2-low-positive breast cancer patients: real-world data from patients treated with neoadjuvant chemotherapy. Cancers 15, https://doi.org/10.3390/cancers15194678 (2023).
Dai, Q. et al. Prognostic impact of HER2-low and HER2-zero in resectable breast cancer with different hormone receptor status: a landmark analysis of real-world data from the National Cancer Center of China. Target Oncol. 19, 81–93 (2024).
Dai, L. J. et al. Molecular features and clinical implications of the heterogeneity in Chinese patients with HER2-low breast cancer. Nat. Commun. 14, 5112 (2023).
Han, B. Y. et al. Clinical sequencing defines the somatic and germline mutation landscapes of Chinese HER2-Low Breast Cancer. Cancer Lett. 588, 216763 (2024).
Tarantino, P. et al. Comprehensive genomic characterization of HER2-low and HER2-0 breast cancer. Nat. Commun. 14, 7496 (2023).
Sammut, S. J. et al. Multi-omic machine learning predictor of breast cancer therapy response. Nature 601, 623–629 (2022).
Duan, H. et al. Machine learning-based prediction model for distant metastasis of breast cancer. Comput. Biol. Med. 169, 107943 (2024).
Molinelli, C. et al. Prognostic value of HER2-low status in breast cancer: a systematic review and meta-analysis. ESMO Open 8, 101592 (2023).
Peiffer, D. S. et al. Clinicopathologic characteristics and prognosis of ERBB2-low breast cancer among patients in the national cancer database. JAMA Oncol. 9, 500–510 (2023).
Won, H. S. et al. Clinical significance of HER2-low expression in early breast cancer: a nationwide study from the Korean Breast Cancer Society. Breast Cancer Res. 24, 22 (2022).
Mutai, R. et al. Prognostic impact of HER2-low expression in hormone receptor positive early breast cancer. Breast 60, 62–69 (2021).
Dischinger, P. S. et al. NF1 deficiency correlates with estrogen receptor signaling and diminished survival in breast cancer. NPJ Breast Cancer 4, 29 (2018).
Marsh, T. et al. Autophagic degradation of NBR1 restricts metastatic outgrowth during mammary tumor progression. Dev. Cell 52, 591–604.e596 (2020).
Rennert, G. et al. Clinical outcomes of breast cancer in carriers of BRCA1 and BRCA2 mutations. N. Engl. J. Med. 357, 115–123 (2007).
Clift, A. K. et al. Development and internal-external validation of statistical and machine learning models for breast cancer prognostication: cohort study. BMJ 381, e073800 (2023).
Zhao, J. et al. Radiomic and clinical data integration using machine learning predict the efficacy of anti-PD-1 antibodies-based combinational treatment in advanced breast cancer: a multicentered study. J. Immunother. Cancer 11, https://doi.org/10.1136/jitc-2022-006514 (2023).
Massa, D. et al. Immune and gene-expression profiling in estrogen receptor low and negative early breast cancer. J. Natl. Cancer Inst. 116, 1914–1927 (2024).
McShane, L. M. et al. REporting recommendations for tumour MARKer prognostic studies (REMARK). Br. J. Cancer 93, 387–391 (2005).
Allison, K. H. et al. Estrogen and progesterone receptor testing in breast cancer: ASCO/CAP guideline update. J. Clin. Oncol. 38, 1346–1366 (2020).
Tan, P. H. et al. The 2019 World Health Organization classification of tumours of the breast. Histopathology 77, 181–185 (2020).
Wolff, A. C. et al. Human epidermal growth factor receptor 2 testing in breast cancer: American Society of Clinical Oncology/College of American Pathologists clinical practice guideline focused update. J. Clin. Oncol. 36, 2105–2122 (2018).
Wolff, A. C. et al. Recommendations for human epidermal growth factor receptor 2 testing in breast cancer: American Society of Clinical Oncology/College of American Pathologists clinical practice guideline update. J. Clin. Oncol. 31, 3997–4013 (2013).
Wolff, A. C. et al. American Society of Clinical Oncology/College of American Pathologists guideline recommendations for human epidermal growth factor receptor 2 testing in breast cancer. J. Clin. Oncol. 25, 118–145 (2007).
Lundberg, S. M. & Lee, S. I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30 https://arxiv.org/abs/1705.07874 (2017).
Acknowledgements
We acknowledge all the members of field team for the maintenance of the clinical databases and follow-up of the participants of this study. We would like to thank all patients, clinicians and pathologists participating in the clinical study. This work was funded by the Nature and Science Fund Public Program of Zhejiang Province (LGF22H160008) and Open Project of Zhejiang Provincial Key Laboratory of Intelligent Preventive Medicine (1-4-2020E10004). The funder played no role in study design, data collection, analysis and interpretation of data, or the writing of this manuscript.
Author information
Authors and Affiliations
Contributions
The study was designed by Q.W. H.Z., Y.W., Y.C., L.C., and W.L. contributed to data acquisition. Biostatistical analysis was done by Z.C. and Q.W. Patient recruitment, sample collection as well as data collection was done by Q.W., Y.W., HM.Z., H.Z., X.Y., Z.J., X.L., B.G., W.H., L.W., J.Z., Q.D., H.L., and JC.Z. All authors had full access to all the data in the study and all authors interpreted the data. The first draft of the report was written by Q.W. Verification of the underlying data was done by Z.C., Q.W., and H.Z. The decision to submit the report for publication was made by all the authors. All authors contributed to the review of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Wang, Q., Wang, Y., Chen, Z. et al. Clinical characteristics and predictive models of HER2-low breast cancer patients who only received adjuvant chemotherapy: a real-world retrospective multicenter study. npj Precis. Onc. 9, 208 (2025). https://doi.org/10.1038/s41698-025-00998-3
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41698-025-00998-3






