Introduction

Triple-negative breast cancer (TNBC) is a highly heterogeneous disease defined by the absence of estrogen receptor (ER) and progesterone receptor (PR) expression, and human epidermal growth factor receptor 2 (HER2) overexpression/ERBB2 gene amplification. The lack of effective targets, coupled with a more aggressive behavior compared to other breast cancer (BC) subtypes, has led to poor clinical outcomes for patients with TNBC1. Immunogenic and transcriptomic profiling have highlighted the considerable intrinsic heterogeneity of this subtype2,3,4; Lehmann et al. classified TNBC into distinct prognostic subtypes based on gene expression data3,4, and our group identified four epigenetic subtypes based on DNA methylation (DNAm) profiling5. Understanding the complex and varied molecular underpinnings of TNBC is essential for improving the management and outcomes of patients with this aggressive disease.

The American Society of Clinical Oncology/College of American Pathologists (ASCO/CAP) established guidelines to characterize ER, PR, and HER2 positivity, which essentially defines TNBC, based on the ability of these biomarkers to predict response to targeted therapies6,7. HER2 protein expression of 0 or 1+ by immunohistochemistry (IHC), or IHC 2+ with ERBB2 gene non-amplified by in situ hybridization (ISH), is considered negative and predictive of non-response to traditional anti-HER2 therapies. Newer antibody-drug conjugates (ADCs), however, have demonstrated efficacy against BC with low expression of HER2 (IHC 1+ or 2+/ERBB2 non-amplified)8,9. The DESTINY Breast04 clinical trial showed that treatment with trastuzumab deruxtecan (T-DXd) resulted in a 50% reduction in the risk of progression and a 36% reduction in the risk of death for patients with HER2-low BC treated in the metastatic setting10. The therapeutic efficacy of T-DXd was consistent regardless of hormone receptor status; based on these results, the FDA-approved T-DXd to treat patients with HER2 IHC 1+ or 2+/HER2 non-amplified, metastatic BC.

Although HER2-low expression is now recognized as a predictive biomarker for ADC therapy, the relevance of HER2-low BC as a distinct entity at the molecular and clinical levels is still unclear. Studies have produced conflicting results when comparing HER2-low and HER2-zero BC11,12,13,14. Schettini et al.12 demonstrated that hormone receptor status was the primary driver of tumor biology rather than HER2 expression in HER2-negative tumors. Additionally, HER2 expression appeared to be closely associated with ER levels11,15,16,17,18. However, the clinical and molecular relevance of HER2-low expression in hormone receptor-negative tumors has been marginally explored. In a cohort study of over 30,000 patients with TNBC, we demonstrated that HER2-low tumors had lower rates of pathologic complete response (pCR) to neoadjuvant chemotherapy (NAC) compared to HER2-zero tumors, and HER2-low status remained significantly associated with lower rates of pCR after adjustment for clinicopathologic covariates19.

In the current study, we generated genome-wide DNAm profiles from clinically annotated TNBC tumors stratified according to HER2 status from a multi-institutional cohort of patients treated at Duke, UCLA, and Scripps Health (DUS). We further integrated genomic, epigenomic, transcriptomic, and proteomic profiling from TNBC patients enrolled in The Cancer Genome Atlas (TCGA), the Sweden Cancerome Analysis Network - Breast (SCAN-B), and the I-SPY2 Clinical Trial. By investigating the biological pathways and gene networks that differentiate HER2-low from HER2-zero TNBC across multiple datasets, we aimed to clarify the role of HER2-low expression in influencing tumor phenotype in patients with hormone receptor-negative BC.

Results

Genomic landscape and HER2 activation in HER2-low TNBC

We analyzed whole-exome single-nucleotide variations (SNV) data from 95 TNBC cases from the TCGA cohort (31 HER2-zero and 64 HER2-low). Consistent with previous findings, TP53 was the most frequently mutated gene across both cohorts, followed by high mutation rates in TTN, a gene known for its extraordinary length and elevated baseline mutation frequency. Although not statistically significant, HER2-low TNBC tumors exhibited a higher mutation frequency of the PIK3CA gene compared to HER2-zero tumors (14% vs. 3%, P = 0.16; Fig. 1a). No significant differences were observed in the mutation frequencies of 20 genes involved in the HER2 pathway (Supplementary Fig. 1a). However, HER2-low tumors displayed a significantly lower TMB than HER2-zero tumors (P = 0.02; Fig. 1b). Interestingly, HER2-low tumors exhibited a lower frequency of LOH of the ERBB2 gene (14% vs 35%; P = 0.04) and a higher frequency of diploidy (44% vs 27%; P = 0.04; Fig. 1c). In concordance, the level of ERBB2 expression was higher in HER2-low tumors (P = 0.02; Supplementary Fig. 1b). Consistently, the levels of HER2 protein (log2 FC = 0.21; P = 0.04) and phosphorylation of tyrosine 1248 (Y1248; log2 FC = 0.14; P = 0.04) were higher in HER2-low tumors (Fig. 1d). Together, these findings demonstrate that HER2-low tumors display distinct biological features compared to HER2-zero tumors, including higher ERBB2 expression, increased translation, and elevated HER2 activation. While both subtypes are classified as HER2-negative per clinical guidelines, the observed HER2 activation in HER2-low tumors suggests potentially distinct biological behavior, with implications for clinical outcomes and response to targeted therapies.

Fig. 1: Comparative molecular characterization of HER2-zero (n = 31) and HER2-low (n = 64) TNBC tumors.
figure 1

a The OncoPrint view of the top eight mutated genes. Genes are sorted by the frequency of variants in the overall study population. The percentages listed show the frequency of alterations in HER2-zero and HER2-low, respectively. b TMB levels in HER2-zero and HER2-low TNBC tumors. TMB is represented as log10 of mutations per Mb sequenced. HER2-zero TNBC is shown in purple and HER2-low in turquoise. The Wilcoxon rank-sum test was used to test statistical differences: *P < 0.05. c Percentage of ERBB2 gene copy number category for each cohort. Gain: >2 copies, diploid: 2 copies, and loss of heterozygosity: =1 copy. HER2-zero TNBC is shown in purple and HER2-low in turquoise. The Chi-square test was used to test statistical differences. d Boxplots showing the protein levels of HER2 and phosphorylated HER2 (pHER2-Y1248) in HER2-zero and HER2-low TNBC tumors. HER2-zero TNBC is shown in purple and HER2-low in turquoise. Student’s t-test and Wilcoxon rank-sum test were used respectively to test statistical differences: *P < 0.05. TNBC triple-negative breast cancer, TMB tumor mutational burden, LOH loss of heterozygosity.

Epigenetic profiling reveals hypermethylation of immune-related pathways in HER2-low TNBC

To identify epigenetically regulated mechanisms of HER2-low TNBC, we compared genome-wide DNAm profiles between HER2-low and HER2-zero tumors using TCGA data. Our analysis identified 5287 differentially methylated sites (DMS) (absolute differential mean β value > 10%, P < 0.05; Fig. 2a). Among those, HER2-low tumors exhibited hypermethylation in 4357 genomic regions and hypomethylation in 930 genomic regions (P < 0.001). Enrichment analyses of hypermethylated genomic regions revealed significant involvement of immune-related pathways, including lymphocyte regulation, T-cell signaling, alpha-beta T-cell activation, and T-cell receptor signaling (P < 0.001; Fig. 2b).

Fig. 2: DNA methylation differences between HER2-zero and HER2-low TNBC and their overlap with emQTLs.
figure 2

a Volcano plot displaying the differential mean β value (x-axis) and the inverse of log10 P-values (y-axis). Compared to HER2-zero, the HER2-low cohort had 5287 DMS of which 4357 were hypermethylated (red) and 930 were hypomethylated (green). b The top 10 enriched GO biological processes identified using hypermethylated DMS from the TCGA cohort. The x-axis shows the inverse log10 of the P-values. c UCSC Genome Browser (GRCh38/hg38) view of differential methylation across the 6p22.1 to 6p21.32 genomic region. The TCGA cohort is shown at the top, alongside our multi-institutional cohort below, with chromatin state segmentations defined by the HMM from ENCODE. Genes are colored according to HLA type: orange represents type I HLAs, while blue represents type II HLAs. Chromatin segmentation colors are as follows: promoters (red), enhancers (yellow), insulators (blue), heterochromatin (gray), and transcriptionally active regions (green). d Percentage of DMS in emQTL Biclusters associated with various cellular processes. e Venn diagram illustrating the overlap of DMS included in the immune response, Bicluster 5. The diagram includes the OR of being differentially methylated and part of the immune-related Bicluster, along with the P-value and 95% confidence intervals. TNBC triple-negative breast cancer, emQTL expression-methylation Quantitative Trait Loci, DMS differentially methylated sites, GO gene ontology, OR odds ratio.

A detailed genome-wide analysis uncovered a prominent cluster of hypermethylated sites localized at the 6p22.1 to 6p21.32 (3.5 Mb) genomic region. This region encodes multiple HLA genes of the major histocompatibility complex (MHC) class I and II, which are critical for antigen processing and presentation (Supplementary Fig. 2). Genome-wide DNAm profiles from patients in our DUS cohort confirmed the significant hypermethylation of HLA genes in HER2-low TNBC tumors (Fig. 2c). In addition, we found in both the DUS and TCGA cohorts that the hypermethylated sites were primarily affecting transcriptionally active regions throughout the HLA loci (Fig. 2c), supporting a repressive impact of these epigenetic alterations in HER2-low tumors.

Analysis of genome-wide hypermethylated sites in HER2-low tumors from our DUS cohort also revealed significant hypermethylation of genomic regions linked to CD4-positive or CD8-positive T-cell lineage commitment, antigen presentation, and leukocyte migration (P < 0.05; Supplementary Fig. 3), concordant with our findings from the TCGA cohort.

Functional implications of epigenetic alterations in HER2-low TNBC

We then examined the overlap between DMS observed in HER2-low TNBC tumors with expression-methylation Quantitative Trait Loci (emQTLs) to identify biologic pathways influenced by these methylation changes. Using curated sets of CpG sites with demonstrated regulatory effects on gene expression (Biclusters; Supplementary Data Table 1)20, we observed that a large percentage of DMS in HER2-low TNBC tumors were enriched in Bicluster 5, a cluster specifically related to the immune response regulation in BC tumors (Fig. 2d). Specifically, 884 DMS were identified in Bicluster 5, of which 883 were hypermethylated, suggesting widespread epigenetic silencing of immune-related genes in HER2-low tumors. This association was statistically significant, with sites differentially methylated between HER2-low and HER2-zero tumors showing an 8.77-fold higher likelihood of being linked to Bicluster 5 (OR = 8.77, 95% CI: 8.12 to 9.46; P < 0.001; Fig. 2e) compared to other genomic regions. The immune-related genes within Bicluster 5 are involved in critical processes such as antigen presentation, T-cell activation, and leukocyte recruitment, key components required for effective anti-tumor immunity.

Transcriptomic alterations in immune-related pathways in HER2-low TNBC tumors

Supporting the findings from the epigenetic analyses from our DUS cohort and TCGA, gene expression differences between HER2-zero and HER2-low TNBC in the TCGA cohort were predominantly noted in immune-related functions (adjusted P < 0.001; Fig. 3a). Specifically, the comparison of global gene expression changes revealed that the adaptive immune response pathways were the most significantly affected in HER2-low tumors (Supplementary Fig. 4a). Among 928 differentially expressed genes (DEGs) (absolute log2 fold change > 0.5; adjusted P < 0.05), 326 genes were significantly downregulated, and 602 were upregulated in HER2-low tumors (Supplementary Fig. 4b). In concordance with our findings in the epigenetic analysis of these tumors, downregulated genes were enriched in pathways critical for effective immune responses, such as leukocyte activation, T-cell signaling, and adaptive immune response processes (adjusted P < 0.001; Fig. 3b), supporting a decrease in immune activity in HER2-low TNBC tumors.

Fig. 3: Analysis of biological processes associated with gene expression differences between HER2-low and HER2-zero TNBC.
figure 3

a The top 10 GO biological processes dysregulated in HER2-low TNBC tumors through GSEA. Pathways are sorted by NES (x-axis). b The top 10 molecular pathways associated with genes downregulated in HER2-low TNBC. The ratio (x-axis) represents the proportion of genes downregulated per total number of genes in the pathway. c Cascade plot representing differential gene expression of HLA genes between HER2-low and HER2-zero TNBC tumors in the TCGA cohort. The red color illustrates statistical significance. Wald test from the DESeq2 pipeline was used to assess statistical differences, P < 0.05. d Upset plot showing the immune-related pathways assessed using GSEA that are exclusively or commonly significant across TCGA (n = 411) and the validation datasets I-SPY2 (n = 182) and SCAN-B (n = 11) between HER2-low and HER2-zero TNBC tumors. The intersections indicate the number of pathways shared among the respective datasets. TNBC triple-negative breast cancer, GO gene ontology, GSEA gene set enrichment analysis, NES normalized enrichment score, HLA human leukocyte antigen.

A deeper analysis of HLA gene expression demonstrated significant downregulation of HLA genes (Fig. 3c), consistent with the observed hypermethylation of sites encoding HLA genes of the MHC in HER2-low tumors. To validate the robustness of these findings, we analyzed the transcriptomic data from SCAN-B and I-SPY2, which comprised 186 HER2-low and 163 HER2-zero TNBC tumors. Gene set enrichment analysis (GSEA) of gene expression profiles of tumors in these additional cohorts corroborated the immune suppression seen in HER2-low tumors, with consistent downregulation of key immune pathways, including lymphocyte activation and antigen presentation (Fig. 3d and Supplementary Fig. 5). Notably, HLA genes exhibited consistent trends of downregulation in HER2-low tumors, reinforcing the immune-depleted antigenic landscape identified in the TCGA cohort (Supplementary Fig. 6).

Immune cell depletion in the tumor microenvironment of HER2-low TNBC tumors

To understand how the molecular alterations identified in HER2-low TNBC impact the tumor microenvironment (TME), we performed a deconvolution analysis of cellular composition using the xCell algorithm. This analysis revealed significant reductions in multiple immune cell populations in HER2-low tumors, indicating an immune-depleted TME. Specifically, HER2-low tumors exhibited significantly lower levels of activated dendritic cells (aDC, P = 0.022), B-cells (P = 0.015), CD4+ memory T-cells (P = 0.010), CD8+ central memory T-cells (Tcm, P = 0.011), class-switched memory B-cells (P = 0.002), macrophages (P = 0.046), M1 macrophages (P = 0.023), mast cells (P = 0.003), memory B-cells (P = 0.004), monocytes (P = 0.020), and Th2 cells (P = 0.017; Fig. 4a).

Fig. 4: Immune-evasive phenotype of the tumor microenvironment of HER2-low TNBC.
figure 4

a Boxplots representing significant differences in the composition of TME immune cells. HER2-zero TNBC is shown in purple and HER2-low in turquoise. Statistical differences are represented as *P < 0.05, **P < 0.01. b Boxplots showing the immune and stroma scores. Scores were calculated using the xCell algorithm. HER2-zero TNBC is shown in purple and HER2-low in turquoise. The Wilcoxon rank-sum test was used to test statistical differences, which are represented as **P < 0.01; ns non-significant differences. TNBC triple-negative breast cancer, TME tumor microenvironment, aDC activated dendritic cells, CD8+ Tcm CD8+ central memory T-cells, Csm B-cells Class-switched memory B-cells, Th2 cells Type 2 T-helper cells.

Notably, the significant depletion of M1 macrophages, which play a pivotal role in anti-tumor activity through direct cytotoxic activity and promotion of T-cell responses, suggests a shift towards a less immune-active TME in HER2-low tumors. This depletion of immune infiltrates was further reflected in a lower immune score in HER2-low (P = 0.002, Fig. 4b). In contrast, the stromal scores did not show a significant difference between the two groups, suggesting that the observed differences are specific to the immune components of the tumors rather than generalized changes in the stroma.

Association of immune-related genes affected in HER2-low tumors with survival outcomes in TNBC

To evaluate the clinical implications of the observed molecular and TME alterations in HER2-low TNBC, we assessed the association of genes enriched in immune regulatory pathways (Bicluster 5) with survival. Of the 853 genes in this bicluster, 268 were differentially expressed in HER2-low TNBC tumors (Supplementary Data Table 2). Of these, 260 genes were significantly downregulated, while only eight were upregulated (Fig. 5a), suggesting a clear trend of epigenetic repression.

Fig. 5: Association of immune-evasive signature with prognosis for TNBC tumors.
figure 5

a Volcano plot represents the log2 fold change of gene expression (x-axis) and the inverse log10 P-value (y-axis) of genes included in the immune response Bicluster 5. 260 genes were downregulated (blue) and 8 were upregulated (red) in HER2-low TNBC tumors. b Differential expression of five genes from the Bicluster 5 selected for survival analyses. Wald test P-value from DESeq2 analysis was used to test statistical differences: ***P < 0.001. c RFS and d OS rates for ER-negative and HER2-negative patients stratified based on the expression of the five-gene immune-evasive signature. The Kaplan–Meier curves were obtained using the Kaplan–Meier Plotter web tool and include HR and 95% CI estimated from the associated unadjusted Cox proportional hazards model and the log-rank test p value. TNBC triple-negative breast cancer, ER estrogen receptor, OS overall survival, RFS relapse-free survival, HR hazard ratios, CI confidence intervals.

We further investigated the prognostic significance of these genes by focusing on the 10 most significantly altered genes and their association with survival outcomes. Survival analyses in an independent cohort of 846 TNBC patients revealed that eight of these genes were significantly associated with worse relapse-free survival (RFS) (Supplementary Fig. 7). Due to redundancy in the predictive potential among these eight genes, we refined the analysis based on P values and predictive strength for survival outcomes to a smaller set of five genes, all of which had notably lower expression in HER2-low TNBC: SPIB, SH3KBP1, HLA-G, BIRC3, and VCAM1 (Fig. 5b). This five-gene signature demonstrated robust prognostic value, showing strong associations with both RFS (HR = 0.52; 95% CI, 0.38 to 0.72; P < 0.001; Fig. 5c) and OS (HR = 0.36; 95% CI, 0.21 to 0.61; P < 0.001; Fig. 5d) with higher expression of these genes resulting in improved survival outcomes. Importantly, this association was maintained regardless of treatment status, indicating a consistent prognostic impact (Supplementary Fig. 8).

Discussion

In this exploratory study, we demonstrated that HER2-low expression in TNBC is associated with an immune-evasive phenotype compared to HER2-zero in both the tumor and TME. These findings suggest that HER2-low expression in hormone receptor-negative BC may confer unique biological characteristics to the tumor, potentially representing a distinct molecular subtype compared to HER2-zero. Our analysis uncovered a coordinated epigenetic repression of immune-related pathways, characterized by the hypermethylation of genes critical for antigen presentation and immune activation, particularly the HLA gene family, with a corresponding decrease in HLA gene expression and downregulation of genes involved in lymphocyte and leukocyte activation pathways. This silencing appears to confer an immune-evasive phenotype to HER2-low TNBC by decreasing both tumor immunogenicity and immune system activation. Data examining HER2-low BC and immune response are limited, but our findings align with those of others21,22; Hu et al.21 demonstrated that immunoglobulin-related genes (IGKC, IGHG1, IGHG4, and IGLC2) and immune activation pathways were significantly upregulated in HER2-zero TNBC compared to HER2-low. Misetic et al. demonstrated an inverse correlation between ERBB2 expression and immune response in BC, agnostic of subtype, with increasing ERBB2 oncogene levels associated with a decrease in Tumor Inflammation Signature23. Others have similarly demonstrated that pathways contributing to increased immunity were enriched in HER2-zero tumors22. Our study complements these findings with novel epigenetic data from independent TNBC cohorts, enhancing the robustness of the association between HER2-low expression and reduced immune activity in TNBC.

In addition to an immune-evasive tumor profile, we observed a suppressed immune landscape in the TME with a lower density of tumor-infiltrating lymphocytes (TILs) and M1 macrophages in HER2-low TNBC based on gene expression differences. van den Ende et al. similarly demonstrated an association between lower density of TILs and HER2-low status after adjustment for clinicopathologic variables (P = 0.04)22; this association is significant given that several studies have demonstrated that a higher density of TILs is consistently reported to correlate with pCR in patients with TNBC24,25,26,27. We and others have shown that HER2-low TNBC tumors appear to have lower rates of pCR to NAC than HER2-zero17,19,28,29. The lower density of TILs and the overall immunosuppressed landscape with depletion of key immune cell types, particularly M1 macrophages and memory T and B-cells, may contribute to the poorer response to chemotherapy observed in HER2-low tumors in these studies. Interestingly, Yam et al.30 demonstrated that patients with HER2-zero tumors who did not achieve a pCR had downregulation in immune response pathway genes, further supporting an association between immune activation and response to therapy.

Since 2021, chemotherapy with the addition of pembrolizumab has become the standard of care for the treatment of patients with stage II and III TNBC in the neoadjuvant setting. The addition of immune checkpoint inhibitor therapy was shown to significantly improve pCR rates31,32, event-free survival33, and clinically meaningful OS34,35. However, the impact of HER2-low expression on response to immunotherapy is currently unknown. Evidence is conflicting regarding HER2-low BC as a distinct biological subtype18,36,37,38,39,40, but our findings suggest that HER2-low tumors may have a tendency toward immune escape, which raises potential concerns about the impact of this phenotype on the efficacy of immunotherapy. While promising, the addition of immunotherapy is not without drawbacks, including immune-related toxicities41, which highlights the need for predictive biomarkers to identify patients who are unlikely to benefit from these therapies.

Of interest, we have shown that HER2-low TNBC tumors have a lower TMB compared to HER2-zero, and TMB has been shown to be a predictive biomarker for response to immunotherapy in patients with TNBC42. Tumors with high TMB appear to benefit from immunotherapy due to the increased number of neoantigens that stimulate a robust anti-tumor immune response43. Pembrolizumab is currently approved for the treatment of patients with solid tumors characterized by high TMB, including BC. However, the lower TMB and the overall immunosuppressive landscape noted in HER2-low TNBC in our study suggest that these characteristics may influence the response of this subtype to immunotherapy, findings which necessitate further research.

The clinical efficacy of the novel antibody-drug conjugate, trastuzumab deruxtecan (T-DXd), in patients with HER2-low BC in the DESTINY Breast04 trial10,44 supports that low expression of HER2 may be a therapeutically actionable target for patients with TNBC. T-DXd is currently FDA-approved for metastatic HER2-low, hormone receptor-positive BC but not for HER2-low TNBC, although T-DXd has shown benefit in this latter group; ESMO guidelines do recommend T-DXd as third-line therapy for TNBC patients with metastatic disease. Despite being clinically classified as HER2-negative, we confirmed increased ERBB2 gene expression and HER2-activated protein phosphorylation in HER2-low TNBC. This observed HER2 signaling activity in HER2-low tumors supports that HER2-low status may serve not only as a biomarker of immune suppression but also as a potential therapeutic target for TNBC tumors. Future research into alternative or synergistic therapies may prove to be beneficial for patients with this purported distinct subtype.

Finally, based on the differentially expressed immune-related genes noted in HER2-low TNBC, we identified a five-gene signature (SPIB, SH3KBP1, HLA-G, BIRC3, and VCAM1) prognostic of survival in patients with TNBC. Lower expression of these genes, as seen in HER2-low TNBC, correlated with worse RFS and OS in both treated and untreated patients. The prognostic relevance of immune-related gene networks, as noted in this gene signature, highlights the clinical relevance of immune suppression in outcomes and suggests that restoring immune activity could be a promising therapeutic strategy for patients with this distinct TNBC subtype. Additionally, future studies are warranted to assess the utility of this gene signature in predicting response to immunotherapy.

Of significance in our study, we validated our findings across multiple independent datasets, including our multi-institutional DUS cohort, the TCGA cohort, the SCAN-B dataset, and the I-SPY2 clinical trial. We confirmed the immune-evasive features of HER2-low tumors, including the hypermethylation of immune-related genes, specifically the hypermethylation and downregulation of HLA gene expression, and the suppression of immune activation pathways. This validation ensures the robustness and reproducibility of our findings and suggests that the immune suppression observed in HER2-low tumors is a biologically consistent phenomenon across populations in patients with TNBC. Our findings support that HER2-low TNBC tumors develop a distinct immune-evasive phenotype through epigenetic silencing of immune regulatory networks; such epigenetic reprogramming may influence tumor behavior and responsiveness of HER2-low TNBC to systemic therapies.

While our study provides valuable insights into HER2-low TNBC, it also has limitations. The retrospective design introduces potential biases, and HER2 IHC scoring (e.g., 0 vs. 1+) may be subject to inter-observer variability, potentially affecting HER2 classification across cohorts. The absence of centralized HER2 reassessment across datasets may have led to misclassification, potentially diminishing the biological distinction between HER2-low and HER2-zero tumors. The validity of retrospective survival analyses is further limited by the lack of treatment annotation in some datasets, which may differ across cohorts. Another limitation is the relatively small sample size in the epigenetic profiling cohorts, including both our cohort and the TCGA dataset, which may reduce the statistical power of the analyses. Nevertheless, the use of centralized pathology assessments in TCGA, the consistency of findings across SCAN-B and I-SPY2, and the inclusion of our well-annotated, multi-institutional DUS cohort strengthen the overall validity of our results.

Our data provide compelling evidence that HER2-low TNBC represents a unique BC subtype with molecular and immune-related features imparting a distinct biology compared to HER2-zero. The clinical impact of HER2-low expression on response to immunotherapy is of particular relevance and is currently under study. Future research is also needed to confirm the causal role of epigenetic and transcriptomic alterations in HER2-low immune evasion and the potential impact on treatment outcomes.

Methods

Ethics approval and consent to participate

This study was approved by the Duke Institutional Review Board (IRB) and included a waiver of informed consent for the retrospective analysis of clinicopathologic data and tissue samples from Duke (Pro00110053). For the UCLA specimens, the study was approved by the UCLA IRB (18-001023), which also included a waiver of informed consent for the retrospective analysis of tissue samples. Samples from the Scripps Clinic Biorepository were obtained from patients who consented to the use of their tissue for research purposes. Additional data for this cross-sectional study were gathered from publicly available datasets, including The TCGA, NCBI-Gene Expression Omnibus (GEO), and European Genome-phenome Archive (EGA), and did not require IRB approval or patient-informed consent per the Code of Federal Regulations (45 CFR §46.104).

Inclusion and ethics statement

This study was conducted following ethical guidelines and principles for research. All research involving human material or human data was conducted in accordance with the ethical principles outlined in the Declaration of Helsinki. All publicly available datasets used in this study were obtained from established repositories with appropriate ethical approvals. The newly generated data adheres to ethical standards. Our research team is composed of scientists from diverse geographic, institutional, and disciplinary backgrounds, fostering a collaborative and inclusive approach to scientific discovery. We recognize the importance of global representation in cancer research and strive to ensure equitable contributions from all team members.

Patient selection and classification criteria

We retrospectively identified a multi-institutional cohort of patients with treatment-naïve TNBC (DUS; n = 63). Clinical data were curated using pathology reports to confirm patients with invasive TNBC, defined by ER, PR, and HER2 negativity according to ASCO/CAP guidelines6,7. HER2 status was further subcategorized into HER2-zero (IHC score of 0) and HER2-low (IHC 1+ or 2+, ERBB2 gene non-amplified by ISH). Specimens were obtained primarily from core needle biopsies of the breast to ensure that tumors were untreated, given that the majority of patients with TNBC received NAC. Tumor purity was evaluated using the LUMP algorithm described by Aran et al.45. Only samples with tumor purity above 50% were included for epigenetic analyses. After applying these criteria and performing quality control of the DNAm data, 44 patients with primary invasive TNBC were included in the study from the DUS cohort, comprising 27 HER2-low and 17 HER2-zero. Clinicopathologic variables were matched between the two groups (Table 1). Data from 462 female patients adhering to the same classification criteria were included in this study from TCGA (n = 113), SCAN-B (n = 63), and the I-SPY2 Clinical Trial (n = 286). Detailed information about the study cohorts is shown in Supplementary Fig. 9.

Table 1 Demographic and clinicopathologic features of the Duke-UCLA-Scripps multi-institutional cohort according to HER2 status

Tissue processing and DNA purification

Representative formalin-fixed paraffin-embedded (FFPE) tissue sections for each block specimen from the DUS cohort were selected. A breast pathologist reviewed tissue slides stained with hematoxylin & eosin and identified and labeled tumor-enriched areas. After deparaffinization, a weak hematoxylin staining was performed in serial 10 µm-thick tissue sections, and tumor tissues were microdissected by needle from consecutive FFPE slides. Genomic DNA was isolated using the Quick-DNA FFPE Miniprep Kit (D3067; Zymo Research, CA). DNA concentration and purity were quantified using a Qubit 3.0 Fluorometer (Q33216; Thermo Fisher Scientific, CA), and bisulfite-converted using EZ DNA Methylation-Direct (D5021, Zymo Research, CA) according to the manufacturer’s recommendations.

Genome-wide DNA methylation profiling and data processing

DNAm was analyzed using the Infinium Methylation EPIC v2.0 BeadChip (Illumina, Inc.) following the manufacturer’s protocol. Data normalization and quality control were performed using the R package ChAMP (v.2.36.0)46. Batch effects were tested and corrected using the ComBat function in ChAMP, and probes with a detection p value above 0.01 were excluded. Probes targeting non-CG dinucleotides, single-nucleotide polymorphisms, and chromosome Y were removed from downstream analyses.

Public data access and processing

TCGA datasets, including copy number alterations (Genotyping array; Affymetrix SNP 6.0), SNV (Whole-exome DNA sequencing), DNAm (HumanMethylation450 array; Illumina), transcriptome profiling (RNA sequencing), and protein expression (Reverse Phase Protein Array) from HER2-low (n = 75) and HER2-zero (n = 38) TNBC tumors were compiled and downloaded from the National Cancer Institute Genomic Data Commons portal using the TCGAbiolinks R package (v. 2.28.4)47. Data from the SCAN-B and I-SPY2 cohorts were retrieved from the NCBI-GEO (GSE81538 and GSE194040). Gene expression data from 186 HER2-low TNBC (n = 29, SCAN-B and n = 157, I-SPY2) and 163 HER2-zero TNBC samples (n = 34, SCAN-B and n = 129, I-SPY2) were included.

For survival analysis, gene expression data from 30 additional cohorts comprising 846 female patients with ER−/HER2− BC tumors were obtained from the NCBI-GEO and the European Genome-phenome Archive (EGA) repositories (Supplementary Data Table 3).

Data processing and bioinformatic analyses

The maftools R package (v. 2.16.0) was employed to analyze SNV data. Tumor mutational burden (TMB) was calculated as the number of mutations per Mb sequenced48. Copy number variations were categorized into Gain (copy number > 2), Diploid (copy number = 2), or Loss of Heterozygosity (LOH; copy number = 1). All CpG sites with a differential mean β value > 10% and P value < 0.05 were considered DMS. Expression-methylation Quantitative Trait Loci (emQTL) included in biclusters associated with cell processes were obtained from the study by Ankill et al.20. The odds ratio (OR) with 95% confidence intervals (CI) was calculated using the R package questionR (v. 0.7.8). Chromatin states from mammary epithelial cells were retrieved from the Broad ChromHMM wgEncodeBroadHmm track available in the UCSC Genome Browser, based on the hg19 genome assembly49. These states were then converted to the hg38 assembly using the liftOver tool in the Galaxy platform50. Subsequently, the data were summarized into biologically relevant states: promoters, enhancers, insulators, heterochromatin, and active transcription.

RNA sequencing data from the TCGA were processed using the DESeq2 R package51. Genes with a base mean of 25 reads or fewer were discarded. DEGs were defined as an absolute log2 fold change (FC) > 0.5 and an adjusted P value < 0.05, corrected for multiple testing using the Benjamini–Hochberg False Discovery Rate method. The GSEA and pathway enrichment analysis of DEGs and DMS were conducted using the clusterProfiler R package (v. 4.8.3)52. Gene set data, including the HER2 pathway, were sourced from the MSigDB database using the msigdbr R package (v. 7.5.1). For data representation, raw RNA-seq counts were normalized using DESeq2 size factors, accounting for differences in library depth, and then log₂-transformed using a pseudocount of 1.

The xCell algorithm was used to deconvolute the cellular composition of tumor samples using RNA sequencing data53. Cell-type enrichment analyses, including the generation of the Immune and Stroma scores, were performed using the xCell R package (v. 1.1.0). For data representation, the ggplot2 R package (v. 3.4.4), circlize (v. 0.4.16)54, and UpSet (v. 1.4.0) R packages55 were used. The collection of R packages grouped on tidyverse (v. 2.0.0) was used for data management. Additional survival analyses were performed using the Kaplan–Meier plotter tool56 in March 2024, involving a cohort of 846 female TNBC patients. To ensure comparability across datasets and minimize batch effects, the platform applied standardized normalization procedures, including MAS5 and scaling methods, as previously described by Györffy et al.57. Overall survival (OS) and RFS were compared for expression levels of genes included in the Gene Chip BC database.

Statistical analyses

Categorical clinicopathologic variables were compared using Fisher’s exact test and the Chi-square test. Age was presented as median and interquartile range (IQR) and mean and standard deviation (SD), and was compared using the Student’s t-test. The Shapiro–Wilk test was used to assess the normality of continuous data. The Wilcoxon rank-sum test was used to determine differences in TMB, phosphorylated HER2 levels, DMS, TME scores, and cell composition between the HER2-low and HER2-zero TNBC tumors. Additionally, the two-sided Student’s t-test was applied to determine differences in non-phosphorylated HER2 protein levels and RNA data from the SCAN-B and I-SPY cohorts. Log-rank tests were used to compare OS and RFS between low and high gene expression. Additionally, Cox Proportional Hazards modeling was conducted to calculate hazard ratios (HR) between groups. For HR calculation, the low expression group served as the reference group. Unless otherwise specified, all computational analyses were performed using R software (v.4.3.1). P values adjusted for multiple comparisons are indicated with “adjusted”; all other p values are not adjusted for multiple comparisons.