Abstract
The efficacy of immune checkpoint inhibitors combined with chemotherapy varies among breast cancer subtypes and is particularly less effective in hormone receptor-positive (HR + ) breast cancers. Here, we analyze pre-, on-, and post-treatment biopsies from 20 female patients with stage II-III HR+ breast cancer who participated in a clinical trial of neoadjuvant chemo-immunotherapy with nab-paclitaxel and pembrolizumab. Through single-nucleus RNA and ATAC sequencing of these tumor biopsies, we identified gene expression metaprograms (MPs) associated with differential therapy responses. Here we show that favorable responders exhibit increased activity in pathways related to tumor state transition, T cell effector functions, and pro-inflammatory macrophage states. Unfavorable responders demonstrate increased tumor estrogen signaling and immunosuppressive tumor-immune interactions. In this work, we highlight the interplay between tumor and microenvironmental cells in treatment naïve and exposed HR+ breast cancers and reveal that pivotal shifts in tumor cell, macrophage, and T cell states may mediate response to chemo-immunotherapy.
Similar content being viewed by others
Introduction
Breast cancer, which can be clinically stratified into hormone receptor-positive (HR + ), HER2-positive (HER2 + ), and triple-negative (TNBC) subtypes, is the leading cause of cancer-related death in women1. The use of immune checkpoint inhibitors (ICIs), both alone and in combination with chemotherapy, has thus far yielded mixed results in the treatment of HR+ breast tumors2,3,4. Despite anti-Programmed Cell Death Protein 1 (PD-1) (with chemotherapy) being an effective standard therapy for some TNBC patients, the molecular drivers of early-stage ICI response remain unclear, as PD-L1 immunohistochemistry lacks sufficient predictive power5,6. The factors governing the induction and maintenance of antitumor responses in the setting of ICIs, as well as conventional cytotoxic chemotherapies used in HR+ breast cancer management, are likely multifaceted and span both tumor-intrinsic and microenvironmental processes. Moreover, tumor-intrinsic characteristics significantly influence the priming, activation, and recruitment of T cells to the tumor microenvironment, which are critical for an immune response in the context of ICIs. Yet, specific mechanisms between tumor and microenvironmental cells that may result in therapeutic response to ICI plus chemotherapy in breast cancer are not well understood.
Multiomic single-nucleus sequencing, which couples single-nucleus transcriptome and chromatin accessibility profiling, offers opportunities to study the microenvironment of tumors and uncover molecular programs underlying response or resistance to therapy in breast cancer7,8. Previous studies have focused on elucidating mechanisms driving therapeutic response by characterizing the clonal expansion of T cells in breast cancer8 and examining the immune cell states in the TNBC subtype7,9,10. However, the intricate tumor-intrinsic properties and tumor-immune interactions that emerge during combination ICI and chemotherapy treatment, particularly in the context of combination anti-PD-1 and taxane chemotherapy in early-stage HR+ breast cancer, remain largely unknown.
In this work, we examine the tumor-intrinsic and -extrinsic microenvironmental dynamics underlying therapeutic response to combination anti-PD-1 + taxane chemotherapy in HR+ breast cancer by performing and analyzing single-nucleus multiome sequencing (snRNA/ATAC-seq) on tumors biopsied before, during, and after neoadjuvant nab-paclitaxel + pembrolizumab treatment from 20 patients with early-stage HR+ breast cancer. Our findings suggest that key shifts in tumor, macrophage, and T cell states may underlie response to chemo-immunotherapy.
Results
Single-nucleus transcriptome and chromatin accessibility profiling
We conducted a pilot study (NCT02999477) of changes in PD-L1 expression that occur during preoperative treatment with nab-paclitaxel + pembrolizumab in HR+ breast cancer11 (Fig. 1A). Eligible patients had treatment-naïve stage II-III HR+ breast cancer. Briefly, patient participants were randomized 1:1 to receive a 2-week window of nab-paclitaxel (Arm 1: Chemo → Combo) or pembrolizumab (Arm 2: ICI → Combo). Patients randomized to Arm 1: Chemo → Combo received nab-paclitaxel 125 mg/m2 via intravenous (IV) infusion once a week for two weeks. Patients randomized to Arm 2: ICI → Combo received one dose of 200 mg pembrolizumab via IV infusion. After this window phase, all patients received weekly nab-paclitaxel 125 mg/m2 in combination with pembrolizumab 200 mg once every three weeks. Total planned neoadjuvant therapy for all patients was 12 doses of weekly nab-paclitaxel and five doses of every-3-week pembrolizumab. Approximately one-half of patients experienced an incomplete clinical response to the trial regimen and received additional standard of care neoadjuvant chemotherapy with Adriamycin/Cyclophosphamide (AC) for four cycles prior to surgery. Patients were subsequently categorized as “favorable responders” with residual cancer burden (RCB) 0-I, or “unfavorable responders” with RCB II-III, consistent with the associated clinical trial evaluation11.
A The study design (Created in BioRender. Miler-jones, L. (2025) https://BioRender.com/fv15ktr). B Genomic and clinical overview of the hormone receptor-positive (HR+) breast cancer sample. Each column represents a tumor sample. Tumors are ordered by residual cancer burden (RCB) response (RCB 0–I or RCB II-III) and within each subgroup ordered by treatment arm. “Stage” denotes the breast cancer stage. “BluePrint” refers to the molecular subtype assessed by MammaPrint. The presence of bulk RNA-seq and bulk WES data is indicated by a black box for each tumor. The type of 10x Genomics assay used for each tumor is denoted by different colors. An additional four cycles of neoadjuvant Adriamycin/cyclophosphamide (AC) are marked by a black box for each patient. Somatic mutations in genes frequently mutated in breast cancer and common copy number alterations are displayed for each tumor. Sample indicators connect samples from the same patients. C UMAP representation of transcriptional (left) and epigenetic landscape (right) of sequenced cells. Top: UMAPs of broad cell-type annotation. Bottom: Transcriptionally based UMAPs displaying canonical marker gene expression. D Proportion of cellular compartments in each sample. Samples are colored by the RCB response. Source data are provided as a Source Data file.
In both treatment arms, treatment-naïve tumor biopsies were collected within 14 days of initiation (“Baseline”), and also as feasible at week 3, day 1, between window monotherapy and combination therapy (“W3D1”; “On-monotherapy”), at week 7, day 1, during combination therapy (“W7D1”; “On-combination”), after the completion of combination therapy (“Pre-surgery”), and at the time of surgery (“Surgery”). Tumor biopsies collected from both treatment arms were profiled using established protocols for single-nuclei RNA-seq (snRNA-seq) or single-nuclei multiome sequencing (paired snRNA/ATAC-seq) (Methods). In total, we successfully generated a single-nuclei transcriptome with or without chromatin accessibility profiles for 40 tumor biopsies from 20 patients (Fig. 1B). Eight tumor biopsies from three patients were profiled using snRNA-seq and 32 tumor biopsies from 17 patients were profiled using multiome sequencing (Supplementary Data 1A). Additionally, 31 of 40 tumor biopsies were also profiled with both bulk RNA-seq and whole-exome sequencing (WES).
From the snRNA-seq data, we detected 249,379 total nuclei, with an average of 1583 unique genes expressed per nucleus (Supplementary Fig. 1A and Supplementary Data 1 Methods). Clustering and annotation analysis (Methods) identified cell types assigned to breast epithelial cells, immune cells, and stromal cells (Fig. 1C). None of these three major cell classes were unique to any specific patient, library construction method, treatment, or treatment timepoint. (Fig. 1D and Supplementary Fig. 1B). Regarding the snATAC-seq data, we performed peak calling12 for each annotated cell type per individual patient tumor biopsy and quantified the number of fragments on the consensus peak regions across all patient tumor biopsy samples in the cohort. After standard quality control (Methods), we captured 22,125 nuclei with high-quality data for both their snRNA-seq and snATAC-seq profiles (Fig. 1C).
Tumor programs linked to combination therapy response
To dissect the tumor-intrinsic molecular programs that contribute to differential therapeutic response to combination therapy in HR+ breast cancer, we first identified malignant cells using single-cell variational aneuploidy analysis with SCEVAN13 (Supplementary Fig. 2A–C; Methods). In order to identify common molecular programs across malignant cells from multiple patients derived at multiple sampling timepoints, we next employed a non-negative matrix factorization (NMF)-based intratumor heterogeneity detection method14,15 to identify gene expression metaprograms (MPs). Briefly, for each MP, a set of genes most shared among the clustered programs is defined as the MP gene signature, and among the malignant cells across our patient tumor biopsies, we identified 11 MPs (Methods). The genes comprising two MPs, MP1 (mitochondrial and ribosomal genes) and MP10 (genes having high correlation with the number of detected unique molecular identifiers (UMIs); Pearson correlation coefficient = 0.75), were suggestive of suboptimal data quality or sequencing library size, respectively, and were thereby excluded from downstream analyses.
The remaining MPs were annotated based on functional pathway enrichment analyses of their gene signatures (Fig. 2A, B; Methods). Four MPs resembled those identified in a previous multi-cancer intratumor heterogeneity study15, including “cell cycle” (MP2), epithelial to mesenchymal transition (EMT) I or “EMT-I” (MP3), “Interferon-MHC II” (MP4), and “Stress” (MP5) (Supplementary Fig. 2D, E). Most MPs could be further categorized based on shared biological processes. For example, in addition to the widely shared MP3 (EMT-I) among patient tumors, the less commonly shared MP6 (EMT-II) and MP11 (EMT-III) were not only enriched for the established EMT MSigDB Hallmark pathway but also had unique enrichments of apoptosis and protein secretion pathways, respectively (Supplementary Data 2). This finding suggests that, in addition to a broadly shared general EMT state among most tumors in this cohort, subsets of tumors exhibited variations of EMT-like programs that reflect distinct biological processes16,17 of the EMT process in these tumors. We also identified two estrogen receptor (ER)-related MPs, MP7 (ER-I) and MP8 (ER-II), along with MP9, which is associated with a previously identified prognostic feature (Apelin, a peptide hormone)18 in breast cancer (Supplementary Fig. 2F–H).
A Heatmap displaying pairwise Jaccard similarity indices among robust non-negative matrix factorization (rNMF) programs based on their top 50 genes. Programs are clustered into nine metaprograms (MPs); MPs, timepoint and patient information are labeled at the top. The central scatterplot illustrates Pearson correlation of rNMFs with cellular complexity. B Heatmap depicting gene membership within MPs, with rows as top representative genes, columns as MPs, and functional annotations on the right. C Distribution of ER-I-related MP7 and EMT-III-related MP11 signature level in tumor cells from pretreatment biopsies. P values from two-sided Wald tests on linear mixed model coefficients (patient as random effect; Methods), without multiple-comparison adjustment. Data presented as median with nested quantile ranges and boxes narrowing toward distribution tails to show extreme values. D Baseline comparison of the relative abundance of ER-I-related MP7 and EMT-III-related MP11 states in the tumor population between favorable responders (R; N(pt) = 3) and unfavorable responders (NR; N(pt) = 9), with p values from two-sided Mann–Whitney–Wilcoxon test. Data presented as median with interquartile range (first and third quartiles). E Dotplot of scaled transcription factor expression (color) and target region enrichment scores (dot size). Each row shows a gene regulatory network (GRN), named after the transcription factor and its target region; columns represent four tumor cell states. F PROGENy-inferred activity in ER-I (MP7) and EMT-III (MP11) tumor cell states. Barplots (left) show pathway activity scores relative to other tumor states. X-axis (right) shows gene weights in the p53 (top) and TGFβ (bottom) pathways; Y-axis shows t-values from differential expression comparing MP7 or MP11 to other states. G ER-I (MP7) and EMT-III (MP11) abundance in tumors with or without TP53 mutations from pretreatment biopsies. P values from two-sided Mann–Whitney–Wilcoxon test. Data presented as median with interquartile range (first and third quartiles). H ER-I-related MP7 and EMT-III-related MP11 gene signatures relative to the overall tumor signature in bulk RNA-Seq data from TCGA primary HR+ breast cancer. Patients grouped by TP53 mutation status, with significance tested by two-sided Mann–Whitney–Wilcoxon test. Data presented as median with interquartile range (first and third quartiles). Source data are provided as a Source Data file.
Malignant cells were then categorized based on their relative signature scores of these MPs (Methods; Supplementary Data 1B). Among baseline tumors, favorable responders (RCB = 0–I) harbored a higher enrichment of MP11 (EMT-III) compared unfavorable responders (RCB = II-III) (Fig. 2C and Supplementary Fig. 2I). In contrast, favorable responders exhibited a lower MP7 (ER-I) enrichment than unfavorable responders (Fig. 2C and Supplementary Fig. 2I). When comparing the cellular abundance between the two clinical groups, we observed a higher proportion of EMT-III-related MP11 and a lower proportion of ER-I-related MP7 among malignant cells in favorable responders than in unfavorable responders (Fig. 2D). Longitudinal analysis(Supplementary Fig. 2J–L) revealed that EMT-III-related MP11 levels significantly decreased during combination therapy, independent of response (Supplementary Fig. 2L). However, when comparing on-combination to on-monotherapy tumors ER-I-related MP7 exhibited opposing trends between the two clinical groups: combination therapy elevated ER-I-related MP7 in unfavorable responders while reducing it in favorable responders when comparing on-combination to on-monotherapy tumors (Supplementary Fig. 2L).
To validate this finding, we projected these two MP gene signatures into bulk RNA-seq data from (i) 25 baseline pretreatment biopsies from this same trial11 and (ii) 13 pretreatment breast biopsies from a cohort of patients with ER+ metastatic breast cancer treated with eribulin and pembrolizumab on a clinical trial19. From corresponding bulk RNA tumors in this trial, early-stage HR+ breast patients with favorable responses exhibited high EMT-III-related MP11 and low ER-I-related MP7 gene signature scores compared to patients with unfavorable responses (Supplementary Fig. 2M). Similarly, metastatic ER+ breast cancer patients experiencing partial response to eribulin and pembrolizumab harbored high EMT-III-related MP11 and low ER-I-related MP7 gene signature scores compared to patients with progressive disease (PD; Supplementary Fig. 2L). Additionally, of the 12 baseline tumor samples in our study, nine underwent BluePrint subtyping (an 80-gene molecular assay classifying early-stage breast cancer into Basal, Luminal, and HER2 subtypes). In these, the basal subtype exhibited a higher proportion of EMT-III-related MP11 (Supplementary Fig. 2O). Taken together, these findings suggest that EMT-III-related MP11 and ER-I-related MP7 may represent tumor-intrinsic ER+ breast cancer gene signatures associated with response and resistance to preoperative ICI plus chemotherapy, respectively.
To infer epigenetic gene regulation specific to malignant cells harboring a high relative gene signature score for MP11 and MP7, we next evaluated enhancer-driven gene regulatory networks (GRNs) from malignant cells with paired single-nucleus transcriptome and chromatin accessibility data via SCENIC+20. TCF7L1/2 transcription factor (TF) regulons, which are a key component of Wnt-β-catenin signaling21,22 and regulate EMT process in epithelial cells23,24,25, were specific to EMT-III-related MP11-high malignant cells (Fig. 2E and Supplementary Fig. 2R–Q). GATA3 and ESR1 regulons, two essential TFs that regulate maintenance of hormone sensitivity in breast cancer tumor cells26,27, were enriched in ER-I-related MP7-high malignant cells (Fig. 2E and Supplementary Fig. 2R, S). These results indicated a phenotypic consistency between the molecular gene expression programs and their predicted epigenetic GRNs.
Lastly, we applied PROGENy28 to malignant cell gene expression profiles to estimate their pathway-responsive gene activities. Transforming growth factor-β (TGF-β) signaling, a key driver of the EMT process, was highly activated in EMT-III-related MP11-high malignant cells (Fig. 2F), whereas p53 signaling was the most activated pathway in ER-I-related MP7-high malignant cells. To further investigate the relationship between p53 signaling and the ER-I-related MP7 signature, we analyzed the cellular abundance of various tumor cell states in patients with and without somatic TP53 mutations. Patients with somatic TP53 mutations tended to have a higher abundance of EMT-III-related MP11-high malignant cells and lower abundance of ER-I-related MP7-high malignant cells (Fig. 2G), which was further confirmed via companion analysis of these MP signature scores in early-stage ER+ primary breast cancer patients from The Cancer Genome Atlas (TCGA) stratified by TP53 mutation status29 (Fig. 2H). Taken together, these observations link tumor-intrinsic transcriptional metaprograms to distinct biological and clinical states in this trial setting, suggesting that the p53 and TGF-β pathways play divergent roles in HR+ breast cancer tumor cells to potentially influence response to preoperative ICI plus chemotherapy.
CD8+ T cell states responsive to combination therapy
To complement the investigation of malignant cell molecular programs associated with differential therapeutic response to neoadjuvant nab-paclitaxel + pembrolizumab combination therapy, we next evaluated the contribution of CD8 + T cells to this therapeutic response within our patient cohort. We identified a total of 18,772 T cells, of which 17,230 were CD8 + T cells (Supplementary Fig. 3A). Using the same methods as those applied to tumor cells, we identified that CD8 + T cell subpopulations were characterized by three MPs (Fig. 3A, Supplementary Fig. 3B, C, and Supplementary Data 1C) that recapitulate known transcriptional profiles of previously defined CD8 + T cell states30: naive, cytotoxic, and exhausted cell states (Supplementary Fig. 3D, E). For example, increased expression of exhaustion-associated markers such as PDCD1 (PD-1), HAVCR2 (TIM-3), and ENTPD1 (CD39) was detected in a subset of exhausted CD8 + T cells, while CCR7 (CCR7) and IL7R (IL-7R) were upregulated in naïve subsets, and PRF1 (Perforin 1) and GNLY (Granulysin) were elevated in cytotoxic subsets. (Supplementary Fig. 3D). We also evaluated cellular signaling activities stimulated by extracellular cytokines across our different CD8 + T cell populations with CytoSig31. Through this analysis, we observed that CD8 + T cells exhibiting exhaustion also had significantly elevated interleukin-15 (IL-15) cytokine response scores compared to the other two CD8 + T cell populations (Fig. 3B, C). IL-15 is produced by various cell types and is traditionally recognized for its role in promoting cytotoxic effector functions32; however, this finding implies that the enrichment of exhausted CD8 + T cells may prevent antitumor response even in the setting of IL-15 exposure by additional cells in the microenvironment.
A Heatmap illustrating average gene expression from three metaprograms (MPs) among three distinct CD8 + T cell states. Rows represent gene names. Top: cell state, patient, and timepoint information. B Differential cytokine activity between cytotoxic and exhausted CD8 + T cell states, analyzed using the two-sided Mann–Whitney–Wilcoxon test. The Y-axis displays negative log10 p values; the X-axis shows t-values. C Boxplot depicting distribution of predictive IL-15 cytokine activity across three CD8 + T cell states. Cytotoxic, native, and exhausted CD8 + T cell sates were detected in 38, 38, and 35 samples, respectively. Each point represents a sample, with p values calculated using the two-sided Mann–Whitney–Wilcoxon test. Data presented as median with interquartile range (first and third quartiles). D Violin plots illustrating shift in distribution of three MP gene signatures from baseline to monotherapy in favorable and unfavorable responders with biopsies at both timepoints. Top: CD8 + T cells during chemotherapy vs. pretreatment CD8 + T cells. Bottom: CD8 + T cells during Pembrolizumab treatment vs. pretreatment CD8 + T cells. P values were calculated via two-sided Wald tests on linear mixed model coefficients (Methods: Differential gene signature analysis), with patient ID modeled as a random effect. No adjustment made for multiple comparison. E Violin plots displaying shift in distribution of three MP gene signatures from monotherapy to combination therapy in favorable and unfavorable responders with biopsies at both timepoints. P values calculated via two-sided Wald tests on linear mixed model coefficients (Methods: Differential gene signature analysis), with patient ID modeled as a random effect. No adjustment made for multiple comparison. F Violin plots showing shift in distribution of three MP gene signatures from combination therapy to post-combination therapy (with AC treatment) in favorable and unfavorable responders with biopsies at both timepoints. P values calculated via two-sided Wald tests on linear mixed model coefficients (Methods: Differential gene signature analysis), with patient ID modeled as a random effect. No adjustment made for multiple comparison. RCB residual cancer burden. Source data are provided as a Source Data file.
Given these findings, we proceeded to compare the three CD8 + T cell population states through trial therapy (Fig. 3D, E and Supplementary Fig. 3F–H). Exhaustion gene scores initially decreased between on-monotherapy and on-combination therapy time points (Fig. 3E and Supplementary Fig. 3G), and cytotoxic gene scores subsequently increased between on-combination therapy and pre-surgery time points (Fig. 3F and Supplementary Fig. 3G, H), seemingly consistent with the anticipated effects of ICI in augmenting CD8 + T cell effector states. While this pattern was seen in both favorable and unfavorable responders, there also were notable differences between the response groups in both baseline scores and dynamics. At baseline, tumor biopsies from favorable responders exhibited numerically higher naive, cytotoxic effector, and exhausted gene signature scores than patients with an unfavorable response (Supplementary Fig. 3F). Over the course of treatment, only unfavorable responders demonstrated a late enrichment of the exhausted gene signature in the pre-surgery samples (Fig. 3F and Supplementary Fig. 3G, H), suggesting a less sustained therapeutic benefit in this group.
Macrophage state dynamics during ICI plus chemotherapy
In addition to T cells, macrophages play an important role in mediating immune surveillance and have been implicated in contributing to selective ICI response in other solid tumors33,34. We thus complemented our microenvironmental analysis by investigating macrophages in the microenvironment of tumors from this trial cohort. Within the myeloid lineage, we identified two subclusters corresponding to distinct cell populations defined by the hierarchical clustering significance method sc-SHC35. Among all patient tumor biopsies, one of the myeloid subclusters displayed high expression of canonical macrophage markers (CD68 and CD163)36 and was thus annotated as the macrophage sub-compartment (Supplementary Fig. 4A and Supplementary Data 1D). Utilizing the same approaches applied to other cell compartments, we identified five macrophage subpopulations defined by six biologically relevant MPs. While five of the MPs could be assigned to a specific macrophage subpopulation, the presentation-related MP8 was broadly expressed among multiple macrophage subpopulations, including those characterized by MP4 (adhesion), MP6 (lipid), and MP7 (interferon) macrophage (Fig. 4A and Supplementary Fig. 4B, C; Methods). Macrophage cells with a high MP gene signature score for either MP1 (endocytosis) or MP4 (lipid processing) displayed high concomitant expression of canonical anti-inflammatory markers (MERTK, MRC1) (Supplementary Fig. 4D). Conversely, macrophage cells with a high relative MP7 (interferon) gene signature score possessed high expression of pro-inflammatory markers (IL1B and CXCL10).
A Heatmap displaying average gene expression from six metaprograms (MPs) among five macrophage cell states. Rows represent gene names. Top: cell state, patient, and timepoint information. B Pairwise Pearson correlation among the six MP gene signatures. P values calculated with two-sided Student’s t-distribution. C Boxplot illustrating distribution of six MP gene signatures in macrophages during monotherapy, comparing 2802 macrophage cells from two favorable responders and 1990 macrophage cells from seven unfavorable responders. P values were calculated via Wald tests on linear mixed model coefficients (Methods: Differential gene signature analysis), with patient ID modeled as a random effect. No adjustment made for multiple comparison. Data presented as median with multiple nested quantile ranges (50–75%, 75–87.5%, 87.5–93.75%, 93.75–96.875%, and 96.875–100%), with boxes narrowing toward distribution tails to show extreme values. D Violin plots showing shift in distribution of six MP gene signatures from baseline to monotherapy in favorable and unfavorable responders with biopsies at both timepoints. Top: Macrophages during chemotherapy vs. pretreatment macrophages. Bottom: Macrophages during Pembrolizumab treatment vs. pretreatment macrophages. P values were calculated via two-sided Wald tests on linear mixed model coefficients (Methods: Differential gene signature analysis), with patient ID modeled as a random effect. No adjustment made for multiple comparison. E Violin plots depicting shift in distribution of six MP gene signatures from monotherapy to combination therapy in favorable and unfavorable responders with biopsies at both timepoints. P values calculated via two-sided Wald tests on linear mixed model coefficients (Methods: Differential gene signature analysis), with patient ID modeled as a random effect. No adjustment made for multiple comparison. F Violin plots illustrating shift in distribution of six MP gene signatures from combination therapy to post-combination therapy (with AC treatment) in favorable and unfavorable responders with biopsies at both timepoints. P values calculated via two-sided Wald tests on linear mixed model coefficients (Methods: Differential gene signature analysis), with patient ID modeled as a random effect. No adjustment made for multiple comparison. RCB residual cancer burden. Source data are provided as a Source Data file.
We next evaluated whether these macrophage populations mapped to previously reported macrophage states37,38,39. Interestingly, the endocytosis-related MP1 gene signature score was negatively correlated with other MP gene signature scores among macrophage cells (Fig. 4B), whereas interferon-related MP7 was positively correlated with other MP gene signature scores except endocytosis-related MP1. As previous studies have reported that M2-like macrophages are endocytotic40, we investigated the association of these two MPs with the M1/M2 axis. The endocytosis-related MP1 gene signature score exhibited a significant positive correlation with literature-derived gene signatures37,38,39 representing the canonical M2-like macrophage phenotype, while the interferon-related MP7 gene signature score had a significant positive correlation with literature-derived gene signatures representing the canonical M1-like macrophage phenotype (Supplementary Fig. 4E). Companion analysis of enhancer-driven gene regulatory networks (Supplementary Fig. 4F) identified that the TCF12-DAB2 regulon41,42,43 is specifically enriched in the endocytosis-related MP1 cell state (Supplementary Fig. 4G), and previous studies have shown that DAB2 regulates macrophage polarization44. Growing evidence suggests that macrophage differentiation exists along a spectrum of phenotypic states, moving beyond the rigid classification of purely antitumor M1 or pro-tumor M2 subtypes45. Our result suggest that endocytosis and interferon-related pathways may in part characterize two functional branches of dynamic transcriptional programming in macrophages in this specific context.
Although we did not observe any statistically significant association between macrophage MP signature scores and therapy response in pretreatment biopsies (Supplementary Fig. 4H), we found that the interferon-related MP7 gene signature was enriched in on-treatment tumors from favorable responders after monotherapy (Fig. 4C). Moreover, in post-exposure tumor biopsies to either monotherapy type from unfavorable responders, we observed a decrease in endocytosis-related MP1 gene signature score compared to pretreatment baseline tumor biopsies (Fig. 4D). Similarly, we detected an increase in the gene signature score of two other MPs, secretion-related MP5 and adhesion-related MP6, both of which were anti-correlated with endocytosis-related MP1 gene signature scores among macrophages (Fig. 4D). In contrast, on-treatment tumor biopsies from favorable responders showed an increase in endocytosis-related MP1, interferon-related MP7, and presentation-related MP8 after chemotherapy. Next, when combination therapy was added after either monotherapy window, the gene signature scores for these four MPs (MP5, MP6, MP7, and MP8) across macrophages were relatively diminished, but the endocytosis-related MP1 gene signature score was elevated in both favorable and unfavorable responders (Fig. 4E). However, when comparing the on-combination tumor biopsies to those after combination therapy, we observed an elevation in adhesion-related MP6 and antigen presentation-related MP8 gene signatures scores across macrophages, whereas the endocytosis-related MP1 gene signature decreased in favorable responders but increased in unfavorable responders (Fig. 4F and Supplementary Fig. 4I, J). Collectively, these findings suggest that both anti-PD-1 therapy and chemotherapy, when administered as monotherapy in HR+ breast cancer, may enhance a pro-inflammatory response in microenvironmental macrophages. Furthermore, exposure to both treatments in a combination therapy setting appears to induce polarization toward increased endocytosis activity associated metaprograms only in unfavorable responders, with a decrease in this feature observed in favorable responders during the course of combination therapy.
Tumor-immune crosstalk linked to combination therapy response
Given the contributions of microenvironmental cells to therapeutic response, along with prior studies demonstrating synergistic interactions between malignant and immune cells that influence tumor progression46,47, we next conducted a focused investigation into the predicted tumor-immune interactions that may be operant among HR+ breast cancer patients as stratified by therapy response via MultiNicheNetR48 (Supplementary Data 3A). We first predicted interactions in pretreatment baseline biopsies (Fig. 5A–D). From this analysis, we observed that predicted TIGIT receptor signaling from CD8 + T cells with Nectin family ligands on malignant cells was observed in pretreatment tumor biopsies from both favorable and unfavorable responders (Fig. 5A). However, inferred CD8 + T cell PD-1 signaling activity in pretreatment baseline tumor biopsies was higher in favorable responders than in unfavorable responders (Supplementary Fig. 5A, B), and predicted CD86/CTLA-4 interaction was numerically higher in favorable responders (Fig. 5B). By contrast, predicted tumor-expressed WBP1 and FAM200A ligand signaling49 to the CTLA-4 immune checkpoint receptor on CD8 + T cells was only enriched in pretreatment baseline tumor biopsies from unfavorable responders (Fig. 5A, B). Additionally, in assessing interactions between malignant cells and macrophages, we found that the malignant cell-expressed ligand signaling to the ITGAV and NRP1/2 receptors on macrophages was more prevalent in pretreatment tumor biopsies from favorable responders (Fig. 5C, D). Conversely, in pretreatment tumor samples from unfavorable responders to the combination therapy, we observed enrichment of CLDN1 and CXCR4 signaling interactions between malignant cells and macrophages (Fig. 5C, D).
A Rank plots displaying baseline ligand-receptor (L–R) interactions between tumors and CD8 + T cells, sorted by their MultiNicheNetR prioritization score. Top: Predominant L–R interactions in favorable responders. Bottom: Predominant L–R interactions in unfavorable responders. Annotations highlight L–R interactions involving immune checkpoint receptors on CD8 + T cells that rank within the top 50 for each group. (Top illustration was created in BioRender. Miler-jones, L. (2025) https://BioRender.com/e6aheln). B Boxplots illustrating the distribution of three L–R interaction metrics for tumors expressing ligands that interact with the CTLA-4 receptor on CD8 + T cells, comparing favorable and unfavorable responders. Left: Probability of producing both ligand and receptor; Middle: Probability of producing the ligand; Right: Probability of producing the receptor. Data were presented as median with interquartile range (first and third quartiles). C Rank plots displaying baseline ligand-receptor (L–R) interactions between tumors and macrophages, sorted by their MultiNicheNetR prioritization score. Top: Predominant L–R interactions in favorable responders. Bottom: Predominant L–R interactions in unfavorable responders. Annotations highlight the top five L–R interactions for each group. (Top illustration was created in BioRender. Miler-jones, L. (2025) https://BioRender.com/e6aheln). D Boxplots illustrating the distribution of three L–R interaction metrics for tumors expressing ligands that interact with the receptors on macrophage, comparing favorable and unfavorable responders. Left: Probability of producing both ligand and receptor; Middle: Probability of producing the ligand; Right: Probability of producing the receptor. Data are presented as median with interquartile range (first and third quartiles). E Heatmap illustrating the changes in tumor-CD8 + T cell L–R interactions from pretreatment (BL) to on-combination therapy (W7D1). The top five altered L–R interactions for each group are displayed. F Heatmap depicting the changes in tumor-macrophage L–R interactions from pretreatment (BL) to on-combination therapy (W7D1). The top five altered L–R interactions for each group are shown. RCB residual cancer burden. Source data are provided as a Source Data file.
Next, we sought to further dissect the impact of combination therapy on tumor-immune interactions in HR+ breast cancer by comparing the predicted signaling interactions between tumor biopsies from the pretreatment baseline (BL) tumor biopsies to on-combination therapy tumor biopsies (W7D1), further partitioned between favorable and unfavorable responders (Supplementary Data 3B). Compared to pretreatment baseline tumor biopsies, on-combination therapy tumor biopsies exhibited augmented malignant cell-CD8 + T cell interactions critical for tumor recognition (HLA-B-CD8A) and CD8 + T cell cytotoxic effector function (ZG16B-CXCR450) only in favorable responders (Fig. 5E). Malignant cells from on-combination biopsies of favorable responders were predicted to interact with pro-inflammatory macrophages (Fig. 5F), and these macrophages also demonstrated increased expression of canonical pro-inflammatory genes (Supplementary Fig. 5C) and enrichment of the interferon-related MP7 signature (Supplementary Fig. 5D). Conversely, predicted interactions between SFTPD and LAIR1, a receptor known to inhibit T cell activation51, from malignant cells to CD8 + T cells were more prevalent in on-combination therapy tumor biopsies from unfavorable responders (Fig. 5E), potentially suggesting suppression of CD8 T cell activation by tumors. Moreover, VEGFA-NRP1/2 signaling from malignant cells to macrophages was enriched in on-combination therapy biopsies from unfavorable responders (Fig. 5F). This aligns with prior evidence showing that macrophages expressing Neuropilin-1 (NRP1) or NRP2 drive tumor progression52,53,54. In summary, these observations underscore the dynamic state reprogramming of malignant and immune cells, along with predicted interactions among them, during neoadjuvant nab-paclitaxel + pembrolizumab combination therapy in early-stage HR+ breast cancer.
Discussion
In this study, we conducted a comprehensive analysis of single-nucleus transcriptomic and chromatin accessibility landscapes in early-stage HR+ breast cancer patients on a clinical trial examining neoadjuvant nab-paclitaxel + pembrolizumab combination therapy. Prior works have both characterized distinct immune/stromal microenvironments across breast cancer subtypes55 and defined the lineage-specific progenitors of luminal and basal tumor cells in primary disease56. A prior window-of-opportunity study8 investigated changes in the tumor microenvironment of breast cancer, specifically focusing on its relationship to T cell expansion. However, the mechanisms by which tumor-intrinsic properties and immune-tumor interactions modulate therapeutic efficacy during chemotherapy, PD-1 blockade, or combination regimens remain poorly understood. Therefore, our study extends these investigations by assessing the dynamic influence of taxane chemotherapy and anti-PD-1 immunotherapy, both as monotherapies as well as in combination, on tumor-intrinsic and -extrinsic microenvironmental features in patient tumors. From this analysis, we identified cell type-specific molecular programs, GRNs, and signaling interactions that are associated with patient responses to combination of taxane chemotherapy and anti-PD-1 therapy. Prior research has indicated that certain malignant cell gene signatures, such as cell cycle and EMT, are shared among malignant cells across different cancer types15,55, thereby providing a valuable framework for identifying common therapeutic targets and understanding tumor progression mechanisms. Additionally, high immune infiltration in lung adenocarcinoma has been strongly linked to the induction of EMT, which is associated with increased expression of inhibitory immune checkpoints like CTLA-4 and PD-L157. A pan-cancer study58 further demonstrated that the crosstalk between immune evasion and EMT correlates with responses to immune checkpoint blockade across several cancer types. Utilizing an NMF-based approach15, we detected several malignant cell molecular programs that recurred among patients in our cohort. Notably, two of these metaprograms, EMT-III-related MP11 and ER-I-related MP7, demonstrated a significant correlation with response and resistance to the combination therapy, respectively. In addition, the EMT-III gene signature was associated with mutations in TP53, and malignant cells with a high EMT-III or ER-I gene signature score displayed enrichment of TGF-β or estrogen signaling pathway activities, respectively. Lastly, by integrating paired gene expression and chromatin accessibility data, we determined that the EMT-III program was predominantly regulated by TCF7L1/2 TF regulons, whereas the ER-I program was regulated by ESR1 and GATA3 TF regulons59. Previous studies have revealed a positive correlation between EMT and PD-L1 expression in breast cancer patients60. Collectively, these findings suggest that a tumor specific EMT program may serve as a potential biomarker for combination immunotherapy and chemotherapy in breast cancer, with potential relevance to other cancer types.
Immune cell states and their relative abundance have also been implicated as predictive biomarkers for ICI efficacy. In our study, we detected relatively high naïve, cytotoxic effector, and exhausted CD8 + T cell states in baseline tumor biopsies from patients who favorably responded to the therapy, whereas these cell populations were detected at relatively low level in patients with an unfavorable response. Furthermore, favorable responders exhibited an increase in the CD8 + T cell cytotoxic effector state and a decrease in the CD8 + T cell exhausted state in on-combination therapy tumor biopsies, suggesting a more robust antitumor T cell response compared to unfavorable responders. In evaluating macrophage state dynamics, our findings suggest that neoadjuvant exposure to either taxane chemotherapy or anti-PD-1 as monotherapies may independently elevate pro-inflammatory macrophage phenotypes, but that the combination of these two therapies may promote a shift of these monotherapy induced-inflammatory macrophage toward an anti-inflammatory macrophage phenotype in tumors resistant to therapy.
Our findings also reveal a difference between predicted tumor-immune cell signaling interactions in pretreatment baseline tumor biopsies versus on-combination therapy biopsies, suggestive of a shift during therapy that may influence patient responses. For example, we detected a higher number of interactions between tumor and CD8 + T cells via PD-1 in favorable responders compared to unfavorable responders at baseline. We also found that pretreatment interactions between tumor cells and CTLA-4 on CD8 + T cells were more prevalent in unfavorable responders. In addition, macrophages expressing NRP1/2 receptors have been linked to immunosuppression, and upon evaluating signaling activity between pretreatment baseline and on-combination therapy tumor biopsies, we noted enhanced tumor-macrophage interactions via NRP1/2-VEGFA in on-combination therapy biopsies from unfavorable responders.
Our study has several limitations. Due to the unique trial design and limited tissue access, our analyses are constrained by the small sample size. Consequently, these findings would benefit from further experimental investigations into the specific cell-type relationships observed, as well as additional external clinical validation, where feasible, with tumors treated with similar therapies and analyzed using the same sequencing approach. Additionally, spatial heterogeneity within tumor regions inherently restricts our ability to comprehensively determine the cell types and states identified. This issue may be further investigated with companion spatially resolved transcriptomics from future trial and real-world samples and cohorts.
In summary, our study provides a comprehensive overview of the tumor-intrinsic and -extrinsic microenvironmental dynamics following a combination of taxane chemotherapy and anti-PD-1 therapy in the neoadjuvant setting of HR+ breast cancer. We identified malignant cell molecular programs, such as ER-I, and immune cell states (i.e., exhausted CD8 + T cell states) and interactions (i.e., malignant-T cell signaling interactions via CTLA-4) that correlate with unfavorable responses to the combination therapy. Our study thereby offers valuable insights into potential predictive biomarkers for therapeutic response in early-stage HR+ breast cancer patients. Further research, including longitudinal studies on patient survival, is necessary to validate the associations identified in this study.
Methods
Sample collection and processing
Patient population and sample selection
This study was approved by the institutional review board of Dana-Farber/Harvard Cancer Center and was conducted in accordance with the principles of the Declaration of Helsinki. As breast cancer is a leading cause of cancer death among women, only female patients were enrolled. Participants did not receive compensation and provided information on their self-identified gender. Tumor biopsies were collected from patients enrolled in the randomized and open-label clinical trial NCT02999477, which enrolled 32 patients. All participants provided written informed consent before undergoing any study-related procedures. Among 32 patients, two patients withdrew before receiving study therapy, and one was found ineligible due to HER2 positivity after a single dose of pembrolizumab. Of the remaining 29 patients eligible for efficacy analysis, only 20 were ultimately evaluated. This discrepancy arose because the single-cell sequencing study was initiated prior to full trial enrollment (with six trial patients enrolled after the single-cell sequencing efforts began, who were then not included in the single-cell cohort), one patient failed to achieve successful single-cell profiling at all timepoints, and the absence of baseline biopsies for two patients. Briefly, baseline biopsies were available for 21 patients for single-cell sequencing; however, due to a high rate of library construction failure, only 12 patients had successful library construction and single-cell profiling. Baseline patient and disease characteristics for these 12 patients are presented in Supplementary Data 4. The median age was 42 years, with 66.7% having clinical stage II breast cancer and 75% of tumors exhibiting pure ductal histology. Additionally, 16.7% of tumors were classified as HR low-positive. These characteristics align with the 29 patients included in the efficacy analysis outlined in Ada et al., 2024 (co-submitted). For W3D1 biopsies, 14 patient samples were available, with successful library construction and single-cell profiling achieved for nine patients. Similarly, for W7D1 biopsies, 12 patient samples were available, of which eight underwent successful library construction and single-cell profiling. Among pre-surgery biopsies, five samples were available, with four successfully processed. For surgery biopsies, ten samples were available, with successful library construction achieved for seven. Overall, despite variability in sample availability and processing success, the patient characteristics of those with successful single-cell profiling remained consistent with those in the efficacy analysis (Supplementary Data 4).
Nuclei isolation
Nuclei isolation was performed as previously described61. Low-retention microcentrifuge tubes (Fisher Scientific, Hampton, NH, USA) were used throughout the procedure to minimize nuclei loss. Briefly, tissues were manually dissociated by chopping with fine spring scissors for 10 minutes, then homogenized in TST solution. The homogenate was filtered through a 30 µm MACS SmartStrainer (Miltenyi Biotec, Germany) and centrifuged at 500×g for 10 min at 4 °C to pellet the nuclei. The nuclei pellet was resuspended in a lysis buffer to permeabilize the nuclei, followed by another centrifugation at 500×g for 10 min at 4 °C. The final nuclei pellet was resuspended in 150 µl of 10x Genomics Diluted Nuclei Buffer. Trypan blue-stained nuclei were then counted manually using INCYTO C-Chip Neubauer Improved Disposable Hemacytometers (VWR International Ltd., Radnor, PA, USA).
Single-nuclei RNA-sequencing (snRNA-seq)
Following the 10x Genomics protocol62, a maximum of 10,000 nuclei per sample were loaded into each channel of the Chromium Next GEM Chip K for processing on the 10x Chromium Controller (10x Genomics, Pleasanton, CA, USA). This was followed by cDNA generation and library construction according to the manufacturer’s instructions (Chromium Next GEM Single Cell 5ʹ Reagent Kits v2 User Guide, Rev E). The resulting libraries were then normalized and pooled for sequencing on an Illumina NovaSeq system (Illumina, Inc., San Diego, CA, USA) with run parameters set to 26, 10, 10, and 90.
Multiomic GEX and ATAC profiling (MO)
Following the 10x Genomics protocol62, we loaded around 16,000 nuclei per sample per channel of the Chromium Next GEM Chip J for processing on the 10x Chromium Controller (10x Genomics, Pleasanton, CA, USA). This was followed by transposition or cDNA generation and library construction according to the manufacturer’s instructions (Chromium Next GEM Single Cell Multiome ATAC + Gene Expression User Guide, Rev F). The resulting libraries were then normalized and pooled for sequencing on two NovaSeq SP-100 flow cells (Illumina, Inc., San Diego, CA, USA).
Single-nuclei gene expression analysis
We utilized CellRanger v6.0.1 to generate raw gene expression matrices for snRNA-seq and CellRanger Arc v2.0.0 for MO profiling, quantifying reads aligned to the GRCh38-2020-A reference genome on a per-cell basis. Cellbender(v0.3.0)63 was then applied to the raw expression matrix for eliminating technical artifacts coming from ambient RNAs and random barcode swapping. Cells with fewer than 250 expressed genes, fewer than 500 UMIs, or greater than 5% mitochondrial gene counts were excluded from further analysis. Subsequently, we employed Scrublet64 to remove potential doublets from the filtered gene expression matrices. After these quality control steps, a total of 249,379 cells remained for downstream analysis (Python 3.8.12 and R 4.1).
There were seven sequencing batches in total (Supplementary Data 1A). For each sequencing batch, Leiden clustering and cell compartment annotation were performed on their aggregated expression matrices with Scanpy (1.9.1)65. Unless specified, default Scanpy parameters were employed. Briefly, for each batch, 2000 highly variable genes were identified and used for the principal component analysis (PCA). The top 50 principal components (PCs) were then used to compute the nearest neighbors distance matrix, with a local neighborhood size of 10 for constructing the neighborhood graph. We applied Leiden clustering at various resolutions (0.1, 0.3, and 0.5) to the neighborhood graph. For each clustering resolution, the Scanpy rank_gene_groups function was performed to characterize the highly expressed genes per cluster. Clusters were annotated by comparing their highly expressed genes to canonical cell type marker genes (Supplementary Data 5), with the Jaccard index quantifying the overlap. The resolution yielding the fewest uncharacterized clusters was selected. Cell types were then grouped into their respective lineages prior to concatenating data from the entire cohort (Supplementary Data 1E). Lastly, we repeated the PCA analysis on 2000 highly variable genes from the integrated expression matrix, constructed a neighborhood graph, and performed uniform manifold approximation and projection (UMAP) to embed the neighborhood graph into a two-dimensional space for visualization.
Malignant cell detection through copy number analysis
Segregation of malignant cells from normal cells on each individual sample was assessed by SCEVAN (1.0.1)13 with immune cells as a reference. Subsequently, we designated malignant cells originating from the epithelial compartment as tumor cells. As this tool also infers the copy number profile of malignant cells, we confirmed that the common copy number alterations detected through WES were also presented on the copy number profile derived from the single-cell expression matrix (Suppl Fig. 2B).
Recurrent programs detection and metaprograms generation
Within each cell type, we filter out samples with a number of cells less than 100 and adopted the methodologies outlined in previous studies14,15. In brief, we employed sparse non-negative matrix factorization (sNMF) implemented by nimfa (1.4.0) on a given cell type from an individual sample, exploring a range of k values from 4 to 10, and characterized the resulting NMF programs by their top 50 defining genes. We then assessed the degree of overlap for the NMF programs both within and across samples, retaining those programs with fewer than 35 overlapping genes within the same sample and more than ten overlapping genes between different samples. These selected NMF programs were designated as robust NMF programs. Finally, we applied a customized clustering method15 to these robust NMF programs, resulting in the identification of clusters referred to as metaprograms (MPs).
Cell state annotation
Within each cell type, cells were further classified into distinct cell states using MPs previously identified for that specific cell type. For each cell type, including CD8 T cells, macrophages, and tumor cells, we quantified cell-type-specific MPs, using the Vision(3.0.1)66 method, generating an N (cell count) × M (MP count) signature matrix per cell type. Due to the incomparable score ranges of MPs, we applied Gaussian mixture modeling (GMM) with hard labeling via the signature scoring python package67. Briefly, GMM was fitted to each MP scoring matrix with the number of mixtures equal to the number of identified MPs. Pearson’s correlation coefficient between MPs and GMM clusters was calculated to generate an M × C similarity matrix. Subsequently, GMM mixtures were labeled according to the MP with which they exhibited the highest similarity. In the final step, each cell was classified under the MP that corresponded to the GMM mixture with the highest probability in the GMM (Supplementary Fig. 6).
Differential gene expression and pathway activity analysis
We utilized the decoupler (1.6.0)68 Python package for the differential gene expression (DEG) analysis and pathway activity inference. To identify DEGs associated with specific cell states, we generated pseudobulk samples69 for each cell state within each patient sample. DEGs for each cell state within the same cell type were then calculated using DESeq270,71. To account for the unequal number of samples across patients, we included patient ID as a design factor in our analysis to minimize patient-driven biases.
For the estimation of signaling activity levels within each cell state, we utilized PROGENy (1.0.6)28, which is based on a compendium of large-scale public signaling perturbation experiments that define responsive genes and interaction weights for pathways. Briefly, the cell state DEGs statistic was fitted into a multivariate linear model. This model predicted the observed gene expression using the PROGENy Pathway-Gene interaction weights. The resulting t-values of the model coefficients served as the activity scores for each cell state’s signaling pathways.
Gene signatures on bulk RNA-sequencing data
Initially, we established a set of signature genes for each cell state, selecting the top 100 genes that were differentially expressed (with adjusted P value <0.05 and a log fold change >1) in comparison to other cells detected in our cohort. In accordance with the guidelines provided by vision(3.0.1)66, we normalized the bulk RNA-sequencing raw count matrix by dividing by the total reads per sample and then multiplying by the median total reads across the cohort. Next, cell state signature scores were calculated on the scaled count matrix via Vision. Lastly, the resulting cell state signature scores were further normalized against the average scores of cell states within the same cell type, as the signatures being investigated were pertinent to individual cell types.
Differential gene signature analysis
The gene signature scores were compared between two groups of interest to identify signatures with significant differences. The cell type-specific signatures changes over treatment time were modeled using linear mixed-effects regression model (1):
Where \(y\) is the observed signature score for a signature across all cells in the given cell type, \({\beta }_{0}\) is the intercept, \({\beta }_{1}\) is the fixed effect of treatment time \(t\) on signature level, \(Z\) is a binary design matrix indicating if the cells are from the same patient or not, \(d\) is a vector of random effect for patient, which is normally distributed with mean zero and represents the deviation from the overall mean of the mean signature scores for each patient, and \(\epsilon\) is random errors. To ensure robust statistical modeling of within-patient dynamics while mitigating bias from incomplete longitudinal data, we only include patients with samples at both timepoints when performing this analysis. To identify signatures that significantly differentiate favorable responders from unfavorable responders, we applied a similar formula, where \({\beta }_{1}\) is the fixed effect of therapy response \(t\) on signature level.
Cell-to-cell communication of single-nuclei gene expression data
Given the requisite minimum number of cells and samples per condition for the application of MultinicheNetR(2.0.0)48, our analysis was concentrated on the cellular interactions at the cell type level, specifically between tumors, macrophages, and CD8 T cells. The algorithm operates by identifying ligand-receptor interactions that exhibit differential expression and activity between specified conditions, as detailed in the referenced literature48. We investigated the interactions that were differentially activated between favorable and unfavorable responders using their baseline samples (Supplementary Data 3A). Subsequently, we explored the changes in cell-cell communication dynamics from baseline to W7D1, focusing on differences between favorable and unfavorable responders (Supplementary Data 3B).
Single nuclei ATAC-sequencing data processing
We generated a fragment bed file for each sample, where each line corresponds to a unique ATAC-seq fragment identified by the assay, using the Cellranger Arc 2.0.0. We then proceeded with the SCENIC + (1.0.1.dev6+ge5ba6fc)20 workflow for our downstream analysis. Firstly, leveraging the cell type labels derived from gene expression analysis, we constructed pseudobulk fragment bed files for each cell type and conducted consensus peak calling12,20. Upon establishing the consensus peak regions, we counted the number of fragments overlapping the consensus peak regions. Cells with fragments in peaks ratio <0.45, log number of unique fragments <3.3, or transcription start site enrichment score <5 were removed. This resulted in 22,125 high-quality cells (Supplementary Data 1F).
Next, we applied topic modeling to the entire cohort, as well as tumor cells and macrophages individually, using LDA with a collapsed Gibbs sampler. We selected models with 30, 15, and 10 topics for the entire cohort, tumor cells, and macrophages, respectively, based on the stabilization of metrics72,73,74 and log-likelihood. Finally, we conducted dimensionality reduction using UMAP on all topics identified across the entire cohort.
Topic binarization
We followed the recommendations as described in ref. 20 to generate the required region sets for gene regulation network inference analysis. Briefly, we binarized topic regions using two methods, which are Ostu75 and ntop (the top 3000 regions per topic).
Differentially accessible regions detection
We imputed the region accessibility exploiting the cell-topic and topic-region probabilities and normalized the probability values with the default scale factor 1,000,000. To speed up the hypothesis testing step, we identified highly variable regions and then identified differentially accessible regions per cell state using the Wilcoxon rank-sum test with log fold change >1.5 and false discovery rate <0.05.
Motif enrichment analysis
We used the score and ranking database v10 motif collection (SCENIC+ motif collection). Motif enrichment was performed using both the cisTarget and DEM algorithm on binarized topic regions, the top 3000 regions per topic, and cell-type-based differentially accessible regions. The motif enrichment analysis was run both including promoters, which were defined as regions within 500 bp up- or downstream of the TSS of each gene and excluding them.
Gene regulation network inference
The raw gene expression count matrix, imputed accessibility and motif enrichment results were used as input into the SCENIC+ workflow, keeping 22,125 cells with both high-quality ATAC-seq and RNA-seq profiles. The SCENIC+ workflows were run using default parameters20.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Raw sequencing reads of all single-cell experiments (snRNA-seq and multiomic GEX and ATAC profiling) of this study have been deposited under restricted access in the database of Genotypes and Phenotypes (dbGaP) with data accession number phs002419.v2. This is controlled-access data, which may be accessed by researchers who submit a research-based application and receive approval from the Data Access Committee (DAC). The Source data, including processed datasets, are publicly available at 10.5281/zenodo.15799565. The publicly available TCGA bulk RNA-Seq data and corresponding clinical information used in this study are downloaded from https://gdc.cancer.gov/about-data/publications/pancanatlas. Bulk RNA-Seq data and clinical annotations from metastatic HR+ breast cancer are retrieved from the Source Data of the referenced study19. The remaining data were available within the article, Supplementary Information or Source Data file. Source data for confidential variables are excluded. Additional data or material requests will be reviewed by the senior authors for intellectual property or confidentiality considerations.
Code availability
Code applied to the analyses in this study can be found on GitHub at: https://github.com/jingxinfu/hr_brca_single_cell. The permanent version76 is deposited in Zenodo (https://zenodo.org/records/16912125).
References
Bray, F. et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 74, 229–263 (2024).
Tolaney, S. M. et al. Effect of eribulin with or without pembrolizumab on progression-free survival for patients with hormone receptor–positive, ERBB2-negative metastatic breast cancer: a randomized clinical trial. JAMA Oncol. 6, 1598–1605 (2020).
Loi, S. et al. A phase III trial of nivolumab with neoadjuvant chemotherapy and adjuvant endocrine therapy in ER+/HER2- primary breast cancer: CheckMate 7FL. J. Clin. Oncol. 38, TPS604–TPS604 (2020).
Cardoso, F. et al. KEYNOTE-756: randomized, double-blind, phase 3 study of pembrolizumab vs placebo combined with neoadjuvant chemotherapy and adjuvant endocrine therapy for high-risk, early-stage estrogen receptor–positive, human epidermal growth factor receptor 2–negative (ER+/HER2−) breast cancer. J. Clin. Oncol. 37, TPS601–TPS601 (2019).
Schmid, P. et al. Pembrolizumab for early triple-negative breast cancer. N. Engl. J. Med. 382, 810–821 (2020).
Schmid, P. et al. LBA18 pembrolizumab or placebo plus chemotherapy followed by pembrolizumab or placebo for early-stage TNBC: updated EFS results from the phase III KEYNOTE-522 study. Ann. Oncol. 34, S1257 (2023).
Zhang, Y. et al. Single-cell analyses reveal key immune cell subsets associated with response to PD-L1 blockade in triple-negative breast cancer. Cancer Cell 39, 1578–1593.e8 (2021).
Bassez, A. et al. A single-cell map of intratumoral changes during anti-PD1 treatment of patients with breast cancer. Nat. Med. 27, 820–832 (2021).
Karaayvaz, M. et al. Unravelling subclonal heterogeneity and aggressive disease states in TNBC through single-cell RNA-seq. Nat. Commun. 9, 3588 (2018).
Wu, S. Z. et al. Stromal cell diversity associated with immune evasion in human triple-negative breast cancer. EMBO J. 39, e104063 (2020).
Waks, A. G. et al. Neoadjuvant nab-paclitaxel and pembrolizumab in hormone-receptor positive breast cancer: Efficacy, safety, and predictive biomarker investigation. Nat Cancer. Companion paper, In Review.
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
De Falco, A., Caruso, F., Su, X.-D., Iavarone, A. & Ceccarelli, M. A variational algorithm to detect the clonal copy number substructure of tumors from scRNA-seq data. Nat. Commun. 14, 1074 (2023).
Kinker, G. S. et al. Pan-cancer single-cell RNA-seq identifies recurring programs of cellular heterogeneity. Nat. Genet. 52, 1208–1218 (2020).
Gavish, A. et al. Hallmarks of transcriptional intratumour heterogeneity across a thousand tumours. Nature 618, 598–606 (2023).
Thiery, J. P., Acloque, H., Huang, R. Y. J. & Nieto, M. A. Epithelial-mesenchymal transitions in development and disease. Cell 139, 871–890 (2009).
Zou, J. et al. Secreted TGF-beta-induced protein promotes aggressive progression in bladder cancer cells. Cancer Manag. Res. 11, 6995–7006 (2019).
Hu, D. et al. Apelin is associated with clinicopathological parameters and prognosis in breast cancer patients. Arch. Gynecol. Obstet. 306, 1185–1195 (2022).
Keenan, T. E. et al. Molecular correlates of response to eribulin and pembrolizumab in hormone receptor-positive metastatic breast cancer. Nat. Commun. 12, 5563 (2021).
Bravo Gonzalez-Blas, C. et al. SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networks. Nat. Methods 20, 1355–1367 (2023).
Clevers, H. Wnt/beta-catenin signaling in development and disease. Cell 127, 469–480 (2006).
Hammond, E. et al. The Wnt effector transcription factor 7-like 2 positively regulates oligodendrocyte differentiation in a manner independent of Wnt/beta-catenin signaling. J. Neurosci. 35, 5007–5022 (2015).
Sanchez-Tillo, E. et al. beta-catenin/TCF4 complex induces the epithelial-to-mesenchymal transition (EMT)-activator ZEB1 to regulate tumor invasiveness. Proc. Natl Acad. Sci. USA 108, 19204–19209 (2011).
Karve, K., Netherton, S., Deng, L., Bonni, A. & Bonni, S. Regulation of epithelial-mesenchymal transition and organoid morphogenesis by a novel TGFbeta-TCF7L2 isoform-specific signaling pathway. Cell Death Dis. 11, 704 (2020).
Korinek, V. et al. Depletion of epithelial stem-cell compartments in the small intestine of mice lacking Tcf-4. Nat. Genet. 19, 379–383 (1998).
Theodorou, V., Stark, R., Menon, S. & Carroll, J. S. GATA3 acts upstream of FOXA1 in mediating ESR1 binding by shaping enhancer accessibility. Genome Res. 23, 12–22 (2013).
Tian, D. et al. A kinetic model identifies phosphorylated estrogen receptor-α (ERα) as a critical regulator of ERα dynamics in breast cancer. FASEB J. 29, 2022–2031 (2015).
Schubert, M. et al. Perturbation-response genes reveal signaling footprints in cancer gene expression. Nat. Commun. 9, 20 (2018).
Cancer Genome Atlas, Network Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012).
Oliveira, G. et al. Phenotype, specificity and avidity of antitumour CD8(+) T cells in melanoma. Nature 596, 119–125 (2021).
Jiang, P. et al. Systematic investigation of cytokine signaling activity at the tissue and single-cell levels. Nat. Methods 18, 1181–1191 (2021).
Tagaya, Y., Bamford, R. N., DeFilippis, A. P. & Waldmann, T. A. IL-15: a pleiotropic cytokine with diverse receptor/signaling pathways whose expression is controlled at multiple levels. Immunity 4, 329–336 (1996).
House, I. G. et al. Macrophage-derived CXCL9 and CXCL10 are required for antitumor immune responses following immune checkpoint blockade. Clin. Cancer Res. 26, 487–504 (2020).
Li, H. et al. The allergy mediator histamine confers resistance to immunotherapy in cancer patients via activation of the macrophage histamine receptor H1. Cancer Cell 40, 36–52.e9 (2022).
Grabski, I. N., Street, K. & Irizarry, R. A. Significance analysis for clustering with single-cell RNA-sequencing data. Nat. Methods 20, 1196–1202 (2023).
Cheng, S. et al. A pan-cancer single-cell transcriptional atlas of tumor infiltrating myeloid cells. Cell 184, 792–809 e23 (2021).
Jablonski, K. A. et al. Novel markers to delineate Murine M1 and M2 macrophages. PLoS ONE 10, e0145342 (2015).
Becker, M. et al. Integrated transcriptomics establish macrophage polarization signatures and have potential applications for clinical health and disease. Sci. Rep. 5, 13351 (2015).
Bi, K. et al. Tumor and immune reprogramming during immunotherapy in advanced renal cell carcinoma. Cancer Cell 39, 649–661.e5 (2021).
Tarique, A. A. et al. Phenotypic, functional, and plasticity features of classical and alternatively activated human macrophages. Am. J. Respir. Cell Mol. Biol. 53, 676–688 (2015).
Gertz, J. et al. Distinct properties of cell-type-specific and shared transcription factor binding sites. Mol. Cell 52, 25–36 (2013).
Zheng, R. et al. Cistrome data browser: expanded datasets and new tools for gene regulatory analysis. Nucleic Acids Res. 47, D729–D735 (2019).
Mei, S. et al. Cistrome data browser: a data portal for ChIP-Seq and chromatin accessibility data in human and mouse. Nucleic Acids Res. 45, D658–D662 (2017).
Adamson, S. E. et al. Disabled homolog 2 controls macrophage phenotypic polarization and adipose tissue inflammation. J. Clin. Invest. 126, 1311–1322 (2016).
Nasir, I. et al. Tumor macrophage functional heterogeneity can inform the development of novel cancer therapies. Trends Immunol. 44, 971–985 (2023).
Michaud, D. E. & Guerriero, J. L. Myeloid cells pave the metastatic road in breast cancer. Cancer Res. 84, 181–183 (2024).
Kos, K. et al. Tumor-educated Tregs drive organ-specific metastasis in breast cancer by impairing NK cells in the lymph node niche. Cell Rep. 38, 110447 (2022).
Browaeys, R. et al. MultiNicheNet: a flexible framework for differential cell-cell communication analysis from multi-sample multi-condition single-cell transcriptomics data. Preprint at bioRxiv https://doi.org/10.1101/2023.06.13.544751 (2023).
Peng, R. & Deng, M. Mapping the protein-protein interactome in the tumor immune microenvironment. Antib. Ther. 6, 311–321 (2023).
Khan, A. B. et al. Redirection to the bone marrow improves T cell persistence and antitumor functions. J. Clin. Invest. 128, 2010–2024 (2018).
Maasho, K. et al. The inhibitory leukocyte-associated Ig-like receptor-1 (LAIR-1) is expressed at high levels by human naive T cells and inhibits TCR mediated activation. Mol. Immunol. 42, 1521–1530 (2005).
Fantin, A. et al. Tissue macrophages act as cellular chaperones for vascular anastomosis downstream of VEGF-mediated endothelial tip cell induction. Blood 116, 829–840 (2010).
Casazza, A. et al. Impeding macrophage entry into hypoxic tumor areas by Sema3A/Nrp1 signaling blockade inhibits angiogenesis and restores antitumor immunity. Cancer Cell 24, 695–709 (2013).
Pellet-Many, C., Fernando-Sayers, J. & Zachary, I. BS20 macrophage-specific deletion of neuropilin-2 inhibits inflammatory signalling and attenuates atherosclerotic plaque development in apoe-deficient mice. BMJ 109 A260.1–A260 (2023).
Wu, S. Z. et al. A single-cell and spatially resolved atlas of human breast cancers. Nat. Genet. 53, 1334–1347 (2021).
Iglesia, M. D. et al. Differential chromatin accessibility and transcriptional dynamics define breast cancer subtypes and their lineages. Nat. Cancer 5, 1713–1736 (2024).
Lou, Y. et al. Epithelial-mesenchymal transition is associated with a distinct tumor microenvironment including elevation of inflammatory signals and multiple immune checkpoints in lung adenocarcinoma. Clin. Cancer Res. 22, 3630–3642 (2016).
Wang, G. et al. The pan-cancer landscape of crosstalk between epithelial-mesenchymal transition and immune evasion relevant to prognosis and immunotherapy response. NPJ Precis. Oncol. 5, 56 (2021).
Hermida-Prado, F. et al. Endocrine therapy synergizes with SMAC mimetics to potentiate antigen presentation and tumor regression in hormone receptor-positive breast cancer. Cancer Res. 83, 3284–3304 (2023).
Alsuliman, A. et al. Bidirectional crosstalk between PD-L1 expression and epithelial to mesenchymal transition: significance in claudin-low breast cancer cells. Mol. Cancer 14, 149 (2015).
Slyper, M. et al. A single-cell and single-nucleus RNA-Seq toolbox for fresh and frozen human tumors. Nat. Med. 26, 1307 (2020).
Rev, D. Cell preparation for single cell protocols. https://cdn.10xgenomics.com/image/upload/v1686678481/support-documents/CG00053_Handbook_CellPreparation_SingleCellProtocols_Rev_D.pdf.
Fleming, S. J. et al. Unsupervised removal of systematic background noise from droplet-based single-cell experiments using CellBender. Nat. Methods 20, 1323–1335 (2023).
Wolock, S. L., Lopez, R. & Klein, A. M. Scrublet: computational identification of cell doublets in single-cell transcriptomic data. Cell Syst. 8, 281–291.e9 (2019).
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).
DeTomaso, D. et al. Functional interpretation of single cell similarity maps. Nat. Commun. 10, 4376 (2019).
Ciernik, L., Kraft, A., Barkmann, F., Yates, J. & Boeva, V. ANS: adjusted neighborhood scoring to improve assessment of gene signatures in single-cell RNA-seq data. Preprint at bioRxiv https://doi.org/10.1101/2023.09.20.558114 (2023).
Badia-I-Mompel, P. et al. decoupleR: ensemble of computational methods to infer biological activities from omics data. Bioinform. Adv. 2, vbac016 (2022).
Squair, J. W. et al. Confronting false discoveries in single-cell differential expression. Nat. Commun. 12, 5692 (2021).
Muzellec, B., Teleńczuk, M., Cabeli, V. & Andreux, M. PyDESeq2: a python package for bulk RNA-seq differential expression analysis. Bioinformatics 39, btad547 (2023).
Love, M., Anders, S. & Huber, W. Differential analysis of count data–the DESeq2 package. Genome Biol. 15, 10–1186 (2014).
Arun, R., Suresh, V., Veni Madhavan, C. E. & Narasimha Murthy, M. N. On finding the natural number of topics with latent dirichlet allocation: some observations. In Proc. 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining 391–402 (Springer, 2010).
Cao, J., Xia, T., Li, J., Zhang, Y. & Tang, S. A density-based method for adaptive LDA model selection. Neurocomputing 72, 1775–1781 (2009).
Mimno, D., Wallach, H. M., Talley, E., Leenders, M. & McCallum, A. Optimizing semantic coherence in topic models. In Proc. Conference on Empirical Methods in Natural Language Processing 262–272 (ACM, 2011).
Otsu, N. A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979).
Fu, J. Cellular reprogramming during anti-PD-1 and chemotherapy treatment in early-stage primary hormone receptor-positive breast cancer. Zenodo https://doi.org/10.5281/ZENODO.16912125 (2025).
Acknowledgements
This study was supported by the NIH/NCI, the Monell Foundation Fund, and the Susan G Komen leadership grant (SAC21204 to E.A.M.) for data sequencing and correlative analysis, and by funding from Merck for the overall trial. The authors acknowledge Jayda Cavanaugh, Mark D. Connell, Nina Listopadzki, Aurelia Reynolds, and Eric Van Baak for their help with single-cell sequencing, and Valerie Hope Goldstein for editorial assistance. E.A.M. acknowledges the Rob and Karen Hale Distinguished Chair in Surgical Oncology. J.P. acknowledges salary support from NIH grant R50CA265182 outside of this project.
Author information
Authors and Affiliations
Contributions
D.L.A., S.M.T., and E.M.V.A. served as supervising senior authors on this manuscript. Study design: J.F., A.G.W., S.M.T., and E.M.V.A. Data gathering and sequencing: J.F., A.G.W., E.P., T.K., E.S., S.V., M.B., A.N., A.R.T., J.P., M.D., E.W., R.J., E.A.M. Data analysis: J.F., A.G.W., B.T., K.B., S.C., T.P., D.L.A., S.M.T., and E.M.V.A. Manuscript writing: J.F. and A.G.W. Critical review of the manuscript: all authors.
Corresponding author
Ethics declarations
Competing interests
AGW reports consulting or advisory roles for AstraZeneca and AMBRX; speaker’s honoraria from AstraZeneca; and research support (to institution) from Genentech, Gilead, Macrogenics, and Merck. RJ reports consulting or advisory roles for Eli Lilly, AstraZeneca, Pfizer, Novartis, Carrick Therapeutics, GE Health, and Luminex; and research funding from Pfizer, Eli Lilly, and Novartis. EAM reports compensated service on scientific advisory boards for AstraZeneca, BioNTech, and Merck; uncompensated service on steering committees for Bristol Myers Squibb and Roche/Genentech; speakers' honoraria and travel support from Merck Sharp & Dohme; and institutional research support from Roche/Genentech (via SU2C grant) and Gilead. EAM also reports research funding from Susan Komen for the Cure, for which she serves as a Scientific Advisor, and uncompensated participation as a member of the American Society of Clinical Oncology Board of Directors. SMT reports consulting or advisory roles for Novartis, Pfizer/SeaGen, Merck, Eli Lilly, AstraZeneca, Genentech/Roche, Eisai, Bristol Myers Squibb/Systimmune, Daiichi Sankyo, Gilead, Blueprint Medicines, Reveal Genomics, Sumitovant Biopharma, Artios Pharma, Menarini/Stemline, Aadi Bio, Bayer, Jazz Pharmaceuticals, Natera, Tango Therapeutics, eFFECTOR, Hengrui USA, Cullinan Oncology, Circle Pharma, Arvinas, BioNTech, Launch Therapeutics, Zuellig Pharma, Johnson&Johnson/Ambrx, Bicycle Therapeutics, BeiGene Therapeutics, Mersana, Summit Therapeutics, Avenzo Therapeutics, Aktis Oncology, Celcuity, Boehringer Ingelheim, Samsung Bioepis, Olema Pharmaceuticals, Tempus, and Boundless Bio; research funding from Genentech/Roche, Merck, Exelixis, Pfizer, Lilly, Novartis, Bristol Myers Squibb, AstraZeneca, NanoString Technologies, Gilead, SeaGen, OncoPep, Daiichi Sankyo, Menarini/Stemline, Jazz Pharmaceuticals, and Olema Pharmaceuticals; and travel support from Lilly, Gilead, Jazz Pharmaceuticals, Pfizer, Arvinas, and Roche. EMVA reports advisory or consulting roles for Enara Bio, Manifold Bio, Monte Rosa, Novartis Institute for Biomedical Research, Serinus Bio, and TracerDx; research funding from Novartis, BMS, Sanofi, and NextPoint; equity in Tango Therapeutics, Genome Medical, Genomic Life, Enara Bio, Manifold Bio, Microsoft, Monte Rosa, Riva Therapeutics, Serinus Bio, Syapse, TracerDx; institutional patents filed on chromatin mutations and immunotherapy response, and methods for clinical interpretation; and intermittent legal consulting on patents for Foley Hoag Editorial Boards: Science Advances. TEK reports becoming an employee at Merck, after her contributions to this manuscript. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Gong Tang, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Fu, J., Waks, A.G., Pimenta, E. et al. Cellular reprogramming during anti-PD-1 and chemotherapy treatment in early-stage primary hormone receptor-positive breast cancer. Nat Commun 16, 10704 (2025). https://doi.org/10.1038/s41467-025-66659-y
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41467-025-66659-y







