Quantification of stromal tumor-infiltrating lymphocytes (sTILs) via hematoxylin and eosin (H&E) stained slides using established approaches1 is an effective surrogate for host anti-tumor immunity2. sTILs are prognostic and associated with response to neoadjuvant chemotherapy (NAC) in primary triple negative breast cancer (TNBC) and HER2+ breast cancer3,4,5,6,7,8,9 yet their role in metastatic breast cancer (MBC) is less well defined. Microtubule-targeting agents are a mainstay of MBC treatment and Cancer and Leukemia Group B (CALGB) 40502 (Alliance) was a randomized phase III trial of 799 patients treated in the first-line setting, comparing microtubule-targeting chemotherapies: nab-paclitaxel, ixabepilone, or paclitaxel, with or without bevacizumab10. In this non-protocol-specified analysis, we hypothesized that sTILs quantity is significantly associated with progression-free survival (PFS) in MBC patients receiving first-line chemotherapy. In the primary analyses, sTILs were evaluated as <5% (low) versus ≥5% (high) based on prior analyses in MBC establishing 5% as a standard threshold11,12,13,14, with sensitivity analyses incorporating sTILs as a continuous variable.

Stromal TILs are lower in distant metastatic sites

sTILs distribution was skewed to low sTILs, with 373/582 (64.1%) with sTILs <5% and 155/582 (26.6%) with sTILs ≥5% (Fig. 1B). 68/582 (11.7%) samples demonstrated sTIL ≥50%, frequently considered ‘lymphocyte predominant’. Overall, samples from primary sites had higher average sTILs relative to locoregional recurrence or metastatic sites (mean 13.3% vs. 8.4%, respectively; Wilcoxon–Mann–Whitney p = 3e-4; Fig. 1C) and metastatic sites had the lowest mean sTILs (primary breast 12.7%, primary LN 22.9%, LRR 13.4%, distant metastasis 6.5%; Kruskal–Wallis p = 1e-5; Fig. 1D). Among distant metastatic sites, mean sTILs ranged from 1.3% bone (least) to 9.5% lung (greatest non-LN), and 20.4% LN (Kruskal–Wallis p = 1.4e-5; Fig. 1E). Of note, TILs in LN tissue is less reliable, impacted by presence of native nodal lymphocytes.

Fig. 1: Study design and descriptive analyses of stromal TILs in CALGB 40502.
figure 1

A CONSORT diagram. B Distribution of stromal TILs across all evaluable slides (n = 582). C Stromal TILs in combined primary samples (primary breast or lymph node) versus combined recurrent/metastatic (locoregional recurrence or distant metastatic sites). D Stromal TILs in primary breast versus primary lymph node versus locoregional recurrence versus distant metastatic sites. E Stromal TILs in distant metastatic sites. For CE, box indicates 25th–75th percentile with median as center line and whiskers indicates 95th percentile above/below median. Mean value indicated below each variable. F Stromal TILs in paired primary breast cancer:metastasis or locoregional recurrence (LRR) within individual patients.

Evaluable slides from both primary and LRR or metastasis within the same patient were available for 100 unique patients. Among paired samples, primary tumors had significantly higher sTILs (mean 10.5%) relative to LRR or distant metastasis (mean 7.7%; Wilcoxon rank sum p = 0.008; Fig. 1F). Sensitivity analyses only evaluating primary:distant metastasis pairs demonstrated a trend that remained statistically significant (mean sTILs primary 7.6% vs. distant metastasis 6.3%; p = 0.011; data not shown).

Stromal TILs and association with survival outcomes

For all survival analyses, one slide per individual with the highest sTILs value was included for analyses (recurrent or metastatic site - 77/443, 17.4%; primary tumor – 366/443, 82.6%). Most patients had maximum sTILs <5% (259/443; 58.5%), with no significant difference in distribution by treatment arm, age, race, BMI, or inferred menopausal status (age > 55) (Table 1). There was a significantly greater proportion of high sTILs (≥5%) among hormone receptor-negative patients (73/114; 64.0%) than hormone receptor-positive patients (111/328; 33.8%). DFI (time from completion of adjuvant therapy to metastatic diagnosis) varied with greater proportion of de novo MBC (35/51; 68.6%) and DFI < 1 year (58/85; 68.2%) among low sTILs (<5%). Covariate data was missing for one individual.

Table 1 Cohort characteristics

For the primary objective, Cox proportional hazard model of sTILs low (versus high) was significantly associated with worse PFS (HR 1.34; 95% CI 1.1–1.63, p = 0.004) and OS (HR 1.32; 95% CI 1.07–1.63, p = 0.009; Table 2) when controlling for treatment arm. When controlling for both treatment arm and hormone receptor status, association of sTILs low versus high demonstrated similar trends but did not reach statistical significance for PFS (HR 1.2; 95% CI 0.97–1.47, p = 0.09) or OS (HR 1.14; 95% CI 0.91–1.43, p = 0.2; Table 2). When controlling for treatment arm, hormone receptor status, and clinicopathologic variables, association of sTILs low versus high demonstrated similar trends but further attenuated association for PFS (HR 1.17; 95% CI 0.94–1.44, p = 0.20) or OS (HR 1.06; 95% CI 0.84–1.34, p = 0.6; Table 2). There was no significant interaction between sTILs and treatment arm (all p-interaction >0.05). Sensitivity analyses among only breast primary samples with sTILs low vs. high and as a continuous variable and, separately, only distant metastases demonstrated that when controlling for treatment arm or both treatment arm and hormone receptor status, association of sTILs low versus high showed similar trends to the primary analyses but did not reach significance for PFS or OS (Supplementary Tables 13).

Table 2 Association of sTILs and progression-free and overall survival

This study is the first trial to our knowledge to examine the association of sTILs with outcomes in MBC among patients receiving chemotherapy. Currently, there is not a reliable biomarker of immune activation that is applied clinically across centers; however, sTILs can readily be determined from routine H&E slide/image and, for TNBC and HER2+ breast cancers, appears to be consistently associated with response to NAC and prognosis among primary breast cancers. By evaluating primary:metastasis pairs from the same patient, we can see a consistent, significant decline in average sTILs. This difference varied patient-to-patient and also could be impacted by site of metastasis evaluated, given differences in sTIL amount, as has been seen in other studies15,16.

In this study, we found sTILs were associated with PFS in the metastatic setting, but that association was attenuated after controlling for HR status. This calls into question the robustness of the association and also in part reflects variations in sTILs between HR-negative and -positive tumors. Based on these findings, immune checkpoint inhibitor therapy is rational in mTNBC patients with pre-existing TIL12,13, yet the clinical challenge remains how to increase TIL/anti-tumor immunity in the metastatic setting. Given the extensive literature supporting association of sTILs with outcome and/or treatment response in primary3,4,5,6,7,8,9 and metastatic12,13,14 settings, a rational next step is design of prospective trials incorporating sTILs in patient stratification or treatment determination.

Limitations of this study include: sTILs enumeration on a mix of primary tumors, primary LN, LRR, and distant metastases adds heterogeneity to the evaluable data. Given significantly lower sTILs in LRR/metastatic specimens compared to primary, dedicated analyses of sTILs and PFS using only distant metastases in non-HR + MBC patients is needed.

In conclusion, immune activation measured by sTILs is significantly lower in metastatic than primary breast cancers and varies by metastatic site. We demonstrated that sTILs were associated with PFS and OS in chemotherapy-treated MBC, but the association of sTILs with outcome did not persist after controlling for hormone receptor status.

Methods

Study population

Clinical data were locked as of March 31, 2021. Of 788 patients receiving treatment on CALGB 40502, 690 H&E slides from 484 unique patients (62.7%) were submitted (Fig. 1A). Relative to the overall intention-to-treat population, patients with available tissue were balanced across arms and baseline characteristics (Table 1). None of the analyzed patients had HER2-positive breast cancer, which comprised only 2% of the CALGB 40502 study population. Protocol-specific written informed consent was obtained from participants and protocol approved by the National Cancer Institute’s Institutional Review Board. The informed consent document complies with federal and institutional guidelines, including the Declaration of Helsinki, for collection and use of data and samples. Overall, 582 slides were evaluable from 443 unique patients: 390 primary breast cancer, 26 primary lymph node (LN), 45 locoregional recurrence (LRR), 121 distant metastasis (Fig. 1A).

Tumor-infiltrating lymphocyte enumeration

TILs were enumerated in accordance with International TILs Working Group methods1,17. Briefly, one section (4–5 μm, magnification ×200–400) per sample was evaluated and TILs were reported for the stromal compartment (percent stromal TILs by study pathologist, RS). TILs were evaluated within the borders of the invasive tumor on full sections (preferred) or core biopsies. All analyses study used stromal TILs as the predefined TIL biomarker. As noted above, for the primary endpoint, sTILs were evaluated as <5% (low) versus ≥5% (high) based few existing studies11,12,13,14, with sensitivity analyses incorporating sTILs as a continuous variable.

Statistical analyses

The association between sTILs low/high and baseline characteristics of patients with evaluable sTILs was examined using two-sample t-test or rank sum tests for continuous variables and chi-squared test or Fisher’s exact test for categorical variables. Cox regression models were based on TIL-evaluable cohort for endpoints of PFS and OS to test the prognostic value of TILs, adjusting for treatment arm alone, with hormone receptor status, or with hormone receptor status plus body mass index (BMI), race, age at diagnosis, and disease-free interval (DFI). Proportional hazards assumptions were verified with Schoenfeld residuals. Data collection and statistical analyses were conducted by Alliance Statistics and Data Management Center.