Introduction

Upper gastrointestinal tract cancers, such as gastric adenocarcinoma (GAC) and esophageal squamous cell carcinoma (ESCC), are the top seven most common cancers in the world1,2. They were usually diagnosed at late stages in China and resulted in poor prognosis. Current treatment guidelines recommend programmed cell death protein 1 (PD-1) inhibitors combined with chemotherapy as first-line treatment for GAC and ESCC3,4,5,6. However, after failure of first-line therapy, there are few options for second-line or late-line treatment, even with unsatisfactory outcome7,8.

For the second-line chemotherapy including irinotecan or taxane drugs with or without ramucirumab, the prognosis for advanced GAC or ESCC remains poor, with a median overall survival (OS) of 8–10 months9,10,11,12,13. In third and subsequent lines, the overall response rate (ORR) drops to 3%, with a median OS of just 18.5 weeks14. Additionally, chemotherapy can lead to treatment-related pain, psychological distress, and adverse reactions. Nivolumab and pembrolizumab were approved for advanced GAC based on the results of ATTRACTION-2 and KEYNOTE-059 trials in third-line or later-line setting, which showed the median OS of 5.26 months and 5.6 months, respectively15,16. One subsequent randomized controlled trial of PD-1 inhibitors, however, failed to suggest survival benefit compared with standard chemotherapy in second-line treatment with an ORR 16% in patients with a PD-L1 combined positive score (CPS) ≥ 117. For patients with advanced ESCC, the results from KEYNOTE-181, ATTRACTION-3, and ESCORT trials provide robust evidence for the use of PD-1 inhibitors in a second-line setting18,19,20. Unfortunately, it should be noted that an obvious limitation of PD-1 inhibitors as monotherapy in advanced ESCC was low ORR (around 20%). Therefore, due to the poor performance status of patients with advanced GAC and ESCC, and the limitation of tolerance to cytotoxic therapy, novel chemotherapy-free regimen as second-line or late-line treatment is urgently needed.

Recent findings underscore the complex interplay between vascular endothelial growth factor (VEGF)/VEGF receptor (VEGFR) signaling and immunosuppression in the tumor microenvironment (TME), suggesting a theoretical basis for the combined use of anti-angiogenic therapy and immune-checkpoint inhibitors (ICIs) like anti-PD-1/PDL-1 antibodies. VEGF not only contributes to local and systemic immunosuppressive effects in cancer development but also inhibits the mobilization, trafficking, proliferation, and effector function of cytotoxic T lymphocytes (CTLs)21. Additionally, VEGF promotes the suppression of dendritic cell maturation and enhances the recruitment and proliferation of immunosuppressive cell subsets such as T regulatory cells (Tregs), myeloid-derived suppressor cells (MDSCs), and M2-type tumor-associated macrophages (M2-TAMs)22,23,24. This is further exacerbated by VEGF-induced abnormal angiogenesis which leads to hypoxia in the TME, promoting the secretion of soluble chemotactic factors that create a highly immunosuppressive environment24. Therefore, strategies that target angiogenesis to normalize vasculature could restore normal immune functions and reduce immunosuppression in the TME, thereby enhancing the efficacy of immunotherapies. Supporting this, preclinical evidence and subsequent clinical studies have demonstrated the superiority of combining anti-angiogenic agents with immunotherapy across various malignancies, showing not only efficacy but also favorable tolerability25,26,27,28. This indicates that simultaneously blocking angiogenesis and immune checkpoints could be an effective and safe approach for cancer treatment.

Anlotinib hydrochloride is a multi-target tyrosine kinase receptor inhibitor that targets VEGFR, fibroblast growth factor receptor (FGFR), platelet-derived growth factor receptors (PDGFR), and c-kit. Pre-clinical studies have shown that Anlotinib hydrochloride could facilitate tumor vessel normalization, improve the tumor immune microenvironment, and promote antitumor immunity through immunomodulatory effects on both innate and adaptive immune cells29,30. Moreover, the combination of anlotinib with PD-1 blockade has shown favorable antitumor activity across multiple types of cancer with tolerable toxicity, including non-small cell lung cancer (NSCLC), soft-tissue sarcoma, epithelial ovarian cancer and hepatocellular carcinoma31,32,33,34. However, the safety and efficacy of anlotinib combined with anti-PD-1 in patients with advanced GAC and ESCC were still unknown.

In this work, we conduct a phase 2 trial to evaluate the antitumor activity and safety of Anlotinib combined with Nivolumab as second-line or further-line treatment in patients with GAC or ESCC. This decision is motivated by the shared therapeutic challenges faced in the treatment of advanced stages of these two cancers, including limited effective second- or later-line options and poor overall survival with existing therapies. Although there are variations in the molecular and immunological profiles of GAC and ESCC, certain patients exhibit positive responses to anti-PD-1 or anti-angiogenic therapy. This suggests the presence of common vulnerabilities that could be targeted through combination therapy involving anti-PD-1 and TKI treatment. The inclusion of ESCC aims to explore the potential applicability of this combination therapy across different types of upper gastrointestinal tract cancers. At the same time, we are devoted to exploring candidate biomarkers to predict efficacy for the combination therapy (Anlotinib hydrochloride plus Nivolumab) for patients with advanced GAC.

Results

Patient enrollment and characteristics at baseline

In the initial stage of the study, 4 patients achieved a PR, allowing the trial to progress to the second phase. Ultimately, 45 patients in GAC cohort and 3 patients in ESCC cohort were recruited from the Fudan University Shanghai Cancer Center from December 30, 2020, to September 30, 2022 (Fig. 1B). The efficacy and safety analysis were evaluated in all patients (n = 48). At the data cutoff on 10 January 2023, the median follow-up time was 15.0 months (95% CI 9.88–20.12) months. 42 patients discontinued the treatment from this study due to disease progression (PD, n = 31), adverse event (AE, n = 6), death (n = 1), and consent withdrawal (n = 4), while 6 patients were still treated by the research regimen.

Fig. 1: The flow chart of this study.
figure 1

A Schematic overview of the therapeutic regimen. The illustration was prepared by Figdraw. B Clinical trial profile. ORR Overall Response Rate; PFS Progression-Free Survival; OS Overall Survival; DCR Disease Control Rate; GAC Gastric Adenocarcinoma; ESCC Esophageal Squamous Cell Carcinoma; AEs Adverse Events.

Among all 48 patients enrolled, the median age was 59.0 years (range 30–72 years) with 36 (75%) male patients and 12(25%) female patients. 31 (64.6%) patients received one line, and 17 (35.4%) received at least 2 lines of treatment before enrollment. Detailed data are shown in Supplementary Table 1.

Therapeutic efficacy

48 patients (45 patients with GAC and 3 patients with ESCC) were evaluated for efficacy. The ORR was 29.2% (14/48, 95% CI, 17.0–44.1), and the DCR was 64.6% (31/48, 95% CI, 49.5–77.8). 2 (4.2%) patients and 12 (25.0%) patients achieved a best response of complete response (CR) or partial response (PR), respectively, while 17 (35.4%) got stable disease (SD) and 17 (35.4%) got progression disease (PD) (2 patients for hyper-PD) (Table 1 and Fig. 2A–C). The median PFS and OS were 4.0 months (95% CI, 2.6–5.4) and 11.1 months (95% CI, 5.7–16.5), respectively (Table 1, Fig.2D and Fig.2E). The ORR was 32.3%, DCR was 61.3%, and median PFS and OS were 4 months and 14.5 months as second-line regimen. For patients of ≥third-line therapy, the ORR, DCR, median PFS and median OS were 23.5%, 70.6%, 3.8 months, and 7.6 months, respectively (Table 1). Efficacy in GAC cohort as illustrated in Supplementary Table 2.

Table 1 The response analysis and survival time of all patients
Fig. 2: Efficacy evaluation.
figure 2

A Waterfall plot of the best percentage change in target lesion diameter from baseline within the efficacy-evaluable population (n = 48). Tumor measurements and response evaluations were performed according to irRECIST. B Spider plot of tumor burden changes from baseline within the efficacy-evaluable population (n = 48). C Swimmer plot of treatment durations and survival (n = 48). Each bar represents one patient. D Kaplan–Meier curve of progression-free survival (PFS) in the full analysis set of the 48 patients. The median PFS was 4.0 months. Shaded areas represent 95% confidence intervals. E Kaplan–Meier curve of overall survival (OS) in the full analysis set of the 48 patients. The median OS was 11.1 months. Shaded areas represent 95% confidence intervals. CR Complete Response; PR Partial Response; SD Stable Disease; PD Progression Disease.

In the context of exploratory subgroup analysis, the ORR and PFS demonstrated overall consistency across various subgroups, encompassing those characterized by ECOG performance status (ECOG PS) 0, patients with the intestinal type of GAC, and those with a CPS score ≥1 (Supplementary Fig. 1 and Supplementary Fig. 2). Nevertheless, it is worth noting that patients with ECOG-PS 0, intestinal type of GAC patients, or CPS score ≥5 exhibited a prolonged OS (Supplementary Fig. 3).

Safety

Among 48 patients in this study, 46 (95.8%) patients had treatment-related adverse events (TRAEs). More than 10% of incidence rate included hypertension (n = 8), hypothyroidism (n = 9), liver dysfunction (n = 6), and rash (n = 6). G3-4 of TRAEs were hypothyroidism (n = 1), liver dysfunction (n = 1), palmar-plantar erythrodysesthesia (n = 1), rash (n = 1), hypophysitis (n = 1) and esophageal-tracheal fistula (n = 1). 3 patients with peritoneal metastasis got intestinal obstruction and 1 patient got intestinal perforation. TRAEs of hypertension (n = 8), thrombocytopenia (n = 2), and proteinuria (n = 1) were mainly associated with Anlotinib hydrochloride. Details in Table 2.

Table 2 The treatment-related adverse events (TRAEs) of 48 patients

Comprehensive genomic profiling (tissue and ctDNA)

Comprehensive genomic profiling was predefined in the study design. Genomic analysis was performed on peripheral blood serum samples from 95.6% (43/45) of patients with GAC and on tumor tissue samples from 73.3% (33/45) of patients with GAC in this study. As shown in Fig. 3A and B, the most frequently altered genes in ctDNA at baseline were TP53, NCOR2, LRP1B, and MUC16 between the responders (CR + PR) and non-responders (SD + PD). Additionally, good concordance was demonstrated between TP53, LRP1B, and MUC16 mutations in tissue and in ctDNA (Supplementary Fig. 4). When studying the relationship of pretreatment ctDNA levels with survival outcomes, a 5% variant allele frequency (VAF) cut-off was used. Notably, a significant association was found between pretreatment mean VAF and OS (Fig. 3E). However, pretreatment VAF did not correlate with ORR and PFS (Fig. 3C and D). Interestingly, the VAF of ctDNA tended to increase when disease progression compared to the baseline (n = 5) (Fig. 3F). Taken together, the results suggest that pretreatment ctDNA VAF levels may be a prognostic factor for survival with the combined treatment of anlotinib plus nivolumab.

Fig. 3: The relationship between mutational landscape of ctDNA and therapeutic efficacy.
figure 3

A Overall frequency of top 40 gene alterations at baseline (n = 43). B Overall frequency of top 40 gene alterations between responders (n = 12) and non-responders (n = 31). C Pretreatment mean-VAF stratified by objective response (non-responders; n = 31 vs. responders; n = 12). Differences between groups were evaluated using the Wilcoxon rank-sum test. D Kaplan–Meier analysis of PFS in patients stratified by ≥5% (n = 15) versus <5% (n = 28) pretreatment mean-VAF. Groups were compared using the log-rank test. E Kaplan–Meier analysis of OS in patients stratified by ≥5% (n = 15) versus <5% (n = 28) pretreatment mean-VAF. Groups were compared using the log-rank test. F Changes of mean-VAF levels before treatment and after treatment progression (n = 5 individuals). A paired t-test was used to compare the VAF levels before and after treatment progression. ns not significant; R responders; NR non-responders; PFS Progression-Free Survival; OS Overall Survival; VAF variant allele frequencies. All tests were two-sided for statistical evaluation. No adjustments for multiple comparisons were made. Each box plot shows the median as the central line, with the box boundaries representing the first and third quartiles (25th and 75th percentiles), and the whiskers indicating the range from minimum to maximum values. Source data are provided as a Source Data file.

We further stratified the enrolled patients with GAC based on their previous treatment regimens, which were grouped into three categories: those who had previously received fluoropyrimidines-based and taxanes-based regimens (n = 18), those treated with platinum-based and fluoropyrimidines-based regimens (n = 18), and those who had received fluoropyrimidines-based, taxanes-based, and platinum-based regimens (n = 7). Analysis of pre-treatment VAF and survival rates, including PFS and OS, revealed no significant differences across these treatment groups, as indicated by similar pre-treatment VAF levels and survival outcomes (Supplementary Fig. 5A–C). Additionally, using a VAF cutoff of 5%, there were no notable survival differences between patients with high and low VAFs in each treatment group (Supplementary Fig. 5D). The absence of significant variations may be due to the small sample size, which could hinder the detection of meaningful differences.

Plasma proteomics profiling by Olink technology

The plasma proteomics analysis was performed as a post-hoc exploratory analysis. Olink proteomic profiling in key immune and inflammation pathways was performed to compare among responders and non-responders for 64.4% (29/45) of patients with GAC to find possible pretreatment biomarkers that can predict efficacy of Nivolumab plus Anlotinib hydrochloride. When comparing with non-responders, the expressive levels of IL-6, IL-8, CXCL5, CXCL1, CSF1, CD4, PD-L1, CCL20, MMP7, PTN, MUC16, and ANGPT2 were lower than that in responders (Fig. 4A and B). Survival analysis was performed to identify the association between these proteins and survival time, which revealed that IL-6, CSF1, MMP7, and chemokines including IL-8, CXCL5, CXCL1, and CCL20 were associated with poorer survival (Fig. 4C). We further investigated the dynamics of serum proteins in patients at the time of baseline and disease progression, which showed that there was an increased trend with IL-6, IL-8, MUC16, CXCL1, and CSF-1 when disease progression (Fig. 4D). Cross-referencing MUC16 levels identified in the plasma proteomics with those detected in genomic profiling reveals a correlation that underscores the biomarker’s significance across different platforms, highlighting its potential as a multi-faceted indicator in our therapeutic assessment. Together, these results indicated that the clinical efficacy of this combined treatment in patients with GAC may be attributed to complex systemic immune responses, a hypothesis that needs to be confirmed by further research.

Fig. 4: The serum immune proteomics are associated with prognosis and treatment efficacy.
figure 4

A, B Comparison of pretreatment serum protein levels in responders and non-responders (responder n = 22; non-responder n = 7). C The expressive level of some serum protein could predict survival time (n = 29). D Changes of serum protein expression levels before and after disease progression (n = 5 individuals). For paired samples, if the difference between the two samples met normality, a paired t-test was used. If the difference did not meet normality, the Wilcoxon Signed-Rank Test was applied. All tests were two-sided for statistical evaluation. No adjustments for multiple comparisons were made. Each box plot shows the median as the central line, with the box boundaries representing the first and third quartiles (25th and 75th percentiles), and the whiskers indicating the range from minimum to maximum values. Source data are provided as a Source Data file.

Multiplex immunofluorescence

Multiplex immunofluorescence was also conducted as a post-hoc exploratory analysis. In order to delve into the biological significance of differentially expressed proteins, we explored their relationship with tumor-infiltrating immune cells, which are crucial players in anti-tumor immunity. Our aim was to elucidate the detailed mechanisms behind combined therapy in GAC. The results revealed significant correlations between CD8+ T cells, macrophages, and the majority of differentially expressed proteins, including IL-6, IL-8, CXCL5, CXCL1, CSF1, CD4, PD-L1, MUC16, and ANGPT2 (n = 27; Fig. 5A and B). This emphasizes the pivotal role of CD8+ T cells and macrophages in the context of combined treatment. The expression patterns of MUC16 as observed in the genomic profiling and plasma proteomics are further validated here, showing consistent trends that contribute to the understanding of its role in modulating immune responses and patient outcomes. Inspired by these findings, we employed a multi-immunofluorescence approach to assess the abundance of cytotoxic CD8+ T cells and CD68+ macrophages in tumor tissues of patients with GAC. The findings demonstrated a substantial increase in the proportion of CD68+PD-L1+macrophages, CD68+PD-1 macrophages, CD68+PD-L1+PD-1+ macrophages, and CD8+PD-L1+ T cells in the stroma region of responders compared to that of non-responders (Supplementary Fig. 6). Conversely, responders expressed lower levels of CD68+PD-L1 T cells in the tumor region than that in non-responders (Supplementary Fig. 7). When considering both in tumor and stroma regions, the expressive levels of CD68+PD-L1+ macrophages, CD68+PD-L1+PD-1+ macrophages, and CD8+PD-L1+ T cells were all elevated in responders (Fig. 5C). Importantly, the presence of CD68+PD-L1+PD-1+ macrophages was associated with improved survival time, suggesting a potential contribution of these macrophages to the response about the combined regimen (Fig. 5D and E).

Fig. 5: The association of changes in the composition of the infiltrating immune cells with prognosis and treatment efficacy.
figure 5

A Correlation heatmap of differentially expressed proteins and immune cells. Spearman correlation test was conducted, with statistical significance determined by the Benjamini-Hochberg procedure. B Representative images of multiplex immunofluorescence from responders and non-responders. The staining of different protein markers is indicated by different colors. C Comparison of indicated immune cells abundance in total region of responders (n = 7) and non-responders (n = 20). Differences between groups were evaluated using the Wilcoxon rank-sum test. D Kaplan–Meier plots of PFS stratified by above or below the median percentage for total CD68+PD-L1+PD-1+ cells. E Kaplan–Meier plots stratified by total CD68+PD-L1+PD-1+ for OS. *, P < 0.05; **, P < 0.01; NS not significant; R responders; NR non-responders; PFS Progression-Free Survival; OS Overall Survival. All tests were two-sided for statistical evaluation. No adjustments for multiple comparisons were made. Each box plot shows the median as the central line, with the box boundaries representing the first and third quartiles (25th and 75th percentiles), and the whiskers indicating the range from minimum to maximum values. Source data and exact p values are provided as a Source Data file.

Gut microbiome analysis

Gut microbiome analysis was predefined in the study design. In order to investigate the potential relationship between the gut microbiome and the efficacy of treatment, we conducted Metagenome sequencing on the initial fecal samples from 40 patients with GAC. The disparity in bacterial diversity among the groups was assessed using α-diversity. The α-diversity analysis revealed an elevated number of observed species in the baseline gut microbiome of responders, although the difference did not reach statistical significance between the groups (Fig. 6A). Subsequently, Principal Coordinates analysis (PCoA) was conducted to assess the overall gut microbial structure in patients at the genus level. Likewise, no discernible differentiation in microbiome composition between the groups of responders and non-responders was observed (Fig. 6B). To further elucidate the taxonomic profiling of differentially abundant gut microbiome in patients, MetaPhlAn3 annotation was employed to evaluate microbial abundance at the species level across groups. In total, fifteen taxa at the species level were identified (Fig. 6C). At the lower taxonomic level, the feces in responder group exhibited restoration of gut normobiosis. For instance, an increase in several well-known commensals was observed in the feces of responders group, such as Faecalibacterium and Parabacteroides (Fig. 6D). Among the taxa that were enriched, seven species (Pyramidobacter_piscolens, Aggregatibacter_segnis, Anaerotruncus_colihominis, Intestinimonas_butyriciproducens, Bacteroides_xylanisolvens, Bacteroides_fragilis, Bacteroides_sp_CAG_443) were found to be enriched in non-responders’ patients (Fig. 6E).

Fig. 6: Gut microbiome analysis of baseline fecal samples.
figure 6

A The Observed species of gut microbiota between responders (n = 12) and non-responders (n = 28). B Principal component analysis plot based on Metagenome sequencing of the fecal content from responders (n = 12) and non-responders (n = 28). C Component proportion of bacterial phylum in each group. D Comparison of relative abundance of bacterial genus level between responders (n = 12) and non-responders (n = 28). E Heat map of differentially abundant taxa in intestinal type (n = 16) vs non-intestinal type (n = 21), responders (n = 10) vs non-responders (n = 27), PFS ≥ 4 (n = 19) vs <4 months (n = 18). R responders; NR non-responders; PFS Progression-Free Survival; OS Overall Survival. All tests were two-sided for statistical evaluation. No adjustments for multiple comparisons were made. Each box plot shows the median as the central line, with the box boundaries representing the first and third quartiles (25th and 75th percentiles), and the whiskers indicating the range from minimum to maximum values. Source data are provided as a Source Data file.

Discussion

In this phase 2 clinical trial, 48 Chinese patients (45 with GAC and 3 with ESCC) were treated with anlotinib hydrochloride combined with Nivolumab as a second-line or salvage therapy. Studies have shown that Asian patients often achieve more significant benefits from immunotherapeutic approaches such as immune checkpoint inhibitors, particularly in gastric and gastroesophageal junction cancers, compared to their non-Asian counterparts35. This enhanced response could be linked to specific molecular and clinical profiles prevalent among these populations. For example, distinct patterns in tumor immune signatures, including differences in T-cell gene expression, have been observed and may play a crucial role in modulating the effectiveness of treatments like PD-1/PD-L1 inhibitors across various geographic and ethnic groups36,37. The trial achieved to an ORR of 29.2%, a DCR of 64.6%, a median PFS of 4.0 months, and a median OS of 11.1 months along with a manageable toxicity profile. The results of this study were consistent with previous studies that combination of a VEGFR inhibitor and an anti-PD-1 antibody displayed synergistic effects on anti-tumor activity38,39,40. Interestingly, anlotinib hydrochloride plus nivolumab exhibited higher ORR, longer PFS, and OS for patients with ECOG-PS 0, Lauren’s intestinal type tumors of GAC, and a PD-L1 CPS of ≥5, which may act as potential prognostic factors for treatment outcomes. While cross-trail comparisons should be cautiously interpreted, the results of our study are encouraging.

The exploration of VEGFR TKI in combination with checkpoint inhibitors has shown promising early efficacy signals in various tumor types, including GC/GEJC/E, as evidenced in single-arm studies. A prior phase 2 study enrolled 29 patients with advanced GC for treatment with lenvatinib plus pembrolizumab, either in first-line or second-line settings41. This study reported an ORR of 69%, with a median PFS of 7.1 months and OS not yet reached. The combination therapy was marked by a favorable safety profile, with no grade 4 or 5 TRAEs reported and grade 3 events occurring in 48% of patients. Furthermore, the combination of apatinib with SHR-1210 (an anti-PD-1 antibody) was assessed in a phase Ia/Ib, multi-cohort trial involving patients with advanced gastric and hepatocellular cancer42. Of the 33 patients treated with the recommended phase II dose (RP2D), 60.6% experienced grade ≥3 TRAEs. In the GC/GEJC cohort, the ORR stood at 16% (4/25), with a DCR of 78%. The efficacy and tolerability of apatinib and SHR-1210 combination regimens were also shown in the second-line therapy for advanced ESCC. 52 patients were enrolled and included in analyses. 34.6% of patients had a confirmed objective response, while 44% experienced grade 3 or worse TRAEs43. In the context of the phase Ib CAMILLA trial, Saeed et al. evaluated cabozantinib plus durvalumab, primarily targeting advanced gastroesophageal cancer and other GI malignancies44. The overall incidence of grade 3–4 TRAEs was 34%. Specifically, in patients with G/GEJ/E adenocarcinoma (n = 10), the ORR and DCR were 30% and 80%, respectively, with mPFS and mOS recorded at 4.6 and 7.2 months. Comparatively, our study demonstrated similar efficacy in combination therapy regimens but exhibited a lower incidence of 3–4 grade TRAEs (16.7%). Notably, our regimen did not lead to any life-threatening or fatal adverse events, confirming a controllable safety profile. A majority of patients (79.2%) experienced only grade 1–2 TRAEs. The most frequently observed adverse events associated with Anlotinib hydrochloride were hypertension, thrombocytopenia, and proteinuria, all of grade 1 or 2 severity. Common TRAEs occurred in over 10% of participants, including hypothyroidism, liver dysfunction, and rash. These were similar to the safety profile observed in the phase Ib/II trial of anlotinib hydrochloride plus penpulimab in patients with unresectable hepatocellular carcinoma34. It was critical to note that three patients with peritoneal metastasis experienced intestinal obstruction, possibly due to peritoneal metastasis itself or as an effect of Anlotinib hydrochloride treatment. Overall, our findings suggested that the Nivolumab plus Anlotinib hydrochloride regimen offered substantial therapeutic effects with minimal side effects.

In the GAC cohort, 45 patients received treatment with Anlotinib hydrochloride and Nivolumab as a second-line or late-line therapy. This regimen achieved an ORR of 28.9%, a DCR of 62.2%, with the median PFS of 3.8 months, and the median OS of 10 months. Patients treated with Anlotinib hydrochloride and Nivolumab as second-line therapy in our study got an ORR of 32.1% and a median OS of 11.9 months, which were higher than the ORR (28%) and median OS (9.6 months) for patients treated with ramucirumab plus paclitaxel in RAINBOW trial12. However, the median PFS of 3.0 months here was shorter than 4.4 months in RAINBOW trial, which may result from the dissimilarities in the patient cohorts enrolled in the two trials and the relatively limited sample size of our study. In the third and later lines, our study revealed an ORR of 23.5%, a median PFS of 3.8 months, and a median OS of 7.6 months. The results were better than those of nivolumab in the ATTRACTION-2 study (ORR of 11.2%, median PFS of 1.61 months, and median OS of 5.26 months)15, as well as apatinib (ORR of 2.84%, median PFS of 2.6 months, and median OS of 6.5 months)45. Furthermore, in order to explore the prognostic and predictive biomarkers for responses to the combined regimen in patients with GAC, we incorporated an exploratory endpoint of tumor biomarker analysis including ctDNA, serum protein, immune infiltration, and microbiota composition.

In this study, genomic analysis in the majority of patients with GAC revealed a high frequency of mutations in genes such as TP53, NCOR2, LRP1B, and MUC16. Notably, these mutations showed good concordance between tissue samples and ctDNA, affirming the reliability of ctDNA as a reflective mirror of the tumor’s genomic landscape46,47. This is further supported by our utilization of ctDNA to explore the relationship between the genomic landscape and treatment outcomes, highlighting ctDNA’s capability to capture the genetic diversity and dynamics of tumors during treatment. Research across various populations has validated that ctDNA profiling can effectively detect genomic alterations and is particularly valuable for monitoring treatment responses and disease progression. Although high ctDNA levels often indicate a poor prognosis, the specific variant allele frequency (VAF) cutoffs and their clinical relevance can vary48,49,50. Moreover, we employed a 5% VAF cutoff to assess pretreatment ctDNA levels51. The significant association between pretreatment mean VAF and OS, but not with ORR or PFS, demonstrated that pretreatment ctDNA levels were indicative of survival outcomes, but they might not reliably predict treatment response or disease progression in short term. Furthermore, mean-VAF increasing with disease progression is intriguing and aligned with similar trends from other studies52,53. This suggested that monitoring ctDNA levels over time could provide valuable insights into disease dynamics and potentially guide treatment decisions. Additionally, the influence of prior therapies on our results and future research directions cannot be overlooked. Our comparative analysis of treatment outcomes and variant allele frequencies among patients with different chemotherapy backgrounds suggests that previous treatments do not significantly alter the initial tumor genetic landscape as gauged by VAF. This finding supports the notion that similar survival outcomes may be expected across different treatment groups, emphasizing the importance of considering historical treatment data when interpreting clinical results.

In light of the predominant focus on the tumor microenvironment (TME) in previous cancer immunology studies, our investigation aimed to assess the associations between serum immune proteomics and tumor-infiltrating immune cells in the TME. Notably, our analysis of peripheral blood samples revealed the differential expression of key proteins such as IL-6, IL-8, CXCL5, CXCL1, CSF1, CD4, PD-L1, CCL20, MMP7, PTN, MUC16, and ANGPT2 between responders and non-responders. Survival analysis further substantiated the clinical relevance of these proteins, where IL-6, CSF1, MMP7, and certain chemokines, such as IL-8, CXCL5, CXCL1, and CCL20 were associated with unfavorable survival outcomes. Interestingly, when disease progressed, the elevation of serum proteins, including IL-6, IL-8, MUC16, CXCL1, and CSF-1, signified a dynamic immune response. It was very important in comprehension of the mechanisms underlying resistance to Nivolumab plus Anlotinib hydrochloride therapy in GAC. These findings were consistent with recent studies that the expression levels of these proteins may serve as indicators of an immune-suppressive TME, rendering it less responsive to immune checkpoint blockade therapy54,55,56,57,58. Additionally, the enrichment of specific immune cell subsets, particularly cytotoxic CD8+T cells and CD68+ macrophages, and the expression of immune checkpoints were investigated for their correlation with the efficacy of the combined treatment. Interestingly, the association of CD68+PD-L1+PD-1+macrophages with improved survival was a promising discovery, positing these macrophages as potential harbingers of a positive therapeutic response.

Currently, pre-clinical and clinical studies have reported that the gut microbiota is a key player of immune response against tumors and could affect the efficacy of immunotherapies especially ICIs59,60,61. However, no direct association has been established between any particular fecal bacteria and treatment response in GAC. Nonetheless, subtle alterations in the composition of gut bacteria have been observed in individuals who responded positively to treatment. A normal gut bacteria balance has been observed in the feces of responders, with an increase commensalism of Faecalibacterium and Parabacteroides, is particularly intriguing. Previous studies have reported the association of these genera with anti-inflammatory properties and a healthy gut environment62,63,64. The increased abundance of these genera in responders might suggest a role in modulating the immune environment to favor a positive treatment response, which deserves further exploration.

The study is limited by its single-arm design without a control group and underpowered subgroup analysis due to the relatively small sample size. Additionally, the inclusion of patients with different histologies (45 with GAC and 3 with ESCC) treated in varied settings (second-line or salvage therapy) introduces significant heterogeneity. This diversity could notably influence both the efficacy outcomes and their interpretation. Conducted exclusively with advanced gastric cancer patients in China, this study’s findings highlight the need for further investigation into the efficacy of these treatments among non-Chinese populations, similar to results seen in Asian versus non-Asian populations in the CheckMate-649 trial3. The limited number of ESCC cases, while not providing robust statistical insights, still yields preliminary data that are crucial for shaping future research and possibly broadening the applicability of our findings. Furthermore, the exploratory nature of our biomarker analyses, conducted solely within the GAC cohort, also reflects variability inherent in clinical research. These analyses were segmented among different subsets of patients-genomic profiles from 43 patients, plasma proteomics from 29, immunofluorescence assessments dependent on available tissue samples, and gut microbiome analyses involving 40 patients. Constraints such as sample availability, quality, and patient consent impacted these analyses, while the COVID-19 pandemic further complicated sample collection and processing. These factors collectively introduce potential biases and limit the generalizability of our results, underscoring the need for careful interpretation. Despite these limitations, the study highlights the potential of anlotinib hydrochloride and nivolumab as a promising treatment option for advanced GAC and ESCC. The significance of ctDNA and serum immune proteomics, identified as key areas for further study, emphasizes their prospective impact on future therapeutic strategies, marking them as deserving of deeper investigation in the quest to refine and enhance cancer treatment.

In summary, the combination of Nivolumab and Anlotinib hydrochloride has shown notable efficacy and a manageable safety profile in patients with GAC who have failed first-line therapy. Initial results from a limited cohort of patients with ESCC indicate potential benefits, but further studies with larger and more representative samples are crucial to confirm efficacy and safety for this subgroup. Moreover, due to the minimal adverse reactions and good tolerability, the safety and effectiveness of this regimen would been further explored in ‘real-world’ clinical practice settings, especially in patients with ECOG-PS 2; Additionally, tumor biomarkers were developed to identify responsive patients with GAC to this treatment approach. Future multi-center large-scale studies are warranted to confirm our conclusion.

Methods

The study was approved by Ethics Committee of Zhongshan Hospital Affiliated to Fudan University (approval number: B2020-066R). The study design adhered to all current regulations for the use of human study participants and was carried out in accordance with the criteria set by the Declaration of Helsinki. All patients signed the written informed consent before participation and received financial compensation for blood and fecal sample collection. Patients were enrolled from December 30, 2020, to September 30, 2022.

Study design and population

The OASIS trial, conducted at Zhongshan Hospital Affiliated to Fudan University, is an open-label, single-center, prospective phase 2 clinical study to evaluate the clinical activity and safety of combination of Anlotinib hydrochloride and Nivolumab in the treatment of second-line or later-line patients with unresectable or metastatic GAC or ESCC. The study was registered at https://clinicaltrials.gov/ on August 05, 2020, prior to patient enrollment (ClinicalTrials.gov identifier: NCT04503967). Enrolled patients aged 18 to 75 years must have a measurable target lesion as defined by the Immune-related Response Evaluation Criteria In Solid Tumors (irRECIST) and should have not received local treatment such as radiotherapy for the target lesion. The Eastern Cooperative Oncology Group’s Performance Score (ECOG PS) was ≤1 and has appropriate organ function. The detailed Study Protocol is available in the Supplementary Information.

Enrolled patients received intravenous Nivolumab 360 mg on day 1 plus oral Anlotinib hydrochloride 12 mg once daily on days 1–14, repeated every 3 weeks until disease progression, unacceptable toxicity, death, or withdrawal of consent. The study design diagram is shown in Fig. 1A. Tumor assessment was performed at baseline, every 6 weeks (±7 days) until week 48, and every 12 weeks (±7 days) thereafter according to irRECIST. Adverse events were graded according to the National Cancer Institute Common Terminology Criteria for Adverse Events (version 5.0).

Endpoints

The primary endpoint of this study was objective response rate (ORR). The secondary endpoints were progression-free survival (PFS), overall survival (OS), disease control rate (DCR), and safety. All endpoints were analyzed locally by investigators. ORR was defined as the proportion of patients with a best objective response of CR or PR, while DCR was defined as the proportion of patients with complete response (CR), partial response (PR), or stable disease (SD) according to irRECIST. Confirmations of CR and PR were conducted within 4 to 6 weeks. The PFS was calculated from the date of enrollment in study until disease progression or death due to any cause, whichever occurred first; and OS was calculated from study enrollment until the date of death or loss of follow-up.

Biomarker analyses

In response to the data derived predominantly from the GAC cohort, this study delineates the analysis performed on biomarkers from patients with GAC. Due to the limited number of ESCC cohorts (n = 3), extensive biomarker analyses were not feasible for this subgroup.

Baseline tumor biopsy from metastatic or recurrent lesions was obtained, which should have been conducted no more than 6 months prior to enrollment (If the most recent biopsy was conducted more than six months prior to enrollment or if there was evidence of significant disease progression, a new biopsy was performed to ensure the analysis reflected the current state of the tumor). Additionally, peripheral blood samples were obtained on the time of both baseline and definite progression of disease. Fecal samples were also collected at baseline. Detailed information on the genomic analysis of tissues and ctDNA (circulating tumor DNA), serum immune proteomics, and multiplex immunofluorescence, shotgun metagenomic sequencing are available in the following methods.

Genomic analysis for tissues and ctDNA (circulating tumor DNA)

Genomic analysis was performed using Oseq, which is a clinical test designed to detect mutations, copy number alterations, and select gene fusions among 688 cancer-associated genes. Genomic DNA (100 ng) from formalin-fixed paraffin-embedded (FFPE) and ctDNA (30 ng) from plasma samples were processed to generate bar-coded libraries according to manufacturer’s instructions (Integrated DNA Technologies, Coralville, IA, USA). Then exons were captured using IDT custom-designed probes (Integrated DNA Technologies, Coralville, IA, USA). A control library was constructed using DNA from white blood cells to filter germline mutations. DNA sequencing was performed on MGISEQ-2000 sequencing system (MGI, Shenzhen, China). Genomic alterations including base substitutions, short insertions and deletions, copy number alterations, and gene fusions were detected using a customized analysis pipeline which is modified from The Genome Analysis Tool kit. Data interpretation was focused on genomic alterations associated with clinically available targeted treatment options according to the standards and guidelines of the NCCN, the Association for Molecular Pathology (AMP), the American Society of Clinical Oncology (ASCO) and the College of American Pathologists (CAP).

Serum immune proteomics

Proteins were measured using the Olink® Multiplex panel* (Olink Proteomics AB, Uppsala, Sweden) according to the manufacturer’s instructions. The Proximity Extension Assay (PEA) technology used for the Olink protocol has been well described65 and enables 92 analytes to be analyzed simultaneously, using 1 µL of each sample. In brief, pairs of oligonucleotide-labeled antibody probes bind to their targeted protein, and if the two probes are brought in close proximity, the oligonucleotides will hybridize in a pair-wise manner. The addition of a DNA polymerase leads to a proximity-dependent DNA polymerization event, generating a unique PCR target sequence. The resulting DNA sequence is subsequently detected and quantified using a microfluidic real-time PCR instrument (Biomark HD, Fluidigm). The resulting Ct-data is then quality controlled and normalized using a set of internal and external controls. The final assay read-out is presented in Normalized Protein eXpression (NPX) values, which is an arbitrary unit on a log2-scale where a high value corresponds to a higher protein expression.

The internal controls are designed to mimic and monitor the different steps of the PEA. They consist of two incubation/immuno controls, an extension control, and a detection control. The internal controls are introduced to all samples as well as to the external controls and are used for quality control and normalization of the data. The external controls consist of a negative control used to calculate the limit of detection (LOD), as well as a triplicate of interplate controls (IPCs) that are used for data normalization. Quality control of the data is performed in two steps: First, the run is quality controlled by calculating the standard deviation for the detection control and the incubation/Immuno controls. The standard deviation should be below 0.2 for a run to pass quality control. Secondly, each sample is quality controlled by comparing the results for the detection control and one of the incubation controls against the run median.

All assay validation data (detection limits, intra- and inter-assay precision data, etc.) are available on manufacturer’s website (www.olink.com). The full dataset is available in the Source Data for the Olink analysis.

Multiplex immunofluorescence

Multiplex immunofluorescence (mIF) was performed on 4um thick formalin-fixed, paraffin-embedded (FFPE) tissue sections according to standard procedures. Briefly, tissue sections were stained consecutively with antibodies against the following: CD68 (clone KP1, ZM-0060, Zsbio; dilution 1:100 with 1*plus automation amplification diluent), CD8 (clone 144B, ab17147, Abcam; dilution 1:25 with NaCl Tryptone Broth (TPS)), PD-1 (clone CAL20, ab237728, Abcam; dilution 1:100 with TPS), PD-L1 (clone SP142, Ventana; ready to use dilution). The fully stained slides were scanned using the TissueFAXS imaging system (Tissue Gnostics, Austria). The scanned images were then analyzed using Strata Quest software (version 7.1.129, Austria, Tissue Gnostics) to identify tissue and cell types and quantify protein expression. The number of positive cells for each marker was determined based on the threshold value for that specific marker. The percentage of positive cells was further calculated by determining the ratio of positive cells to the total number of cells in the tissue section.

Shotgun metagenomic sequencing

Fecal specimens were self-collected in hospital using a sterile container with spoon. After sampling, all containers were rapidly placed in a cooler with dry ice and delivered to the laboratory. Approximately 1 g of stool was then taken from the sterile container and placed into a 5 ml freezing tube by experienced technicians and stored immediately at −80 °C before processing. Microbial DNA extraction from fecal samples was performed using the MagPure Fast Stool DNA KF Kit B. DNA nanoball (DNB) based DNA library construction and combinatorial probe-anchor synthesis (cPAS) based shotgun metagenomic sequencing with 100 bp paired-end (PE) reads was applied to all samples (MGI, Shenzhen, China).

Statistical analysis

The primary endpoint of this study was ORR, using Simon’s two-stage design. Let π be the effective rate of the research drug. According to previous reports, the maximum ineffective cut-off value π0 is set to 15%, and the minimum effective cut-off value π1 is set to 30% (π1 > π0). If π ≤ π0, the trial can be terminated early due to ineffectiveness. For example, if the effectiveness of the study drug is π ≥ π1, the trial can be prepared to advance to the next stage of trial because it is effective. If the significance level α is 0.05 and the power is 80%, it is calculated according to the Minimax design that 23 subjects will be recruited during the first stage. If ORR occurred in 3 or fewer of these subjects, the study would be terminated. A total of 48 subjects were planned for enrollment to complete the second phase, with the trial being considered effective if more than 11 out of the 48 subjects showed effectiveness.

All patients who were administered at least one dose of the study regimen and underwent at least one post-baseline efficacy evaluation were considered the efficacy-evaluable population. Similarly, patients who received at least one dose of the study treatment and had at least one post-baseline safety evaluation were included in the safety population. The Kaplan–Meier method was utilized to estimate survival probability, while the log-rank test was employed to compare groups. The Clopper-Pearson method was used to calculate the 95% confidence intervals (CIs) for ORR or DCR. Furthermore, hazard ratios (HR) for PFS and OS, along with their respective 95% CIs, were estimated using the Cox proportional hazards model.

For further analysis, subgroup comparisons between responders and non-responders were conducted using the exact Wilcoxon rank-sum test. The relationship between potential biomarkers and radiographic response was assessed via Spearman’s rank correlation coefficient. Additionally, biomarker levels before and after treatment were compared using a paired t-test to evaluate treatment effects.

Statistical analysis was performed using SAS 9.4 and R version 4.3.2. The measurements were taken from distinct samples. All statistical tests were two-tailed, and a significance level of P < 0.05 was considered statistically significant. Plots were generated using R version 4.3.2.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.