Table 1 Summary of the selected studies discussed in the present review that investigate molecular stratification of esophageal adenocarcinoma, Barrett’s esophagus or a combination of both.

From: Molecular stratification of esophageal adenocarcinoma: implications for prognosis and treatment strategy

Study

Number of samples

Source

Tumor content in the samples

Other samples included in the clustering

Validation dataset

Number of subtypes

Type of data used for stratification

Feature selection for stratification

Method used for sample clustering

Signature

Main findings

Improvement of prognostic accuracy

Kim et al., 2010, PLoS One [26]

75 EAC

in-house

Not provided

-

-

3

Expression microarrays

Genes with an expression ratio that was at least two fold different relatively to reference in at least 8 tissues

Unsupervised hierarchical clustering

-

Among the genes whose expression was significantly associated with prognosis, two (SPARC and SPP1) were highlighted as a potential prognostic biomarker signature.

Pinto et al., 2024, Mol Oncol [27]

201 EAC

TCGA and GSE72872

≥60% for TCGA samples [20] and ≥50% for GEO samples [34]

STAD, COAD and READ

201 EAC samples from TCGA and GSE72872 (splitted)

4 (out of 6 pan-GI)

DNA methylation (450 K)

DVPs between tumor and normal samples

Consensus hierarchical clustering

-

The four subtyes where EAC patients were identified span a range of distinct DNA methylation levels, and agree with the distribution of the different CIMP status enrichment.

The normal-like subtype showed the poorest prognosis of all subtypes.

Improvement of prognostic accuracy: CIMP-like subtypes

Kaz et al., 2011, Epigenetics [33]

29 EAC

in-house

≥75%

-

-

2 (high and low methylation epigenotypes)

DNA methylation (Golden Gate)

Probes located in CpG islands and in the neighborhood of the transcription starting site

Unsupervised hierarchical clustering

-

First genome-wide study indicating higher similarity of DNA methylation profiles between BE and EAC than between each one of them and the normal squamous esophagus.

Krause et al., 2016, Carcinogenesis [34]

125 EAC and 19 BE

in-house

≥50%

BE / -

89 EAC from TCGA

2 (non-CIMP and CIMP-like)

DNA methylation (450k)

CpG island-located most variable probes in EAC samples, and not highly methylated in normal squamous esophagus

Unsupervised hierarchical clustering

-

The clustering separated EAC and BE from normal squamous esophagus, but not EAC from BE.

A group of EAC patients with a CIMP-like methylation pattern was proposed.

Patients with the most hypermethylated tumors exhibited significantly poorer survival outcomes compared to all the other tumors.

Liu et al., 2018, Cancer Cell [35]

79 EAC

TCGA

≥60% [20]

STAD, COAD and READ

-

4 (out of 7 pan-GI)

CIMP-high, GEA CIMP-low and two non-CIMP

DNA methylation (450k)

Gene promoter loci unmethylated in normal tissues and leukocytes (mean β < 0.2) and methylated (β > 0.3) in more than 5% samples in at least one of the GI adenocarcinoma types.

Unsupervised hierarchical clustering

-

EAC patients were also included in hypermutated-single-nucleotide variants, genome stable and mostly chromosomal unstable subgroups defined by multiplatform analyses.

Sánchez-Vega et al., 2017, World J Gastrointest Oncol [36]

87 EAC

TCGA

≥60% [20]

-

-

3 (CIMP + , CIMP-intermediate and CIMP-)

DNA methylation (450k)

Identification of CpG islands-located DMPs between tumor and normal samples, among those with a high variance across samples (standard deviation ≥ 0.1), low methylation in control samples (average β < 0.05) and increased methylation in tumor samples (average β > 0.25).

Unsupervised hierarchical clustering

-

No significant association was observed between MLH1 promoter hypermethylation and CIMP categories.

Improvement of prognostic accuracy: classification models

Lan et al., 2021, Medicine [40]

75 EAC

TCGA (split into test and training set)

≥60% [20]

-

43 EAC from GEO (GSE72874)

2 (low-/high-risk)

Expression (RNA-seq)

Identification of DEGs between tumor and normal samples. Lasso penalized Cox regression was then applied to identify prognostic-related mRNAs.

Median survival risk score (calculated based on the expression levels of the gene signature)

5 genes (LC26A9, SINHCAF, MICB, KRT19 and MT1X)

The 5-mRNA signature was promising as a biomarker for predicting 3-year survival rate of EAC in the internal test set, the entire TCGA set, and the external test set.

The sensitivity and specificity of the mRNA signature performed numerically better than the TNM stage for 1-, 2-, and 3-year prognostic evaluation of EAC.

Mao et al., 2024, Transl Cancer Res [41]

80 EAC

GEO (GSE13898 and GSE26886, for DEGs identification) and TCGA (for DEGs identification and model construction)

≥60% for TCGA samples [20]; not provided for GEO samples

-

-

2 (low-/high-risk)

Expression (RNA-seq)

Identification of DEGs between tumor and normal or adjacent non-cancerous tissues. Cox analysis together with Akaike information criterion were then performed to find DEGs associated with prognosis.

Median survival risk score (calculated based on the expression levels of the gene signature)

4 genes (ALAD, ABLIM3, IL17RB and IFI6)

Multivariate Cox regression analyses suggested that the four-gene signature served as an independent factor in overall survival prediction. Stage stratified analysis showed that the four-gene signature had better predictive performance for patients with advanced tumor stage (III and IV).

Chen et al., 2021, Biomed Res Int [42]

78 EAC

TCGA

≥60% [20]

-

-

2 (low-/high-risk)

Expression (RNA-seq) and DNA methylation (450 K)

Identification of methylation-driven genes by comparing DNA methylation status of tumor and normal samples and correlating it with transcriptomic data. Lasso penalized Cox regression was then applied to identify prognostic-related features.

Median survival risk score (calculated based on the expression levels of the gene signature)

4 methylation-driven genes (GPBAR1, OLFM4, FOXI2 and CASP10)

Multivariate Cox regression analyses showed that the prognostic risk score was an independent prognostic factor.

Li et al., 2019, Aging [43]

79 EAC

TCGA (not clear)

≥60% [20]

-

-

2 (low-/high-risk)

DNA methylation (450 K)

Lasso-Cox model applied to differentially methylated CpG sites between EAC and normal samples to identify prognostic-related features.

Median survival risk score (calculated based on the DNA methylation levels of the gene signature)

3 CpG sites (cg01192745, cg19801256 and cg18276155)

The 3-CpG prognostic methylation classifier was an independent risk factor by multivariate Cox regression adjusting for clinical risk factors. The classifier improved the predictive ability of the TNM staging system.

Personalized treatment selection: Subtypes with potential actionable targets

Secrier et al., 2016, Nat Genet [48]

129 EAC

OCCAMS / ICGC

>70%

-

87 EAC samples from EGAS00001000750 and ICGC

3 (DNA damage repair impaired, mutagenic and C > A/T dominant)

Mutational signatures obtained from whole-genome sequencing via non-negative matrix factorization

Six mutational signatures: S1 (age), S2 (APOBEC), S3 (BRCA), S17, S17B and S18-like

Consensus clustering

-

Drug sensitive assays in EAC cell lines showed that the subtypes can be a basis for therapy selection.

Guo et al., 2018, BMC Genomics [49]

215 EAC (from three independent cohorts) and 15 BE

GEO (GSE13898, GSE19417) and TCGA (independently analysed and in a meta-analysis)

≥60% for TCGA samples [20]; not provided for GEO samples

1) BE and normal esophageal tissues; 2) squamous esophageal carcinoma and gastric carcinoma

215 EAC samples from three independent cohorts: GSE13898, GSE19417 and TCGA (independently analysed and in a meta-analysis)

2 (Subtype I, gastric-like; and Subtype II, squamous-like)

Expression (microarrays and RNA-seq)

Standard deviation

Consensus hierarchical clustering (performed on each of the three datasets independently and in a meta-analysis)

-

The subtypes showed distinct expression patterns and mutation profiles.

The EAC gastric-like subtype II exhibited gene expression patterns closely resembling those found in BE.

The study suggested that subtype II EAC patients might be more likely responsive to chemotherapy. However, a limited number of patients had available therapeutic information.

Yu et al., 2019, Gut [50]

23 EAC

in-house

>70%

-

87 EAC from TCGA

4 (high-, intermediate-, low-and minimal-methylator)

DNA methylation (450k)

Most variable probes among EAC

Recursively partitioned mixture model clustering

-

Cell lines representative for each subtype responded differently to anti-cancer chemotherapies.

Jammula et al., 2020, Gastroenterology [51]

285 EAC and 150 BE

OCCAMS / ICGC

>70%

BE

19 BE and 125 EAC samples from GEO (GSE72872)

4

DNA methylation (EPIC)

Optimal metagenes obtained by non-negative matrix factorization

Non-negative matrix factorization with k-means clustering

-

Subtype 2 was enriched in BE samples. Methylation profiles of BE and EAC were more similar to each other than normal esophagus. Subtype 3 was associated with the shortest time of patient survival.

Sundar et al., 2019, Eur J Cancer [54]

229 EAC

in-house (MRC OE02 trial)

≥30%

-

13 in-house EAC samples

2

DNA methylation (Illumina GoldenGate Cancer Panel I)

Cox proportional hazard analysis to select probes predictive of survival in the chemotherapy+surgery arm

Non-negative matrix factorization

Non-negative matrix factorization metagene signature involving 11 probes.

In cluster 1, patients in the chemotherapy+surgery arm had significantlly better overall survival, appearing to benefit from chemotherapy. In cluster 2, patients in the chemotherapy+surgery arm exhibited worse survival compared with that of patients in the surgery-only arm, suggesting that they may not derive any survival benefit from neoadjuvant chemotherapy.

The identified epigenetic signature may serve as a predictive biomarker for neoadjuvant chemotherapy (cisplatin + fluorouracil) benefit in EAC.

Hoadley et al., 2018, Cell [55]

Not provided

TCGA

≥60% [20]

ESCC and 32 other cancer types

-

7 (out of 28 pan-cancer)

C2: BRCA (HER2 amp)

C4: Pan-GI (CRC)

C10: Pan-SCC

C13: Mixed (Chr 8 del)

C18: Pan-GI (MSI)

C20: Mixed (Stromal/Immune)

C25: Pan-SCC (Chr 11 amp)

Copy number, DNA methylation, mRNA and miRNA expression

Variable (depending on the type of data)

Multi-platform integrative clustering with iCluster

-

Cell-of-origin influences, but does not fully determine, tumor classification.

Mutation frequencies and mutational signatures varied among the clusters, as well as the enriched pathways.

Personalized treatment selection: Immunological profiling

Ling et al., 2022, Pharmaceuticals [61]

223 EAC

TCGA and GEO (GSE72874, GSE92396 and GSE13898)

≥60% for TCGA samples [20]; not provided for GEO samples

-

44, 45 and 48 EAC samples from GEO (GSE72874, GSE13898 and GSE19417, respectively)

2

Gene expression (mRNA and lncRNA) and TME scores

Identification of DEGs (mRNA and lncRNA) with a significant prognostic value and TME scores with median absolute deviation > 0.5

IntNMF (Integrative Clustering of Multiple Genomic Dataset) and consensus clustering

50 subtype-specific DEGs

The group has developed a classifier to stratify samples into two subtypes based on 50 subtype-specific signature genes.

Stratified survival analyses based on the age and clinical stage subgroups confirmed the prognostic value of the two EAC subtypes

The subtypes showed differences in prognostic and in tumor microenvironment landscape.

Naeini et al., 2023, Nat Commun [62]

68 EAC

in-house (mostly DOCTOR clinical trial)

Variable

-

78 EAC from TCGA

4 immune clusters (immune hot, immune cold, immune suppressed and immune moderate)

18 immune cell proportions (deconvoluted from RNA-seq data)

18 immune cell proportions (deconvoluted from RNA-seq data)

Unsupervised k-means clustering

-

The four immune clusters associated with both overall survival and progression-free survival.

Thorsson et al., 2018, Immunity [64]

76 ECA

TCGA

≥60%

ESCC and 29 other cancer types

76 EAC samples from TCGA (splitted)

5 (out of 6 pan-GI)

C1: wound healing,

C2: IFN-γ dominant,

C3: inflammatory,

C4: lymphocyte depleted, and

C6: TGF-β dominant

Expression (RNA-seq)

5 cancer immune expression signatures

Consensus clustering of the pairwise correlation of the signature scores

5 cancer immune expression signatures

Immunogenomic features were predictive of outcome, with overall survival and progression-free interval differing between immune subtypes both within and across cancer types.

Personalized treatment selection: Immune-based classification models for improved mortality risk assessment

Yang et al., 2023, Med Sci Monit [65]

78 EAC

TCGA

≥60% [20]

-

64 EAC samples from GEO (GSE13898)

2 (low-/high-risk)

Expression (RNA-seq)

Identification of DEGs between tumor and normal tissue samples and combination with immune-related genes list from ImmPort database. Performance of weighted correlation network analysis on this list, followed by Cox regression model to identify prognostic-related genes.

Median survival risk score (calculated based on the expression levels of the gene signature)

4 genes (UNC93B1, HSPA14, AR and FGF13)

Multivariate Cox regression analyses showed that the prognostic risk score was an independent prognostic factor.

The results suggested that the high-risk group is more suitable for immunotherapy, which may provide a reference value for the treatment of EAC patients.

Zhang et al., 2021, BMC Bioinformatics [66]

80 EAC

TCGA

≥60% [20]

-

48 EAC samples from GEO (GSE72873)

2 (low-/high-risk)

Expression (RNA-seq)

Identification of DEGs between tumor and normal tissue samples and combination with immune-related genes list from ImmPort database. Cox regression analysis was then performed on immune-related DEGs to identify prognostic-related genes.

Median survival risk score (calculated based on the expression levels of the gene signature)

12 immune-related genes (ADRM1, CXCL1, SEMG1, CCL26, CCL24, AREG, IL23A, UCN2, FGFR4, IL17RB, TNFRSF11A, and TNFRSF21)

Multivariate Cox analysies and nomogram indicated that a combined analysis of the risk score, sex, M stage, and tumor stage can accurately predict survaval prognosis factors.

The significance of the survival rate difference between high- and low-risk groups was kept when patients were stratified by age and tumor stage.

The signature constructed for EAC patients was proven not suitable for ESCC patients.

Elucidation of the mechanisms underlying EAC carcinogenesis

Krause et al., 2016, Carcinogenesis [34]

                                         BE stratified in combination with EAC, as above.

Jammula et al., 2020, Gastroenterology [51]

                                         BE stratified in combination with EAC, as above.

Guo et al., 2018, BMC Genomics [49]

                                         BE stratified in combination with EAC, as above.

Nones, 2014, Nat Commun [14]

22 EAC

in-house

≥50%

-

-

Unstable genome, scattered and complex localized

Whole-genome sequencing and single-nucleotide polymorphism-arrays

Structural variants

-

-

The number of structural variants and their genomic distribution revealed considerable inter-tumor heterogeneity.

Yu et al., 2019, Gut [50]

59 BE

in-house

>70%

-

-

4 (high-, intermediate-, low-and minimal-methylator)

DNA methylation (450k)

Most variable probes among EAC

Recursively partitioned mixture model clustering

-

The four subtypes mirrored those identified in EAC in the same study.

Kaz et al., 2011, Epigenetics [33]

29 BE

in-house

≥75%

-

-

2 (high and low methylation epigenotypes)

DNA methylation (Golden Gate)

Probes located in CpG islands and in the neighborhood of the transcription starting site

Unsupervised hierarchical clustering

-

The four subtypes mirrored those identified in EAC in the same study, as well as CIMP groups in other cancer types.

  1. BE Barrett’s esophagus, CIMP CpG island methylator phenotype, COAD colon adenocarcinoma, DEG differentially expressed gene, DVP differentially variable probe, EAC esophageal adenocarcinoma, ESCC esophageal squamous cell carcinoma, GEA gastroesophageal adenocarcinoma, GEO Gene Expression Omnibus, GI gastrointestinal, ICGC International Cancer Genome Consortium, READ rectal adenocarcinoma, STAD stomach adenocarcinoma, TCGA The Cancer Genome Atlas, TME tumor microenvironment, TNM tumor-node-metastasis.