Introduction

Mismatch repair deficiency is characterized by loss of expression of one or more mismatch repair proteins MLH1, MSH2, MSH6, and PMS2 and is a feature of approximately 20% of endometrial cancers [1,2,3,4,5,6,7,8]. The consequences of defective mismatch repair machinery include hypermutation and insertion and deletion errors in genomic DNA repeats, leading to microsatellite instability. Mismatch repair deficiency is a defining feature of the microsatellite instability molecular subclass of endometrial cancers as classified by the Cancer Genome Atlas project [9].

The most common mechanism of mismatch repair deficiency in sporadic endometrial cancer is via epigenetic silencing of MLH1 by promoter hypermethylation [10, 11]. Mismatch repair deficiency may also result from germline or somatic mutation of the mismatch repair genes, and pathogenic germline variants establish a diagnosis of Lynch syndrome, which predisposes patients to increased risk of cancers of colorectal, endometrial, and other sites [12].

Recently, mismatch repair deficiency has been demonstrated to be an important biomarker for therapy selection with immune checkpoint inhibitors. Endometrial cancers with microsatellite instability have been associated with elevated mutational burden, increased tumor-infiltrating lymphocytes, and elevated programmed cell death protein-1 (PD-1) and programmed death-ligand 1 (PD-L1) expression in tumor associated  lymphocytes [13,14,15]. Mismatch repair deficiency has been associated with clinical response to immunotherapy in multiple tumor types [16], and in May 2017, the Food and Drug Administration granted accelerated approval of pembrolizumab for advanced solid tumors with mismatch repair deficiency or microsatellite instability, regardless of pathological diagnosis [17].

In addition to identifying driver mutations, broad cancer sequencing panels can use sequencing data to identify patterns within incidentally sequenced passenger mutations. Our laboratory and others have developed next-generation sequencing algorithms to detect phenotypic features of mismatch repair deficiency and microsatellite instability from cancer panel sequencing data [18,19,20,21]. These metrics have demonstrated high concordance with mismatch repair protein immunohistochemistry and microsatellite analysis by polymerase chain reaction in colorectal cancer.

Since colorectal cancer is the most common cancer type with mismatch repair deficiency, most sequencing algorithms are validated predominantly on this cancer type, and the efficacy of sequencing-based detection of mismatch repair deficiency in endometrial cancers is not as well established. Endometrial cancers are biologically different than colorectal cancers, and endometrial cancers with mismatch repair deficiency acquire mutations in different microsatellite regions compared to colorectal cancers [22]. Therefore, we hypothesize that any mismatch repair deficiency detection algorithm trained on colorectal cancers needs to be adjusted to improve accuracy in endometrial cancers.

Here, we test our algorithm for mismatch repair deficiency detection in a cohort of endometrial cancers. Through this study, we hope to better understand the molecular phenotype of endometrial cancers with mismatch repair deficiency and the utility of next-generation sequencing to inform clinical decision making.

Materials and methods

Patient selection and enrollment

Study volunteers were prospectively enrolled via Profile, an institutional study for cancer genotyping, and provided written consent for cancer sequencing [23]. Subjects provided written consent at the Dana Farber Cancer Institute. Sequencing and immunohistochemistry were performed at Brigham and Women’s Hospital. Universal screening for MLH1, MSH2, MSH6, and PMS2 protein expression was clinically performed for endometrial cancers at our institution [1]. Endometrial cancers that had undergone targeted next-generation sequencing were identified and matched to corresponding surgical pathology reports. A total of 259 patients who had both sequencing and screening by immunohistochemistry were included in the study. Pathological diagnoses included endometrioid (184), serous (39), clear cell (5), undifferentiated (7), mixed (7), and unclassified high-grade Mullerian carcinoma (1) and carcinosarcoma (16). This project was approved by the Dana Farber Cancer Institute Institutional Review Board and the Partners Human Research Committee.

Mismatch repair immunohistochemistry

Immunohistochemistry was conducted on 4-μm-thick formalin-fixed, paraffin-embedded tissue sections using mouse anti-MLH1 monoclonal antibody, mouse anti-MSH2 monoclonal antibody, mouse anti-PMS2 monoclonal antibody, and mouse anti-MSH6 monoclonal antibody using the Envision Plus Detection System, as previously described [20]. Briefly, the following antibodies and dilutions were applied: (1) MSH2 monoclonal antibody (Cal Biochem- EMD Millipore, clone NA27) 1:200; (2) MSH6 monoclonal antibody (BD Bioscience, clone PU29) 1:50; (3) MLH1 monoclonal antibody (Novocastra, clone NCL-L-MLH1) 1:75; and (4) PMS2 monoclonal antibody (Cell Marque, clone MRQ-28) 1:100. The antigen retrieval method for all stains included heat-induced epitope retrieval in either pH 6.1 10 mM citrate buffer (MSH6, MLH1, PMS2) or pH 8.0 EDTA (MSH2), with a pressure cooker. Internal positive controls (stroma, vasculature) and appropriate negative controls were used.

Targeted next-generation sequencing

Targeted next-generation sequencing was performed using OncoPanel, which has been previously described [24]. In brief, DNA was isolated from formalin-fixed, paraffin-embedded or frozen, optimal cutting temperature-embedded tissue in regions with at least 20% cancer cell nuclei. At least 50 ng DNA was used for library preparation, and hybrid capture with a custom RNA bait set by Agilent SureSelect (Agilent Technologies, Santa Clara, CA) was performed to enrich for exons of cancer-associated genes. Three versions of the assay were used, depending on the time of enrollment, including coding regions of 275, 298, or 447 genes encompassing 757,787, 831,033, or 1,315,708 bp of the genome, respectively. Thus, 39 specimens were tested with version 1, 170 were tested with version 2, and 50 were tested with version 3. Sequencing was performed on the HiSeq 2500 System (Illumina, San Diego, CA). Informatics was performed using a custom pipeline, and insertion and deletion mutations were called using Indelocator (Broad Institute, Cambridge, MA).

Mismatch repair deficiency detection algorithm

The mismatch repair algorithm identified single-nucleotide insertion and deletion events in DNA mononucleotide repeat regions of at least four consecutive nucleotides and were designated mismatch repair deficiency-associated events. Such events were previously demonstrated to be enriched in colorectal cancers with mismatch repair deficiency [21]. The total number of mismatch repair deficiency-associated events were normalized to the size of the panel to establish the number of events per megabase (indel/Mb) for each case. Indel/Mb values were compared to immunohistochemistry results to establish thresholds for mismatch-deficient, -proficient, and -indeterminate cancers by sequencing.

The accuracy of sequencing for the detection of mismatch repair deficiency was calculated as follows: sensitivity, the percent of all cases with loss of expression of at least one mismatch repair protein by immunohistochemistry that is classified as mismatch repair deficient by sequencing; specificity, the percent of all cases with retained expression of all mismatch repair proteins by immunohistochemistry that is classified as mismatch repair proficient by sequencing; positive predictive value, the percent of all cases classified as mismatch repair deficient by sequencing that has loss of at least one mismatch repair protein by immunohistochemistry; and negative predictive value, the percent of all cases classified as mismatch repair proficient that has retained expression of all mismatch repair proteins by immunohistochemistry.

Statistical analysis

Two-sided Mann–Whitney U test was used to compare distributions of continuous variables with significance set at P < 0.05.

Results

Insertion and deletion events in mismatch repair-deficient endometrial cancers

The number of mismatch repair deficiency-associated events per megabase for each endometrial cancer was calculated. The distribution of the number of events in mismatch repair-deficient and mismatch repair-proficient endometrial cancers, as determined by immunohistochemistry, is shown in Fig. 1.

Fig. 1
figure 1

Distribution of mismatch repair-deficient and proficient endometrial cancers with respect to sequencing findings. Mismatch repair deficiency is assessed by immunohistochemical expression of MLH1, MSH2, MSH6, and PMS2. The x-axis shows the number of insertion and deletion events in mononucleotide repeats per megabase per case by sequencing, rounded to the nearest integer. Each bar represents the number of mismatch repair-deficient or -proficient endometrial cancers with that number of events

Of the 259 endometrial cancers in the study, 63 (24%) had loss of immunohistochemical expression of at least one mismatch repair protein. The median number of mismatch repair deficiency-associated events was 6.0 indel/Mb (mean 5.8, standard deviation 4.0, range 0.0–16.0) for mismatch repair-deficient cancers compared to 0.0 indel/Mb (mean 0.3, standard deviation 2.2, range 0.0–29.6) for mismatch repair-proficient cancers (P < 0.0001).

Of the 63 endometrial cancers with mismatch repair deficiency, 47 (74.6%) had >2.5 indel/Mb by sequencing, and 54 of 63 (85.7%) endometrial cancers with mismatch repair deficiency had >1.5 indel/Mb by sequencing. In contrast, 171 of 196 (87.2%) endometrial cancers with intact immunohistochemical staining had no mismatch repair deficiency-associated events detected, and 190 of 196 (96.9%) had <1.5 indel/Mb by sequencing.

Concordance of sequencing interpretation with immunohistochemistry

Using the distribution described in Fig. 1, a single threshold of 1.5 indel/Mb achieved the highest degree of concordance of 94% between sequencing and immunohistochemistry, with agreement in 244 of 259 cases. Of the 63 cancers with loss of expression of at least one mismatch repair protein by immunohistochemistry, 54 had greater than 1.5 indel/Mb by sequencing (sensitivity 54/63, 86%). Of the 196 cancers with intact expression of all mismatch repair proteins by immunohistochemistry, 190 had less than 1.5 indel/Mb mismatch repair proficient by sequencing (specificity 190/196, 97%).

However, we recognized that cancers with 1.5 to 2.5 indel/Mb were difficult to classify, with 7 of 12 cases in this group exhibiting mismatch repair protein loss by immunohistochemistry. Therefore, we further divided our cohort into three categories based on sequencing interpretation: ‘deficient’ (>2.5 indel/Mb), ‘proficient’ (<1.5 indel/Mb), and ‘indeterminate’ (1.5–2.5 indel/Mb). Of the 48 cancers predicted to be deficient by sequencing, 47 had loss of expression of a mismatch repair protein by immunohistochemistry (positive predictive value 47/48, 98%). Of the 199 cancers predicted to be proficient by sequencing, 190 had retained expression of mismatch repair proteins by immunohistochemistry (negative predictive value 190/199, 95%, Table 1).

Table 1 Next-generation sequencing determination of mismatch repair deficiency, defined by number of indel events in mononucleotide repeats, compared to immunohistochemistry, defined by loss of nuclear expression of MLH1, MSH2, MSH6, or PMS2

Review of discordant cases

The only cancer predicted to be deficient by sequencing with intact mismatch repair protein expression was an undifferentiated carcinoma with the highest number of mismatch repair deficiency-associated events (29.6 indel/Mb) in the cohort. Sequencing identified a POLE p.V411L mutation and an elevated number of single-nucleotide variants, and the phenotype is consistent with a POLE-associated ultramutated endometrial cancer.

Of the nine cases predicted to be proficient by sequencing with loss of at least one mismatch repair protein expression, the discordance may be due to several factors. Two of the nine cancers exhibited isolated loss of MSH6 expression. Three of the nine cancers with mismatch repair deficiency missed by sequencing showed low variant allele fractions of less than 10% for pathogenic somatic mutations, consistent with specimens with low tumor purity. No definitive explanation for discordance between sequencing and immunohistochemistry could be identified for the remaining four specimens.

Discussion

Phenotypic mismatch repair deficiency can be assessed in the pathology laboratory through immunohistochemical evaluation for nuclear protein expression in tumor cells. Although phenotypic screening for Lynch syndrome by immunohistochemistry is universally recommended for colorectal cancer in consensus guidelines [25], such guidelines do not exist for endometrial cancer [26]. However, studies have suggested the added value of universal Lynch syndrome screening in patients with endometrial cancer [1, 3, 5, 27].

As sequencing costs have declined, multiple cancer panel next-generation sequencing assays have been developed to profile genomic alterations for clinical use [24, 28,29,30]. These assays aim to identify driver events in cancer development to help guide therapy selection and are major decision points in multi-institutional basket trials for targeted cancer therapy [31, 32]. As next-generation sequencing technology is rapidly adopted into clinical practice in molecular laboratories [33], it is rational to use a single set of sequencing data for multiple clinical indications, including screening for mismatch repair deficiency.

We have previously described an algorithm counting mismatch repair deficiency-associated events and ultimately demonstrated that this simple metric achieved 96% sensitivity and 99% specificity compared to mismatch repair immunohistochemistry in colorectal cancer [21]. The application of a similar metric in endometrial cancer demonstrates a poorer level of performance in this study, achieving at best 86% sensitivity and 97% specificity with a false negative rate of 5% in our cohort.

Potential explanations for false negative results include a higher frequency of endometrial cancers with isolated MSH6 loss. Germline MSH6 mutations are known to confer a higher risk for endometrial cancers compared to colorectal cancers [34], and affected cancers frequently exhibit microsatellite instability-low status [35]. Another potential limitation is low tumor purity in endometrial cancer specimens, which are susceptible to contamination with nonneoplastic cells from endometrial stroma and myometrium.

Mismatch repair deficiency is known to confer different molecular phenotypes in endometrial cancers compared to colorectal cancers. Before the invention of next-generation sequencing technology, investigators recognized that some recurrent microsatellite alterations in colorectal cancers were not seen in endometrial cancers with mismatch repair deficiency [36]. Analysis of targeted exonic microsatellites demonstrated fewer mutations in endometrial cancers compared to colorectal cancers with mismatch repair deficiency [37].

Similar findings have since been described in whole-exome sequencing data. The only paper to perform extensive direct comparisons between mismatch repair-deficient colorectal and endometrial cancers used the Cancer Genome Atlas data to demonstrate that colorectal cancers had more microsatellite instability events compared to endometrial cancer [22]. Using a cutoff of greater than 50 events, the authors’ methodology achieved 100% accuracy in distinguishing microsatellite instability-high versus microsatellite instability-low or microsatellite stable colorectal cancer but only 83% sensitivity (25 of 30) and 97% specificity (97 of 100) in endometrial cancer. These figures are comparable to our current classification using a much more limited targeted panel. Another article evaluating the correlation between whole-exome sequencing data and microsatellite instability across many cancer types demonstrated discordance in 11 of 171 microsatellite instability-high cancers, of which 10 of 11 discordant cases were endometrial cancers [38].

Our current findings and the literature highlight phenotypic differences in genomic alterations across tumor types as a result of mismatch repair deficiency. The unique biology and molecular phenotype of endometrial cancers may limit accurate detection of mismatch repair deficiency in this disease compared to colorectal cancer, and algorithms using next-generation sequencing data to broadly screen for mismatch repair deficiency may need to be modified for endometrial cancer. In the setting of routine clinical screening, immunohistochemistry remains an accurate and cost-effective method for mismatch repair deficiency screening in endometrial cancer.

Despite these limitations, our algorithm applied to a limited targeted cancer gene panel still performs well with 94.2% concordance compared to immunohistochemistry. As targeted next-generation sequencing becomes available for clinical cancer care, the ability to obtain more actionable information, including determination of mismatch repair deficiency, from sequencing data will help increase clinical benefit relative to sequencing cost. Based on these results and prior work [20, 21], we have since adopted this methodology to screen for mismatch repair deficiency in all cancers undergoing next-generation sequencing in our laboratory, and we encourage other laboratories to explore similar validations in endometrial cancers and other cancer types.