Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

Comprehensive analysis of cancer-associated somatic mutations in class I HLA genes

This article has been updated

Abstract

Detection of somatic mutations in human leukocyte antigen (HLA) genes using whole-exome sequencing (WES) is hampered by the high polymorphism of the HLA loci, which prevents alignment of sequencing reads to the human reference genome. We describe a computational pipeline that enables accurate inference of germline alleles of class I HLA-A, B and C genes and subsequent detection of mutations in these genes using the inferred alleles as a reference. Analysis of WES data from 7,930 pairs of tumor and healthy tissue from the same patient revealed 298 nonsilent HLA mutations in tumors from 266 patients. These 298 mutations are enriched for likely functional mutations, including putative loss-of-function events. Recurrence of mutations suggested that these 'hotspot' sites were positively selected. Cancers with recurrent somatic HLA mutations were associated with upregulation of signatures of cytolytic activity characteristic of tumor infiltration by effector lymphocytes, supporting immune evasion by altered HLA function as a contributory mechanism in cancer.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Development and validation of Polysolver for inference of MHC class I type.
Figure 2: Polysolver for the detection of somatic mutations in MHC class I alleles across cancers.
Figure 3: Distribution of HLA mutations across cancers and across functional domains and tumor types.
Figure 4: Distribution of MHC class I mutations and evidence of positive functional selection.

Similar content being viewed by others

Change history

  • 01 October 2015

    In the version of this article initially published online, there were errors in three equations in the first page of Online Methods, in the section “Allele inference”: “= ei/3 otherwise” should have been on a separate line; the equation beginning with “P (D = dk)” was missing an equal sign immediately after this expression; and in the equation starting with Lm, the fifth summation sign was missing “k = 1”. On p. 3, under the first subheading on the right-hand side, “ovarian cancer (n = 432)” should have read “thyroid cancer (n = 486).” In addition, the citation for Supplementary Software was missing. The errors and omission have been corrected for the print, PDF and HTML versions of this article.

References

  1. Stransky, N. et al. The mutational landscape of head and neck squamous cell carcinoma. Science 333, 1157–1160 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Cancer Genome Atlas Research Network. Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 519–525 (2012).

  3. Lohr, J.G. et al. Discovery and prioritization of somatic mutations in diffuse large B-cell lymphoma (DLBCL) by whole-exome sequencing. Proc. Natl. Acad. Sci. USA 109, 3879–3884 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Lawrence, M.S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Cancer Genome Atlas Research Network. Comprehensive molecular characterization of gastric adenocarcinoma. Nature 513, 202–209 (2014).

  6. The MHC sequencing consortium. Complete sequence and gene map of a human major histocompatibility complex. The MHC sequencing consortium. Nature 401, 921–923 (1999).

  7. Townsend, A. & Bodmer, H. Antigen recognition by class I-restricted T lymphocytes. Annu. Rev. Immunol. 7, 601–624 (1989).

    Article  CAS  PubMed  Google Scholar 

  8. Bjorkman, P.J. & Parham, P. Structure, function, and diversity of class I major histocompatibility complex molecules. Annu. Rev. Biochem. 59, 253–288 (1990).

    Article  CAS  PubMed  Google Scholar 

  9. Welsh, K. & Bunce, M. Molecular typing for the MHC with PCR-SSP. Rev. Immunogenet. 1, 157–176 (1999).

    CAS  PubMed  Google Scholar 

  10. Fernandez-Viña, M.A., Falco, M., Sun, Y. & Stastny, P. DNA typing for HLA class I alleles: I. Subsets of HLA-A2 and of -A28. Hum. Immunol. 33, 163–173 (1992).

    Article  PubMed  Google Scholar 

  11. Tiercy, J.M. et al. Oligotyping of HLA-A2, -A3, and -B44 subtypes. Detection of subtype incompatibilities between patients and their serologically matched unrelated bone marrow donors. Hum. Immunol. 41, 207–215 (1994).

    Article  CAS  PubMed  Google Scholar 

  12. Erlich, R.L. et al. Next-generation sequencing for HLA typing of class I loci. BMC Genomics 12, 42 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Wang, C. et al. High-throughput, high-fidelity HLA genotyping with deep sequencing. Proc. Natl. Acad. Sci. USA 109, 8676–8681 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Lank, S.M. et al. Ultra-high resolution HLA genotyping and allele discovery by highly multiplexed cDNA amplicon pyrosequencing. BMC Genomics 13, 378 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Danzer, M. et al. Rapid, scalable and highly automated HLA genotyping using next-generation sequencing: a transition from research to diagnostics. BMC Genomics 14, 221 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Cao, H. et al. An integrated tool to study MHC region: accurate SNV detection and HLA genes typing in human MHC region using targeted high-throughput sequencing. PLoS One 8, e69388 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Wang, L. et al. SF3B1 and other novel cancer genes in chronic lymphocytic leukemia. N. Engl. J. Med. 365, 2497–2506 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Robinson, J. et al. The IMGT/HLA database. Nucleic Acids Res. 41, D1222–D1227 (2013).

    Article  CAS  PubMed  Google Scholar 

  19. Gonzalez-Galarza, F.F., Christmas, S., Middleton, D. & Jones, A.R. Allele frequency net: a database and online repository for immune gene frequencies in worldwide populations. Nucleic Acids Res. 39, D913–D919 (2011).

    Article  CAS  PubMed  Google Scholar 

  20. Szolek, A. et al. OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics 30, 3310–3316 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Saunders, C.T. et al. Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics 28, 1811–1817 (2012).

    Article  CAS  PubMed  Google Scholar 

  23. Omberg, L. et al. Enabling transparent and collaborative computational analysis of 12 tumor types within The Cancer Genome Atlas. Nat. Genet. 45, 1121–1126 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Hodis, E. et al. A landscape of driver mutations in melanoma. Cell 150, 251–263 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Robinson, J.T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Engström, P.G. et al. Systematic evaluation of spliced alignment programs for RNA-seq data. Nat. Methods 10, 1185–1191 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Roberts, R.J., Carneiro, M.O. & Schatz, M.C. The advantages of SMRT sequencing. Genome Biol. 14, 405 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  28. Fayen, J. et al. Class I MHC alpha 3 domain can function as an independent structural unit to bind CD8 alpha. Mol. Immunol. 32, 267–275 (1995).

    Article  CAS  PubMed  Google Scholar 

  29. Brusic, V., Petrovsky, N., Zhang, G. & Bajic, V.B. Prediction of promiscuous peptides that bind HLA class I molecules. Immunol. Cell Biol. 80, 280–285 (2002).

    Article  CAS  PubMed  Google Scholar 

  30. Ruppert, J. et al. Prominent role of secondary anchor residues in peptide binding to HLA-A2.1 molecules. Cell 74, 929–937 (1993).

    Article  CAS  PubMed  Google Scholar 

  31. Brown, S.D. et al. Neo-antigens predicted by tumor genome meta-analysis correlate with increased patient survival. Genome Res. 24, 743–750 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Rooney, M.S., Shukla, S.A., Wu, C.J., Getz, G. & Hacohen, N. Molecular and genetic properties of tumors associated with local immune cytolytic activity. Cell 160, 48–61 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Schreiber, R.D., Old, L.J. & Smyth, M.J. Cancer immunoediting: integrating immunity's roles in cancer suppression and promotion. Science 331, 1565–1570 (2011).

    Article  CAS  PubMed  Google Scholar 

  34. Bubeník, J. MHC class I down-regulation: tumour escape from immune surveillance? (review). Int. J. Oncol. 25, 487–491 (2004).

    PubMed  Google Scholar 

  35. Zou, W. Regulatory T cells, tumour immunity and immunotherapy. Nat. Rev. Immunol. 6, 295–307 (2006).

    Article  CAS  PubMed  Google Scholar 

  36. Pardoll, D.M. The blockade of immune checkpoints in cancer immunotherapy. Nat. Rev. Cancer 12, 252–264 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Norgaard, L., Fugger, L., Madsen, H.O. & Svejgaard, A. Identification of 4 different alternatively spliced HLA-A transcripts. Tissue Antigens 54, 370–378 (1999).

    Article  CAS  PubMed  Google Scholar 

  38. Brady, C.S. et al. Multiple mechanisms underlie HLA dysregulation in cervical cancer. Tissue Antigens 55, 401–411 (2000).

    Article  CAS  PubMed  Google Scholar 

  39. Jiménez, P. et al. A nucleotide insertion in exon 4 is responsible for the absence of expression of an HLA-A*0301 allele in a prostate carcinoma cell line. Immunogenetics 53, 606–610 (2001).

    Article  PubMed  Google Scholar 

  40. Pittet, M.J. et al. Alpha 3 domain mutants of peptide/MHC class I multimers allow the selective isolation of high avidity tumor-reactive CD8 T cells. J. Immunol. 171, 1844–1849 (2003).

    Article  CAS  PubMed  Google Scholar 

  41. Boegel, S. et al. HLA typing from RNA-seq sequence reads. Genome Med. 4, 102 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Kim, H.J. & Pourmand, N. HLA typing from RNA-seq data using hierarchical read weighting. PLoS One 8, e67885 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Bai, Y., Ni, M., Cooper, B., Wei, Y. & Fury, W. Inference of high resolution HLA types using genome-wide RNA or DNA sequencing reads. BMC Genomics 15, 325 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Warren, R.L. et al. Derivation of HLA types from shotgun sequence datasets. Genome Med. 4, 95 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Landau, D.A. et al. Evolution and impact of subclonal mutations in chronic lymphocytic leukemia. Cell 152, 714–726 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Purcell, S.M. et al. A polygenic burden of rare disruptive mutations in schizophrenia. Nature 506, 185–190 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Irimia, M. & Roy, S.W. Origin of spliceosomal introns and alternative splicing. Cold Spring Harb. Perspect. Biol. 6, a016071 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

C.J.W. is a Scholar of the Leukemia and Lymphoma Society and acknowledges support from the Blavatnik Family Foundation, American Association for Cancer Research (AACR) (SU2C Innovative Research Grant), National Heart, Lung, and Blood Institute (NHLBI) (1RO1HL103532-01) and National Cancer Institute (NCI) (1R01CA155010-01A1). This work has made extensive use of data generated by TCGA, a project of the National Cancer Institute and National Human Genome Research Institute. We thank E. Hodis for providing access to the melanoma data. We would also like to thank C. McCowan (Broad Technology Labs), T. Shea (Broad Technology Labs), S. Young (Broad Technology Labs) and M. Weiand (Pacific Biosciences) for their help in setting up, performing and analyzing data using Pacific Biosciences RSII instruments. We are grateful to E. Fritsch for critical reading of the manuscript and providing valuable feedback.

Author information

Authors and Affiliations

Authors

Contributions

C.J.W. proposed the initial idea of using exome data for HLA typing. S.A.S., G.G., C.J.W., P.M.D. and K.C. conceived and designed Polysolver and the mutation detection pipeline. G.T. and A.K. developed the ethnicity inference module. M.S.L. and S.A.S. performed the mutation significance analysis. V.B., C.J.W. and S.A.S. mapped the contact residue mutations. M.S.R. and N.H. performed the gene expression analysis. J.S., W.J.L., S.S. and J.L.D. performed the experimental validation. C.S. helped with data access and management. C.J.W., S.A.S., G.G. and M.R. wrote the manuscript. C.J.W. and G.G. led the project.

Corresponding authors

Correspondence to Catherine J Wu or Gad Getz.

Ethics declarations

Competing interests

A patent application has been filed on this work by the Broad Institute with S.A.S., C.J.W. and G.G. as authors. G.G. is an inventor of Mutect and MutSig, which were used in this work. C.J.W. and N.H. are founders of Neon Therapeutics.

Integrated supplementary information

Supplementary Figure 1 GC%, coverage and informative sites in HLA genes in 8 CLL samples.

(a) A significant negative correlation was observed between GC content and exome coverage (1-way ANOVA, P = 1.6×10−7). Mapping was carried out using BWA with the following parameters: aln task, −q 5 −l 32 −k 2 −o 1; sampe task, −a 300 (b) GC-rich regions of HLA genes have a relative over-abundance of informative (variant) sites (1-way ANOVA, P = 0.0197). (c) Detailed view of GC%, coverage and informative site density in each HLA gene from 1 representative CLL sample. Top row: The x-axis represents the chr6 location. The mid-panel dashed black segments represent exons. GC% (green) decreases in the 5′->3′ direction (HLA-B and HLA-C are located on the negative strand). Coverage (blue) has an opposite trend and increases in the 5′->3′ direction. The informative site density (red) was evaluated as the number of variant sites located in a 50 bp window, and tracked with GC%. Bottom row — the coverage distribution at the variant positions in each of HLA-A, -B and -C.

Supplementary Figure 2 Specificity of different tag length libraries for retrieval of HLA reads.

A broad range of tag length libraries were evaluated for their specificity for HLA-A, -B and -C genes. Since we had 76-mer paired end reads, we selected a 38-mer tag library, which ensured 100% sensitivity in the context of downstream processing with 23.3% specificity for class I HLA genes.

Supplementary Figure 3 Ethnicity inference using PCA (HapMap samples).

Ethnicities of 132 of 133 HapMap samples were inferred correctly based on their projection in the 2-dimensional space defined by the first two principal components. The colored icons show the clustering of the 1,398 training samples belonging to four different ethnic groups. The black icons depict the projection of 132 HapMap samples in this space. (NA12878 was removed from the PCA step as an outlier.) The success rate for attributing the correct ethnicity to each sample was 100%.

Supplementary Figure 4 Characteristics of HLA mutations detected by POLYSOLVER across 7,930 samples.

(a) Allelic frequencies of all 298 detected HLA somatic changes. The median allele fraction across somatic changes was 33% (interquartile range: 16–58%). Most of these mutations are likely heterozygous. (b) Frequency of HLA mutations in samples. 240 of 266 (90.2%) samples with HLA mutations only had a single somatic event, 20 had two and 6 samples (4 colon, 1 stomach and 1 uterine) had 3 distinct HLA mutations. (c) Frequency of cases per recurrently mutated site. 57 of 64 recurrently mutated sites were defined as recurrent on the basis of 2 to 4 specimens across samples with a mutation at the same site. Residues 25, 299, 7 and 209 were found to be highly recurrent with 7, 9, 11 and 24 distinct individuals harboring mutations at these two positions respectively. (d) Length-normalized distribution of HLA mutations across functional domains. A strong preference of potentially loss-of-function events (nonsense, frameshift indels, splice site mutations) for exon 1 is observed.

Supplementary Figure 5 Genes with significantly reduced expression in HLA mutant samples across tumor types.

More than 80 genes were identified pan-cancer (P < 10−10); however, a coherent theme was not evident among them.

Supplementary information

Supplementary Figures and Notes

Supplementary Figures 1–5 and Supplementary Notes 1–5 (PDF 3631 kb)

Supplementary Tables

Supplementary Tables 1–16 (XLSX 7128 kb)

Supplementary Software

Supplementary Software (ZIP 80387 kb)

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shukla, S., Rooney, M., Rajasagi, M. et al. Comprehensive analysis of cancer-associated somatic mutations in class I HLA genes. Nat Biotechnol 33, 1152–1158 (2015). https://doi.org/10.1038/nbt.3344

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/nbt.3344

This article is cited by

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer