The National Cancer Institute (NCI) Genomic Data Commons (GDC) contains more than 2.9 petabytes of genomic and associated clinical data from more than 60 NCI-funded and other contributed cancer genomics research projects. The GDC consists of five applications over a common data model and a common application programming interface.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Non-negative matrix factorization and deconvolution as a dual simplex problem
Genome Biology Open Access 14 January 2026
-
Bioinformatics analysis identifies AURKB as a prognostic biomarker across multiple human cancers
Discover Oncology Open Access 08 January 2026
-
Long read sequencing reveals novel genomic and epigenomic alterations in repetitive regions of high grade serous ovarian cancer
Scientific Reports Open Access 30 October 2025
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout



Change history
18 May 2021
A Correction to this paper has been published: https://doi.org/10.1038/s41588-021-00883-2
References
Grossman, R. L. Cancer J. 24, 122–126 (2018).
Grossman, R. L., Heath, A., Murphy, M., Patterson, M. & Wells, W. Comput. Sci. Eng. 18, 10–20 (2016).
Wilkinson, M. D. et al. Sci. Data 3, 160018 (2016).
Lawrence, M. S. et al. Nature 505, 495–501 (2014).
Wilson, S. et al. Cancer Res. 77, e15–e18 (2017).
Leek, J. T. et al. Nat. Rev. Genet. 11, 733–739 (2010).
Mailman, M. D. et al. Nat. Genet. 39, 1181–1186 (2007).
Heath, A. P. et al. J. Am. Med. Inform. Assoc. 21, 969–975 (2014).
Hinkson, I. V. et al. Front. Cell Dev. Biol. 5, 83 (2017).
Zhang, Z. et al. Nat. Commun. https://doi.org/10.1038/s41467-021-21254-9 (2021).
Jia, P. et al. Genome Biol. 15, 489 (2014).
Lillie, E. O. et al. Per. Med. 8, 161–173 (2011).
Levine, R. L. et al. Cancer Cell 7, 387–397 (2005).
Acknowledgements
This project was funded in part with Federal funds from the National Cancer Institute, National Institutes of Health, agreement 14X050 and task order T02 under agreement 17X147 under contract HHSN261200800001E. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products or organizations imply endorsement by the US Government. The project is grateful for the contributions of S. Marechek and E. Miller, both of whom have passed away.
Author information
Authors and Affiliations
Contributions
The GDC software was developed and tested by A. Khurana, A. Kadam, A.W., A.H., A.C., A.Z., B.F.C., B.L.W., B.R., B.B., C.F.B., C.W., C.D., C.K.Y., C.Y., C.P.R., F. Gomez, F. Gerthoffert, F.C., G.L.G., I.M., J.C.A., J.J.P., J.B., J.A.M., J.P., J. Spring, J. Sislow, J.T.Y., J.S.M., J.Z., J.H.B.B., K.W., K.H., K.R., K.C., K.M., K.M.J.K., K.A.S., L.L., L.X., M.A., M.W.M., M.Y.P., M.S.F., M. Ford, M. Fukuma, P.L.P., P.-M.D., P.M., R.P., R.A., R.L.G., R.B., R.J., R.O.O., S.R., S.S., S.W., S.J., S.A., T.N., T.I.G., V.E.K., V.F., W.P.W., Y.T. and Y.Z. Bioinformatics, data curation and data modeling were performed by A.P.K., D.P.M., F.M.O., J.H.S., J.Z., K.M.H., L.S., M.A.J., M.L.F., R.L.G., R.B., S.L., T.M.L., T.I.G., V.F., W.P.W., Z.Z. and Z.W. The project managers were A.P.H., B.I., C.K.Y., D.S.G., E.M., F.G., H.D.T., H.S., J.L., J.C.Z., L.Y., L. Stein, L. Staudt, M.A.J., M.L.F., M.T., R.L.G., S.S.G., S.G., T.D., T.J.S., V.F. and Z.W. The manuscript was written and revised by A.P.H., D.S.G., J.C.Z., L. Staudt, R.L.G. and Z.Z.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work.
Supplementary information
Supplementary Information
Supplementary Note
Rights and permissions
About this article
Cite this article
Heath, A.P., Ferretti, V., Agrawal, S. et al. The NCI Genomic Data Commons. Nat Genet 53, 257–262 (2021). https://doi.org/10.1038/s41588-021-00791-5
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41588-021-00791-5
This article is cited by
-
Non-negative matrix factorization and deconvolution as a dual simplex problem
Genome Biology (2026)
-
Machine Learning-Driven Identification and In Vitro Validation of the APOBEC3B-ANLN Regulatory Axis in Adrenocortical Carcinoma
Endocrine (2026)
-
Bioinformatics analysis identifies AURKB as a prognostic biomarker across multiple human cancers
Discover Oncology (2026)
-
TumorXDB: an integrated multi-omics xWAS/xQTL platform for cross-ethnic pan-cancer analysis
Journal of Translational Medicine (2025)
-
Locus-specific HERV expression associated with hepatocellular carcinoma
Mobile DNA (2025)