Abstract
Next-generation sequencing (NGS) is routinely used for constitutional genetic analysis. However, cross-contamination between samples constitutes a major risk that could impact the results of the analysis. We have developed ART-DeCo, a tool using the allelic ratio (AR) of the Single Nucleotide Polymorphisms sequenced with regions of interest. When a sample is contaminated by DNA with a different genotype, unexpected ARs are obtained, which are in turn used for detection of contamination with a screening test, followed by identification and quantification of the contaminant. Following optimization, ART-DeCo was applied to 2222 constitutional DNA samples. The screening test was positive for 191 samples. In 33 cases (contamination percentages: 1.3% to 29.2%), the contaminant was identified and was mostly located in adjacent wells. Three other positive cases were due to barcoding errors or mixture of two DNA samples. Interestingly, the last contaminated sample corresponded to a bone marrow transplant recipient. Lastly, no contaminant was identified in 154 weakly positive ( < 4%) samples that were considered to be irrelevant to constitutional genetic analysis. ART-DeCo lends itself to mandatory quality control procedures, also highlighting the delicate steps of library preparation, resulting in practice improvement. Importantly, ART-DeCo can be implemented in any NGS workflow, from gene panel to genome-wide analyses. https://sourceforge.net/projects/ngs-art-deco/.
Similar content being viewed by others
Log in or create a free account to read this content
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
References
Kamps R, Brandao RD, Bosch BJ, Paulussen AD, Xanthoulea S, Blok, et al. Next-generation sequencing in oncology: genetic diagnosis, risk prediction and cancer classification. Int J Mol Sci. 2017;18:308.
Meacham F, Boffelli D, Dhahbi J, Martin DI, Singer M, Pachter L, et al. Identification and correction of systematic error in high-throughput sequence data. BMC Bioinformatics. 2011;12:451.
Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B. Detection and quantification of rare mutations with massively parallel sequencing. Proc Natl Acad Sci USA. 2011;108:9530–5.
Costello M, Fleharty M, Abreu J, Farjoun Y, Ferriera S, Holmes L, et al. Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms. BMC Genomics. 2018;19:332.
Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal. 2011;17:10–2.
Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9.
Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levy-Moonshine A, et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics. 2013;43:1–33.
Davidson CJ, Zeringer E, Champion KJ, Gauthier MP, Wang F, Boonyaratanakornkit J, et al. Improving the limit of detection for Sanger sequencing: a comparison of methodologies for KRAS variant detection. Biotechniques. 2012;53:182–8.
Taniguchi S, Maekawa N, Yashiro N, Hamada T. Detection of human T-cell lymphotropic virus type-1 proviral DNA in the saliva of an adult T-cell leukaemia/lymphoma patient using the polymerase chain reaction. Br J Dermatol. 1993;129:637–41.
Cibulskis K, McKenna A, Fennell T, Banks E, DePristo M, Getz G, et al. ContEst: estimating cross-contamination of human samples in next-generation sequencing data. Bioinformatics. 2011;27:2601–2.
Bergmann EA, Chen BJ, Arora K, Vacic V, Zody MC. Conpair: concordance and contamination estimator for matched tumor-normal pairs. Bioinformatics. 2016;32:3196–8.
Sehn JK, Spencer DH, Pfeifer JD, Bredemeyer AJ, Cottrell CE, Abel HJ, et al. Occult specimen contamination in routine clinical next-generation sequencing testing. Am J Clin Pathol. 2015;144:667–74.
Jun G, Flickinger M, Hetrick KN, Romm JM, Doheny KF, Abecasis GR, et al. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am J Hum Genet. 2012;91:839–48.
Flickinger M, Jun G, Abecasis GR, Boehnke M, Kang HM. Correcting for sample contamination in genotype calling of DNA sequence data. Am J Hum Genet. 2015;97:284–90.
Acknowledgements
This work was supported by grants from the ANR-10-EQPX-03 from the Agence Nationale de la Recherche (Investissements d’Avenir).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note: Springer Nature remains neutral with regard t jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fiévet, A., Bernard, V., Tenreiro, H. et al. ART-DeCo: easy tool for detection and characterization of cross-contamination of DNA samples in diagnostic next-generation sequencing analysis. Eur J Hum Genet 27, 792–800 (2019). https://doi.org/10.1038/s41431-018-0317-x
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41431-018-0317-x
This article is cited by
-
Assessment of gene–disease associations and recommendations for genetic testing for somatic variants in vascular anomalies by VASCERN-VASCA
Orphanet Journal of Rare Diseases (2024)
-
Computational analysis of cancer genome sequencing data
Nature Reviews Genetics (2022)
-
Contaminant DNA in bacterial sequencing experiments is a major source of false genetic variability
BMC Biology (2020)