Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Nature Precedings
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • RSS feed
  1. nature
  2. nature precedings
  3. articles
  4. article
Variant calling comparison CASAVA1.8 and GATK
Download PDF
Download PDF
  • Manuscript
  • Open access
  • Published: 18 July 2011

Variant calling comparison CASAVA1.8 and GATK

  • Denis Bauer1 

Nature Precedings (2011)Cite this article

  • 1559 Accesses

  • 11 Citations

  • 11 Altmetric

  • Metrics details

Abstract

This work aims at addressing the question of whether the new CASAVA1.8, which boasts improvements such as local realignments of reads, is at par with the well accepted pipeline of BWA mapping, duplicate removal, local realignment, re-calibration and variant calling using GATK. We therefore compare the two methods on chromosome 21 of a Yoruba trio and compare the results to the genotype identified by the 1000 genomes project.We find that the mapping performance is the same for CASAVA1.8 and the academic pipeline, resulting in a mean coverage of about 22. CASAVA1.8 and GATK both call about 70.000 SNPs per individual of which 80% overlap between CASAVA1.8, GATK and the 1000 genomes project. This stands in contrast to the indel calling performance where CASAVA1.8 calls about 12,000 indels while GATK calls 16,000. Furthermore, CASAVA1.8 has a higher Mendelian error rate and frequently more than one alternative allele per locus indicating a non-optimal alignment.We conclude that CASAVA1.8 has come a long way and can be considered a mature SNP calling approach. However, CASAVA1.8 does not deliver the same quality in the indel calling set compared to the newly incorporated Dindel-algorithm of GATK. It hence remains the best practice to use CASAVA1.8 for producing fastq files and switch at this stage to the academic tools for mapping, alignment improvement and variant calling.

Similar content being viewed by others

Benchmarking of variant calling software for whole-exome sequencing using gold standard datasets

Article Open access 21 April 2025

Sparse haplotype-based fine-scale local ancestry inference at scale reveals recent selection on immune responses

Article Open access 20 March 2025

SAVANA: reliable analysis of somatic structural variants and copy number aberrations using long-read sequencing

Article Open access 28 May 2025

Article PDF

Author information

Authors and Affiliations

  1. Queensland Brain Institute https://www.nature.com/nature

    Denis Bauer

Authors
  1. Denis Bauer
    View author publications

    Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Denis Bauer.

Rights and permissions

Creative Commons Attribution 3.0 License.

Reprints and permissions

About this article

Cite this article

Bauer, D. Variant calling comparison CASAVA1.8 and GATK. Nat Prec (2011). https://doi.org/10.1038/npre.2011.6107.1

Download citation

  • Received: 18 July 2011

  • Accepted: 18 July 2011

  • Published: 18 July 2011

  • DOI: https://doi.org/10.1038/npre.2011.6107.1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • CASAVA1.8
  • GATK
  • SNP
  • indels
  • variant calling
  • ngs
  • second generation sequencing

This article is cited by

  • QTL mapping and identification of genes associated with the resistance to Acanthoscelides obtectus in cultivated common bean using a high-density genetic linkage map

    • Xiaoming Li
    • Yongsheng Tang
    • Shumin Wang

    BMC Plant Biology (2022)

  • Detailed simulation of cancer exome sequencing data reveals differences and common limitations of variant callers

    • Ariane L. Hofmann
    • Jonas Behr
    • Niko Beerenwinkel

    BMC Bioinformatics (2017)

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Sign up for alerts
  • RSS feed

About the journal

  • Journal Information

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Nature Precedings (Nat Preced)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2025 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing