Table 1 CNV calling tools included in the study.
From: Benchmarking germline CNV calling tools from exome sequencing data
Tool | Algorithm detail | Features (specifics) | Year |
---|---|---|---|
CANOES | Negative binomial distribution, regression-based normalization (GC-content), HMM | At least 15 samples, average targets 6, distance between targets 70 kb, average rate of CNV occurrence in the exome 10–8 | 2014 |
CLAMMS | GC-content and average depth normalization, custom reference set using kNN, mixture model, HMM | 0.3 < GC < 0.7, mappability > 0.75 | 2015 |
cn.MOPS | GC-content and sample normalization, mixture Poissons model and Bayes approach | At least 6 samples Minimum segments 5 | 2012 |
CNVkit | In-target and off-target regions, bias (GC-content, repeat-masked fraction, target density) correction using rolling median, CBS | Exclude poor mappable regions | 2016 |
CODEX | Log-linear decomposition-based normalization, Poisson likelihood-based segmentation | 0.2 < GC < 0.8 Target length > 20 bp, median target coverage > 20 × , mappability > 0.9 | 2015 |
CoNIFER | Singular value decomposition normalization, ± 1.5 SVD-ZRPKM threshold | At least 50 samples Probes with median RPKM across samples > 1, samples with a standard deviation of SVD-ZRPKM < 0.5 | 2012 |
CONTRA | Base-level log-ratios, GC-content, library-size correction, calling region significant based on normal distribution, CBS for large variation | Include regions at least 10-bp long with coverage > 10 | 2012 |
DeAnnCNV | Web-server, GC-normalization, HMM of log read counts ratio | CNV evidence threshold > 80 | 2015 |
EXCAVATOR2 | In and off-target regions, 3-step normalization (GC-content, mappability, region length) segmentation with shifting level model, FastCall algorithm | Read mapq > 1 Min number of targets in CNV 4 | 2016 |
exomeCopy | Negative binomial distribution, HMM using background read depth and positional covariates (GC-content, length) | mapq > 1, overlap to include read into region—1 bp, median value for background, transition probability to CNV 1e-4 Transition probability to normal state 0.05 | 2011 |
ExomeDepth | Beta-binomial distribution, optimized reference set, HMM | Read mapq > 20, max distance between target border and the middle of paired read to include read into region 300 bp Transition probability to CNV 0.0001 Expected CNV length 50 kb | 2012 |
ExonDel | Deletion in exome or genes of interest, GC-content median correction, calling by comparing to median depth within the gene | Read mapq > 20, base quality > 20, min percent of covered bp for each exon 0.1, max number of exons in CNV 9 | 2014 |
FishingCNV | PCA of RPKM, CBS test sample, comparing segment coverage against control set distribution | Read mapq > 15 Base quality 10, RPKM > 3 FDR adjusted p-value 0.05 | 2013 |
HMZDelFinder | Only deletion, exon and sample filtering, call region with RPKM < 0.65 as deletion, AOH filtering based on VCF, prioritization based on Z-score | Mean RPKM > 7 across samples, deletion frequency < 0.5% Exclude 2% samples with the highest number of deletion | 2017 |
PatternCNV | Log2-transformed RPKM standardization, average and variability pattern training from control samples, smooth bin within exon | Bin size 10 mapq > 20 | 2014 |
XHMM | Gaussian distribution, PCA normalization, HMM | At least 50 samples, 0.1 < GC < 0.9, 10 bp < target < 10 kbp, mean coverage > 10 × across all samples, average targets 6, distance between targets 70 kb, average rate of CNV occurrence in the exome 10–8 | 2012 |