Fig. 3

Small insertion/deletion variants are observed in distinct proportions of sequence reads. A, Variants were separated by class and whether they were novel or previously reported in dbSNP and plotted according to the percentage of sequence reads in which the variant was observed. Dashed lines represent 25% and 75% of sequence reads. Substitution variants largely clustered near 50% or 100% of sequence reads as would be expected for heterozygous or homozygous variants, respectively. Known insertion/deletion variants (Ins/Del) also tended to cluster near 50% or 100% of sequence reads. However, novel small insertion/deletion variants had a much wider distribution with a significant proportion of variants detected in fewer than 25% of the sequence reads. B, The depth of coverage, defined by the number of sequence reads in the alignment for a given variant, is shown for the different types of sequence variants. No obvious differences were observed between the classes of variants in terms of the total depth of coverage. C–F, Known substitutions, known insertions/deletions, novel substitutions, and novel insertions/deletions were plotted according to the total read depth and the percentage of reads in which the variant was observed. The novel insertion/deletion variants clearly follow a distinct pattern, with a large proportion of the variants detected in fewer than 25% of sequence reads, but no clear relationship between depth of coverage and the percentage of reads in which the variant was observed.