Table 1 The genomic context in which a variant is found can be used as preliminary functional analysis

From: Principles for the post-GWAS functional characterization of cancer risk loci

Classification

Approximate percentagesa

Approximate numbersa

Intronic

40

1,047

Intergenic

32

838

Within non-coding sequence of a gene

10

262

Upstream

8

210

Downstream

4

105

Non-synonymous coding

3

79

3′ untranslated region

∼1

26

Synonymous coding

∼1

26

5′ untranslated region

  

Regulatory region

  

Nonsense-mediated decay transcript

  

Unknown

∼1

26

Splice site

  

Gained stop codon

  

Frameshift in a coding sequence

  
  1. The table broadly summarizes the genomic context of disease- and trait-associated SNPs annotated in the Catalog of Genome-Wide Association Studies (http://www.genome.gov/gwastudies/) as of December 9th, 2010: 1,212 published genome-wide associations with P < 5 × 10−8 for 210 traits totaling 2,619 SNPs. Most of the SNPs are located in intergenic and intronic positions, but a small percentage are located upstream and downstream of genes, as well as in regulatory regions and splice sites. SNPs in these locations can be analyzed in more detail using more specific bioinformatics tools.
  2. aValues are indicative and dependent on genomic boundaries used.