Abstract
Recent advances in high-throughput sequencing technologies make it increasingly more efficient to sequence large cohorts for many complex traits. We discuss here a class of sequence-based association tests for family-based designs that corresponds naturally to previously proposed population-based tests, including the classical Burden and variance-component tests. This framework allows for a direct comparison between the powers of sequence-based association tests with family- vs population-based designs. We show that for dichotomous traits using family-based controls results in similar power levels as the population-based design (although at an increased sequencing cost for the family-based design), while for continuous traits (in random samples, no ascertainment) the population-based design can be substantially more powerful. A possible disadvantage of population-based designs is that they can lead to increased false-positive rates in the presence of population stratification, while the family-based designs are robust to population stratification. We show also an application to a small exome-sequencing family-based study on autism spectrum disorders. The tests are implemented in publicly available software.
Similar content being viewed by others
Log in or create a free account to read this content
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
References
Mardis ER : The impact of next-generation sequencing technology on genetics. Trends Genet 2008; 24: 133–141.
Metzker ML : Sequencing technologies—the next generation. Nat Rev Genet 2010; 11: 31–46.
Amos CI : Successful design and conduct of genome-wide association studies. Hum Mol Genet 2007; 16: R220–R225.
Ott J, Kamatani Y, Lathrop M : Family-based designs for genome-wide association studies. Nat Rev Genet 2011; 12: 465–474.
Neale BM, Kou Y, Liu L et al: Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 2012; 485: 242–245.
Xu B, Ionita-Laza I, Roos JL et al: De novo gene mutations highlight patterns of genetic and neural complexity in schizophrenia. Nat Genet 2012; 44: 1365–1369.
Rampersaud E, Mitchell BD, Naj AC, Pollin TI : Investigating parent of origin effects in studies of type 2 diabetes and obesity. Curr Diabetes Rev 2008; 4: 329–339.
Li B, Leal SM : Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet 2008; 83: 311–321.
Madsen BE, Browning SR : A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet 2009; 5: e1000384.
Price AL, Kryukov GV, de Bakker PI et al: Pooled association tests for rare variants in exon-resequencing studies. Am J Hum Genet 2010; 86: 832–838.
Liu DJ, Leal SM : A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associations with rare variants due to gene main effects and interactions. PLoS Genet 2010; 6: e1001156.
Morris AP, Zeggini E : An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genet Epidemiol 2010; 34: 188–193.
King CR, Rathouz PJ, Nicolae DL : An evolutionary framework for association testing in resequencing studies. PLoS Genet 2010; 6: e1001202.
Bhatia G, Bansal V, Harismendy O et al: A covering method for detecting genetic associations between rare variants and common phenotypes. PLoS Comput Biol 2010; 6: e1000954.
Basu S, Pan W : Comparison of statistical tests for disease association with rare variants. Genet Epidemiol 2010; 35: 606–619.
Han F, Pan W : A data-adaptive sum test for disease association with multiple common or rare variants. Hum Hered 2010; 70: 42–54.
Ionita-Laza I, Buxbaum J, Laird NM, Lange C : A new testing strategy to identify rare variants with either risk or protective effect on disease. Plos Genet 2011; 7: e1001289.
Ionita-Laza I, Makarov V, Yoon S et al: Finding disease variants in Mendelian disorders by using sequence data: methods and applications. Am J Hum Genet 2011; 89: 701–712.
Wu M, Lee S, Cai T, Li Y, Boehnke M, Lin X : Rare variant association testing for sequencing data using the Sequence Kernel Association Test (SKAT). Am J Hum Genet 2011; 89: 82–93.
Sul JH, Han B, He D, Eskin E : An optimal weighted aggregated association test for identification of rare variants involved in common diseases. Genetics 2011; 188: 181–188.
Lin DY, Tang ZZ : A general framework for detecting disease associations with rare variants in sequencing studies. Am J Hum Genet 2011; 89: 354–367.
Tzeng JY, Zhang D, Pongpanich M et al: Studying gene and gene-environment effects of uncommon and common variants on continuous traits: a marker-set approach using gene-trait similarity regression. Am J Hum Genet 2011; 89: 277–288.
Laird NM, Horvath S, Xu X : Implementing a unified approach to family based tests of association. Genetic Epi 2000; 19: S36–S42.
De G, Yip WK, Ionita-Laza I, Laird NM : Rare Variant Analysis for Family-Based Design. PLoS ONE 2013; 8: e48495.
Lee S, Wu M, Lin X : Optimal tests for rare variant effects in sequencing associ-ation studies. Biostatistics 2012; 13: 762–775.
Zhang D, Lin X : Hypothesis testing in semiparametric additive mixed models. Biostatistics 2003; 4: 57–74.
Rabinowitz D, Laird N : A unified approach to adjusting association tests for population admixture with arbitrary pedigree structure and arbitrary missing marker information. Hum Hered 2000; 50: 211–223.
Rakovski CS, Xu X, Lazarus R, Blacker D, Laird NM : A new multimarker test for family-based association studies. Genet Epidemiol 2007; 31: 9–17.
Davies RB : Algorithm AS 155: the distribution of a linear combination of χ2 random variables. Appl Stat 1980; 29: 323–333.
Schaffner SF, Foo C, Gabriel S, Reich D, Daly MJ, Altshuler D : Calibrating a coalescent simulation of human genome sequence variation. Genome Res 2005; 15: 1576–1583.
Balding DJ, Nichols RA : A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity. Genetica 1995; 96: 3–12.
Price AL, Patterson NJ, Plenge RM et al: Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 2006; 38: 904–909.
Mathieson I, McVean G : Differential confounding of rare and common variants in spatially structured populations. Nat Genet 2012; 44: 243–246.
Van Steen K, McQueen MB, Herbert A et al: Genomic screening and replication using the same data set in family-based association testing. Nat Genet 2005; 37: 683–691.
Ionita-Laza I, McQueen MB, Laird NM, Lange C : Genomewide weighted hypothesis testing in family-based association studies, with an application to a 100 K scan. Am J Hum Genet 2007; 81: 607–614.
Won S, Wilk JB, Mathias RA et al: On the analysis of genome-wide association studies in family-based designs: a universal, robust analysis approach and an application to four genome-wide association studies. PLoS Genet 2009; 5: e1000741.
Ionita-Laza I, Ottman R : Study designs for identification of rare disease variants in complex diseases: the utility of family-based designs. Genetics 2011; 189: 1061–1068.
DePristo MA, Banks E, Poplin R et al: A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 2011; 43: 491–498.
Li H, Durbin R : Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009; 25: 1754–1760.
McKenna A, Hanna M, Banks E et al: The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010; 20: 1297–1303.
Acknowledgements
The research was partially supported by National Science Foundation grant DMS-1100279 and National Institutes of Health grants R01MH095797 and 1R03HG005908 (to II-L), a Seaver Foundation grant and National Institutes of Health grants MH089025 and (to JDB) and National Institutes of Health grants R37 CA076404 and P01CA134294 (to SL and XL).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests
The authors declare no conflict of interest.
Additional information
Supplementary Information accompanies the paper on European Journal of Human Genetics website
Supplementary information
Rights and permissions
About this article
Cite this article
Ionita-Laza, I., Lee, S., Makarov, V. et al. Family-based association tests for sequence data, and comparisons with population-based association tests. Eur J Hum Genet 21, 1158–1162 (2013). https://doi.org/10.1038/ejhg.2012.308
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/ejhg.2012.308
Keywords
This article is cited by
-
Rediscovering the value of families for psychiatric genetics research
Molecular Psychiatry (2019)
-
Rare variant association analysis in case-parents studies by allowing for missing parental genotypes
BMC Genetics (2018)
-
WISARD: workbench for integrated superfast association studies for related datasets
BMC Medical Genomics (2018)
-
A combined association test for rare variants using family and case-control data
BMC Proceedings (2016)
-
Comparing family-based rare variant association tests for dichotomous phenotypes
BMC Proceedings (2016)