Abstract
We have previously shown that the selection of haplotype tag single nucleotide polymorphisms (htSNPs) and their statistical analysis in a multi-locus transmission/disequilibrium test (TDT) results in a more cost-effective genotyping strategy in disease association studies of genes by minimising redundancy due to linkage disequilibrium between SNPs. Further savings can be achieved by the use of a two-stage genotyping strategy. This approach is illustrated here in conjunction with the multi-locus TDT in determining whether common alleles of the immune regulatory genes RANK and its ligand TRANCE (RANKL) are associated with type 1 diabetes (T1D). A saving of approximately 75% of potential genotyping reactions could be made with minimal loss of power. There was little evidence from our analysis for association between the TRANCE and RANK genes and T1D in the populations tested.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 6 digital issues and online access to articles
$119.00 per year
only $19.83 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Service SK, Sandkuijl LA, Freimer NB . Cost-effective designs for linkage disequilibrium mapping of complex traits. Am J Hum Genet 2003; 72: 1213–1220.
Ioannidis JPA, Trikalinos TA, Ntzani EE, Contopoulos-Ioannidis DG . Genetic associations in large versus small studies: an empirical assessment. Lancet 2003; 361: 567–571.
Johnson GC, Esposito L, Barratt BJ et al. Haplotype tagging for the identification of common disease genes. Nat Genet 2001; 29: 233–237.
Chapman JM, Cooper JD, Todd JA, Clayton DG . Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Hum Heredity 2003; 56: 18–31.
Aplenc R, Zhao H, Rebbeck TR, Propert KJ . Group sequential methods and sample size savings in biomarker–disease association studies. Genetics 2003; 163: 1215–1219.
Satagopan JM, Elston RC . Optimal two-stage genotyping in population-based association studies. Genet Epidemiol 2003; 25: 149–157.
Zhang K, Deng M, Chen T, Waterman MS, Sun F . A dynamic programming algorithm for haplotype block partitioning. Proc Natl Acad Sci USA 2002; 99: 7335–7339.
Ke X, Cardon LR . Efficient selective screening of haplotype tag SNPs. Bioinformatics 2003; 19: 287–288.
Stram DO, Haiman CA, Hirschhorn JN et al. Choosing haplotype-tagging SNPs based on unphased genotype data using a preliminary sample of unrelated subjects with an example from the multiethnic cohort study. Hum Heredity 2003; 55: 27–36.
Fan R, Knapp M . Genome association studies of complex diseases by case–control designs. Am J Hum Genet 2003; 72: 850–868.
Green EA, Choi Y, Flavell RA . Pancreatic lymph node-derived CD4(+)CD25(+) Treg cells: highly potent regulators of diabetes that require TRANCE–RANK signals. Immunity 2002; 16: 183–191.
Merriman TR, Twells RC, Merriman ME et al. Evidence by allelic association-dependent methods for a type 1 diabetes polygene (IDDM6) on chromosome 18q21. Hum Mol Genet 1997; 6: 1003–1010.
Vaidya B, Imrie H, Perros P et al. Evidence for a new Graves’ disease susceptibility locus at chromosome 18q21. Am J Hum Genet 2000; 66: 1710–1714.
Merriman TR, Cordell HJ, Eaves IA et al. Suggestive evidence for association of human chromosome 18q12–q21 and its orthologue on rat and mouse chromosome 18 with several autoimmune diseases. Diabetes 2001; 50: 184–194.
Jawaheer D, Seldin MF, Amos CI et al. Screening the genome for rheumatoid arthritis susceptibility genes. Arthritis Rheum 2003; 48: 906–916.
Chapman JM, Clayton DG . Detecting disease associations due to linkage disequilibrium using haplotype tags: technical addendum##http://www-gene.cimr.cam.ac.uk/clayton/tech_reports/chapman-clayton-2003.pdf.
Boos DB . On generalized score tests. Am Statistician 1992; 46: 327–333.
StataCorp. Stata Statistical Software: Realease 8.0. Stata Corporation: College Station, TX, 2003.
R statistical language http://www.r-project.org/.
Acknowledgements
The Wellcome Trust and the Juvenile Diabetes Research Foundation International have funded this work. We thank Vin Everett, Geoff Dolman and Neil Walker for data management and the DNA team for sample preparation. Diabetes UK and the Human Biological Data Interchange are acknowledged for multiplex family collections. We also thank the Norwegian Study Group for Childhood Diabetes, Dag Undlien and Kjersti Ronningen for the collection and provision of Norwegian samples.
Author information
Authors and Affiliations
Corresponding author
Appendix A. Adjusting χ2 tests for stopping for futility in multi-stage association studies
Appendix A. Adjusting χ2 tests for stopping for futility in multi-stage association studies
We consider a study in which the test is carried out in a series of n stages, stopping for futility at each stage if the results, thus far, do not achieve a given nominal significance level. For a conventional frequentist interpretation, the nominal significance level at the end of the study should be corrected for the intermediate analyses.
χ2 tests are generated by quadratic forms T=uTV−1u, where u is (asymptotically) multivariate normal with variance V. The test statistic T is then distributed as a non-central χ2 distribution with df v, the rank of V and non-centrality parameter η=μTV−1μ where μ=E(u). If the test is carried out in a series of n stages, involving proportions p1,…,pn of the total available sample, the score vector decomposes into independent contributions u=u1+u2+···+un, where

Writing u[k], p[k] for the partial sums

the test statistic carried out after stage k is

The distribution of Tk, conditional upon the history of previous results, u1,,…,uk−1 is that of pkχ2/p[k], where χ2 is a non-central χ2 variate with v df and non-centrality parameter
We stop after stage k if Tk fails to exceed a critical value ck. The probability of exceeding this critical value conditional upon reaching stage k is

This integral is intractable but may be approximated by simulation.
An accurate and efficient Monte Carlo method for calculating the overall probability of rejection is to simulate sequences of score vectors subject to the stopping rule described. The length of such sequences will vary from one to n. The probability of exceeding the test criterion after stage 2 conditional upon reaching stage 2 may then be calculated by averaging Pr(T2>c2∣u1) over all simulated values of u1. Similarly, the probability of exceeding the test criterion after stage 3 conditional upon reaching stage 3 may be calculated by averaging Pr(T3>c3∣u1, u2) over all simulated pairs of values (u1, u2). In this manner, the complete sequence of conditional probabilities can be estimated. When generating the sequences of score vectors, without loss of generality, we may take the v elements of ui to be independent variates. The overall probability of rejecting the null hypothesis is given by the cumulative product

These calculations are implemented in the R language by the program Nstage.
Rights and permissions
About this article
Cite this article
Lowe, C., Cooper, J., Chapman, J. et al. Cost-effective analysis of candidate genes using htSNPs: a staged approach. Genes Immun 5, 301–305 (2004). https://doi.org/10.1038/sj.gene.6364064
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/sj.gene.6364064
Keywords
This article is cited by
-
Association analysis of PRNP gene region with chronic wasting disease in Rocky Mountain elk
BMC Research Notes (2010)
-
Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes
Nature Genetics (2007)
-
Integrated analysis of genetic data with R
Human Genomics (2006)
-
Discovery, linkage disequilibrium and association analyses of polymorphisms of the immune complement inhibitor, decay-accelerating factor gene (DAF/CD55) in type 1 diabetes
BMC Genetics (2006)
-
Detecting multiple associations in genome-wide studies
Human Genomics (2006)