Multi-SNP Haplotype Analysis Methods for Association Analysis
互联网
608
This chapter reviews the rationale for the use of haplotypes in association-based testing, discusses statistical issues related to haplotype uncertainty that complicate the analysis, then gives practical guidance for testing haplotype-based associations with phenotype or outcome trait, first of candidate gene regions and then for the genome as a whole. Haplotypes are interesting for two reasons: First, they may be in closer LD with a causal variant than any single measured SNP, and therefore may enhance the coverage value of the genotypes over single SNP analysis. Second, haplotypes may themselves be the causal variants of interest and some solid examples of this have appeared in the literature. This chapter discusses three possible approaches to incorporation of SNP haplotype analysis into generalized linear regression models: (1) a simple substitution method involving imputed haplotypes; (2) simultaneous maximum likelihood (ML) estimation of all parameters, including haplotype frequencies and regression parameters; and (3) a simplified approximation to full ML for case–control data. Examples of the various approaches for a haplotype analysis of a candidate gene are provided. We compare the behavior of the approximation-based methods and show that in most instances the simpler methods hold up well in practice. We also describe the practical implementation of genome-wide haplotype risk estimation and discuss several shortcuts that can be used to speed up otherwise potentially very intensive computational requirements.