Statistical Analysis of Genomic Data
互联网
500
In this chapter we describe methods for statistical analysis of GWAS data with the goal of quantifying evidence for genomic effects associated with trait variation, while avoiding spurious associations due to evidence not being well quantified or due to population structure.
Single marker analysis and imputation are discussed in Sect. 1 , and a Bayesian multi-locus analysis using the BayesQTLBIC R package (1 , 2 ) is described in Sect. 2 . The multi-locus analysis, applied in a genomic window, enables local inference of the QTL genetic architecture and is an alternative to imputation. Multi-locus analysis with BayesQTLBIC , including calculation of posterior probabilities for alternative models, posterior probabilities for number of QTL, marginal probabilities for markers, and Bayes factors for individual chromosomes, is demonstrated for simulated QTL data. Methods for correcting the population structure and the possible effects of population structure on power are discussed in Sect. 3 . Section 4 considers analysis combining information from linkage and linkage disequilibrium when sampling from a pedigree. Section 5 considers combining information from two different studies—showing that data from an existing QTL mapping family can be profitably used in combination with an association study—prior odds are higher for candidate genes mapping into a QTL region in the QTL mapping family, and, optionally, the number of markers genotyped in an association study can be reduced. Examples using R and the R packages BayesQTLBIC , ncdf are given.