Xiaoquan Wen, Matthew Stephens
(Submitted on 4 Nov 2011 (v1), last revised 8 Nov 2011 (this version, v2))
In genetic association analyses, it is often desired to analyze data from multiple potentially-heterogeneous subgroups. The amount of expected heterogeneity can vary from modest (as might typically be expected in a meta-analysis of multiple studies of the same phenotype, for example), to large (e.g. a strong gene-environment interaction, where the environmental exposure defines discrete subgroups). Here, we consider a flexible set of Bayesian models and priors that can capture these different levels of heterogeneity. We provide accurate numerical approaches to compute approximate Bayes Factors for these different models, and also some simple analytic forms which have natural interpretations and, in some cases, close connections with standard frequentist test statistics. These approximations also have the convenient feature that they require only summary-level data from each subgroup (in the simplest case, a point estimate for the genetic effect, and its standard error, from each subgroup). We illustrate the flexibility of these approaches on three examples: an analysis of a potential gene-environment interaction for a recombination phenotype, a large scale meta-analysis of genome-wide association data from the Global Lipids consortium, and a cross-population analysis for expression quantitative trait loci (eQTLs).