Unbiased statistical testing of shared genetic control for potentially related traits
Chris Wallace
(Submitted on 23 Jan 2013)
Integration of data from genomewide single nucleotide polymorphism (SNP) association studies of different traits should allow researchers to disentangle the genetics of potentially related traits within individually associated regions. Methods have ranged from visual comparison of association $p$ values for each trait to formal statistical colocalisation testing of individual regions, which requires selection of a set of SNPs summarizing the association in a region. We show that the SNP selection method greatly affects type 1 error rates, with all published studies to date having used SNP selection methods that result in substantially biased inference. The primary reasons are twofold: random variation in the prescence of linkage disequilibrium means selected SNPs do not fully capture the association signal, and selecting SNPs on the basis of significance leads to biased effect size estimates.
We show that unbiased inference can be made either by avoiding variable selection and instead testing the most informative principal components or by integrating over variable selection using Bayesian model averaging. Application to data from Graves’ disease and Hashimoto’s thyroiditis reveals a common genetic signature across seven regions shared between the diseases, and indicates that for five out of six regions which have been significantly associated with one disease and not the other, the lack of evidence in one disease represents genuine absence of association rather than lack of power.