Approaching allelic probabilities and Genome-Wide Association Studies from beta distributions
José Santiago García-Cremades, Angel del Río, José A. García, Javier Gayán, Antonio González-Pérez, Agustín Ruiz, O. Sotolongo-Grau, Manuel Ruiz-Marín
(Submitted on 25 Feb 2014)
In this paper we have proposed a model for the distribution of allelic probabilities for generating populations as reliably as possible. Our objective was to develop such a model which would allow simulating allelic probabilities with different observed truncation and de- gree of noise. In addition, we have also introduced here a complete new approach to analyze a genome-wide association study (GWAS) dataset, starting from a new test of association with a statistical distribution and two effect sizes of each genotype. The new methodologi- cal approach was applied to a real data set together with a Monte Carlo experiment which showed the power performance of our new method. Finally, we compared the new method based on beta distribution with the conventional method (based on Chi-Squared distribu- tion) using the agreement Kappa index and a principal component analysis (PCA). Both the analyses show found differences existed between both the approaches while selecting the single nucleotide polymorphisms (SNPs) in association.