Multiple Quantitative Trait Analysis Using Bayesian Networks

Multiple Quantitative Trait Analysis Using Bayesian Networks

Marco Scutari, Phil Howell, David J. Balding, Ian Mackay
(Submitted on 12 Feb 2014)

Models for genome-wide prediction and association studies usually target a single phenotypic trait. However, in animal and plant genetics it is common to record information on multiple phenotypes for each individual that will be genotyped. Modeling traits individually disregards the fact that they are most likely associated due to pleiotropy and shared biological basis, thus providing only a partial, confounded view of genetic effects and phenotypic interactions. In this paper we use data from a Multiparent Advanced Generation Inter-Cross (MAGIC) winter wheat population to explore Bayesian networks as a convenient and interpretable framework for the simultaneous modeling of multiple quantitative traits. We show that they are equivalent to multivariate genetic best linear unbiased prediction (GBLUP), and that they outperform single-trait elastic net and single-trait GBLUP in predictive performance. Finally, we discuss their relationship with other additive-effects models and their advantages in inference and interpretation. MAGIC populations provide an ideal setting for this kind of investigation because the very low population structure and large sample size result in predictive models with good power and limited confounding due to relatedness.

Estimating the evolution of human life history traits in age-structured populations

Estimating the evolution of human life history traits in age-structured populations
Ryan Baldini

I propose a method that estimates the selection response of all vital rates in an age-structured population. I assume that vital rates are determined by the additive genetic contributions of many loci. The method uses all relatedness information in the sample to inform its estimates of genetic parameters, via an MCMC Bayesian framework. One can use the results to estimate the selection response of any life history trait that is a function of the vital rates, including the age at first reproduction, total lifetime fertility, survival to adulthood, and others. This method closely ties the empirical analysis of life history evolution to dynamically complete models of natural selection, and therefore enjoys some theoretical advantages over other methods. I demonstrate the method on a simulated model of evolution with two age classes. Finally I discuss how the method can be extended to more complicated cases.

Extensive epistasis within the MHC contributes to the genetic architecture of celiac disease

Extensive epistasis within the MHC contributes to the genetic architecture of celiac disease
Ben Goudey, Gad Abraham, Eder Kikianty, Qiao Wang, Dave Rawlinson, Fan Shi, Izhak Haviv, Linda Stern, Adam Kowalczyk, Michael Inouye

Epistasis has long been thought to contribute to the genetic aetiology of complex diseases, yet few robust epistatic interactions in humans have been detected. We have conducted exhaustive genome-wide scans for pairwise epistasis in five independent celiac disease (CeD) case-control studies, using a rapid model-free approach to examine over 500 billion SNP pairs in total. We found extensive epistasis within the MHC region with 7,270 statistically significant pairs achieving stringent replication criteria across multiple studies. These robust epistatic pairs partially tagged CeD risk HLA haplotypes, and replicable evidence for epistatic SNPs outside the MHC was not observed. Both within and between European populations, we observed striking consistency of epistatic models and epistatic model distribution, thus providing empirical estimates of their frequencies in a complex disease. Within the UK population, models of CeD comprised of both epistatic and additive single-SNP effects increased explained CeD variance by approximately 1% over those of single SNPs. Further analysis showed that additive SNP effects tag epistatic effects (and vice versa), sometimes involving SNPs separated by a megabase or more. These findings show that the genetic architecture of CeD consists of overlapping additive and epistatic components, indicating that the genetic architecture of CeD, and potentially other common autoimmune diseases, is more complex than previously thought.

Genome-Wide Introgression Revealed Pervasive Hybrid Incompatibilities (HI) between Caenorhabditis species

Genome-Wide Introgression Revealed Pervasive Hybrid Incompatibilities (HI) between Caenorhabditis species
Yu Bi, Xiaoliang Ren, Cheung Yan, Jiaofang Shao, Dongying Xie, Zhongying Zhao

Systematic characterization of hybrid incompatibility (HI) between related species remains the key to understanding speciation. The genetic basis of HI has been intensively studied in Drosophila species, but remains largely unknown in other species, including nematodes. This is mainly due to the lack of a sister species with which C. elegans can mate and produce viable progeny. The recent discovery of a C. briggsae sister species, C. sp.9, opened up the possibility of dissecting the genetic basis of HI in nematode species. However, paucity of molecular and genetic tools has prevented the precise mapping of HI loci between the two species. To systematically isolate the HI loci between the nematode species pair, we first generated 96 chromosomally integrated, independent GFP insertions in the C. briggsae genome. We next mapped the GFP insertion site into defined locations using a method we had developed earlier. The dominant and visible markers facilitated the directional crossing of its linked genomic sequences into C. sp.9. We then backcrossed each individual marker into C. sp.9 for at least 15 generations and produced 111 independent introgression lines, which together represent most of the C. briggsae genome. We finally dissected the HI patterns by scoring embryonic lethality, larval arrest, sex ratio, fertility, male sterility and inviability in a subset of the introgression lines, and identified pervasive HIs between the two species. The study produced a genome-wide landscape of HI between nematode species for the first time. The initial crossing results confirmed the Haldane?s rule and the fertility data from homozygous introgressions supported the rule of large X effect. The large collection of introgression lines allows mapping of numerous HI loci into defined genomic regions between C. briggsae and C. sp.9, thus facilitating further characterization of their genetic and molecular mechanisms. Importantly, the study permits comparative analysis of speciation genetics between nematodes and other species.

Cross-phenotype meta-analysis reveals large-scale trans-eQTLs mediating patterns of transcriptional co-regulation

Cross-phenotype meta-analysis reveals large-scale trans-eQTLs mediating patterns of transcriptional co-regulation
Boel Brynedal, Towfique Raj, Barbara E Stranger, Robert Bjornson, Benjamin M Neale, Benjamin F Voight, Chris Cotsapas
(Submitted on 7 Feb 2014)

Genetic variation affecting gene regulation is a central driver of phenotypic differences between individuals and can be used to uncover how biological processes are organized in a cell. Although detecting cis-eQTLs is now routine, trans-eQTLs have proven more challenging to find due to the modest variance explained and the multiple tests burden of testing millions of SNPs for association to thousands of transcripts. Here, we successfully map trans-eQTLs with the complementary approach of looking for SNPs associated to the expression of multiple genes simultaneously. We find 732 trans- eQTLs that replicate across two continental populations; each trans-eQTL controls large groups of target transcripts (regulons), which are part of interacting networks controlled by transcription factors. We are thus able to uncover co-regulated gene sets and begin describing the cell circuitry of gene regulation.

Genetic variants associated with motion sickness point to roles for inner ear development, neurological processes, and glucose homeostasis

Genetic variants associated with motion sickness point to roles for inner ear development, neurological processes, and glucose homeostasis

Bethann S Hromatka, Joyce Y Tung, Amy K Kiefer, Chuong B Do, David A Hinds, Nicholas Eriksson

Roughly one in three individuals is highly susceptible to motion sickness and yet the underlying causes of this condition are not well understood. Despite high heritability, no associated genetic factors have been discovered to date. Here, we conducted the first genome-wide association study on motion sickness in 80,494 individuals from the 23andMe database who were surveyed about car sickness. Thirty-five single-nucleotide polymorphisms (SNPs) were associated with motion sickness at a genome-wide-significant level (p< 5e-8). Many of these SNPs are near genes involved in balance, and eye, ear, and cranial development (e.g., PVRL3, TSHZ1, MUTED, HOXB3, HOXD3). Other SNPs may affect motion sickness through nearby genes with roles in the nervous system, glucose homeostasis, or hypoxia. We show that several of these SNPs display sex-specific effects, with as much as three times stronger effects in women. We searched for comorbid phenotypes with motion sickness, confirming associations with known comorbidities including migraines, postoperative nausea and vomiting (PONV), vertigo, and morning sickness, and observing new associations with altitude sickness and many gastrointestinal conditions. We also show that two of these related phenotypes (PONV and migraines) share underlying genetic factors with motion sickness. These results point to the importance of the nervous system in motion sickness and suggest a role for glucose levels in motion-induced nausea and vomiting, a finding that may provide insight into other nausea-related phenotypes such as PONV. They also highlight personal characteristics (e.g., being a poor sleeper) that correlate with motion sickness, findings that could help identify risk factors or treatments.

The causal meaning of genomic predictors and how it affects the construction and comparison of genome-enabled selection models

The causal meaning of genomic predictors and how it affects the construction and comparison of genome-enabled selection models
Bruno D Valente, Gota Morota, Guilherme JM Rosa, Daniel Gianola, Kent Weigel

The additive genetic effect is arguably the most important quantity inferred in animal and plant breeding analyses. The term effect indicates that it represents causal information, which is different from standard statistical concepts as regression coefficient and association. The process of inferring causal information is also different from standard statistical learning, as the former requires causal (i.e. non-statistical) assumptions and involves extra complexities. Remarkably, the task of inferring genetic effects is largely seen as a standard regression/prediction problem, contradicting its label. This widely accepted analysis approach is by itself insufficient for causal learning, suggesting that causality is not the point for selection. Given this incongruence, it is important to verify if genomic predictors need to represent causal effects to be relevant for selection decisions, especially because applying regression studies to answer causal questions may lead to wrong conclusions. The answer to this question defines if genomic selection models should be constructed aiming maximum genomic predictive ability or aiming identifiability of genetic causal effects. Here, we demonstrate that selection relies on a causal effect from genotype to phenotype, and that genomic predictors are only useful for selection if they distinguish such effect from other sources of association. Conversely, genomic predictors capturing non-causal signals provide information that is less relevant for selection regardless of the resulting predictive ability. Focusing on covariate choice decision, simulated examples are used to show that predictive ability, which is the criterion normally used to compare models, may not indicate the quality of genomic predictors for selection. Additionally, we propose using alternative criteria to construct models aiming for the identification of the genetic causal effects.

genomic architecture of human neuroanatomical diversity

Genomic architecture of human neuroanatomical diversity
Roberto Toro, Jean-Baptiste Poline, Guillaume Huguet, Eva Loth, Vincent Frouin, Tobias Banaschewski, Gareth J Barker, Arun Bokde, Christian Büchel, Fabiana Carvalho, Patricia Conrod, Mira Fauth-Bühler, Herta Flor, Jürgen Gallinat, Hugh Garavan, Penny Gowloan, Andreas Heinz, Bernd Ittermann, Claire Lawrence, Hervé Lemaître, Karl Mann, Frauke Nees, TomᚠPaus, Zdenka Pausova, Marcella Rietschel, Trevor Robbins, Michael Smolka, Andreas Ströhle, Gunter Schumann, Thomas Bourgeron

Human brain anatomy is strikingly diverse and highly inheritable: genetic factors may explain up to 80% of its variability. Prior studies have tried to detect genetic variants with a large effect on neuroanatomical diversity, but those currently identified account for <5% of the variance. Here we show, based on our analyses of neuroimaging and whole-genome genotyping data from 1,765 subjects, that up to 54% of this heritability is captured by large numbers of single nucleotide polymorphisms of small effect spread throughout the genome, especially within genes and close regulatory regions. The genetic bases of neuroanatomical diversity appear to be relatively independent of those of body size (height), but shared with those of verbal intelligence scores. The study of this genomic architecture should help us better understand brain evolution and disease.

A Robust Model-free Approach for Rare Variants Association Studies Incorporating Gene-Gene and Gene-Environmental Interactions

A Robust Model-free Approach for Rare Variants Association Studies Incorporating Gene-Gene and Gene-Environmental Interactions
Ruixue Fan, Shaw-Hwa Lo
(Submitted on 2 Dec 2013)

Recently more and more evidence suggests that rare variants with much lower minor allele frequencies play significant roles in disease etiology. Advances in next-generation sequencing technologies will lead to many more rare variants association studies. Several statistical methods have been proposed to assess the effect of rare variants by aggregating information from multiple loci across a genetic region and testing the association between the phenotype and aggregated genotype. One limitation of existing methods is that they only look into the marginal effects of rare variants but do not systematically take into account effects due to interactions among rare variants and between rare variants and environmental factors. In this article, we propose the summation of partition approach (SPA), a robust model-free method that is designed specifically for detecting both marginal effects and effects due to gene-gene (G-G) and gene-environmental (G-E) interactions for rare variants association studies. SPA has three advantages. First, it accounts for the interaction information and gains considerable power in the presence of unknown and complicated G-G or G-E interactions. Secondly, it does not sacrifice the marginal detection power; in the situation when rare variants only have marginal effects it is comparable with the most competitive method in current literature. Thirdly, it is easy to extend and can incorporate more complex interactions; other practitioners and scientists can tailor the procedure to fit their own study friendly. Our simulation studies show that SPA is considerably more powerful than many existing methods in the presence of G-G and G-E interactions.

Natural Allelic Variations of Xenobiotic Enzymes Pleiotropically Affect Sexual Dimorphism in Oryzias latipes

Natural Allelic Variations of Xenobiotic Enzymes Pleiotropically Affect Sexual Dimorphism in Oryzias latipes
Takafumi Katsumura, Shoji Oda, Shigeki Nakagome, Tsunehiko Hanihara, Hiroshi Kataoka, Hiroshi Mitani, Shoji Kawamura, Hiroki Oota

Sexual dimorphisms, which are phenotypic differences between males and females, are driven by sexual selection [1, 2]. Interestingly, sexually selected traits show geographic variations within species despite strong directional selective pressures [3, 4]. However, genetic factors that regulate varied sexual differences remain unknown. In this study, we show that polymorphisms in cytochrome P450 (CYP) 1B1, which encodes a xenobiotic-metabolising enzyme, are associated with local differences of sexual dimorphisms in the anal fin morphology of medaka fish (Oryzias latipes). High and low activity CYP1B1 alleles increased and decreased differences in anal fin sizes, respectively. Behavioural and phylogenetic analyses suggest maintenance of the high activity allele by sexual selection, whereas the low activity allele may have evolved by positive selection due to by-product effects of CYP1B1. The present data can elucidate evolutionary mechanisms behind genetic variations in sexual dimorphism and indicate pleiotropic effects of xenobiotic enzymes.