The Genetic Architecture of Gene Expression Levels in Wild Baboons

The Genetic Architecture of Gene Expression Levels in Wild Baboons

Jenny Tung, Xiang Zhou, Susan C Alberts, Matthew Stephens, Yoav Gilad

Gene expression variation is well documented in human populations and its genetic architecture has been extensively explored. However, we still know little about the genetic architecture of gene expression variation in other species, particularly our closest living relatives, the nonhuman primates. To address this gap, we performed an RNA sequencing (RNA-seq)-based study of 63 wild baboons, members of the intensively studied Amboseli baboon population in Kenya. Our study design allowed us to measure gene expression levels and identify genetic variants using the same data set, enabling us to perform complementary mapping of putative cis-acting expression quantitative trait loci (eQTL) and measurements of allele-specific expression (ASE) levels. We discovered substantial evidence for genetic effects on gene expression levels in this population. Surprisingly, we found more power to detect individual eQTL in the baboons relative to a HapMap human data set of comparable size, probably as a result of greater genetic variation, enrichment of SNPs with high minor allele frequencies, and longer-range linkage disequilibrium in the baboons. eQTL were most likely to be identified for lineage-specific, rapidly evolving genes. Interestingly, genes with eQTL significantly overlapped between the baboon and human data sets, suggesting that some genes may tolerate more genetic perturbation than others, and that this property may be conserved across species. Finally, we used a Bayesian sparse linear mixed model to partition genetic, demographic, and early environmental contributions to variation in gene expression levels. We found a strong genetic contribution to gene expression levels for almost all genes, while individual demographic and environmental effects tended to be more modest. Together, our results establish the feasibility of eQTL mapping using RNA-seq data alone, and act as an important first step towards understanding the genetic architecture of gene expression variation in nonhuman primates.

Sampling through time and phylodynamic inference with coalescent and birth-death models

Sampling through time and phylodynamic inference with coalescent and birth-death models

Erik M. Volz, Simon DW Frost
(Submitted on 28 Aug 2014)

Many population genetic models have been developed for the purpose of inferring population size and growth rates from random samples of genetic data. We examine two popular approaches to this problem, the coalescent and the birth-death-sampling model, in the context of estimating population size and birth rates in a population growing exponentially according to the birth-death branching process. For sequences sampled at a single time, we found the coalescent and the birth-death-sampling model gave virtually indistinguishable results in terms of the growth rates and fraction of the population sampled, even when sampling from a small population. For sequences sampled at multiple time points, we find that the birth-death model estimators are subject to large bias if the sampling process is misspecified. Since birth-death-sampling models incorporate a model of the sampling process, we show how much of the statistical power of birth-death-sampling models arises from the sequence of sample times and not from the genealogical tree. This motivates the development of a new coalescent estimator, which is augmented with a model of the known sampling process and is potentially more precise than the coalescent that does not use sample time information.

C. elegans harbors pervasive cryptic genetic variation for embryogenesis

C. elegans harbors pervasive cryptic genetic variation for embryogenesis

Annalise Paaby, Amelia White, David Riccardi, Kristin Gunsalus, Fabio Piano, Matthew Rockman

Conditionally functional mutations are an important class of natural genetic variation, yet little is known about their prevalence in natural populations or their contribution to disease risk. Here, we describe a vast reserve of cryptic genetic variation, alleles that are normally silent but which affect phenotype when the function of other genes is perturbed, in the gene networks of C. elegans embryogenesis. We find evidence that cryptic-effect loci are ubiquitous and segregate at intermediate frequencies in the wild. The cryptic alleles demonstrate low developmental pleiotropy, in that specific, rather than general, perturbations are required to reveal them. Our findings underscore the importance of genetic background in characterizing gene function and provide a model for the expression of conditionally functional effects that may be fundamental in basic mechanisms of trait evolution and the genetic basis of disease susceptibility.

Determination of Nonlinear Genetic Architecture using Compressed Sensing

Determination of Nonlinear Genetic Architecture using Compressed Sensing

Chiu Man Ho, Stephen D.H. Hsu
(Submitted on 27 Aug 2014)

We introduce a statistical method that can reconstruct nonlinear genetic models (i.e., including epistasis, or gene-gene interactions) from phenotype-genotype (GWAS) data. The computational and data resource requirements are similar to those necessary for reconstruction of linear genetic models (or identification of gene-trait associations), assuming a condition of generalized sparsity, which limits the total number of gene-gene interactions. An example of a sparse nonlinear model is one in which a typical locus interacts with several or even many others, but only a small subset of all possible interactions exist. It seems plausible that most genetic architectures fall in this category. Our method uses a generalization of compressed sensing (L1-penalized regression) applied to nonlinear functions of the sensing matrix. We give theoretical arguments suggesting that the method is nearly optimal in performance, and demonstrate its effectiveness on broad classes of nonlinear genetic models using both real and simulated human genomes.

Genomic and transcriptomic insights into the regulation of snake venom production

Genomic and transcriptomic insights into the regulation of snake venom production

Adam D Hargreaves, Martin T Swain, Matthew J Hegarty, Darren W Logan, John F Mulley

The gene regulatory mechanisms underlying the rapid replenishment of snake venom following expenditure are currently unknown. Using a comparative transcriptomic approach we find that venomous and non-venomous species produce similar numbers of secreted products in their venom or salivary glands and that only one transcription factor (Tbx3) is expressed in venom glands but not salivary glands. We also find evidence for temporal variation in venom production. We have generated a draft genome sequence for the painted saw-scaled viper, Echis coloratus, and identified conserved transcription factor binding sites in the upstream regions of venom genes. We find binding sites to be conserved across members of the same gene family, but not between gene families, indicating that multiple gene regulatory networks are involved in venom production. Finally, we suggest that negative regulation may be important for rapid activation of the venom replenishment cycle.

Fixation in large populations: a continuous view of a discrete problem

Fixation in large populations: a continuous view of a discrete problem

Fabio A. C. C. Chalub, Max O. Souza
(Submitted on 27 Aug 2014)

We study fixation in large, but finite populations with two types, and dynamics governed by birth-death processes. By considering a restricted class of such processes, which includes most classical evolutionary processes, we derive a continuous approximation for the probability of fixation that is valid beyond the weak-selection (WS) limit. Indeed, in the derivation three regimes naturally appear: selection-driven, balanced, and quasi-neutral — the latter two require WS, while the former can appear with or without WS. From the continuous approximations, we then obtain asymptotic approximations for evolutions with at most one equilibrium, in the selection-driven regime, that does not preclude a weak-selection regime. As an application, we show that the fixation pattern for the Hawk and Dove game satisfies what we term the one-half law: if the Evolutionary Stable Strategy (ESS) is outside a small interval around $\sfrac{1}{2}$, the fixation is of dominance type. We also show that outside of the weak-selection regime the dynamics of large populations can have very little resemblance to the infinite population case. In addition, we also show results for the case of two equilibria. Finally, we present a continuous restatement of the definition of an ESSN strategy, that is valid for large populations. We then present two applications of this restatement: we obtain a definition valid in the quasi-neutral regime that recovers the one-third law under linear fitness and, as a generalisation, we introduce the concept of critical-frequency.

Sexual dimorphism in epigenomic responses of stem cells to extreme fetal growth

Sexual dimorphism in epigenomic responses of stem cells to extreme fetal growth

Fabien Delahaye, Neil Ari Wijetunga, Hye J Heo, Jessica N Tozour, Yong Mei Zhao, John M Greally, Francine H Einstein

Extreme fetal growth is associated with increased susceptibility to a range of adult diseases through an unknown mechanism of cellular memory. We tested whether heritable epigenetic processes in long-lived CD34+ hematopoietic stem/progenitor cells (HSPCs) showed evidence for re-programming associated with the extremes of fetal growth. Here we show that both fetal growth restriction and over-growth are associated with global shifts towards DNA hypermethylation, targeting cis-regulatory elements in proximity to genes involved in glucose homeostasis and stem cell function. A sexually dimorphic response was found, intrauterine growth restriction (IUGR) associated with substantially greater epigenetic dysregulation in males but large for gestational age (LGA) growth affecting females predominantly. The findings are consistent with extreme fetal growth interacting with variable fetal susceptibility to influence cellular aging and metabolic characteristics through epigenetic mechanisms, potentially generating biomarkers that could identify infants at higher risk for chronic disease later in life.