Ebola virus is evolving but not changing: no evidence for functional change in EBOV from 1976 to the 2014 outbreak

Ebola virus is evolving but not changing: no evidence for functional change in EBOV from 1976 to the 2014 outbreak

Abayomi S Olabode, Xiaowei Jiang, David L Robertson, Simon C Lovell
doi: http://dx.doi.org/10.1101/014480

The Ebola epidemic is having a devastating impact in West Africa. Sequencing of Ebola viruses from infected individuals has revealed extensive genetic variation, leading to speculation that the virus may be adapting to the human host and accounting for the scale of the 2014 outbreak. We show that so far there is no evidence for adaptation of EBOV to humans. We analyze the putatively functional changes associated with the current and previous Ebola outbreaks, and find no significant molecular changes. Observed amino acid replacements have minimal effect on protein structure, being neither stabilizing nor destabilizing. Replacements are not found in regions of the proteins associated with known functions and tend to occur in disordered regions. This observation indicates that the difference between the current and previous outbreaks is not due to the observed evolutionary change of the virus. Instead, epidemiological factors must be responsible for the unprecedented spread of EBOV.

Exploring the genetic patterns of complex diseases via the integrative genome-wide approach

Exploring the genetic patterns of complex diseases via the integrative genome-wide approach

Ben Teng, Can Yang, Jiming Liu, Zhipeng Cai, Xiang Wan
(Submitted on 26 Jan 2015)

Motivation: Genome-wide association studies (GWASs), which assay more than a million single nucleotide polymorphisms (SNPs) in thousands of individuals, have been widely used to identify genetic risk variants for complex diseases. However, most of the variants that have been identified contribute relatively small increments of risk and only explain a small portion of the genetic variation in complex diseases. This is the so-called missing heritability problem. Evidence has indicated that many complex diseases are genetically related, meaning these diseases share common genetic risk variants. Therefore, exploring the genetic correlations across multiple related studies could be a promising strategy for removing spurious associations and identifying underlying genetic risk variants, and thereby uncovering the mystery of missing heritability in complex diseases. Results: We present a general and robust method to identify genetic patterns from multiple large-scale genomic datasets. We treat the summary statistics as a matrix and demonstrate that genetic patterns will form a low-rank matrix plus a sparse component. Hence, we formulate the problem as a matrix recovering problem, where we aim to discover risk variants shared by multiple diseases/traits and those for each individual disease/trait. We propose a convex formulation for matrix recovery and an efficient algorithm to solve the problem. We demonstrate the advantages of our method using both synthesized datasets and real datasets. The experimental results show that our method can successfully reconstruct both the shared and the individual genetic patterns from summary statistics and achieve better performance compared with alternative methods under a wide range of scenarios.

Origins of cattle on Chirikof Island, Alaska

Origins of cattle on Chirikof Island, Alaska

Jared E. Decker, Jeremy F. Taylor, Matthew A. Cronin, Leeson J. Alexander, Juha Kantanen, Ann Millbrooke, Robert D. Schnabel, Michael D. MacNeil
doi: http://dx.doi.org/10.1101/014415

Feral livestock may harbor genetic variation of commercial, scientific, historical or esthetic value. Origins and uniqueness of feral cattle on Chirikof Island, Alaska are uncertain. The island is now part of the Alaska Maritime Wildlife Refuge and Federal wildlife managers want grazing to cease, presumably leading to demise of the cattle. Here we characterize the Chirikof Island cattle relative to extant breeds and discern their origins. Our analyses support the inference that Russian cattle arrived first on Chirikof Island, then approximately 95 years ago the first European taurine cattle were introduced to the island, and finally Hereford cattle were introduced about 40 years ago. While clearly Bos taurus taurus, the Chirikof Island cattle appear at least as distinct as other recognized breeds. Further, this mixture of European and East-Asian cattle is unique compared to other North American breeds and we find evidence that natural selection in the relatively harsh environment of Chirikof Island has further impacted their genetic architecture. These results provide an objective basis for decisions regarding conservation of the Chirikof Island cattle.

Evolution of Conditional Cooperativity Between HOXA11 and FOXO1 Through Allosteric Regulation

Evolution of Conditional Cooperativity Between HOXA11 and FOXO1 Through Allosteric Regulation

Mauris C. Nnamani, Soumya Ganguly, Vincent J. Lynch, Laura S. Mizoue, Yingchun Tong, Heather Darling, Monika Fuxreiter, Jens Meiler, Gunter P. Wagner
doi: http://dx.doi.org/10.1101/014381

Transcription factors (TFs) play multiple roles in different cells and stages of development. Given this multitude of functional roles it has been assumed that TFs are evolutionarily highly constrained. Here we investigate the molecular mechanisms for the origin of a derived functional interaction between two TFs that play a key role in mammalian pregnancy, HOXA11 and FOXO1. We have previously shown that the regulatory role of HOXA11 in mammalian endometrial stromal cells requires an interaction with FOXO1, and that the physical interaction between these proteins evolved long before their functional cooperativity. Through a combination of functional, biochemical, and structural approaches, we demonstrate that the derived functional cooperativity between HOXA11 and FOXO1 is due to derived allosteric regulation of HOXA11 by FOXO1. This study shows that TF function can evolve through changes affecting the functional output of a pre-existing protein complex.

Ancestry specific association mapping in admixed populations

Ancestry specific association mapping in admixed populations

Line Skotte, Thorfinn Sand S Korneliussen, Ida Moltke, Anders Albrechtsen
doi: http://dx.doi.org/10.1101/014001

As recently demonstrated in several genetic association studies, historically small and isolated populations can offer increased statistical power due to extended link- age equilibrium and increased genetic drift over many generations. However, many such populations, like the Greenlandic Inuit population, have recently experienced substantial admixture with other populations, which can complicate the association studies. One important complication is that most current methods for performing association testing are based on the assumption that the effect of the tested ge- netic marker is the same regardless of ancestry. This is a reasonable assumption for a causal variant, but may not hold for the genetic markers that are tested in association studies, which are usually not causal. The effects of non-causal genetic markers depend on how strongly their presence correlate with the presence of the causal marker, and this may vary between ancestral populations because of different linkage disequilibrium patterns and allele frequencies. Motivated by this, we here introduce a new statistical method for association testing in recently admixed populations, where the effect sizes are allowed to depend on the ancestry of the allele.Our method does not rely on accurate inference of local ancestry, yet using simulations we show that in some scenarios it gives a dramatic increase in statistical power to detect associations. In addition, the method allows for testing for difference in effect size between ancestral populations, which can be used to determine if a SNP is causal. We demonstrate the usefulness of the method on data from the Greenlandic population.

Alternative splicing QTLs in European and African populations using Altrans, a novel method for splice junction quantification

Alternative splicing QTLs in European and African populations using Altrans, a novel method for splice junction quantification

Halit Ongen, Emmanouil T Dermitzakis
doi: http://dx.doi.org/10.1101/014126

With the advent of RNA-sequencing technology we now have the power to detect different types of alternative splicing and how DNA variation affects splicing. However, given the short read lengths used in most population based RNA-sequencing experiments, quantifying transcripts accurately remains a challenge. Here we present a novel method, Altrans, for discovery of alternative splicing quantitative trait loci (asQTLs). To assess the performance of Altrans we compared it to Cufflinks, a well-established transcript quantification method. Simulations show that in the presence of transcripts absent from the annotation, Altrans performs better in quantifications than Cufflinks. We have applied Altrans and Cufflinks to the Geuvadis dataset, which comprises samples from European and African populations, and discovered (FDR = 1%) 1806 and 243 asQTLs with Altrans, and 1596 and 288 asQTLs with Cufflinks for Europeans and Africans, respectively. Although Cufflinks results replicated better across the two populations, this likely due to the increased sensitivity of Altrans in detecting harder to detect associations. We show that, by discovering a set of asQTLs in a smaller subset of European samples and replicating these in the remaining larger subset of Europeans, both methods achieve similar replication levels (94% and 98% replication in Altrans and Cufflinks, respectively). We find that method specific asQTLs are largely due to different types of alternative splicing events detected by each method. We overlapped the asQTLs with biochemically active regions of the genome and observed significant enrichments for many functional marks and variants in splicing regions, highlighting the biological relevance of the asQTLs identified. All together, we present a novel approach for discovering asQTLs that is a more direct assessment of splicing compared to other methods and is complementary to other transcript quantification methods.

Geometric constraints dominate the antigenic evolution of influenza H3N2 hemagglutinin

Geometric constraints dominate the antigenic evolution of influenza H3N2 hemagglutinin

Austin G Meyer, Claus O Wilke
doi: http://dx.doi.org/10.1101/014183

We have carried out a comprehensive analysis of the determinants of human influenza A H3 hemagglutinin evolution, considering three distinct predictors of evolutionary variation at in- dividual sites: solvent accessibility (as a proxy for protein fold stability and/or conservation), experimental epitope sites (as a proxy for host immune bias), and proximity to the receptor- binding region (as a proxy for protein function). We have found that these three predictors individually explain approximately 15% of the variation in site-wise dN/dS. However, the sol- vent accessibility and proximity predictors seem largely independent of each other, while the epitope sites are not. In combination, solvent accessibility and proximity explain 32% of the variation in dN/dS. Incorporating experimental epitope sites into the model adds only an ad- ditional 2 percentage points. We have also found that the historical H3 epitope sites, which date back to the 1980s and 1990s, show only weak overlap with the latest experimental epi- tope data, and we have defined a novel set of four epitope groups which are experimentally supported and cluster in 3D space. Finally, sites with dN/dS > 1, i.e., the sites most likely driving seasonal immune escape, are not correctly predicted by either historical or experimental epitope sites, but only by proximity to the receptor-binding region. In summary, proximity to the receptor-binding region, rather than host immune bias, seems to be the primary determinant of H3 immune-escape evolution.

Integrating crop growth models with whole genome prediction through approximate Bayesian computation

Integrating crop growth models with whole genome prediction through approximate Bayesian computation

Frank Technow, Carlos D. Messina, L. Radu Totir, Mark Cooper
doi: http://dx.doi.org/10.1101/014100

Genomic selection, enabled by whole genome prediction (WGP) methods, is revolutionizing plant breeding. Existing WGP methods have been shown to deliver accurate predictions in the most common settings, such as prediction of across environment performance for traits with additive gene effects. However, prediction of traits with non-additive gene effects and prediction of genotype by environment interaction (GxE), continues to be challenging. Previous attempts to increase prediction accuracy for these particularly difficult tasks employed prediction methods that are purely statistical in nature. Augmenting the statistical methods with biological knowledge has been largely overlooked thus far. Crop growth models (CGMs) attempt to represent the functional relationships between plant physiology and the environment in the formation of yield and similar output traits of interest. Thus, they can explain the impact of GxE and certain types of non-additive gene effects on the expressed phenotype. Approximate Bayesian computation (ABC), a novel and powerful computational procedure, allows the incorporation of CGMs directly into the estimation of whole genome marker effects in WGP. Here we provide a proof of concept study for this novel approach and demonstrate its use with a simulated data set. We show that this novel approach can be considerably more accurate than the benchmark WGP method GBLUP in predicting performance in environments represented in the estimation set as well as in previously unobserved environments for traits determined by non-additive gene effects. We conclude that this proof of concept demonstrates that using ABC for incorporating biological knowledge in the form of CGMs into WGP is a very promising novel approach to improving prediction accuracy for some of the most challenging scenarios of interest to applied geneticists.

Empirical determinants of adaptive mutations in yeast experimental evolution

Empirical determinants of adaptive mutations in yeast experimental evolution

Celia Payen, Anna B Sunshine, Giang T Ong, Jamie L Pogachar, Wei Zhao, Maitreya J Dunham
doi: http://dx.doi.org/10.1101/014068

High-throughput sequencing technologies have enabled expansion of the scope of genetic screens to identify mutations that underlie quantitative phenotypes, such as fitness improvements that occur during the course of experimental evolution. This new capability has allowed us to describe the relationship between fitness and genotype at a level never possible before, and ask deeper questions, such as how genome structure, available mutation spectrum, and other factors drive evolution. Here we combined functional genomics and experimental evolution to first map on a genome scale the distribution of potential beneficial mutations available as a first step to an evolving population and then compare these to the mutations actually observed in order to define the constraints acting upon evolution. We first constructed a single-step fitness landscape for the yeast genome by using barcoded gene deletion and overexpression collections, competitive growth in continuous culture, and barcode sequencing. By quantifying the relative fitness effects of thousands of single-gene amplifications or deletions simultaneously we revealed the presence of hundreds of accessible evolutionary paths. To determine the actual mutation spectrum used in evolution, we built a catalog of >1000 mutations selected during experimental evolution. By combining both datasets, we were able to ask how and why evolution is constrained. We identified adaptive mutations in laboratory evolved populations, derived mutational signatures in a variety of conditions and ploidy states, and determined that half of the mutations accumulated positively affect cellular fitness. We also uncovered hundreds of potential beneficial mutations never observed in the mutational spectrum derived from the experimental evolution catalog and found that those adaptive mutations become accessible in the absence of the dominant adaptive solution. This comprehensive functional screen explored the set of potential adaptive mutations on one genetic background, and allows us for the first time at this scale to compare the mutational path with the actual, spontaneously derived spectrum of mutations.

Feller’s Contributions to Mathematical Biology

Feller’s Contributions to Mathematical Biology

Ellen Baake, Anton Wakolbinger
(Submitted on 21 Jan 2015)

This is a review of William Feller’s important contributions to mathematical biology. The seminal paper [Feller1951] “Diffusion processes in genetics” was particularly influential on the development of stochastic processes at the interface to evolutionary biology, and interesting ideas in this direction (including a first characterization of what is nowadays known as “Feller’s branching diffusion”) already shaped up in the paper [Feller 1939] (written in German) “The foundations of a probabistic treatment of Volterra’s theory of the struggle for life”. Feller’s article “On fitness and the cost of natural selection” [Feller 1967] contains a critical analysis of the concept of “genetic load”.