Gaussian process test for high-throughput sequencing time series: application to experimental evolution
Hande Topa, Ágnes Jónás, Robert Kofler, Carolin Kosiol, Antti Honkela
Comments: 26 pages, 13 figures
Subjects: Populations and Evolution (q-bio.PE); Genomics (q-bio.GN); Quantitative Methods (q-bio.QM); Applications (stat.AP)
Motivation: Recent advances in high-throughput sequencing (HTS) have made it possible to monitor genomes in great detail. New experiments not only use HTS to measure genomic features at one time point but to monitor them changing over time with the aim of identifying significant changes in their abundance. In population genetics, for example, allele frequencies are monitored over time to detect significant frequency changes that indicate selection pressures. Previous attempts at analysing data from HTS experiments have been limited as they could not simultaneously include data at intermediate time points, replicate experiments and sources of uncertainty specific to HTS such as sequencing depth.
Results: We present the beta-binomial Gaussian process (BBGP) model for ranking features with significant non-random variation in abundance over time. The features are assumed to represent proportions, such as proportion of an alternative allele in a population. We use the beta-binomial model to capture the uncertainty arising from finite sequencing depth and combine with a Gaussian process model over the time series. In simulations that mimic the features of experimental evolution data, the proposed method clearly outperforms classical testing in average precision of finding selected alleles. We also present results on real data from Drosophila experimental evolution experiment in temperature adaptation.
Availability: R software implementing the test is available at https://github.com/handetopa/BBGP.
An experimentally determined evolutionary model dramatically improves phylogenetic fit
Jesse D Bloom
All modern approaches to molecular phylogenetics require a quantitative model for how genes evolve. Unfortunately, existing evolutionary models do not realistically represent the site-heterogeneous selection that governs actual sequence change. Attempts to remedy this problem have involved augmenting these models with a burgeoning number of free parameters. Here I demonstrate an alternative: experimental determination of a parameter-free evolutionary model via mutagenesis, functional selection, and deep sequencing. Using this strategy, I create an evolutionary model for influenza nucleoprotein that describes the gene phylogeny far better than existing models with dozens or even hundreds of free parameters. High-throughput experimental strategies such as the one employed here provide fundamentally new information that has the potential to transform the sensitivity of phylogenetic analyses.
Global Epistasis Makes Adaptation Predictable Despite Sequence-Level Stochasticity
Sergey Kryazhimskiy, Daniel Paul Rice, Elizabeth Jerison, Michael M Desai
Epistasis can make adaptation highly unpredictable, rendering evolutionary trajectories contingent on the chance effects of initial mutations. We used experimental evolution in Saccharomyces cerevisiae to quantify this effect, finding dramatic differences in adaptability between 64 closely related genotypes. Despite these differences, sequencing of 105 evolved clones showed no significant effect of initial genotype on future sequence-level evolution. Instead, reconstruction experiments revealed a consistent pattern of diminishing returns epistasis. Our results suggest that many beneficial mutations affecting a variety of biological processes are globally coupled: they interact strongly, but only through their combined effect on fitness. Sequence-level adaptation is thus highly stochastic. Nevertheless, fitness evolution is strikingly predictable because differences in adaptability are determined only by global fitness-mediated epistasis, not by the identity of individual mutations.
Biophysical Fitness Landscapes for Transcription Factor Binding Sites
Allan Haldane, Michael Manhart, Alexandre V. Morozov
(Submitted on 3 Dec 2013)
Evolutionary trajectories and phenotypic states available to cell populations are ultimately dictated by intermolecular interactions between DNA, RNA, proteins, and other molecular species. Here we study how evolution of gene regulation in a single-cell eukaryote S. cerevisiae is affected by the interactions between transcription factors (TFs) and their cognate genomic sites. Our study is informed by high-throughput in vitro measurements of TF-DNA binding interactions and by a comprehensive collection of genomic binding sites. Using an evolutionary model for monomorphic populations evolving on a fitness landscape, we infer fitness as a function of TF-DNA binding energy for a collection of 12 yeast TFs, and show that the shape of the predicted fitness functions is in broad agreement with a simple thermodynamic model of two-state TF-DNA binding. However, the effective temperature of the model is not always equal to the physical temperature, indicating selection pressures in addition to biophysical constraints caused by TF-DNA interactions. We find little statistical support for the fitness landscape in which each position in the binding site evolves independently, showing that epistasis is common in evolution of gene regulation. Finally, by correlating TF-DNA binding energies with biological properties of the sites or the genes they regulate, we are able to rule out several scenarios of site-specific selection, under which binding sites of the same TF would experience a spectrum of selection pressures depending on their position in the genome. These findings argue for the existence of universal fitness landscapes which shape evolution of all sites for a given TF, and whose properties are determined in part by the physics of protein-DNA interactions.
Genome-wide targets of selection: female response to experimental removal of sexual selection in Drosophila melanogaster
Paolo Innocenti, Ilona Flis, Edward H Morrow
Despite the common assumption that promiscuity should in general be favored in males, but not in females, to date there is no consensus on the general impact of multiple mating on female fitness. Notably, very little is known about the genetic and physiological features underlying the female response to sexual selection pressures. By combining an experimental evolution approach with genomic techniques, we investigated the effects of single and multiple matings on female fecundity and gene expression. We experimentally manipulated the mating system in replicate populations of Drosophila melanogaster by removing sexual selection, with the aim of testing differences in short term post-mating effects of females evolved under different mating strategies. We show that monogamous females suffer decreased fecundity, a decrease that was partially recovered by experimentally reversing the selection pressure back to the ancestral promiscuous state. The post-mating gene expression profiles of monogamous females differ significantly from promiscuous females, involving 9% of the genes tested. These transcripts are active in several tissues, mainly ovaries, neural tissues and midgut, and are involved in metabolic processes, reproduction and signaling pathways. Our results demonstrate how the female post-mating response can evolve under different mating systems, and provide novel insights into the genes targeted by sexual selection in females, by identifying a list of candidate genes responsible for the decrease in female fecundity in the absence of promiscuity.
The first steps of adaptation of Escherichia coli to the gut are dominated by soft sweeps
João Barroso-Batista, Ana Sousa, Marta Lourenço, Marie-Louise Bergman, Jocelyne Demengeot, Karina B. Xavier, Isabel Gordo
(Submitted on 11 Nov 2013)
The accumulation of adaptive mutations is essential for survival in novel environments. However, in clonal populations with a high mutational supply, the power of natural selection is expected to be limited. This is due to clonal interference – the competition of clones carrying different beneficial mutations – which leads to the loss of many small effect mutations and fixation of large effect ones. If interference is abundant, then mechanisms for horizontal transfer of genes, which allow the immediate combination of beneficial alleles in a single background, are expected to evolve. However, the relevance of interference in natural complex environments, such as the gut, is poorly known. To address this issue, we studied the invasion of beneficial mutations responsible for Escherichia coli’s adaptation to the mouse gut and demonstrate the pervasiveness of clonal interference. The observed dynamics of change in frequency of beneficial mutations are consistent with soft sweeps, where a similar adaptive mutation arises repeatedly on different haplotypes without reaching fixation. The genetic basis of the adaptive mutations revealed a striking parallelism in independently evolving populations. This was mainly characterized by the insertion of transposable elements in both coding and regulatory regions of a few genes. Interestingly in most populations, we observed a complete phenotypic sweep without loss of genetic variation. The intense clonal interference during adaptation to the gut environment, here demonstrated, may be important for our understanding of the levels of strain diversity of E. coli inhabiting the human gut microbiota and of its recombination rate.
Some mathematical tools for the Lenski experiment
Bernard Ycart (LJK), Agnès Hamon (LJK), Joël Gaffé (LAPM), Dominique Schneider (LAPM)
(Submitted on 2 Oct 2013)
The Lenski experiment is a long term daily reproduction of Escherichia coli, that has evidenced phenotypic and genetic evolutions along the years. Some mathematical models, that could be usefull in understanding the results of that experiment, are reviewed here: stochastic and deterministic growth, mutation appearance and fixation, competition of species.