The McDonald-Kreitman Test and its Extensions under Frequent Adaptation: Problems and Solutions

The McDonald-Kreitman Test and its Extensions under Frequent Adaptation: Problems and Solutions

Philipp W. Messer, Dmitri A. Petrov
(Submitted on 1 Nov 2012)

Population genomic studies have shown that genetic draft and background selection can profoundly affect the genome-wide patterns of molecular variation. We performed forward simulations under realistic gene-structure and selection scenarios to investigate whether such linkage effects impinge on the ability of the McDonald-Kreitman (MK) test to infer the rate of positive selection (\alpha) from polymorphism and divergence data. We find that in the presence of slightly deleterious mutations, MK estimates of \alpha\ severely underestimate the true rate of adaptation even if all polymorphisms with population frequencies under 50% are excluded. Furthermore, already under intermediate rates of adaptation, genetic draft substantially distorts the site frequency spectra at neutral and functional sites from the expectations under mutation-selection-drift balance. MK-type approaches that first infer demography from synonymous sites and then use the inferred demography to correct the estimation of \alpha\ obtain almost the correct \alpha\ in our simulations. However, these approaches typically infer a severe past population expansion although there was no such expansion in the simulations, casting doubt on the accuracy of methods that infer demography from synonymous polymorphism data. We suggest a simple asymptotic extension of the MK test that should yield accurate estimates of \alpha\ even in the presence of linkage effects.

Most viewed on Haldane’s Sieve: October 2012

The most viewed preprints on Haldane’s Sieve in October 2012 were:

Using haplotype differentiation among hierarchically structured populations for the detection of selection signatures

Using haplotype differentiation among hierarchically structured populations for the detection of selection signatures

Marìa Inès Fariello, Simon Boitard, Hugo Naya, Magali SanCristobal, Bertrand Servin
(Submitted on 29 Oct 2012)

The detection of molecular signatures of selection is one of the major concerns of modern population genetics. A widely used strategy in this context is to compare samples from several populations, and to look for genomic regions with outstanding genetic differentiation between these populations. Genetic differentiation is generally based on allele frequency differences between populations, which are measured by Fst or related statistics. Here we introduce a new statistic, denoted hapFLK, which focuses instead on the differences of haplotype frequencies between populations. In contrast to most existing statistics, hapFLK accounts for the hierarchical structure of the sampled populations. Using computer simulations, we show that each of these two features – the use of haplotype information and of the hierarchical structure of populations – significantly improves the detection power of selected loci, and that combining them in the hapFLK statistic provides even greater power. We also show that hapFLK is robust with respect to bottlenecks and migration and improves over existing approaches in many situations. Finally, we apply hapFLK to a set of six sheep breeds from Northern Europe, and identify seven regions under selection, which include already reported regions but also several new ones. We propose a method to help identifying the population(s) under selection in a detected region, which reveals that in many of these regions selection most likely occurred in more than one population. Furthermore, several of the detected regions correspond to incomplete sweeps, where the favourable haplotype is only at intermediate frequency in the population(s) under selection.

The Baldwin effect under multi-peaked fitness landscapes: Phenotypic fluctuation accelerates evolutionary rate

The Baldwin effect under multi-peaked fitness landscapes: Phenotypic fluctuation accelerates evolutionary rate

Nen Saito, Shuji Ishihara, Kunihiko Kaneko
(Submitted on 19 Oct 2012)

Phenotypic fluctuations and plasticity can generally affect the course of evolution, a process known as the Baldwin effect. Several studies have recast this effect and claimed that phenotypic plasticity acceler- ates evolutionary rate (the Baldwin expediting effect); however, the validity of this claim is still controversial. In this study, we investi- gate the evolutionary population dynamics of a quantitative genetic model under a multi-peaked fitness landscape, in order to evaluate the validity of the effect. We provide analytical expressions for the evolutionary rate and average population fitness. Our results indicate that under a multi-peaked fitness landscape, phenotypic fluctuation always accelerates evolutionary rate, but it decreases the average fit- ness. As an extreme case of the trade-off between the rate of evolution and average fitness, phenotypic fluctuation is shown to accelerate the error catastrophe, in which a population fails to sustain a high-fitness peak. In the context of our findings, we discuss the role of phenotypic plasticity in adaptive evolution.

Plump Cutthroat Trout and Thin Rainbow Trout in a Lentic Ecosystem

Plump Cutthroat Trout and Thin Rainbow Trout in a Lentic Ecosystem

Joshua Courtney, Jessica Abbott, Kerri Schmidt, Michael Courtney
(Submitted on 17 Oct 2012)

Background: Much has been written about introduced rainbow trout (Oncorhynchus mykiss) interbreeding and outcompeting cutthroat trout (Oncorhynchus clarkii). However, the specific mechanisms by which rainbow trout and their hybrids outcompete cutthroat trout have not been thoroughly explored, and the published data is limited to lotic ecosystems. Materials and Methods: Samples of rainbow trout and cutthroat trout were obtained from a lentic ecosystem by angling. The total length and weight of each fish was measured and the relative weight of each fish was computed (Anderson R.O., Neumann R.M. 1996. Length, Weight, and Associated Structural Indices, Pp. 447-481. In: Murphy B.E. and Willis D.W. (eds.) Fisheries Techniques, second edition. American Fisheries Society.), along with the mean and uncertainty in the mean for each species. Data from an independent source (K.D. Carlander, 1969. Handbook of Freshwater Fishery Biology, Volume One, Iowa University Press, Ames.) was also used to generate mean weight-length curves, as well as 25th and 75th percentile curves for each species to allow further comparison. Results: The mean relative weight of the rainbow trout was 72.5 (+/- 2.1); whereas, the mean relative weight of the cutthroat trout was 101.0 (+/- 4.9). The rainbow trout were thin; 80% weighed below the 25th percentile. The cutthroat trout were plump; 86% weighed above the 75th percentile, and 29% were above the heaviest recorded specimens at a given length in the Carlander (1969) data set. Conclusion: This data casts doubt on the hypothesis that rainbow trout are strong food competitors with cutthroat trout in lentic ecosystems. On the contrary, in the lake under study, the cutthroat trout seem to be outcompeting rainbow trout for the available food.

Integrated Nested Laplace Approximation for Bayesian Nonparametric Phylodynamics

Integrated Nested Laplace Approximation for Bayesian Nonparametric Phylodynamics

Julia A. Palacios, Vladimir N. Minin
(Submitted on 16 Oct 2012)

The goal of phylodynamics, an area on the intersection of phylogenetics and population genetics, is to reconstruct population size dynamics from genetic data. Recently, a series of nonparametric Bayesian methods have been proposed for such demographic reconstructions. These methods rely on prior specifications based on Gaussian processes and proceed by approximating the posterior distribution of population size trajectories via Markov chain Monte Carlo (MCMC) methods. In this paper, we adapt an integrated nested Laplace approximation (INLA), a recently proposed approximate Bayesian inference for latent Gaussian models, to the estimation of population size trajectories. We show that when a genealogy of sampled individuals can be reliably estimated from genetic data, INLA enjoys high accuracy and can replace MCMC entirely. We demonstrate significant computational efficiency over the state-of-the-art MCMC methods. We illustrate INLA-based population size inference using simulations and genealogies of hepatitis C and human influenza viruses.

Quantitative analyses of empirical fitness landscapes

Quantitative analyses of empirical fitness landscapes

Ivan G. Szendro, Martijn F. Schenk, Jasper Franke, Joachim Krug, J. Arjan G. M. de Visser
(Submitted on 20 Feb 2012 (v1), last revised 17 Oct 2012 (this version, v2))

The concept of a fitness landscape is a powerful metaphor that offers insight into various aspects of evolutionary processes and guidance for the study of evolution. Until recently, empirical evidence on the ruggedness of these landscapes was lacking, but since it became feasible to construct all possible genotypes containing combinations of a limited set of mutations, the number of studies has grown to a point where a classification of landscapes becomes possible. The aim of this review is to identify measures of epistasis that allow a meaningful comparison of fitness landscapes and then apply them to the empirical landscapes to discern factors that affect ruggedness. The various measures of epistasis that have been proposed in the literature appear to be equivalent. Our comparison shows that the ruggedness of the empirical landscape is affected by whether the included mutations are beneficial or deleterious and by whether intra- or intergenic epistasis is involved. Finally, the empirical landscapes are compared to landscapes generated with the Rough Mt. Fuji model. Despite the simplicity of this model, it captures the features of the experimental landscapes remarkably well.

Fluctuations in fitness distributions and the effects of weak linked selection on sequence evolution

Fluctuations in fitness distributions and the effects of weak linked selection on sequence evolution

Benjamin H. Good, Michael M. Desai
(Submitted on 15 Oct 2012)

Evolutionary dynamics and patterns of molecular evolution are strongly influenced by selection on linked regions of the genome, but our quantitative understanding of these effects remains incomplete. Recent work has focused on predicting the distribution of fitness within an evolving population, and this forms the basis for several methods that leverage the fitness distribution to predict the patterns of genetic diversity when selection is strong. However, in weakly selected populations random fluctuations due to genetic drift are more severe, and neither the distribution of fitness nor the sequence diversity within the population are well understood. Here, we briefly review the motivations behind the fitness-distribution picture, and summarize the general approaches that have been used to analyze this distribution in the strong-selection regime. We then extend these approaches to the case of weak selection, by outlining a perturbative treatment of selection at a large number of linked sites. This allows us to quantify the stochastic behavior of the fitness distribution and yields exact analytical predictions for the sequence diversity and substitution rate in the limit that selection is weak.

A 454 survey of the community composition and core microbiome of the common bed bug, Cimex lectularius, reveals significant microbial community structure across an urban landscape

A 454 survey of the community composition and core microbiome of the common bed bug, Cimex lectularius, reveals significant microbial community structure across an urban landscape

Matthew Meriweather, Sara Matthews, Rita Rio, Regina S Baucom
(Submitted on 13 Oct 2012)

Elucidating the spatial dynamic and core constituents of the microbial communities found in association with arthropod hosts is of crucial importance for insects that may vector human or agricultural pathogens. The hematophagous Cimex lectularius, known as the common bed bug, has made a recent resurgence in North America, as well as worldwide, potentially owing to increased travel and resistance to insecticides. A comprehensive survey of the bed bug microbiome has not been performed to date, nor has an assessment of the spatial dynamics of its microbiome. Here we present a survey of bed bug microbial communities by amplifying the V4-V6 hypervariable region of the 16S rDNA gene region followed by 454 Titanium sequencing using 31 individuals from eight natural populations collected from residences in Cincinnati, OH. Across all samples, 97% of the microbial community is made up of two dominant OTUs identified as the \alpha-proteobacterium Wolbachia and an unnamed \gamma-proteobacterium from the Enterobacteriaceae. Microbial communities varied among host populations for measures of community diversity and exhibited significant population structure. We also uncovered a strong negative correlation in the abundance of the two dominant OTUs, suggesting they may fulfill similar roles as nutritional mutualists. This broad survey represents the most comprehensive assessment, to date, of the microbes that associate with bed bugs, and uncovers evidence for potential antagonism between the two dominant members of the bed bug microbiome.

Species Identification and Unbiased Profiling of Complex Microbial Communities Using Shotgun Illumina Sequencing of 16S rRNA Amplicon Sequences

Species Identification and Unbiased Profiling of Complex Microbial Communities Using Shotgun Illumina Sequencing of 16S rRNA Amplicon Sequences

Swee Hoe Ong, Vinutha Uppoor Kukkillaya, Andreas Wilm, Christophe Lay, Eliza Xin Pei Ho, Louie Low, Martin Lloyd Hibberd, Niranjan Nagarajan
(Submitted on 12 Oct 2012)

The high throughput and cost-effectiveness afforded by short-read sequencing technologies, in principle, enable researchers to perform 16S rRNA profiling of complex microbial communities at unprecedented depth and resolution. Existing Illumina sequencing protocols are, however, limited by the fraction of the 16S rRNA gene that is interrogated and therefore limit the resolution and quality of the profiling. To address this, we present the design of a novel protocol for shotgun Illumina sequencing of the bacterial 16S rRNA gene, optimized to capture more than 90% of sequences in the Greengenes database and with nearly twice the resolution of existing protocols. Using several in silico and experimental datasets, we demonstrate that despite the presence of multiple variable and conserved regions, the resulting shotgun sequences can be used to accurately quantify the diversity of complex microbial communities. The reconstruction of a significant fraction of the 16S rRNA gene also enabled high precision (>90%) in species-level identification thereby opening up potential application of this approach for clinical microbial characterization.