Defensive complexity and the phylogenetic conservation of immune control

Defensive complexity and the phylogenetic conservation of immune control

Erick Chastain, Rustom Antia, Carl T. Bergstrom
(Submitted on 13 Nov 2012)

One strategy for winning a coevolutionary struggle is to evolve rapidly. Most of the literature on host-pathogen coevolution focuses on this phenomenon, and looks for consequent evidence of coevolutionary arms races. An alternative strategy, less often considered in the literature, is to deter rapid evolutionary change by the opponent. To study how this can be done, we construct an evolutionary game between a controller that must process information, and an adversary that can tamper with this information processing. In this game, a species can foil its antagonist by processing information in a way that is hard for the antagonist to manipulate. We show that the structure of the information processing system induces a fitness landscape on which the adversary population evolves. Complex processing logic can carve long, deep fitness valleys that slow adaptive evolution in the adversary population. We suggest that this type of defensive complexity on the part of the vertebrate adaptive immune system may be an important element of coevolutionary dynamics between pathogens and their vertebrate hosts. Furthermore, we cite evidence that the immune control logic is phylogenetically conserved in mammalian lineages. Thus our model of defensive complexity suggests a new hypothesis for the lower rates of evolution for immune control logic compared to other immune structures.

BayesHammer: Bayesian clustering for error correction in single-cell sequencing

BayesHammer: Bayesian clustering for error correction in single-cell sequencing

Sergey I. Nikolenko, Anton I. Korobeynikov, Max A. Alekseyev
(Submitted on 12 Nov 2012)

Error correction of sequenced reads remains a difficult task, especially in single-cell sequencing projects with extremely non-uniform coverage. While existing error correction tools designed for standard (multi-cell) sequencing data usually come up short in single-cell sequencing projects, algorithms actually used for single-cell error correction have been so far very simplistic.
We introduce several novel algorithms based on Hamming graphs and Bayesian subclustering in our new error correction tool BayesHammer. While BayesHammer was designed for single-cell sequencing, we demonstrate that it also improves on existing error correction tools for multi-cell sequencing data while working much faster on real-life datasets. We benchmark BayesHammer on both $k$-mer counts and actual assembly results with the SPAdes genome assembler.

Exact results for fixation probability of bithermal evolutionary graphs

Exact results for fixation probability of bithermal evolutionary graphs

Bahram Houchmandzadeh (LIPhy), Marcel Vallade (LIPhy)
(Submitted on 12 Nov 2012)

One of the most fundamental concepts of evolutionary dynamics is the “fixation” probability, i.e. the probability that a mutant spreads through the whole population. Most natural communities are geographically structured into habitats exchanging individuals among each other and can be modeled by an evolutionary graph (EG), where directed links weight the probability for the offspring of one individual to replace another individual in the community. Very few exact analytical results are known for EGs. We show here how by using the techniques of the fixed point of Probability Generating Function, we can uncover a large class of of graphs, which we term bithermal, for which the exact fixation probability can be simply computed.

Improved haplotyping of rare variants using next-generation sequence data

Improved haplotyping of rare variants using next-generation sequence data
Fouad Zakharia, Carlos Bustamante
(Submitted on 9 Nov 2012)

Accurate identification of haplotypes in sequenced human genomes can provide invaluable information about population demography and fine-scale correlations along the genome, thus empowering both population genomic and medical association studies. Yet phasing unrelated individuals remains a challenging problem. Incorporating available data from high throughput sequencing into traditional statistical phasing approaches is a promising avenue to alleviate these issues. We present a novel statistical method that expands on an existing graphical haplotype reconstruction method (shapeIT) to incorporate phasing information from paired-end read data. The algorithm harnesses the haplotype graph information estimated by shapeIT from genotypes across the population and refines haplotype likelihoods for a given individual to be compatible with the sequencing data. Applying the method to HapMap individuals genotyped on the Affymetrix Axiom chip at 7,745,081 SNPs and on a trio sequenced by Complete Genomics, we found that the inclusion of paired end read data significantly improved phasing, with reductions in switch error on the order of 4-15% against shapeIT across all panels. As expected, the improvements were found to be most significant at sites harboring rare variants; furthermore, we found that longer read sizes and higher throughput translated to greater decreases in switching error, as did higher variance in the size of the insert separating the two reads–suggesting that multi-platform next generation sequencing may be exploited to yield particularly accurate haplotypes. Overall, the phasing improvements afforded by this new method highlight the power of integrating sequencing read information and population genotype data for reconstructing haplotypes in unrelated individuals.

The effect of multiple paternity on genetic diversity during and after colonisation

The effect of multiple paternity on genetic diversity during and after colonisation

M. Rafajlovic, A. Eriksson, A. Rimark, S. H. Saltin, G. Charrier, M. Panova, C. André, K. Johannesson, B. Mehlig
(Submitted on 5 Nov 2012)

In metapopulations, genetic variation of local populations is influenced by the genetic content of the founders, and of migrants following establishment. We analyse the effect of multiple paternity on genetic diversity using a model in which the highly promiscuous marine snail Littorina saxatilis expands from a mainland to colonise initially empty islands of an archipelago. Migrant females carry a large number of eggs fertilised by 1 – 10 mates. We quantify the genetic diversity of the population in terms of its heterozygosity: initially during the transient colonisation process, and at long times when the population has reached an equilibrium state with migration. During colonisation, multiple paternity increases the heterozygosity by 10 – 300 % in comparison with the case of single paternity. The equilibrium state, by contrast, is less strongly affected: multiple paternity gives rise to 10 – 50 % higher heterozygosity compared with single paternity. Further we find that far from the mainland, new mutations spreading from the mainland cause bursts of high genetic diversity separated by long periods of low diversity. This effect is boosted by multiple paternity. We conclude that multiple paternity facilitates colonisation and maintenance of small populations, whether or not this is the main cause for the evolution of extreme promiscuity in Littorina saxatilis.

Genomic mutation rates that neutralize adaptive evolution and natural selection

Genomic mutation rates that neutralize adaptive evolution and natural selection

Philip Gerrish, Alexandre Colato, Paul Sniegowski
(Submitted on 5 Nov 2012)

When mutation rates are low, natural selection remains effective, and increasing the mutation rate can give rise to an increase in adaptation rate. When mutation rates are high to begin with, however, increasing the mutation rate may have a detrimental effect because of the overwhelming presence of deleterious mutations. Indeed, if mutation rates are high enough: 1) adaptation rate can become negative despite the continued availability of adaptive and/or compensatory mutations, or 2) natural selection may be disabled because adaptive and/or compensatory mutations — whether established or newly-arising — are eroded by excessive mutation and decline in frequency. We apply these two criteria to a standard model of asexual adaptive evolution and derive mathematical expressions — some new, some old in new guise — delineating the mutation rates under which either adaptive evolution or natural selection is neutralized. The expressions are simple and require no \emph{a priori} knowledge of organism- and/or environment-specific parameters. Our discussion connects these results to each other and to previous theory, showing convergence or equivalence of the different results in most cases.

Response to Horizontal gene transfer may explain variation in θs

Response to Horizontal gene transfer may explain variation in \theta_s

Inigo Martincorena, Nicholas M. Luscombe
(Submitted on 5 Nov 2012)

In a short article submitted to ArXiv [1], Maddamsetti et al. argue that the variation in the neutral mutation rate among genes in Escherichia coli that we recently reported [2] might be explained by horizontal gene transfer (HGT). To support their argument they present a reanalysis of synonymous diversity in 10 E.coli strains together with an analysis of a collection of 1,069 synonymous mutations found in repair-deficient strains in a long-term in vitro evolution experiment. Here we respond to this communication. Briefly, we explain that HGT was carefully accounted for in our study by multiple independent phylogenetic and population genetic approaches, and we show that there is no new evidence of HGT affecting our results. We also argue that caution must be exercised when comparing mutations from repair deficient strains to data from wild-type strains, as these conditions are dominated by different mutational processes. Finally, we reanalyse Maddamsetti’s collection of mutations from a long-term in vitro experiment and we report preliminary evidence of non-random variation of the mutation rate in these repair deficient strains.

Our paper: The McDonald-Kreitman Test and its Extensions under Frequent Adaptation: Problems and Solutions

For our next guest post Philipp Messer and Dmitri Petrov write about their paper
The McDonald-Kreitman Test and its Extensions under Frequent Adaptation: Problems and Solutions, arXived here

The McDonald-Kreitman (MK) test is the basis of most modern approaches to measure the rate of adaptation from population genomic data. This test was used to argue that in some organisms, such as Drosophila, the rate of adaptation is surprisingly high. However, the MK test, and in fact most of the current machinery of population genetics, relies on the assumption that adaptation is rare so that the effects of selective sweeps on linked variation can be neglected. We test this assumption using a powerful forward simulation and show that the MK test is severely biased even when the rate of adaptation is only moderate. The biases arise from the complex linkage effects between slightly deleterious and strongly advantageous mutations. In order to deal with these biases, we suggest a new robust approach based on a simple asymptotic extension of the MK test.

We further show that already under very moderate amounts of adaptation, linkage effects from recurrent selective sweeps can profoundly affect key population genetic parameters, such as the fixation probabilities of deleterious mutations and the frequency distributions of polymorphisms. In synonymous polymorphism data, these linkage effects leave signatures that can easily be mistaken for the signatures of recent, severe population expansion.

The bigger claim of our paper is that the effects of linked selection cannot be simply swept under the rug by introducing effective parameters, such as effective population size or effective strength of selection, and then using these effective parameters in formulae derived from the diffusion approximation under the assumption of free recombination. Given that most of our estimates of the key evolutionary parameters are still obtained from methods based on this paradigm, we argue that it is crucial to verify whether they are robust to linkage effects.

Philipp Messer and Dmitri Petrov

Inference of Admixture Parameters in Human Populations Using Weighted Linkage Disequilibrium

Inference of Admixture Parameters in Human Populations Using Weighted Linkage Disequilibrium

Po-Ru Loh, Mark Lipson, Nick Patterson, Priya Moorjani, Joseph K Pickrell, David Reich, Bonnie Berger
(Submitted on 1 Nov 2012)

Long-range migrations and the resulting admixture between populations have been an important force shaping human genetic diversity. Most existing methods for detecting and reconstructing historical admixture events are based on allele frequency divergences or patterns of ancestry segments in chromosomes of admixed individuals. An emerging new approach harnesses the exponential decay of admixture-induced linkage disequilibrium (LD) as a function of genetic distance. Here, we comprehensively develop LD-based inference into a versatile tool for investigating admixture. We present a new weighted LD statistic that can be used to infer mixture proportions as well as dates with fewer constraints on reference populations than previous methods. We define an LD-based three-population test for admixture and identify scenarios in which it can detect admixture that previous formal tests cannot. We further show that we can discover phylogenetic relationships between populations by comparing weighted LD curves obtained using a suite of references. Finally, we describe several improvements to the computation and fitting of weighted LD curves that greatly increase the robustness and speed of the computation. We implement all of these advances in a software package, ALDER, which we validate in simulations and apply to test for admixture among all populations from the Human Genome Diversity Project (HGDP), highlighting insights into the admixture history of Central African Pygmies, Sardinians, and Japanese.

The McDonald-Kreitman Test and its Extensions under Frequent Adaptation: Problems and Solutions

The McDonald-Kreitman Test and its Extensions under Frequent Adaptation: Problems and Solutions

Philipp W. Messer, Dmitri A. Petrov
(Submitted on 1 Nov 2012)

Population genomic studies have shown that genetic draft and background selection can profoundly affect the genome-wide patterns of molecular variation. We performed forward simulations under realistic gene-structure and selection scenarios to investigate whether such linkage effects impinge on the ability of the McDonald-Kreitman (MK) test to infer the rate of positive selection (\alpha) from polymorphism and divergence data. We find that in the presence of slightly deleterious mutations, MK estimates of \alpha\ severely underestimate the true rate of adaptation even if all polymorphisms with population frequencies under 50% are excluded. Furthermore, already under intermediate rates of adaptation, genetic draft substantially distorts the site frequency spectra at neutral and functional sites from the expectations under mutation-selection-drift balance. MK-type approaches that first infer demography from synonymous sites and then use the inferred demography to correct the estimation of \alpha\ obtain almost the correct \alpha\ in our simulations. However, these approaches typically infer a severe past population expansion although there was no such expansion in the simulations, casting doubt on the accuracy of methods that infer demography from synonymous polymorphism data. We suggest a simple asymptotic extension of the MK test that should yield accurate estimates of \alpha\ even in the presence of linkage effects.