Genealogies of rapidly adapting populations

Genealogies of rapidly adapting populations
Richard A. Neher, Oskar Hallatschek
(Submitted on 15 Aug 2012)

The genetic diversity of a species is shaped by its recent evolutionary history and can be used to infer demographic events or selective sweeps. Most inference methods are based on the null hypothesis that natural selection is a weak evolutionary force. However, many species, particularly pathogens, are under continuous pressure to adapt in response to changing environments. A statistical framework for inference from diversity data of such populations is currently lacking. Toward this goal, we explore the properties of genealogies that emerge from models of continual adaptation. We show that lineages trace back to a small pool of highly fit ancestors, in which simultaneous coalescence of more than two lineages frequently occurs. While such multiple mergers are unlikely under the neutral coalescent, they create a unique genetic footprint in adapting populations. The site frequency spectrum of derived neutral alleles, for example, is non-monotonic and has a peak at high frequencies, whereas Tajima’s D becomes more and more negative with increasing sample size. Since multiple merger coalescents emerge in various evolutionary scenarios characterized by sustained selection pressures, we argue that they should be considered as null-models for adapting populations.

Genetic Diversity and the Structure of Genealogies in Rapidly Adapting Populations.

Genetic Diversity and the Structure of Genealogies in Rapidly Adapting Populations.

Michael M. Desai, Aleksandra M. Walczak, Daniel S. Fisher
(Submitted on 16 Aug 2012)
Positive selection distorts the structure of genealogies and hence alters patterns of genetic variation within a population. Most analyses of these distortions focus on the signatures of hitchhiking due to hard or soft selective sweeps at a single genetic locus. However, in linked regions of rapidly adapting genomes, multiple beneficial mutations at different loci can segregate simultaneously within the population, an effect known as clonal interference. This leads to a subtle interplay between hitchhiking and interference effects, which leads to a unique signature of rapid adaptation on genetic variation both at the selected sites and at linked neutral loci. Here, we introduce an effective coalescent theory (a “fitness-class coalescent”) that describes how positive selection at many perfectly linked sites alters the structure of genealogies. We use this theory to calculate several simple statistics describing genetic variation within a rapidly adapting population, and to implement efficient backwards-time coalescent simulations which can be used to predict how clonal interference alters the expected patterns of molecular evolution.

How to infer relative fitness from a sample of genomic sequences

How to infer relative fitness from a sample of genomic sequences
Adel Dayarian, Boris I Shraiman
(Submitted on 29 Aug 2012)

Mounting evidence suggests that natural populations can harbor extensive fitness diversity with numerous genomic loci under selection. It is also known that genealogical trees for populations under selection are quantifiably different from those expected under neutral evolution and described statistically by Kingman’s coalescent. While differences in the statistical structure of genealogies have long been used as a test for the presence of selection, the full extent of the information that they contain has not been exploited. Here we shall demonstrate that the shape of the reconstructed genealogical tree for a moderately large number of random genomic samples taken from a fitness diverse, but otherwise unstructured asexual population can be used to predict the relative fitness of individuals within the sample. To achieve this we define a heuristic algorithm, which we test {\it in silico} using simulations of a Wright-Fisher model for a realistic range of mutation rates and selection strength. Our inferred fitness ranking is based on a linear discriminator which identifies rapidly coalescing lineages in the reconstructed tree. Inferred fitness ranking correlates strongly with the actual fitness, with top 10% ranked being in the top 20% fittest with false discovery rate of 0.1-0.3 depending on the mutation/selection parameters. The ranking also enables to predict the common genotype of the future population. While the inference accuracy increases monotonically with sample size, sample sizes of 200 nearly saturate the performance. We propose that our approach can be used for inferring relative fitness of genomes obtained in single-cell sequencing of tumors and in monitoring viral outbreaks.

The impact of deleterious passenger mutations on cancer progression.

The impact of deleterious passenger mutations on cancer progression. (arXiv:1208.6068v1 [q-bio.PE])
by Christopher D McFarland, Gregory V Kryukov, Shamil Sunyaev, Leonid Mirny

Cancer progression is driven by a small number of genetic alterations accumulating in a neoplasm. These few driver alterations reside in a cancer genome alongside tens of thousands of other mutations that are widely believed to have no role in cancer and termed passengers. Many passengers, however, fall within protein coding genes and other functional elements and can possibly have deleterious effects on cancer cells. Here we investigate a potential of mildly deleterious passengers to accumulate and alter the course of neoplastic progression. Our approach combines evolutionary simulations of cancer progression with the analysis of cancer sequencing data. In our simulations, individual cells stochastically divide, acquire advantageous driver and deleterious passenger mutations, or die. Surprisingly, despite selection against them, passengers accumulate and largely evade selection during progression. Although individually weak, the collective burden of passengers alters the course of progression leading to several phenomena observed in oncology that cannot be explained by a traditional driver-centric view. We tested predictions of the model using cancer genomic data. We find that many passenger mutations are likely to be damaging and that, in agreement with the model, they have largely evaded purifying selection. Finally, we used our model to explore cancer treatments that exploit the load of passengers by either 1) increasing the mutation rate; or 2) exacerbating their deleterious effects. While both approaches lead to cancer regression, the later leads to less frequent relapse. Our results suggest a new framework for understanding cancer progression as a balance of driver and passenger mutations.