[This author post is by Richard Neher on his paper with Oskar Hallatschek: Genealogies of rapidly adapting populations arXived here.
That selection distorts genealogies is a well-known fact, but properties of genealogies shaped by selection are poorly understood. We set out to investigate genealogies in a simple model of rapid adaptation in asexuals: The fitness of individuals is changed by small amounts though frequent mutation, while the overall population size is kept constant by a carrying capacity. We simulated the model and tracked genealogies.
The genealogies we found have two striking features incompatible with the standard neutral coalescent: (i) Many lineages merge almost simultaneously. (ii) Forward in time, the trees often branch very asymmetrically, i.e., almost the entire population descends from one branch while the other branches share the remaining minority. Using branching process approximations and a mapping to range expansion problems (see Brunet et al, (2007)), we show that the genealogies are similar to those expected from the Bolthausen-Sznitman coalescent (BSC), a special case of multiple merger coalescents. Very similar conclusions have been reached in another recent preprint by Desai, Walczak and Fisher. The BSC is well studied and we can build on many results from the mathematical literature.
The difference between Kingman and multiple merger coalescence is closely related to the distinct stochastic properties of genetic drift and draft. While drift describes short term fluctuations in offspring number which are bounded, draft refers to stochasticity through linked selection. Draft can result in fluctuations of the same order as the population size. Even if very rare, such large fluctuations are important. Lumping drift and draft together and labeling the result as effective population size is rarely helpful and often confusing.
Why should we care? We often want to learn about past dynamics from snapshots of populations (sequence samples). To this end, we compare the diversity patterns in the sample to model predictions and infer model parameters. If we use an inappropriate model, we get meaningless answers. Furthermore, some events that are very unlikely under Kingman’s coalescent are quite common when multiple mergers are allowed. Consider for example a lone haplotype in a large sample that connects to the root of the tree. This is very unlikely in neutral coalescent models and one might take it as evidence for immigration from a diverged population. If multiple mergers dominate coalescence, this does not come as a surprise. Similarly, an excess of singletons is not necessarily evidence for expanding populations or deleterious mutations but might be due to draft. I wonder whether more potential pitfalls of this sort exist.
Richard Neher
Pingback: Genes, a gene, genes…. | Gene Expression | Discover Magazine
Pingback: Genes, a gene, genes…. | Biology News by Biologged