Complex patterns of local adaptation in teosinte

Complex patterns of local adaptation in teosinte

Tanja Pyhäjärvi, Matthew B. Hufford, Sofiane Mezmouk, Jeffrey Ross-Ibarra
(Submitted on 3 Aug 2012)

Populations of widely distributed species often encounter and adapt to specific environmental conditions. However, comprehensive characterization of the genetic basis of adaptation is demanding, requiring genome-wide genotype data, multiple sampled populations, and a good understanding of population structure. We have used environmental and high-density genotype data to describe the genetic basis of local adaptation in 21 populations of teosinte, the wild ancestor of maize. We found that altitude, dispersal events and admixture among subspecies formed a complex hierarchical genetic structure within teosinte. Patterns of linkage disequilibrium revealed four mega-base scale inversions that segregated among populations and had altitudinal clines. Based on patterns of differentiation and correlation with environmental variation, inversions and nongenic regions play an important role in local adaptation of teosinte. Further, we note that strongly differentiated individual populations can bias the identification of adaptive loci. The role of inversions in local adaptation has been predicted by theory and requires attention as genome-wide data become available for additional plant species. These results also suggest a potentially important role for noncoding variation, especially in large plant genomes in which the gene space represents a fraction of the entire genome.

Finite populations with frequency-dependent selection: a genealogical approach

Finite populations with frequency-dependent selection: a genealogical approach

Peter Pfaffelhuber, Benedikt Vogt
(Submitted on 28 Jul 2012)

Evolutionary models for populations of constant size are frequently studied using the Moran model, the Wright-Fisher model, or their diffusion limits. When evolution is neutral, a random genealogy given through Kingman’s coalescent is used in order to understand basic properties of such models. Here, we address the use of a genealogical perspective for models with weak frequency-dependent selection, i.e. N s =: {\alpha} is small, and s is the fitness advantage of a fit individual and N is the population size. When computing fixation probabilities, this leads either to the approach proposed by Rousset (2003), who argues how to use the Kingman’s coalescent for weak selection, or to extensions of the ancestral selection graph of Neuhauser and Krone (1997) and Neuhauser (1999). As an application, we re-derive the one-third law of evolutionary game theory (Nowak et al., 2004). In addition, we provide the approximate distribution of the genealogical distance of two randomly sampled individuals under linear frequency-dependence.

Our paper: Genealogies of rapidly adapting populations

[This author post is by Richard Neher on his paper with Oskar Hallatschek: Genealogies of rapidly adapting populations arXived here.


That selection distorts genealogies is a well-known fact, but properties of genealogies shaped by selection are poorly understood. We set out to investigate genealogies in a simple model of rapid adaptation in asexuals: The fitness of individuals is changed by small amounts though frequent mutation, while the overall population size is kept constant by a carrying capacity. We simulated the model and tracked genealogies.

The genealogies we found have two striking features incompatible with the standard neutral coalescent: (i) Many lineages merge almost simultaneously. (ii) Forward in time, the trees often branch very asymmetrically, i.e., almost the entire population descends from one branch while the other branches share the remaining minority. Using branching process approximations and a mapping to range expansion problems (see Brunet et al, (2007)), we show that the genealogies are similar to those expected from the Bolthausen-Sznitman coalescent (BSC), a special case of multiple merger coalescents. Very similar conclusions have been reached in another recent preprint by Desai, Walczak and Fisher. The BSC is well studied and we can build on many results from the mathematical literature.

The difference between Kingman and multiple merger coalescence is closely related to the distinct stochastic properties of genetic drift and draft. While drift describes short term fluctuations in offspring number which are bounded, draft refers to stochasticity through linked selection. Draft can result in fluctuations of the same order as the population size. Even if very rare, such large fluctuations are important. Lumping drift and draft together and labeling the result as effective population size is rarely helpful and often confusing.

Why should we care? We often want to learn about past dynamics from snapshots of populations (sequence samples). To this end, we compare the diversity patterns in the sample to model predictions and infer model parameters. If we use an inappropriate model, we get meaningless answers. Furthermore, some events that are very unlikely under Kingman’s coalescent are quite common when multiple mergers are allowed. Consider for example a lone haplotype in a large sample that connects to the root of the tree. This is very unlikely in neutral coalescent models and one might take it as evidence for immigration from a diverged population. If multiple mergers dominate coalescence, this does not come as a surprise. Similarly, an excess of singletons is not necessarily evidence for expanding populations or deleterious mutations but might be due to draft. I wonder whether more potential pitfalls of this sort exist.

Richard Neher

The date of interbreeding between Neandertals and modern humans

The date of interbreeding between Neandertals and modern humans

Sriram Sankararaman, Nick Patterson, Heng Li, Svante Pääbo, David Reich
(Submitted on 10 Aug 2012)

Comparisons of DNA sequences between Neandertals and present-day humans have shown that Neandertals share more genetic variants with non-Africans than with Africans. This could be due to interbreeding between Neandertals and modern humans when the two groups met subsequent to the emergence of modern humans outside Africa. However, it could also be due to population structure that antedates the origin of Neandertal ancestors in Africa. We measure the extent of linkage disequilibrium (LD) in the genomes of present-day Europeans and find that the last gene flow from Neandertals (or their relatives) into Europeans likely occurred 37,000-86,000 years before the present (BP), and most likely 47,000-65,000 years ago. This supports the recent interbreeding hypothesis, and suggests that interbreeding may have occurred when modern humans carrying Upper Paleolithic technologies encountered Neandertals as they expanded out of Africa.

Genealogies of rapidly adapting populations

Genealogies of rapidly adapting populations
Richard A. Neher, Oskar Hallatschek
(Submitted on 15 Aug 2012)

The genetic diversity of a species is shaped by its recent evolutionary history and can be used to infer demographic events or selective sweeps. Most inference methods are based on the null hypothesis that natural selection is a weak evolutionary force. However, many species, particularly pathogens, are under continuous pressure to adapt in response to changing environments. A statistical framework for inference from diversity data of such populations is currently lacking. Toward this goal, we explore the properties of genealogies that emerge from models of continual adaptation. We show that lineages trace back to a small pool of highly fit ancestors, in which simultaneous coalescence of more than two lineages frequently occurs. While such multiple mergers are unlikely under the neutral coalescent, they create a unique genetic footprint in adapting populations. The site frequency spectrum of derived neutral alleles, for example, is non-monotonic and has a peak at high frequencies, whereas Tajima’s D becomes more and more negative with increasing sample size. Since multiple merger coalescents emerge in various evolutionary scenarios characterized by sustained selection pressures, we argue that they should be considered as null-models for adapting populations.

Genetic Diversity and the Structure of Genealogies in Rapidly Adapting Populations.

Genetic Diversity and the Structure of Genealogies in Rapidly Adapting Populations.

Michael M. Desai, Aleksandra M. Walczak, Daniel S. Fisher
(Submitted on 16 Aug 2012)
Positive selection distorts the structure of genealogies and hence alters patterns of genetic variation within a population. Most analyses of these distortions focus on the signatures of hitchhiking due to hard or soft selective sweeps at a single genetic locus. However, in linked regions of rapidly adapting genomes, multiple beneficial mutations at different loci can segregate simultaneously within the population, an effect known as clonal interference. This leads to a subtle interplay between hitchhiking and interference effects, which leads to a unique signature of rapid adaptation on genetic variation both at the selected sites and at linked neutral loci. Here, we introduce an effective coalescent theory (a “fitness-class coalescent”) that describes how positive selection at many perfectly linked sites alters the structure of genealogies. We use this theory to calculate several simple statistics describing genetic variation within a rapidly adapting population, and to implement efficient backwards-time coalescent simulations which can be used to predict how clonal interference alters the expected patterns of molecular evolution.

How to infer relative fitness from a sample of genomic sequences

How to infer relative fitness from a sample of genomic sequences
Adel Dayarian, Boris I Shraiman
(Submitted on 29 Aug 2012)

Mounting evidence suggests that natural populations can harbor extensive fitness diversity with numerous genomic loci under selection. It is also known that genealogical trees for populations under selection are quantifiably different from those expected under neutral evolution and described statistically by Kingman’s coalescent. While differences in the statistical structure of genealogies have long been used as a test for the presence of selection, the full extent of the information that they contain has not been exploited. Here we shall demonstrate that the shape of the reconstructed genealogical tree for a moderately large number of random genomic samples taken from a fitness diverse, but otherwise unstructured asexual population can be used to predict the relative fitness of individuals within the sample. To achieve this we define a heuristic algorithm, which we test {\it in silico} using simulations of a Wright-Fisher model for a realistic range of mutation rates and selection strength. Our inferred fitness ranking is based on a linear discriminator which identifies rapidly coalescing lineages in the reconstructed tree. Inferred fitness ranking correlates strongly with the actual fitness, with top 10% ranked being in the top 20% fittest with false discovery rate of 0.1-0.3 depending on the mutation/selection parameters. The ranking also enables to predict the common genotype of the future population. While the inference accuracy increases monotonically with sample size, sample sizes of 200 nearly saturate the performance. We propose that our approach can be used for inferring relative fitness of genomes obtained in single-cell sequencing of tumors and in monitoring viral outbreaks.

The impact of deleterious passenger mutations on cancer progression.

The impact of deleterious passenger mutations on cancer progression. (arXiv:1208.6068v1 [q-bio.PE])
by Christopher D McFarland, Gregory V Kryukov, Shamil Sunyaev, Leonid Mirny

Cancer progression is driven by a small number of genetic alterations accumulating in a neoplasm. These few driver alterations reside in a cancer genome alongside tens of thousands of other mutations that are widely believed to have no role in cancer and termed passengers. Many passengers, however, fall within protein coding genes and other functional elements and can possibly have deleterious effects on cancer cells. Here we investigate a potential of mildly deleterious passengers to accumulate and alter the course of neoplastic progression. Our approach combines evolutionary simulations of cancer progression with the analysis of cancer sequencing data. In our simulations, individual cells stochastically divide, acquire advantageous driver and deleterious passenger mutations, or die. Surprisingly, despite selection against them, passengers accumulate and largely evade selection during progression. Although individually weak, the collective burden of passengers alters the course of progression leading to several phenomena observed in oncology that cannot be explained by a traditional driver-centric view. We tested predictions of the model using cancer genomic data. We find that many passenger mutations are likely to be damaging and that, in agreement with the model, they have largely evaded purifying selection. Finally, we used our model to explore cancer treatments that exploit the load of passengers by either 1) increasing the mutation rate; or 2) exacerbating their deleterious effects. While both approaches lead to cancer regression, the later leads to less frequent relapse. Our results suggest a new framework for understanding cancer progression as a balance of driver and passenger mutations.

The genetic prehistory of southern Africa

The genetic prehistory of southern Africa

Joseph K. Pickrell, Nick Patterson, Chiara Barbieri, Falko Berthold, Linda Gerlach, Mark Lipson, Po-Ru Loh, Tom Güldemann, Blesswell Kure, Sununguko Wata Mpoloka, Hirosi Nakagawa, Christfried Naumann, Joanna L. Mountain, Carlos D. Bustamante, Bonnie Berger, Brenna M. Henn, Mark Stoneking, David Reich, Brigitte Pakendorf
(Submitted on 23 Jul 2012)

The hunter-gatherer populations of southern and eastern Africa are known to harbor some of the most ancient human lineages, but their historical relationships are poorly understood. We report data from 22 populations analyzed at over half a million single nucleotide polymorphisms (SNPs), using a genome-wide array designed for studies of history. The southern Africans-here called Khoisan-fall into two groups, loosely corresponding to the northwestern and southeastern Kalahari, which we show separated within the last 30,000 years. All individuals derive at least a few percent of their genomes from admixture with non-Khoisan populations that began 1,200 years ago. In addition, the Hadza, an east African hunter-gatherer population that speaks a language with click consonants, derive about a quarter of their ancestry from admixture with a population related to the Khoisan, implying an ancient genetic link between southern and eastern Africa.

The geography of recent genetic ancestry across Europe

The geography of recent genetic ancestry across Europe

Peter Ralph, Graham Coop
(Submitted on 16 Jul 2012 (v1), last revised 19 Jul 2012 (this version, v2))

The recent genealogical history of human populations is a complex mosaic formed by individual migration, large-scale population movements, and other demographic events. Population genomics datasets can provide a window into this recent history, as rare traces of recent shared genetic ancestry are detectable due to long segments of shared genomic material. We make use of genomic data for 2,257 Europeans (the POPRES dataset) to conduct one of the first surveys of recent genealogical ancestry over the past three thousand years at a continental scale. We detected 1.9 million shared genomic segments, and used the lengths of these to infer the distribution of shared ancestors across time and geography. We find that a pair of modern Europeans living in neighboring populations share around 10-50 genetic common ancestors from the last 1500 years, and upwards of 500 genetic ancestors from the previous 1000 years. These numbers drop off exponentially with geographic distance, but since genetic ancestry is rare, individuals from opposite ends of Europe are still expected to share millions of common genealogical ancestors over the last 1000 years. There is substantial regional variation in the number of shared genetic ancestors: especially high numbers of common ancestors between many eastern populations likely date to the Slavic and/or Hunnic expansions, while much lower levels of common ancestry in the Italian and Iberian peninsulas may indicate weaker demographic effects of Germanic expansions into these areas and/or more stably structured populations. Recent shared ancestry in modern Europeans is ubiquitous, and clearly shows the impact of both small-scale migration and large historical events. Population genomic datasets have considerable power to uncover recent demographic history, and will allow a much fuller picture of the close genealogical kinship of individuals across the world.