The deleterious mutation load is insensitive to recent population history

The deleterious mutation load is insensitive to recent population history
Yuval B. Simons, Michael C. Turchin, Jonathan K. Pritchard, Guy Sella
(Submitted on 9 May 2013)

Human populations have undergone dramatic changes in population size in the past 100,000 years, including a severe bottleneck of non-African populations and recent explosive population growth. There is currently great interest in how these demographic events may have affected the burden of deleterious mutations in individuals and the allele frequency spectrum of disease mutations in populations. Here we use population genetic models to show that–contrary to previous conjectures–recent human demography has likely had very little impact on the average burden of deleterious mutations carried by individuals. This prediction is supported by exome sequence data showing that African American and European American individuals carry very similar burdens of damaging mutations. We next consider whether recent population growth has increased the importance of very rare mutations in complex traits. Our analysis predicts that for most classes of disease variants, rare alleles are unlikely to contribute a large fraction of the total genetic variance, and that the impact of recent growth is likely to be modest. However, for diseases that have a direct impact on fitness, strongly deleterious rare mutations likely do play important roles, and the impact of very rare mutations will be far greater as a result of recent growth. In summary, demographic history has dramatically impacted patterns of variation in different human populations, but these changes have likely had little impact on either genetic load or on the importance of rare variants for most complex traits.

Statistical Physics of Evolutionary Trajectories on Fitness Landscapes

Statistical Physics of Evolutionary Trajectories on Fitness Landscapes
Michael Manhart, Alexandre V. Morozov
(Submitted on 6 May 2013)

Random walks on multidimensional nonlinear landscapes are of interest in many areas of science and engineering. In particular, properties of adaptive trajectories on fitness landscapes determine population fates and thus play a central role in evolutionary theory. The topography of fitness landscapes and its effect on evolutionary dynamics have been extensively studied in the literature. We will survey the current research knowledge in this field, focusing on a recently developed systematic approach to characterizing path lengths, mean first-passage times, and other statistics of the path ensemble. This approach, based on general techniques from statistical physics, is applicable to landscapes of arbitrary complexity and structure. It is especially well-suited to quantifying the diversity of stochastic trajectories and repeatability of evolutionary events. We demonstrate this methodology using a biophysical model of protein evolution that describes how proteins maintain stability while evolving new functions.

Critical case stochastic phylogenetic tree model via the Laplace transform

Critical case stochastic phylogenetic tree model via the Laplace transform
Krzysztof Bartoszek, Michal Krzeminski
(Submitted on 30 Apr 2013)

Birth-and-death models are now a common mathematical tool to describe branching patterns observed in real-world phylogenetic trees. Liggett and Schinazi (2009) is one such example. The authors propose a simple birth-and-death model that is compatible with phylogenetic trees of both influenza and HIV, depending on the birth rate parameter. An interesting special case of this model is the critical case where the birth rate equals the death rate. This is a non-trivial situation and to study its asymptotic behaviour we employed the Laplace transform. With this we correct the proof of Liggett and Schinazi (2009) in the critical case.

The Expected Linkage Disequilibrium in Finite Populations Revisited

The Expected Linkage Disequilibrium in Finite Populations Revisited
Ulrike Ober, Alexander Malinowski, Martin Schlather, Henner Simianer
(Submitted on 17 Apr 2013)

The expected level of linkage disequilibrium (LD) in a finite ideal population at equilibrium is of relevance for many applications in population and quantitative genetics. Several recursion formulae have been proposed during the last decades, whose derivations mostly contain heuristic parts and therefore remain mathematically questionable. We propose a more justifiable approach, including an alternative recursion formula for the expected LD. Since the exact formula depends on the distribution of allele frequencies in a very complicated manner, we suggest an approximate solution and analyze its validity extensively in a simulation study. Compared to the widely used formula of Sved, the proposed formula performs better for all parameter constellations considered. We then analyze the expected LD at equilibrium using the theory on discrete-time Markov chains based on the linear recursion formula, with equilibrium being defined as the steady-state of the chain, which finally leads to a formula for the effective population size N_e. An additional analysis considers the effect of non-exactness of a recursion formula on the steady-state, demonstrating that the resulting error in expected LD can be substantial. In an application to the HapMap data of two human populations we illustrate the dependency of the N_e-estimate on the distribution of minor allele frequencies (MAFs), showing that the estimated N_e can vary by up to 30% when a uniform instead of a skewed distribution of MAFs is taken as a basis to select SNPs for the analyses. Our analyses provide new insights into the mathematical complexity of the problem studied.

Identifiability of a Coalescent-based Population Tree Model

Identifiability of a Coalescent-based Population Tree Model
Arindam RoyChoudhury
(Submitted on 12 Apr 2013)

Identifiability of evolutionary tree models has been a recent topic of discussion and some models have been shown to be non-identifiable. A coalescent-based rooted population tree model, originally proposed by Nielsen et al. 1998 [2], has been used by many authors in the last few years and is a simple tool to accurately model the changes in allele frequencies in the tree. However, the identifiability of this model has never been proven. Here we prove this model to be identifiable by showing that the model parameters can be expressed as functions of the probability distributions of subsamples. This a step toward proving the consistency of the maximum likelihood estimator of the population tree based on this model.

The Maintenance of Sex: Ronald Fisher meets the Red Queen

The Maintenance of Sex: Ronald Fisher meets the Red Queen
David Green, Chris Mason
(Submitted on 10 Apr 2013)

Sex in higher diploids carries a two-fold cost of males that should reduce its fitness relative to cloning and result in extinction. Instead, sex is widespread and it is clonal species that face early obsolescence. One possible reason is that sex is an adaptation to resist parasites. We use computer simulations of finite populations to model a Red Queen in which a parasitic haploid mounts a negative frequency-dependent attack on a diploid host. Both host and parasite populations generate novel alleles by mutation and have access to large allele spaces. Sex outcompetes cloning by two overlapping mechanisms. First, sexual diploids adopt advantageous homozygous mutations more rapidly than clonal diploids under conditions of lag load. This rate advantage can offset the lesser fecundity of sex. Second, a relative advantage to sex emerges under host mutation rates that are fast enough to retain fitness in a rapidly mutating parasite environment and increase host polymorphism and polyclonality. Polyclonal populations disproportionately experience interference with selection at high mutation rates, both between and within loci, slowing clonal population adaptation to a changing parasite environment and reducing clonal population fitness relative to sex. This effect increases markedly with the number of loci under independent selection. Rates of parasite mutation exist that not only allow sex to survive despite the two-fold cost of males but which enable sexual and clonal populations to have equal fitness and co-exist. Since all higher organisms carry parasitic loads, the model is of general applicability.

Change in Recessive Lethal Alleles Frequency in Inbred Populations

Change in Recessive Lethal Alleles Frequency in Inbred Populations
Arindam RoyChoudhury
(Submitted on 10 Apr 2013)

In a population practicing consanguineous marriage, rare recessive lethal alleles (RRLA) have higher chances of affecting phenotypes. As inbreeding causes more homozygosity and subsequently more deaths, the loss of individuals with RRLA decreases the frequency of these alleles. Although this phenomenon is well studied in general, here some hitherto unstudied cases are presented. An analytical formula for the RRLA frequency is presented for infinite monoecious population practicing several different types of inbreeding. In finite diecious populations, it is found that more severe inbreeding leads to quicker RRLA losses, making the upcoming generations healthier. A population of size 10,000 practicing 30% half-sib marriages loses more than 95% of its RRLA in 100 generations; a population practicing 30% cousin marriages loses about 75% of its RRLA. Our findings also suggest that given enough resources to grow, a small inbred population will be able to rebound while losing the RRLA.

The causal meaning of Fisher’s average effect

The causal meaning of Fisher’s average effect
James J. Lee, Carson C. Chow
(Submitted on 6 Apr 2013)

In order to formulate the Fundamental Theorem of Natural Selection, Fisher defined the average excess and average effect of a gene substitution. Finding these notions to be somewhat opaque, some authors have recommended reformulating Fisher’s ideas in terms of covariance and regression, which are classical concepts of statistics. We argue that Fisher intended his two averages to express a distinction between correlation and causation. On this view the average effect is a specific weighted average of the actual phenotypic changes that result from physically changing the allelic states of homologous genes. We show that the statistical and causal conceptions of the average effect, perceived as inconsistent by Falconer, can be reconciled if certain relationships between the genotype frequencies and non-additive residuals are conserved. There are certain theory-internal considerations favoring Fisher’s original formulation in terms of causality; for example, the frequency-weighted mean of the average effects equaling zero at each locus becomes a derivable consequence rather than an arbitrary constraint. More broadly, Fisher’s distinction between correlation and causation is of critical importance to gene-trait mapping studies and the foundations of evolutionary biology.

Minimal clade size in the Bolthausen-Sznitman coalescent

Minimal clade size in the Bolthausen-Sznitman coalescent
Fabian Freund, Arno Siri-Jégousse
(Submitted on 14 Jan 2013 (v1), last revised 6 Mar 2013 (this version, v2))

This article shows the asymptotics of distribution and moments of the size $X_n$ of the minimal clade of a randomly chosen individual in a Bolthausen-Sznitman $n$-coalescent for $n\to\infty$. The Bolthausen-Sznitman $n$-coalescent is a Markov process taking states in the set of partitions of $\left\{1,\ldots,n\right\}$, where $1,\ldots,n$ are referred to as individuals. The minimal clade of an individual is the equivalence class the individual is in at the time of the first coalescence event this individual participates in.\\ The main tool used is the connection of the Bolthausen-Sznitman $n$-coalescent with random recursive trees introduced by Goldschmidt and Martin (see \cite{goldschmidtmartin}). This connection shows that $X_n-1$ is distributed as the number $M_n$ of all individuals not in the equivalence class of individual 1 shortly before the time of the last coalescence event. Both functionals are distributed like the size $RT_{n-1}$ of an uniformly chosen table in a standard Chinese restaurant process with $n-1$ customers.We give exact formulae for these distributions.\\ Using the asymptotics of $M_n$ shown by Goldschmidt and Martin in \cite{goldschmidtmartin}, we see $(\log n)^{-1}\log X_n$ converges in distribution to the uniform distribution on [0,1] for $n\to\infty$.\\ We provide the complimentary information that $\frac{\log n}{n^k}E(X_n^k)\to \frac{1}{k}$ for $n\to\infty$, which is also true for $M_n$ and $RT_n$.

The consequences of gene flow for local adaptation and differentiation: A two-locus two-deme model

The consequences of gene flow for local adaptation and differentiation: A two-locus two-deme model
Ada Akerman, Reinhard Bürger
(Submitted on 6 Mar 2013)

We consider a population subdivided into two demes connected by migration in which selection acts in opposite direction. We explore the effects of recombination and migration on the maintenance of multilocus polymorphism, on local adaptation, and on differentiation by employing a deterministic model with genic selection on two linked diallelic loci (i.e., no dominance or epistasis). For the following cases, we characterize explicitly the possible equilibrium configurations: weak, strong, highly asymmetric, and super-symmetric migration, no or weak recombination, and independent or strongly recombining loci. For independent loci (linkage equilibrium) and for completely linked loci, we derive the possible bifurcation patterns as functions of the total migration rate, assuming all other parameters are fixed but arbitrary. For these and other cases, we determine analytically the maximum migration rate below which a stable fully polymorphic equilibrium exists. In this case, differentiation and local adaptation are maintained. Their degree is quantified by a new multilocus version of $\Fst$ and by the migration load, respectively. In addition, we investigate the invasion conditions of locally beneficial mutants and show that linkage to a locus that is already in migration-selection balance facilitates invasion. Hence, loci of much smaller effect can invade than predicted by one-locus theory if linkage is sufficiently tight. We study how this minimum amount of linkage admitting invasion depends on the migration pattern. This suggests the emergence of clusters of locally beneficial mutations, which may form `genomic islands of divergence’. Finally, the influence of linkage and two-way migration on the effective migration rate at a linked neutral locus is explored. Numerical work complements our analytical results.