The general recombination equation in continuous time and its solution

The general recombination equation in continuous time and its solution

Ellen Baake, Michael Baake, Majid Salamat
(Submitted on 4 Sep 2014)

The process of recombination in population genetics, in its deterministic limit, leads to a nonlinear ODE in the Banach space of finite measures on a locally compact product space. It has an embedding into a larger family of nonlinear ODEs that permits a systematic analysis with lattice-theoretic methods for general partitions of finite sets. We discuss this type of system, reduce it to an equivalent finite-dimensional nonlinear problem, and solve the latter recursively for generic sets of parameters. We also briefly discuss the singular cases, and how to extend the solution to this situation.

Tracing the genetic origin of Europe’s first farmers reveals insights into their social organization

Tracing the genetic origin of Europe’s first farmers reveals insights into their social organization

Anna Szécsényi-Nagy, Guido Brandt, Victoria Keerl, János Jakucs, Wolfgang Haak, Sabine Möller-Rieker, Kitti Köhler, Balázs Mende, Marc Fecher, Krisztián Oross, Tibor Paluch, Anett Osztás, Viktória Kiss, György Pálfi, Erika Molnár, Katalin Sebők, András Czene, Tibor Paluch, Mario Šlaus, Mario Novak, Nives Pećina-Šlaus, Brigitta Ősz, Vanda Voicsek, Krisztina Somogyi, Gábor Tóth, Bernd Kromer, Eszter Bánffy, Kurt Alt

Farming was established in Central Europe by the Linearbandkeramik culture (LBK), a well-investigated archaeological horizon, which emerged in the Carpathian Basin, in today’s Hungary. However, the genetic background of the LBK genesis has not been revealed yet. Here we present 9 Y chromosomal and 84 mitochondrial DNA profiles from Mesolithic, Neolithic Starčevo and LBK sites (7th/6th millennium BC) from the Carpathian Basin and south-eastern Europe. We detect genetic continuity of both maternal and paternal elements during the initial spread of agriculture, and confirm the substantial genetic impact of early farming south-eastern European and Carpathian Basin cultures on Central European populations of the 6th-4th millennium BC. Our comprehensive Y chromosomal and mitochondrial DNA population genetic analyses demonstrate a clear affinity of the early farmers to the modern Near East and Caucasus, tracing the expansion from that region through south-eastern Europe and the Carpathian Basin into Central Europe. Our results also reveal contrasting patterns for male and female genetic diversity in the European Neolithic, suggesting patrilineal descent system and patrilocal residential rules among the early farmers.

Conservation of expression regulation throughout the animal kingdom

Conservation of expression regulation throughout the animal kingdom

Michael Kuhn, Andreas Beyer
doi: http://dx.doi.org/10.1101/007252

Gene expression programs have been found to be highly conserved between closely related species, especially when comparing the same tissue types between species. Such analysis is, however, much more challenging over larger evolutionary distances when complementary tissues cannot readily be defined. Here, we present the first cross-species mapping of tissue-specific and developmental gene expression patterns across a wide range of animals, including many non-model species. Importantly, our approach does not require the definition of homologous tissues. In our survey of 32 datasets across 23 species, we detected conserved expression programs on all taxonomic levels, both within animals and between the animals and their closest unicellular relatives, the choanoflagellates. We found that the rate of change in tissue expression patterns is a property of gene families. Subsequently, we used the conservation of expression programs as a means to identify neofunctionalization of gene duplication products. We found 1206 duplication events where one of the two genes kept the expression program of the original gene, whereas the other copy adopted a novel expression program. We corroborated such potential neofunctionalizations using independent network information: the duplication product with the more conserved expression pattern shared more interaction partners with the non-duplicated reference gene than the more divergent duplication product. Our findings open new avenues of study for the comparison and transfer of knowledge between different species.

Looking down in the ancestral selection graph: A probabilistic approach to the common ancestor type distribution

Looking down in the ancestral selection graph: A probabilistic approach to the common ancestor type distribution

Ute Lenz, Sandra Kluth, Ellen Baake, Anton Wakolbinger
(Submitted on 2 Sep 2014)

In a (two-type) Wright-Fisher diffusion with directional selection and two-way mutation, let x denote today’s frequency of the beneficial type, and given x, let h(x) be the probability that, among all individuals of today’s population, the individual whose progeny will eventually take over in the population is of the beneficial type. Fearnhead [Fearnhead, P., 2002. The common ancestor at a nonneutral locus. J. Appl. Probab. 39, 38-54] and Taylor [Taylor, J. E., 2007. The common ancestor process for a Wright-Fisher diffusion. Electron. J. Probab. 12, 808-847] obtained a series representation for h(x). We develop a construction that contains elements of both the ancestral selection graph and the lookdown construction and includes pruning of certain lines upon mutation. Besides interest in its own right, this construction allows a transparent derivation of the series coefficients of h(x) and gives them a probabilistic meaning.

Estimating the temporal and spatial extent of gene flow among sympatric lizard populations (genus Sceloporus) in the southern Mexican highlands

Estimating the temporal and spatial extent of gene flow among sympatric lizard populations (genus Sceloporus) in the southern Mexican highlands

Jared A Grummer, Martha L. Calderón, Adrián Nieto Montes-de Oca, Eric N Smith, Fausto Mendez-de la Cruz, Adam Leache
doi: http://dx.doi.org/10.1101/008623

Interspecific gene flow is pervasive throughout the tree of life. Although detecting gene flow between populations has been facilitated by new analytical approaches, determining the timing and geography of hybridization has remained difficult, particularly for historical gene flow. A geographically explicit phylogenetic approach is needed to determine the ancestral population overlap. In this study, we performed population genetic analyses, species delimitation, simulations, and a recently developed approach of species tree diffusion to infer the phylogeographic history, timing and geographic extent of gene flow in the Sceloporus spinosus group. The two species in this group, S. spinosus and S. horridus, are distributed in eastern and western portions of Mexico, respectively, but populations of these species are sympatric in the southern Mexican highlands. We generated data consisting of three mitochondrial genes and eight nuclear loci for 148 and 68 individuals, respectively. We delimited six lineages in this group, but found strong evidence of mito-nuclear discordance in sympatric populations of S. spinosus and S. horridus owing to mitochondrial introgression. We used coalescent simulations to differentiate ancestral gene flow from secondary contact, but found mixed support for these two models. Bayesian phylogeography indicated more than 60% range overlap between ancestral S. spinosus and S. horridus populations since the time of their divergence. Isolation-migration analyses, however, revealed near-zero levels of gene flow between these ancestral populations. Interpreting results from both simulations and empirical data indicate that despite a long history of sympatry among these two species, gene flow in this group has only recently occurred.

Continuous and Discontinuous Phase Transitions in Quantitative Genetics: the role of stabilizing selective pressure

Continuous and Discontinuous Phase Transitions in Quantitative Genetics: the role of stabilizing selective pressure

Annalisa Fierro, Sergio Cocozza, Antonella Monticelli, Giovanni Scala, Gennaro Miele
(Submitted on 2 Sep 2014)

By using the tools of statistical mechanics, we have analyzed the evolution of a population of N diploid hermaphrodites in random mating regime. The population evolves under the effect of drift, selective pressure in form of viability on an additive polygenic trait, and mutation. The analysis allows to determine a phase diagram in the plane of mutation rate and strength of selection. The involved pattern of phase transitions is characterized by a line of critical points for weak selective pressure (smaller than a threshold), whereas discontinuous phase transitions characterized by metastable hysteresis are observed for strong selective pressure. A finite size scaling analysis suggests the analogy between our system and the mean field Ising model for selective pressure approaching the threshold from weaker values. In this framework, the mutation rate, which allows the system to explore the accessible microscopic states, is the parameter controlling the transition from large heterozygosity (disordered phase) to small heterozygosity (ordered one).

Rate and cost of adaptation in the Drosophila genome

Rate and cost of adaptation in the Drosophila genome

Stephan Schiffels, Michael Lässig, Ville Mustonen
doi: http://dx.doi.org/10.1101/008680

Recent studies have consistently inferred high rates of adaptive molecular evolution between Drosophila species. At the same time, the Drosophila genome evolves under different rates of recombination, which results in partial genetic linkage between alleles at neighboring genomic loci. Here we analyze how linkage correlations affect adaptive evolution. We develop a new inference method for adaptation that takes into account the effect on an allele at a focal site caused by neighboring deleterious alleles (background selection) and by neighboring adaptive substitutions (hitchhiking). Using complete genome sequence data and fine-scale recombination maps, we infer a highly heterogeneous scenario of adaptation in Drosophila. In high-recombining regions, about 50% of all amino acid substitutions are adaptive, together with about 20% of all substitutions in proximal intergenic regions. In low-recombining regions, only a small fraction of the amino acid substitutions are adaptive, while hitchhiking accounts for the majority of these changes. Hitchhiking of deleterious alleles generates a substantial collateral cost of adaptation, leading to a fitness decline of about 30/2N per gene and per million years in the lowest-recombining regions. Our results show how recombination shapes rate and efficacy of the adaptive dynamics in eukaryotic genomes.

Segregation distorters are not a primary source of Dobzhansky-Muller incompatibilities in house mouse hybrids

Segregation distorters are not a primary source of Dobzhansky-Muller incompatibilities in house mouse hybrids

Russ Corbett-Detig, Emily Jacobs-Palmer, Daniel Hartl, Hopi Hoekstra
doi: http://dx.doi.org/10.1101/008672

Understanding the molecular basis of species formation is an important goal in evolutionary genetics, and Dobzhansky-Muller incompatibilities are thought to be a common source of postzygotic reproductive isolation between closely related lineages. However, the evolutionary forces that lead to the accumulation of such incompatibilities between diverging taxa are poorly understood. Segregation distorters are an important source of Dobzhansky-Muller incompatibilities between Drosophila species and crop plants, but it remains unclear if the contribution of these selfish genetic elements to reproductive isolation is prevalent in other species. Here, we genotype millions of single nucleotide polymorphisms across the genome from viable sperm of first-generation hybrid male progeny in a cross between Mus musculus castaneus and M. m. domesticus, two subspecies of rodent in the earliest stages of speciation. We then search for a skew in the allele frequencies of the gametes and show that segregation distorters are not measurable contributors to observed infertility in these hybrid males, despite sufficient statistical power to detect even weak segregation distortion with our novel method. Thus, reduced hybrid male fertility in crosses between these nascent species is attributable to other evolutionary forces.

An algorithm for constructing principal geodesics in phylogenetic treespace

An algorithm for constructing principal geodesics in phylogenetic treespace

Tom M. W. Nye
(Submitted on 2 Sep 2014)

Most phylogenetic analyses result in a sample of trees, but summarizing and visualizing these samples can be challenging. Consensus trees often provide limited information about a sample, and so methods such as consensus networks, clustering and multidimensional scaling have been developed and applied to tree samples. This paper describes a stochastic algorithm for constructing a principal geodesic or line through treespace which is analogous to the first principal component in standard Principal Components Analysis. A principal geodesic summarizes the most variable features of a sample of trees, in terms of both tree topology and branch lengths, and it can be visualized as an animation of smoothly changing trees. The algorithm performs a stochastic search through parameter space for a geodesic which minimises the sum of squared projected distances of the data points. This procedure aims to identify the globally optimal principal geodesic, though convergence to locally optimal geodesics is possible. The methodology is illustrated by constructing principal geodesics for experimental and simulated data sets, demonstrating the insight into samples of trees that can be gained and how the method improves on a previously published approach. A java package called GeoPhytter for constructing and visualising principal geodesics is freely available from http://www.ncl.ac.uk/~ntmwn/geophytter.

MINI REVIEW: Statistical methods for detecting differentially methylated loci and regions

MINI REVIEW: Statistical methods for detecting differentially methylated loci and regions

Mark D Robinson, Abdullah Kahraman, Charity W Law, Helen Lindsay, Malgorzata Nowicka, Lukas M Weber, Xiaobei Zhou
doi: http://dx.doi.org/10.1101/007120

DNA methylation, and specifically the reversible addition of methyl groups at CpG dinucleotides genome-wide, represents an important layer that is associated with the regulation of gene expression. In particular, aberrations in the methylation status have been noted across a diverse set of pathological states, including cancer. With the rapid development and uptake of large scale sequencing of short DNA fragments, there has been an explosion of data analytic methods for processing and discovering changes in DNA methylation across diverse data types. In this mini-review, we aim to condense many of the salient challenges, such as experimental design, statistical methods for differential methylation detection and critical considerations such as cell type composition and the potential confounding that can arise from batch effects, into a compact and accessible format. Our main interests, from a statistical perspective, include the practical use of empirical Bayes or hierarchical models, which have been shown to be immensely powerful and flexible in genomics and the procedures by which control of false discoveries are made. Of course, there are many critical platform-specific data preprocessing aspects that we do not discuss here. In addition, we do not make formal performance comparisons of the methods, but rather describe the commonly used statistical models and many of the pertinent issues; we make some recommendations for further study.