Lateral Gene Transfer from the Dead

Lateral Gene Transfer from the Dead
Szöll\Hosi GJ, Eric Tannier, Nicolas Lartillot, Vincent Daubin
(Submitted on 19 Nov 2012)

In phylogenetic studies, the evolution of molecular sequences is assumed to have taken place along the phylogeny traced by the ancestors of extant species. In the presence of lateral gene transfer (LGT), however, this may not be the case, because the species lineage from which a gene was transferred may have gone extinct or not have been sampled. Because it is not feasible to specify or reconstruct the complete phylogeny of all species, we must describe the evolution of genes outside the represented phylogeny by modelling the speciation dynamics that gave rise to the complete phylogeny. We demonstrate that if the number of sampled species is small compared to the total number of existing species, the overwhelming majority of gene transfers involve speciation to, and evolution along extinct or unsampled lineages. We show that the evolution of genes along extinct or unsampled lineages can to good approximation be treated as those of independently evolving lineages described by a few global parameters. Using this result, we derive an algorithm to calculate the probability of a gene tree and recover the maximum likelihood reconciliation given the phylogeny of the sampled species. Examining 473 near universal gene families from 36 cyanobacteria, we find that nearly a third of transfer events — 28% — appear to have topological signatures of evolution along extinct species, but only approximately 6% of transfers trace their ancestry to before the common ancestor of the sampled cyanobacteria.

Journal policy change: Genome Research will consider preprints

We have been alerted to a change in the Genome Research author guidelines, which now read:

The journal only accepts papers that present original research that has not been published previously. Conference presentations or posting unrefereed manuscripts on not-for-profit community preprint servers will not be considered prior publication. Authors are responsible for updating the archived preprint with the journal reference (including DOI), and a link to the published article on the Genome Research website upon publication. [emphasis added]

We look forward to discussing preprints targeted for Genome Research at Haldane’s Sieve.

The genetic architecture of adaptations to high altitude in Ethiopia

The genetic architecture of adaptations to high altitude in Ethiopia

Gorka Alkorta-Aranburu, Cynthia M. Beall, David B. Witonsky, Amha Gebremedhin, Jonathan K. Pritchard, Anna Di Rienzo
(Submitted on 13 Nov 2012)

Although hypoxia is a major stress on physiological processes, several human populations have survived for millennia at high altitudes, suggesting that they have adapted to hypoxic conditions. This hypothesis was recently corroborated by studies of Tibetan highlanders, which showed that polymorphisms in candidate genes show signatures of natural selection as well as well-replicated association signals for variation in hemoglobin levels. We extended genomic analysis to two Ethiopian ethnic groups: Amhara and Oromo. For each ethnic group, we sampled low and high altitude residents, thus allowing genetic and phenotypic comparisons across altitudes and across ethnic groups. Genome-wide SNP genotype data were collected in these samples by using Illumina arrays. We find that variants associated with hemoglobin variation among Tibetans or other variants at the same loci do not influence the trait in Ethiopians. However, in the Amhara, SNP rs10803083 is associated with hemoglobin levels at genome-wide levels of significance. No significant genotype association was observed for oxygen saturation levels in either ethnic group. Approaches based on allele frequency divergence did not detect outliers in candidate hypoxia genes, but the most differentiated variants between high- and lowlanders have a clear role in pathogen defense. Interestingly, a significant excess of allele frequency divergence was consistently detected for genes involved in cell cycle control, DNA damage and repair, thus pointing to new pathways for high altitude adaptations. Finally, a comparison of CpG methylation levels between high- and lowlanders found several significant signals at individual genes in the Oromo.

Defensive complexity and the phylogenetic conservation of immune control

Defensive complexity and the phylogenetic conservation of immune control

Erick Chastain, Rustom Antia, Carl T. Bergstrom
(Submitted on 13 Nov 2012)

One strategy for winning a coevolutionary struggle is to evolve rapidly. Most of the literature on host-pathogen coevolution focuses on this phenomenon, and looks for consequent evidence of coevolutionary arms races. An alternative strategy, less often considered in the literature, is to deter rapid evolutionary change by the opponent. To study how this can be done, we construct an evolutionary game between a controller that must process information, and an adversary that can tamper with this information processing. In this game, a species can foil its antagonist by processing information in a way that is hard for the antagonist to manipulate. We show that the structure of the information processing system induces a fitness landscape on which the adversary population evolves. Complex processing logic can carve long, deep fitness valleys that slow adaptive evolution in the adversary population. We suggest that this type of defensive complexity on the part of the vertebrate adaptive immune system may be an important element of coevolutionary dynamics between pathogens and their vertebrate hosts. Furthermore, we cite evidence that the immune control logic is phylogenetically conserved in mammalian lineages. Thus our model of defensive complexity suggests a new hypothesis for the lower rates of evolution for immune control logic compared to other immune structures.

BayesHammer: Bayesian clustering for error correction in single-cell sequencing

BayesHammer: Bayesian clustering for error correction in single-cell sequencing

Sergey I. Nikolenko, Anton I. Korobeynikov, Max A. Alekseyev
(Submitted on 12 Nov 2012)

Error correction of sequenced reads remains a difficult task, especially in single-cell sequencing projects with extremely non-uniform coverage. While existing error correction tools designed for standard (multi-cell) sequencing data usually come up short in single-cell sequencing projects, algorithms actually used for single-cell error correction have been so far very simplistic.
We introduce several novel algorithms based on Hamming graphs and Bayesian subclustering in our new error correction tool BayesHammer. While BayesHammer was designed for single-cell sequencing, we demonstrate that it also improves on existing error correction tools for multi-cell sequencing data while working much faster on real-life datasets. We benchmark BayesHammer on both $k$-mer counts and actual assembly results with the SPAdes genome assembler.

Exact results for fixation probability of bithermal evolutionary graphs

Exact results for fixation probability of bithermal evolutionary graphs

Bahram Houchmandzadeh (LIPhy), Marcel Vallade (LIPhy)
(Submitted on 12 Nov 2012)

One of the most fundamental concepts of evolutionary dynamics is the “fixation” probability, i.e. the probability that a mutant spreads through the whole population. Most natural communities are geographically structured into habitats exchanging individuals among each other and can be modeled by an evolutionary graph (EG), where directed links weight the probability for the offspring of one individual to replace another individual in the community. Very few exact analytical results are known for EGs. We show here how by using the techniques of the fixed point of Probability Generating Function, we can uncover a large class of of graphs, which we term bithermal, for which the exact fixation probability can be simply computed.

The effect of multiple paternity on genetic diversity during and after colonisation

The effect of multiple paternity on genetic diversity during and after colonisation

M. Rafajlovic, A. Eriksson, A. Rimark, S. H. Saltin, G. Charrier, M. Panova, C. André, K. Johannesson, B. Mehlig
(Submitted on 5 Nov 2012)

In metapopulations, genetic variation of local populations is influenced by the genetic content of the founders, and of migrants following establishment. We analyse the effect of multiple paternity on genetic diversity using a model in which the highly promiscuous marine snail Littorina saxatilis expands from a mainland to colonise initially empty islands of an archipelago. Migrant females carry a large number of eggs fertilised by 1 – 10 mates. We quantify the genetic diversity of the population in terms of its heterozygosity: initially during the transient colonisation process, and at long times when the population has reached an equilibrium state with migration. During colonisation, multiple paternity increases the heterozygosity by 10 – 300 % in comparison with the case of single paternity. The equilibrium state, by contrast, is less strongly affected: multiple paternity gives rise to 10 – 50 % higher heterozygosity compared with single paternity. Further we find that far from the mainland, new mutations spreading from the mainland cause bursts of high genetic diversity separated by long periods of low diversity. This effect is boosted by multiple paternity. We conclude that multiple paternity facilitates colonisation and maintenance of small populations, whether or not this is the main cause for the evolution of extreme promiscuity in Littorina saxatilis.

Genomic mutation rates that neutralize adaptive evolution and natural selection

Genomic mutation rates that neutralize adaptive evolution and natural selection

Philip Gerrish, Alexandre Colato, Paul Sniegowski
(Submitted on 5 Nov 2012)

When mutation rates are low, natural selection remains effective, and increasing the mutation rate can give rise to an increase in adaptation rate. When mutation rates are high to begin with, however, increasing the mutation rate may have a detrimental effect because of the overwhelming presence of deleterious mutations. Indeed, if mutation rates are high enough: 1) adaptation rate can become negative despite the continued availability of adaptive and/or compensatory mutations, or 2) natural selection may be disabled because adaptive and/or compensatory mutations — whether established or newly-arising — are eroded by excessive mutation and decline in frequency. We apply these two criteria to a standard model of asexual adaptive evolution and derive mathematical expressions — some new, some old in new guise — delineating the mutation rates under which either adaptive evolution or natural selection is neutralized. The expressions are simple and require no \emph{a priori} knowledge of organism- and/or environment-specific parameters. Our discussion connects these results to each other and to previous theory, showing convergence or equivalence of the different results in most cases.

Response to Horizontal gene transfer may explain variation in θs

Response to Horizontal gene transfer may explain variation in \theta_s

Inigo Martincorena, Nicholas M. Luscombe
(Submitted on 5 Nov 2012)

In a short article submitted to ArXiv [1], Maddamsetti et al. argue that the variation in the neutral mutation rate among genes in Escherichia coli that we recently reported [2] might be explained by horizontal gene transfer (HGT). To support their argument they present a reanalysis of synonymous diversity in 10 E.coli strains together with an analysis of a collection of 1,069 synonymous mutations found in repair-deficient strains in a long-term in vitro evolution experiment. Here we respond to this communication. Briefly, we explain that HGT was carefully accounted for in our study by multiple independent phylogenetic and population genetic approaches, and we show that there is no new evidence of HGT affecting our results. We also argue that caution must be exercised when comparing mutations from repair deficient strains to data from wild-type strains, as these conditions are dominated by different mutational processes. Finally, we reanalyse Maddamsetti’s collection of mutations from a long-term in vitro experiment and we report preliminary evidence of non-random variation of the mutation rate in these repair deficient strains.

Inference of Admixture Parameters in Human Populations Using Weighted Linkage Disequilibrium

Inference of Admixture Parameters in Human Populations Using Weighted Linkage Disequilibrium

Po-Ru Loh, Mark Lipson, Nick Patterson, Priya Moorjani, Joseph K Pickrell, David Reich, Bonnie Berger
(Submitted on 1 Nov 2012)

Long-range migrations and the resulting admixture between populations have been an important force shaping human genetic diversity. Most existing methods for detecting and reconstructing historical admixture events are based on allele frequency divergences or patterns of ancestry segments in chromosomes of admixed individuals. An emerging new approach harnesses the exponential decay of admixture-induced linkage disequilibrium (LD) as a function of genetic distance. Here, we comprehensively develop LD-based inference into a versatile tool for investigating admixture. We present a new weighted LD statistic that can be used to infer mixture proportions as well as dates with fewer constraints on reference populations than previous methods. We define an LD-based three-population test for admixture and identify scenarios in which it can detect admixture that previous formal tests cannot. We further show that we can discover phylogenetic relationships between populations by comparing weighted LD curves obtained using a suite of references. Finally, we describe several improvements to the computation and fitting of weighted LD curves that greatly increase the robustness and speed of the computation. We implement all of these advances in a software package, ALDER, which we validate in simulations and apply to test for admixture among all populations from the Human Genome Diversity Project (HGDP), highlighting insights into the admixture history of Central African Pygmies, Sardinians, and Japanese.