Similar efficacies of selection shape mitochondrial and nuclear genes in Drosophila melanogaster and Homo sapiens

Similar efficacies of selection shape mitochondrial and nuclear genes in Drosophila melanogaster and Homo sapiens
Brandon S. Cooper, Chad Burrus, Chao Ji, Matthew W. Hahn, Kristi L. Montooth
doi: http://dx.doi.org/10.1101/010355

Deleterious mutations contribute to polymorphism even when selection effectively prevents their fixation. The efficacy of selection in removing deleterious mitochondrial mutations from populations depends on the effective population size (Ne) of the mtDNA, and the degree to which a lack of recombination magnifies the effects of linked selection. Using complete mitochondrial genomes from Drosophila melanogaster and nuclear data available from the same samples, we re-examine the hypothesis that non-recombining animal mtDNA harbor an excess of deleterious polymorphisms relative to the nuclear genome. We find no evidence of recombination in the mitochondrial genome, and the much-reduced level of mitochondrial synonymous polymorphism relative to nuclear genes is consistent with a reduction in Ne. Nevertheless, we find that the neutrality index (NI), a measure of the excess on nonsynonymous polymorphism relative to the neutral expectation, is not significantly different between mitochondrial and nuclear loci. Reanalysis of published data from Homo sapiens reveals the same lack of a difference between the two genomes, though small samples in previous studies had suggested a strong difference in both species. Thus, despite a smaller Ne, mitochondrial loci of both flies and humans appear to experience similar efficacies of selection as do loci in the recombining nuclear genome.

Recent evolution of the mutation rate and spectrum in Europeans

Recent evolution of the mutation rate and spectrum in Europeans
Kelley Harris
doi: http://dx.doi.org/10.1101/010314

As humans dispersed out of Africa, they adapted to new environmental challenges including changes in exposure to mutagenic solar radiation. This raises the possibility that different populations experienced different selective pressures affecting genome integrity. Prior work has uncovered divergent selection in tropical versus temperate latitudes on eQTLs that regulate the DNA damage response, as well as evidence that the human mutation rate per year has changed at least 2-fold since we shared a common ancestor with chimpanzees. Here, I present evidence that the rate of a particular mutation type has recently increased in the European lineage, rising in frequency by 50% during the 30,000–50,000 years since Europeans diverged from Asians. A comparison of single nucleotide polymorphisms (SNPs) private to Africa, Asia, and Europe in the 1000 Genomes data reveals that private European variation is enriched for the transition 5’-TCC-3’→5’-TTC-3’. Although it is not clear whether UV played a causal role in the changing the European mutational spectrum, 5’-TCC-3’→5’-TTC-3’ is known to be the most common somatic mutation present in melanoma skin cancers, as well as the mutation most frequently induced in vitro by UV. Regardless of its causality, this change indicates that DNA replication fidelity has not remained stable even since the origin of modern humans and might have changed numerous times during our recent evolutionary history.

On the role of epistasis in adaptation

On the role of epistasis in adaptation
David M. McCandlish, Jakub Otwinowski, Joshua B. Plotkin
Subjects: Populations and Evolution (q-bio.PE)

Although the role of epistasis in evolution has received considerable attention from experimentalists and theorists alike, it is unknown which aspects of adaptation are in fact sensitive to epistasis. Here, we address this question by comparing the evolutionary dynamics on all finite epistatic landscapes versus all finite non-epistatic landscapes, under weak mutation. We first analyze the fitness trajectory — that is, the time course of the expected fitness of a population. We show that for any epistatic fitness landscape and choice of starting genotype, there always exists a non-epistatic fitness landscape and starting genotype that produces the exact same fitness trajectory. Thus, surprisingly, the presence or absence of epistasis is irrelevant to the first-order dynamics of adaptation. On the other hand, we show that the time evolution of the variance in fitness across replicate populations can be sensitive to epistasis: some epistatic fitness landscapes produce variance trajectories that cannot be produced by any non-epistatic landscape. Likewise, the mean substitution trajectory — that is, the expected number of mutations that fix over time — is also sensitive to epistasis. These results on identifiability have direct implications for efforts to infer epistasis from the types of data often measured in experimental populations.

Quantification of GC-biased gene conversion in the human genome

Quantification of GC-biased gene conversion in the human genome
Sylvain Glemin, Peter F Arndt, Philipp W Messer, Dmitri Petrov, Nicolas Galtier, Laurent Duret
doi: http://dx.doi.org/10.1101/010173

Many lines of evidence indicate GC-biased gene conversion (gBGC) has a major impact on the evolution of mammalian genomes. However, up to now, this process had not been properly quantified. In principle, the strength of gBGC can be measured from the analysis of derived allele frequency spectra. However, this approach is sensitive to a number of confounding factors. In particular, we show by simulations that the inference is pervasively affected by polymorphism polarization errors, especially at hypermutable sites, and spatial heterogeneity in gBGC strength. Here we propose a new method to quantify gBGC from DAF spectra, incorporating polarization errors and taking spatial heterogeneity into account. This method is very general in that it does not require any prior knowledge about the source of polarization errors and also provides information about mutation patterns. We apply this approach to human polymorphism data from the 1000 genomes project. We show that the strength of gBGC does not differ between hypermutable CpG sites and non-CpG sites, suggesting that in humans gBGC is not caused by the base-excision repair machinery. We further find that the impact of gBGC is concentrated primarily within recombination hotspots: genome-wide, the strength of gBGC is in the nearly neutral area, but 2% of the human genome is subject to strong gBGC, with population-scaled gBGC coefficients above 5. Given that the location of recombination hotspots evolves very rapidly, our analysis predicts that in the long term, a large fraction of the genome is affected by short episodes of strong gBGC.

STACEY: species delimitation and phylogeny estimation under the multispecies coalescent

STACEY: species delimitation and phylogeny estimation under the multispecies coalescent
Graham R Jones
doi: http://dx.doi.org/10.1101/010199

This article describes a new package called STACEY for BEAST2 which is capable of both species delimitation and species tree estimation using DNA sequences from multiple loci. The focus in this article is on species delimitation. STACEY is based on the multispecies coalescent model, and builds on earlier software (DISSECT), which uses a `birth-death-collapse’ prior to deal with delimitations without the need for reversible-jump Markov chain Monte Carlo moves. Like DISSECT, it requires no a priori assignment of individuals to species or populations, and no guide tree. This paper introduces two innovations. The first is a new model for the populations along the branches of the species tree, and the second is a new MCMC move for exploring the posterior when the multispecies coalescent model is assumed. The main benefit of STACEY over DISSECT is much better convergence. Current practice, using a pipeline approach to species delimitation under the multispecies coalescent, has been shown to have major problems on simulated data. The same simulated data set is used to demonstrate the accuracy and efficiency of STACEY.

The role of standing variation in geographic convergent adaptation

The role of standing variation in geographic convergent adaptation
Peter L. Ralph, Graham Coop
doi: http://dx.doi.org/10.1101/009803

The extent to which populations experiencing shared selective pressures adapt through a shared genetic response is relevant to many questions in evolutionary biology. In a number of well studied traits and species, it appears that convergent evolution within species is common. In this paper, we explore how standing, deleterious genetic variation contributes to convergent genetic responses in a geographically spread population, extending our previous work on the topic. Geographically limited dispersal slows the spread of each selected allele, hence allowing other alleles — newly arisen mutants or present as standing variation — to spread before any one comes to dominate the population. When such alleles meet, their progress is substantially slowed — if the alleles are selectively equivalent, they mix slowly, dividing the species range into a random tessellation, which can be well understood by analogy to a Poisson process model of crystallization. In this framework, we derive the geographic scale over which a typical allele is expected to dominate, the time it takes the species to adapt as a whole, and the proportion of adaptive alleles that arise from standing variation. Finally, we explore how negative pleiotropic effects of alleles before an environment change can bias the subset of alleles that get to contribute to a species adaptive response. We apply the results to the many geographically localized G6PD deficiency alleles thought to confer resistance to malaria, whose large mutational target size and deleterious effects make them likely candidates to have been present as deleterious standing variation. We find the numbers and geographic spread of these alleles matches our predictions reasonably well, which suggest that these arose both from standing variation and new mutations since the advent of malaria. Our results suggest that much of adaptation may be geographically local even when selection pressures are wide-spread. We close by discussing the implications of these results for arguments of species coherence and the nature of divergence between species.

Thinking too positive? Revisiting current methods of population-genetic selection inference

Thinking too positive? Revisiting current methods of population-genetic selection inference
Claudia Bank, Gregory B Ewing, Anna Ferrer-Admettla, Matthieu Foll, Jeffrey D Jensen
doi: http://dx.doi.org/10.1101/009654

In the age of next-generation sequencing, the availability of increasing amounts and quality of data at decreasing cost ought to allow for a better understanding of how natural selection is shaping the genome than ever before. Yet, alternative forces such as demography and background selection obscure the footprints of positive selection that we would like to identify. Here, we illustrate recent developments in this area, and outline a roadmap for improved selection inference. We argue (1) that the development and obligatory use of advanced simulation tools is necessary for improved identification of selected loci, (2) that genomic information from multiple- time points will enhance the power of inference, and (3) that results from experimental evolution should be utilized to better inform population-genomic studies.

On the prospect of identifying adaptive loci in recently bottlenecked populations

On the prospect of identifying adaptive loci in recently bottlenecked populations
Yu-Ping Poh, Vera S Domingues, Hopi Hoekstra, Jeffrey Jensen
doi: http://dx.doi.org/10.1101/009456

Identifying adaptively important loci in recently bottlenecked populations—be it natural selection acting on a population following the colonization of novel habitats in the wild, or artificial selection during the domestication of a breed—remains a major challenge. Here we report the results of a simulation study examining the performance of available population-genetic tools for identifying genomic regions under selection. To illustrate our findings, we examined the interplay between selection and demography in two species of Peromyscus mice, for which we have independent evidence of selection acting on phenotype as well as functional evidence identifying the underlying genotype. With this unusual information, we tested whether population-genetic-based approaches could have been utilized to identify the adaptive locus. Contrary to published claims, we conclude that the use of the background site frequency spectrum as a null model is largely ineffective in bottlenecked populations. Results are quantified both for site frequency spectrum and linkage disequilibrium-based predictions, and are found to hold true across a large parameter space that encompasses many species and populations currently under study. These results suggest that the genomic footprint left by selection on both new and standing variation in strongly bottlenecked populations will be difficult, if not impossible, to find using current approaches.

On the unfounded enthusiasm for soft selective sweeps

On the unfounded enthusiasm for soft selective sweeps
Jeffrey D. Jensen
doi: http://dx.doi.org/10.1101/009563

Underlying any understanding of the mode, tempo, and relative importance of the adaptive process in the evolution of natural populations is the notion of whether adaptation is mutation-limited. Two very different population genetic models have recently been proposed in which the rate of adaptation is not strongly limited by the rate at which newly arising beneficial mutations enter the population. This review discusses the theoretical underpinnings and requirements of these models, as well as the experimental insights on the parameters of relevance. Importantly, empirical and experimental evidence to date challenges the recent enthusiasm for invoking these models to explain observed patterns of variation in humans and Drosophila.

Origins and impacts of new exons

Origins and impacts of new exons
Jason Merkin*, Ping Chen*, Sampsa Hautaniemi, Christopher Burge
doi: http://dx.doi.org/10.1101/009282

Mammalian genes are typically broken into several protein-coding and non-coding exons, but the evolutionary origins and functions of new exons are not well understood. Here, we analyzed patterns of exon gain using deep cDNA sequencing data from several mammals and one bird, identifying thousands of species- and lineage-specific exons. While exons conserved across mammals are mostly protein-coding and constitutively spliced, species-specific exons were mostly located in 5′ untranslated regions and alternatively spliced. New exons most often derived from unique intronic sequence rather than repetitive elements, and were associated with upstream intronic deletions, increased nucleosome occupancy and RNA polymerase II pausing. Surprisingly, exon gain was associated with increased gene expression, but only in tissues where the exon was included, suggesting that splicing enhances steady-state mRNA levels and that changes in splicing represent a major contributor to the evolution of gene expression.