First-step Mutations during Adaptation to Thermal Stress Shift the Expression of Thousands of Genes Back toward the Pre-stressed State

First-step Mutations during Adaptation to Thermal Stress Shift the Expression of Thousands of Genes Back toward the Pre-stressed State

Alejandra Rodriguez-Verdugo, Olivier Tenaillon, Brandon Gaut
doi: http://dx.doi.org/10.1101/022905

The temporal change of phenotypes during the adaptive process remain largely unexplored, as do the genetic changes that affect these phenotypic changes. Here we focused on three mutations that rose to high frequency in the early stages of adaptation within 12 Escherichia coli populations subjected to thermal stress (42°C). All of the mutations were in the rpoB gene, which encodes the RNA polymerase beta subunit. For each mutation, we measured the growth curves and gene expression (mRNAseq) of clones at 42°C. We also compared growth and gene expression to their ancestor under unstressed (37°C) and stressed conditions (42°C). Each of the three mutations changed the expression of hundreds of genes and conferred large fitness advantages, apparently through the restoration of global gene expression from the stressed towards the pre-stressed state. Finally, we compared the phenotypic characteristics of one mutant, I572L, to two high-temperature adapted clones that have this mutation plus additional background mutations. The background mutations increased fitness, but they did not substantially change gene expression. We conclude that early mutations in a global transcriptional regulator cause extensive changes in gene expression, many of which are likely under positive selection for their effect in restoring the pre-stress physiology.

Purging of deleterious variants in Italian founder populations with extended autozygosity

Purging of deleterious variants in Italian founder populations with extended autozygosity

Massimiliano Cocca, Marc Pybus, Pier Francesco Palamara, Erik Garrison, Michela Traglia, Cinzia F Sala, Sheila Ulivi, Yasin Memari, Anja Kolb-Kokocinski, Richard Durbin, Paolo Gasparini, Daniela Toniolo, Nicole Soranzo, Vincenza Colonna
doi: http://dx.doi.org/10.1101/022947

Purging through inbreeding defines the process through which deleterious alleles can be removed from populations by natural selection when exposed in homozygosis through the occurrence of consanguineous marriage. In this study we carried out low-read depth (4-10x) whole-genome sequencing in 568 individuals from three Italian founder populations, and compared it to data from other Italian and European populations from the 1000 Genomes Project. We show depletion of homozygous genotypes at potentially detrimental sites in the founder populations compared to outbred populations and observe patterns consistent with consanguinity driving the accelerated purging of highly deleterious mutations.

Genetic loci with parent of origin effects cause hybrid seed lethality between Mimulus species

Genetic loci with parent of origin effects cause hybrid seed lethality between Mimulus species
Austin G Garner, Amanda M Kenney, Lila Fishman, Andrea L Sweigart
doi: http://dx.doi.org/10.1101/022863

The classic finding in both flowering plants and mammals that hybrid lethality often depends on parent of origin effects suggests that divergence in the underlying loci might be an important source of hybrid incompatibilities between species. In flowering plants, there is now good evidence from diverse taxa that seed lethality arising from interploidy crosses is often caused by endosperm defects associated with deregulated imprinted genes. A similar seed lethality phenotype occurs in many crosses between closely related diploid species, but the genetic basis of this form of early-acting F1 postzygotic reproductive isolation is largely unknown. Here, we show that F1 hybrid seed lethality is an exceptionally strong isolating barrier between two closely related Mimulus species, M. guttatus and M. tilingii, with reciprocal crosses producing less than 1% viable seeds. Using a powerful crossing design and high-resolution genetic mapping, we identify both maternally- and paternally-derived loci that contribute to hybrid seed incompatibility. Strikingly, these two sets of loci are largely non-overlapping, providing strong evidence that genes with parent of origin effects are the primary driver of F1 hybrid seed lethality between M. guttatus and M. tilingii. We find a highly polygenic basis for both parental components of hybrid seed lethality suggesting that multiple incompatibility loci have accumulated to cause strong postzygotic isolation between these closely related species. Our genetic mapping experiment also reveals hybrid transmission ratio distortion and chromosomal differentiation, two additional correlates of functional and genomic divergence between species.

Early modern human dispersal from Africa: genomic evidence for multiple waves of migration

Early modern human dispersal from Africa: genomic evidence for multiple waves of migration
Francesca Tassi, Silvia Ghirotto, Massimo Mezzavilla, Sibelle Torres Vilaça, Lisa De Santi, Guido Barbujani
doi: http://dx.doi.org/10.1101/022889

Background. Anthropological and genetic data agree in indicating the African continent as the main place of origin for modern human. However, it is unclear whether early modern humans left Africa through a single, major process, dispersing simultaneously over Asia and Europe, or in two main waves, first through the Arab peninsula into Southern Asia and Oceania, and later through a Northern route crossing the Levant. Results. Here we show that accurate genomic estimates of the divergence times between European and African populations are more recent than those between Australo-Melanesia and Africa, and incompatible with the effects of a single dispersal. This difference cannot possibly be accounted for by the effects of hybridization with archaic human forms in Australo-Melanesia. Furthermore, in several populations of Asia we found evidence for relatively recent genetic admixture events, which could have obscured the signatures of the earliest processes. Conclusions. We conclude that the hypothesis of a single major human dispersal from Africa appears hardly compatible with the observed historical and geographical patterns of genome diversity, and that Australo-Melanesian populations seem still to retain a genomic signature of a more ancient divergence from Africa

A comparative study of SVDquartets and other coalescent-based species tree estimation methods

A comparative study of SVDquartets and other coalescent-based species tree estimation methods
Jed Chou, Ashu Gupta, Shashank Yaduvanshi, Ruth Davidson, Mike Nute, Siavash Mirarab, Tandy Warnow
doi: http://dx.doi.org/10.1101/022855

Background: Species tree estimation is challenging in the presence of incomplete lineage sorting (ILS), which can make gene trees different from the species tree. Because ILS is expected to occur and the standard concatenation approach can return incorrect trees with high support in the presence of ILS, “coalescent-based” summary methods (which first estimate gene trees and then combine gene trees into a species tree) have been developed that have theoretical guarantees of robustness to arbitrarily high amounts of ILS. Some studies have suggested that summary methods should only be used on “c-genes” (i.e., recombination-free loci) that can be extremely short (sometimes fewer than 100 sites). However, gene trees estimated on short alignments can have high estimation error, and summary methods tend to have high error on short c-genes. To address this problem, Chifman and Kubatko introduced SVDquartets, a new coalescent-based method. SVDquartets takes multi-locus unlinked single-site data, infers the quartet trees for all subsets of four species, and then combines the set of quartet trees into a species tree using a quartet amalgamation heuristic. Yet, the relative accuracy of SVDquartets to leading coalescent-based methods has not been assessed. Results: We compared SVDquartets to two leading coalescent-based methods (ASTRAL-2 and NJst), and to concatenation using maximum likelihood. We used a collection of simulated datasets, varying ILS levels, numbers of taxa, and number of sites per locus. Although SVDquartets was sometimes more accurate than ASTRAL-2 and NJst, most often the best results were obtained using ASTRAL-2, even on the shortest gene sequence alignments we explored (with only 10 sites per locus). Finally, concatenation was the most accurate of all methods under low ILS conditions. Conclusions: ASTRAL-2 generally had the best accuracy under higher ILS conditions, and concatenation had the best accuracy under the lowest ILS conditions. However, SVDquartets was competitive with the best methods under conditions with low ILS and small numbers of sites per locus. The good performance under many conditions of ASTRAL-2 in comparison to SVDquartets is surprising given the known vulnerability of ASTRAL-2 and similar methods to short gene sequences.

Author post: Adaptive evolution is substantially impeded by Hill-Robertson interference in Drosophila

This guest post is by David Castellano and Adam Eyre-Walker on their preprint (with co-authors) Adaptive evolution is substantially impeded by Hill-Robertson interference in Drosophila.

Our paper “Adaptive evolution is substantially impeded by Hill-Robertson interference in Drosophila”, in which we investigate the role of both the rate of recombination and the mutation rate on the rate of adaptive amino acid substitutions, has been available at biorxiv (http://dx.doi.org/10.1101/021600) since 27 June.

Population genetics theory predicts that the rate of adaptive evolution should depend upon the rate of recombination; genes with low rates of recombination will suffer from Hill-Robertson interference (HRi) in which selected mutations interfere with each other (see the figure below): a newly arising advantageous mutation may find itself in competition for fixation with another advantageous mutation at a linked locus on another chromosome in the population, or in linkage disequilibrium with deleterious mutations, which will reduce its probability of fixation if it can not recombine away from them.

A schematic HRi example among adaptive alleles (left) and among adaptive and deleterious alleles (right).

A schematic HRi example among adaptive alleles (left) and among adaptive and deleterious alleles (right).

Likewise, it is expected that genes with higher mutation rates will undergo more adaptive evolution than genes with low mutation rates. More interestingly an interaction between the rate of recombination and the rate of mutation is also expected; HRi should be more prevalent in genes with high mutation rates and low rates of recombination. No attempt has been done so far to quantify the overall impact of HRi on the rate of adaptive evolution for any given genome. In our paper we propose a way to quantify the number of adaptive substitutions lost due to HRi – approximately 27% of all adaptive mutations, which would go to fixation since the split of D. melanogaster – D. yakuba if there was free recombination, are lost due to HRi. Moreover, we are able to estimate how the fraction of lost adaptive amino acid substitutions to HRi depends on gene’s mutation rate. In agreement with our expectations, genes with high mutation rates lose a significantly higher proportion of adaptive substitutions than genes with low mutation rates (43% vs 11%, respectively).

An open question is to what extent HRi affects rates of adaptive evolution in other species. Moreover, the loss of adaptive substitutions to HRi can potentially tell us something important about the strength of selection acting on some advantageous mutations, since weakly selected mutations are those that are most likely to be affected by HRi. This will require further analysis and population genetic modeling, but in combination with other sources of information, for example, the dip in diversity around non-synonymous substitutions, the site frequency spectrum the high frequency variants that are left by selective sweeps it may be possible to infer much more about the DFE of advantageous mutations than previously thought.

It will be of great interest to do similar analyses to those performed here in other species.

Comments very welcome!
David and Adam

A tree metric using structure and length to capture distinct phylogenetic signals

A tree metric using structure and length to capture distinct phylogenetic signalsMichelle Kendall, Caroline Colijn
Subjects: Populations and Evolution (q-bio.PE)

Phylogenetic trees are a central tool in understanding evolution. They are typically inferred from sequence data, and capture evolutionary relationships through time. It is essential to be able to compare trees from different data sources (e.g. several genes from the same organisms) and different inference methods. We propose a new metric for robust, quantitative comparison of rooted, labeled trees. It enables clear visualizations of tree space, gives meaningful comparisons between trees, and can detect distinct islands of tree topologies in posterior distributions of trees. This makes it possible to select well-supported summary trees. We demonstrate our approach on Dengue fever phylogenies.

MuCor: Mutation Aggregation and Correlation

MuCor: Mutation Aggregation and Correlation

Karl W Kroll, Ann-Katherin Eisfeld, Gerard Lozanski, Clara D Bloomfield, John C Byrd, James S Blachly
doi: http://dx.doi.org/10.1101/022780

Motivation: There are many tools for variant calling and effect prediction, but little to tie together large sample groups. Aggregating, sorting, and summarizing variants and effects across a cohort is often done with ad hoc scripts that must be re-written for every new project. In response, we have written MuCor, a tool to gather variants from a variety of input formats (including multiple files per sample), perform database lookups and frequency calculations, and write many report types. In addition to use in large studies with numerous samples, MuCor can also be employed to directly compare variant calls from the same sample across two or more platforms, parameters, or pipelines. A companion utility, DepthGauge, measures coverage at regions of interest to increase confidence in calls. Availability: Source code is freely available at https://github.com/blachlylab Contact: james.blachly@osumc.edu Supplementary data: Supplementary data, including detailed documentation, are available online.

Adaptive divergence in the bovine genome

Adaptive divergence in the bovine genome

William Barendse, Sean McWilliam, Rowan J Bunch, Blair E Harrison
doi: http://dx.doi.org/10.1101/022764

Cattle diverged during the Pleistocene into two subspecies, one in temperate and one in tropical environments. Here we have used next generation sequencing of the indicine subspecies of cattle and compared it to the taurine subspecies. Although 23.8 million single nucleotide polymorphisms (SNP) were found, the number of fixed amino acid substitutions between the taurine and indicine subspecies was low and consistent with the Haldane predictions for adaptive selection rather than with Neutral Theory. We noted 33 regions of enhanced divergence of nonsynonymous SNP between the subspecies, which included an increased rate of deleterious variants. Signals of positive selection were found for genes associated with immunity, including the Bovine Major Histocompatibility Complex, which also showed an increased rate of deleterious amino acid variants. The genes important in sensing the environment, especially the olfactory system, showed a network wide signal of positive selection.

What’s in your next-generation sequence data? An exploration of unmapped DNA and RNA sequence reads from the bovine reference individual

What’s in your next-generation sequence data? An exploration of unmapped DNA and RNA sequence reads from the bovine reference individual

Lynsey K. Whitacre, Polyana C. Tizioto, JaeWoo Kim, Tad S. Sonstegard, Steven G. Schroeder, Leeson J. Alexander, Juan F. Medrano, Robert D. Schnabel, Jeremy F. Taylor, Jared E. Decker
doi: http://dx.doi.org/10.1101/022731

Next-generation sequencing projects commonly commence by aligning reads to a reference genome assembly. While improvements in alignment algorithms and computational hardware have greatly enhanced the efficiency and accuracy of alignments, a significant percentage of reads often remain unmapped. We generated de novo assemblies of unmapped reads from the DNA and RNA sequencing of the Bos taurus reference individual and identified the closest matching sequence to each contig by alignment to the NCBI non-redundant nucleotide database using BLAST. As expected, many of these contigs represent vertebrate sequence that is absent, incomplete, or misassembled in the UMD3.1 reference assembly. However, numerous additional contigs represent invertebrate species. Most prominent were several species of Spirurid nematodes and a blood-borne parasite, Babesia bigemina. These species are not known to infect taurine cattle and the reference animal appears to have been host to unsequenced sister species. We demonstrate the importance of exploring unmapped reads to ascertain sequences that are either absent or misassembled in the reference assembly and for detecting sequences indicative of infectious or symbiotic organisms.