Coalescence with background and balancing selection in systems with bi- and uniparental reproduction: contrasting partial asexuality and selfing

Coalescence with background and balancing selection in systems with bi- and uniparental reproduction: contrasting partial asexuality and selfing

Aneil Agrawal, Matthew Hartfield
doi: http://dx.doi.org/10.1101/022996

Uniparental reproduction in diploids, via asexual reproduction or selfing, reduces the independence with which separate loci are transmitted across generations. This is expected to increase the extent to which a neutral marker is affected by selection elsewhere in the genome. Such effects have previously been quantified in coalescent models involving selfing. Here we examine the effects of background selection and balancing selection in diploids capable of both sexual and asexual reproduction (i.e., partial asexuality). We find that the effect of background selection on reducing coalescent time (and effective population size) can be orders of magnitude greater when rates of sex are low than when sex is common. This is because asexuality enhances the effects of background selection through both a recombination effect and a segregation effect. We show that there are several reasons that the strength of background selection differs between systems with partial asexuality and those with comparable levels of uniparental reproduction via selfing. Expectations for reductions in Ne via background selection have been verified using stochastic simulations. In contrast to background selection, balancing selection increases the coalescent time for a linked neutral site. With partial asexuality, the effect of balancing selection is somewhat dependent upon the mode of selection (e.g., heterozygote advantage vs. negative frequency dependent selection) in a manner that does not apply to selfing. This is because the frequency of heterozygotes, which are required for recombination onto alternative genetic backgrounds, is more dependent on the pattern of selection with partial asexuality than with selfing.

ASTRID: Accurate Species TRees from Internode Distances

ASTRID: Accurate Species TRees from Internode Distances

Pranjal Vachaspati, Tandy Warnow
doi: http://dx.doi.org/10.1101/023036

Background: Incomplete lineage sorting (ILS), modelled by the multi-species coalescent (MSC), is known to create discordance between gene trees and species trees, and lead to inaccurate species tree estimations unless appropriate methods are used to estimate the species tree. While many statistically consistent methods have been developed to estimate the species tree in the presence of ILS, only ASTRAL-2 and NJst have been shown to have good accuracy on large datasets. Yet, NJst is generally slower and less accurate than ASTRAL-2, and cannot run on some datasets. Results: We have redesigned NJst to enable it to run on all datasets, and we have expanded its design space so that it can be used with different distance-based tree estimation methods. The resultant method, ASTRID, is statistically consistent under the MSC model, and has accuracy that is competitive with ASTRAL-2. Furthermore, ASTRID is much faster than ASTRAL-2, completing in minutes on some datasets for which ASTRAL-2 used hours. Conclusions: ASTRID is a new coalescent-based method for species tree estimation that is competitive with the best current method in terms of accuracy, while being much faster. ASTRID is available in open source form on github.

Estimating K in Genetic Mixture Models

Estimating K in Genetic Mixture Models

Robert Verity, Richard Nichols
doi: http://dx.doi.org/10.1101/022988

A key quantity in the analysis of structured populations is the parameter K, which describes the number of subpopulations that make up the total population. Inference of K ideally proceeds via the model evidence, which is equivalent to the likelihood of the model. However, the evidence in favour of a particular value of K cannot usually be computed exactly, and instead programs such as STRUCTURE make use of simple heuristic estimators to approximate this quantity. We show – using simulated data sets small enough that the true evidence can be computed exactly – that these simple heuristics often fail to estimate the true evidence, and that this can lead to incorrect conclusions about K. Our proposed solution is to use thermodynamic integration (TI) to estimate the model evidence. After outlining the TI methodology we demonstrate the effectiveness of this approach using a range of simulated data sets. We find that TI can be used to obtain estimates of the model evidence that are orders of magnitude more accurate and precise than those based on simple heuristics. Furthermore, estimates of K based on these values are found to be more reliable than those based on a suite of model comparison statistics. Our solution is implemented for models both with and without admixture in the software TrueK.

Adaptation to temporally fluctuating environments by the evolution of maternal effects

Adaptation to temporally fluctuating environments by the evolution of maternal effects

Snigdhadip Dey, Steve Proulx, Henrique Teotonio
doi: http://dx.doi.org/10.1101/023044

Most organisms live in ever-challenging temporally fluctuating environments. Theory suggests that the evolution of anticipatory (or deterministic) maternal effects underlies adaptation to environments that regularly fluctuate every other generation because of selection for increased offspring performance. Evolution of maternal bet-hedging reproductive strategies that randomize offspring phenotypes is in turn expected to underlie adaptation to irregularly fluctuating environments. Although maternal effects are ubiquitous their adaptive significance is unknown since they can easily evolve as a correlated response to selection for increased maternal performance. Using the nematode Caenorhabditis elegans, we show the experimental evolution of maternal provisioning of offspring with glycogen, in populations facing a novel anoxia hatching environment every other generation. As expected with the evolution of deterministic maternal effects, improved embryo hatching survival under anoxia evolved at the expense of fecundity and glycogen provisioning when mothers experienced anoxia early in life. Unexpectedly, populations facing an irregularly fluctuating anoxia hatching environment failed to evolve maternal bet-hedging reproductive strategies. Instead, adaptation in these populations should have occurred through the evolution of balancing trade-offs over multiple generations, since they evolved reduced fitness over successive generations in anoxia but did not go extinct during experimental evolution. Mathematical modelling confirms our conclusion that adaptation to a wide range of patterns of environmental fluctuations hinges on the existence of deterministic maternal effects, and that they are generally much more likely to contribute to adaptation than maternal bet-hedging reproductive strategies.

Plant reproductive development is characterised by a transcriptomic evolutionary bulge

Plant reproductive development is characterised by a transcriptomic evolutionary bulge

Toni I Gossmann, Dounia Saleh, Marc W Schmid, Michael A Spence, Karl Schmid
doi: http://dx.doi.org/10.1101/022939

Reproductive traits in plants tend to evolve rapidly due to various causes that include plant-pollinator coevolution and pollen competition, but the genomic basis of reproductive trait evolution is still largely unknown. To characterise evolutionary patterns of genome wide gene expression in reproductive tissues and to compare them to developmental stages of the sporophyte, we analysed evolutionary conservation and genetic diversity of protein-coding genes using microarray-based transcriptome data from three plant species, Arabidopsis thaliana, rice (Oryza sativa) and soybean (Glycine max). In all three species a significant shift in gene expression occurs during gametogenesis in which genes of younger evolutionary age and higher genetic diversity contribute significantly more to the transcriptome than in other stages. We refer to this phenomenon as ‘evolutionary bulge” during plant reproductive development because it differentiates the gametophyte from the sporophyte. The extent of the bulge pattern is much stronger than the transcriptomic hourglass, which postulates that during early embryo development an increased proportion of ancient and conserved genes contribute to the total transcriptome. In the three plant species, we observed an hourglass pattern only in A. thaliana but not in rice or soybean, which suggests that unlike the evolutionary bulge of reproductive genes the transcriptomic hourglass is not a general pattern of plant embryogenesis, which is consistent with the absence of a morphologically defined phylotypic stage in plant development

First-step Mutations during Adaptation to Thermal Stress Shift the Expression of Thousands of Genes Back toward the Pre-stressed State

First-step Mutations during Adaptation to Thermal Stress Shift the Expression of Thousands of Genes Back toward the Pre-stressed State

Alejandra Rodriguez-Verdugo, Olivier Tenaillon, Brandon Gaut
doi: http://dx.doi.org/10.1101/022905

The temporal change of phenotypes during the adaptive process remain largely unexplored, as do the genetic changes that affect these phenotypic changes. Here we focused on three mutations that rose to high frequency in the early stages of adaptation within 12 Escherichia coli populations subjected to thermal stress (42°C). All of the mutations were in the rpoB gene, which encodes the RNA polymerase beta subunit. For each mutation, we measured the growth curves and gene expression (mRNAseq) of clones at 42°C. We also compared growth and gene expression to their ancestor under unstressed (37°C) and stressed conditions (42°C). Each of the three mutations changed the expression of hundreds of genes and conferred large fitness advantages, apparently through the restoration of global gene expression from the stressed towards the pre-stressed state. Finally, we compared the phenotypic characteristics of one mutant, I572L, to two high-temperature adapted clones that have this mutation plus additional background mutations. The background mutations increased fitness, but they did not substantially change gene expression. We conclude that early mutations in a global transcriptional regulator cause extensive changes in gene expression, many of which are likely under positive selection for their effect in restoring the pre-stress physiology.

Purging of deleterious variants in Italian founder populations with extended autozygosity

Purging of deleterious variants in Italian founder populations with extended autozygosity

Massimiliano Cocca, Marc Pybus, Pier Francesco Palamara, Erik Garrison, Michela Traglia, Cinzia F Sala, Sheila Ulivi, Yasin Memari, Anja Kolb-Kokocinski, Richard Durbin, Paolo Gasparini, Daniela Toniolo, Nicole Soranzo, Vincenza Colonna
doi: http://dx.doi.org/10.1101/022947

Purging through inbreeding defines the process through which deleterious alleles can be removed from populations by natural selection when exposed in homozygosis through the occurrence of consanguineous marriage. In this study we carried out low-read depth (4-10x) whole-genome sequencing in 568 individuals from three Italian founder populations, and compared it to data from other Italian and European populations from the 1000 Genomes Project. We show depletion of homozygous genotypes at potentially detrimental sites in the founder populations compared to outbred populations and observe patterns consistent with consanguinity driving the accelerated purging of highly deleterious mutations.

Genetic loci with parent of origin effects cause hybrid seed lethality between Mimulus species

Genetic loci with parent of origin effects cause hybrid seed lethality between Mimulus species
Austin G Garner, Amanda M Kenney, Lila Fishman, Andrea L Sweigart
doi: http://dx.doi.org/10.1101/022863

The classic finding in both flowering plants and mammals that hybrid lethality often depends on parent of origin effects suggests that divergence in the underlying loci might be an important source of hybrid incompatibilities between species. In flowering plants, there is now good evidence from diverse taxa that seed lethality arising from interploidy crosses is often caused by endosperm defects associated with deregulated imprinted genes. A similar seed lethality phenotype occurs in many crosses between closely related diploid species, but the genetic basis of this form of early-acting F1 postzygotic reproductive isolation is largely unknown. Here, we show that F1 hybrid seed lethality is an exceptionally strong isolating barrier between two closely related Mimulus species, M. guttatus and M. tilingii, with reciprocal crosses producing less than 1% viable seeds. Using a powerful crossing design and high-resolution genetic mapping, we identify both maternally- and paternally-derived loci that contribute to hybrid seed incompatibility. Strikingly, these two sets of loci are largely non-overlapping, providing strong evidence that genes with parent of origin effects are the primary driver of F1 hybrid seed lethality between M. guttatus and M. tilingii. We find a highly polygenic basis for both parental components of hybrid seed lethality suggesting that multiple incompatibility loci have accumulated to cause strong postzygotic isolation between these closely related species. Our genetic mapping experiment also reveals hybrid transmission ratio distortion and chromosomal differentiation, two additional correlates of functional and genomic divergence between species.

Early modern human dispersal from Africa: genomic evidence for multiple waves of migration

Early modern human dispersal from Africa: genomic evidence for multiple waves of migration
Francesca Tassi, Silvia Ghirotto, Massimo Mezzavilla, Sibelle Torres Vilaça, Lisa De Santi, Guido Barbujani
doi: http://dx.doi.org/10.1101/022889

Background. Anthropological and genetic data agree in indicating the African continent as the main place of origin for modern human. However, it is unclear whether early modern humans left Africa through a single, major process, dispersing simultaneously over Asia and Europe, or in two main waves, first through the Arab peninsula into Southern Asia and Oceania, and later through a Northern route crossing the Levant. Results. Here we show that accurate genomic estimates of the divergence times between European and African populations are more recent than those between Australo-Melanesia and Africa, and incompatible with the effects of a single dispersal. This difference cannot possibly be accounted for by the effects of hybridization with archaic human forms in Australo-Melanesia. Furthermore, in several populations of Asia we found evidence for relatively recent genetic admixture events, which could have obscured the signatures of the earliest processes. Conclusions. We conclude that the hypothesis of a single major human dispersal from Africa appears hardly compatible with the observed historical and geographical patterns of genome diversity, and that Australo-Melanesian populations seem still to retain a genomic signature of a more ancient divergence from Africa

A comparative study of SVDquartets and other coalescent-based species tree estimation methods

A comparative study of SVDquartets and other coalescent-based species tree estimation methods
Jed Chou, Ashu Gupta, Shashank Yaduvanshi, Ruth Davidson, Mike Nute, Siavash Mirarab, Tandy Warnow
doi: http://dx.doi.org/10.1101/022855

Background: Species tree estimation is challenging in the presence of incomplete lineage sorting (ILS), which can make gene trees different from the species tree. Because ILS is expected to occur and the standard concatenation approach can return incorrect trees with high support in the presence of ILS, “coalescent-based” summary methods (which first estimate gene trees and then combine gene trees into a species tree) have been developed that have theoretical guarantees of robustness to arbitrarily high amounts of ILS. Some studies have suggested that summary methods should only be used on “c-genes” (i.e., recombination-free loci) that can be extremely short (sometimes fewer than 100 sites). However, gene trees estimated on short alignments can have high estimation error, and summary methods tend to have high error on short c-genes. To address this problem, Chifman and Kubatko introduced SVDquartets, a new coalescent-based method. SVDquartets takes multi-locus unlinked single-site data, infers the quartet trees for all subsets of four species, and then combines the set of quartet trees into a species tree using a quartet amalgamation heuristic. Yet, the relative accuracy of SVDquartets to leading coalescent-based methods has not been assessed. Results: We compared SVDquartets to two leading coalescent-based methods (ASTRAL-2 and NJst), and to concatenation using maximum likelihood. We used a collection of simulated datasets, varying ILS levels, numbers of taxa, and number of sites per locus. Although SVDquartets was sometimes more accurate than ASTRAL-2 and NJst, most often the best results were obtained using ASTRAL-2, even on the shortest gene sequence alignments we explored (with only 10 sites per locus). Finally, concatenation was the most accurate of all methods under low ILS conditions. Conclusions: ASTRAL-2 generally had the best accuracy under higher ILS conditions, and concatenation had the best accuracy under the lowest ILS conditions. However, SVDquartets was competitive with the best methods under conditions with low ILS and small numbers of sites per locus. The good performance under many conditions of ASTRAL-2 in comparison to SVDquartets is surprising given the known vulnerability of ASTRAL-2 and similar methods to short gene sequences.