Dis-integrating the fly: A mutational perspective on phenotypic integration and covariation

Dis-integrating the fly: A mutational perspective on phenotypic integration and covariation

Annat Haber, Ian Dworkin
doi: http://dx.doi.org/10.1101/023333

The structure of environmentally induced phenotypic covariation can influence the effective strength and magnitude of natural selection. Yet our understanding of the factors that contribute to and influence the evolutionary lability of such covariation is poor. Most studies have examined either environmental variation, without accounting for covariation, or examined phenotypic and genetic covariation, without distinguishing the environmental component. In this study we examined the effect of mutational perturbations on different properties of environmental covariation, as well as mean shape. We use strains of Drosophila melanogaster bearing well-characterized mutations known to influence wing shape, as well as naturally-derived strains, all reared under carefully-controlled conditions and with the same genetic background. We find that mean shape changes more freely than the covariance structure, and that different properties of the covariance matrix change independently from each other. The perturbations affect matrix orientation more than they affect matrix size or eccentricity. Yet, mutational effects on matrix orientation do not cluster according to the developmental pathway that they target. These results suggest that it might be useful to consider a more general concept of ‘decanalization’, involving all aspects of variation and covariation.

Long-term natural selection affects patterns of neutral divergence on the X chromosome more than the autosomes.

Long-term natural selection affects patterns of neutral divergence on the X chromosome more than the autosomes.

Melissa Ann Wilson Sayres, Pooja Narang
doi: http://dx.doi.org/10.1101/023234

Natural selection reduces neutral population genetic diversity near coding regions of the genome because recombination has not had time to unlink selected alleles from nearby neutral regions. For ten sub-species of great apes, including human, we show that long-term selection affects estimates of divergence on the X differently from the autosomes. Divergence increases with increasing distance from genes on both the X chromosome and autosomes, but increases faster on the X chromosome than autosomes, resulting in increasing ratios of X/A divergence in putatively neutral regions. Similarly, divergence is reduced more on the X chromosome in neutral regions near conserved regulatory elements than on the autosomes. Consequently estimates of male mutation bias, which rely on comparing neutral divergence between the X and autosomes, are twice as high in neutral regions near genes versus far from genes. Our results suggest filters for putatively neutral genomic regions differ between the X and autosomes.

Dating ancient human samples using the recombination clock

Dating ancient human samples using the recombination clock

Priya Moorjani, Sriram Sankararaman, Qiaomei Fu, Molly Przeworski, Nick J Patterson, David E. Reich
doi: http://dx.doi.org/10.1101/023341

The study of human evolution has been revolutionized by inferences from ancient DNA analyses. Key to these is the reliable estimation of the age of ancient specimens. The current best practice is radiocarbon dating, which relies on characterizing the decay of radioactive carbon isotope (14C), and is applicable for dating up to 50,000-year-old samples. Here, we introduce a new genetic method that uses recombination clock for dating. The key idea is that an ancient genome has evolved less than the genomes of extant individuals. Thus, given a molecular clock provided by the steady accumulation of recombination events, one can infer the age of the ancient genome based on the number of missing years of evolution. To implement this idea, we take advantage of the shared history of Neanderthal gene flow into non-Africans that occurred around 50,000 years ago. Using the Neanderthal ancestry decay patterns, we estimate the Neanderthal admixture time for both ancient and extant samples. The difference in these admixture dates then provides an estimate of the age of the ancient genome. We show that our method provides reliable results in simulations. We apply our method to date five ancient Eurasian genomes with radiocarbon dates ranging between 12,000 to 45,000 years and recover consistent age estimates. Our method provides a complementary approach for dating ancient human samples and is applicable to ancient non-African genomes with Neanderthal ancestry. Extensions of this methodology that use older shared events may be able to date ancient genomes that fall beyond the radiocarbon frontier.

The genetic basis of cone serotiny in Pinus contorta as a function of mixed-severity and stand-replacement fire regimes

The genetic basis of cone serotiny in Pinus contorta as a function of mixed-severity and stand-replacement fire regimes

Mike Feduck, Philippe Henry, Richard Winder, David Dunn, René I Alfaro, Lara vanAkker, Brad Hawkes
doi: http://dx.doi.org/10.1101/023267

Wildfires and mountain pine beetle (MPB) attacks are important contributors to the development of stand structure in lodgepole pine, and major drivers of its evolution. The historical pattern of these events have been correlated with variation in cone serotiny (possessing cones that remain closed and retain seeds until opened by fire) across the Rocky Mountain region of Western North America. As climate change brings about a marked increase in the size, intensity, and severity of our wildfires, it is becoming increasingly important to study the genetic basis of serotiny as an adaptation to wildfire. Knowledge gleaned from these studies would have direct implications for forest management in the future, and for the future. In this study, we collected physical data and DNA samples from 122 trees of two different areas in the IDF-dk of British Columbia; multi-cohort stands (Cariboo-Chilcotin) with a history of mixed-severity fire and frequent MPB disturbances, and single-cohort stands (Logan Lake) with a history of stand replacing (crown) fire and infrequent MPB disturbances. We used QuantiNemo to construct simulated populations of lodgepole pine at five different growth rates, and compared the statistical outputs to physical data, then ran a random forest analysis to shed light on sources of variation in serotiny. We also sequenced 39 SNPs, of which 23 failed or were monomorphic. The 16 informative SNPs were used to calculate HO and HE, which were included alongside genotypes for a second random forest analysis. Our best random forest model explained 33% of variation in serotiny, using simulation and physical variables. Our results highlight the need for more investigation into this matter, using more extensive approaches, and also consideration of alternative methods of heredity such as epigenetics.

Phylogenetic effective sample size

Phylogenetic effective sample size

Krzysztof Bartoszek
doi: http://dx.doi.org/10.1101/023242

In this paper I address the question – how large is a phylogenetic sample? I propose a definition of a phylogenetic effective sample size for Brownian motion and Ornstein-Uhlenbeck processes – the regression effective sample size. I discuss how mutual information can be used to define an effective sample size in the non-normal process case and compare these two definitions to an already present concept of effective sample size (the mean effective sample size). Through a simulation study I find that the AICc is robust if one corrects for the number of species or effective number of species. Lastly I discuss how the concept of the phylogenetic effective sample size can be useful for biodiversity quantification, identification of interesting clades and deciding on the importance of phylogenetic correlations

A probabilistic method for identifying sex-linked genes using RNA-seq-derived genotyping data

A probabilistic method for identifying sex-linked genes using RNA-seq-derived genotyping data

Aline Muyle, Jos Käfer, Niklaus Zemp, Sylvain Mousset, Franck Picard, Gabriel AB Marais
doi: http://dx.doi.org/10.1101/023358

The genetic basis of sex determination remains unknown for the vast majority of organisms with separate sexes. A key question is whether a species has sex chromosomes (SC). SC presence indicates genetic sex determination, and their sequencing may help identifying the sex-determining genes and understanding the molecular mechanisms of sex determination. Identifying SC, especially homomorphic SC, can be difficult. Sequencing SC is also very challenging, in particular the repeat-rich non-recombining regions. A novel approach for identifying sex-linked genes and SC consisting of using RNA-seq to genotype male and female individuals and study sex-linkage has recently been proposed. This approach entails a modest sequencing effort and does not require prior genomic or genetic resources, and is thus particularly suited to study non-model organisms. Applying this approach to many organisms is, however, difficult due to the lack of an appropriate statistically-grounded pipeline to analyse the data. Here we propose a model-based method to infer sex-linkage using a maximum likelihood framework and genotyping data from a full-sib family, which can be obtained for most organisms that can be grown in the lab and for economically important animals/plants. Our method works on any type of SC (XY, ZW, UV) and has been embedded in a pipeline that includes a genotyper specifically developed for RNA-seq data. Validation on empirical and simulated data indicates that our pipeline is particularly relevant to study SC of recent or intermediate age but can return useful information in old systems as well; it is available as a Galaxy workflow.

Coalescence with background and balancing selection in systems with bi- and uniparental reproduction: contrasting partial asexuality and selfing

Coalescence with background and balancing selection in systems with bi- and uniparental reproduction: contrasting partial asexuality and selfing

Aneil Agrawal, Matthew Hartfield
doi: http://dx.doi.org/10.1101/022996

Uniparental reproduction in diploids, via asexual reproduction or selfing, reduces the independence with which separate loci are transmitted across generations. This is expected to increase the extent to which a neutral marker is affected by selection elsewhere in the genome. Such effects have previously been quantified in coalescent models involving selfing. Here we examine the effects of background selection and balancing selection in diploids capable of both sexual and asexual reproduction (i.e., partial asexuality). We find that the effect of background selection on reducing coalescent time (and effective population size) can be orders of magnitude greater when rates of sex are low than when sex is common. This is because asexuality enhances the effects of background selection through both a recombination effect and a segregation effect. We show that there are several reasons that the strength of background selection differs between systems with partial asexuality and those with comparable levels of uniparental reproduction via selfing. Expectations for reductions in Ne via background selection have been verified using stochastic simulations. In contrast to background selection, balancing selection increases the coalescent time for a linked neutral site. With partial asexuality, the effect of balancing selection is somewhat dependent upon the mode of selection (e.g., heterozygote advantage vs. negative frequency dependent selection) in a manner that does not apply to selfing. This is because the frequency of heterozygotes, which are required for recombination onto alternative genetic backgrounds, is more dependent on the pattern of selection with partial asexuality than with selfing.

ASTRID: Accurate Species TRees from Internode Distances

ASTRID: Accurate Species TRees from Internode Distances

Pranjal Vachaspati, Tandy Warnow
doi: http://dx.doi.org/10.1101/023036

Background: Incomplete lineage sorting (ILS), modelled by the multi-species coalescent (MSC), is known to create discordance between gene trees and species trees, and lead to inaccurate species tree estimations unless appropriate methods are used to estimate the species tree. While many statistically consistent methods have been developed to estimate the species tree in the presence of ILS, only ASTRAL-2 and NJst have been shown to have good accuracy on large datasets. Yet, NJst is generally slower and less accurate than ASTRAL-2, and cannot run on some datasets. Results: We have redesigned NJst to enable it to run on all datasets, and we have expanded its design space so that it can be used with different distance-based tree estimation methods. The resultant method, ASTRID, is statistically consistent under the MSC model, and has accuracy that is competitive with ASTRAL-2. Furthermore, ASTRID is much faster than ASTRAL-2, completing in minutes on some datasets for which ASTRAL-2 used hours. Conclusions: ASTRID is a new coalescent-based method for species tree estimation that is competitive with the best current method in terms of accuracy, while being much faster. ASTRID is available in open source form on github.

Estimating K in Genetic Mixture Models

Estimating K in Genetic Mixture Models

Robert Verity, Richard Nichols
doi: http://dx.doi.org/10.1101/022988

A key quantity in the analysis of structured populations is the parameter K, which describes the number of subpopulations that make up the total population. Inference of K ideally proceeds via the model evidence, which is equivalent to the likelihood of the model. However, the evidence in favour of a particular value of K cannot usually be computed exactly, and instead programs such as STRUCTURE make use of simple heuristic estimators to approximate this quantity. We show – using simulated data sets small enough that the true evidence can be computed exactly – that these simple heuristics often fail to estimate the true evidence, and that this can lead to incorrect conclusions about K. Our proposed solution is to use thermodynamic integration (TI) to estimate the model evidence. After outlining the TI methodology we demonstrate the effectiveness of this approach using a range of simulated data sets. We find that TI can be used to obtain estimates of the model evidence that are orders of magnitude more accurate and precise than those based on simple heuristics. Furthermore, estimates of K based on these values are found to be more reliable than those based on a suite of model comparison statistics. Our solution is implemented for models both with and without admixture in the software TrueK.

Adaptation to temporally fluctuating environments by the evolution of maternal effects

Adaptation to temporally fluctuating environments by the evolution of maternal effects

Snigdhadip Dey, Steve Proulx, Henrique Teotonio
doi: http://dx.doi.org/10.1101/023044

Most organisms live in ever-challenging temporally fluctuating environments. Theory suggests that the evolution of anticipatory (or deterministic) maternal effects underlies adaptation to environments that regularly fluctuate every other generation because of selection for increased offspring performance. Evolution of maternal bet-hedging reproductive strategies that randomize offspring phenotypes is in turn expected to underlie adaptation to irregularly fluctuating environments. Although maternal effects are ubiquitous their adaptive significance is unknown since they can easily evolve as a correlated response to selection for increased maternal performance. Using the nematode Caenorhabditis elegans, we show the experimental evolution of maternal provisioning of offspring with glycogen, in populations facing a novel anoxia hatching environment every other generation. As expected with the evolution of deterministic maternal effects, improved embryo hatching survival under anoxia evolved at the expense of fecundity and glycogen provisioning when mothers experienced anoxia early in life. Unexpectedly, populations facing an irregularly fluctuating anoxia hatching environment failed to evolve maternal bet-hedging reproductive strategies. Instead, adaptation in these populations should have occurred through the evolution of balancing trade-offs over multiple generations, since they evolved reduced fitness over successive generations in anoxia but did not go extinct during experimental evolution. Mathematical modelling confirms our conclusion that adaptation to a wide range of patterns of environmental fluctuations hinges on the existence of deterministic maternal effects, and that they are generally much more likely to contribute to adaptation than maternal bet-hedging reproductive strategies.