Evolution of molecular phenotypes under stabilizing selection

Evolution of molecular phenotypes under stabilizing selection
Armita Nourmohammad, Stephan Schiffels, Michael Laessig
(Submitted on 17 Jan 2013)

Molecular phenotypes are important links between genomic information and organismic functions, fitness, and evolution. Complex phenotypes, which are also called quantitative traits, often depend on multiple genomic loci. Their evolution builds on genome evolution in a complicated way, which involves selection, genetic drift, mutations and recombination. Here we develop a coarse-grained evolutionary statistics for phenotypes, which decouples from details of the underlying genotypes. We derive approximate evolution equations for the distribution of phenotype values within and across populations. This dynamics covers evolutionary processes at high and low recombination rates, that is, it applies to sexual and asexual populations. In a fitness landscape with a single optimal phenotype value, the phenotypic diversity within populations and the divergence between populations reach evolutionary equilibria, which describe stabilizing selection. We compute the equilibrium distributions of both quantities analytically and we show that the ratio of mean divergence and diversity depends on the strength of selection in a universal way: it is largely independent of the phenotype’s genomic encoding and of the recombination rate. This establishes a new method for the inference of selection on molecular phenotypes beyond the genome level. We discuss the implications of our findings for the predictability of evolutionary processes.

Efficient Identification of Equivalences in Dynamic Graphs and Pedigree Structures

Efficient Identification of Equivalences in Dynamic Graphs and Pedigree Structures
Hoyt Koepke, Elizabeth Thompson
(Submitted on 16 Jan 2013)

We propose a new framework for designing test and query functions for complex structures that vary across a given parameter such as genetic marker position. The operations we are interested in include equality testing, set operations, isolating unique states, duplication counting, or finding equivalence classes under identifiability constraints. A motivating application is locating equivalence classes in identity-by-descent (IBD) graphs, graph structures in pedigree analysis that change over genetic marker location. The nodes of these graphs are unlabeled and identified only by their connecting edges, a constraint easily handled by our approach. The general framework introduced is powerful enough to build a range of testing functions for IBD graphs, dynamic populations, and other structures using a minimal set of operations. The theoretical and algorithmic properties of our approach are analyzed and proved. Computational results on several simulations demonstrate the effectiveness of our approach.

Mandated data archiving greatly improves access to research data

Mandated data archiving greatly improves access to research data
Timothy H. Vines, Rose L. Andrew, Dan G. Bock, Michelle T. Franklin, Kimberly J. Gilbert, Nolan C. Kane, Jean-Sébastien Moore, Brook T. Moyers, Sébastien Renaut, Diana J. Rennison, Thor Veen, Sam Yeaman
(Submitted on 16 Jan 2013)

The data underlying scientific papers should be accessible to researchers both now and in the future, but how best can we ensure that these data are available? Here we examine the effectiveness of four approaches to data archiving: no stated archiving policy, recommending (but not requiring) archiving, and two versions of mandating data deposition at acceptance. We control for differences between data types by trying to obtain data from papers that use a single, widespread population genetic analysis, STRUCTURE. At one extreme, we found that mandated data archiving policies that require the inclusion of a data availability statement in the manuscript improve the odds of finding the data online almost a thousand-fold compared to having no policy. However, archiving rates at journals with less stringent policies were only very slightly higher than those with no policy at all. We also assessed the effectiveness of asking for data directly from authors and obtained over half of the requested datasets, albeit with about 8 days delay and some disagreement with authors. Given the long term benefits of data accessibility to the academic community, we believe that journal based mandatory data archiving policies and mandatory data availability statements should be more widely adopted.

Loss of amyloid disaggregases during the evolution of Metazoa

Loss of amyloid disaggregases during the evolution of Metazoa
Albert Erives, Jan Fassler
(Submitted on 15 Jan 2013)

In yeast, phenotypic adaptations can evolve by natural selection of conformational variant prions and their variant amyloid fibers. This system requires the Hsp104 disaggregase, which fragments amyloid fibers into smaller seed prions that are passed on to mitotic descendants and meiotic spores. Interestingly, Hsp104 is found in diverse eukaryotes except metazoans. To investigate whether a prion-based transmission “genetics” was incompatible with the evolution of Metazoa, we identify genes conserved in fungi and choanoflagellates but lost in animals. We show that both eukaryotic clpB amyloid disaggregases, HSP104 and its nuclear-encoded mitochondrial endo-ortholog HSP78, were lost in the stem-metazoan lineage along with only a small number of other relevant genes. We show that these gene losses are not unrelated historical accidents because these loci comprise a very small regulon devoted to prion transmission in yeast. We propose that evolution of developmental asymmetric cell-specifications necessitated the evolutionary deprecation of the ancient clpB system.

Strong Purifying Selection at Synonymous Sites in D. melanogaster

Strong Purifying Selection at Synonymous Sites in D. melanogaster
David S. Lawrie, Philipp W. Messer, Ruth Hershberg, Dmitri A. Petrov
(Submitted on 15 Jan 2013)

Synonymous sites are generally assumed to be subject to weak selective constraint. For this reason, they are often neglected as a possible source of important functional variation. We use site frequency spectra from deep population sequencing data to show that, contrary to this expectation, 22% of four-fold synonymous (4D) sites in D. melanogaster evolve under very strong selective constraint while few, if any, appear to be under weak constraint. Linking polymorphism with divergence data, we further find that the fraction of synonymous sites exposed to strong purifying selection is higher for those positions that show slower evolution on the Drosophila phylogeny. The function underlying the inferred strong constraint appears to be separate from splicing enhancers, nucleosome positioning, and the translational optimization generating canonical codon bias. The fraction of synonymous sites under strong constraint within a gene correlates well with gene expression, particularly in the mid-late embryo, pupae, and adult developmental stages. Genes enriched in strongly constrained synonymous sites tend to be particularly functionally important and are often involved in key developmental pathways. Given that the observed widespread constraint acting on synonymous sites is likely not limited to Drosophila, the role of synonymous sites in genetic disease and adaptation should be reevaluated.

Does your gene need a background check? How genetic background impacts the analysis of mutations, genes, and evolution

Does your gene need a background check? How genetic background impacts the analysis of mutations, genes, and evolution
Chris H. Chandler, Sudarshan Chari, Ian Dworkin
(Submitted on 12 Jan 2013)

The premise of genetic analysis is that a causal link exists between phenotypic and allelic variation. Yet it has long been documented that mutant phenotypes are not a simple result of a single DNA lesion, but rather are due to interactions of the focal allele with other genes and the environment. Although an experimentally rigorous approach, focusing on individual mutations and isogenic control strains, has facilitated amazing progress within genetics and related fields, a glimpse back suggests that a vast complexity has been omitted from our current understanding of allelic effects. Armed with traditional genetic analyses and the foundational knowledge they have provided, we argue that the time and tools are ripe to return to the under-explored aspects of gene function and embrace the context-dependent nature of genetic effects. We assert that a broad understanding of genetic effects and the evolutionary dynamics of alleles requires identifying how mutational outcomes depend upon the wild-type genetic background. Furthermore, we discuss how best to exploit genetic background effects to broaden genetic research programs.

SLiM: Simulating Evolution with Selection and Linkage

SLiM: Simulating Evolution with Selection and Linkage
Philipp W. Messer
(Submitted on 14 Jan 2013)

SLiM is an efficient forward population genetic simulation designed for studying the effects of linkage and selection on a chromosome-wide scale. The program can incorporate complex scenarios of demography and population substructure, various models for selection and dominance of new mutations, arbitrary gene and chromosomal structure, and user-defined recombination maps.

Consider public archiving for your dissertation

This guest post is by Carl Boettiger (@cboettig). Carl is a postdoc with interests in theoretical and applied ecology, evolution, and phylogenetics. He’s a supporter of open access and open science, and recently posted his PhD thesis to figshare (see discussion with him on the merits of theses on figshare and University archives here).

Consider public archiving for your dissertation

As researchers we spend an immense amount of time generating products other than papers. While we go through great lengths to see that our papers are published in just the right place to be seen by our colleagues (fretting about the different impact factors, percieved audience, editorial boards, open access policies, and many other factors that determine just how a paper will see the light of day), other products of our labors largely languish on forgotten hard-drives from long ago.

Among the items that recieve considerable investiment of blood, sweat and tears in not only producing but formating just right, etc, is the PhD dissertation. As much of this work will no doubt eventually make its way into various formal publications, if it hasn’t already, it easy to view the process more as ritual than practical, whose only outcome will be another dusty black cover to grace the darkest shelves of the University library and the office of any adviser over fifty. Yet dissertations have more practical uses than bookends
as well.

A dissertation is frequently the first time certain results will see the light of day, and may offer a more accessible introduction with more complete review of background material than a published paper, thanks to the long-hand monograph style that seems to be out of vogue in the peer reviewed literature. Dissertation acknowledgements often provide wonderful snapshot into the toils of a PhD in recognizing contributions and support. And while the published results may appear only in journals requiring subscriptions, the author can almost always still release the original thesis as open access to gain the potential benefits of larger readership.[1]

While some dissertations have been important references to me during my own PhD and beyond, they aren’t always easy to find — for me, author’s webpages have been a more common source than University or publisher catalogs. Meanwhile, many other researchers do not even mention their dissertations on their own websites. Today, there are better and easier alternatives for sharing your dissertation.

An increasing recognition of other products of research has led to a proliferation of possible outlets to share research materials. Repositories such as arXiv and Figshare are indexed by Google Scholar, provide reliable persistent storage, and permanent identifiers or DOIs that can make it easy to cite or link.

[1]: e.g. see:
1. Gargouri, Y. et al. Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research. PLoS ONE 5, e13636 (2010).
2. Eysenbach, G. Citation advantage of open access articles. PLoS Biology 4, e157 (2006).

Improving the Efficiency of Genomic Selection

Improving the Efficiency of Genomic Selection
Marco Scutari, Ian Mackay, David J. Balding
(Submitted on 10 Jan 2013)

We investigate two approaches to increase the efficiency of phenotypic prediction from genome-wide markers, which is a key step for genomic selection (GS) in plant and animal breeding. The first approach is feature selection based on Markov blankets, which provide a theoretically-sound framework for identifying non-informative markers. Fitting GS models using only the informative markers results in simpler models, which may allow cost savings from reduced genotyping. We show that this is accompanied by no loss, and possibly a small gain, in predictive power for four GS models: partial least squares (PLS), ridge regression, LASSO and elastic net. The second approach is the choice of kinship coefficients for genomic best linear unbiased prediction (GBLUP). We compare kinships based on different combinations of centring and scaling of marker genotypes, and a newly proposed kinship measure that adjusts for linkage disequilibrium (LD).
We illustrate the use of both approaches and examine their performances using three real-world data sets from plant and animal genetics. We find that elastic net with feature selection and GBLUP using LD-adjusted kinships performed similarly well, and were the best-performing methods in our study.

A genome-wide survey of genetic variation in gorillas using reduced representation sequencing

A genome-wide survey of genetic variation in gorillas using reduced representation sequencing
Aylwyn Scally, Bryndis Yngvadottir, Yali Xue, Qasim Ayub, Richard Durbin, Chris Tyler-Smith
(Submitted on 9 Jan 2013)

All non-human great apes are endangered in the wild, and it is therefore important to gain an understanding of their demography and genetic diversity. To date, however, genetic studies within these species have largely been confined to mitochondrial DNA and a small number of other loci. Here, we present a genome-wide survey of genetic variation in gorillas using a reduced representation sequencing approach, focusing on the two lowland subspecies. We identify 3,274,491 polymorphic sites in 14 individuals: 12 western lowland gorillas (Gorilla gorilla gorilla) and 2 eastern lowland gorillas (Gorilla beringei graueri). We find that the two species are genetically distinct, based on levels of heterozygosity and patterns of allele sharing. Focusing on the western lowland population, we observe evidence for population substructure, and a deficit of rare genetic variants suggesting a recent episode of population contraction. In western lowland gorillas, there is an elevation of variation towards telomeres and centromeres on the chromosomal scale. On a finer scale, we find substantial variation in genetic diversity, including a marked reduction close to the major histocompatibility locus, perhaps indicative of recent strong selection there. These findings suggest that despite their maintaining an overall level of genetic diversity equal to or greater than that of humans, population decline, perhaps associated with disease, has been a significant factor in recent and long-term pressures on wild gorilla populations.