Flexible isoform-level differential expression analysis with Ballgown

Flexible isoform-level differential expression analysis with Ballgown

Alyssa C Frazee, Geo Pertea, Andrew E Jaffe, Ben Langmead, Steven L Salzberg, Jeffrey T Leek

We have built a statistical package called Ballgown for estimating differential expression of genes, transcripts, or exons from RNA sequencing experiments. Ballgown is designed to work with the popular Cufflinks transcript assembly software and uses well-motivated statistical methods to provide estimates of changes in expression. It permits statistical analysis at the transcript level for a wide variety of experimental designs, allows adjustment for confounders, and handles studies with continuous covariates. Ballgown provides improved statistical significance estimates as compared to the Cuffdiff differential expression tool included with Cufflinks. We demonstrate the flexibility of the Ballgown package by re-analyzing 667 samples from the GEUVADIS study to identify transcript-level eQTLs and identify non-linear artifacts in transcript data. Our package is freely available from: https://github.com/alyssafrazee/ballgown

The distribution of the quasispecies for the Wright-Fisher model on the sharp peak landscape

The distribution of the quasispecies for the Wright-Fisher model on the sharp peak landscape

Joseba Dalmau
(Submitted on 27 Mar 2014)

We consider the classical Wright-Fisher model with mutation and selection. Mutations occur independently in each locus, and selection is performed according to the sharp peak landscape. In the asymptotic regime studied in [3], a quasispecies is formed. We find explicitly the distribution of this quasispecies, which turns out to be the same distribution as for the Moran model.

Selscan: an efficient multi-threaded program to perform EHH-based scans for positive selection

Selscan: an efficient multi-threaded program to perform EHH-based scans for positive selection

Zachary A Szpiech, Ryan D Hernandez
(Submitted on 26 Mar 2014)

Haplotype-based scans to detect natural selection are useful to identify recent or ongoing positive selection in genomes. As both real and simulated genomic datasets grow larger, spanning thousands of samples and millions of markers, there is a need for a fast and efficient implementation of these scans for general use. Here we present selscan, an efficient multi-threaded application that implements Extended Haplotype Homozygosity (EHH), Integrated Haplotype Score (iHS), and Cross-population Extended Haplotype Homozygosity (XPEHH). selscan performs extremely well on both simulated and real data and over an order of magnitude faster than existing available implementations. It calculates iHS on chromosome 22 (22,147 loci) across 204 CEU haplotypes in 502s on one thread (77s on 16 threads) and calculates XPEHH for the same data relative to 210 YRI haplotypes in 907s on one thread (107s on 16 threads). Source code and binaries (Windows, OSX and Linux) are available at this https URL.

Weedy Adaptation in Setaria spp.: VI. S. faberi Seed hull shape as soil germination signal antenna

Weedy Adaptation in Setaria spp.: VI. S. faberi Seed hull shape as soil germination signal antenna

J.L. Donnelly, D.C. Adams, J. Dekker
(Submitted on 27 Mar 2014)

Ecological selection forces for weedy and domesticated traits have influenced the evolution of seed shape in Setaria resulting in similarity in seed shape that reflects similarity in ecological function rather than reflecting phylogenetic relatedness. Seeds from two diploid subspecies of Setaria viridis, consisting of one weedy subspecies and two races of the domesticated subspecies, and four other polyploidy weedy species of Setaria. We quantified seed shape from the silhouettes of the seeds in two separate views. Differences in shape were compared to ecological role (weed vs. crop) and the evolutionary trajectory of shape change by phylogenetic grouping from a single reference species was calculated. Idealized three-dimensional models were created to examine the differences in shape relative to surface area and volume. All populations were significantly different in shape, with crops easily distinguished from weeds, regardless of relatedness between the taxa. Trajectory of shape change varied by view, but separated crops from weeds and phylogenetic groupings. Three-dimensional models gave further evidence of differences in shape reflecting adaptation for environmental exploitation. The selective forces for weedy and domesticated traits have exceeded phylogenetic constraints, resulting in seed shape similarity due to ecological role rather than phylogenetic relatedness. Seed shape and surface-to-volume ratio likely reflect the importance of the water film that accumulates on the seed surface when in contact with soil particles. Seed shape may also be a mechanism of niche separation between taxa.

Multidimensional epistasis and the transitory advantage of sex

Multidimensional epistasis and the transitory advantage of sex

Stefan Nowak, Johannes Neidhart, Ivan G. Szendro, Joachim Krug
(Submitted on 25 Mar 2014)

Identifying and quantifying the benefits of sex and recombination is a long standing problem in evolutionary theory. In particular, contradictory claims have been made about the existence of a benefit of recombination on high dimensional fitness landscapes in the presence of sign epistasis. Here we present a comparative numerical study of sexual and asexual evolutionary dynamics on tunably rugged model landscapes, paying special attention to the temporal development of the evolutionary advantage of recombination and the link between population diversity and the rate of adaptation. We show that the adaptive advantage of recombination on static rugged landscapes is strictly transitory. At early times an advantage of recombination through the Fisher-Muller effect is generally observed, but this effect is reversed at longer times by the much more efficient trapping of recombining populations at local fitness peaks. These findings are explained by means of well established results for a setup with only two loci. In accordance with the Red Queen hypothesis the transitory advantage can be prolonged indefinitely in fluctuating environments, and it is maximal when the environment fluctuates on the same time scale on which trapping at local optima typically occurs.

Range Expansion of Heterogeneous Populations

Range Expansion of Heterogeneous Populations

Matthias Reiter (1), Steffen Rulands (1), Erwin Frey (1 contributed equally)
(Submitted on 25 Mar 2014)

Risk spreading in bacterial populations is generally regarded as a strategy to maximize survival. Here, we study its role during range expansion of a genetically diverse population where growth and motility are two alternative traits. We find that during the initial expansion phase fast growing cells do have a selective advantage. By contrast, asymptotically, generalists balancing motility and reproduction are evolutionarily most successful. These findings are rationalized by a set of coupled Fisher equations complemented by stochastic simulations.

High burden of private mutations due to explosive human population growth and purifying selection

High burden of private mutations due to explosive human population growth and purifying selection

Feng Gao, Alon Keinan
(Submitted on 22 Mar 2014)

Recent studies have shown that human populations have experienced a complex demographic history, including a recent epoch of rapid population growth that led to an excess in the proportion of rare genetic variants in humans today. This excess can impact the burden of private mutations for each individual, defined here as the proportion of heterozygous variants in each newly sequenced individual that are novel compared to another large sample of sequenced individuals. We calculated the burden of private mutations predicted by different demographic models, and compared with empirical estimates based on data from the NHLBI Exome Sequencing Project and data from the Neutral Regions (NR) dataset. We observed a significant excess in the proportion of private mutations in the empirical data compared with models of demographic history without a recent epoch of population growth. Incorporating recent growth into the model provides a much improved fit to empirical observations. This phenomenon becomes more marked for larger sample sizes. The proportion of private mutations is additionally increased by purifying selection, which differentially affect mutations of different functional annotations. These results have important implications to the design and analysis of sequencing-based association studies of complex human disease as they pertain to private and very rare variants.