SimPhy: Phylogenomic Simulation of Gene, Locus and Species Trees

SimPhy: Phylogenomic Simulation of Gene, Locus and Species Trees
Diego Mallo, Leonardo de Oliveira Martins, David Posada
doi: http://dx.doi.org/10.1101/021709
We present here a fast and flexible software–SimPhy–for the simulation of multiple gene families evolving under incomplete lineage sorting, gene duplication and loss, horizontal gene transfer—all three potentially leading to the species tree/gene tree discordance—and gene conversion. SimPhy implements a hierarchical phylogenetic model in which the evolution of species, locus and gene trees is governed by global and local parameters (e.g., genome-wide, species-specific, locus-specific), that can be fixed or be sampled from a priori statistical distributions. SimPhy also incorporates comprehensive models of substitution rate variation among lineages (uncorrelated relaxed clocks) and the capability of simulating partitioned nucleotide, codon and protein multilocus sequence alignments under a plethora of substitution models using the program INDELible. We validate SimPhy’s output using theoretical expectations and other programs, and show that it scales extremely well with complex models and/or large trees, being an order of magnitude faster than the most similar program (DLCoal-Sim). In addition, we demonstrate how SimPhy can be useful to understand interactions among different evolutionary processes, conducting a simulation study to characterize the systematic overestimation of the duplication time when using standard reconciliation methods. SimPhy is available at https://github.com/adamallo/SimPhy, where users can find the source code, pre-compiled executables, a detailed manual and example cases.

Mendelian randomization: a premature burial?

Mendelian randomization: a premature burial?
George Davey Smith
doi: http://dx.doi.org/10.1101/021386
Mendelian randomization is a promising approach to help improve causal inference in observational studies, with widespread potential applications, including to prioritization of pharmacotherapeutic targets for evaluation in RCTs. From its initial proposal the limitations of Mendelian randomization approaches have been widely recognised and discussed, and recently Pickrell has reiterated these1. However this critique did not acknowledge recent developments in both methodological and empirical research, nor did it recognise many future opportunities for application of the Mendelian randomization approach. These issues are briefly reviewed here.

Evolution in spatial and spatiotemporal variable metapopulations changes a herbivore’s host plant range

Evolution in spatial and spatiotemporal variable metapopulations changes a herbivore’s host plant rangeAnnelies De Roissart, Nicky Wybouw, David Renault, Thomas Van Leeuwen, Dries Bonte
doi: http://dx.doi.org/10.1101/021683

The persistence and dynamics of populations largely depends on the way they are configured and integrated into space and the ensuing eco-evolutionary dynamics. We manipulated spatial and temporal variation in patch size in replicated experimental metapopulations of the herbivore mite Tetranychus urticae. Evolution over approximately 30 generations in the spatially and spatiotemporally variable metapopulations induced a significant divergence in life history traits, physiological endpoints and gene expression, but also a remarkable convergence relative to the stable reference patchy metapopulation in traits related to size and fecundity and in its transcriptional regulation. The observed evolutionary dynamics are tightly linked to demographic changes, more specifically frequent episodes of resource shortage, and increased the reproductive performance of mites on tomato, a challenging host plant. This points towards a general, adaptive stress response in stable spatial variable and spatiotemporal variable metapopulations that pre-adapts a herbivore arthropod to novel environmental stressors.

Collective Fluctuations in models of adaptation

Collective Fluctuations in models of adaptation
Oskar Hallatschek, Lukas Geyrhofer
Subjects: Populations and Evolution (q-bio.PE); Statistical Mechanics (cond-mat.stat-mech); Biological Physics (physics.bio-ph)

The dynamics of adaptation is difficult to predict because it is highly stochastic even in large populations. The uncertainty emerges from number fluctuations, called genetic drift, arising in the small number of particularly fit individuals of the population. Random genetic drift in this evolutionary vanguard also limits the speed of adaptation, which diverges in deterministic models that ignore these chance effects. Several approaches have been developed to analyze the crucial role of noise on the expected dynamics of adaptation, including the mean fitness of the entire population, or the fate of newly arising beneficial deleterious mutations. However, very little is known about how genetic drift causes fluctuations to emerge on the population level, including fitness distribution variations and speed variations. Yet, these phenomena control the replicability of experimental evolution experiments and are key to a truly predictive understanding of evolutionary processes. Here, we develop an exact approach to these emergent fluctuations by a combination of computational and analytical methods. We show, analytically, that the infinite hierarchy of moment equations can be closed at any arbitrary order by a suitable choice of a dynamical constraint. This constraint regulates (rather than fixes) the population size, accounting for resource limitations. The resulting linear equations, which can be accurately solved numerically, exhibit fluctuation-induced terms that amplify short-distance correlations and suppress long-distance ones. Importantly, by accounting for the dynamics of sub-populations, we provide a systematic route to key population genetic quantities, such as fixation probabilities and decay rates of the genetic diversity.

BGT: efficient and flexible genotype query across many samples

BGT: efficient and flexible genotype query across many samples Heng Li
Subjects: Genomics (q-bio.GN)

Summary: BGT is a compact format, a fast command line tool and a simple web application for efficient and convenient query of whole-genome genotypes and frequencies across tens to hundreds of thousands of samples. On real data, it encodes the haplotypes of 32,488 samples across 39.2 million SNPs into a 7.4GB database and decodes a couple of hundred million genotypes per CPU second. The high performance enables real-time responses to complex queries.
Availability and implementation: https://github.com/lh3/bgt
Contact: hengli@broadinstitute.org

The power of single molecule real-time sequencing technology in the de novo assembly of a eukaryotic genome

The power of single molecule real-time sequencing technology in the de novo assembly of a eukaryotic genomeHiroaki Sakai, Naito Ken, Eri Ogiso-Tanaka, Yu Takahashi, Kohtaro Iseki, Chiaki Muto, Kazuhito Satou, Kuniko Teruya, Akino Shiroma, Makiko Shimoji, Takashi Hirano, Takeshi Itoh, Akito Kaga, Norihiko Tomooka
doi: http://dx.doi.org/10.1101/021634

Second-generation sequencers (SGS) have been game-changing, achieving cost-effective whole genome sequencing in many non-model organisms. However, a large portion of the genomes still remains unassembled. We reconstructed azuki bean (Vigna angularis) genome using single molecule real-time (SMRT) sequencing technology and achieved the best contiguity and coverage among currently assembled legume crops. The SMRT-based assembly produced 100 times longer contigs with 100 times smaller amount of gaps compared to the SGS-based assemblies. A detailed comparison between the assemblies revealed that the SMRT-based assembly enabled a more comprehensive gene annotation than the SGS-based assemblies where thousands of genes were missing or fragmented. A chromosome-scale assembly was generated based on the high-density genetic map, covering 86% of the azuki bean genome. We demonstrated that SMRT technology, though still needed to be assisted by SGS data, can achieve a near-complete assembly of a eukaryotic genome.

Evolution of organismal stoichiometry in a 50,000-generation experiment with Escherichia coli

Evolution of organismal stoichiometry in a 50,000-generation experiment with Escherichia coli
Caroline B. Turner, Brian D. Wade, Justin R. Meyer, Richard E. Lenski
doi: http://dx.doi.org/10.1101/021360

Organismal stoichiometry refers to the relative proportion of chemical elements in the biomass of organisms, and it can have important effects on ecological interactions from population to ecosystem scales. Although stoichiometry has been studied extensively from an ecological perspective, little is known about rates and directions of evolutionary changes in elemental composition in response to nutrient limitation. We measured carbon, nitrogen, and phosphorus content of Escherichia coli evolved under controlled carbon-limited conditions for 50,000 generations. The bacteria evolved higher relative nitrogen and phosphorus content, consistent with selection for increased use of the more abundant elements. Total carbon assimilated also increased, indicating more efficient use of the limiting element. Altogether, our study shows that stoichiometry evolved over a relatively short time-period, and that it did so in a predictable direction given the carbon-limiting environment.