Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences

Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences
Heng Li

Motivation: Single Molecule Real-Time (SMRT) sequencing technology and Oxford Nanopore technologies (ONT) produce reads over 10kbp in length, which have enabled high-quality genome assembly at an affordable cost. However, at present, long reads have an error rate as high as 10-15%. Complex and computationally intensive pipelines are required to assemble such reads.
Results: We present a new mapper, minimap, and a de novo assembler, miniasm, for efficiently mapping and assembling SMRT and ONT reads without an error correction stage. They can often assemble a sequencing run of bacterial data into a single contig in a few minutes, and assemble 45-fold C. elegans data in 9 minutes, orders of magnitude faster than the existing pipelines. We also introduce a pairwise read mapping format (PAF) and a graphical fragment assembly format (GFA), and demonstrate the interoperability between ours and current tools.

Mutation is a sufficient and robust predictor of genetic variation for mitotic spindle traits in C. elegans

Mutation is a sufficient and robust predictor of genetic variation for mitotic spindle traits in C. elegans

Reza Farhadifar, Jose Miguel Ponciano, Erik C Andersen, Daniel J Needleman, Charles F Baer

Across-cohort QC analyses of genome-wide association study summary statistics from complex traits

Across-cohort QC analyses of genome-wide association study summary statistics from complex traits

Guo-Bo Chen, Sang Hong Lee, Matthew R Robinson, Maciej Trzaskowski, Zhi-Xiang Zhu, Thomas W Winkler, Felix R Day, Damien C Croteau-Chonka, Andrew R Wood, Adam E Locke, Zoltan Kutalik, Ruth J F Loos, Timothy M Frayling, Joel N Hirschhorn, Jian Yang, Naomi R Wray, GIANT, Peter M Visscher

Random and non-random mating populations: Evolutionary dynamics in meiotic drive

Random and non-random mating populations: Evolutionary dynamics in meiotic drive
Bijan Sarkar

Game theoretic tools are utilized to analyze a one-locus continuous selection model of sex-specific meiotic drive by considering nonequivalence of the viabilities of reciprocal heterozygotes that might be noticed at an imprinted locus. The model draws attention to the role of viability selections of different types to examine the stable nature of polymorphic equilibrium. A bridge between population genetics and evolutionary game theory has been built up by applying the concept of the Fundamental Theorem of Natural Selection. In addition to pointing out the influences of male and female segregation ratios on selection, configuration structure reveals some noted results, e.g., Hardy-Weinberg frequencies hold in replicator dynamics, occurrence of faster evolution at the maximized variance fitness, existence of mixed Evolutionarily Stable Strategy (ESS) in asymmetric games, the tending evolution to follow not only a 1:1 sex ratio but also a 1:1 different alleles ratio at particular gene locus. Through construction of replicator dynamics in the group selection framework, our selection model introduces a redefining bases of game theory to incorporate non-random mating where a mating parameter associated with population structure is dependent on the social structure. Also, the model exposes the fact that the number of polymorphic equilibria will depend on the algebraic expression of population structure.

Evolutionary dynamics of a quantitative trait in a finite asexual population.

Evolutionary dynamics of a quantitative trait in a finite asexual population.

Florence Debarre, Sarah Otto

Phenotypic robustness determines genetic regulation of complex traits

Phenotypic robustness determines genetic regulation of complex traits

Anupama Yadav, Kaustubh Dhole, Himanshu Sinha

Trait evolution in adaptive radiations: modelling and measuring interspecific competition on phylogenies

Trait evolution in adaptive radiations: modelling and measuring interspecific competition on phylogenies

Magnus Clarke, Gavin H Thomas, Robert P Freckleton

A New Statistical Framework for Genetic Pleiotropic Analysis of High Dimensional Phenotype Data

A New Statistical Framework for Genetic Pleiotropic Analysis of High Dimensional Phenotype Data
Panpan Wang, Mohammad Rahman, Li Jin, Momiao Xiong
(Submitted on 3 Dec 2015)

The widely used genetic pleiotropic analysis of multiple phenotypes are often designed for examining the relationship between common variants and a few phenotypes. They are not suited for both high dimensional phenotypes and high dimensional genotype (next-generation sequencing) data. To overcome these limitations, we develop sparse structural equation models (SEMs) as a general framework for a new paradigm of genetic analysis of multiple phenotypes. To incorporate both common and rare variants into the analysis, we extend the traditional multivariate SEMs to sparse functional SEMs. To deal with high dimensional phenotype and genotype data, we employ functional data analysis and the alternative direction methods of multiplier (ADMM) techniques to reduce data dimension and improve computational efficiency. Using large scale simulations we showed that the proposed methods have higher power to detect true causal genetic pleiotropic structure than other existing methods. Simulations also demonstrate that the gene-based pleiotropic analysis has higher power than the single variant-based pleiotropic analysis. The proposed method is applied to exome sequence data from the NHLBI Exome Sequencing Project (ESP) with 11 phenotypes, which identifies a network with 137 genes connected to 11 phenotypes and 341 edges. Among them, 114 genes showed pleiotropic genetic effects and 45 genes were reported to be associated with phenotypes in the analysis or other cardiovascular disease (CVD) related phenotypes in the literature.

Bayesian non-parametric inference for Λ-coalescents: consistency and a parametric method

Bayesian non-parametric inference for Λ-coalescents: consistency and a parametric method
Jere Koskela, Paul A. Jenkins, Dario Spanò
(Submitted on 3 Dec 2015)

We investigate Bayesian non-parametric inference for Λ-coalescent processes parametrised by probability measures on the unit interval, and provide an implementable, provably consistent MCMC inference algorithm. We give verifiable criteria on the prior for posterior consistency when observations form a time series, and prove that any non-trivial prior is inconsistent when all observations are contemporaneous. We then show that the likelihood given a data set of size n∈ℕ is constant across Λ-measures whose leading n−2 moments agree, and focus on inferring truncated sequences of moments. We provide a large class of functionals which can be extremised using finite computation given a credibility region of posterior truncated moment sequences, and a pseudo-marginal Metropolis-Hastings algorithm for sampling the posterior. Finally, we compare the efficiency of the exact and noisy pseudo-marginal algorithms with and without delayed acceptance acceleration using a simulation study.

Efficient recycled algorithms for quantitative trait models on phylogenies

Efficient recycled algorithms for quantitative trait models on phylogenies
Gordon Hiscott, Colin Fox, Matthew Parry, David Bryant
(Submitted on 2 Dec 2015)

We present an efficient and flexible method for computing likelihoods of phenotypic traits on a phylogeny. The method does not resort to Monte-Carlo computation but instead blends Felsenstein’s discrete character pruning algorithm with methods for numerical quadrature. It is not limited to Gaussian models and adapts readily to model uncertainty in the observed trait values. We demonstrate the framework by developing efficient algorithms for likelihood calculation and ancestral state reconstruction under Wright’s threshold model, applying our methods to a dataset of trait data for extrafloral nectaries (EFNs) across a phylogeny of 839 Labales species.