AGOUTI: improving genome assembly and annotation using transcriptome data

AGOUTI: improving genome assembly and annotation using transcriptome data

Simo V Zhang, Luting Zhuo, Matthew W Hahn

Coevolution leaves a stronger imprint on interactions than on community structure

Coevolution leaves a stronger imprint on interactions than on community structure

Timothee Poisot, Daniel Stouffer

The role of deleterious substitutions in crop genomes

The role of deleterious substitutions in crop genomes

Thomas J Y Kono, Fengli Fu, Mohsen Mohammadi, Paul J Hoffman, Chaochih Liu, Robert M Stupar, Kevin P Smith, Peter Tiffin, Justin C Fay, Peter L Morrell

Adaptive Protein Evolution in Animals and the Effective Population Size Hypothesis.

Adaptive Protein Evolution in Animals and the Effective Population Size Hypothesis.

Nicolas Galtier

Efficient coalescent simulation and genealogical analysis for large sample sizes

Efficient coalescent simulation and genealogical analysis for large sample sizes

Jerome Kelleher, Gil McVean, Alison M Etheridge

Bayesian identification of bacterial strains from sequencing data

Bayesian identification of bacterial strains from sequencing data
Aravind Sankar, Brandon Malone, Sion Bayliss, Ben Pascoe, Guillaume Méric, Matthew D. Hitchings, Samuel K. Sheppard, Edward J. Feil, Jukka Corander, Antti Honkela

Rapidly assaying the diversity of a bacterial species present in a sample obtained from a hospital patient or an evironmental source has become possible after recent technological advances in DNA sequencing. For several applications it is important to accurately identify the presence and estimate relative abundances of the target organisms from short sequence reads obtained from a sample. This task is particularly challenging when the set of interest includes very closely related organisms, such as different strains of pathogenic bacteria, which can vary considerably in terms of virulence, resistance and spread. Using advanced Bayesian statistical modelling and computation techniques we introduce a novel method for bacterial identification that is shown to outperform the currently leading pipeline for this purpose. Our approach enables fast and accurate sequence based identification of bacterial strains while using only modest computational resources. Hence it provides a useful tool for a wide spectrum of applications, including rapid clinical diagnostics to distinguish among closely related strains causing nosocomial infections. The software implementation is available at this https URL

Hot RAD: A Tool for Analysis of Next-Gen RAD Tag Data

Hot RAD: A Tool for Analysis of Next-Gen RAD Tag Data
Lauren A. Assour, Nicholas LaRosa, Scott J. Emrich

Restriction site Associated DNA (RAD) tagging (also known as RAD-seq, etc.) is an emerging method for analyzing an organism’s genome without completely sequencing it. This can be applied to a non-model organism without a reference genome, though this creates the problem of how to begin data analysis on unmapped and unannotated reads. Our program, Hot RAD, presents a straightforward and easy-to-use method to take raw Illumina data that has been RAD tagged and produce consensus contigs or sequence stacks using a distributed framework, creating a basis on which to begin analyzing an organism’s DNA. The GUI (graphical user interface) element of our tool makes it easy for those not familiar with the command line to take raw sequence files and produce usable data in a timely manner.