A general approximation for the dynamics of quantitative traits

A general approximation for the dynamics of quantitative traits
Katarína Boďová, Gašper Tkačik, Nicholas H. Barton

Selection, mutation and random drift affect the dynamics of allele frequencies and consequently of quantitative traits. While the macroscopic dynamics of quantitative traits can be measured, the underlying allele frequencies are typically unobserved. Can we understand how the macroscopic observables evolve without following these microscopic processes? The problem has previously been studied by analogy with statistical mechanics: the allele frequency distribution at each time is approximated by the stationary form, which maximises entropy. We explore the limitations of this method when mutation is small (4Nμ<1) so that populations are typically close to fixation and we extend the theory in this regime to account for changes in mutation strength. We consider a single diallelic locus under either directional selection, or with over-dominance, and then generalise to multiple unlinked biallelic loci with unequal effects. We find that the maximum entropy approximation is remarkably accurate, even when mutation and selection change rapidly.

Rawcopy: Improved copy number analysis with Affymetrix arrays

Rawcopy: Improved copy number analysis with Affymetrix arrays

Markus Mayrhofer, Bjorn Viklund, Anders Isaksson

Homomorphic ZW Chromosomes in a Wild Strawberry Show Distinctive Recombination Heterogeneity but a Small Sex-Determining Region

Homomorphic ZW Chromosomes in a Wild Strawberry Show Distinctive Recombination Heterogeneity but a Small Sex-Determining Region

Jacob Tennessen, Rajanikanth Govindarajulu, Aaron Liston, Tia-Lynn Ashman

Inferring chimpanzee Y chromosome history and amplicon diversity from whole genome sequencing

Inferring chimpanzee Y chromosome history and amplicon diversity from whole genome sequencing

Matthew Oetjens, Feichen Shen, Zhengting Zou, Jeffrey Kidd

Flowr: Robust and efficient pipelines using a simple language-agnostic approach

Flowr: Robust and efficient pipelines using a simple language-agnostic approach

Sahil Seth, Samir Amin, Xingzhi Song, Xizeng Mao, Huandong Sun, Andrew Futreal, Jianhua Zhang

Machine learning for metagenomics: methods and tools

Machine learning for metagenomics: methods and tools
Hayssam Soueidan, Macha Nikolski

While genomics is the research field relative to the study of the genome of any organism, metagenomics is the term for the research that focuses on many genomes at the same time, as typical in some sections of environmental study. Metagenomics recognizes the need to develop computational methods that enable understanding the genetic composition and activities of communities of species so complex that they can only be sampled, never completely characterized.
Machine learning currently offers some of the most computationally efficient tools for building predictive models for classification of biological data. Various biological applications cover the entire spectrum of machine learning problems including supervised learning, unsupervised learning (or clustering), and model construction. Moreover, most of biological data — and this is the case for metagenomics — are both unbalanced and heterogeneous, thus meeting the current challenges of machine learning in the era of Big Data.
The goal of this revue is to examine the contribution of machine learning techniques for metagenomics, that is answer the question “to what extent does machine learning contribute to the study of microbial communities and environmental samples?” We will first briefly introduce the scientific fundamentals of machine learning. In the following sections we will illustrate how these techniques are helpful in answering questions of metagenomic data analysis. We will describe a certain number of methods and tools to this end, though we will not cover them exhaustively. Finally, we will speculate on the possible future directions of this research.

Efficient genome-wide sequencing and low coverage pedigree analysis from non-invasively collected samples

Efficient genome-wide sequencing and low coverage pedigree analysis from non-invasively collected samples

Noah Snyder-Mackler, William H Majoros, Michael L Yuan, Amanda O Shaver, Jacob B Gordon, Gisela H Kopp, Stephen A Schlebusch, Jeffrey D Wall, Susan C Alberts, Sayan Mukherjee, Xiang Zhou, Jenny Tung

Hard, soft and just right: variations in linked selection and recombination drive genomic divergence during speciation of aspens

Hard, soft and just right: variations in linked selection and recombination drive genomic divergence during speciation of aspens

Jing Wang, Nathaniel R Street, Douglas G Scofield, Pär Ingvarsson

Inferring the correlated fitness effects of nonsynonymous mutations at the same site using triallelic population genomics

Inferring the correlated fitness effects of nonsynonymous mutations at the same site using triallelic population genomics

Aaron P Ragsdale, Alec J Coffman, PingHsun Hsieh, Travis J Struck, Ryan N Gutenkunst

New thoughts on an old riddle: what determines genetic diversity within and between species?

New thoughts on an old riddle: what determines genetic diversity within and between species?
Shi Huang

The question of what determines genetic diversity both between and within species has long remained unsolved by the modern evolutionary theory (MET). However, it has not deterred researchers from producing interpretations of genetic diversity by using MET. We here examine the two key experimental observations of genetic diversity made in the 1960s, one between species and the other within a population of a species, that directly contributed to the development of MET. The interpretations of these observations as well as the assumptions by MET are widely known to be inadequate. We review the recent progress of an alternative framework, the maximum genetic diversity (MGD) hypothesis, that uses axioms and natural selection to explain the vast majority of genetic diversity as being at optimum equilibrium that is largely determined by organismal complexity. The MGD hypothesis fully absorbs the proven virtues of MET and considers its assumptions relevant only to a much more limited scope. This new synthesis has accounted for the much overlooked phenomenon of progression towards higher complexity, and more importantly, been instrumental in directing productive research into both evolutionary and biomedical problems.