The most parsimonious tree for random data

The most parsimonious tree for random data

Mareike Fischer, Michelle Galla, Lina Herbst, Mike Steel
(Submitted on 1 Jun 2014)

Applying a method to reconstruct a phylogenetic tree from random data provides a way to detect whether that method has an inherent bias towards certain tree `shapes’. For maximum parsimony, applied to a sequence of random 2-state data, each possible binary phylogenetic tree has exactly the same distribution for its parsimony score. Despite this pleasing and slightly surprising symmetry, some binary phylogenetic trees are more likely than others to be a most parsimonious (MP) tree for a sequence of k such characters, as we show. For k=2, and unrooted binary trees on six taxa, any tree with a caterpillar shape has a higher chance of being an MP tree than any tree with a symmetric shape. On the other hand, if we take any two binary trees, on any number of taxa, we prove that this bias between the two trees vanishes as the number of characters grows. However, again there is a twist: MP trees on six taxa are more likely to have certain shapes than a uniform distribution on binary phylogenetic trees predicts, and this difference does not appear to dissipate as k grows.

A field test for frequency-dependent selection on mimetic colour patterns in Heliconius butterflies

A field test for frequency-dependent selection on mimetic colour patterns in Heliconius butterflies

Patricio Alejandro Salazar Carrión, Martin Stevens, Robert T. Jones, Imogen Ogilvie, Chris Jiggins

Müllerian mimicry, the similarity among unpalatable species, is thought to evolve by frequency-dependent selection. Accordingly, phenotypes that become established in an area are positively selected because predators have learnt to avoid these forms, while introduced phenotypes are eliminated because predators have not yet learnt to associate these other forms with unprofitability. We tested this prediction in two areas where different colour morphs of the mimetic species Heliconius erato and H. melpomene have become established, as well as in the hybrid zone between these morphs. In each area we tested for selection on three colour patterns: the two parental and the most common hybrid. We recorded bird predation on butterfly models with paper wings, matching the appearance of each morph to bird vision, and plasticine bodies. We did not detect differences in survival between colour morphs, but all morphs were more highly attacked in the hybrid zone. This finding is consistent with recent evidence from controlled experiments with captive birds, which suggest that the effectiveness of warning signals decreases when a large signal diversity is available to predators. This is likely to occur in the hybrid zone where over twenty hybrid phenotypes coexist.

Phylogenetic Identification and Functional Characterization of Orthologs and Paralogs across Human, Mouse, Fly, and Worm

Phylogenetic Identification and Functional Characterization of Orthologs and Paralogs across Human, Mouse, Fly, and Worm

Yi-Chieh Wu, Mukul S Bansal, Matthew D Rasmussen, Javier Herrero, Manolis Kellis

Model organisms can serve the biological and medical community by enabling the study of conserved gene families and pathways in experimentally-tractable systems. Their use, however, hinges on the ability to reliably identify evolutionary orthologs and paralogs with high accuracy, which can be a great challenge at both small and large evolutionary distances. Here, we present a phylogenomics-based approach for the identification of orthologous and paralogous genes in human, mouse, fly, and worm, which forms the foundation of the comparative analyses of the modENCODE and mouse ENCODE projects. We study a median of 16,101 genes across 2 mammalian genomes (human, mouse), 12 Drosophila genomes, 5 Caenorhabditis genomes, and an outgroup yeast genome, and demonstrate that accurate inference of evolutionary relationships and events across these species must account for frequent gene-tree topology errors due to both incomplete lineage sorting and insufficient phylogenetic signal. Furthermore, we show that integration of two separate phylogenomic pipelines yields increased accuracy, suggesting that their sources of error are independent, and finally, we leverage the resulting annotation of homologous genes to study the functional impact of gene duplication and loss in the context of rich gene expression and functional genomic datasets of the modENCODE, mouse ENCODE, and human ENCODE projects.

Most viewed on Haldane’s Sieve: May 2014

The most viewed posts on Haldane’s Sieve in May 2014 were: