Detection of adaptive shifts on phylogenies using shifted stochastic processes on a tree

Detection of adaptive shifts on phylogenies using shifted stochastic processes on a tree

Paul Bastide, Mahendra Mariadassou, Stéphane Robin
doi: http://dx.doi.org/10.1101/023804

Comparative and evolutive ecologists are interested in the distribution of quantitative traits among related species. The classical framework for these distributions consists of a random process running along the branches of a phylogenetic tree relating the species. We consider shifts in the process parameters, which reveal fast adaptation to changes of ecological niches. We show that models with shifts are not identifiable in general. Constraining the models to be parsimonious in the number of shifts partially alleviates the problem but several evolutionary scenarios can still provide the same joint distribution for the extant species. We provide a recursive algorithm to enumerate all the equivalent scenarios and to count the effectively different scenarios. We introduce an incomplete-data framework and develop a maximum likelihood estimation procedure based on the EM algorithm. Finally, we propose a model selection procedure, based on the cardinal of effective scenarios, to estimate the number of shifts and prove an oracle inequality.

Most viewed on Haldane’s Sieve: July 2015

The most viewed posts on Haldane’s Sieve this month were:

The constant philopater hypothesis: a new life history invariant for dispersal evolution

The constant philopater hypothesis: a new life history invariant for dispersal evolution
António M. M. Rodrigues, Andy Gardner
doi: http://dx.doi.org/10.1101/023655
Life history invariants play a pivotal role in the study of social adaptation: they provide theoretical hypotheses that can be empirically tested, and benchmark frameworks against which new theoretical developments can be understood. Here we derive a novel invariant for dispersal evolution: the “constant philopater hypothesis” (CPH). Specifically, we find that, irrespective of variation in maternal fecundity, all mothers are favoured to produce exactly the same number of philopatric offspring, with high-fecundity mothers investing proportionally more, and low-fecundity mothers investing proportionally less, into dispersing offspring. This result holds for female and male dispersal, under haploid, diploid and haplodiploid modes of inheritance, irrespective of the sex ratio, local resource availability, and whether mother or offspring controls the latter’s dispersal propensity. We explore the implications of this result for evolutionary conflicts of interest – and the exchange and withholding of contextual information – both within and between families, and we show that the CPH is the fundamental invariant that underpins and explains a wider family of invariance relationships that emerge from the study of social evolution.

Multilocus sex determination revealed in two populations of gynodioecious wild strawberry, Fragaria vesca subsp. bracteata

Multilocus sex determination revealed in two populations of gynodioecious wild strawberry, Fragaria vesca subsp. bracteata
Tia-Lynn Ashman, Jacob Tennessen, Rebecca Dalton, Rajanikanth Govindarajulu, Mathew Koski, Aaron Liston
doi: http://dx.doi.org/10.1101/023713

Gynodioecy, the coexistence of females and hermaphrodites, occurs in 20% of angiosperm families and often enables transitions between hermaphroditism and dioecy. Clarifying mechanisms of sex determination in gynodioecious species can thus illuminate sexual system evolution. Genetic determination of gynodioecy, however, can be complex and is not fully characterized in any wild species. We used targeted sequence capture to genetically map a novel nuclear contributor to male sterility in a self-pollinated hermaphrodite of Fragaria vesca subsp. bracteata from the southern portion of its range. To understand its interaction with another identified locus and possibly additional loci, we performed crosses within and between two populations separated by 2000 km, phenotyped the progeny and sequenced candidate markers at both sex-determining loci. The newly mapped locus contains a high density of pentatricopeptide repeat genes, a class commonly involved in restoration of fertility caused by cytoplasmic male sterility. Examination of all crosses revealed three unlinked epistatically interacting loci that determine sexual phenotype and vary in frequency between populations. Fragaria vesca subsp. bracteata represents the first wild gynodioecious species with genomic evidence of both cytoplasmic and nuclear genes in sex determination. We propose a model for the interactions between these loci and new hypotheses for the evolution of sex determining chromosomes in the subdioecious and dioecious Fragaria.

Comparing Variant Call Files for Performance Benchmarking of Next-Generation Sequencing Variant Calling Pipelines

Comparing Variant Call Files for Performance Benchmarking of Next-Generation Sequencing Variant Calling Pipelines
John G. Cleary, Ross Braithwaite, Kurt Gaastra, Brian S Hilbush, Stuart Inglis, Sean A Irvine, Alan Jackson, Richard Littin, Mehul Rathod, David Ware, Justin M. Zook, Len Trigg, Francisco M. M. De La Vega
doi: http://dx.doi.org/10.1101/023754
To evaluate and compare the performance of variant calling methods and their confidence scores, comparisons between a test call set and a “gold standard” need to be carried out. Unfortunately, these comparisons are not straightforward with the current Variant Call Files (VCF), which are the standard output of most variant calling algorithms for high-throughput sequencing data. Comparisons of VCFs are often confounded by the different representations of indels, MNPs, and combinations thereof with SNVs in complex regions of the genome, resulting in misleading results. A variant caller is inherently a classification method designed to score putative variants with confidence scores that could permit controlling the rate of false positives (FP) or false negatives (FN) for a given application. Receiver operator curves (ROC) and the area under the ROC (AUC) are efficient metrics to evaluate a test call set versus a gold standard. However, in the case of VCF data this also requires a special accounting to deal with discrepant representations. We developed a novel algorithm for comparing variant call sets that deals with complex call representation discrepancies and through a dynamic programing method that minimizes false positives and negatives globally across the entire call sets for accurate performance evaluation of VCFs.

Genome-Wide Scan for Adaptive Divergence and Association with Population-Specific Covariates

Genome-Wide Scan for Adaptive Divergence and Association with Population-Specific Covariates
mathieu gautier
doi: http://dx.doi.org/10.1101/023721

In population genomics studies, accounting for the neutral covariance structure across population allele frequencies is critical to improve the robustness of genome-wide scan approaches. Elaborating on the BayEnv model, this study investigates several modeling extensions i) to improve the estimation accuracy of the population covariance matrix and all the related measures; ii) to identify significantly overly differentiated SNPs based on a calibration procedure of the XtX statistics; and iii) to consider alternative covariate models for analyses of association with population-specific covariables. In particular, the auxiliary variable model allows to deal with multiple testing issues and, providing the relative marker positions are available, to capture some Linkage Disequilibrium information. A comprehensive simulation study is further carried out to investigate and compare the performance of the different models. For illustration purpose, genotyping data on 18 French cattle breeds are also analyzed leading to the identification of thirteen strong signatures of selection. Among these, four (surrounding the KITLG, KIT, EDN3 and ALB genes) contained SNPs strongly associated with the piebald coloration pattern while a fifth (surrounding PLAG1) could be associated to morphological differences across the populations. Finally, analysis of Pool–Seq data from 12 populations of {\it Littorina saxatilis} living in two different ecotypes illustrates how the proposed framework might help addressing relevant ecological question in non–model species. Overall, the proposed methods define a robust Bayesian framework to characterize adaptive genetic differentiation across populations. The BayPass program implementing the different models is available at http://www1.montpellier.inra.fr/CBGP/software/baypass/.