An explicit Poisson-Kolmogorov-Smirnov test for the molecular clock in phylogenies

Fernando Marcon, Fernando Antoneli, Marcelo R. S. Briones

(Submitted on 21 May 2015)

Divergence dates estimates are central to understand evolutionary processes and depend, in the case of molecular phylogenies, on tests for the molecular clock. Testing for global and local clocks generally compare a clock-constrained tree versus a non-clock tree (e.g. the likelihood ratio test). These tests verify the evolutionary rate homogeneity among taxa and usually employ the chi-square test for rejection/acceptance of the “clock-like” phylogeny. The paradox is that the molecular clock hypothesis, as proposed, is a Poisson process, and therefore, non-homogeneous. Here we propose a method for testing the molecular clock in phylogenies that is built upon the assumption of Poisson stochastic process that accommodates rate heterogeneity and is based on ensembles of trees inferred by the Bayesian method. The observed distribution of branch lengths (number of substitutions) is obtained from the ensemble of post burn-in Bayesian search. The parameter λ of the expected Poisson distribution is given by the average branch length of this ensemble. The goodness-of-fit test is performed using a modified Kolmogorov-Smirnov test for Poisson distributions. The method here introduced uses a large number of statistically equivalent phylogenies to obtain the observed distribution. This circumvents problems of small sample size (lack of power and lack of information), because the power of the test is asymptotic to unity. Also, the observed distribution obtained is very robust in the sense that for a sufficient number of trees (700) the empirical distribution stabilizes. Therefore, the estimated parameter λ, used to define the expected distribution, is essentially independent of sample size.