Efficient Bayesian species tree inference under the multi-species coalescent
Bruce Rannala, Ziheng Yang
A method was developed for Bayesian inference of species phylogeny using the multi-species coalescent model. To improve the mixing properties of the Markov chain Monte Carlo (MCMC) algorithm that traverses the space of species trees, we implement two efficient MCMC proposals: the first is based on the Subtree Pruning and Regrafting (SPR) algorithm and the second is based on a novel node-slider algorithm. Like the Nearest-Neighbor Interchange (NNI) algorithm we implemented previously, both algorithms propose changes to the species tree, while simultaneously altering the gene trees at multiple genetic loci to automatically avoid conflicts with the newly-proposed species tree. The method integrates over gene trees, naturally taking account of the uncertainty of gene tree topology and branch lengths given the sequence data. A simulation study was performed to examine the statistical properties of the new method. We found that it has excellent statistical performance, inferring the correct species tree with near certainty when analyzing 10 loci. The prior on species trees has some impact, particularly for small numbers of loci. An empirical dataset (for rattlesnakes) was reanalyzed. While the 18 nuclear loci and one mitochondrial locus support largely consistent species trees under the multi-species coalescent model estimates of parameters suggest drastically different evolutionary dynamics between the nuclear and mitochondrial loci.