Coalescent histories for lodgepole species trees

Filippo Disanto, Noah A. Rosenberg

Subjects: Populations and Evolution (q-bio.PE); Combinatorics (math.CO)

Coalescent histories are combinatorial structures that describe for a given gene tree and species tree the possible lists of branches of the species tree on which the gene tree coalescences take place. Properties of the number of coalescent histories for gene trees and species trees affect a variety of probabilistic calculations in mathematical phylogenetics. Exact and asymptotic evaluations of the number of coalescent histories, however, are known only in a limited number of cases. Here we introduce a particular family of species trees, the \emph{lodgepole} species trees $(\lambda_n)_{n\geq 0}$, in which tree $\lambda_n$ has $m=2n+1$ taxa. We determine the number of coalescent histories for the lodgepole species trees, in the case that the gene tree matches the species tree, showing that this number grows with $m!!$ in the number of taxa $m$. This computation demonstrates the existence of tree families in which the growth in the number of coalescent histories is faster than exponential. Further, it provides a substantial improvement on the lower bound for the ratio of the largest number of matching coalescent histories to the smallest number of matching coalescent histories for trees with $m$ taxa, increasing a previous bound of $(\sqrt{\pi} / 32)[(5m-12)/(4m-6)] m \sqrt{m}$ to $[ \sqrt{m-1}/(4 \sqrt{e}) ]^{m}$. We discuss the implications of our enumerative results for phylogenetic computations.