Erik M. Volz, Simon DW Frost
(Submitted on 28 Aug 2014)
Many population genetic models have been developed for the purpose of inferring population size and growth rates from random samples of genetic data. We examine two popular approaches to this problem, the coalescent and the birth-death-sampling model, in the context of estimating population size and birth rates in a population growing exponentially according to the birth-death branching process. For sequences sampled at a single time, we found the coalescent and the birth-death-sampling model gave virtually indistinguishable results in terms of the growth rates and fraction of the population sampled, even when sampling from a small population. For sequences sampled at multiple time points, we find that the birth-death model estimators are subject to large bias if the sampling process is misspecified. Since birth-death-sampling models incorporate a model of the sampling process, we show how much of the statistical power of birth-death-sampling models arises from the sequence of sample times and not from the genealogical tree. This motivates the development of a new coalescent estimator, which is augmented with a model of the known sampling process and is potentially more precise than the coalescent that does not use sample time information.