An exact algorithm and efficient importance sampling for computing two-locus likelihoods under variable population size

John A. Kamm, Jeffrey P. Spence, Jeffrey Chan, Yun S. Song

Two-locus sampling probabilities have played a central role in devising an efficient composite likelihood method for estimating fine-scale recombination rates. Due to mathematical and computational challenges, these sampling probabilities are typically computed under the unrealistic assumption of a constant population size, and simulation studies have shown that resulting recombination rate estimates can be severely biased in certain cases of historical population size changes. To alleviate this problem, we develop here two distinct methods to compute the sampling probability for variable population size functions that are piecewise constant. The first is a novel formula that can be evaluated by numerically exponentiating a large but sparse matrix. The second method is importance sampling on genealogies, based on a characterization of the optimal proposal distribution that extends previous results to the variable-size setting. The resulting proposal distribution is highly efficient, with an average effective sample size (ESS) of nearly 98% per sample. Through a simulation study, we show that accounting for population size changes improves inference of recombination rates.