Efficient computation of the joint sample frequency spectra for multiple populations

Efficient computation of the joint sample frequency spectra for multiple populations

John A. Kamm, Jonathan Terhorst, Yun S. Song
(Submitted on 3 Mar 2015)

A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences. In particular, recently there has been growing interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. Although much methodological progress has been made, existing SFS-based inference methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable efficient computation of the expected joint SFS for multiple populations related by a complex demographic model with arbitrary population size histories (including piecewise exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study involving tens of populations, we demonstrate our improvements to numerical stability and computational complexity.

Advertisements

3 thoughts on “Efficient computation of the joint sample frequency spectra for multiple populations

    • Thanks, Ryan. momi is still under development. Our current implementation efficiently computes the expected joint SFS for multiple populations, and the next natural step is to turn our approach into a useful inference method by utilizing the expected SFS in a likelihood framework. We will be releasing our software package, possibly even before publication, once we have a stable optimization implementation that can efficiently perform parameter estimation and model selection.

      Obviously the name of our software is in tribute to your method.

  1. Pingback: Most viewed on Haldane’s Sieve: March 2015 | Haldane's Sieve

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s