Anthony J. Geneva, Christina A. Muirhead, LeAnne M. Lovato, Sarah B. Kingan, Daniel Garrigan
(Submitted on 6 Mar 2014)
The study of complex speciation, or speciation with gene flow, requires the identification of genomic regions that are either unusually divergent or that have experienced recent gene flow. Furthermore, the rapid growth of population genomic datasets relevant to studying complex speciation requires that analytical tools be scalable to the level of whole-genome analysis. We present a simple sequence measure, Gmin which is specifically designed to identify regions of diverging genomes as candidates for experiencing recent gene flow. Gmin is defined as the ratio of the minimum number of nucleotide differences between sequences from two different populations to the average number of between-population differences. We compare the sensitivity of Gmin to that of the widely used index of population differentiation, Fst. Extensive computer simulations demonstrate that Gmin has greater sensitivity and specificity to detect gene flow than Fst. Additionally, the sensitivity of Gmin to detect gene flow is robust with respect to both the population mutation and recombination rates, suggesting that it is flexible and can be applied to a variety of biological scenarios. Finally, a scan of Gmin across the X~chromosome of Drosophila melanogaster identifies candidate regions of introgression between sub-Saharan African and cosmopolitan populations that were previously missed by other methods. These results demonstrate that Gmin is a biologically straightforward, yet powerful, alternative to Fst, as well as to more computationally intensive model-based methods for detecting gene flow.