Demography and the age of rare variants
Iain Mathieson, Gil McVean
(Submitted on 16 Jan 2014)
Recently, large whole-genome sequencing projects have provided access to much of the rare variation in human populations. This variation is highly informative about population structure and recent demography. In this paper, we show how the age of rare variants can be estimated from patterns of haplotype sharing and how this information can detect and quantify historical relationships between populations. We investigate the distribution of the age of f2 variants in a worldwide sample sequenced by the 1,000 Genomes Project, revealing enormous variation across populations. The median age of f2 variants shared within continents is 50 to 160 generations for Europe and Asia, and 170 to 320 generations for Africa. Variants shared between continents are much older with median ages ranging from 320 to 670 generations between Europe and Asia, and 1,000 to 2,400 generations between African and Non-African populations. The distribution of the ages of variants shared across populations is informative about their demography, revealing recent bottlenecks, ancient splits, and more modern connections between populations. We see the signature of selection in the observation that functional variants are significantly younger than nonfunctional variants of the same frequency. This approach is relatively insensitive to mutation rate and complements other nonparametric methods for demographic inference.