Population genetics of identity by descent
Pier Francesco Palamara, Ph.D. thesis
Recent improvements in high-throughput genotyping and sequencing technologies have afforded the collection of massive, genome-wide datasets of DNA information from hundreds of thousands of individuals. These datasets, in turn, provide unprecedented opportunities to reconstruct the history of human populations and detect genotype-phenotype association. Recently developed computational methods can identify long-range chromosomal segments that are identical across samples, and have been transmitted from common ancestors that lived tens to hundreds of generations in the past. These segments reveal genealogical relationships that are typically unknown to the carrying individuals. In this work, we demonstrate that such identical-by-descent (IBD) segments are informative about a number of relevant population genetics features: they enable the inference of details about past population size fluctuations, migration events, and they carry the genomic signature of natural selection. We derive a mathematical model, based on coalescent theory, that allows for a quantitative description of IBD sharing across purportedly unrelated individuals, and develop inference procedures for the reconstruction of recent demographic events, where classical methodologies are statistically underpowered. We analyze IBD sharing in several contemporary human populations, including representative communities of the Jewish Diaspora, Kenyan Maasai samples, and individuals from several Dutch provinces, in all cases retrieving evidence of fine-scale demographic events from recent history. Finally, we expand the presented model to describe distributions for those sites in IBD shared segments that harbor mutation events, showing how these may be used for the inference of mutation rates in humans and other species.