Conflations of short IBD blocks can bias inferred length of IBD
Charleston W.K. Chiang, Peter Ralph, John Novembre
Comments: 12 figures, 1 table
Subjects: Populations and Evolution (q-bio.PE)
Identity-by-descent (IBD) is a fundamental concept in genetics with many applications. Often, segments between two haplotypes are said to be IBD if they are inherited from a recent shared common ancestor without intervening recombination. Long IBD blocks (> 1cM) can be efficiently detected by a number of computer programs using high-density SNP array data from a population sample. However, all programs detect IBD based on contiguous segments of identity-by-state, and can therefore be due to the conflation of smaller, nearby IBD blocks. We quantified this effect using coalescent simulations, finding that nearly 40% of inferred blocks 1-2cM long are false conflations of two or more longer blocks, under demographic scenarios typical for modern humans. This biases the inferred IBD block length distribution, and so can affect downstream inferences. We observed this conflation effect universally across different IBD detection programs and human demographic histories, and found inference of segments longer than 2cM to be much more reliable (less than 5% conflation rate). We then present and analyze a novel estimator of the de novo mutation rate using IBD blocks, and demonstrate that the biased length distribution of the IBD segments due to conflation can strongly affect this estimator if the conflation is not modeled. Thus, the conflation effect should be carefully considered, especially as methods to detect shorter IBD blocks using sequencing data are being developed.