Linkage Disequilibrium and Inversion-Typing of the Drosophila melanogaster Genome Reference Panel
David Houle , Eladio J. Marquez
We calculated the linkage disequilibrium between all pairs of variants in the Drosophila Genome Reference Panel, and make available the list of all highly correlated SNPs for use in association studies. Seventy-three percent of variant SNPs are correlated at r2>0.5 with at least one other SNP, and the mean number of correlated SNPs per variant over the whole genome is 64.9. Disequilibrium between distant SNPs is also common when minor allele frequency (MAF) is low: 24% of SNPs with MAF<0.1 are highly correlated with SNPs more than 100kb distant. While SNPs within regions with polymorphic inversions are highly correlated with somewhat larger numbers of SNPs, and these correlated SNPs are on average farther away, the probability that a SNP in such regions is highly correlated with at least one other SNP is very similar to SNPs outside inversions. Previous karyotyping of the DGRP lines has been inconsistent, and we used LD and genotype to investigate these discrepancies. When previous studies agreed on inversion karyotype, our analysis was almost perfectly concordant with those assignments. In discordant cases, and for inversion heterozygotes, our results suggest errors in two previous analyses, or discordance between genotype and karyotype. Heterozygosities of chromosome arms are in many cases surprisingly highly correlated, suggesting strong epsistatic selection during the inbreeding and maintenance of the DGRP lines.