Detecting the structure of haplotypes, local ancestry and excessive local European ancestry in Mexicans
(Submitted on 5 Apr 2013)
We present a two-layer hidden Markov model to detect structure of haplotypes for unrelated individuals. This allows modeling two scales of linkage disequilibrium (one within a group of haplotypes and one between groups), thereby taking advantage of rich haplotype information to infer local ancestry for admixed individuals. Our method outperforms competing state-of-art methods, particularly for regions of small ancestral track lengths. Applying our method to Mexican samples in HapMap3, we found five coding regions, ranging from $0.3 -1.3$ megabase (Mb) in lengths, that exhibit excessive European ancestry (average dosage > 1.6). A particular interesting region of 1.1Mb (with average dosage 1.95) locates on Chromosome 2p23 that harbors two genes, PXDN and MYT1L, both of which are associated with autism and schizophrenia. In light of the low prevalence of autism in Hispanics, this region warrants special attention. We confirmed our findings using Mexican samples from the 1000 genomes project. A software package implementing methods described in the paper is freely available at this http URL.