A Pleiotropy-Informed Bayesian False Discovery Rate adapted to a Shared Control Design Finds New Disease Associations From GWAS Summary Statistics

A Pleiotropy-Informed Bayesian False Discovery Rate adapted to a Shared Control Design Finds New Disease Associations From GWAS Summary Statistics

James Liley, Chris Wallace
doi: http://dx.doi.org/10.1101/014886

Genome-wide association studies (GWAS) have been successful in identifying single nucleotide polymorphisms (SNPs) associated with many traits and diseases. However, at existing sample sizes, these variants explain only part of the estimated heritability. Leverage of GWAS results from related phenotypes may improve detection without the need for larger datasets. The Bayesian conditional false discovery rate (cFDR) constitutes an upper bound on the expected false discovery rate (FDR) across a set of SNPs whose p values for two diseases are both less than two disease-specific thresholds. Calculation of the cFDR requires only summary statistics and has several advantages over traditional GWAS analysis. However, existing methods require distinct control samples between studies. Here, we extend the technique to allow for some or all controls to be shared, increasing applicability. Several different SNP sets can be defined with the same cFDR value, and we show that the expected FDR across the union of these sets may exceed expected FDR in any single set. We describe a procedure to establish an upper bound for the expected FDR among the union of such sets of SNPs. We apply our technique to pairwise analysis of p values from ten autoimmune diseases with variable sharing of controls, enabling discovery of 59 SNP-disease associations which do not reach GWAS significance after genomic control in individual datasets. Most of the SNPs we highlight have previously been confirmed using replication studies or larger GWAS, a useful validation of our technique; we report eight SNP-disease associations across five diseases not previously declared. Our technique extends and strengthens the previous algorithm, and establishes robust limits on the expected FDR. This approach can improve SNP detection in GWAS, and give insight into shared aetiology between phenotypically related conditions.

Natural Selection Shapes the Mosaic Ancestry of the Drosophila Genetic Reference Panel and the D. melanogaster Reference Genome

Natural Selection Shapes the Mosaic Ancestry of the Drosophila Genetic Reference Panel and the D. melanogaster Reference Genome

John E Pool
doi: http://dx.doi.org/10.1101/014837

North American populations of Drosophila melanogaster are thought to derive from both European and African source populations, but despite their importance for genetic research, patterns of admixture along their genomes are essentially undocumented. Here, I infer geographic ancestry along genomes of the Drosophila Genetic Reference Panel (DGRP) and the D. melanogaster reference genome. Overall, the proportion of African ancestry was estimated to be 20% for the DGRP and 9% for the reference genome. Based on the size of admixture tracts and the approximate timing of admixture, I estimate that the DGRP population underwent roughly 13.9 generations per year. Notably, ancestry levels varied strikingly among genomic regions, with significantly less African introgression on the X chromosome, in regions of high recombination, and at genes involved in specific processes such as circadian rhythm. An important role for natural selection during the admixture process was further supported by a genome-wide signal of ancestry disequilibrium, in that many between-chromosome pairs of loci showed a deficiency of Africa-Europe allele combinations. These results support the hypothesis that admixture between partially genetically isolated Drosophila populations led to natural selection against incompatible genetic variants, and that this process is ongoing. The ancestry blocks inferred here may be relevant for the performance of reference alignment in this species, and may bolster the design and interpretation of many population genetic and association mapping studies.

Author post: Imperfect drug penetration leads to spatial monotherapy and rapid evolution of multi-drug resistance

This guest post is by Pleuni Pennings about her paper (with co-authors) Imperfect drug penetration leads to spatial monotherapy and rapid evolution of multi-drug resistance, bioRxived here. This is a cross-post from Pleuni’s blog.

Almost three years ago, in early 2012, I attended a talk by Martin Nowak. He talked about cancer and one of the things he said was that treatment with multiple drugs at the same time is a good idea because it helps prevent the evolution of drug resistance. Specifically, he explained, when treatment is with multiple drugs, the pathogen (tumor cells in the case of cancer) needs to acquire multiple resistance mutations at the same time in order to escape drug pressure.

As I listened to Martin Nowak’s talk, I was thinking of HIV, not cancer. At that time, I had already spent about two years working on drug resistance in HIV. Treatment of HIV is always with multiple drugs, for the same reason that Martin Nowak highlighted in his talk: it helps prevent the evolution of drug resistance.

However, as I read the HIV drug resistance literature and analyzed sequence data from HIV patients, I found evidence that drug resistance mutations in HIV tend to accumulate one at a time. This is contrary to the generally accepted idea that the pathogen must acquire resistance mutations simultaneously.

There seemed to be a clear mismatch between data and theory. Data show mutations are acquired one at a time, and theory says mutations must be acquired simultaneously. One of the two must be wrong, and it can’t be the data![1]

Interesting!

After Martin Nowak’s talk, I went up to him and told him how I thought data didn’t fit the theory. Martin’s response: “Oh, that is interesting!” (Imagine this being said with an Austrian accent). “Let’s meet and talk about it.”

So, we met. Logically, Alison Hill and Daniel Rosenbloom, then grad students in Martin’s group, were there too. I had already met with Alison and Daniel many times, since they were also working on drug resistance in HIV. John Wakeley (my advisor at Harvard) came to the meeting too.

Between the five of us, we brainstormed and fairly quickly realized that one solution to the conundrum was to assume that a body’s patient consisted of different compartments and that each drug may not penetrate into each compartment. Maybe we found this solution quickly because Alison and Daniel had already been thinking of the issue of drug penetration in the context of another project. A body compartment that has only one drug instead of two or three would allow a pathogen that has acquired one drug resistance mutation to replicate. If a pathogen with just one mutation has a place to replicate, this makes it possible for the pathogen to acquire resistance mutations one at a time.

We decided to start a collaboration to analyze a formal model to see whether our intuition was correct. Over the following three years, there were some personnel changes and several moves, graduations and new jobs. Stefany Moreno joined the project as a student from the European MEME Master’s program when she spent a semester in Martin’s group. When I moved to Stanford, Dmitri Petrov became involved in the project. Next, Alison and Daniel each got their PhD and started postdocs (Alison at Harvard, Daniel at Columbia), Stefany got her MSc and started a PhD in Groningen, I had a baby and became an assistant professor at SFSU. No one would have been surprised if the project would never have been finished! But we stuck with it and after many hours of work, especially by the first authors Alison and Stefany, and uncountable Google Hangout meetings, we can now confidently say that our initial intuition from that meeting in 2012 was correct. Compartments with imperfect drug penetration indeed allow pathogens to acquire drug resistance one mutation at a time. And, importantly, the evolution of multi-drug resistance can happen fast if mutations can be acquired one at a time, much faster than when simultaneous mutations are needed.

Our manuscript can be found on the BioRxiv (link). It is entitled “Imperfect drug penetration leads to spatial monotherapy and rapid evolution of multi-drug resistance.” We hope you find it useful!

[1]Of course, it could be my interpretation of the data!

Molecular evolutionary consequences of island colonisation

Molecular evolutionary consequences of island colonisation

Jennifer James, Robert Lanfear, Adam Eyre-Walker
doi: http://dx.doi.org/10.1101/014811

Island endemics are likely to experience population bottlenecks; they also have restricted ranges. Therefore we expect island species to have small effective population sizes (Ne) and reduced genetic diversity compared to their mainland counterparts. As a consequence, island species may have inefficient selection and reduced adaptive potential. We used both polymorphisms and substitutions to address these predictions, improving on the approach of recent studies that only used substitution data. This allowed us to directly test the assumption that island species have small values of Ne. We found that island species had significantly less genetic diversity than mainland species; however, this pattern could be attributed to a subset of island species that had undergone a recent population bottleneck. When these species were excluded from the analysis, island and mainland species had similar levels of genetic diversity, despite island species occupying considerably smaller areas than their mainland counterparts. We also found no overall difference between island and mainland species in terms of effectiveness of selection or mutation rate. Our evidence suggests that island colonisation has no lasting impact on molecular evolution. This surprising result highlights gaps in our knowledge of the relationship between census and effective population size.

Most viewed on Haldane’s Sieve: January 2015

The most viewed posts on Haldane’s Sieve this month were:

Permutation Testing in the Presence of Polygenic Variation

Permutation Testing in the Presence of Polygenic Variation

Mark Abney
doi: http://dx.doi.org/10.1101/014571

This article discusses problems with and solutions to performing valid permutation tests for quantitative trait loci in the presence of polygenic effects. Although permutation testing is a popular approach for determining statistical significance of a test statistic with an unknown distribution–for instance, the maximum of multiple correlated statistics or some omnibus test statistic for a gene, gene-set or pathway–naive application of permutations may result in an invalid test. The risk of performing an invalid permutation test is particularly acute in complex trait mapping where polygenicity may combine with a structured population resulting from the presence of families, cryptic relatedness, admixture or population stratification. I give both analytical derivations and a conceptual understanding of why typical permutation procedures fail and suggest an alternative permutation based algorithm, MVNpermute, that succeeds. In particular, I examine the case where a linear mixed model is used to analyze a quantitative trait and show that both phenotype and genotype permutations may result in an invalid permutation test. I provide a formula that predicts the amount of inflation of the type 1 error rate depending on the degree of misspecification of the covariance structure of the polygenic effect and the heritability of the trait. I validate this formula by doing simulations, showing that the permutation distribution matches the theoretical expectation, and that my suggested permutation based test obtains the correct null distribution. Finally, I discuss situations where naive permutations of the phenotype or genotype are valid and the applicability of the results to other test statistics.

Mediated pleiotropy between psychiatric disorders and autoimmune disorders revealed by integrative analysis of multiple GWAS

Mediated pleiotropy between psychiatric disorders and autoimmune disorders revealed by integrative analysis of multiple GWAS

Qian Wang, Can Yang, Joel Gelernter, Hongyu Zhao
doi: http://dx.doi.org/10.1101/014530

Epidemiological observations and molecular-level experiments have indicated that brain disorders in the realm of psychiatry may be influenced by immune dysregulation. However, the degree of genetic overlap between immune disorders and psychiatric disorders has not been well established. We investigated this issue by integrative analysis of genome-wide association studies (GWAS) of 18 complex human traits/diseases (five psychiatric disorders, seven autoimmune disorders, and others) and multiple genome-wide annotation resources (Central nervous system genes, immune-related expression-quantitative trait loci (eQTL) and DNase I hypertensive sites from 98 cell-lines). We detected pleiotropy in 24 of the 35 psychiatric-autoimmune disorder pairs, with statistical significance as strong as p=3.9e-285 (schizophrenia-rheumatoid arthritis). Strong enrichment (>1.4 fold) of immune-related eQTL was observed in four psychiatric disorders. Genomic regions responsible for pleiotropy between psychiatric disorders and autoimmune disorders were detected. The MHC region on chromosome 6 appears to be the most important (and it was indeed previously noted (1-3) as a confluence between schizophrenia and immune disorder risk regions), with many other regions, such as cytoband 1p13.2. We also found that most alleles shared between schizophrenia and Crohn’s disease have the same effect direction, with similar trend found for other disorder pairs, such as bipolar-Crohn’s disease. Our results offer a novel bird’s-eye view of the genetic relationship and demonstrate strong evidence for mediated pleiotropy between psychiatric disorders and autoimmune disorders. Our findings might open new routes for prevention and treatment strategies for these disorders based on a new appreciation of the importance of immunological mechanisms in mediating risk.

An Atlas of Genetic Correlations across Human Diseases and Traits

An Atlas of Genetic Correlations across Human Diseases and Traits

Brendan Bulik-Sullivan, Hilary K Finucane, Verneri Anttila, Alexander Gusev, Felix R Day, ReproGen Consortium, Psychiatric Genomics Consortium, Genetic Consortium for Anorexia Nervosa of the Wellcome Trust Consortium 3, John R.B. Perry, Nick Patterson, Elise Robinson, Mark J Daly, Alkes L Price, Benjamin M Neale
doi: http://dx.doi.org/10.1101/014498

Identifying genetic correlations between complex traits and diseases can provide useful etiological insights and help prioritize likely causal relationships. The major challenges preventing estimation of genetic correlation from genome-wide association study (GWAS) data with current methods are the lack of availability of individual genotype data and widespread sample overlap among meta-analyses. We circumvent these difficulties by introducing a technique for estimating genetic correlation that requires only GWAS summary statistics and is not biased by sample overlap. We use our method to estimate 300 genetic correlations among 25 traits, totaling more than 1.5 million unique phenotype measurements. Our results include genetic correlations between anorexia nervosa and schizophrenia/ body mass index and associations between educational attainment and several diseases. These results highlight the power of a polygenic modeling framework, since there currently are no genome-wide significant SNPs for anorexia nervosa and only three for educational attainment.

A Single Gene Causes an Interspecific Difference in Pigmentation in Drosophila

A Single Gene Causes an Interspecific Difference in Pigmentation in Drosophila

Yasir H. Ahmed-Braimah, Andrea L. Sweigart
doi: http://dx.doi.org/10.1101/014464

The genetic basis of species differences remains understudied. Studies in insects have contributed significantly to our understanding of morphological evolution. Pigmentation traits in particular have received a great deal of attention and several genes in the insect pigmentation pathway have been implicated in inter- and intraspecific differences. Nonetheless, much remains unknown about many of the genes in this pathway and their potential role in understudied taxa. Here we genetically analyze the puparium color difference between members of the Virilis group of Drosophila. The puparium of Drosophila virilis is black, while those of D. americana, D. novamexicana, and D. lummei are brown. We used a series of backcross hybrid populations between D. americana and D. virilis to map the genomic interval responsible for the difference between this species pair. First, we show that the pupal case color difference is caused by a single Mendelizing factor, which we ultimately map to an ~11kb region on chromosome 5. The mapped interval includes only the first exon and regulatory region(s) of the dopamine N-acetyltransferase gene (Dat). This gene encodes an enzyme that is known to play a part in the insect pigmentation pathway. Second, we show that this gene is highly expressed at the onset of pupation in light-brown taxa (D. americana and D. novamexicana) relative to D. virilis, but not in the dark-brown D. lummei. Finally, we examine the role of Dat in adult pigmentation between D. americana (heavily melanized) and D. novamexicana (lightly melanized) and find no discernible effect of this gene in adults. Our results demonstrate that a single gene is entirely or almost entirely responsible for a morphological difference between species.

A consistency lemma in statistical phylogenetics

A consistency lemma in statistical phylogenetics

Mike Steel
(Submitted on 26 Jan 2015)

This short note provides a simple formal proof of a folklore result in statistical phylogenetics concerning the convergence of bootstrap support for a tree and its edges.