A genetic variant near olfactory receptor genes influences cilantro preference

A genetic variant near olfactory receptor genes influences cilantro preference

Nicholas Eriksson, Shirley Wu, Chuong B. Do, Amy K. Kiefer, Joyce Y. Tung, Joanna L. Mountain, David A. Hinds, Uta Francke
(Submitted on 10 Sep 2012)

The leaves of the Coriandrum sativum plant, known as cilantro or coriander, are widely used in many cuisines around the world. However, far from being a benign culinary herb, cilantro can be polarizing—many people love it while others claim that it tastes or smells foul, often like soap or dirt. This soapy or pungent aroma is largely attributed to several aldehydes present in cilantro. Cilantro preference is suspected to have a genetic component, yet to date nothing is known about specific mechanisms. Here we present the results of a genome-wide association study among 14,604 participants of European ancestry who reported whether cilantro tasted soapy, with replication in a distinct set of 11,851 participants who declared whether they liked cilantro. We find a single nucleotide polymorphism (SNP) significantly associated with soapy-taste detection that is confirmed in the cilantro preference group. This SNP, rs72921001, (p=6.4e-9, odds ratio 0.81 per A allele) lies within a cluster of olfactory receptor genes on chromosome 11. Among these olfactory receptor genes is OR6A2, which has a high binding specificity for several of the aldehydes that give cilantro its characteristic odor. We also estimate the heritability of cilantro soapy-taste detection in our cohort, showing that the heritability tagged by common SNPs is low, about 0.087. These results confirm that there is a genetic component to cilantro taste perception and suggest that cilantro dislike may stem from genetic variants in olfactory receptors. We propose that OR6A2 may be the olfactory receptor that contributes to the detection of a soapy smell from cilantro in European populations.

Our paper: A faster-X effect for gene expression in Drosophila embryos

[This author post is by Alex Kalinka and Pavel Tomancak on their paper, An excess of gene expression divergence on the X chromosome in Drosophila embryos: implications for the faster-X hypothesis, posted to the arXiv here.]

We have been working towards publishing our study of gene expression evolution on the X chromosome in Drosophila embryos since the beginning of March this year. Recently, Casey Bergman suggested that we upload our manuscript to the arXiv, and after we did so, we were kindly invited by Graham Coop to write a guest post about our work for Haldane's Sieve.

It makes sense to post here since the roots of our study go back to Haldane in 1924 [1]; he recognised that the unusual inheritance pattern of the X chromosome, in which a single copy is present in the heterogametic sex versus two copies in the homogametic sex, could in turn lead to unusual evolutionary patterns on the X relative to the autosomes. If, for example, a beneficial mutation is recessive, then it would be more exposed to natural selection in the heterogametic sex where, relative to an equivalent autosomal allele, it would spend less time being masked by the dominant, less beneficial allele [1]. The prediction that adaptive evolution might proceed more quickly on the X than on the autosomes has been dubbed the faster-X hypothesis. However, the X chromosome might also be expected to evolve more rapidly for non-adaptive reasons. In each mating pair there will be 3 copies of the X chromosome versus 4 copies of each autosome, which might in turn lead to a lower chromosomal effective population size for the X thereby increasing the strength of random genetic drift.

While some studies have reported evidence for a faster-X effect for adaptive protein evolution in Drosophila, other studies have reported that there is no difference between the X and the autosomes, and to date the evidence is somewhat inconclusive. As we focused our study on gene expression, we had an opportunity to relax the implicit assumption that the majority of adaptive evolution occurs in coding regions. To help disentangle adaptive and non-adaptive evolutionary signatures in our data, we used both between-species measures of gene expression divergence and within-species measures of gene expression variation using inbred strains of D. melanogaster generated by the Drosophila Genetic Reference Panel (DGRP).

We found an excess of gene expression divergence on the X chromosome between six Drosophila species (a mean increase of ~20%). In contrast, we found that the X exhibits a significantly lower level of gene expression variation between inbred strains of D. melanogaster (a mean decrease of ~10%). Taken together, these results suggest that the divergence that we find between species is not driven by a relaxation of selective constraint on the X chromosome. To further explore whether such a signature could be driven by the hemizygosity of the X, we analysed gene expression in mutation accumulation lines of D. melanogaster. If the single copy of the X in males is driving the excess of expression divergence that we found on the X, then we would expect to find an excess of expression variation between lines that have independently accumulated mutations. In fact, we found the opposite was the case – the X chromosome displayed a significantly lower rate of mutation accumulation than the autosomes suggesting that the hemizygosity of the X alone is not sufficient to drive a higher rate of fixation of gene expression mutations.

Overall, we argue that the excess divergence we find on the X is best understood within the framework of the faster-X hypothesis. In support of this interpretation, we find that there is an excess of gene expression divergence on Muller's D element along the branch leading to the obscura sub-group; Muller's D element segregates as a neo-X chromosome in the obscura sub-group, and hence provides a powerful, independent test of faster evolution of the X chromosome.

Several questions remain, however, and we hope that our findings will help to stimulate further research into the details underpinning the differences we find on the X. In particular, work needs to be done to discover the genetics underlying divergence on the X, such as the relative importance of cis versus trans-acting factors, and, crucially, we need to develop a better understanding of how variation in gene expression impacts organismal fitness. Research into the latter question is essential if we are to bridge the conspicuous gaps between sequence variation, gene expression variation, and organismal fitness.

Dave Gerrard initially found elevated gene expression divergence on the X in the course of analysing data for our developmental hourglass paper, and spoke about his findings at the 43rd population genetics conference; that was more than two years ago. Since then we collected new data, and it took a while to put the paper together although it is still not certain that it will be published in a traditional journal. The arXiv is a great way to let the scientific community know about your results before the academic process runs its course. We only regret we didn't make use of this excellent outlet back in March.

[1] Haldane JBS (1924) A mathematical theory of natural and artificial selection. Part I. Trans Camb Phil Soc 23: 19-41.

An experimental test for genetic constraints in Drosophila melanogaster

An experimental test for genetic constraints in Drosophila melanogaster
Ian Dworkin, David Tack, Jarrod Hadfield
(Submitted on 7 Sep 2012)

In addition to natural selection, adaptive evolution requires genetic variation to proceed. Yet the G-matrix may have limited ‘genetic degrees of freedom’, with certain combinations of trait values unavailable to evolution. Such limitations are often referred to as genetic constraints. Unfortunately, clear predictions about when to expect constraints are rarely available. Therefore, we developed an experimental system that provides specific predictions regarding constraints. Such tests are important as disagreements persist regarding the evidence for genetic constraints, possibly due to differences in methodology, study system or both. Numerous measures of genetic constraints have been suggested, and generally focus on whether some axes of G have eigenvalues=~0, indicating a lack of genetic variance.The mutation Ultrabithorax1 causes a mild homeotic transformation of segmental identity. We predicted that this mutation would induce a genetic constraint due to this homeosis. We measured genetic co-variation for a set of traits in a panel of strains with and without Ubx1. As expected, Ubx1 induced homeotic transformations, and altered patterns of allometry. Yet, no changes in correlational structure nor in the distribution of eigenvalues of G were observed. We discuss the role of using genetic manipulations to refine hypotheses of constraints in natural systems.

Polygenic Modeling with Bayesian Sparse Linear Mixed Models

Polygenic Modeling with Bayesian Sparse Linear Mixed Models
Xiang Zhou, Peter Carbonetto, Matthew Stephens
(Submitted on 6 Sep 2012)

Both linear mixed models (LMMs) and sparse regression models are widely used in genetics applications, including, recently, polygenic modeling. These two approaches make very different assumptions, so are expected to perform well in different situations. However, in practice, for a given data set one typically does not know which assumptions will be more accurate. Motivated by this, we consider a hybrid of the two, which we refer to as a “Bayesian sparse linear mixed model” (BSLMM) that includes both these models as special cases. We address several key computational and statistical issues that arise when applying BSLMM, including appropriate prior specification for the hyper-parameters, and a novel Markov chain Monte Carlo algorithm for posterior inference. We apply BSLMM and compare it with other methods for two polygenic modeling applications: estimating the proportion of variance in phenotypes explained (PVE) by available genotypes, and phenotype (or breeding value) prediction. For estimating PVE, we demonstrate that BSLMM combines the advantages of both standard LMMs and sparse regression modeling. For phenotype prediction it considerably outperforms either of the other two methods, as well as several other large-scale regression methods previously suggested for this problem. Software implementing our method is freely available from this http URL

Our paper: Population genomics of sub-Saharan Drosophila melanogaster: African diversity and non-African admixture

[This author post is by John Pool on his paper: Population genomics of sub-Saharan Drosophila melanogaster: African diversity and non-African admixture arXived here.]

We are in the process of publishing this analysis of >100 sequenced Drosophila melanogaster genomes (largely haploid genomes at >25X depth).  These genomes come from more than 20 geographic locations, largely within sub-Saharan Africa, where the species is thought to originate.  Truth be told, this sampling scheme was somewhat accidental – we wanted to identify a population representing a “center of genetic diversity” for the species, which for us involved sequencing small numbers of genomes from many different population samples (some from previous lab stocks, others from newly collected lines).  Ultimately we did find the sample we were looking for, and we are in the process of sequencing ~300 genomes from this Zambian population.  Still, it seemed more than worthwhile to analyze the “geographic scatter” of genomes we had obtained from across sub-Saharan Africa (as well as one small sample from Europe).

Our ambitions for this paper were largely descriptive – a preliminary analysis of genetic variation within and among the sampled populations.  We envisioned being able to compare diversity levels and genetic structure across Africa (much as I once did with a dramatically smaller data set), and to identify specific loci with signatures of selection.  And we were able to do that.  We found the highest levels of genetic diversity in and around Zambia, raising the prospect of a southern-central African origin for D. melanogaster.  We found low-to-moderate levels of genetic structure across most of sub-Saharan Africa, with only Ethiopian populations showing stronger genetic differentiation (along with some morphological differentiation, but that’s another story).  Analyses of allele frequencies within and between populations revealed a substantial number of loci with evidence of recent natural selection – many GO categories enriched for such outliers pertained to gene regulation, much as we had observed in another recent population genomic analysis.

Of course that’s how we normally think of natural selection’s influence on genetic variation – specific beneficial mutations leading to selective sweeps (whether hard or soft, partial or complete), each one influencing diversity on a limited genomic scale.  And at least in
species with large outbreeding populations like Drosophila, recurrent hitchhiking may be common enough to affect diversity at random sites in the genome (e.g. 1, 2, 3).  So we weren’t surprised to find sweep signals.  The bigger surprise to us was finding evidence that specific episodes of natural selection had affected genetic variation on the scale of whole chromosome arms or the entire genome.

The first major surprise concerned genomic patterns of non-African admixture in African D. melanogaster populations.  The occurrence of such introgression had been documented before, and there were previous findings that non-African genotypes were associated with urban environments in Africa, and that admixture levels could vary within the genome. We developed a hidden Markov model approach to detect admixed chromosomal regions (based simply on the reduced diversity found in populations outside sub-Saharan Africa).  Whereas we tend to think of admixture as a selectively neutral force, the genomic patterns of admixture we observed did not seem consistent with passive gene flow.  Non-African genotypes had displaced large portions of the gene pool of presumably quite large African populations, and this had occurred within a very short time (judging by the megabase scale of admixture tracts).  Levels of admixture across the genome showed both broad-scale heterogeneity (chromosomal differences) and relatively narrow “spikes” of admixture.  These peaks of admixture quite often overlapped with outliers for high FST between Africa and Europe, as would be expected if these regions contained functional differences between populations for which introgressing non-African alleles may now be favored in some African environments (e.g. modernizing cities).  

The second surprise came as we documented population genetic patterns associated with polymorphic inversions (as further analyzed in a forthcoming paper by Russ Corbett-Detig and Dan Hartl).  It was already known that inversions tend to differ in frequency between D. melanogaster populations, but theory and most empirical data suggested that only diversity around the inversion breakpoints should be affected.  Instead, we observed some African populations in which elevated inversion frequencies were associated with notable reductions in diversity for entire chromosome arms (and ultimately affecting genome-wide average diversity), consistent with directional selection on rearrangements or linked loci.  Perhaps more surprisingly, mostinversions found in the non-African sample (France) served to substantially increase diversity across whole chromosome arms (by up to 29% in the case of inversions on arm 3R), and by 12% genome-wide.  Here, we can only suggest that selection may have acted to favor inverted chromosomes that recently originated from a more genetically diverse (e.g. African or African-admixed) population.  Accounting for these inversions substantially alters chromosomal diversity ratios between African and European populations.

Hence, we may have the curious situation of natural selection driving introgression in both directions across the sub-Saharan/cosmopolitan population genetic divide in D. melanogaster.

You can find our draft manuscript here, supplemental items here, and the data here.

 I’m definitely glad we were able to post a draft at arXiv – it was time to communicate our findings to the research community (especially to facilitate our colleagues’ analysis and publication plans for this data set), and there’s really no downside to us as authors.  I also appreciate the chance to post here at Haldane’s Sieve, and it would be great to discuss any aspect of our draft.

John Pool

Complex patterns of local adaptation in teosinte

Complex patterns of local adaptation in teosinte

Tanja Pyhäjärvi, Matthew B. Hufford, Sofiane Mezmouk, Jeffrey Ross-Ibarra
(Submitted on 3 Aug 2012)

Populations of widely distributed species often encounter and adapt to specific environmental conditions. However, comprehensive characterization of the genetic basis of adaptation is demanding, requiring genome-wide genotype data, multiple sampled populations, and a good understanding of population structure. We have used environmental and high-density genotype data to describe the genetic basis of local adaptation in 21 populations of teosinte, the wild ancestor of maize. We found that altitude, dispersal events and admixture among subspecies formed a complex hierarchical genetic structure within teosinte. Patterns of linkage disequilibrium revealed four mega-base scale inversions that segregated among populations and had altitudinal clines. Based on patterns of differentiation and correlation with environmental variation, inversions and nongenic regions play an important role in local adaptation of teosinte. Further, we note that strongly differentiated individual populations can bias the identification of adaptive loci. The role of inversions in local adaptation has been predicted by theory and requires attention as genome-wide data become available for additional plant species. These results also suggest a potentially important role for noncoding variation, especially in large plant genomes in which the gene space represents a fraction of the entire genome.

An excess of gene expression divergence on the X chromosome in Drosophila embryos: implications for the faster-X hypothesis

An excess of gene expression divergence on the X chromosome in Drosophila embryos: implications for the faster-X hypothesis

Melek A. Kayserili, Dave T. Gerrard, Pavel Tomancak, Alex T. Kalinka
(Submitted on 5 Sep 2012)

The X chromosome is present as a single copy in the heterogametic sex, and this hemizygosity is expected to drive unusual patterns of evolution on the X relative to the autosomes. For example, the hemizgosity of the X may lead to a lower chromosomal effective population size compared to the autosomes suggesting that the X might be more strongly affected by genetic drift. However, the X may also experience stronger positive selection than the autosomes because recessive beneficial mutations will be more visible to selection on the X where they will spend less time being masked by the dominant, less beneficial allele – a proposal known as the faster-X hypothesis. Thus, empirical studies demonstrating increased genetic divergence on the X chromosome could be indicative of either adaptive or non-adaptive evolution. We measured gene expression in Drosophila species and in D. melanogaster inbred strains for both embryos and adults. In the embryos we found that expression divergence is on average more than 20% higher for genes on the X chromosome relative to the autosomes, but in contrast, in the inbred strains gene expression variation is significantly lower on the X chromosome. Furthermore, expression divergence of genes on Muller’s D element is significantly greater along the branch leading to the obscura sub-group, in which this element segregates as a neo-X chromosome. In the adults, divergence is greatest on the X chromosome for males, but not for females, yet in both sexes inbred strains harbour the lowest level of gene expression variation on the X chromosome. We consider different explanations for our results and conclude that they are most consistent within the framework of the faster-X hypothesis.

Analysis of DNA sequence variation within marine species using Beta-coalescents

Analysis of DNA sequence variation within marine species using Beta-coalescents

Matthias Steinrücken, Matthias Birkner, Jochen Blath
(Submitted on 4 Sep 2012)

We apply recently developed inference methods based on general coalescent processes to DNA sequence data obtained from various marine species. Several of these species are believed to exhibit so-called shallow gene genealogies, potentially due to extreme reproductive behaviour, e.g. via Hedgecock’s “reproduction sweepstakes”. Besides the data analysis, in particular the inference of mutation rates and the estimation of the (real) time to the most recent common ancestor, we briefly address the question whether the genealogies might be adequately described by so-called Beta coalescents (as opposed to Kingman’s coalescent), allowing multiple mergers of genealogies.
The choice of the underlying coalescent model for the genealogy has drastic implications for the estimation of the above quantities, in particular the real-time embedding of the genealogy

Finite populations with frequency-dependent selection: a genealogical approach

Finite populations with frequency-dependent selection: a genealogical approach

Peter Pfaffelhuber, Benedikt Vogt
(Submitted on 28 Jul 2012)

Evolutionary models for populations of constant size are frequently studied using the Moran model, the Wright-Fisher model, or their diffusion limits. When evolution is neutral, a random genealogy given through Kingman’s coalescent is used in order to understand basic properties of such models. Here, we address the use of a genealogical perspective for models with weak frequency-dependent selection, i.e. N s =: {\alpha} is small, and s is the fitness advantage of a fit individual and N is the population size. When computing fixation probabilities, this leads either to the approach proposed by Rousset (2003), who argues how to use the Kingman’s coalescent for weak selection, or to extensions of the ancestral selection graph of Neuhauser and Krone (1997) and Neuhauser (1999). As an application, we re-derive the one-third law of evolutionary game theory (Nowak et al., 2004). In addition, we provide the approximate distribution of the genealogical distance of two randomly sampled individuals under linear frequency-dependence.

Journal policy change: ESA journals will consider preprints

Scott Collins, the president of the Ecological Society of America, announced today on Twitter that all ESA journals will now consider papers for publication that have previously been posted on preprint servers like arXiv. We look forward to discussing preprints headed to ESA journals here on Haldane’s Sieve.

[Update] See also here.