Author post: Patterns of positive selection in seven ant genomes

This guest post is by Julien Roux on Roux et al. “Patterns of positive selection in seven ant genomes“, arXived here.

The publication of the honeybee genome in 2006 can be considered the birth date of "sociogenomics", a research field whose agenda is to understand social life in molecular terms. Recently, this field has entered a period of rapid discovery with the publication of full genome sequences of multiple Hymenoptera species. In particular, the release of seven ant genome sequences gave us the opportunity to look for the molecular origins of some of the spectacular adaptations of the ant lineage, through patterns of positive selection on amino-acid substitutions in ant genes. We used rigorous methods to detect episodic positive selection while controlling for false positives inspired by the database Selectome. All data is publicly available for people to reuse.

An original aspect of our paper is that we analyzed not only ant genomes, but also data from 10 species of bees and 12 species of flies with the same methods to permit an unbiased comparison of positive selection patterns between lineages. For example, immune genes were enriched for positive selection signal in all three lineages. This may not look surprising since these are classical hits of positive selection scans, but it was previously hypothesized that the evolution of social hygienic behaviors in ants and bees may have relaxed the selective pressure on immune genes. Our analysis indicates that this effect is either absent or relatively small.

Other hypotheses have been put forward in relation to the evolution of sociality in Hymenoptera. Notably, it was proposed that the challenges of social life in the colonies should be reflected by increased positive selection signal on neurogenesis genes. Similarly, because communication is mostly based on chemical signals in colonies of social insects, it was suggested that increased positive selection should be observed on olfactory receptors compared to non-social insects. Our results question both these hypotheses, since we observed that increased positive selection on these classes of genes does not coincide with (but predated) the evolution of sociality in Hymenoptera.

Finally, the comparison between the three lineages allowed us to pinpoint some patterns that were most likely specific to the ant lineage. We found less positive selection on genes related to metabolism in ants compared to bees and flies. We think this could be the sign of relaxed selection on these genes, possibly in relation to the important reduction on metabolic needs with the loss of flight in ant workers. By contrast, we identified a robust pattern of directional selection specific to the ant lineage on genes functioning in the mitochondria. Several pieces of evidence suggest that this pattern might be linked to the remarkable lifespan extension that evolved in the ant lineage. Queens of some ant species can indeed live up to 100 times longer than solitary insects, (that is up to 30 years!). Positive selection possibly played a role in optimizing the activity of mitochondria, where the respiratory chain is the primary source of production of Reactive Oxidative Species (ROS), an important proximal cause for aging, thus contributing to the evolution of increased lifespan in ants.

In conclusion, protein level episodic positive selection appears to have played an important role in the evolution of social insects, notably regarding strong mitochondrial adaptation in ants.

Most viewed on Haldane’s Sieve: November 2013

The most viewed posts on Haldane’s Sieve in November 2013 were:

Agriculture driving male expansion in Neolithic Time

Agriculture driving male expansion in Neolithic Time
Chuan-Chao Wang, Yunzhi Huang, Shao-Qing Wen, Chun Chen, Li Jin, Hui Li
(Submitted on 27 Nov 2013)

The emergence of agriculture is suggested to have driven extensive human population growths. However, genetic evidence from maternal mitochondrial genomes suggests major population expansions began before the emergence of agriculture. Therefore, role of agriculture that played in initial population expansions still remains controversial. Here, we analyzed a set of globally distributed whole Y chromosome and mitochondrial genomes of 526 male samples from 1000 Genome Project. We found that most major paternal lineage expansions coalesced in Neolithic Time. The estimated effective population sizes through time revealed strong evidence for 10- to 100-fold increase in population growth of males with the advent of agriculture. This sex-biased Neolithic expansion might result from the reduction in hunting-related mortality of males.

Population genetics and substitution models of adaptive evolution

Population genetics and substitution models of adaptive evolution
Mario dos Reis
(Submitted on 26 Nov 2013)

The ratio of non-synonymous to synonymous substitutions ω(=dN/dS) has been widely used as a measure of adaptive evolution in protein coding genes. Omega can be defined in terms of population genetics parameters as the fixation ratio of selected vs. neutral mutants. Here it is argued that approaches based on the infinite sites model are not appropriate to define ω for single codon locations. Simple models of amino acid substitution with reversible mutation and selection are analysed, and used to define ω under several evolutionary scenarios. In most practical cases ω1 can be sometimes expected for single locations at equilibrium. An example with influenza data is discussed.

Population genetic consequences of the Allee effect and the role of offspring-number variation

Population genetic consequences of the Allee effect and the role of offspring-number variation
Meike J. Wittmann, Wilfried Gabriel, Dirk Metzler
(Submitted on 21 Nov 2013)

A strong demographic Allee effect in which the expected population growth rate is negative below a certain critical population size can cause high extinction probabilities in small introduced populations. However, many species are repeatedly introduced to the same location and eventually one population may overcome the Allee effect by chance. With the help of stochastic models, we investigate how much genetic diversity such successful populations harbour on average and how this depends on offspring-number variation, an important source of stochastic variability in population size. We find that with increasing variability, the Allee effect increasingly promotes genetic diversity in successful populations. Successful Allee-effect populations with highly variable population dynamics escape rapidly from the region of small population sizes and do not linger around the critical population size. Therefore, they are exposed to relatively little genetic drift. We show that here—unlike in classical population genetics models—the role of offspring-number variation cannot be accounted for by an effective-population-size correction. Thus, our results highlight the importance of detailed biological knowledge, in this case on the probability distribution of family sizes, when predicting the evolutionary potential of newly founded populations or when using genetic data to reconstruct their demographic history.

Calibrated birth-death phylogenetic time-tree priors for Bayesian inference

Calibrated birth-death phylogenetic time-tree priors for Bayesian inference
Joseph Heled, Alexei J.Drummond
(Submitted on 19 Nov 2013)

Here we introduce a general class of multiple calibration birth-death tree priors for use in Bayesian phylogenetic inference. All tree priors in this class separate ancestral node heights into a set of “calibrated nodes” and “uncalibrated nodes” such that the marginal distribution of the calibrated nodes is user-specified whereas the density ratio of the birth-death prior is retained for trees with equal values for the calibrated nodes. We describe two formulations, one in which the calibration information informs the prior on ranked tree topologies, through the (conditional) prior, and the other which factorizes the prior on divergence times and ranked topologies, thus allowing uniform, or any arbitrary prior distribution on ranked topologies. While the first of these formulations has some attractive properties the algorithm we present for computing its prior density is computationally intensive. On the other hand, the second formulation is always computationally efficient. We demonstrate the utility of the new class of multiple-calibration tree priors using both small simulations and a real-world analysis and compare the results to existing schemes. The two new calibrated tree priors described in this paper offer greater flexibility and control of prior specification in calibrated time-tree inference and divergence time dating, and will remove the need for indirect approaches to the assessment of the combined effect of calibration densities and tree process priors in Bayesian phylogenetic inference.

On the concept of biological function, junk DNA and the gospels of ENCODE and Graur et al

On the concept of biological function, junk DNA and the gospels of ENCODE and Graur et al.

Claudiu I Bandea

In a recent article entitled “On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE”, Graur et al. dismantle ENCODE’s evidence and conclusion that 80% of the human genome is functional. However, the article by Graur et al. contains assumptions and statements that are questionable. Primarily, the authors limit their evaluation of DNA’s biological functions to informational roles, sidestepping putative non-informational functions. Here, I bring forward an old hypothesis on the evolution of genome size and on the role of so called ‘junk DNA’ (jDNA), which might explain C-value enigma. According to this hypothesis, the jDNA functions as a defense mechanism against insertion mutagenesis by endogenous and exogenous inserting elements such as retroviruses, thereby protecting informational DNA sequences from inactivation or alteration of their expression. Notably, this model couples the mechanisms and the selective forces responsible for the origin of jDNA with its putative protective biological function, which represents a classic case of ‘fighting fire with fire.’ One of the key tenets of this theory is that in humans and many other species, jDNAs serves as a protective mechanism against insertional oncogenic transformation. As an adaptive defense mechanism, the amount of protective DNA varies from one species to another based on the rate of its origin, insertional mutagenesis activity, and evolutionary constraints on genome size.

Validity of covariance models for the analysis of geographical variation

Validity of covariance models for the analysis of geographical variation
Gilles Guillot, René Schilling, Emilio Porcu, Moreno Bevilacqua
(Submitted on 17 Nov 2013)

Due to the availability of large molecular data-sets, covariance models are increasingly used to describe the structure of genetic variation as an alternative to more heavily parametrised biological models. We focus here on a class of parametric covariance models that received sustained attention lately and show that the conditions under which they are valid mathematical models have been overlooked so far. We provide rigorous results for the construction of valid covariance models in this family. We also outline how to construct alternative covariance models for the analysis of geographical variation that are both mathematically well behaved and easily implementable.

Data Mining of Online Genealogy Datasets for Revealing Lifespan Patterns in Human Population

Data Mining of Online Genealogy Datasets for Revealing Lifespan Patterns in Human Population
Michael Fire, Yuval Elovici
(Submitted on 18 Nov 2013)

Online genealogy datasets contain extensive information about millions of people and their past and present family connections. This vast amount of data can assist in identifying various patterns in human population. In this study, we present methods and algorithms which can assist in identifying variations in lifespan distributions of human population in the past centuries, in detecting social and genetic features which correlate with human lifespan, and in constructing predictive models of human lifespan based on various features which can easily be extracted from genealogy datasets.
We have evaluated the presented methods and algorithms on a large online genealogy dataset with over a million profiles and over 8.8 million connections, all of which were collected from the WikiTree website. Our findings indicate that significant but small positive correlations exist between the parents’ lifespan and their children’s lifespan. Additionally, we found slightly higher and significant correlations between the lifespans of spouses. We also discovered a very small positive and significant correlation between longevity and reproductive success in males, and a small and significant negative correlation between longevity and reproductive success in females. Moreover, our machine learning algorithms presented better than random classification results in predicting which people who outlive the age of 50 will also outlive the age of 80.
We believe that this study will be the first of many studies which utilize the wealth of data on human populations, existing in online genealogy datasets, to better understand factors which influence human lifespan. Understanding these factors can assist scientists in providing solutions for successful aging.

Genetic diversity in introduced populations with Allee effect

Genetic diversity in introduced populations with Allee effect
Meike J. Wittmann, Wilfried Gabriel, Dirk Metzler
(Submitted on 18 Nov 2013)

A phenomenon that strongly influences the demography of small introduced populations and thereby potentially their genetic diversity is the Allee effect, a reduction in population growth rates at small population sizes. We take a stochastic modeling approach to investigate levels of genetic diversity in populations that successfully overcame a strong demographic Allee effect, a scenario in which populations smaller than a certain critical size are expected to decline. Our results indicate that compared to successful populations without Allee effect, successful Allee-effect populations tend to 1) derive from larger founder population sizes and thus have a higher initial amount of genetic variation, 2) spend fewer generations at small population sizes where genetic drift is particularly strong, and 3) spend more time around the critical population size and thus experience more drift there. Altogether, the Allee effect can either increase or decrease genetic diversity, depending on the average founder population size. In the case of multiple introduction events, there is an additional increase in diversity because Allee-effect populations tend to derive from a larger number of introduction events than other populations. Finally, we show that given genetic data from sufficiently many populations, we can statistically infer the critical population size.