Positive selection drives faster-Z evolution in silkmoths

Positive selection drives faster-Z evolution in silkmoths
Timothy B. Sackton (1), Russell B. Corbett-Detig (1), Javaregowda Nagaraju (2), R. Lakshmi Vaishna (2), Kallare P. Arunkumar (2), Daniel L. Hartl (1) ((1) Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, USA, (2) Centre of Excellence for Genetics and Genomics of Silkmoths, Laboratory of Molecular Genetics, Centre for DNA Fingerprinting and Diagnostics, Hyderabad, India)
(Submitted on 29 Apr 2013)

Genes linked to X or Z chromosomes, which are hemizygous in the heterogametic sex, are predicted to evolve at different rates than those on autosomes. This faster-X effect can arise either as a consequence of hemizygosity which leads to more efficient selection for recessive beneficial mutations in the heterogametic sex, or as a consequence of reduced effective population size on the hemizygous chromosome, which leads to increased fixation of weakly deleterious mutations due to random genetic drift. Empirical results to date have suggested that, while the overall pattern across taxa is complicated, in general systems with male-heterogamy show a faster-X effect primarily attributable to more efficient selection, whereas systems with female-heterogamy show a faster-Z effect primarily attributable to increased drift. However, to date only a single female-heterogamic taxa has been investigated. In order to test the generality of the faster-Z pattern seen in birds, we sequenced the genome of the Lepidopteran insect Bombyx huttoni, a close outgroup of the domesticated silkmoth Bombyx mori. We show that silkmoths experience faster-Z evolution, but unlike in birds, the faster-Z effect appears to be attributable to more efficient positive selection in females. These results suggest that female-heterogamy alone is unlikely to be sufficient to explain the reduced efficacy of selection on the bird Z chromosome. Instead, it is likely that a combination of patterns of dosage compensation and overall effective population size, among other factors, influence patterns of faster-Z evolution.

Inferring non-neutral regulatory change in pathways from transcriptional profiling data

Inferring non-neutral regulatory change in pathways from transcriptional profiling data
Joshua G. Schraiber, Yulia Mostovoy, Tiffany Y. Hsu, Rachel B. Brem
(Submitted on 19 Apr 2013)

An outstanding question in comparative genomics is the evolutionary importance of gene expression differences between species. Rigorous molecular-evolution methods to infer evidence for natural selection from transcriptional profiling data are at a premium in the field, and to date, phylogenetic approaches have not been well-suited to address the question in the small sets of taxa profiled in standard surveys of gene expression. To meet this challenge, we have developed a strategy to infer evolutionary histories from expression data by analyzing suites of genes of common function. In a manner conceptually similar to molecular-evolution models in which the evolutionary rates of DNA sequence at multiple loci follow a gamma distribution, we modeled expression of the genes of an a priori-defined pathway with rates drawn from an inverse-gamma distribution. We then developed a fitting strategy to infer the parameters of this distribution from expression measurements, and to identify gene groups whose expression patterns were consistent with evolutionary constraint or rapid evolution in particular species. Simulations confirmed the power and accuracy of our inference method. As an experimental testbed for our approach, we generated and analyzed transcriptional profiles of four Saccharomyces yeasts. The results revealed pathways with signatures of constrained and accelerated regulatory evolution in individual yeasts, and across the phylogeny, highlighting the prevalence of pathway- level expression change during the divergence of yeast species. We anticipate that our pathway-based phylogenetic approach will be of broad utility in the search to understand the evolutionary relevance of regulatory change.

Our paper: Clusters of microRNAs emerge by new hairpins in existing transcripts

This guest post is by Antonio Marco (@antonio_marco_c) on his paper Marco et al. Clusters of microRNAs emerge by new hairpins in existing transcripts arXived here.

Our paper:

MicroRNAs are short regulatory sequences involved in virtually all biological processes. MicroRNAs are often organized in genomic clusters that produce polycistronic transcripts. It is well-known that protein-coding polycistronic transcripts are almost absent in animals (with a few exceptions in nematodes and ascidians). So where do these microRNA clusters come from, and why are they so prevalent? We tackle these questions in our paper “Clusters of microRNAs emerge by new hairpins in existing transcripts”, recently deposited in arXiv.

We envisioned several possible scenarios for the origin of polycistronic microRNAs: First, polycistronic microRNAs can emerge by genomic rearrangements that bring together pre-existing microRNAs. As in bacterial operons, the clustering of microRNAs with related functions can be advantageous, and the fusion of related microRNAs may be positively selected. We call this the ‘put together’ model. Alternatively, multiple microRNAs could become polycistronic as a by-product of genome reduction (this is analogous to Caenorhabditis elegans operons). This is the ‘left together’ model. A third model, called ‘tandem duplication’, implies that polycistronic microRNAs emerge by tandem duplication of single sequences. Lastly, new microRNAs can emerge de novo in already existing microRNA transcripts. We named this the ‘new hairpin’ model, since a novel microRNA first requires the formation of a hairpin-like structure in the transcript.

By reconstructing the evolutionary history of Drosophila melanogaster microRNAs we observed that the majority of microRNA clusters emerged by the formation of new microRNA precursors in existing transcribed microRNA genes (‘new hairpin’ model). We also find that gene duplication generated a minority of the clusters (‘tandem duplication’). However, we didn’t see any instance of fusion of pre-existing microRNA genes. Moreover, clusters rarely split or suffer rearrangements. Once a microRNA cluster is formed, it stays as a cluster or it is lost a a whole.

We propose a model for the origin and evolution of microRNA clusters. Polycistronic microRNAs are an extreme case of genetic linkage, in which a microRNA is typically a few nucleotides away from another microRNA. Once a cluster is formed, the linkage is so tight that recombination is dramatically reduced between these loci. We suggest that, because of strong selective interference between loci (Hill-Robertson effect), a microRNA under selective pressure strongly influences the evolutionary fate of any neighbouring microRNA. Even slightly deleterious microRNAs may be maintained in a population if selection in one microRNA of the cluster is strong enough. Currently, we are analysing polymorphism data to test the validity of our model in actual Drosophila populations.

In summary, we suggest that clusters of microRNAs emerge by non-adaptive mechanisms and they are maintained as a consequence of tight linkage.

Integrating influenza antigenic dynamics with molecular evolution

Integrating influenza antigenic dynamics with molecular evolution
Trevor Bedford, Marc A. Suchard, Philippe Lemey, Gytis Dudas, Victoria Gregory, Alan J. Hay, John W. McCauley, Colin A. Russell, Derek J. Smith, Andrew Rambaut
(Submitted on 12 Apr 2013)

Influenza viruses undergo continual antigenic evolution allowing mutant viruses to evade immunity acquired by the host population to previous virus strains. Antigenic phenotype is often assessed through pairwise measurement of cross-reactivity between influenza strains using the hemagglutination inhibition (HI) assay. Here, we extend previous approaches to antigenic cartography, which seeks to place strains on an antigenic map, such that distances on this map best recapitulate titers observed across multiple HI assays. In our model, we simultaneously characterize antigenic and genetic evolution by including an evolutionary model in which antigenic location diffuses over a shared virus phylogeny. Using HI data for four lineages of influenza, encompassing influenza A subtypes H3N2 and H1N1, and influenza B lineages Victoria and Yamagata, we determine average rates of antigenic drift for each lineage, as well as year-to-year variability in the rate of drift. Through comparison with epidemiological data, we demonstrate a year-to-year correlation between drift and incidence and present evidence that antigenic drift mediates interference between influenza lineages. We investigate the selective underpinnings for differing antigenic dynamics across lineages and show that A/H3N2 benefits from both a higher influx of new antigenic mutations and also from more efficient conversion of antigenic variation into fixed differences. This work does much to elucidate the antigenic dynamics of influenza lineages, but also allows for substantial future advances in investigating the dynamics of influenza and other antigenically-variable pathogens by providing a model that intimately combines molecular and antigenic evolution.

A Model-Based Analysis of GC-Biased Gene Conversion in the Human and Chimpanzee Genomes

A Model-Based Analysis of GC-Biased Gene Conversion in the Human and Chimpanzee Genomes
John A. Capra, Melissa J. Hubisz, Dennis Kostka, Katherine S. Pollard, Adam Siepel
(Submitted on 9 Mar 2013)

GC-biased gene conversion (gBGC) is a recombination-associated process that favors the fixation of G/C alleles over A/T alleles. In mammals, gBGC is hypothesized to contribute to variation in GC content, rapidly evolving sequences, and the fixation of deleterious mutations, but its prevalence and general functional consequences remain poorly understood. gBGC is difficult to incorporate into models of molecular evolution and so far has primarily been studied using summary statistics from genomic comparisons. Here, we introduce a new probabilistic model that captures the joint effects of natural selection and gBGC on nucleotide substitution patterns, while allowing for correlations along the genome in these effects. We implemented our model in a computer program, called phastBias, that can accurately detect gBGC tracts ~1 kilobase or longer in simulated sequence alignments. When applied to real primate genome sequences, phastBias predicts gBGC tracts that cover roughly 0.3% of the human and chimpanzee genomes and account for 1.2% of human-chimpanzee nucleotide differences. These tracts fall in clusters, particularly in subtelomeric regions; they are enriched for recombination hotspots and fast-evolving sequences; and they display an ongoing fixation preference for G and C alleles. We also find some evidence that they contribute to the fixation of deleterious alleles, including an enrichment for disease-associated polymorphisms. These tracts provide a unique window into historical recombination processes along the human and chimpanzee lineages; they supply additional evidence of long-term conservation of megabase-scale recombination rates accompanied by rapid turnover of hotspots. Together, these findings shed new light on the evolutionary, functional, and disease implications of gBGC. The phastBias program and our predicted tracts are freely available.

Deleterious synonymous mutations hitchhike to high frequency in HIV-1 env evolution

Deleterious synonymous mutations hitchhike to high frequency in HIV-1 env evolution
Fabio Zanini, Richard A. Neher
(Submitted on 4 Mar 2013)

Intrapatient HIV-1 evolution is dominated by selection on the protein level in the arms race with the adaptive immune system. When cytotoxic CD8+ T-cells or neutralizing antibodies target a new epitope, the virus often escapes via nonsynonymous mutations that impair recognition. Synonymous mutations do not affect this interplay and are often assumed to be neutral. We analyze longitudinal intrapatient data from the C2-V5 part of the envelope gene (env) and observe that synonymous derived alleles rarely fix even though they often reach high frequencies in the viral population. We find that synonymous mutations that disrupt base pairs in RNA stems flanking the variable loops of gp120 are more likely to be lost than other synonymous changes, hinting at a direct fitness effect of these stem-loop structures in the HIV-1 RNA. Computational modeling indicates that these synonymous mutations have a (Malthusian) selection coefficient of the order of -0.002 and that they are brought up to high frequency by hitchhiking on neighboring beneficial nonsynonymous alleles. The patterns of fixation of nonsynonymous mutations estimated from the longitudinal data and comparisons with computer models suggest that escape mutations in C2-V5 are only transiently beneficial, either because the immune system is catching up or because of competition between equivalent escapes.

Our paper: Epistasis not needed to explain low dN/dS

This guest post is by Joshua Plotkin on his group’s paper McCandlish et al. Epistasis not needed to explain low dN/dS arXived here.

Our lab has recently begun to post research pre-prints on arXiv. All members of the group enthusiastically support this trend, both within our own group and within the broader scientific community. The merits of sharing pre-prints have been described elsewhere. The benefits of pre-prints are so immediately apparent, I feel, that there is no need to add further verses to the praises that have already been sung.

Recently, however, my research group and I faced an unusual and difficult question: whether we should post a pre-print that does not describe primary research, but rather is a critique of a recent paper published by another group – a paper on the role of epistasis in molecular evolution from the group led by Fyodor Kondrashov. My group and I have never before written such a commentary; and so I faced this choice with some uncertainty. Here are some thoughts on our group’s decision to write the commentary and to post it to arXiv.

Kondrashov’s group is at the vanguard of contemporary research in molecular evolution. In this particular paper from his group, Breen et al. contend that epistasis is “pervasive throughout protein evolution”; a view that I mostly support and indeed have expressed, in a more limited scope, in several publications and commentaries (e.g. here, here, and here). However, in discussing the paper by Breen et al. over lunch, our research group came to the consensus that their argument is logically flawed. Breen et al. reached their conclusion because the dN/dS values observed in some genes are much lower than their expectation in the absence of epistasis. But when calculating the expected dN/dS ratio in the absence of epistasis, Breen et al. assumed that all amino acids observed in a protein alignment at any particular position have equal fitness. This assumption is unrealistic because, simply, some amino acids may be more fit than others. When we relaxed this unrealistic assumption, we found that the observed dN/dS values and the observed patterns of amino acid diversity at each site are perfectly consistent with a non-epistatic model of protein evolution, for all the nuclear and chloroplast genes in the Breen et al. dataset (but, interestingly, not for their mitochondrial genes).

In an ideal world, scientific disagreements would be resolved by straightforward transactions based solely on logic and data. But in reality, such disagreements inevitably involve intellectual biases, not to mention personalities, politics, reputations, et cetera. In fact, we (my research group and I) are colleagues and admirers of Kondrashov and his comrades (these two papers of his are among our favorites). Why risk our collegiality by publishing a critique on arXiv?

The answer is two-fold. First, we are passionate about understanding molecular evolution, both as individuals and within the context of a scientific community – and we believe this exchange will advance that understanding. Second, we have had extensive email correspondences with Fedya about the scientific issues at hand. These correspondences have been completely open and straightforward: we have shared our computer code so that Fedya can reproduce our analyses; and Fedya has agreed with our critique, in principle, although he has some reservations and may appreciate subtleties of his data that we do not. In any case, I feel that the scientific exchange has been honest, and it will hopefully avoid the snark that sometimes accompanies such disagreements, and focus instead on the scientific issues at stake.

I wish to thank Graham Coop for inviting me to contribute to Haldane’s Sieve. And thanks of course to my co-authors, including our own fearless leader, David McCandlish.

—Joshua B. Plotkin

N.B.: This blog post is meant as an exchange among scientific colleagues, and not as an advertisement to the media.

Epistasis not needed to explain low dN/dS

Epistasis not needed to explain low dN/dS
In Response to “Epistasis as the primary factor in molecular evolution” by Breen et al. Nature 490, 535-538 (2012)
David M. McCandlish, Etienne Rajon, Premal Shah, Yang Ding, Joshua B. Plotkin
(Submitted on 20 Dec 2012)

An important question in molecular evolution is whether an amino acid that occurs at a given position makes an independent contribution to fitness, or whether its effect depends on the state of other loci in the organism’s genome, a phenomenon known as epistasis. In a recent letter to Nature, Breen et al. (2012) argued that epistasis must be “pervasive throughout protein evolution” because the observed ratio between the per-site rates of non-synonymous and synonymous substitutions (dN/dS) is much lower than would be expected in the absence of epistasis. However, when calculating the expected dN/dS ratio in the absence of epistasis, Breen et al. assumed that all amino acids observed in a protein alignment at any particular position have equal fitness. Here, we relax this unrealistic assumption and show that any dN/dS value can in principle be achieved at a site, without epistasis. Furthermore, for all nuclear and chloroplast genes in the Breen et al. dataset, we show that the observed dN/dS values and the observed patterns of amino acid diversity at each site are jointly consistent with a non-epistatic model of protein evolution.