Coordinated Evolution of Influenza A Surface Proteins

Coordinated Evolution of Influenza A Surface Proteins

Alexey D. Neverov, Sergey Kryazhimskiy, Joshua B. Plotkin, Georgii A. Bazykin
doi: http://dx.doi.org/10.1101/008235

Surface proteins hemagglutinin (HA) and neuraminidase (NA) of the human influenza A virus evolve under selection pressure to escape the human adaptive immune response and antiviral drug treatments. In addition to these external selection pressures, some mutations in HA are known to affect the adaptive landscape of NA, and vice versa, because these two proteins are physiologically interlinked. However, the extent to which evolution of one protein affects the evolution of the other is unknown. Here we develop a novel phylogenetic method for detecting the signatures of such genetic interactions between mutations in different genes, that is, inter-gene epistasis. Using this method, we show that influenza surface proteins evolve in a coordinated way, with substitutions in HA affecting substitutions in NA and vice versa, at many sites. Of particular interest is our finding that the oseltamivir-resistance mutations in NA in subtype H1N1 were likely facilitated by prior mutations in HA. Our results illustrate that the adaptive landscape of a viral protein is remarkably sensitive to its genomic context and, more generally, imply that the evolution of any single protein must be understood within the context of the entire evolving genome.

Protein folding and binding can emerge as evolutionary spandrels through structural coupling

Protein folding and binding can emerge as evolutionary spandrels through structural coupling

Michael Manhart, Alexandre V Morozov
doi: http://dx.doi.org/10.1101/008250

Binding interactions between proteins and other molecules mediate numerous cellular processes, including metabolism, signaling, and regulation of gene expression. These interactions evolve in response to changes in the protein’s chemical or physical environment (such as the addition of an antibiotic), or when genes duplicate and diverge. Several recent studies have shown the importance of folding stability in constraining protein evolution. Here we investigate how structural coupling between protein folding and binding — the fact that most proteins can only bind their targets when folded — gives rise to evolutionary coupling between the traits of folding stability and binding strength. Using biophysical and evolutionary modeling, we show how these protein traits can emerge as evolutionary “spandrels” even if they do not confer an intrinsic fitness advantage. In particular, proteins can evolve strong binding interactions that have no functional role but merely serve to stabilize the protein if misfolding is deleterious. Furthermore, such proteins may have divergent fates, evolving to bind or not bind their targets depending on random mutation events. These observations may explain the abundance of apparently nonfunctional interactions among proteins observed in high-throughput assays. In contrast, for proteins with both functional binding and deleterious misfolding, evolution may be highly predictable at the level of biophysical traits: adaptive paths are tightly constrained to first gain extra folding stability and then partially lose it as the new binding function is developed. These findings have important consequences for our understanding of fundamental evolutionary principles of both natural and engineered proteins.

Genome-wide Comparative Analysis Reveals Possible Common Ancestors of NBS Domain Containing Genes in Hybrid Citrus sinensis Genome and Original Citrus clementina Genome

Genome-wide Comparative Analysis Reveals Possible Common Ancestors of NBS Domain Containing Genes in Hybrid Citrus sinensis Genome and Original Citrus clementina Genome

Yunsheng Wang, Lijuan Zhou, Dazhi Li, Amy Lawton-Rauh, Pradip K. Srimani, Liangying Dai, Yongping Duan, Feng Luo
doi: http://dx.doi.org/10.1101/008219

Background Recently available whole genome sequences of three citrus species: one Citrus clementina and two Citrus sinensis genomes have made it possible to understand the features of candidate disease resistance genes with nucleotide-binding sites (NBS) domain in Citrus and how NBS genes differ between hybrid and original Citrus species. Result We identified and re-annotated NBS genes from three citrus genomes and found similar numbers of NBS genes in those citrus genomes. Phylogenetic analysis of all citrus NBS genes across three genomes showed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different groups that contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three citrus genomes. This suggests that NBS genes in three citrus genomes may come from shared ancestral origins. We also mapped the re-sequenced reads of three pomelo and three Mandarin orange genomes onto the Citrus sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genome. The homologous NBS genes in pomelo and mandarin may explain why the NBS genes in their hybrid Citrus sinensis are similar to those in Citrus clementina in this study. Furthermore, sequence variation amongst citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different citrus genomes. Conclusion Our comparative analyses yield valuable insight into the understanding of the structure, evolution and organization of NBS genes in Citrus genomes. There are significantly more NBS genes in Citrus genomes compared to other plant species. NBS genes in hybrid C. sinensis genomes are very similar to those in progenitor C. clementina genome and they may be derived from possible common ancestral gene copies. Furthermore, our comprehensive analysis showed that there are three groups of plant NBS genes while CC-containing NBS genes can be divided into two groups.

The genomic landscape of polymorphic human nuclear mitochondrial insertions

The genomic landscape of polymorphic human nuclear mitochondrial insertions

Gargi Dayama, Sarah B Emery, Jeffrey M Kidd, Ryan E Mills
doi: http://dx.doi.org/10.1101/008144

The transfer of mitochondrial genetic material into the nuclear genomes of eukaryotes is a well-established phenomenon. Many studies over the past decade have utilized reference genome sequences of numerous species to characterize the prevalence and contribution of nuclear mitochondrial insertions to human diseases. The recent advancement of high throughput sequencing technologies has enabled the interrogation of genomic variation at a much finer scale, and now allows for an exploration into the diversity of polymorphic nuclear mitochondrial insertions (NumtS) in human populations. We have developed an approach to discover and genotype previously undiscovered Numt insertions using whole genome, paired-end sequencing data. We have applied this method to almost a thousand individuals in twenty populations from the 1000 Genomes Project and other data sets and identified 138 novel sites of Numt insertions, extending our current knowledge of existing Numt locations in the human genome by almost 20%. Most of the newly identified NumtS were found in less than 1% of the samples we examined, suggesting that they occur infrequently in nature or have been rapidly removed by purifying selection. We find that recent Numt insertions are derived from throughout the mitochondrial genome, including the D-loop, and have integration biases consistent with previous studies on older, fixed NumtS in the reference genome. We have further determined the complete inserted sequence for a subset of these events to define their age and origin of insertion as well as their potential impact on studies of mitochondrial heteroplasmy.

Exact solutions for the selection-mutation equilibrium in the Crow-Kimura evolutionary model

Exact solutions for the selection-mutation equilibrium in the Crow-Kimura evolutionary model

Yuri S. Semenov, Artem S. Novozhilov
(Submitted on 19 Aug 2014)

We reformulate the eigenvalue problem for the selection–mutation equilibrium distribution in the case of a haploid asexually reproduced population in the form of an equation for an unknown probability generating function of this distribution. The special form of this equation in the infinite sequence limit allows us to obtain analytically the steady state distributions for a number of particular cases of the fitness landscape. The general approach is illustrated by examples and theoretical findings are compared with numerical calculations.

An amino acid polymorphism in the Drosophila insulin receptor demonstrates pleiotropic and adaptive function in life history traits

An amino acid polymorphism in the Drosophila insulin receptor demonstrates pleiotropic and adaptive function in life history traits

Annalise B. Paaby, Alan O. Bergland, Emily L. Behrman, Paul S. Schmidt
doi: http://dx.doi.org/10.1101/008193

Finding the specific nucleotides that underlie adaptive variation is a major goal in evolutionary biology, but polygenic traits pose a challenge because the complex genotype-phenotype relationship can obscure the effects of individual alleles. However, natural selection working in large wild populations can shift allele frequencies and indicate functional regions of the genome. Previously, we showed that the two most common alleles of a complex amino acid insertion-deletion polymorphism in the Drosophila insulin receptor show independent, parallel clines in frequency across the North American and Australian continents. Here, we report that the cline is stable over at least a five-year period and that the polymorphism also demonstrates temporal shifts in allele frequency concurrent with seasonal change. We tested the alleles for effects on levels of insulin signaling, fecundity, development time, body size, stress tolerance, and lifespan. We find that the alleles are associated with predictable differences in these traits, consistent with patterns of Drosophila life history variation across geography that likely reflect adaptation to the heterogeneous climatic environment. These results implicate insulin signaling as a major mediator of life history adaptation in Drosophila, and suggest that life history tradeoffs can be explained by extensive pleiotropy at a single locus.

The impact of macroscopic epistasis on long-term evolutionary dynamics

The impact of macroscopic epistasis on long-term evolutionary dynamics

Benjamin H. Good, Michael M. Desai
(Submitted on 18 Aug 2014)

Genetic interactions can strongly influence the fitness effects of individual mutations, yet the impact of these epistatic interactions on evolutionary dynamics remains poorly understood. Here we investigate the evolutionary role of epistasis over 50,000 generations in a well-studied laboratory evolution experiment in E. coli. The extensive duration of this experiment provides a unique window into the effects of epistasis during long-term adaptation to a constant environment. Guided by analytical results in the weak-mutation limit, we develop a computational framework to assess the compatibility of a given epistatic model with the observed patterns of fitness gain and mutation accumulation through time. We find that the average fitness trajectory alone provides little power to distinguish between competing models, including those that lack any direct epistatic interactions between mutations. However, when combined with the mutation trajectory, these observables place strong constraints on the set of possible models of epistasis, ruling out most existing explanations of the data. Instead, we find the strongest support for a “two-epoch” model of adaptation, in which an initial burst of diminishing returns epistasis is followed by a steady accumulation of mutations under a constant distribution of fitness effects. Our results highlight the need for additional DNA sequencing of these populations, as well as for more sophisticated models of epistasis that are compatible with all of the experimental data.

CloudSTRUCTURE: infer population STRUCTURE on the cloud

CloudSTRUCTURE: infer population STRUCTURE on the cloud

Liya Wang, Doreen Ware
(Submitted on 18 Aug 2014)

We present CloudSTRUCTURE, an application for running parallel analyses with the population genetics program STRUCTURE. The HPC ready application, powered by iPlant cyber-infrastructure, provides a fast (by parallelization) and convenient (through a user friendly GUI) way to calculate like-lihood values across multiple values of K (number of genetic groups) and numbers of iterations. The results are automati-cally summarized for easier determination of the K value that best fit the data. In addition, CloudSTRUCTURE will reformat STRUCTURE output for use in downstream programs, such as TASSEL for association analysis with population structure ef-fects stratified.

Genome sequencing of the perciform fish Larimichthys crocea provides insights into stress adaptation

Genome sequencing of the perciform fish Larimichthys crocea provides insights into stress adaptation

Jingqun Ao, Yinnan Mu, Li-Xin Xiang, DingDing Fan, MingJi Feng, Shicui Zhang, Qiong Shi, Lv-Yun Zhu, Ting Li, Yang Ding, Li Nie, Qiuhua Li, Wei-ren Dong, Liang Jiang, Bing Sun, XinHui Zhang, Mingyu Li, Hai-Qi Zhang, ShangBo Xie, YaBing Zhu, XuanTing Jiang, Xianhui Wang, Pengfei Mu, Wei Chen, Zhen Yue, Zhuo Wang, Jun Wang, Jian-Zhong Shao, Xinhua Chen
doi: http://dx.doi.org/10.1101/008136

The large yellow croaker Larimichthys crocea (L. crocea) is one of the most economically important marine fish in China and East Asian countries. It also exhibits peculiar behavioral and physiological characteristics, especially sensitive to various environmental stresses, such as hypoxia and air exposure. These traits may render L. crocea a good model for investigating the response mechanisms to environmental stress. To understand the molecular and genetic mechanisms underlying the adaptation and response of L. crocea to environmental stress, we sequenced and assembled the genome of L. crocea using a bacterial artificial chromosome and whole-genome shotgun hierarchical strategy. The final genome assembly was 679 Mb, with a contig N50 of 63.11 kb and a scaffold N50 of 1.03 Mb, containing 25,401 protein-coding genes. Gene families underlying adaptive behaviours, such as vision-related crystallins, olfactory receptors, and auditory sense-related genes, were significantly expanded in the genome of L. crocea relative to those of other vertebrates. Transcriptome analyses of the hypoxia-exposed L. crocea brain revealed new aspects of neuro-endocrine-immune/metabolism regulatory networks that may help the fish to avoid cerebral inflammatory injury and maintain energy balance under hypoxia. Proteomics data demonstrate that skin mucus of the air-exposed L. crocea had a complex composition, with an unexpectedly high number of proteins (3,209), suggesting its multiple protective mechanisms involved in antioxidant functions, oxygen transport, immune defence, and osmotic and ionic regulation. Our results provide novel insights into the mechanisms of fish adaptation and response to hypoxia and air exposure.

Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Sang Hoon Lee, Robyn Ffrancon, Daniel M. Abrams, Beom Jun Kim, Mason A. Porter
doi: http://dx.doi.org/10.1101/000257

The study of human mobility is both of fundamental importance and of great potential value. For example, it can be leveraged to facilitate efficient city planning and improve prevention strategies when faced with epidemics. The newfound wealth of rich sources of data—including banknote flows, mobile phone records, and transportation data—have led to an explosion of attempts to characterize modern human mobility. Unfortunately, the dearth of comparable historical data makes it much more difficult to study human mobility patterns from the past. In this paper, we present such an analysis: we demonstrate that the data record from Korean family books (called “jokbo”) can be used to estimate migration patterns via marriages from the past 750 years. We apply two generative models of long-term human mobility to quantify the relevance of geographical information to human marriage records in the data, and we find that the wide variety in the geographical distributions of the clans poses interesting challenges for the direct application of these models. Using the different geographical distributions of clans, we quantify the “ergodicity” of clans in terms of how widely and uniformly they have spread across Korea, and we compare these results to those obtained using surname data from the Czech Republic. To examine population flow in more detail, we also construct and examine a population-flow network between regions. Based on the correlation between ergodicity and migration patterns in Korea, we identify two different types of migration patterns: diffusive and convective. We expect the analysis of diffusive versus convective effects in population flows to be widely applicable to the study of mobility and migration patterns across different cultures.