The impact of macroscopic epistasis on long-term evolutionary dynamics

The impact of macroscopic epistasis on long-term evolutionary dynamics

Benjamin H. Good, Michael M. Desai
(Submitted on 18 Aug 2014)

Genetic interactions can strongly influence the fitness effects of individual mutations, yet the impact of these epistatic interactions on evolutionary dynamics remains poorly understood. Here we investigate the evolutionary role of epistasis over 50,000 generations in a well-studied laboratory evolution experiment in E. coli. The extensive duration of this experiment provides a unique window into the effects of epistasis during long-term adaptation to a constant environment. Guided by analytical results in the weak-mutation limit, we develop a computational framework to assess the compatibility of a given epistatic model with the observed patterns of fitness gain and mutation accumulation through time. We find that the average fitness trajectory alone provides little power to distinguish between competing models, including those that lack any direct epistatic interactions between mutations. However, when combined with the mutation trajectory, these observables place strong constraints on the set of possible models of epistasis, ruling out most existing explanations of the data. Instead, we find the strongest support for a “two-epoch” model of adaptation, in which an initial burst of diminishing returns epistasis is followed by a steady accumulation of mutations under a constant distribution of fitness effects. Our results highlight the need for additional DNA sequencing of these populations, as well as for more sophisticated models of epistasis that are compatible with all of the experimental data.

CloudSTRUCTURE: infer population STRUCTURE on the cloud

CloudSTRUCTURE: infer population STRUCTURE on the cloud

Liya Wang, Doreen Ware
(Submitted on 18 Aug 2014)

We present CloudSTRUCTURE, an application for running parallel analyses with the population genetics program STRUCTURE. The HPC ready application, powered by iPlant cyber-infrastructure, provides a fast (by parallelization) and convenient (through a user friendly GUI) way to calculate like-lihood values across multiple values of K (number of genetic groups) and numbers of iterations. The results are automati-cally summarized for easier determination of the K value that best fit the data. In addition, CloudSTRUCTURE will reformat STRUCTURE output for use in downstream programs, such as TASSEL for association analysis with population structure ef-fects stratified.

Genome sequencing of the perciform fish Larimichthys crocea provides insights into stress adaptation

Genome sequencing of the perciform fish Larimichthys crocea provides insights into stress adaptation

Jingqun Ao, Yinnan Mu, Li-Xin Xiang, DingDing Fan, MingJi Feng, Shicui Zhang, Qiong Shi, Lv-Yun Zhu, Ting Li, Yang Ding, Li Nie, Qiuhua Li, Wei-ren Dong, Liang Jiang, Bing Sun, XinHui Zhang, Mingyu Li, Hai-Qi Zhang, ShangBo Xie, YaBing Zhu, XuanTing Jiang, Xianhui Wang, Pengfei Mu, Wei Chen, Zhen Yue, Zhuo Wang, Jun Wang, Jian-Zhong Shao, Xinhua Chen
doi: http://dx.doi.org/10.1101/008136

The large yellow croaker Larimichthys crocea (L. crocea) is one of the most economically important marine fish in China and East Asian countries. It also exhibits peculiar behavioral and physiological characteristics, especially sensitive to various environmental stresses, such as hypoxia and air exposure. These traits may render L. crocea a good model for investigating the response mechanisms to environmental stress. To understand the molecular and genetic mechanisms underlying the adaptation and response of L. crocea to environmental stress, we sequenced and assembled the genome of L. crocea using a bacterial artificial chromosome and whole-genome shotgun hierarchical strategy. The final genome assembly was 679 Mb, with a contig N50 of 63.11 kb and a scaffold N50 of 1.03 Mb, containing 25,401 protein-coding genes. Gene families underlying adaptive behaviours, such as vision-related crystallins, olfactory receptors, and auditory sense-related genes, were significantly expanded in the genome of L. crocea relative to those of other vertebrates. Transcriptome analyses of the hypoxia-exposed L. crocea brain revealed new aspects of neuro-endocrine-immune/metabolism regulatory networks that may help the fish to avoid cerebral inflammatory injury and maintain energy balance under hypoxia. Proteomics data demonstrate that skin mucus of the air-exposed L. crocea had a complex composition, with an unexpectedly high number of proteins (3,209), suggesting its multiple protective mechanisms involved in antioxidant functions, oxygen transport, immune defence, and osmotic and ionic regulation. Our results provide novel insights into the mechanisms of fish adaptation and response to hypoxia and air exposure.

Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Sang Hoon Lee, Robyn Ffrancon, Daniel M. Abrams, Beom Jun Kim, Mason A. Porter
doi: http://dx.doi.org/10.1101/000257

The study of human mobility is both of fundamental importance and of great potential value. For example, it can be leveraged to facilitate efficient city planning and improve prevention strategies when faced with epidemics. The newfound wealth of rich sources of data—including banknote flows, mobile phone records, and transportation data—have led to an explosion of attempts to characterize modern human mobility. Unfortunately, the dearth of comparable historical data makes it much more difficult to study human mobility patterns from the past. In this paper, we present such an analysis: we demonstrate that the data record from Korean family books (called “jokbo”) can be used to estimate migration patterns via marriages from the past 750 years. We apply two generative models of long-term human mobility to quantify the relevance of geographical information to human marriage records in the data, and we find that the wide variety in the geographical distributions of the clans poses interesting challenges for the direct application of these models. Using the different geographical distributions of clans, we quantify the “ergodicity” of clans in terms of how widely and uniformly they have spread across Korea, and we compare these results to those obtained using surname data from the Czech Republic. To examine population flow in more detail, we also construct and examine a population-flow network between regions. Based on the correlation between ergodicity and migration patterns in Korea, we identify two different types of migration patterns: diffusive and convective. We expect the analysis of diffusive versus convective effects in population flows to be widely applicable to the study of mobility and migration patterns across different cultures.

On the genetic architecture of intelligence and other quantitative traits

On the genetic architecture of intelligence and other quantitative traits

Stephen D.H. Hsu
(Submitted on 14 Aug 2014)

How do genes affect cognitive ability or other human quantitative traits such as height or disease risk? Progress on this challenging question is likely to be significant in the near future. I begin with a brief review of psychometric measurements of intelligence, introducing the idea of a “general factor” or g score. The main results concern the stability, validity (predictive power), and heritability of adult g. The largest component of genetic variance for both height and intelligence is additive (linear), leading to important simplifications in predictive modeling and statistical estimation. Due mainly to the rapidly decreasing cost of genotyping, it is possible that within the coming decade researchers will identify loci which account for a significant fraction of total g variation. In the case of height analogous efforts are well under way. I describe some unpublished results concerning the genetic architecture of height and cognitive ability, which suggest that roughly 10k moderately rare causal variants of mostly negative effect are responsible for normal population variation. Using results from Compressed Sensing (L1-penalized regression), I estimate the statistical power required to characterize both linear and nonlinear models for quantitative traits. The main unknown parameter s (sparsity) is the number of loci which account for the bulk of the genetic variation. The required sample size is of order 100s, or roughly a million in the case of cognitive ability.

Seasonality in the migration and establishment of H3N2 Influenza lineages with epidemic growth and decline

Seasonality in the migration and establishment of H3N2 Influenza lineages with epidemic growth and decline

Daniel Zinder, Trevor Bedford, Edward B. Baskerville, Robert J. Woods, Manojit Roy, Mercedes Pascual
(Submitted on 15 Aug 2014)

Background: Influenza A/H3N2 has been circulating in humans since 1968, causing considerable morbidity and mortality. Although H3N2 incidence is highly seasonal, how such seasonality contributes to global phylogeographic migration dynamics has not yet been established. In this study, we incorporate time-varying migration rates in a Bayesian MCMC framework focusing initially on migration within China and, to and from North-America, as case studies, and later on global communities.
Results: Both global migration and migration between and within large geographic regions is clearly seasonal. On a global level, windows of immigration (in migration) map to the seasonal timing of epidemic spread, while windows of emigration (out migration) to epidemic decline. Seasonal patterns also affect the probability that local lineages go extinct and fail to contribute to long term viral evolution. The probability that a region will contribute to long term viral evolution as a part of the trunk of the phylogenetic tree increases in the absence of deep troughs and with reduced incidence variability.
Conclusions: Seasonal migration and rapid turnover within regions is sustained by the invasion of ‘fertile epidemic grounds’ at the end of older epidemics. Thus, the current emphasis on connectivity, including air-travel, should be complemented with a better understanding of the conditions and timing required for successful establishment. This will better our understanding of seasonal drivers, improve predictions, and improve vaccine updating by identifying strains that not only escape immunity but also have the seasonal opportunity to establish and spread. Further work is also needed on additional conditions that contribute to the persistence and long term evolution of influenza within the human population, such as spatial heterogeneity with respect to climate and seasonality.

Understanding Admixture Fractions

Understanding Admixture Fractions

Mason Liang, Rasmus Nielsen
doi: http://dx.doi.org/10.1101/008078

Estimation of admixture fractions has become one of the most commonly used computational tools in population genomics. However, there is remarkably little population genetic theory on their statistical properties. We develop theoretical results that can accurately predict means and variances of admixture proportions within a population using models with recombination and genetic drift. Based on established theory on measures of multilocus disequilibrium, we show that there is a set of recurrence relations that can be used to derive expectations for higher moments of the admixture fraction distribution. We obtain closed form solutions for some special cases. Using these results, we develop a method for estimating admixture parameters from estimated admixture proportion obtained from programs such as Structure or Admixture. We apply this method to HapMap data and find that the population history of African Americans, as expected, is not best explained by a single admixture event between people of European and African ancestry. A model of constant gene flow for the past 11 generations until 2 generations ago gives a better fit.