Epigenetic Modifications are Associated with Inter-species Gene Expression Variation in Primates

Epigenetic Modifications are Associated with Inter-species Gene Expression Variation in Primates

Xiang Zhou, Carolyn Cain, Marsha Myrthil, Noah Lewellen, Katelyn Michelini, Emily Davenport, Matthew Stephens, Jonathan Pritchard, Yoav Gilad

Changes in gene regulation level have long been thought to play an important role in evolution and speciation, especially in primates. Over the past decade, comparative genomic studies have revealed extensive inter-species differences in gene expression levels yet we know much less about the extent to which regulatory mechanisms differ between species. To begin addressing this gap, we performed a comparative epigenetic study in primate lymphoblastoid cell lines (LCLs), to query the contribution of RNA polymerase II (Pol II) and four histone modifications (H3K4me1, H3K4me3, H3K27ac, and H3K27me3) to inter-species variation in gene expression levels. We found that inter-species differences in mark enrichment near transcription start sites are significantly more often associated with inter-species differences in the corresponding gene expression level than expected by chance alone. Interestingly, we also found that first-order interactions among the histone marks and Pol II do not markedly contribute to the degree of association between the marks and inter-species variation in gene expression levels, suggesting that the marginal effects of the five marks dominate this contribution.

The Role of Migration in the Evolution of Phenotypic Switching

The Role of Migration in the Evolution of Phenotypic Switching

Oana Carja, Robert E Furrow, Marc W Feldman

Stochastic switching is an example of phenotypic bet-hedging, where an individual can switch between different phenotypic states in a fluctuating environment. Although the evolution of stochastic switching has been studied when the environment varies temporally, there has been little theoretical work on the evolution of phenotypic switching under both spatially and temporally fluctuating selection pressures. Here we use a population genetic model to explore the interaction of temporal and spatial variation in the evolutionary dynamics of phenotypic switching. We find that spatial variation in selection is important; when selection pressures are similar across space, migration can decrease the rate of switching, but when selection pressures differ spatially, increasing migration between demes can facilitate the evolution of higher rates of switching. These results may help explain the diverse array of non-genetic contributions to phenotypic variability and phenotypic inheritance observed in both wild and experimental populations.

Identifying recombination hotspots using population genetic data

Identifying recombination hotspots using population genetic data
Adam Auton, Simon Myers, Gil McVean
(Submitted on 17 Mar 2014)

Motivation: Recombination rates vary considerably at the fine scale within mammalian genomes, with the majority of recombination occurring within hotspots of ~2 kb in width. We present a method for inferring the location of recombination hotspots from patterns of linkage disequilibrium within samples of population genetic data. Results: Using simulations, we show that our method has hotspot detection power of approximately 50-60%, but depending on the magnitude of the hotspot. The false positive rate is between 0.24 and 0.56 false positives per Mb for data typical of humans. Availability: this http URL

Horizontal Transfers and Gene Losses in the phospholipid pathway of Bartonella reveal clues about early ecological niches

Horizontal Transfers and Gene Losses in the phospholipid pathway of Bartonella reveal clues about early ecological niches
Qiyun Zhu, Michael Kosoy, Kevin J Olival, Katharina Dittmar

Bartonellae are mammalian pathogens vectored by blood-feeding arthropods. Although of increasing medical importance, little is known about their ecological past, and host associations are underexplored. Previous studies suggest an influence of horizontal gene transfers in ecological niche colonization by acquisition of host pathogenicity genes. We here expand these analyses to metabolic pathways of 28 Bartonella genomes, and experimentally explore the distribution of bartonellae in 21 species of blood-feeding arthropods. Across genomes, repeated gene losses and horizontal gains in the phospholipid pathway were found. The evolutionary timing of these patterns suggests functional consequences likely leading to an early intracellular lifestyle for stem bartonellae. Comparative phylogenomic analyses discover three independent lineage-specific reacquisitions of a core metabolic gene – NAD(P)H-dependent glycerol-3-phosphate dehydrogenase (gpsA) – from Gammaproteobacteria and Epsilonproteobacteria. Transferred genes are significantly closely related to invertebrate Arsenophonus-, and Serratia-like endosymbionts, and mammalian Helicobacter-like pathogens, supporting a cellular association with arthropods and mammals at the base of extant bartonellae. Our studies suggest that the horizontal re-aquisitions had a key impact on bartonellae lineage specific ecological and functional evolution.

Analysis of stop-gain and frameshift variants in human innate immunity genes


Analysis of stop-gain and frameshift variants in human innate immunity genes

Antonio Rausell, Pejman Mohammadi, Paul J McLaren, Ioannis Xenarios, Jacques Fellay, Amalio Telenti

Loss-of-function variants in innate immunity genes are associated with Mendelian disorders in the form of primary immunodeficiencies. Recent resequencing projects report that stop-gains and frameshifts are collectively prevalent in humans and could be responsible for some of the inter-individual variability in innate immune response. Current computational approaches evaluating loss-of-function in genes carrying these variants rely on gene-level characteristics such as evolutionary conservation and functional redundancy across the genome. However, innate immunity genes represent a particular case because they are more likely to be under positive selection and duplicated. To create a ranking of severity that would be applicable to the innate immunity genes we first evaluated 17764 stop-gain and 13915 frameshift variants from the NHLBI Exome Sequencing Project and 1000 Genomes Project. Sequence-based features such as loss of functional domains, isoform-specific truncation and non-sense mediated decay were found to correlate with variant allele frequency and validated with gene expression data. We integrated these features in a Bayesian classification scheme and benchmarked its use in predicting pathogenic variants against OMIM disease stop-gains and frameshifts. The classification scheme was applied in the assessment of 335 stop-gains and 236 frameshifts affecting 227 interferon-stimulated genes. The sequence-based score ranks variants in innate immunity genes according to their potential to cause disease, and complements existing gene-based pathogenicity scores.

Markov mutation models on Yule trees: pairwise species comparisons

Markov mutation models on Yule trees: pairwise species comparisons
Willem H. Mulder, Forrest W. Crawford
Subjects: Populations and Evolution (q-bio.PE)

Efforts to reconstruct phylogenetic trees and understand evolutionary processes depend fundamentally on stochastic models of speciation and mutation. The simplest continuous-time model for speciation in phylogenetic trees is the Yule process, in which new species are “born” from existing lineages at a constant rate. Recent work has illuminated some of the structural properties of Yule trees, but it remains mostly unknown how these properties affect sequence and trait patterns observed at the tips of the phylogenetic tree. Understanding the interplay between speciation and mutation under simple models of evolution is essential for deriving valid phylogenetic inference methods and gives insight into the optimal design of phylogenetic studies. In this work, we derive the probability distribution of interspecies covariance under Brownian motion and Ornstein-Uhlenbeck processes on a Yule tree. We compute the probability distribution of the number of mutations shared between two randomly chosen taxa in a Yule tree under several mutation models. These results suggest summary measures of phylogenetic information content, illuminate the correlation between site patterns in sequences or traits of related organisms, and provide heuristics for experimental design and reconstruction of phylogenetic trees.

Gaussian process test for high-throughput sequencing time series: application to experimental evolution

Gaussian process test for high-throughput sequencing time series: application to experimental evolution
Hande Topa, Ágnes Jónás, Robert Kofler, Carolin Kosiol, Antti Honkela
Comments: 26 pages, 13 figures
Subjects: Populations and Evolution (q-bio.PE); Genomics (q-bio.GN); Quantitative Methods (q-bio.QM); Applications (stat.AP)

Motivation: Recent advances in high-throughput sequencing (HTS) have made it possible to monitor genomes in great detail. New experiments not only use HTS to measure genomic features at one time point but to monitor them changing over time with the aim of identifying significant changes in their abundance. In population genetics, for example, allele frequencies are monitored over time to detect significant frequency changes that indicate selection pressures. Previous attempts at analysing data from HTS experiments have been limited as they could not simultaneously include data at intermediate time points, replicate experiments and sources of uncertainty specific to HTS such as sequencing depth.
Results: We present the beta-binomial Gaussian process (BBGP) model for ranking features with significant non-random variation in abundance over time. The features are assumed to represent proportions, such as proportion of an alternative allele in a population. We use the beta-binomial model to capture the uncertainty arising from finite sequencing depth and combine with a Gaussian process model over the time series. In simulations that mimic the features of experimental evolution data, the proposed method clearly outperforms classical testing in average precision of finding selected alleles. We also present results on real data from Drosophila experimental evolution experiment in temperature adaptation.
Availability: R software implementing the test is available at https://github.com/handetopa/BBGP.

Genetic influences on translation in yeast

Genetic influences on translation in yeast

Frank W. Albert, Dale Muzzey, Jonathan Weissman, Leonid Kruglyak
(Submitted on 13 Mar 2014)

Heritable differences in gene expression between individuals are an important source of phenotypic variation. The question of how closely the effects of genetic variation on protein levels mirror those on mRNA levels remains open. Here, we addressed this question by using ribosome profiling to examine how genetic differences between two strains of the yeast S. cerevisiae affect translation. Strain differences in translation were observed for hundreds of genes, more than half as many as showed genetic differences in mRNA levels. Similarly, allele specific measurements in the diploid hybrid between the two strains revealed roughly half as many cis-acting effects on translation as were observed for mRNA levels. In both the parents and the hybrid, strong effects on translation were rare, such that the direction of an mRNA difference was typically reflected in a concordant footprint difference. The relative importance of cis and trans acting variation on footprint levels was similar to that for mRNA levels. Across all expressed genes, there was a tendency for translation to more often reinforce than buffer mRNA differences, resulting in footprint differences with greater magnitudes than the mRNA differences. A reanalysis of two earlier studies which reported translational buffering between two yeast species showed that translational reinforcement is in fact more common between these species, consistent with our results. Finally, we catalogued instances of premature translation termination in the two yeast strains. Overall, genetic variation clearly influences translation, but primarily does so by subtly modulating differences in mRNA levels. Translation does not appear to create strong discrepancies between genetic influences on mRNA and protein levels.

Predicting discovery rates of genomic features

Predicting discovery rates of genomic features

Simon Gravel, NHLBI GO Exome Sequencing Project
(Submitted on 13 Mar 2014)

Successful sequencing experiments require judicious sample selection. However, this selection must often be performed on the basis of limited preliminary data. Predicting the statistical properties of the final sample based on preliminary data can be challenging, because numerous uncertain model assumptions may be involved. Here, we ask whether we can predict “omics” variation across many samples by sequencing only a fraction of them. In the infinite-genome limit, we find that a pilot study sequencing 5% of a population is sufficient to predict the number of genetic variants in the entire population within 6% of the correct value, using an estimator agnostic to demography, selection, or population structure. To reach similar accuracy in a finite genome with millions of polymorphisms, the pilot study would require about 15% of the population. We present computationally efficient jackknife and linear programming methods that exhibit substantially less bias than the state of the art when applied to simulated data and sub-sampled 1000 Genomes Project data. Extrapolating based on the NHLBI Exome Sequencing Project data, we predict that 7.2% of sites in the capture region would be variable in a sample of 50,000 African-Americans, and 8.8% in a European sample of equal size. Finally, we show how the linear programming method can also predict discovery rates of various genomic features, such as the number of transcription factor binding sites across different cell types.

Increased genetic diversity improves crop yield stability under climate variability: a computational study on sunflower

Increased genetic diversity improves crop yield stability under climate variability: a computational study on sunflower

Pierre Casadebaig (1), Ronan Trépos (2), Victor Picheny (2), Nicolas B. Langlade (3), Patrick Vincourt (3), Philippe Debaeke (1) ((1) INRA, UMR1248 AGIR, Castanet-Tolosan, France, (2) INRA, UR875 MIAT, Castanet-Tolosan, France, (3) INRA, UMR441 LIPM, Castanet-Tolosan, France)
(Submitted on 12 Mar 2014)

A crop can be represented as a biotechnical system in which components are either chosen (cultivar, management) or given (soil, climate) and whose combination generates highly variable stress patterns and yield responses. Here, we used modeling and simulation to predict the crop phenotypic plasticity resulting from the interaction of plant traits (G), climatic variability (E) and management actions (M). We designed two in silico experiments that compared existing and virtual sunflower cultivars (Helianthus annuus L.) in a target population of cropping environments by simulating a range of indicators of crop performance. Optimization methods were then used to search for GEM combinations that matched desired crop specifications. Computational experiments showed that the fit of particular cultivars in specific environments is gradually increasing with the knowledge of pedo-climatic conditions. At the regional scale, tuning the choice of cultivar impacted crop performance the same magnitude as the effect of yearly genetic progress made by breeding. When considering virtual genetic material, designed by recombining plant traits, cultivar choice had a greater positive impact on crop performance and stability. Results suggested that breeding for key traits conferring plant plasticity improved cultivar global adaptation capacity whereas increasing genetic diversity allowed to choose cultivars with distinctive traits that were more adapted to specific conditions. Consequently, breeding genetic material that is both plastic and diverse may improve yield stability of agricultural systems exposed to climatic variability. We argue that process-based modeling could help enhancing spatial management of cultivated genetic diversity and could be integrated in functional breeding approaches.