Mitochondria, mutations and sex: a new hypothesis for the evolution of sex based on mitochondrial mutational erosion

Mitochondria, mutations and sex: a new hypothesis for the evolution of sex based on mitochondrial mutational erosion

Justin Havird , Matthew D Hall , Damian Dowling
doi: http://dx.doi.org/10.1101/019125

The evolution of sex in eukaryotes represents a paradox, given the “two-fold” fitness cost it incurs. We hypothesize that the mutational dynamics of the mitochondrial genome would have favoured the evolution of sexual reproduction. Mitochondrial DNA (mtDNA) exhibits a high mutation rate across most eukaryote taxa, and several lines of evidence suggest this high rate is an ancestral character. This seems inexplicable given mtDNA-encoded genes underlie the expression of life’s most salient functions, including energy conversion. We propose that negative metabolic effects linked to mitochondrial mutation accumulation would have invoked selection for sexual recombination between divergent host nuclear genomes in early eukaryote lineages. This would provide a mechanism by which recombinant host genotypes could be rapidly shuffled and screened for the presence of compensatory modifiers that offset mtDNA-induced harm. Under this hypothesis, recombination provides the genetic variation necessary for compensatory nuclear coadaptation to keep pace with mitochondrial mutation accumulation.

Long-term survival of duplicate genes despite absence of subfunctionalized expression.

Long-term survival of duplicate genes despite absence of subfunctionalized expression.

Xun Lan , Jonathan K Pritchard
doi: http://dx.doi.org/10.1101/019166

Gene duplication is a fundamental process in genome evolution. However, young duplicates are frequently degraded into pseudogenes by loss-of-function mutations. One standard model proposes that the main path for duplicate genes to avoid mutational destruction is by rapidly evolving subfunctionalized expression profiles. We examined this hypothesis using RNA-seq data from 46 human tissues. Surprisingly, we find that sub- or neofunctionalization of expression evolves very slowly, and is rare among duplications that arose within the placental mammals. Most mammalian duplicates are located in tandem and have highly correlated expression profiles, likely due to shared regulation, thus impeding subfunctionalization. Moreover, we also find that a large fraction of duplicate gene pairs exhibit a striking asymmetric pattern in which one gene has consistently higher expression. These asymmetrically expressed duplicates (AEDs) may persist for tens of millions of years, even though the lower-expressed copies tend to evolve under reduced selective constraint and are associated with fewer human diseases than their duplicate partners. We suggest that dosage-sharing of expression, rather than subfunctionalization, is more likely to be the initial factor enabling survival of duplicate gene pairs.

A basic mathematical model for the Lenski experiment and the deceleration of the relative fitness

A basic mathematical model for the Lenski experiment and the deceleration of the relative fitness
Adrián González Casanova, Noemi Kurt, Anton Wakolbinger, Linglong Yuan
Subjects: Probability (math.PR); Populations and Evolution (q-bio.PE)

The Lenski experiment investigates the long-term evolution of bacterial populations. Its design allows the direct comparison of the reproductive fitness of an evolved strain with its founder ancestor. It was observed by Wiser et al. (2013) that the mean fitness over time increases sublinearly, a behaviour which is commonly attributed to effects like clonal interference or epistasis. In this paper we present an individual-based probabilistic model that captures essential features of the design of the Lenski experiment. We assume that each beneficial mutation increases the individual reproduction rate by a fixed amount, which corresponds to the absence of epistasis in the continuous-time (intraday) part of the model, but leads to an epistatic effect in the discrete-time (interday) part of the model. Using an approximation by near-critical Galton-Watson processes, we prove that under some assumptions on the model parameters which exclude clonal interference, the relative fitness process converges, after suitable rescaling, in the large population limit to a power law function.

A vision for ubiquitous sequencing

A vision for ubiquitous sequencing
Yaniv Erlich
doi: http://dx.doi.org/10.1101/019018

Genomics has recently celebrated reaching the \$1000 genome milestone, making affordable DNA sequencing a reality. This goal of the sequencing revolution has been successfully completed. Looking forward, the next goal of the revolution can be ushered in by the advent of sequencing sensors – miniaturized sequencing devices that are manufactured for real time applications and deployed in large quantities at low costs. The first part of this manuscript envisions applications that will benefit from moving the sequencers to the samples in a range of domains. In the second part, the manuscript outlines the critical barriers that need to be addressed in order to reach the goal of ubiquitous sequencing sensors.

Rail-RNA: Scalable analysis of RNA-seq splicing and coverage

Rail-RNA: Scalable analysis of RNA-seq splicing and coverage
Abhinav Nellore , Leonardo Collado-Torres , Andrew E Jaffe , James Morton , Jacob Pritt , José Alquicira-Hernández , Jeffrey T Leek , Ben Langmead
doi: http://dx.doi.org/10.1101/019067

RNA sequencing (RNA-seq) experiments now span hundreds to thousands of samples. A source of frustration for investigators analyzing a given dataset is the inability to rapidly and reproducibly align its samples jointly. Current spliced alignment software is designed to analyze each sample separately. Consequently, no information is gained from analyzing multiple samples together, and it is difficult to reproduce the exact analysis without access to original computing resources. We describe Rail-RNA, a cloud-enabled spliced aligner that analyzes many samples at once. Rail-RNA eliminates redundant work across samples, making it more efficient as samples are added. For many samples, Rail-RNA is more accurate than annotation-assisted aligners. We use Rail-RNA to align 666 RNA-seq samples from the GEUVADIS project on Amazon Web Services in 12 hours for US$0.69 per sample. Rail-RNA produces alignments and base-resolution bigWig coverage files, ready for use with downstream packages for reproducible statistical analysis. We identify 290,416 expressed regions in the GEUVADIS samples, including 21,224 that map to intergenic sequence. We show that these regions show consistent patterns of variation across populations and with respect to known technological confounders. We identify expressed regions in the GEUVADIS samples and show that both annotated and unannotated (novel) expressed regions exhibit consistent patterns of variation across populations and with respect to known confounders. Rail-RNA is open-source software available at http://rail.bio .

Bayesian Inference of Divergence Times and Feeding Evolution in Grey Mullets (Mugilidae)

Bayesian Inference of Divergence Times and Feeding Evolution in Grey Mullets (Mugilidae)
Francesco Santini , Michael R. May , Giorgio Carnevale , Brian R. Moore
doi: http://dx.doi.org/10.1101/019075

Grey mullets (Mugilidae, Ovalentariae) are coastal fishes found in near-shore environments of tropical, subtropical, and temperate regions within marine, brackish, and freshwater habitats throughout the world. This group is noteworthy both for the highly conserved morphology of its members—which complicates species identification and delimitation—and also for the uncommon herbivorous or detritivorous diet of most mullets. In this study, we first attempt to identify the number of mullet species, and then—for the resulting species—estimate a densely sampled time-calibrated phylogeny using three mitochondrial gene regions and three fossil calibrations. Our results identify two major subgroups of mullets that diverged in the Paleocene/Early Eocene, followed by an Eocene/Oligocene radiation across both tropical and subtropical habitats. We use this phylogeny to explore the evolution of feeding preference in mullets, which indicates multiple independent origins of both herbivorous and detritivorous diets within this group. We also explore correlations between feeding preference and other variables, including body size, habitat (marine, brackish, or freshwater), and geographic distribution (tropical, subtropical, or temperate). Our analyses reveal: (1) a positive correlation between trophic index and habitat (with herbivorous and/or detritivorous species predominantly occurring in marine habitats); (2) a negative correlation between trophic index and geographic distribution (with herbivorous species occurring predominantly in subtropical and temperate regions), and; (3) a negative correlation between body size and geographic distribution (with larger species occurring predominantly in subtropical and temperate regions).

GWGGI: software for genome-wide gene-gene interaction analysis

GWGGI: software for genome-wide gene-gene interaction analysis
Changshuai Wei, Qing Lu
Journal-ref: BMC Genetics 2014, 15:101
Subjects: Quantitative Methods (q-bio.QM); Data Structures and Algorithms (cs.DS); Genomics (q-bio.GN); Applications (stat.AP)

Background: While the importance of gene-gene interactions in human diseases has been well recognized, identifying them has been a great challenge, especially through association studies with millions of genetic markers and thousands of individuals. Computationally efficient and powerful tools are in great need for the identification of new gene-gene interactions in high-dimensional association studies. Result: We develop C++ software for genome-wide gene-gene interaction analyses (GWGGI). GWGGI utilizes tree-based algorithms to search a large number of genetic markers for a disease-associated joint association with the consideration of high-order interactions, and then uses non-parametric statistics to test the joint association. The package includes two functions, likelihood ratio Mann-whitney (LRMW) and Tree Assembling Mann-whitney (TAMW).We optimize the data storage and computational efficiency of the software, making it feasible to run the genome-wide analysis on a personal computer. The use of GWGGI was demonstrated by using two real data-sets with nearly 500 k genetic markers. Conclusion: Through the empirical study, we demonstrated that the genome-wide gene-gene interaction analysis using GWGGI could be accomplished within a reasonable time on a personal computer (i.e., ~3.5 hours for LRMW and ~10 hours for TAMW). We also showed that LRMW was suitable to detect interaction among a small number of genetic variants with moderate-to-strong marginal effect, while TAMW was useful to detect interaction among a larger number of low-marginal-effect genetic variants.

Trees Assembling Mann Whitney Approach for Detecting Genome-wide Joint Association among Low Marginal Effect loci

Trees Assembling Mann Whitney Approach for Detecting Genome-wide Joint Association among Low Marginal Effect loci
Changshuai Wei, Daniel J. Schaid, Qing Lu
Journal-ref: Genet Epidemiol. 2013 Jan;37(1):84-91
Subjects: Quantitative Methods (q-bio.QM); Computation (stat.CO); Machine Learning (stat.ML)

Common complex diseases are likely influenced by the interplay of hundreds, or even thousands, of genetic variants. Converging evidence shows that genetic variants with low marginal effects (LME) play an important role in disease development. Despite their potential significance, discovering LME genetic variants and assessing their joint association on high dimensional data (e.g., genome wide association studies) remain a great challenge. To facilitate joint association analysis among a large ensemble of LME genetic variants, we proposed a computationally efficient and powerful approach, which we call Trees Assembling Mann whitney (TAMW). Through simulation studies and an empirical data application, we found that TAMW outperformed multifactor dimensionality reduction (MDR) and the likelihood ratio based Mann whitney approach (LRMW) when the underlying complex disease involves multiple LME loci and their interactions. For instance, in a simulation with 20 interacting LME loci, TAMW attained a higher power (power=0.931) than both MDR (power=0.599) and LRMW (power=0.704). In an empirical study of 29 known Crohn’s disease (CD) loci, TAMW also identified a stronger joint association with CD than those detected by MDR and LRMW. Finally, we applied TAMW to Wellcome Trust CD GWAS to conduct a genome wide analysis. The analysis of 459K single nucleotide polymorphisms was completed in 40 hours using parallel computing, and revealed a joint association predisposing to CD (p-value=2.763e-19). Further analysis of the newly discovered association suggested that 13 genes, such as ATG16L1 and LACC1, may play an important role in CD pathophysiological and etiological processes.

Reconstructing A/B compartments as revealed by Hi-C using long-range correlations in epigenetic data

Reconstructing A/B compartments as revealed by Hi-C using long-range correlations in epigenetic data
Jean-Philippe Fortin , Kasper D Hansen
doi: http://dx.doi.org/10.1101/019000

Analysis of Hi-C data has shown that the genome can be divided into two compartments called A/B compartments. These compartments are cell-type specific and are associated with open and closed chromatin. We show that A/B compartments can be reliably estimated using epigenetic data from two different platforms, the Illumina 450k DNA methylation microarray and DNase hypersensitivity sequencing. We do this by exploiting the fact that the structure of long range correlations differs between open and closed compartments. This work makes A/B compartments readily available in a wide variety of cell types, including many human cancers.

Negative Niche Construction Favors the Evolution of Cooperation

Negative Niche Construction Favors the Evolution of Cooperation
Brian D Connelly , Katherine J Dickinson , Sarah P Hammarlund , Benjamin Kerr
doi: http://dx.doi.org/10.1101/018994

By benefitting others at a cost to themselves, cooperators face an ever present threat from defectors—individuals that avail themselves of the cooperative benefit without contributing. A longstanding challenge to evolutionary biology is to understand the mechanisms that support the many instances of cooperation that nevertheless exist. Hammarlund et al. recently demonstrated that cooperation can persist by hitchhiking along with beneficial non-social adaptations. Importantly, cooperators play an active role in this process. In spatially-structured environments, clustered cooperator populations reach greater densities, which creates more mutational opportunities to gain beneficial non-social adaptations. Cooperation rises in abundance by association with these adaptations. However, once adaptive opportunities have been exhausted, the ride abruptly ends as cooperators are displaced by adapted defectors. Using an agent-based model, we demonstrate that the selective feedback that is created as populations construct their local niches can maintain cooperation indefinitely. This cooperator success depends specifically on negative niche construction, which acts as a perpetual source of adaptive opportunities. As populations adapt, they alter their environment in ways that reveal additional opportunities for adaptation. Despite being independent of niche construction in our model, cooperation feeds this cycle. By reaching larger densities, populations of cooperators are better able to adapt to changes in their constructed niche and successfully respond to the constant threat posed by defectors. We relate these findings to previous studies from the niche construction literature and discuss how this model could be extended to provide a greater understanding of how cooperation evolves in the complex environments in which it is found.