How complexity originates: The evolution of animal eyes

How complexity originates: The evolution of animal eyes
Todd H Oakley , Daniel I Speiser
doi: http://dx.doi.org/10.1101/017129

Learning how complex traits like eyes originate is fundamental for understanding evolution. Here, we first sketch historical perspectives on trait origins and argue that new technologies offer key new insights. Next, we articulate four open questions about trait origins. To address them, we define a research program to break complex traits into components and study the individual evolutionary histories of those parts. By doing so, we can learn when the parts came together and perhaps understand why they stayed together. We apply the approach to five structural innovations critical for complex eyes, reviewing the history of the parts of each of those innovations. Photoreceptors evolved within animals by bricolage, recombining genes that originated far earlier. Multiple genes used in eyes today had ancestral roles in stress responses. We hypothesize that photo-stress could have increased the chance those genes were expressed together in places on animals where light was abundant.

Large-Scale Search of Transcriptomic Read Sets with Sequence Bloom Trees

Large-Scale Search of Transcriptomic Read Sets with Sequence Bloom Trees
Brad Solomon , Carleton Kingsford
doi: http://dx.doi.org/10.1101/017087

Enormous databases of short-read RNA-seq sequencing experiments such as the NIH Sequence Read Archive (SRA) are now available. However, these collections remain difficult to use due to the inability to search for a particular expressed sequence. A natural question is which of these experiments contain sequences that indicate the expression of a particular sequence such as a gene isoform, lncRNA, or uORF. However, at present this is a computationally demanding question at the scale of these databases. We introduce an indexing scheme, the Sequence Bloom Tree (SBT), to support sequence-based querying of terabase-scale collections of thousands of short-read sequencing experiments. We apply SBT to the problem of finding conditions under which query transcripts are expressed. Our experiments are conducted on a set of 2652 publicly available RNA-seq experiments contained in the NIH for the breast, blood, and brain tissues, comprising 5 terabytes of sequence. SBTs of this size can be queried for a 1000 nt sequence in 19 minutes using less than 300 MB of RAM, over 100 times faster than standard usage of SRA-BLAST and 119 times faster than STAR. SBTs allow for fast identification of experiments with expressed novel isoforms, even if these isoforms were unknown at the time the SBT was built. We also provide some theoretical guidance about appropriate parameter selection in SBT and propose a sampling-based scheme for potentially scaling SBT to even larger collections of files. While SBT can handle any set of reads, we demonstrate the effectiveness of SBT by searching a large collection of blood, brain, and breast RNA-seq files for all 214,293 known human transcripts to identify tissue-specific transcripts. The implementation used in the experiments below is in C++ and is available as open source at http://www.cs.cmu.edu/~ckingsf/software/bloomtree.

Adaptation, Clonal Interference, and Frequency-Dependent Interactions in a Long-Term Evolution Experiment with Escherichia coli

Adaptation, Clonal Interference, and Frequency-Dependent Interactions in a Long-Term Evolution Experiment with Escherichia coli

Rohan Maddamsetti , Richard E. Lenski , Jeffrey E. Barrick
doi: http://dx.doi.org/10.1101/017020

Twelve replicate populations of Escherichia coli have been evolving in the laboratory for more than 25 years and 60,000 generations. We analyzed bacteria from whole-population samples frozen every 500 generations through 20,000 generations for one well-studied population, called Ara???1. By tracking 42 known mutations in these samples, we reconstructed the history of this population???s genotypic evolution over this period. The evolutionary dynamics of Ara???1 show strong evidence of selective sweeps as well as clonal interference between competing lineages bearing different beneficial mutations. In some cases, sets of several mutations approached fixation simultaneously, often conveying no information about their order of origination; we present several possible explanations for the existence of these mutational cohorts. Against a backdrop of rapid selective sweeps both earlier and later, we found that two clades coexisted for over 6000 generations before one drove the other extinct. In that time, at least nine mutations arose in the clade that prevailed. We found evidence that the clades evolved a frequency-dependent interaction, which prevented the competitive exclusion of either clade, but which eventually collapsed as beneficial mutations accumulated in the clade that prevailed. Clonal interference and frequency dependence can occur even in the simplest microbial populations. Furthermore, frequency dependence may generate dynamics that extend the period of coexistence that would otherwise be sustained by clonal interference alone.

Threshold trait architecture of Hsp90-buffered variation

Threshold trait architecture of Hsp90-buffered variation

Charles C Carey , Kristen F Gorman , Becky Howsmon , Charles Kooperberg , Aaron K Aragaki , Suzannah Rutherford
doi: http://dx.doi.org/10.1101/016980

Common genetic variants buffered by Hsp90 are candidates for human diseases of signaling such as cancer. Like cancer, morphological abnormalities buffered by Hsp90 are discrete threshold traits with a continuous underlying basis of liability determining their probability of occurrence. QTL and deletion maps for one of the most frequent Hsp90-dependent abnormalities in Drosophila, deformed eye (dfe), were replicated across three genetically related artificial selection lines using strategies dependent on proximity to the dfe threshold and the direction of genetic and environmental effects. Up to 17 dfe loci (QTL) linked by 7 interactions were detected based on the ability of small recombinant regions of an unaffected and completely homozygous control genotype to dominantly suppress or enhance dfe penetrance at its threshold in groups of isogenic recombinant flies, and over 20 deletions increased dfe penetrance from a low expected value in one or more line, identifying a complex network of genes responsible for the dfe phenotype. Replicated comparisons of these whole-genome mapping approaches identified several QTL regions narrowly defined by deletions and 4 candidate genes, with additional uncorrelated QTL and deletions highlighting differences between the approaches and the need for caution in attributing the effect of deletions directly to QTL genes.

RNAseq in the mosquito maxillary palp: a little antennal RNA goes a long way

RNAseq in the mosquito maxillary palp: a little antennal RNA goes a long way

David C. Rinker , Xiaofan Zhou , Ronald Jason Pitts , Antonis Rokas , LJ Zwiebel
doi: http://dx.doi.org/10.1101/016998

A comparative transcriptomic study of mosquito olfactory tissues recently published in BMC Genomics (Hodges et al., 2014) reported several novel findings that have broad implications for the field of insect olfaction. In this brief commentary, we outline why the conclusions of Hodges et al. are problematic under the current models of insect olfaction and then contrast their findings with those of other RNAseq based studies of mosquito olfactory tissues. We also generated a new RNAseq data set from the maxillary palp of Anopheles gambiae in an effort to replicate the novel results of Hodges et al. but were unable to reproduce their results. Instead, our new RNAseq data support the more straightforward explanation that the novel findings of Hodges et al. were a consequence of contamination by antennal RNA. In summary, we find strong evidence to suggest that the conclusions of Hodges et al were spurious, and that at least some of their RNAseq data sets were irrevocably compromised by cross-contamination between samples.

Selective strolls: fixation and extinction in diploids are slower for weakly selected mutations than for neutral ones

Selective strolls: fixation and extinction in diploids are slower for weakly selected mutations than for neutral ones

fabrizio mafessoni , Michael Lachmann
doi: http://dx.doi.org/10.1101/016881

In finite populations, an allele disappears or reaches fixation due to two main forces, selection and drift. Selec- tion is generally thought to accelerate the process: a selected mutation will reach fixation faster than a neutral one, and a disadvantageous one will quickly disappear from the population. We show that even in simple diploid populations, this is often not true. Dominance and recessivity unexpectedly slow down the evolutionary process for weakly selected alleles. In particular, slightly advantageous dominant and mildly deleterious recessive mu- tations reach fixation more slowly than neutral ones. This phenomenon determines genetic signatures opposite to those expected under strong selection, such as increased instead of decreased genetic diversity around the selected site. Furthermore, we characterize a new phenomenon: mildly deleterious recessive alleles, thought to represent the vast majority of newly arising mutations, survive in a population longer than neutral ones, before getting lost. Hence, natural selection is less effective than previously thought in getting rid rapidly of slightly negative mutations, contributing their observed persistence in present populations. Consequently, low frequency slightly deleterious mutations are on average older than neutral ones.

Variation in rural African gut microbiomes is strongly shaped by parasitism and diet

Variation in rural African gut microbiomes is strongly shaped by parasitism and diet

Elise R Morton , Joshua Lynch , Alain Froment , Sophie Lafosse , Evelyne Heyer , Molly Przeworski , Ran Blekhman , Laure Segurel
doi: http://dx.doi.org/10.1101/016949

The human gut microbiome is influenced by its host’s nutrition and health status, and represents an interesting adaptive phenotype under the influence of metabolic and immune constraints. Previous studies contrasting rural populations in developing countries to urban industrialized ones have shown that geography is an important factor associated with the gut microbiome; however, studies have yet to disentangle the effects of factors such as climate, diet, host genetics, hygiene and parasitism. Here, we focus on fine-scale comparisons of African rural populations in order to (i) contrast the gut microbiomes of populations that inhabit similar environments but have different traditional subsistence modes and (ii) evaluate the effect of parasitism on microbiome composition and structure. We sampled rural Pygmy hunter-gatherers as well as Bantu individuals from both farming and fishing populations in Southwest Cameroon and found that the presence of Entamoeba is strongly correlated with microbial composition and diversity. Using a random forest classifier model, we show that an individual’s infection status can be predicted with 79% accuracy based on his/her gut microbiome composition. We identified multiple taxa that differ significantly in frequency between infected and uninfected individuals, and found that alpha diversity is significantly higher in infected individuals, while beta-diversity is reduced. Subsistence mode was another factor significantly associated with microbial composition, notably with some taxa previously shown to differ between Hadza East African hunter-gatherers and Italians also discriminating Pygmy hunter-gatherers from neighboring farming or fishing populations in Cameroon. In conclusion, these results provide evidence for a strong relationship between human gut parasites and the microbiome, and highlight how sensitive this microbial ecosystem is to subtle changes in host nutrition.

The origins of a novel butterfly wing patterning gene from within a family of conserved cell cycle regulators

The origins of a novel butterfly wing patterning gene from within a family of conserved cell cycle regulators

Nicola Nadeau , Carolina Pardo-Diaz , Annabel Whibley , Megan Ann Supple , Richard Wallbank , Grace C. Wu , Luana Maroja , Laura Ferguson , Heather Hines , Camilo Salazar , Richard ffrench-Constant , Mathieu Joron , William Owen McMillan , Chris Jiggins
doi: http://dx.doi.org/10.1101/016006

A major challenge in evolutionary biology is to understand the origins of novel structures. The wing patterns of butterflies and moths are derived phenotypes unique to the Lepidoptera. Here we identify a gene that we name poikilomousa (poik), which regulates colour pattern switches in the mimetic Heliconius butterflies. Strong associations between phenotypic variation and DNA sequence variation are seen in three different Heliconius species, in addition to associations between gene expression and colour pattern. Colour pattern variants are also associated with differences in splicing of poik transcripts. poik is a member of the conserved fizzy family of cell cycle regulators. It belongs to a faster evolving subfamily, the closest functionally characterised orthologue being the cortex gene in Drosophila, a female germ-line specific protein involved in meiosis. poik appears to have adopted a novel function in the Lepidoptera and become a major target for natural selection acting on colour and pattern variation in this group.

Recombining without hotspots: A comprehensive evolutionary portrait of recombination in two closely related species of Drosophila

Recombining without hotspots: A comprehensive evolutionary portrait of recombination in two closely related species of Drosophila

Caiti Smukowski Heil , Chris Ellison , Matthew Dubin , Mohamed Noor
doi: http://dx.doi.org/10.1101/016972

Meiotic recombination rate varies across the genome within and between individuals, populations, and species in virtually all taxa studied. In almost every species, this variation takes the form of discrete recombination hotspots, determined in Metazoans by a protein called PRDM9. Hotspots and their determinants have a profound effect on the genomic landscape, and share certain features that extend across the tree of life. Drosophila, in contrast, are anomalous in their absence of hotspots, PRDM9, and other species-specific differences in the determination of recombination. To better understand the evolution of meiosis and general patterns of recombination across diverse taxa, we present what may be the most comprehensive portrait of recombination to date, combining contemporary recombination estimates from each of two sister species along with historic estimates of recombination using linkage-disequilibrium-based approaches derived from sequence data from both species. Using Drosophila pseudoobscura and Drosophila miranda as a model system, we compare recombination rate between species at multiple scales, and we replicate the pattern seen in human-chimpanzee that recombination rate is conserved at broad scales and more divergent at finer scales. We also find evidence of a species-wide recombination modifier, resulting in both a present and historic genome wide elevation of recombination rates in D. miranda, and identify broad scale effects on recombination from the presence of an inter-species inversion. Finally, we reveal an unprecedented view of the distribution of recombination in D. pseudoobscura, illustrating patterns of linked selection and where recombination is taking place. Overall, by combining these estimation approaches, we highlight key similarities and differences in recombination between Drosophila and other organisms.

Repeatability of evolution on epistatic landscapes

Repeatability of evolution on epistatic landscapes
Benedikt Bauer , Chaitanya S Gokhale
doi: http://dx.doi.org/10.1101/016782

Evolution is a dynamic process. The two classical forces of evolution are mutation and selection. Assuming small mutation rates, evolution can be predicted based solely on the fitness differences between phenotypes. Predicting an evolutionary process under varying mutation rates as well as varying fitness is still an open question. Experimental procedures, however, do include these complexities along with fluctuating population sizes and stochastic events such as extinctions. We investigate the mutational path probabilities of systems having epistatic effects on both fitness and mutation rates using a theoretical and computational framework. In contrast to previous models, we do not limit ourselves to the typical strong selection, weak mutation (SSWM)-regime or to fixed population sizes. Rather we allow epistatic interactions to also affect mutation rates. This can lead to qualitatively non-trivial dynamics. Pathways, that are negligible in the SSWM-regime, can overcome fitness valleys and become accessible. This finding has the potential to extend the traditional predictions based on the SSWM foundation and bring us closer to what is observed in experimental systems.