Interfertile oaks in an island environment. II. Limited hybridization between Quercus alnifolia Poech and Q. coccifera L. in a mixed stand

Posted on June 12, 2013 by Joe Pickrell

Interfertile oaks in an island environment. II. Limited hybridization between Quercus alnifolia Poech and Q. coccifera L. in a mixed stand
Charalambos Neophytou, Filippos A. Aravanopoulos, Siegfried Fink, Aikaterini Dounavi
(Submitted on 11 Jun 2013)

Hybridization and introgression between Quercus alnifolia Poech and Q. coccifera L. is studied by analyzing morphological traits, nuclear and chloroplast DNA markers. The study site is a mixed stand on Troodos Mountains (Cyprus) and the analyzed material includes both adult trees and progenies of specific mother trees. Multivariate analysis of morphological traits shows that the two species can be well distinguished using simple leaf morphometric parameters. A lower genetic diversity in Q. alnifolia than in Q. coccifera and a high interspecific differentiation between the two species are supported by an analysis of nuclear and chloroplast microsatellites. The intermediacy of the four designated hybrids is verified by both leaf morphometric and genetic data. Analysis of progeny arrays provides evidence that interspecific crossings are rare. This finding is further supported by limited introgression of chloroplast genomes. Reproductive barriers (e.g. asynchronous phenology, post-zygotic incompatibilities) might account for this result. A directionality of interspecific gene flow is indicated by a genetic assignment analysis of effective pollen clouds with Q. alnifolia acting as pollen donor. Differences in flowering phenology and species distribution in the stand may have influenced the direction of gene flow and the genetic differentiation among effective pollen clouds of different mother trees within species.

Our paper: Effect of Genetic Variation in a Drosophila Model of Misfolded Human Proinsulin

Posted on June 10, 2013 by Joe Pickrell

This guest post is by Bin He on two preprints, Genetic Complexity in a Drosophila Model of Diabetes-Associated Misfolded Human Proinsulin and Effect of Genetic Variation in a Drosophila Model of Misfolded Human Proinsulin, arXived here and here, respectively. This is a cross-post from Bin’s blog

Here we describe a pair of papers, both of which have been posted by Joe on this blog in the past month. But since they are intimately connected, we would like to write an additional post to explain the rationales behind them and the major findings therein.

The central questions in these two papers concern the genetic architecture of complex traits, such as those in human common disorders. We took a model organism approach in order to complement human studies, which are getting more and more powerful because of the successful community collaboration, but are still limited in several aspects, including mapping resolution and the ability to perform experimental validations.

Another important thinking underlying this project is the idea that decanalization of a trait may have caused a release of genetic variation, which subsequently contributed to the disease variability we see today. To this end, our fly model of misfolded human proinsulin may be viewed as an external perturbation, which, by exhausting the organism’s buffering capacity, reveals normally cryptic genetic variation. Under this view, our model will have general relevance in many human disorders.

To perform this study, we first established a fly model of a disease-associated human mutant proinsulin, which was the subject of our first paper “Genetic Complexity in a Drosophila Model of Diabetes-Associated Misfolded Human Proinsulin”.

We’d like to bring out several points. First, regarding the etiology of the disease phenotype in our fly model, we believe it is mainly due to the physical property of the mutant protein, rather than the biological function of the human proinsulin. Although Drosophila also has insulin-like proteins, their sequence similarity and functions differ substantially from the human homolog. Consistent with this view, when we made a transgenic fly expressing the wild-type human proinsulin, what we observed is that, at both phenotype and transcription level, expressing the wild-type human proinsulin in developing eye and other imaginal discs do not cause any visible changes. We thus propose that our fly model is for a general class of human disease associated with unfolded or misfolded protein.

In the first paper, we also described the phenomenon of variable phenotypic severity when put on different wild-derived genetic background. A series of experiments ruled out possible confounding factors, such as correlations induced by natural variability in eye size, or different levels of transgene expression.

We were exploring the idea of using natural variation in the fly to identify associated loci underlying a complex disease trait. We did so by crossing the transgenic, Mendelian disease carrying line to a panel of wild-derived inbred lines, and asked whether the severity of the disease is dependent on the genetic background. The answer is a definite yes: the range of phenotype quantified by the size of the eye span from 10% to 80% of wildtype (the mutant human proinsulin was expressed in the eye disc during development, causing neurodegeneration. We used eye because it is dispensable in lab conditions, and easy to measure the phenotype). We then conducted a GWAS, which led to the identification of sfl, as described above, and also the HS biosynthetic pathway by genetic test. One unique advantage of our system is its ultra-high resolution in mapping: we localized the association signal to ~400bp LD block within one of the introns of sfl, allowing us to test specific hypotheses about the molecular mechanisms of the associated variants. Pyro-sequencing analysis revealed allele-specific expression difference due to the intronic variation, but also highlighted the genetic heterogeneity even within that locus, with additional cis-variants present to influence the expression level. Overall, we believe that our fly model system is a powerful complementary approach to the genetic study of complex traits. Its high mapping resolution and rich molecular/genetic toolkits allow faster and in-depth characterization of disease-associated variation, which is a unique advantage.

Bin Z. He
Kreitman Lab, Dept of Ecology and Evolution, University of Chicago
current address: O’Shea Lab, FAS Center for Systems Biology, Harvard University / HHMI

On the accumulation of deleterious mutations during range expansions

Posted on June 10, 2013 by Joe Pickrell

On the accumulation of deleterious mutations during range expansions
Stephan Peischl, Isabelle Dupanloup, Mark Kirkpatrick, Laurent Excoffier
(Submitted on 7 Jun 2013)

We investigate the effect of spatial range expansions on the evolution of fitness when beneficial and deleterious mutations co-segregate. We perform individual-based simulations of a uniform linear habitat and complement them with analytical approximations for the evolution of mean fitness at the edge of the expansion. We find that deleterious mutations accumulate steadily on the wave front during range expansions, thus creating an expansion load. Reduced fitness due to the expansion load is not restricted to the wave front but occurs over a large proportion of newly colonized habitats. The expansion load can persist and represent a major fraction of the total mutation load thousands of generations after the expansion. Our results extend qualitatively and quantitatively to two-dimensional expansions. The phenomenon of expansion load may explain growing evidence that populations that have recently expanded, including humans, show an excess of deleterious mutations. To test the predictions of our model, we analyze patterns of neutral and non-neutral genetic diversity in humans and find an excellent fit between theory and data.

Low-bandwidth and non-compute intensive remote identification of microbes from raw sequencing reads

Posted on June 10, 2013 by Joe Pickrell

Low-bandwidth and non-compute intensive remote identification of microbes from raw sequencing reads
Laurent Gautier, Ole Lund
(Submitted on 6 Jun 2013)

Cheap high-throughput DNA sequencing may soon become routine not only for human genomes but also for practically anything requiring the identification of living organisms from their DNA: tracking of infectious agents, control of food products, bioreactors, or environmental samples.
We propose a novel general approach to the analysis of sequencing data in which the reference genome does not have to be specified. Using a distributed architecture we are able to query a remote server for hints about what the reference might be, transferring a relatively small amount of data, and the hints can be used for more computationally-demanding work.
Our system consists of a server with known reference DNA indexed, and a client with raw sequencing reads. The client sends a sample of unidentified reads, and in return receives a list of matching references known to the server. Sequences for the references can be retrieved and used for exhaustive computation on the reads, such as alignment.
To demonstrate this approach we have implemented a web server, indexing tens of thousands of publicly available genomes and genomic regions from various organisms and returning lists of matching hits from query sequencing reads. We have also implemented two clients, one of them running in a web browser, in order to demonstrate that gigabytes of raw sequencing reads of unknown origin could be identified without the need to transfer a very large volume of data, and on modestly powered computing devices.
A web access is available at this http URL. The source code for a python command-line client, a server, and supplementary data is available at this http URL.

SPATA: A Seeding and Patching Algorithm for Hybrid Transcriptome Assembly

Posted on June 7, 2013 by Joe Pickrell

SPATA: A Seeding and Patching Algorithm for Hybrid Transcriptome Assembly
Tin Chi Nguyen, Zhiyu Zhao, Dongxiao Zhu
(Submitted on 6 Jun 2013)

Transcriptome assembly from RNA-Seq reads is an active area of bioinformatics research. The ever-declining cost and the increasing depth of RNA-Seq have provided unprecedented opportunities to better identify expressed transcripts. However, the nonlinear transcript structures and the ultra-high throughput of RNA-Seq reads pose significant algorithmic and computational challenges to the existing transcriptome assembly approaches, either reference-guided or de novo. While reference-guided approaches offer good sensitivity, they rely on alignment results of the splice-aware aligners and are thus unsuitable for species with incomplete reference genomes. In contrast, de novo approaches do not depend on the reference genome but face a computational daunting task derived from the complexity of the graph built for the whole transcriptome. In response to these challenges, we present a hybrid approach to exploit an incomplete reference genome without relying on splice-aware aligners. We have designed a split-and-align procedure to efficiently localize the reads to individual genomic loci, which is followed by an accurate de novo assembly to assemble reads falling into each locus. Using extensive simulation data, we demonstrate a high accuracy and precision in transcriptome reconstruction by comparing to selected transcriptome assembly tools. Our method is implemented in assemblySAM, a GUI software freely available at this http URL.

Hide and seek: placing and finding an optimal tree for thousands of homoplasy-rich sequences

Posted on June 7, 2013 by Joe Pickrell

Hide and seek: placing and finding an optimal tree for thousands of homoplasy-rich sequences
Dietrich Radel, Andreas Sand, Mike Steel
(Submitted on 6 Jun 2013)

Finding optimal evolutionary trees from sequence data is typically an intractable problem, and there is usually no way of knowing how close to optimal the best tree from some search truly is. The problem would seem to be particularly acute when we have many taxa and when that data has high levels of homoplasy, in which the individual characters require many changes to fit on the best tree. However, a recent mathematical result has provided a precise tool to generate a short number of high-homoplasy characters for any given tree, so that this tree is provably the optimal tree under the maximum parsimony criterion. This provides, for the first time, a rigorous way to test tree search algorithms on homoplasy-rich data, where we know in advance what the `best’ tree is. In this short note we consider just one search program (TNT) but show that it is able to locate the globally optimal tree correctly for 32,768 taxa, even though the characters in the dataset requires, on average, 1148 state-changes each to fit on this tree, and the number of characters is only 57.

Density behavior of spatial birth-and-death stochastic evolution of mutating genotypes under selection rates

Posted on June 7, 2013 by Joe Pickrell

Density behavior of spatial birth-and-death stochastic evolution of mutating genotypes under selection rates
Dmitri Finkelshtein, Yuri Kondratiev, Oleksandr Kutoviy, Stanislav Molchanov, Elena Zhizhina
(Submitted on 5 Jun 2013)

We consider birth-and-death stochastic evolution of genotypes with different lengths. The genotypes might mutate that provides a stochastic changing of lengthes by a free diffusion law. The birth and death rates are length dependent which corresponds to a selection effect. We study an asymptotic behavior of a density for an infinite collection of genotypes. The cases of space homogeneous and space heterogeneous densities are considered.

Genetic Complexity in a Drosophila Model of Diabetes-Associated Misfolded Human Proinsulin

Posted on June 4, 2013 by Joe Pickrell

Genetic Complexity in a Drosophila Model of Diabetes-Associated Misfolded Human Proinsulin

Soo-Young Park, Michael Z. Ludwig, Natalia A. Tamarina, Bin Z. He, Sarah H. Carl, Desiree A. Dickerson, Levi Barse, Bharath Arun, Calvin Williams, Cecelia M. Miles, Louis H. Philipson, Donald F. Steiner, Graeme I. Bell, Martin Kreitman
(Submitted on 31 May 2013)

Here we use Drosophila melanogaster to create a genetic model of human permanent neonatal diabetes mellitus and present experimental results describing dimensions of this complexity. The approach involves the transgenic expression of a misfolded mutant of human preproinsulin, hINSC96Y, which is a cause of the disease. When expressed in fly imaginal discs, hINSC96Y causes a reduction of adult structures, including the eye, wing and notum. Eye imaginal discs exhibit defects in both the structure and arrangement of ommatidia. In the wing, expression of hINSC96Y leads to ectopic expression of veins and mechano-sensory organs, indicating disruption of wild type signaling processes regulating cell fates. These readily measurable disease phenotypes are sensitive to temperature, gene dose and sex. Mutant (but not wild type) proinsulin expression in the eye imaginal disc induces IRE1-mediated Xbp1 alternative splicing, a signal for endoplasmic reticulum stress response activation, and produces global change in gene expression. Mutant hINS transgene tester strains, when crossed to stocks from the Drosophila Genetic Reference Panel produces F1 adults with a continuous range of disease phenotypes and large broad-sense heritability. Surprisingly, the severity of mutant hINS-induced disease in the eye is not correlated with that in the notum in these crosses, nor with eye reduction phenotypes caused by the expression of two dominant eye mutants acting in two different eye development pathways, Drop (Dr) or Lobe (L) when crossed into the same genetic backgrounds. The tissue specificity of genetic variability for mutant hINS-induced disease thus has its own distinct signature. The genetic dominance of disease-specific phenotypic variability makes this approach amenable to genome-wide association study (GWAS) in a simple F1 screen of natural variation.

Genome Sequencing Highlights Genes Under Selection and the Dynamic Early History of Dogs

Posted on June 3, 2013 by Joe Pickrell

Genome Sequencing Highlights Genes Under Selection and the Dynamic Early History of Dogs
Adam H. Freedman, Rena M. Schweizer, Ilan Gronau, Eunjung Han, Diego Ortega-Del Vecchyo, Pedro M. Silva, Marco Galaverni, Zhenxin Fan, Peter Marx, Belen Lorente-Galdos, Holly Beale, Oscar Ramirez, Farhad Hormozdiari, Can Alkan, Carles Vilà, Kevin Squire, Eli Geffen, Josip Kusak, Adam R. Boyko, Heidi G. Parker, Clarence Lee, Vasisht Tadigotla, Adam Siepel, Carlos D. Bustamante, Timothy T. Harkins, Stanley F. Nelson, Elaine A. Ostrander, Tomas Marques-Bonet, Robert K. Wayne, John Novembre
(Submitted on 31 May 2013)

To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we analyzed novel high-quality genome sequences of three gray wolves, one from each of three putative centers of dog domestication, two ancient dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. We find dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-divergence gene flow, which confounds previous inferences of dog origins. In dogs, the domestication bottleneck was severe involving a 17 to 49-fold reduction in population size, a much stronger bottleneck than estimated previously from less intensive sequencing efforts. A sharp bottleneck in wolves occurred soon after their divergence from dogs, implying that the pool of diversity from which dogs arose was far larger than represented by modern wolf populations. Conditional on mutation rate, we narrow the plausible range for the date of initial dog domestication to an interval from 11 to 16 thousand years ago. This period predates the rise of agriculture, implying that the earliest dogs arose alongside hunter-gathers rather than agriculturists. Regarding the geographic origin of dogs, we find that surprisingly, none of the extant wolf lineages from putative domestication centers are more closely related to dogs, and the sampled wolves instead form a sister monophyletic clade. This result, in combination with our finding of dog-wolf admixture during the process of domestication, suggests a re-evaluation of past hypotheses of dog origin is necessary. Finally, we also detect signatures of selection, including evidence for selection on genes implicated in morphology, metabolism, and neural development. Uniquely, we find support for selective sweeps at regulatory sites suggesting gene regulatory changes played a critical role in dog domestication.

Haldane's Sieve

Discussing preprints in population and evolutionary genetics

Author Archives: Joe Pickrell

Interfertile oaks in an island environment. II. Limited hybridization between Quercus alnifolia Poech and Q. coccifera L. in a mixed stand

Our paper: Effect of Genetic Variation in a Drosophila Model of Misfolded Human Proinsulin

On the accumulation of deleterious mutations during range expansions

Low-bandwidth and non-compute intensive remote identification of microbes from raw sequencing reads

SPATA: A Seeding and Patching Algorithm for Hybrid Transcriptome Assembly

Hide and seek: placing and finding an optimal tree for thousands of homoplasy-rich sequences

Density behavior of spatial birth-and-death stochastic evolution of mutating genotypes under selection rates

Genetic Complexity in a Drosophila Model of Diabetes-Associated Misfolded Human Proinsulin

Genome Sequencing Highlights Genes Under Selection and the Dynamic Early History of Dogs

Most viewed on Haldane’s Sieve: May 2013

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: