Hybridization capture using RAD probes (hyRAD), a new tool for performing genomic analyses on museum collection specimens.

Posted on September 2, 2015 by schraib

Tomasz Suchan, Camille Pitteloud, Nadezhda Gerasimova, Anna Kostikova, Nils Arrigo, Mila Pajkovic, Michał Ronikier, Nadir Alvarez

bioRxiv doi: http://dx.doi.org/10.1101/025551

In the recent years, many protocols aimed at reproducibly sequencing reduced-genome subsets in non-model organisms have been published. Among them, RAD-sequencing is one of the most widely used. It relies on digesting DNA with specific restriction enzymes and performing size selection on the resulting fragments. Despite its utility, this method is of a limited use with degraded DNA samples, such as those isolated from museum specimens, as these are either less likely to harbor fragments long enough to comprise two restriction sites making possible ligation of the technical sequences required or performing size selection of the resulting fragments. In addition, RAD-sequencing also reveals a suboptimal technique when applied to an evolutionary scale larger than the intra-specific level, as polymophisms in the restriction sites cause loci dropout. Here, we address both of these limitations by a novel method called hybridization RAD (hyRAD). In this method, biotinylated RAD fragments, covering a random fraction of the genome, are used as baits for capturing homologous fragments from samples processed through a classical genomic shotgun sequencing protocol. This simple and cost- effective approach allows sequencing orthologous sequences even from highly degraded DNA samples, opening new avenues of research in the field of museum genomics. Not relying on the restriction site presence, it improves among-sample loci coverage, and can be applied to broader phylogenetic scales. In a trial study, hyRAD allowed us to obtain a large set of orthologous loci from fresh and museum samples from a non-model butterfly species, with over 10.000 single nucleotide polymorphisms present in all eight analyzed specimens, including 58 years old museum samples.

SFS_CODE: More Efficient and Flexible Forward Simulations

Posted on September 2, 2015 by schraib

SFS_CODE: More Efficient and Flexible Forward Simulations

Ryan D. Hernandez, Lawrence H. Uricchio

bioRxiv doi: http://dx.doi.org/10.1101/025064

SUMMARY: Modern implementations of forward population genetic simulations are efficient and flexible, enabling the exploration of complex models that may otherwise be intractable. Here we describe an updated version of SFS_CODE, which has increased efficiency and includes many novel features. Among these features is an arbitrary model of dominance, the ability to simulate partial and soft selective sweeps, as well as track the trajectories of mutations and/or ancestries across multiple populations under complex models that are not possible under a coalescent framework. We also release sfs_coder, a Python wrapper to SFS_CODE allowing the user to easily generate command lines for common models of demography, selection, and human genome structure, as well as parse and simulate phenotypes from SFS_CODE output. Availability and Implementation: Our open source software is written in C and Python, and are available under the GNU General Public License at http://sfscode.sourceforge.net. Contact: ryan.hernandez@ucsf.edu Supplementary information: Detailed usage information is available from the project website at http://sfscode.sourceforge.net.

Teaser: Individualized benchmarking and optimization of read mapping results for NGS data

Posted on September 2, 2015 by schraib

Teaser: Individualized benchmarking and optimization of read mapping results for NGS data

, , , ,

doi: http://dx.doi.org/10.1101/025858

Mapping reads to a genome remains challenging, especially for non-model organisms with poorer quality assemblies, or for organisms with higher rates of mutations. While most research has focused on speeding up the mapping process, little attention has been paid to optimize the choice of mapper and parameters for a user’s dataset. Here we present Teaser, which assists in these choices through rapid automated benchmarking of different mappers and parameter settings for individualized data. Within minutes, Teaser completes a quantitative evaluation of an ensemble of mapping algorithms and parameters. Using Teaser, we demonstrate how Bowtie2 can be optimized for different data.

Fitness-valley crossing with generalized parent-offspring transmission

Posted on September 2, 2015 by schraib

Fitness-valley crossing with generalized parent-offspring transmission

Matthew Osmond, Sarah P Otto

bioRxiv doi: http://dx.doi.org/10.1101/025502

Simple and ubiquitous gene interactions create rugged fitness landscapes composed of coadapted gene complexes separated by “valleys” of low fitness. Crossing such fitness valleys allows a population to escape suboptimal local fitness peaks to become better adapted. This is the premise of Sewall Wright’s shifting balance process. Here we generalize the theory of fitness-valley crossing in the two-locus, biallelic case by allowing bias in parent-offspring transmission. This generalization extends the existing mathematical framework to genetic systems with segregation distortion and uniparental inheritance. Our results are also flexible enough to provide insight into shifts between alternate stable states in cultural systems with “transmission valleys”. Using a semi-deterministic analysis and a stochastic diffusion approximation, we focus on the limiting step in valley crossing: the first appearance of the genotype on the new fitness peak whose lineage will eventually fix. We then apply our results to specific cases of segregation distortion, uniparental inheritance, and cultural transmission. Segregation distortion favouring mutant alleles facilitates crossing most when recombination and mutation are rare, i.e., scenarios where crossing is otherwise unlikely. Interactions with more mutable genes (e.g., uniparental inherited cytoplasmic elements) substantially reduce crossing times. Despite component traits being passed on poorly in the previous cultural background, small advantages in the transmission of a new combination of cultural traits can greatly facilitate a cultural transition. While peak shifts are unlikely under many of the common assumptions of population genetic theory, relaxing some of these assumptions can promote fitness-valley crossing.

Statistical Inference of a Convergent Antibody Repertoire Response to Influenza Vaccine

Posted on September 2, 2015 by schraib

Statistical Inference of a Convergent Antibody Repertoire Response to Influenza Vaccine

Nicolas Strauli, Ryan Hernandez

bioRxiv doi: http://dx.doi.org/10.1101/025098

Background: Vaccines dramatically affect an individual’s adaptive immune system, and thus provide an excellent means to study human immunity. Upon vaccination, the B cells that express antibodies (Abs) that happen to bind the vaccine are stimulated to proliferate and undergo mutagenesis at their Ab locus. This process may alter the composition of B cell lineages within an individual, which are known collectively as the antibody repertoire (AbR). Antibodies are also highly expressed in whole blood, potentially enabling unbiased RNA sequencing technologies to query this diversity. Less is known about the diversity of AbR responses across individuals to a given vaccine and if individuals tend to yield a similar response to the same antigenic stimulus. Methods: Here we implement a bioinformatic pipeline that extracts the AbR information from a time-series RNA-seq dataset of 5 patients who were administered a seasonal trivalent influenza vaccine (TIV). We harness the detailed time-series nature of this dataset and use methods based in functional data analysis (FDA) to identify the B cell lineages that respond to the vaccine. We then design and implement rigorous statistical tests in order to ask whether or not these patients exhibit a convergent AbR response to the same TIV. Results: We find that high-resolution time-series data can be used to help identify the Ab lineages that respond to an antigenic stimulus, and that this response can exhibit a convergent nature across patients inoculated with the same vaccine. However, correlations in AbR diversity among individuals prior to inoculation can confound inference of a convergent signal unless it is taken into account. Conclusions: We developed a framework to identify the elements of an AbR that respond to an antigen. This information could be used to understand the diversity of different immune responses in different individuals, as well as to gauge the effectiveness of the immune response to a given stimulus within an individual. We also present a framework for testing a convergent hypothesis between AbRs; a hypothesis that is more difficult to test than previously appreciated. Our discovery of a convergent signal suggests that similar epitopes do select for antibodies with similar sequence characteristics.

Allele-specific expression reveals interactions between genetic variation and environment

Posted on September 2, 2015 by schraib

Allele-specific expression reveals interactions between genetic variation and environment

, , , , , , , , , ,

doi: http://dx.doi.org/10.1101/025874

The impact of environment on human health is dramatic, with major risk factors including substance use, diet and exercise. However, identifying interactions between the environment and an individual’s genetic background (GxE) has been hampered by statistical and computational challenges. By combining RNA sequencing of whole blood and extensive environmental annotations collected from 922 individuals, we have evaluated GxE interactions at a cellular level. We have developed EAGLE, a hierarchical Bayesian model for identifying GxE interactions based on association between environment and allele-specific expression (ASE). EAGLE increases power by leveraging the controlled, within-sample comparison of environmental impact on different genetic backgrounds provided by ASE, while also taking into account technical covariates and over-dispersion of sequencing read counts. EAGLE identifies 35 GxE interactions, a substantial increase over standard GxE testing. Among EAGLE hits are variants that modulate response to smoking, exercise and blood pressure medication. Further, application of EAGLE identifies GxE interactions to infection response that replicate results reported in vitro, demonstrating the power of EAGLE to accurately identify GxE candidates from large RNA sequencing studies.

On the widespread and critical impact of systematic bias and batch effects in single-cell RNA-Seq data

Posted on September 2, 2015 by schraib

On the widespread and critical impact of systematic bias and batch effects in single-cell RNA-Seq data

Stephanie C Hicks, Mingxiang Teng, Rafael A Irizarry

bioRxiv doi: http://dx.doi.org/10.1101/025528

Single-cell RNA-Sequencing (scRNA-Seq) has become the most widely used high-throughput method for transcription profiling of individual cells. Systematic errors, including batch effects, have been widely reported as a major challenge in high-throughput technologies. Surprisingly, these issues have received minimal attention in published studies based on scRNA-Seq technology. We examined data from five published studies and found that systematic errors can explain a substantial percentage of observed cell-to-cell expression variability. Specifically, we found that the proportion of genes reported as expressed explains a substantial part of observed variability and that this quantity varies systematically across experimental batches. Furthermore, we found that the implemented experimental designs confounded outcomes of interest with batch effects, a design that can bring into question some of the conclusions of these studies. Finally, we propose a simple experimental design that can ameliorate the effect of theses systematic errors have on downstream results.

Shift and adapt: the costs and benefits of karyotype variations

Posted on September 2, 2015 by schraib

Shift and adapt: the costs and benefits of karyotype variations

Aleeza C Gerstein, Judith Berman

bioRxiv doi: http://dx.doi.org/10.1101/025171

Variation is the spice of life or, in the case of evolution, variation is the necessary material on which selection can act to enable adaptation. Karyotypic variation in ploidy (the number of homologous chromosome sets) and aneuploidy (imbalance in the number of chromosomes) are fundamentally different than other types of genomic variants. Karyotypic variation emerges through different molecular mechanisms than other mutational events, and unlike mutations that alter the genome at the base pair level, rapid reversion to the wild type chromosome number is often possible. Although karyotypic variation has long been noted and discussed by biologists, interest in the importance of karyotypic variants in evolutionary processes has spiked in recent years, and much remains to be discovered about how karyotypic variants are produced and subsequently selected.

Protein homeostasis imposes a barrier on functional integration of horizontally transferred genes in bacteria

Posted on September 2, 2015 by schraib

Protein homeostasis imposes a barrier on functional integration of horizontally transferred genes in bacteria

Shimon Bershtein, Adrian Serohijos, Sanchari Bhattacharyya, Michael Manhart, Jeong-Mo Choi, Wangmen Mu, Jingwen Zhou, Eugene Shakhnovich

bioRxiv doi: http://dx.doi.org/10.1101/025841

Horizontal gene transfer (HGT) plays a central role in bacterial evolution, yet the molecular and cellular constraints on functional integration of the foreign genes are poorly understood. Here we performed inter-species replacement of the chromosomal folA gene, encoding an essential metabolic enzyme dihydrofolate reductase (DHFR), with orthologs from 35 other mesophilic bacteria. The orthologous inter-species replacements caused a marked drop (in the range 10-90%) in bacterial growth rate despite the fact that most orthologous DHFRs are as stable as E.coli DHFR at 370C and are more catalytically active than E. coli DHFR. Although phylogenetic distance between E. coli and orthologous DHFRs as well as their individual molecular properties correlate poorly with growth rates, the product of the intracellular DHFR abundance and catalytic activity (kcat/KM), correlates strongly with growth rates, indicating that the drop in DHFR abundance constitutes the major fitness barrier to HGT. Serial propagation of the orthologous strains for ~600 generations dramatically improved growth rates by largely alleviating the fitness barriers. Whole genome sequencing and global proteome quantification revealed that the evolved strains with the largest fitness improvements have accumulated mutations that inactivated the ATP-dependent Lon protease, causing an increase in the intracellular DHFR abundance. In one case DHFR abundance increased further due to mutations accumulated in folA promoter, but only after the lon inactivating mutations were fixed in the population. Thus, by apparently distinguishing between self and non-self proteins, protein homeostasis imposes an immediate and global barrier to the functional integration of foreign genes by decreasing the intracellular abundance of their products. Once this barrier is alleviated, more fine-tuned evolution occurs to adjust the function/expression of the transferred proteins to the constraints imposed by the intracellular environment of the host organism.

Further genetic diversification in multiple tumors and an evolutionary perspective on therapeutics

Posted on September 2, 2015 by schraib

Further genetic diversification in multiple tumors and an evolutionary perspective on therapeutics

Yong Tao, Zheng Hu, Shaoping Ling, Shiou-Hwie Yeh, Weiwei Zhai, Ke Chen, Chunyan Li, Yu Wang, Kaile Wang, Hurng-Yi Wang, Eric A Hungate, Kenan Onel, Jiang Liu, Changqing Zeng, Richard R Hudson, Pei-Jer Chen, Xuemei Lu, Chung-I Wu

bioRxiv doi: http://dx.doi.org/10.1101/025429

The genetic diversity within a single tumor can be extremely large, possibly with mutations at all coding sites (Ling et al. 2015). In this study, we analyzed 12 cases of multiple hepatocellular carcinoma (HCC) tumors by sequencing and genotyping several samples from each case. In 10 cases, tumors are clonally related by a process of cell migration and colonization. They permit a detailed analysis of the evolutionary forces (mutation, migration, drift and natural selection) that influence the genetic diversity both within and between tumors. In 23 inter-tumor comparisons, the descendant tumor usually shows a higher growth rate than the parent tumor. In contrast, neutral diversity dominates within-tumor observations such that adaptively growing clones are rarely found. The apparent adaptive evolution between tumors can be explained by the inherent bias for detecting larger tumors that have a growth advantage. Beyond these tumors are a far larger number of clones which, growing at a neutral rate and too small to see, can nevertheless be verified by molecular means. Given that the estimated genetic diversity is often very large, therapeutic strategies need to take into account the pre-existence of many drug-resistance mutations. Importantly, these mutations are expected to be in the very low frequency range in the primary tumors (and become frequent in the relapses, as is indeed reported (1-3). In conclusion, tumors may often harbor a very large number of mutations in the very low frequency range. This duality provides both a challenge and an opportunity for designing strategies against drug resistance (4-8).

Haldane's Sieve

Discussing preprints in population and evolutionary genetics

Category Archives: Uncategorized

Hybridization capture using RAD probes (hyRAD), a new tool for performing genomic analyses on museum collection specimens.

SFS_CODE: More Efficient and Flexible Forward Simulations

Teaser: Individualized benchmarking and optimization of read mapping results for NGS data

Teaser: Individualized benchmarking and optimization of read mapping results for NGS data

Fitness-valley crossing with generalized parent-offspring transmission

Statistical Inference of a Convergent Antibody Repertoire Response to Influenza Vaccine

Allele-specific expression reveals interactions between genetic variation and environment

Allele-specific expression reveals interactions between genetic variation and environment

On the widespread and critical impact of systematic bias and batch effects in single-cell RNA-Seq data

Shift and adapt: the costs and benefits of karyotype variations

Protein homeostasis imposes a barrier on functional integration of horizontally transferred genes in bacteria

Further genetic diversification in multiple tumors and an evolutionary perspective on therapeutics

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: