Positive selection on a regulatory insertion-deletion polymorphism in FADS2 influences apparent endogenous synthesis of arachidonic acid

Posted on March 9, 2016 by schraib

Kumar S.D. Kothapalli, Kaixiong Ye, Maithili S. Gadgil, Susan E. Carlson, Kimberly O. O’Brien, Ji Yao Zhang, Hui Gyu Park, Kinsley Ojukwu, James Zou, Stephanie S. Hyon, Kalpana S. Joshi, Alon Keinan, J. Thomas Brenna

bioRxiv doi: http://dx.doi.org/10.1101/042549

Long chain polyunsaturated fatty acids (LCPUFA) are bioactive components of membrane phospholipids and serve as substrates for signaling molecules. LCPUFA can be obtained directly from animal foods or synthesized endogenously from 18 carbon precursors via the FADS2 coded enzyme. Vegans rely almost exclusively on endogenous synthesis to generate LCPUFA and we hypothesized that an adaptive genetic polymorphism would confer advantage. The rs66698963 polymorphism, a 22 bp insertion-deletion within FADS2, is associated with basal FADS1 expression, and coordinated induction of FADS1 and FADS2 in vitro. Here we determined rs66698963 genotype frequencies from 234 individuals of a primarily vegetarian Indian population and 311 individuals from the U.S. A much higher I/I genotype frequency was found in Indians (68%) than in the U.S. (18%). Analysis using 1000 Genomes Project data confirmed our observation, revealing a global I/I genotype of 70% in South Asians, 53% in Africans, 29% in East Asians, and 17% in Europeans. Tests based on population divergence, site frequency spectrum and long-range haplotype consistently point to positive selection encompassing rs66698963 in South Asian, African and some East Asian populations. Basal plasma phospholipid arachidonic acid status was 8% greater in I/I compared to D/D individuals. The biochemical pathway product-precursor difference, arachidonic acid minus linoleic acid, was 31% and 13% greater for I/I and I/D compared to D/D, respectively. Our study is consistent with previous in vitro data suggesting that the insertion allele enhances n-6 LCPUFA synthesis and may confer an adaptive advantage in South Asians because of the traditional plant-based diet practice.

Horizontal transfer in bacterial Methionyl tRNA synthetase is very common shown by Genus and phyla level phylogenetic analysis.

Posted on March 9, 2016 by schraib

Horizontal transfer in bacterial Methionyl tRNA synthetase is very common shown by Genus and phyla level phylogenetic analysis.

Prabhakar Ghorpade, Avinash Pange, Bhaskar Sharma

bioRxiv doi: http://dx.doi.org/10.1101/042366

Methionyl tRNA synthetase is single copy informational gene in Salmonella typhimurium. Informational genes are more conserved than operational genes. In this study we had analyzed HGT events within MetG sequences of different bacterial genera. A species tree based on 16srRNA sequences of the same genus was drawn evaluated against the generally accepted species tree of the bacteria. MetG phylogenetic tree was evaluated against the 16srRNAS tree and HGT event identified. Similarly phyla trees were made and HGT event identified. 24 HGT events were identified between genus and 11 within phyla. MetG is a considered as conserved gene finding so many HGT event in this gene indicate that horizontal gene transfer is very common in this gene. Manual tree making for phyla could help to understand phylogenetic relationships between very large trees.

Accelerating Wright-Fisher Forward Simulations on the Graphics Processing Unit

Posted on March 9, 2016 by schraib

Accelerating Wright-Fisher Forward Simulations on the Graphics Processing Unit

David S. Lawrie

bioRxiv doi: http://dx.doi.org/10.1101/042622

Forward Wright-Fisher simulations are powerful in their ability to model complex demography and selection scenarios, but suffer from slow execution on the CPU, thus limiting their usefulness. The single-locus Wright-Fisher forward algorithm is, however, exceedingly parallelizable, with many steps which are so-called embarrassingly parallel, consisting of a vast number of individual computations that are all independent of each other and thus capable of being performed concurrently. The rise of modern Graphics Processing Units (GPUs) and programming languages designed to leverage the inherent parallel nature of these processors have allowed researchers to dramatically speed up many programs that have such high arithmetic intensity and intrinsic concurrency. The presented GPU Optimized Wright-Fisher simulation, or GO Fish for short, can be used to simulate arbitrary selection and demographic scenarios while running over 340-fold faster than its serial counterpart on the CPU. Even modest GPU hardware can achieve an impressive speedup of well over two orders of magnitude. With simulations so accelerated, one can not only do quick parametric bootstrapping of previously estimated parameters, but also use simulated results to calculate the likelihoods and summary statistics of demographic and selection models against real polymorphism data – all without restricting the demographic and selection scenarios that can be modeled or requiring approximations to the single-locus forward algorithm for efficiency. Further, as many of the parallel programming techniques used in this simulation can be applied to other computationally intensive algorithms important in population genetics, GO Fish serves as an exciting template for future research into accelerating computation in evolution. Code available (soon) at: https://github.com/DL42/GOFish

Does linked selection explain the narrow range of genetic diversity across species?

Posted on March 9, 2016 by schraib

Does linked selection explain the narrow range of genetic diversity across species?

Graham Coop

bioRxiv doi: http://dx.doi.org/10.1101/042598

The relatively narrow range of genetic polymorphism levels across species has been a major source of debate since the inception of molecular population genetics. Recently Corbett-Detig et al found evidence that linked selection strongly constrains levels of polymorphism in species with large census sizes. Here I reexamine this claim and find weak support for this conclusion. While linked selection is an important determinant of polymorphism levels along the genome in many species, we currently lack compelling evidence that it is a major determinant of polymorphism levels among obligately sexual species.

Natural selection and genetic diversity in the butterfly Heliconius melpomene

Posted on March 9, 2016 by schraib

Natural selection and genetic diversity in the butterfly Heliconius melpomene

Simon Henry Martin, Markus Moest, Wiliam J Palmer, Camilo Salazar, W. Owen McMillan, Francis M Jiggins, Chris D Jiggins

bioRxiv doi: http://dx.doi.org/10.1101/042796

A combination of selective and neutral evolutionary forces shape patterns of genetic diversity in nature. Among the insects, most previous analyses of the roles of drift and selection in shaping variation across the genome have focused on the genus Drosophila. A more complete understanding of these forces will come from analysing other taxa that differ in population demography and other aspects of biology. We have analysed diversity and signatures of selection in the neotropical Heliconius butterflies using resequenced genomes from 58 wild-caught individuals of H. melpomene, and another 21 resequenced genomes representing 11 related species. By comparing intra-specific diversity and inter-specific divergence, we estimate that 31% of amino acid substitutions between Heliconius species are adaptive. Diversity at putatively neutral sites is negatively correlated with gene density and positively correlated with recombination rate, indicating widespread linked selection. This process also manifests in significantly reduced diversity on longer chromosomes, consistent with lower recombination rates. Genetic hitchhiking around beneficial non-synonymous mutations has also had a significant impact on genetic variation in this species, but evidence for strong selective sweeps was limited overall. We did however identify two regions where distinct haplotypes have swept in different populations, leading to increased population differentiation. On the whole, our study suggests that positive selection is less pervasive in these butterflies as compared to fruit flies; a fact that curiously results in very similar levels of neutral diversity in these very different insects.

Application of database-independent approach to assess the quality of OTU picking methods

Posted on March 9, 2016 by schraib

Application of database-independent approach to assess the quality of OTU picking methods

Patrick D Schloss

bioRxiv doi: http://dx.doi.org/10.1101/042812

Assigning 16S rRNA gene sequences to operational taxonomic units (OTUs) allows microbial ecologists to overcome the inconsistencies and biases within bacterial taxonomy and provides a strategy for clustering similar sequences that do not have representatives in a reference database. I have applied the Matthew’s correlation coefficient to assess the ability of 15 reference-independent and -dependent clustering algorithms to assign sequences to OTUs. This metric quantifies the ability of an algorithm to reflect the relationships between sequences without the use of a reference and can be applied to any dataset or method. The most consistently robust method was the average neighbor algorithm; however, for some datasets other algorithms matched its performance.

A fast and accurate method for detection of IBD shared haplotypes in genome-wide SNP data

Posted on March 9, 2016 by schraib

A fast and accurate method for detection of IBD shared haplotypes in genome-wide SNP data

Douglas W. Bjelland, Uday Lingala, Piyush Patel, Matt Jones, Matthew C. Keller

bioRxiv doi: http://dx.doi.org/10.1101/042879

Identical by descent (IBD) segments are used to understand a number of fundamental issues in genetics. IBD segments are typically detected using long stretches of identical alleles between haplotypes in whole-genome SNP data. Phase or SNP call errors in genomic data can degrade accuracy of IBD detection and lead to false positive calls, false negative calls, and under- or overextension of true IBD segments. Furthermore, the number of comparisons increases quadratically with sample size, requiring high computational efficiency. We developed a new IBD segment detection program, FISHR (Find IBD Shared Haplotypes Rapidly), in an attempt to accurately detect IBD segments and to better estimate their endpoints using an algorithm that is fast enough to be deployed on the very large whole-genome SNP datasets. We compared the performance of FISHR to three leading IBD segment detection programs: GERMLINE, refinedIBD, and HaploScore. Using simulated and real genomic sequence data, we show that FISHR is slightly more accurate than all programs at detecting long (greater than 3 cM) IBD segments but slightly less accurate than refinedIBD at detecting short (1 cM) IBD segments. Moreover, FISHR outperforms all programs in determining the true endpoints of IBD segments, which is important for several reasons. FISHR takes two to four times longer than GERMLINE to run, whereas both GERMLINE and FISHR were orders of magnitude faster than refinedIBD and HaploScore. Overall, FISHR provides accurate IBD detection in unrelated individuals and is computationally efficient enough to be utilized on large SNP datasets greater than 20,000 individuals.

SCOTTI: Efficient Reconstruction of Transmission within Outbreaks with the Structured Coalescent

Posted on March 9, 2016 by schraib

SCOTTI: Efficient Reconstruction of Transmission within Outbreaks with the Structured Coalescent
Nicola De Maio, Chieh-Hsi Wu, Daniel J Wilson
(Submitted on 7 Mar 2016)

Exploiting pathogen genomes to reconstruct transmission represents a powerful tool in the fight against infectious disease. However, their interpretation rests on a number of simplifying assumptions that regularly ignore important complexities of real data, in particular within-host evolution and non-sampled patients.
Here we propose a new approach to transmission inference called SCOTTI (Structured COalescent Transmission Tree Inference). This method is based on a statistical framework that models each host as a distinct population, and transmissions between hosts as migration events. Our computationally efficient implementation of this model enables the inference of host-to-host transmission while accommodating within-host evolution and non-sampled hosts. SCOTTI is distributed as an open source package for the phylogenetic software BEAST2.
We show that SCOTTI can generally infer transmission events even in the presence of considerable within-host variation, can account for the uncertainty associated with the possible presence of non-sampled hosts, and can efficiently use data from multiple samples of the same host, although there is some reduction in accuracy when samples are collected very close to the infection time.
We illustrate the features of our approach by investigating transmission from genetic and epidemiological data in a Foot and Mouth Disease Virus (FMDV) veterinary outbreak in England and a Klebsiella pneumoniae outbreak in a Nepali neonatal unit. Transmission histories inferred with SCOTTI will be important in devising effective measures to prevent and halt transmission.

The Genetic Architecture of Quantitative Traits Cannot Be Inferred From Variance Component Analysis

Posted on March 4, 2016 by schraib

The Genetic Architecture of Quantitative Traits Cannot Be Inferred From Variance Component Analysis

Wen Huang, Trudy F.C. Mackay

bioRxiv doi: http://dx.doi.org/10.1101/041434

Classical quantitative genetic analyses estimate additive and non-additive genetic and environmental components of variance from phenotypes of related individuals. The genetic variance components are defined in terms of genotypic values reflecting underlying genetic architecture (additive, dominance and epistatic genotypic effects) and allele frequencies. However, the dependency of the definition of genetic variance components on the underlying genetic models is not often appreciated. Here, we show how the partitioning of additive and non-additive genetic variation is affected by the genetic models and parameterization of allelic effects. We show that arbitrarily defined variance components often capture a substantial fraction of total genetic variation regardless of the underlying genetic architecture in simulated and real data. Therefore, variance component analysis cannot be used to infer genetic architecture of quantitative traits. The genetic basis of quantitative trait variation in a natural population can only be defined empirically using high resolution mapping methods followed by detailed characterization of QTL effects.

VcfR: an R package to manipulate and visualize VCF format data

Posted on March 4, 2016 by schraib

VcfR: an R package to manipulate and visualize VCF format data

Brian J Knaus, Niklaus J Grunwald

bioRxiv doi: http://dx.doi.org/10.1101/041277

Software to call single nucleotide polymorphisms or related genetic variants has converged on the variant call format (VCF) as the output format of choice. This has created a need for tools to work with VCF files. While an increasing number of software exists to read VCF data, many only extract the genotypes without including the data associated with each genotype that describes its quality. We created the R package vcfR to address this issue. We developed a VCF file exploration tool implemented in the R language because R provides an interactive experience and an environment that is commonly used for genetic data analysis. Functions to read and write VCF files into R as well as functions to extract portions of the data and to plot summary statistics of the data are implemented. VcfR further provides the ability to visualize how various parameterizations of the data affect the results. Additional tools are included to integrate sequence (FASTA) and annotation data (GFF) for visualization of genomic regions such as chromosomes. Conversion functions translate data from the vcfR data structure to formats used by other R genetics packages. Computationally intensive functions are implemented in C++ to improve performance. Use of these tools is intended to facilitate VCF data exploration, including intuitive methods for data quality control and easy export to other R packages for further analysis. VcfR thus provides essential, novel tools currently not available in R.

Haldane's Sieve

Discussing preprints in population and evolutionary genetics

Positive selection on a regulatory insertion-deletion polymorphism in FADS2 influences apparent endogenous synthesis of arachidonic acid

Horizontal transfer in bacterial Methionyl tRNA synthetase is very common shown by Genus and phyla level phylogenetic analysis.

Accelerating Wright-Fisher Forward Simulations on the Graphics Processing Unit

Does linked selection explain the narrow range of genetic diversity across species?

Natural selection and genetic diversity in the butterfly Heliconius melpomene

Application of database-independent approach to assess the quality of OTU picking methods

A fast and accurate method for detection of IBD shared haplotypes in genome-wide SNP data

SCOTTI: Efficient Reconstruction of Transmission within Outbreaks with the Structured Coalescent

The Genetic Architecture of Quantitative Traits Cannot Be Inferred From Variance Component Analysis

VcfR: an R package to manipulate and visualize VCF format data

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: