Association mapping reveals the role of mutation-selection balance in the maintenance of genomic variation for gene expression.

Association mapping reveals the role of mutation-selection balance in the maintenance of genomic variation for gene expression.

Emily Josephs , Young Wha Lee , John R. Stinchcombe , Stephen I Wright

The evolutionary forces that maintain genetic variation for quantitative traits within populations remain unknown. One hypothesis suggests that variation is maintained by a balance between new mutations and their removal by selection and drift. Theory predicts that this mutation-selection balance will result in an excess of low-frequency variants and a negative correlation between minor allele frequency and selection coefficients. Here, we test these predictions using the genetic loci associated with total expression variation (‘eQTLs’) and allele-specific expression variation (‘aseQTLs’) mapped within a single population of the plant Capsella grandiflora. In addition to finding eQTLs and aseQTLs for a large fraction of genes, we show that alleles at these loci are rarer than expected and exhibit a negative correlation between effect size and frequency. Overall, our results show that mutation-selection balance is the dominant contributor to genomic variation for expression within a single, outcrossing population.

Extensive de novo mutation rate variation between individuals and across the genome of Chlamydomonas reinhardtii

Extensive de novo mutation rate variation between individuals and across the genome of Chlamydomonas reinhardtii

Rob W Ness , Andrew D Morgan , Radhakrishnan B Vasanthakrishnan , Nick Colegrave , Peter D Keightley

Describing the process of spontaneous mutation is fundamental for understanding the genetic basis of disease, the threat posed by declining population size in conservation biology, and in much evolutionary biology. However, directly studying spontaneous mutation is difficult because of the rarity of de novo mutations. Mutation accumulation (MA) experiments overcome this by allowing mutations to build up over many generations in the near absence of natural selection. In this study, we sequenced the genomes of 85 MA lines derived from six genetically diverse wild strains of the green alga Chlamydomonas reinhardtii. We identified 6,843 spontaneous mutations, more than any other study of spontaneous mutation. We observed seven-fold variation in the mutation rate among strains and that mutator genotypes arose, increasing the mutation rate dramatically in some replicates. We also found evidence for fine-scale heterogeneity in the mutation rate, driven largely by the sequence flanking mutated sites, and by clusters of multiple mutations at closely linked sites. There was little evidence, however, for mutation rate heterogeneity between chromosomes or over large genomic regions of 200Kbp. Using logistic regression, we generated a predictive model of the mutability of sites based on their genomic properties, including local GC content, gene expression level and local sequence context. Our model accurately predicted the average mutation rate and natural levels of genetic diversity of sites across the genome. Notably, trinucleotides vary 17-fold in rate between the most mutable and least mutable sites. Our results uncover a rich heterogeneity in the process of spontaneous mutation both among individuals and across the genome.

Phen-Gen: Combining Phenotype and Genotype to Analyze Rare Disorders

Phen-Gen: Combining Phenotype and Genotype to Analyze Rare Disorders

Asif Javed , Saloni Agrawal , Pauline Ng

We introduce Phen-Gen, a method which combines patient’s disease symptoms and sequencing data with prior domain knowledge to identify the causative gene(s) for rare disorders. Simulations reveal that the causal variant is ranked first in 88% cases when it is coding; which is 52% advantage over a genotype-only approach and outperforms existing methods by 13-58%. If disease etiology is unknown, the causal variant is assigned top-rank in 71% of simulations.

Catch me if you can: Adaptation from standing genetic variation to a moving phenotypic optimum

Catch me if you can: Adaptation from standing genetic variation to a moving phenotypic optimum

Sebastian Matuszewski , Joachim Hermisson , Michael Kopp
AbstractInfo/HistoryMetrics Preview PDF

Adaptation lies at the heart of Darwinian evolution. Accordingly, numerous studies have tried to provide a formal framework for the description of the adaptive process. Out of these, two complementary modelling approaches have emerged: While so-called adaptive-walk models consider adaptation from the successive fixation of de-novo mutations only, quantitative genetic models assume that adaptation proceeds exclusively from pre-existing standing genetic variation. The latter approach, however, has focused on short-term evolution of population means and variances rather than on the statistical properties of adaptive substitutions. Our aim is to combine these two approaches by describing the ecological and genetic factors that determine the genetic basis of adaptation from standing genetic variation in terms of the effect-size distribution of individual alleles. Specifically, we consider the evolution of a quantitative trait to a gradually changing environment. By means of analytical approximations, we derive the distribution of adaptive substitutions from standing genetic variation, that is, the distribution of the phenotypic effects of those alleles from the standing variation that become fixed during adaptation. Our results are checked against individual-based simulations. We find that, compared to adaptation from de-novo mutations, (i) adaptation from standing variation proceeds by the fixation of more alleles of small effect; (ii) populations that adapt from standing genetic variation can traverse larger distances in phenotype space and, thus, have a higher potential for adaptation if the rate of environmental change is fast rather than slow.

Quality assessment for different haplotyping methods and GWAS sensitivity to phasing errors

Quality assessment for different haplotyping methods and GWAS sensitivity to phasing errors

Giovanni Busonera , Marco Cogoni , Gianluigi Zanetti

In this report we present a multimarker association tool (Flash) based on a novel algorithm to generate haplotypes from raw genotype data. It belongs to the entropy minimization class of methods and is composed of a two stage deterministic – heuristic part and of a optional stochastic optimization. This algorithm is able to scale up well to handle huge datasets with faster performance than the competing technologies such as BEAGLE and MACH while maintaining a comparable accuracy. A quality assessment of the results is carried out by comparing the switch error. Finally, the haplotypes are used to perform a haplotype-based Genome-wide Association Study (GWAS). The association results are compared with a multimarker and a single SNP association test performed with Plink. Our experiments confirm that the multimarker association test can be more powerful than the single SNP one as stated in the literature. Moreover, Flash and Plink show similar results for the multimarker association test but Flash speeds up the computation time of about an order of magnitude using 5 SNP size haplotypes.

Pervasive adaptation of gene expression in Drosophila

Pervasive adaptation of gene expression in Drosophila

Armita Nourmohammad, Joachim Rambeau, Torsten Held, Johannes Berg, Michael Lassig
(Submitted on 23 Feb 2015)

Gene expression levels are important molecular quantitative traits that link genotypes to molecular functions and fitness. In Drosophila, population-genetic studies in recent years have revealed substantial adaptive evolution at the genomic level. However, the evolutionary modes of gene expression have remained controversial. Here we present evidence that adaptation dominates the evolution of gene expression levels in flies. We show that 64% of the observed expression divergence across seven Drosophila species are adaptive changes driven by directional selection. Our results are derived from the variation of expression within species and the time-resolved divergence across a family of related species, using a new inference method for selection. We identify functional classes of adaptively regulated genes, as well as sex-specific adaptation occurring predominantly in males. Our analysis opens a new avenue to map system-wide selection on molecular quantitative traits independently of their genetic basis.

Calibrating the Human Mutation Rate via Ancestral Recombination Density in Diploid Genomes

Calibrating the Human Mutation Rate via Ancestral Recombination Density in Diploid Genomes

Mark Lipson , Po-Ru Loh , Sriram Sankararaman , Nick Patterson , Bonnie Berger , David Reich

The human mutation rate is an essential parameter for studying the evolution of our species, interpreting present-day genetic variation, and understanding the incidence of genetic disease. Nevertheless, our current estimates of the rate are uncertain. Classical methods based on sequence divergence have yielded significantly larger values than more recent approaches based on counting de novo mutations in family pedigrees. Here, we propose a new method that uses the fine-scale human recombination map to calibrate the rate of accumulation of mutations. By comparing local heterozygosity levels in diploid genomes to the genetic distance scale over which these levels change, we are able to estimate a long-term mutation rate averaged over hundreds or thousands of generations. We infer a rate of 1.65 +/- 0.10 x 10^(-8) mutations per base per generation, which falls in between phylogenetic and pedigree-based estimates, and we suggest possible mechanisms to reconcile our estimate with previous studies. Our results support intermediate-age divergences among human populations and between humans and other great apes.

Differential Evolution Approach to Detect Recent Admixture

Differential Evolution Approach to Detect Recent Admixture

Konstantin Kozlov , Dmitry Chebotarov , Mehedi Hassan , Petr Triska , Martin Triska , Pavel Flegontov , Tatiana V Tatarinova

The genetic structure of human populations is extraordinarily complex and of fundamental importance to studies of anthropology, evolution, and medicine. As increasingly many individuals are of mixed origin, there is an unmet need for tools that can infer multiple origins. Misclassification of such individuals can lead to incorrect and costly misinterpretations of genomic data, primarily in disease studies and drug trials. We present an advanced tool to infer ancestry that can identify the biogeographic origins of highly mixed individuals. reAdmix can incorporate individual’s knowledge of ancestors (e.g. having some ancestors from Turkey or a Scottish grandmother). reAdmix is an online tool available at

Chromosome-scale shotgun assembly using an in vitro method for long-range linkage

Chromosome-scale shotgun assembly using an in vitro method for long-range linkage
Nicholas H. Putnam, Brendan O’Connell, Jonathan C. Stites, Brandon J. Rice, Andrew Fields, Paul D. Hartley, Charles W. Sugnet, David Haussler, Daniel S. Rokhsar, Richard E. Green
Subjects: Genomics (q-bio.GN); Biomolecules (q-bio.BM)

Long-range and highly accurate de novo assembly from short-read data is one of the most pressing challenges in genomics. Recently, it has been shown that read pairs generated by proximity ligation of DNA in chromatin of living tissue can address this problem. These data dramatically increase the scaffold contiguity of assemblies and provide haplotype phasing information. Here, we describe a simpler approach (“Chicago”) based on in vitro reconstituted chromatin. We generated two Chicago datasets with human DNA and used a new software pipeline (“HiRise”) to construct a highly accurate de novo assembly and scaffolding of a human genome with scaffold N50 of 30 Mb. We also demonstrated the utility of Chicago for improving existing assemblies by re-assembling and scaffolding the genome of the American alligator. With a single library and one lane of Illumina HiSeq sequencing, we increased the scaffold N50 of the American alligator from 508 kb to 10 Mb. Our method uses established molecular biology procedures and can be used to analyze any genome, as it requires only about 5 micrograms of DNA as the starting material.

Genetic evidence for an origin of the Armenians from Bronze Age mixing of multiple populations

Genetic evidence for an origin of the Armenians from Bronze Age mixing of multiple populations
Marc Haber , Massimo Mezzavilla , Yali Xue , David Comas , Paolo Gasparini , Pierre Zalloua , Chris Tyler-Smith

The Armenians are a culturally isolated population who historically inhabited a region in the Near East bounded by the Mediterranean and Black seas and the Caucasus, but remain underrepresented in genetic studies and have a complex history including a major geographic displacement during World War One. Here, we analyse genome-wide variation in 173 Armenians and compare them to 78 other worldwide populations. We find that Armenians form a distinctive cluster linking the Near East, Europe, and the Caucasus. We show that Armenian diversity can be explained by several mixtures of Eurasian populations that occurred between ~3,000 and ~2,000 BCE, a period characterized by major population migrations after the domestication of the horse, appearance of chariots, and the rise of advanced civilizations in the Near East. However, genetic signals of population mixture cease after ~1,200 BCE when Bronze Age civilizations in the Eastern Mediterranean world suddenly and violently collapsed. Armenians have since remained isolated and genetic structure within the population developed ~500 years ago when Armenia was divided between the Ottomans and the Safavid Empire in Iran. Finally, we show that Armenians have higher genetic affinity to Neolithic Europeans than other present-day Near Easterners, and that 29% of the Armenian ancestry may originate from an ancestral population best represented by Neolithic Europeans.