Abundant contribution of short tandem repeats to gene expression variation in humans

Abundant contribution of short tandem repeats to gene expression variation in humans

Melissa Gymrek , Thomas Willems , Haoyang Zeng , Barak Markus , Mark J Daly , Alkes L Price , Jonathan Pritchard , Yaniv Erlich
doi: http://dx.doi.org/10.1101/017459

Expression quantitative trait loci (eQTLs) are a key tool to dissect cellular processes mediating complex diseases. However, little is known about the role of repetitive elements as eQTLs. We report a genome-wide survey of the contribution of Short Tandem Repeats (STRs), one of the most polymorphic and abundant repeat classes, to gene expression in humans. Our survey identified 2,060 significant expression STRs (eSTRs). These eSTRs were replicable in orthogonal populations and expression assays. We used variance partitioning to disentangle the contribution of eSTRs from linked SNPs and indels and found that eSTRs contribute 10%-15% of the cis-heritability mediated by all common variants. Functional genomic analyses showed that eSTRs are enriched in conserved regions, co-localize with regulatory elements, and are predicted to modulate histone modifications. Our results show that eSTRs provide a novel set of regulatory variants and highlight the contribution of repeats to the genetic architecture of quantitative human traits.

Rate of Adaptive Evolution under Blending Inheritance

Rate of Adaptive Evolution under Blending Inheritance

Alan R. Rogers
(Submitted on 1 Apr 2015)

In a population of size N, adaptive evolution is 2N times faster under Mendelian inheritance than under the 19th-century theory of blending inheritance.

Neanderthal Genomics Suggests a Pleistocene Time Frame for the First Epidemiologic Transition

Neanderthal Genomics Suggests a Pleistocene Time Frame for the First Epidemiologic Transition

Charlotte Jane Houldcroft , Simon Underdown
doi: http://dx.doi.org/10.1101/017343

High quality Altai Neanderthal and Denisovan genomes are revealing which regions of archaic hominin DNA have persisted in the modern human genome. A number of these regions are associated with response to infection and immunity, with a suggestion that derived Neanderthal alleles found in modern Europeans and East Asians may be associated with autoimmunity. Independent sources of DNA-based evidence allow a re-evaluation of the nature and timing of the first epidemiologic transition. By combining skeletal, archaeological and genetic evidence we question whether the first epidemiologic transition in Eurasia was as tightly tied to the onset of the Holocene as has previously been assumed. There is clear evidence to suggest that this transition began before the appearance of agriculture and occurred over a timescale of tens of thousands of years. The transfer of pathogens between human species may also have played a role in the extinction of the Neanderthals.

XWAS: a software toolset for genetic data analysis and association studies of the X chromosome

XWAS: a software toolset for genetic data analysis and association studies of the X chromosome

Feng Gao , Diana Chang , Arjun Biddanda , Li Ma , Yingjie Guo , Zilu Zhou , Alon Keinan
doi: http://dx.doi.org/10.1101/009795

XWAS is a new software for the analysis of the X chromosome in association studies and similar studies. The X chromosome plays an important role in human disease, especially those with sexually dimorphic characteristics. Special attention needs to be given to its analysis due to the unique inheritance pattern, leading to analytical complications that have resulted in the majority of genome-wide association studies (GWAS) either not considering X or mishandling it with GWAS toolsets that have been designed for non-sex chromosomes.. Hence, XWAS fills the need for tools that are specially designed for analysis of X. Following extensive, stringent, and X-specific quality control, XWAS offers an array of statistical tests of association, including: (1) the standard test between a SNP (single nucleotide polymorphism) and disease risk, including after first stratifying individuals by sex, (2) a test for a differential effect of a SNP on disease between males and females, (3) motivated by X-inactivation, a test for higher variance of a trait in heterozygous females as compared to homozygous females, and (4) for all tests, a version that allows for combining evidence across all SNPs in a whole gene. We applied the toolset analysis pipeline to 16 GWAS datasets of immune-related disorders and to 7 risk factors of coronary artery disease, and discovered several new X-linked genetic associations. XWAS will provide the tools and incentive for others to incorporate the X chromosome into GWAS, hence enabling discoveries of novel loci implicated in many diseases and in their sexual dimorphism.

Most viewed on Haldane’s Sieve: March 2015

The most viewed posts on Haldane’s Sieve this month were:

Network analysis of genome-wide selective constraint reveals a gene network active in early fetal brain intolerant of mutation

Network analysis of genome-wide selective constraint reveals a gene network active in early fetal brain intolerant of mutation

Jinmyung Choi , Parisa Shooshtari , Kaitlin E Samocha , Mark J Daly , Chris Cotsapas
doi: http://dx.doi.org/10.1101/017277
AbstractInfo/HistoryMetrics Preview PDF
Abstract

Using robust, integrated analysis of multiple genomic datasets, we show that genes depleted for non-synonymous de novo mutations form a subnetwork of 72 members under strong selective constraint. We further show this subnetwork is preferentially expressed in the early development of the human hippocampus and is enriched for genes mutated in neurological, but not other, Mendelian disorders. We thus conclude that carefully orchestrated developmental processes are under strong constraint in early brain development, and perturbations caused by mutation have adverse outcomes subject to strong purifying selection. Our findings demonstrate that selective forces can act on groups of genes involved in the same process, supporting the notion that adaptation can act coordinately on multiple genes. Our approach provides a statistically robust, interpretable way to identify the tissues and developmental times where groups of disease genes are active. Our findings highlight the importance of considering the interactions between genes when analyzing genome-wide sequence data.

RiboDiff: Detecting Changes of Translation Efficiency from Ribosome Footprints

RiboDiff: Detecting Changes of Translation Efficiency from Ribosome Footprints

Yi Zhong , Theofanis Karaletsos , Philipp Drewe , Vipin Thankam T Sreedharan , Kamini Singh , Hans-Guido Wendel , Gunnar Rätsch
doi: http://dx.doi.org/10.1101/017111

Motivation: Deep sequencing based ribosome footprint profiling can provide novel insights into the regulatory mechanisms of protein translation. However, the observed ribosome profile is fundamentally confounded by transcriptional activity. In order to decipher principles of translation regulation, tools that can reliably detect changes in translation efficiency in case-control studies are needed. Results: We present a statistical framework and analysis tool, RiboDiff, to detect genes with changes in translation efficiency across experimental treatments. RiboDiff uses generalized linear models to estimate the over-dispersion of RNA-Seq and ribosome profiling measurements separately, and performs a statistical test for differential translation efficiency using both mRNA abundance and ribosome occupancy. Availability: Source code and documentation are available at http://github.com/ratschlab/ribodiff. Supplementary Material can be found at http://bioweb.me/ribo.

New Routes to Phylogeography

New Routes to Phylogeography

Nicola De Maio, Chieh-Hsi Wu, Kathleen M O’Reilly, Daniel Wilson
(Submitted on 27 Mar 2015)

Phylogeographic methods aim to infer migration trends and the history of sampled lineages from genetic data. Applications of phylogeography are broad, and in the context of pathogens include the reconstruction of transmission histories and the origin and emergence of outbreaks. Phylogeographic inference based on bottom-up population genetics models is computationally expensive, and as a result faster alternatives based on the evolution of discrete traits have become popular. In this paper, we show that inference of migration rates and root locations based on discrete trait models is extremely unreliable and sensitive to biased sampling. To address this problem, we introduce BASTA (BAyesian STructured coalescent Approximation), a new approach implemented in BEAST2 that combines the accuracy of methods based on the structured coalescent with the computational efficiency required to handle more than just few populations. We illustrate the potentially severe implications of poor model choice for phylogeographic analyses by investigating the zoonotic transmission of Ebola virus. Whereas the structured coalescent analysis correctly infers that successive human Ebola outbreaks have been seeded by a large unsampled non-human reservoir population, the discrete trait analysis implausibly concludes that undetected human-to-human transmission has allowed the virus to persist over the past four decades. As genomics takes on an increasingly prominent role informing the control and prevention of infectious diseases, it will be vital that phylogeographic inference provides robust insights into transmission history.

Long live the alien: studying the fate of the genomic diversity along the long-term dynamics of an extremely successful invader, the crested porcupine.

Long live the alien: studying the fate of the genomic diversity along the long-term dynamics of an extremely successful invader, the crested porcupine.

Emiliano Trucchi , Benoit Facon , Paolo Gratton , Emiliano Mori , Nils Chr Stenseth , Sissel Jentoft
doi: http://dx.doi.org/10.1101/016493

Describing long-term evolutionary trajectories of alien species is a fundamental, although rarely possible, step to understand the pivotal drivers of successful invasions. Here, we tackled this task by investigating the genetic structure of the crested porcupine (Hystrix cristata), whose invasion of Italy started about 1500 years ago. Using genome-wide RAD markers, we explored the demographic processes that shaped, and are shaping, the gene pool of the expanding invasive populations and compared their genetic diversity with that of native and invasive populations of both African porcupine species (crested and Cape, H. africaeaustralis). Through coalescence-based demographic reconstructions, we demonstrated that bottleneck at introduction was mild and did not severely affect the reservoir of genetic diversity. Our data also highlighted a marked geographic structure in the invasive populations, indicating that they are likely the results of multiple introduction events. Nevertheless, both the invasive populations and its source show a lower level of diversity relative to other native populations from Sub-Saharan and South Africa, suggesting that demographic history before introduction may have played a role in forging a successful invader. Finally, we showed that the current spatial expansion at the northern boundary of the range is following a leading-edge model characterized by a general reduction of genetic diversity towards the edge of the expanding range. Consistently, random fixation of alleles through gene-surfing seems a more likely explanation than adaptive divergence for the distribution of the few outlier loci with highly divergent frequencies between core and newly colonized areas.

Adaptation, Clonal Interference, and Frequency-Dependent Interactions in a Long-Term Evolution Experiment with Escherichia coli

Adaptation, Clonal Interference, and Frequency-Dependent Interactions in a Long-Term Evolution Experiment with Escherichia coli

Rohan Maddamsetti , Richard E. Lenski , Jeffrey E. Barrick
doi: http://dx.doi.org/10.1101/017020

Twelve replicate populations of Escherichia coli have been evolving in the laboratory for more than 25 years and 60,000 generations. We analyzed bacteria from whole-population samples frozen every 500 generations through 20,000 generations for one well-studied population, called Ara???1. By tracking 42 known mutations in these samples, we reconstructed the history of this population???s genotypic evolution over this period. The evolutionary dynamics of Ara???1 show strong evidence of selective sweeps as well as clonal interference between competing lineages bearing different beneficial mutations. In some cases, sets of several mutations approached fixation simultaneously, often conveying no information about their order of origination; we present several possible explanations for the existence of these mutational cohorts. Against a backdrop of rapid selective sweeps both earlier and later, we found that two clades coexisted for over 6000 generations before one drove the other extinct. In that time, at least nine mutations arose in the clade that prevailed. We found evidence that the clades evolved a frequency-dependent interaction, which prevented the competitive exclusion of either clade, but which eventually collapsed as beneficial mutations accumulated in the clade that prevailed. Clonal interference and frequency dependence can occur even in the simplest microbial populations. Furthermore, frequency dependence may generate dynamics that extend the period of coexistence that would otherwise be sustained by clonal interference alone.