A high-throughput RNA-seq approach to profile transcriptional responses

A high-throughput RNA-seq approach to profile transcriptional responses

Gregory A Moyerbrailean , Gordon O Davis , Chris T Harvey , Donovan Watza , Xiaoquan Wen , Roger Pique-Regi , Francesca Luca
doi: http://dx.doi.org/10.1101/018416

In recent years, different technologies have been used to measure genome-wide gene expression levels and to study the transcriptome across many types of tissues and in response to in vitro treatments. However, a full understanding of gene regulation in any given cellular and environmental context combination is still missing. This is partly because analyzing tissue/environment-specific gene expression generally implies screening a large number of cellular conditions and samples, without prior knowledge of which conditions are most informative (e.g. some cell types may not respond to certain treatments). To circumvent these challenges, we have established a new two-step high-throughput and cost-effective RNA-seq approach: the first step consists of gene expression screening of a large number of conditions, while the second step focuses on deep sequencing of the most relevant conditions (e.g. largest number of differentially expressed genes). This study design allows for a fast and economical screen in step one, with a more profitable allocation of resources for the deep sequencing of re-pooled libraries in step two. We have applied this approach to study the response to 26 treatments in three lymphoblastoid cell line samples and we show that it is applicable for other high-throughput transcriptome profiling requiring iterative refinement or screening.

Methods for distinguishing between protein-coding and long noncoding RNAs and the elusive biological purpose of translation of long noncoding RNAs

Methods for distinguishing between protein-coding and long noncoding RNAs and the elusive biological purpose of translation of long noncoding RNAs
Gali Housman , Igor Ulitsky
doi: http://dx.doi.org/10.1101/017889

Long noncoding RNAs (lncRNAs) are a diverse class of RNAs with increasingly appreciated functions in vertebrates, yet much of their biology remains poorly understood. In particular, it is unclear to what extent the current catalog of over 10,000 distinct annotated lncRNAs is indeed devoid of genes coding for proteins. Here we review the available computational and experimental schemes for distinguishing between recent genome-wide applications. We conclude that the model most consistent with available data is that a large number of mammalian lncRNAs undergo translation, but only a very small minority of such translation events result in stable and functional peptides. The outcome of the majority of the translation events and their potential biological purposes remain an intriguing topic for future investigation.

Analysis of allele-specific expression reveals cis-regulatory changes associated with a recent mating system shift and floral adaptation in Capsella

Analysis of allele-specific expression reveals cis-regulatory changes associated with a recent mating system shift and floral adaptation in Capsella

Kim A Steige , Johan ReimegÄrd , Daniel Koenig , Douglas G Scofield , Tanja Slotte
doi: http://dx.doi.org/10.1101/017749

Cis-regulatory changes have long been suggested to contribute to organismal adaptation. While cis-regulatory changes can now be identified on a transcriptome-wide scale, in most cases the adaptive significance and mechanistic basis of rapid cis-regulatory divergence remains unclear. Here, we have characterized cis-regulatory changes associated with recent adaptive floral evolution in the selfing plant Capsella rubella, which diverged from the outcrosser Capsella grandiflora less than 200 kya. We assessed allele-specific expression (ASE) in leaves and flower buds at a total of 18,452 genes in three interspecific F1 C. grandiflora x C. rubella hybrids. After accounting for technical variation and read-mapping biases using genomic reads, we estimate that an average of 44% of these genes show evidence of ASE, however only 6% show strong allelic expression biases. Flower buds, but not leaves, show an enrichment of genes with ASE in genomic regions responsible for phenotypic divergence between C. rubella and C. grandiflora. We further detected an excess of heterozygous transposable element (TE) insertions in the vicinity of genes with ASE, and TE insertions targeted by uniquely mapping 24-nt small RNAs were associated with reduced allelic expression of nearby genes. Our results suggest that cis-regulatory changes have been important for recent adaptive floral evolution in Capsella and that differences in TE dynamics between selfing and outcrossing species could be an important mechanism underlying rapid regulatory divergence.

Adaptive evolution of anti-viral siRNAi genes in bumblebees

Adaptive evolution of anti-viral siRNAi genes in bumblebees
Sophie Helbing , Michael Lattorff
doi: http://dx.doi.org/10.1101/017681

The high density of frequently interacting and closely related individuals in social insects enhance pathogen transmission and establishment within colonies. Group-mediated behavior supporting immune defenses tend to decrease selection acting on immune genes. Along with low effective population sizes this will result in relaxed constraint and rapid evolution of genes of the immune system. Here we show that sociality is the main driver of selection in antiviral siRNAi genes in social bumblebees compared to their socially parasitic cuckoo bumblebees that lack a worker caste. RNAi genes show frequent positive selection at the codon level additionally supported by the occurrence of parallel evolution and their evolutionary rate is linked to their pathway specific position with genes directly interacting with viruses showing the highest rates of molecular evolution. We suggest that indeed higher pathogen load in social insects drive adaptive evolution of immune genes, if not compensated by behavior.

Mycobacterial infection induces a specific human innate immune response

Mycobacterial infection induces a specific human innate immune response

John D Blischak , Ludovic Tailleux , Amy Mitrano , Luis B Barreiro , Yoav Gilad
doi: http://dx.doi.org/10.1101/017483

The innate immune system provides the first response to pathogen infection and orchestrates the activation of the adaptive immune system. Though a large component of the innate immune response is common to all infections, pathogen-specific responses have been documented as well. The innate immune response is thought to be especially critical for fighting infection with Mycobacterium tuberculosis (MTB), the causative agent of tuberculosis (TB). While TB can be deadly, only 5-10% of individuals infected with MTB develop active disease. The risk for disease susceptibility is, at least partly, heritable. Studies of inter-individual variation in the innate immune response to MTB infection may therefore shed light on the genetic basis for variation in susceptibility to TB. Yet, to date, we still do not know which properties of the innate immune response are specific to MTB infection and which represent a general response to pathogen infection. To begin addressing this gap, we infected macrophages with eight different bacteria, including different MTB strains and related mycobacteria, and studied the transcriptional response to infection. Although the ensued gene regulatory responses were largely consistent across the bacterial infection treatments, we were able to identify a novel subset of genes whose regulation was affected specifically by infection with mycobacteria. Genetic variants that are associated with regulatory differences in these genes should be considered candidate loci for explaining inter-individual susceptibility TB.

Abundant contribution of short tandem repeats to gene expression variation in humans

Abundant contribution of short tandem repeats to gene expression variation in humans

Melissa Gymrek , Thomas Willems , Haoyang Zeng , Barak Markus , Mark J Daly , Alkes L Price , Jonathan Pritchard , Yaniv Erlich
doi: http://dx.doi.org/10.1101/017459

Expression quantitative trait loci (eQTLs) are a key tool to dissect cellular processes mediating complex diseases. However, little is known about the role of repetitive elements as eQTLs. We report a genome-wide survey of the contribution of Short Tandem Repeats (STRs), one of the most polymorphic and abundant repeat classes, to gene expression in humans. Our survey identified 2,060 significant expression STRs (eSTRs). These eSTRs were replicable in orthogonal populations and expression assays. We used variance partitioning to disentangle the contribution of eSTRs from linked SNPs and indels and found that eSTRs contribute 10%-15% of the cis-heritability mediated by all common variants. Functional genomic analyses showed that eSTRs are enriched in conserved regions, co-localize with regulatory elements, and are predicted to modulate histone modifications. Our results show that eSTRs provide a novel set of regulatory variants and highlight the contribution of repeats to the genetic architecture of quantitative human traits.

Entire genome transcription across evolutionary time exposes non-coding DNA to de novo gene emergence

Entire genome transcription across evolutionary time exposes non-coding DNA to de novo gene emergence
Rafik Neme , Diethard Tautz
doi: http://dx.doi.org/10.1101/017152

Even in the best studied Mammalian genomes, less than 5% of the total genome length is annotated as exonic. However, deep sequencing analysis in humans has shown that around 40% of the genome may be covered by poly-adenylated non-coding transcripts occurring at low levels. Their functional significance is unclear, and there has been a dispute whether they should be considered as noise of the transcriptional machinery. We propose that if such transcripts show some evolutionary stability they will serve as substrates for de novo gene evolution, i.e. gene emergence out of non-coding DNA. Here, we characterize the phylogenetic turnover of low-level poly-adenylated transcripts in a comprehensive sampling of populations, sub-species and species of the genus Mus, spanning a phylogenetic distance of about 10 Myr. We find evidence for more evolutionary stable gains of transcription than losses among closely related taxa, balanced by a loss of older transcripts across the whole phylogeny. We show that adding taxa increases the genomic transcript coverage and that no major transcript-free islands exist over time. This suggests that the entire genome can be transcribed into poly-adenylated RNA when viewed at an evolutionary time scale. Thus, any part of the “non-coding” genome can become subject to evolutionary functionalization via de novo gene evolution.