PrediXcan: Trait Mapping Using Human Transcriptome Regulation

PrediXcan: Trait Mapping Using Human Transcriptome Regulation

Eric R Gamazon, Heather E Wheeler, Kaanan Shah, Sahar V Mozaffari, Keston Aquino-Michaels, Robert J Carroll, Anne E Eyler, Joshua C Denny, Dan L Nicolae, Nancy J Cox, Hae Kyung Im, GTEx Consortium
doi: http://dx.doi.org/10.1101/020164

Genome-wide association studies (GWAS) have identified thousands of variants robustly associated with complex traits. However, the biological mechanisms underlying these associations are, in general, not well understood. We propose a gene-based association method called PrediXcan that directly tests the molecular mechanisms through which genetic variation affects phenotype. The approach estimates the component of gene expression determined by an individual’s genetic profile and correlates the “imputed” gene expression with the phenotype under investigation to identify genes involved in the etiology of the phenotype. The genetically regulated gene expression is estimated using whole-genome tissue-dependent prediction models trained with reference transcriptome datasets. PrediXcan enjoys the benefits of gene- based approaches such as reduced multiple testing burden, more comprehensive annotation of gene function compared to that derived from single variants, and a principled approach to the design of follow-up experiments while also integrating knowledge of regulatory function. Since no actual expression data are used in the analysis of GWAS data – only in silico expression – reverse causality problems are largely avoided. PrediXcan harnesses reference transcriptome data for disease mapping studies. Our results demonstrate that PrediXcan can detect known and novel genes associated with disease traits and provide insights into the mechanism of these associations.

Advertisement

A Unified Architecture of Transcriptional Regulatory Elements

A Unified Architecture of Transcriptional Regulatory Elements

Robin Andersson, Albin Sandelin, Charles G Danko
doi: http://dx.doi.org/10.1101/019844

Gene expression is precisely controlled in time and space through the integration of signals that act at gene promoters and gene-distal enhancers. Classically, promoters and enhancers are considered separate classes of regulatory elements, often distinguished by histone modifications. However, recent studies have revealed broad similarities between enhancers and promoters, blurring the distinction: active enhancers often initiate transcription, and some gene promoters have the potential of enhancing transcriptional output of other promoters. Here, we propose a model in which promoters and enhancers are considered a single class of functional element, with a unified architecture for transcription initiation. The context of interacting regulatory elements, and surrounding sequences, determine local transcriptional output as well as the enhancer and promoter activities of individual elements.

Determining Exon Connectivity in Complex mRNAs by Nanopore Sequencing

Determining Exon Connectivity in Complex mRNAs by Nanopore Sequencing

Mohan Bolisetty, Gopinath Rajadinakaran, Brenton Graveley
doi: http://dx.doi.org/10.1101/019752

Though powerful, short-read high throughput RNA sequencing is limited in its ability to directly measure exon connectivity in mRNAs containing multiple alternative exons located farther apart than the maximum read lengths. Here, we use the Oxford Nanopore MinION™ sequencer to identify 7,899 ‘full-length’ isoforms expressed from four Drosophila genes, Dscam1, MRP, Mhc, and Rdl. These results demonstrate that nanopore sequencing can be used to deconvolute individual isoforms and that it has the potential to be an important method for comprehensive transcriptome characterization.

Dynamics of Wolbachia pipientis gene expression across the Drosophila melanogaster life cycle

Dynamics of Wolbachia pipientis gene expression across the Drosophila melanogaster life cycle

Florence Gutzwiller, Catarina R. Carmo, Danny E. Miller, Danny W. Rice, Irene L. Newton, Luis Teixeira, Casey M. Bergman
(Submitted on 21 May 2015)

Symbiotic interactions between microbes and their multicellular hosts have manifold impacts on molecular, cellular and organismal biology. To identify candidate bacterial genes involved in maintaining endosymbiotic associations with insect hosts, we analyzed genome-wide patterns of gene expression in the alpha-proteobacteria Wolbachia pipientis across the life cycle of Drosophila melanogaster using public data from the modENCODE project that was generated in a Wolbachia-infected version of the ISO1 reference strain. We find that the majority of Wolbachia genes are expressed at detectable levels in D. melanogaster across the entire life cycle, but that only 7.8% of 1195 Wolbachia genes exhibit robust stage- or sex-specific expression differences when studied in the “holo-organism” context. Wolbachia genes that are differentially expressed during development are typically up-regulated after D. melanogaster embryogenesis, and include many bacterial membrane, secretion system and ankyrin-repeat containing proteins. Sex-biased genes are often organised as small operons of uncharacterised genes and are mainly up-regulated in adult males D. melanogaster in an age-dependent manner suggesting a potential role in cytoplasmic incompatibility. Our results indicate that large changes in Wolbachia gene expression across the Drosophila life-cycle are relatively rare when assayed across all host tissues, but that candidate genes to understand host-microbe interaction in facultative endosymbionts can be successfully identified using holo-organism expression profiling. Our work also shows that mining public gene expression data in D. melanogaster provides a rich set of resources to probe the functional basis of the Wolbachia-Drosophila symbiosis and annotate the transcriptional outputs of the Wolbachia genome.

Near-optimal RNA-Seq quantification

Near-optimal RNA-Seq quantification
Nicolas Bray, Harold Pimentel, Páll Melsted, Lior Pachter
Subjects: Quantitative Methods (q-bio.QM); Computational Engineering, Finance, and Science (cs.CE); Data Structures and Algorithms (cs.DS); Genomics (q-bio.GN)

We present a novel approach to RNA-Seq quantification that is near optimal in speed and accuracy. Software implementing the approach, called kallisto, can be used to analyze 30 million unaligned RNA-Seq reads in less than 5 minutes on a standard laptop computer while providing results as accurate as those of the best existing tools. This removes a major computational bottleneck in RNA-Seq analysis.

A high-throughput RNA-seq approach to profile transcriptional responses

A high-throughput RNA-seq approach to profile transcriptional responses

Gregory A Moyerbrailean , Gordon O Davis , Chris T Harvey , Donovan Watza , Xiaoquan Wen , Roger Pique-Regi , Francesca Luca
doi: http://dx.doi.org/10.1101/018416

In recent years, different technologies have been used to measure genome-wide gene expression levels and to study the transcriptome across many types of tissues and in response to in vitro treatments. However, a full understanding of gene regulation in any given cellular and environmental context combination is still missing. This is partly because analyzing tissue/environment-specific gene expression generally implies screening a large number of cellular conditions and samples, without prior knowledge of which conditions are most informative (e.g. some cell types may not respond to certain treatments). To circumvent these challenges, we have established a new two-step high-throughput and cost-effective RNA-seq approach: the first step consists of gene expression screening of a large number of conditions, while the second step focuses on deep sequencing of the most relevant conditions (e.g. largest number of differentially expressed genes). This study design allows for a fast and economical screen in step one, with a more profitable allocation of resources for the deep sequencing of re-pooled libraries in step two. We have applied this approach to study the response to 26 treatments in three lymphoblastoid cell line samples and we show that it is applicable for other high-throughput transcriptome profiling requiring iterative refinement or screening.

Methods for distinguishing between protein-coding and long noncoding RNAs and the elusive biological purpose of translation of long noncoding RNAs

Methods for distinguishing between protein-coding and long noncoding RNAs and the elusive biological purpose of translation of long noncoding RNAs
Gali Housman , Igor Ulitsky
doi: http://dx.doi.org/10.1101/017889

Long noncoding RNAs (lncRNAs) are a diverse class of RNAs with increasingly appreciated functions in vertebrates, yet much of their biology remains poorly understood. In particular, it is unclear to what extent the current catalog of over 10,000 distinct annotated lncRNAs is indeed devoid of genes coding for proteins. Here we review the available computational and experimental schemes for distinguishing between recent genome-wide applications. We conclude that the model most consistent with available data is that a large number of mammalian lncRNAs undergo translation, but only a very small minority of such translation events result in stable and functional peptides. The outcome of the majority of the translation events and their potential biological purposes remain an intriguing topic for future investigation.

Analysis of allele-specific expression reveals cis-regulatory changes associated with a recent mating system shift and floral adaptation in Capsella

Analysis of allele-specific expression reveals cis-regulatory changes associated with a recent mating system shift and floral adaptation in Capsella

Kim A Steige , Johan Reimegård , Daniel Koenig , Douglas G Scofield , Tanja Slotte
doi: http://dx.doi.org/10.1101/017749

Cis-regulatory changes have long been suggested to contribute to organismal adaptation. While cis-regulatory changes can now be identified on a transcriptome-wide scale, in most cases the adaptive significance and mechanistic basis of rapid cis-regulatory divergence remains unclear. Here, we have characterized cis-regulatory changes associated with recent adaptive floral evolution in the selfing plant Capsella rubella, which diverged from the outcrosser Capsella grandiflora less than 200 kya. We assessed allele-specific expression (ASE) in leaves and flower buds at a total of 18,452 genes in three interspecific F1 C. grandiflora x C. rubella hybrids. After accounting for technical variation and read-mapping biases using genomic reads, we estimate that an average of 44% of these genes show evidence of ASE, however only 6% show strong allelic expression biases. Flower buds, but not leaves, show an enrichment of genes with ASE in genomic regions responsible for phenotypic divergence between C. rubella and C. grandiflora. We further detected an excess of heterozygous transposable element (TE) insertions in the vicinity of genes with ASE, and TE insertions targeted by uniquely mapping 24-nt small RNAs were associated with reduced allelic expression of nearby genes. Our results suggest that cis-regulatory changes have been important for recent adaptive floral evolution in Capsella and that differences in TE dynamics between selfing and outcrossing species could be an important mechanism underlying rapid regulatory divergence.

Adaptive evolution of anti-viral siRNAi genes in bumblebees

Adaptive evolution of anti-viral siRNAi genes in bumblebees
Sophie Helbing , Michael Lattorff
doi: http://dx.doi.org/10.1101/017681

The high density of frequently interacting and closely related individuals in social insects enhance pathogen transmission and establishment within colonies. Group-mediated behavior supporting immune defenses tend to decrease selection acting on immune genes. Along with low effective population sizes this will result in relaxed constraint and rapid evolution of genes of the immune system. Here we show that sociality is the main driver of selection in antiviral siRNAi genes in social bumblebees compared to their socially parasitic cuckoo bumblebees that lack a worker caste. RNAi genes show frequent positive selection at the codon level additionally supported by the occurrence of parallel evolution and their evolutionary rate is linked to their pathway specific position with genes directly interacting with viruses showing the highest rates of molecular evolution. We suggest that indeed higher pathogen load in social insects drive adaptive evolution of immune genes, if not compensated by behavior.

Mycobacterial infection induces a specific human innate immune response

Mycobacterial infection induces a specific human innate immune response

John D Blischak , Ludovic Tailleux , Amy Mitrano , Luis B Barreiro , Yoav Gilad
doi: http://dx.doi.org/10.1101/017483

The innate immune system provides the first response to pathogen infection and orchestrates the activation of the adaptive immune system. Though a large component of the innate immune response is common to all infections, pathogen-specific responses have been documented as well. The innate immune response is thought to be especially critical for fighting infection with Mycobacterium tuberculosis (MTB), the causative agent of tuberculosis (TB). While TB can be deadly, only 5-10% of individuals infected with MTB develop active disease. The risk for disease susceptibility is, at least partly, heritable. Studies of inter-individual variation in the innate immune response to MTB infection may therefore shed light on the genetic basis for variation in susceptibility to TB. Yet, to date, we still do not know which properties of the innate immune response are specific to MTB infection and which represent a general response to pathogen infection. To begin addressing this gap, we infected macrophages with eight different bacteria, including different MTB strains and related mycobacteria, and studied the transcriptional response to infection. Although the ensued gene regulatory responses were largely consistent across the bacterial infection treatments, we were able to identify a novel subset of genes whose regulation was affected specifically by infection with mycobacteria. Genetic variants that are associated with regulatory differences in these genes should be considered candidate loci for explaining inter-individual susceptibility TB.