Maternal microRNAs in Drosophila eggs: selection against target sites in maternal protein-coding transcripts

Maternal microRNAs in Drosophila eggs: selection against target sites in maternal protein-coding transcripts

Antonio Marco
doi: http://dx.doi.org/10.1101/012757

In animals, before the zygotic genome is expressed, the egg already contains gene products deposited by the mother. These maternal products are crucial during the initial steps of development. In Drosophila melanogaster a large number of maternal products are found in the oocyte, some of which are indispensable. Many of these products are RNA molecules, such as gene transcripts and ribosomal RNAs. Recently, microRNAs – small RNA gene regulators – have been detected early during development and are important in these initial steps. The presence of some microRNAs in unfertilized eggs has been reported, but whether they have a functional impact in the egg or early embryo has not being explored. To characterize a maternal microRNA set, I have extracted and sequenced small RNAs from Drosophila unfertilized eggs. The unfertilized egg is rich in small RNAs, particularly in ribosomal RNAs, and contains multiple microRNA products. I further validated two of these microRNAs by qPCR and also showed that these are not present in eggs from mothers without Dicer-1 activity. Maternal microRNAs are often encoded within the intron of maternal genes, suggesting that many maternal microRNAs are the product of transcriptional hitch-hiking. Comparative genomics and population data suggest that maternally deposited transcripts tend to avoid target sites for maternally deposited microRNAs. A potential role of the maternal microRNA mir-9c in maternal-to-zygotic transition is also discussed. In conclusion, maternal microRNAs in Drosophila have a functional impact in maternal protein-coding transcripts.

Reproductive workers show queen-like gene expression in an intermediately eusocial insect, the buff-tailed bumble bee Bombus terrestris

Reproductive workers show queen-like gene expression in an intermediately eusocial insect, the buff-tailed bumble bee Bombus terrestris.

Mark Christian Harrison, Robert L Hammond, Eamonn B Mallon
doi: http://dx.doi.org/10.1101/012500

Bumble bees represent a taxon with an intermediate level of eusociality within Hymenoptera. The clear division of reproduction between a single founding queen and the largely sterile workers is characteristic for highly eusocial species, whereas the morphological similarity between the bumble bee queen and the workers is typical for more primitively eusocial hymenopterans. Also, unlike other highly eusocial hymenopterans, division of labour among worker sub-castes is plastic and not predetermined by morphology or age. We conducted a differential expression analysis based on RNA-seq data from 11 combinations of developmental stage and caste to investigate how a single genome can produce the distinct castes of queens, workers and males in the buff-tailed bumble bee Bombus terrestris. Based on expression patterns, we found males to be the most distinct of all adult castes (2,411 transcripts differentially expressed compared to non-reproductive workers). However, only relatively few transcripts were differentially expressed between males and workers during development (larvae: 71, pupae: 162). This indicates the need for more distinct expression patterns to control behaviour and physiology in adults compared to those required to create different morphologies. Among the female castes, the expression of over ten times more transcripts differed signifcantly between reproductive workers and their non-reproductive sisters than when comparing reproductive workers to the mother queen. This suggests a strong shift towards a more queen-like behaviour and physiology when a worker becomes fertile. This is in contrast to findings for higher eusocial species, in which reproductive workers are more similar to non-reproductive workers than the queen.

The genomic signature of social interactions regulating honey bee caste development

The genomic signature of social interactions regulating honey bee caste development
Svjetlana Vojvodic, Brian R Johnson, Brock Harpur, Clement Kent, Amro Zayed, Kirk E Anderson, Timothy Linksvayer
doi: http://dx.doi.org/10.1101/012385

Social evolution theory posits the existence of genes expressed in one individual that affect the traits and fitness of social partners. The archetypal example of reproductive altruism, honey bee reproductive caste, involves strict social regulation of larval caste fate by care-giving nurses. However, the contribution of nurse-expressed genes, which are prime socially-acting candidate genes, to the caste developmental program and to caste evolution remains mostly unknown. We experimentally induced new queen production by removing the current colony queen, and we used RNA sequencing to study the gene expression profiles of both developing larvae and their care-giving nurses before and after queen removal. By comparing the gene expression profiles between both queen-destined larvae and their nurses to worker-destined larvae and their nurses in queen-present and queen-absent conditions, we identified larval and nurse genes associated with larval caste development and with queen presence. Of 950 differentially-expressed genes associated with larval caste development, 82% were expressed in larvae and 18% were expressed in nurses. Behavioral and physiological evidence suggests that nurses may specialize in the short term feeding queen- versus worker-destined larvae. Estimated selection coefficients indicated that both nurse and larval genes associated with caste are rapidly evolving, especially those genes associated with worker development. Of the 1863 differentially-expressed genes associated with queen presence, 90% were expressed in nurses. Altogether, our results suggest that socially-acting genes play important roles in both the expression and evolution of socially-influenced traits like caste.

Evaluating intra- and inter-individual variation in the human placental transcriptome

Evaluating intra- and inter-individual variation in the human placental transcriptome
David A Hughes, Martin Kircher, Zhisong He, Song Guo, Genevieve L Fairbrother, Carlos S Moreno, Philipp Khaitovich, Mark Stoneking
doi: http://dx.doi.org/10.1101/012468

Background: Gene expression variation is a phenotypic trait of particular interest as it represents the initial link between genotype and other phenotypes. Analyzing how such variation apportions among and within groups allows for the evaluation of how genetic and environmental factors influence such traits. It also provides opportunities to identify genes and pathways that may have been influenced by non-neutral processes. Here we use a population genetics framework and next generation sequencing to evaluate how gene expression variation is apportioned among four human groups in a natural biological tissue, the placenta. Results: We estimate that on average, 33.2%, 58.9% and 7.8% of the placental transcriptome is explained by variation within individuals, among individuals and among human groups, respectively. Additionally, when technical and biological traits are included in models of gene expression they account for roughly 2% of total gene expression variation. Notably, the variation that is significantly different among groups is enriched in biological pathways associated with immune response, cell signaling and metabolism. Many biological traits demonstrated correlated changes in expression in numerous pathways of potential interest to clinicians and evolutionary biologists. Finally, we estimate that the majority of the human placental transcriptome (65% of expressed genes) exhibits expression profiles consistent with neutrality; the remainder are consistent with stabilizing selection (26%), directional selection (4.9%), or diversifying selection (4.8%). Conclusion: We apportion placental gene expression variation into individual, population and biological trait factors and identify how each influence the transcriptome. Additionally, we advance methods to associate expression profiles with different forms of selection.

Local and systemic gene expression responses to a white syndrome-like disease in a reef building coral, Acropora hyacinthus.

Local and systemic gene expression responses to a white syndrome-like disease in a reef building coral, Acropora hyacinthus.

Rachel M Wright, Galina V Aglyamova, Eli Meyer, Mikhail V Matz
doi: http://dx.doi.org/10.1101/012211

Background Corals are capable of launching diverse immune defenses at the site of direct contact with pathogens, but the molecular mechanisms of this activity and the colony-wide effects of such stressors remain poorly understood. Here we compared gene expression profiles in eight healthy Acropora hyacinthus colonies against eight colonies exhibiting white syndrome-like symptoms, all collected from a natural reef environment near Palau. Two types of tissues were sampled from diseased corals: visibly affected and apparently healthy tissues. Results Tag-based RNA-Seq followed by weighted gene co-expression network analysis identified groups of co-regulated differentially expressed genes between all disease states (diseased, ahead of the lesion, and healthy). Most of the differentially expressed genes were found between tissues at the lesions and asymptomatic (healthy and ahead of the lesion) tissues. These genes were related to innate immunity, oxidative stress responses, lipid metabolism, and calcification. Network analysis also revealed groups of genes regulated specifically in the tissues from diseased colonies that were not yet showing obvious symptoms of disease, indicating a systemic response to infection. Conclusions These observations suggest that tissues ahead of the lesion of disease progression exist in a transitional state between health and lesion appearance. Alternatively, these gene expression profiles capture physiological differences between colonies with varying disease susceptibilities.

>msCentipede: Modeling heterogeneity across genomic sites improves accuracy in the inference of transcription factor binding

msCentipede: Modeling heterogeneity across genomic sites improves accuracy in the inference of transcription factor binding
Anil Raj, Heejung Shim, Yoav Gilad, Jonathan K Pritchard, Matthew Stephens
doi: http://dx.doi.org/10.1101/012013

Motivation: Understanding global gene regulation depends critically on accurate annotation of regulatory elements that are functional in a given cell type. CENTIPEDE, a powerful, probabilistic framework for identifying transcription factor binding sites from tissue-specific DNase I cleavage patterns and genomic sequence content, leverages the hypersensitivity of factor-bound chromatin and the information in the DNase I spatial cleavage profile characteristic of each DNA binding protein to accurately infer functional factor binding sites. However, the model for the spatial profile in this framework underestimates the substantial variation in the DNase I cleavage profiles across factor-bound genomic locations and across replicate measurements of chromatin accessibility. Results: In this work, we adapt a multi-scale modeling framework for inhomogeneous Poisson processes to better model the underlying variation in DNase I cleavage patterns across genomic locations bound by a transcription factor. In addition to modeling variation, we also model spatial structure in the heterogeneity in DNase I cleavage patterns for each factor. Using DNase-seq measurements assayed in a lymphoblastoid cell line, we demonstrate the improved performance of this model for several transcription factors by comparing against the Chip-Seq peaks for those factors. Finally, we propose an extension to this framework that allows for a more flexible background model and evaluate the additional gain in accuracy achieved when the background model parameters are estimated using DNase-seq data from naked DNA. The proposed model can also be applied to paired-end ATAC-seq and DNase-seq data in a straightforward manner. Availability: msCentipede, a Python implementation of an algorithm to infer transcription factor binding using this model, is made available at https://github.com/rajanil/msCentipede

The developmental transcriptome of contrasting Arctic charr (Salvelinus alpinus) morphs

The developmental transcriptome of contrasting Arctic charr (Salvelinus alpinus) morphs
Jóhannes Gudbrandsson, Ehsan P Ahi, Kalina H Kapralova, Sigrídur R Franzdottir, Bjarni K Kristjánsson, Sophie S Steinhaeuser, Ísak M Jóhannesson, Valerie H Maier, Sigurdur S Snorrason, Zophonías O Jónsson, Arnar Pálsson
doi: http://dx.doi.org/10.1101/011361

Species showing repeated evolution of similar traits can help illuminate the molecular and developmental basis of diverging traits and specific adaptations. Following the last glacial period, dwarfism and specialized bottom feeding morphology evolved rapidly in several landlocked Arctic charr (Salvelinus alpinus) populations in Iceland. In order to study the genetic divergence between small benthic morphs and larger morphs with limnetic morphotype, we conducted an RNA-seq transcriptome analysis of developing charr. We sequenced mRNA from whole embryos at four stages in early development of two stocks with very different morphologies, the small benthic (SB) charr from Lake Thingvallavatn and Holar aquaculture (AC) charr. The data reveal significant differences in expression of several biological pathways during charr development. There is also a difference between SB- and AC-charr in mitochondrial genes involved in energy metabolism and blood coagulation genes. We confirmed expression difference of five genes in whole embryos with qPCR, including lysozyme and natterin which was previously identified as a fish-toxin of a lectin family that may be a putative immunopeptide. We verified differential expression of 7 genes in developing heads, and the expression associated consistently with benthic v.s. limnetic charr (studied in 4 morphs total). Comparison of Single nucleotide polymorphism (SNP) frequencies reveals extensive genetic differentiation between the SB- and AC-charr (60 fixed SNPs and around 1300 differing more than 50% in frequency). In SB-charr the high frequency derived SNPs are in genes related to translation and oxidative processes. Curiously, several derived SNPs reside in the 12s and 16s mitochondrial ribosomal RNA genes, including a base highly conserved among fishes. The data implicate multiple genes and molecular pathways in divergence of small benthic charr and/or the response of aquaculture charr to domestication. Functional, genetic and population genetic studies on more freshwater and anadromous populations are needed to confirm the specific loci and mutations relating to specific ecological or domestication traits in Arctic charr.

Enhanced Transcriptome Maps from Multiple Mouse Tissues Reveal Evolutionary Constraint in Gene Expression for Thousands of Genes

Enhanced Transcriptome Maps from Multiple Mouse Tissues Reveal Evolutionary Constraint in Gene Expression for Thousands of Genes
Dmitri Pervouchine, Sarah Djebali, Alessandra Breschi, Carrie A Davis, Pablo Prieto Barja, Alex Dobin, Andrea Tanzer, Julien Lagarde, Chris Zaleski, Lei-Hoon See, Meagan Fastuca, Jorg Drenkow, Huaien Wang, Giovanni Bussotti, Baikang Pei, Suganthi Balasubramanian, Jean Monlong, Arif Harmanci, Mark Gerstein, Michael A Beer, Cedric Notredame, Roderic Guigo, Thomas R Gingeras
doi: http://dx.doi.org/10.1101/010884

We characterized by RNA-seq the transcriptional profiles of a large and heterogeneous collection of mouse tissues, augmenting the mouse transcriptome with thousands of novel transcript candidates. Comparison with transcriptome profiles obtained in human cell lines reveals substantial conservation of transcriptional programs, and uncovers a distinct class of genes with levels of expression across cell types and species, that have been constrained early in vertebrate evolution. This core set of genes capture a substantial and constant fraction of the transcriptional output of mammalian cells, and participates in basic functional and structural housekeeping processes common to all cell types. Perturbation of these constrained genes is associated with significant phenotypes including embryonic lethality and cancer. Evolutionary constraint in gene expression levels is not reflected in the conservation of the genomic sequences, but it is associated with strong and conserved epigenetic marking, as well as to a characteristic post-transcriptional regulatory program in which sub-cellular localization and alternative splicing play comparatively large roles.

Genome-wide comparative analysis reveals human- mouse regulatory landscape and evolution

Genome-wide comparative analysis reveals human- mouse regulatory landscape and evolution
Olgert Denas, Richard Sandstrom, Yong Cheng, Kathryn Beal, Javier Herrero, Ross Hardison, James Taylor
doi: http://dx.doi.org/10.1101/010926

Background: Because species-specific gene expression is driven by species-specific regulation, understanding the relationship between sequence and function of the regulatory regions in different species will help elucidate how differences among species arise. Despite active experimental and computational research, the relationships among sequence, conservation, and function are still poorly understood. Results: We compared transcription factor occupied segments (TFos) for 116 human and 35 mouse TFs in 546 human and 125 mouse cell types and tissues from the Human and the Mouse ENCODE projects. We based the map between human and mouse TFos on a one-to-one nucleotide cross-species mapper, bnMapper, that utilizes whole genome alignments (WGA). Our analysis shows that TFos are under evolutionary constraint, but a substantial portion (25.1% of mouse and 25.85% of human on average) of the TFos does not have a homologous sequence on the other species; this portion varies among cell types and TFs. Furthermore, 47.67% and 57.01% of the homologous TFos sequence shows binding activity on the other species for human and mouse respectively. However, 79.87% and 69.22% is repurposed such that it binds the same TF in different cells or different TFs in the same cells. Remarkably, within the set of TFos not showing conservation of occupancy, the corresponding genome regions in the other species are preferred locations of novel TFos. These events suggest that a substantial amount of functional regulatory sequences is exapted from other biochemically active genomic material. Despite substantial repurposing of TFos, we did not find substantial changes in their predicted target genes, suggesting that CRMs buffer evolutionary events allowing little or no change in the TF – target gene associations. Thus, the small portion of TFos with strictly conserved occupancy underestimates the degree of conservation of regulatory interactions. Conclusion: We mapped regulatory sequences from an extensive number of TFs and cell types between human and mouse. A comparative analysis of this correspondence unveiled the extent of the shared regulatory sequence across TFs and cell types under study. Importantly, a large part of the shared regulatory sequence repurposed on the other species. This sequence, fueled by turnover events, provides a strong case for exaptation in regulatory elements.

Sharing and specificity of co-expression networks across 35 human tissues

Sharing and specificity of co-expression networks across 35 human tissues
Emma Pierson, GTEx Consortium, Daphne Koller, Alexis Battle, Sara Mostafavi
doi: http://dx.doi.org/10.1101/010843

To understand the regulation of tissue-specific gene expression, the GTEx Consortium generated RNA-seq expression data for more than thirty distinct human tissues. This data provides an opportunity for deriving shared and tissue-specific gene regulatory networks on the basis of co-expression between genes. However, a small number of samples are available for a majority of the tissues, and therefore statistical inference of networks in this setting is highly underpowered. To address this problem, we infer tissue-specific gene co-expression networks for 35 tissues in the GTEx dataset using a novel algorithm, GNAT, that uses a hierarchy of tissues to share data between related tissues. We show that this transfer learning approach increases the accuracy with which networks are learned. Analysis of these networks reveals that tissue-specific transcription factors are hubs that preferentially connect to genes with tissue-specific functions. Additionally, we observe that genes with tissue-specific functions lie at the peripheries of our networks. We identify numerous modules enriched for Gene Ontology functions, and show that modules conserved across tissues are especially likely to have functions common to all tissues, while modules that are upregulated in a particular tissue are often instrumental to tissue-specific function. Finally, we provide a web tool which allows exploration of gene function and regulation in a tissue-specific manner.