FORGE : A tool to discover cell specific enrichments of GWAS associated SNPs in regulatory regions.

FORGE : A tool to discover cell specific enrichments of GWAS associated SNPs in regulatory regions.

Ian Dunham, Eugene Kulesha, Valentina Iotchkova, Sandro Morganella, Ewan Birney
doi: http://dx.doi.org/10.1101/013045

Genome wide association studies provide an unbiased discovery mechanism for numerous human diseases. However, a frustration in the analysis of GWAS is that the majority of variants discovered do not directly alter protein-coding genes. We have developed a simple analysis approach that detects the tissue-specific regulatory component of a set of GWAS SNPs by identifying enrichment of overlap with DNase I hotspots from diverse tissue samples. Functional element Overlap analysis of the Results of GWAS Experiments (FORGE) is available as a web tool and as standalone software and provides tabular and graphical summaries of the enrichments. Conducting FORGE analysis on SNP sets for 260 phenotypes available from the GWAS catalogue reveals numerous overlap enrichments with tissue–specific components reflecting the known aetiology of the phenotypes as well as revealing other unforeseen tissue involvements that may lead to mechanistic insights for disease.

Genetic Analysis of Substrain Divergence in NOD Mice

Genetic Analysis of Substrain Divergence in NOD Mice

Petr Simecek, Gary A Churchill, Hyuna Yang, Lucy B Rowe, Lieselotte Herberg, David V Serreze, Edward H Leiter
doi: http://dx.doi.org/10.1101/013037

The NOD mouse is a polygenic model for type 1 diabetes that is characterized by insulitis, a leukocytic infiltration of the pancreatic islets. During ~35 years since the original inbred strain was developed in Japan, NOD substrains have been established at different laboratories around the world. Although environmental differences among NOD colonies capable of impacting diabetes incidence have been recognized, differences arising from genetic divergence have not previously been analyzed. We illustrate the importance of intersubstrain genetic differences by showing a difference in diabetes incidence between two substrains (NOD/ShiLtJ and NOD/Bom) maintained in a common environment. We use both Mouse Diversity Array and Whole Exome Capture Sequencing platforms to identify genetic differences distinguishing 5 NOD substrains. We describe 64 SNPs, and 2 short indels that differ in coding regions of the 5 NOD substrains. A 100 kb deletion on Chromosome 3 distinguishes NOD/ShiLtJ and NOD/ShiLtDvs from 3 other substrains, while a 111 kb deletion in the Icam2 gene on Chromosome 11 is unique to the NOD/ShiLtDvs genome. The extent of genetic divergence for NOD substrains is compared to similar studies for C57BL6 and BALB/c substrains. As mutations are fixed to homozygosity by continued inbreeding, significant differences in substrain phenotypes are to be expected. These results emphasize the importance of using embryo freezing methods to minimize genetic drift within substrains.

Y Chromosome of Aisin Gioro, the Imperial House of Qing Dynasty

Y Chromosome of Aisin Gioro, the Imperial House of Qing Dynasty

Shi Yan, Harumasa Tachibana, Lan-Hai Wei, Ge Yu, Shao-Qing Wen, Chuan-Chao Wang
(Submitted on 19 Dec 2014)

House of Aisin Gioro is the imperial family of the last dynasty in Chinese history – Qing Dynasty (1644 – 1911). Aisin Gioro family originated from Jurchen tribes and developed the Manchu people before they conquered China. By investigating the Y chromosomal short tandem repeats (STRs) of 7 modern male individuals who claim belonging to Aisin Gioro family (in which 3 have full records of pedigree), we found that 3 of them (in which 2 keep full pedigree, whose most recent common ancestor is Nurgaci) shows very close relationship (1 – 2 steps of difference in 17 STR) and the haplotype is rare. We therefore conclude that this haplotype is the Y chromosome of the House of Aisin Gioro. Further tests of single nucleotide polymorphisms (SNPs) indicates that they belong to Haplogroup C3b2b1*-M401(xF5483), although their Y-STR results are distant to the “star cluster”, which also belongs to the same haplogroup. This study forms the base for the pedigree research of the imperial family of Qing Dynasty by means of genetics.

Using Bayesian multilevel whole-genome regression models for partial pooling of estimation sets in genomic prediction

Using Bayesian multilevel whole-genome regression models for partial pooling of estimation sets in genomic prediction

Frank Technow, L. Radu Totir
doi: http://dx.doi.org/10.1101/012971

Estimation set size is an important determinant of genomic prediction accuracy. Plant breeding programs are characterized by a high degree of structuring, particularly into populations. This hampers establishment of large estimation sets for each population. Pooling populations increases estimation set size but ignores unique genetic characteristics of each. A possible solution is partial pooling with multilevel models, which allows estimating population specific marker effects while still leveraging information across populations. We developed a Bayesian multilevel whole-genome regression model and compared its performance to that of the popular BayesA model applied to each population separately (no pooling) and to the joined data set (complete pooling). As example we analyzed a wide array of traits from the nested association mapping maize population. There we show that for small population sizes (e.g., < 50), partial pooling increased prediction accuracy over no or complete pooling for populations represented in the estimation set. No pooling was superior however when populations were large. In another example data set of interconnected biparental maize populations either partial or complete pooling were superior, depending on the trait. A simulation showed that no pooling is superior when differences in genetic effects among populations are large and partial pooling when they are intermediate. With small differences, partial and complete pooling achieved equally high accuracy. For prediction of new populations, partial and complete pooling had very similar accuracy in all cases. We conclude that partial pooling with multilevel models can maximize the potential of pooling by making optimal use of information in pooled estimation sets.

Imperfect drug penetration leads to spatial monotherapy and rapid evolution of multi-drug resistance

Imperfect drug penetration leads to spatial monotherapy and rapid evolution of multi-drug resistance

Stefany Moreno-Gamez, Alison L Hill, Daniel I.S. Rosenbloom, Dmitri A. Petrov, Martin A Nowak, Pleuni Pennings
doi: http://dx.doi.org/10.1101/013003

Infections with rapidly evolving pathogens are often treated using combinations of drugs with different mechanisms of action. One of the major goals of combination therapy is to reduce the risk of drug resistance emerging during a patient’s treatment. While this strategy generally has significant benefits over monotherapy, it may also select for multi-drug resistant strains, which present an important clinical and public health problem. For many antimicrobial treatment regimes, individual drugs have imperfect penetration throughout the body, so there may be regions where only one drug reaches an effective concentration. Here we propose that mismatched drug coverage can greatly speed up the evolution of multi-drug resistance by allowing mutations to accumulate in a stepwise fashion. We develop a mathematical model of within-host pathogen evolution under spatially heterogeneous drug coverage and demonstrate that even very small single-drug compartments lead to dramatically higher resistance risk. We find that it is often better to use drug combinations with matched penetration profiles, although there may be a trade-off between preventing eventual treatment failure due to resistance in this way, and temporarily reducing pathogen levels systemically. Our results show that drugs with the most extensive distribution are likely to be the most vulnerable to resistance. We conclude that optimal combination treatments should be designed to prevent this spatial effective monotherapy. These results are widely applicable to diverse microbial infections including viruses, bacteria and parasites.

Common binding by redundant group B Sox proteins is evolutionarily conserved in Drosophila

Common binding by redundant group B Sox proteins is evolutionarily conserved in Drosophila

Sarah H Carl, Steven Russell
doi: http://dx.doi.org/10.1101/012872

Background: Group B Sox proteins are a highly conserved group of transcription factors that act extensively to coordinate nervous system development in higher metazoans while showing both co-expression and functional redundancy across a broad group of taxa. In Drosophila melanogaster, the two group B Sox proteins Dichaete and SoxNeuro show widespread common binding across the genome. While some instances of functional compensation have been observed in Drosophila, the function of common binding and the extent of its evolutionary conservation is not known. Results: We used DamID-seq to examine the genome-wide binding patterns of Dichaete and SoxNeuro in four species of Drosophila. Through a quantitative comparison of Dichaete binding, we evaluated the rate of binding site turnover across the genome as well as at specific functional sites. We also examined the presence of Sox motifs within binding intervals and the correlation between sequence conservation and binding conservation. To determine whether common binding between Dichaete and SoxNeuro is conserved, we performed a detailed analysis of the binding patterns of both factors in two species. Conclusion: We find that, while the regulatory networks driven by Dichaete and SoxNeuro are largely conserved across the drosophilids studied, binding site turnover is widespread and correlated with phylogenetic distance. Nonetheless, binding is preferentially conserved at known cis-regulatory modules and core, independently verified binding sites. We observed the strongest binding conservation at sites that are commonly bound by Dichaete and SoxNeuro, suggesting that these sites are functionally important. Our analysis provides insights into the evolution of group B Sox function, highlighting the specific conservation of shared binding sites and suggesting alternative sources of neofunctionalisation between paralogous family members.

Modeling and quantifying frequency-dependent fitness in microbial populations with cross-feeding interactions

Modeling and quantifying frequency-dependent fitness in microbial populations with cross-feeding interactions

Noah Ribeck, Richard E. Lenski
doi: http://dx.doi.org/10.1101/012807

Coexistence of multiple populations by frequency-dependent selection is common in nature, and it often arises even in well-mixed experiments with microbes. If ecology is to be incorporated into models of population genetics, then it is important to represent accurately the functional form of frequency-dependent interactions. However, measuring this functional form is problematic for traditional fitness assays, which assume a constant fitness difference between competitors over the course of an assay. Here, we present a theoretical framework for measuring the functional form of frequency-dependent fitness by accounting for changes in abundance and relative fitness during a competition assay. Using two examples of ecological coexistence that arose in a long-term evolution experiment with Escherichia coli, we illustrate accurate quantification of the functional form of frequency-dependent relative fitness. Using a Monod-type model of growth dynamics, we show that two ecotypes in a typical cross-feeding interaction—such as when one bacterial population uses a byproduct generated by another—yields relative fitness that is linear with relative frequency.

Maternal microRNAs in Drosophila eggs: selection against target sites in maternal protein-coding transcripts

Maternal microRNAs in Drosophila eggs: selection against target sites in maternal protein-coding transcripts

Antonio Marco
doi: http://dx.doi.org/10.1101/012757

In animals, before the zygotic genome is expressed, the egg already contains gene products deposited by the mother. These maternal products are crucial during the initial steps of development. In Drosophila melanogaster a large number of maternal products are found in the oocyte, some of which are indispensable. Many of these products are RNA molecules, such as gene transcripts and ribosomal RNAs. Recently, microRNAs – small RNA gene regulators – have been detected early during development and are important in these initial steps. The presence of some microRNAs in unfertilized eggs has been reported, but whether they have a functional impact in the egg or early embryo has not being explored. To characterize a maternal microRNA set, I have extracted and sequenced small RNAs from Drosophila unfertilized eggs. The unfertilized egg is rich in small RNAs, particularly in ribosomal RNAs, and contains multiple microRNA products. I further validated two of these microRNAs by qPCR and also showed that these are not present in eggs from mothers without Dicer-1 activity. Maternal microRNAs are often encoded within the intron of maternal genes, suggesting that many maternal microRNAs are the product of transcriptional hitch-hiking. Comparative genomics and population data suggest that maternally deposited transcripts tend to avoid target sites for maternally deposited microRNAs. A potential role of the maternal microRNA mir-9c in maternal-to-zygotic transition is also discussed. In conclusion, maternal microRNAs in Drosophila have a functional impact in maternal protein-coding transcripts.

Fitness costs in spatially structured environments

Fitness costs in spatially structured environments

Florence Debarre
doi: http://dx.doi.org/10.1101/012740

The clustering of individuals that results from limited dispersal is a double-edged sword: while it allows for local interactions to be mostly among related individuals, it also results in increased local competition. Here I show that, because they mitigate local competition, fitness costs such as reduced fecundity or reduced survival are less costly in spatially structured environments than in non spatial settings. I first present a simple demographic example to illustrate how spatial structure weakens selection against fitness costs. Then, I illustrate the importance of disentangling the evolution of a trait from the evolution of potential associated costs, using an example taken from a recent study investigating the effect of spatial structure on the evolution of host defence. In this example indeed, the differences between spatial and non-spatial selection gradients are entirely due to differences in the fitness costs, thereby undermining interpretations of the results made in terms of the trait only. This illustrates the need to consider fitness costs as proper traits in both theoretical and empirical studies.

Origin and cross-century dynamics of an avian hybrid zone

Origin and cross-century dynamics of an avian hybrid zone

Andrea Morales-Rozo, Elkin A. Tenorio, Matthew D. Carling, Carlos Daniel Cadena
doi: http://dx.doi.org/10.1101/012856

Background: Characterizations of the dynamics of hybrid zones in space and time can give insights about traits and processes important in population divergence and speciation. We characterized a hybrid zone between tanagers in the genus Ramphocelus (Aves, Thraupidae) located in southwestern Colombia. We tested whether this hybrid zone originated as a result of secondary contact or of primary differentiation, and described its dynamics across time using spatial analyses of molecular, morphological, and coloration data in combination with paleodistribution modeling. Results: Models of potential historical distributions based on climatic data and genetic signatures of demographic expansion suggested that the hybrid zone originated following secondary contact between populations that expanded their ranges out of isolated areas in the Quaternary. Concordant patterns of variation in phenotypic characters across the hybrid zone and its narrow extent are suggestive of a tension zone, maintained by a balance between dispersal and selection against hybrids. Estimates of phenotypic cline parameters obtained using specimens collected over nearly a century revealed that, in recent decades, the zone has moved to the east and to higher elevations, and has become narrower. Genetic variation was not clearly structured along the hybrid zone, but comparisons between historical and contemporary specimens suggested that temporal changes in its genetic makeup may also have occurred. Conclusions: Our data suggest that the hybrid zone resulted from secondary contact between populations. The observed changes in the hybrid zone may be a result of sexual selection, asymmetric gene flow, or environmental change.