Driven to Extinction: On the Probability of Evolutionary Rescue from Sex-Ratio Meiotic Drive

Driven to Extinction: On the Probability of Evolutionary Rescue from Sex-Ratio Meiotic Drive
Robert Unckless , Andrew Clark
doi: http://dx.doi.org/10.1101/018820

Many evolutionary processes result in sufficiently low mean fitness that they pose a risk of species extinction. Sex-ratio meiotic drive was recognized by W.D. Hamilton (1967) to pose such a risk, because as the driving sex chromosome becomes common, the opposite sex becomes rare. We expand on Hamilton’s classic model by allowing for the escape from extinction due to evolution of suppressors of X and Y drivers. We explore differences in the two systems in their probability of escape from extinction. Several novel conclusions are evident, including a) that extinction time scales approximately with the log of population size so that even large populations may go extinct quickly, b) extinction risk is driven by the relationship between female fecundity and drive strength, c) anisogamy and the fact that X and Y drive result in sex ratios skewed in opposite directions, mean systems with Y drive are much more likely to go extinct than those with X drive, and d) suppressors are most likely to become established when the strength of drive is intermediate, since weak drive leads to weak selection for suppression and strong drive leads to rapid extinction.

Controlling False Positive Rates in Methods for Differential Gene Expression Analysis using RNA-Seq Data

Controlling False Positive Rates in Methods for Differential Gene Expression Analysis using RNA-Seq Data

David M Rocke , Luyao Ruan , J. Jared Gossett , Blythe Durbin-Johnson , Sharon Aviran
doi: http://dx.doi.org/10.1101/018739

We review existing methods for the analysis of RNA-Seq data and place them in a common framework of a sequence of tasks that are usually part of the process. We show that many existing methods produce large numbers of false positives in cases where the null hypothesis is true by construction and where actual data from RNA-Seq studies are used, as opposed to simulations that make specific assumptions about the nature of the data. We show that some of those mathematical assumptions about the data likely are one of the causes of the false positives, and define a general structure that is not apparently subject to these problems. The best performance was shown by limma-voom and by some simple methods composed of easily understandable steps.

Fine-mapping cellular QTLs with RASQUAL and ATAC-seq

Fine-mapping cellular QTLs with RASQUAL and ATAC-seq

Natsuhiko Kumasaka , Andrew Knights , Daniel Gaffney
doi: http://dx.doi.org/10.1101/018788

When cellular traits are measured using high-throughput DNA sequencing quantitative trait loci (QTLs) manifest at two levels: population level differences between individuals and allelic differences between cis-haplotypes within individuals. We present RASQUAL (Robust Allele Specific QUAntitation and quality controL), a novel statistical approach for association mapping that integrates genetic effects and robust modelling of biases in next generation sequencing (NGS) data within a single, probabilistic framework. RASQUAL substantially improves causal variant localisation and sensitivity of association detection over existing methods in RNA-seq, DNaseI-seq and ChIP-seq data. We illustrate how RASQUAL can be used to maximise association detection by generating the first map of chromatin accessibility QTLs (caQTLs) in a European population using ATAC-seq. Despite a modest sample size, we identified 2,706 independent caQTLs (FDR 10%) and illustrate how RASQUAL’s improved causal variant localisation provides powerful information for fine-mapping disease-associated variants. We also map “multipeak” caQTLs, identical genetic associations found across multiple, independent open chromatin regions and illustrate how genetic signals in ATAC-seq data can be used to link distal regulatory elements with gene promoters. Our results highlight how joint modelling of population and allele-specific genetic signals can improve functional interpretation of noncoding variation.

The “Gini index” in genetics: measuring genetic architecture complexity of quantitative traits

The “Gini index” in genetics: measuring genetic architecture complexity of quantitative traits

Xia Shen
doi: http://dx.doi.org/10.1101/018713

Genetic architecture is a general terminology used and discussed very often in complex traits genetics. It is related to the number of functional loci involved in explaining variation of a complex trait and the distribution of genetic effects across these loci. Understanding the complexity level of the genetic architecture of complex traits is essential for evaluating the potential power of mapping functional loci and prediction of complex traits. However, there has been no quantitative measurement of the genetic architecture complexity, which makes it difficult to link results from genetic data analysis to such terminology. Inspired by the “Gini index” for measuring income distribution in economics, I develop a genetic architecture score (“GA score”) to measure genetic architecture complexity. Simulations indicate that the GA score is an effective measurement of the complexity level of complex traits genetic architecture.

Detecting recent selective sweeps while controlling for mutation rate and background selection

Detecting recent selective sweeps while controlling for mutation rate and background selection

Christian D. Huber , Michael DeGiorgio , Ines Hellmann , Rasmus Nielsen
doi: http://dx.doi.org/10.1101/018697

A composite likelihood ratio test implemented in the program SweepFinder is a commonly used method for scanning a genome for recent selective sweeps. SweepFinder uses information on the spatial pattern of the site frequency spectrum (SFS) around the selected locus. To avoid confounding effects of background selection and variation in the mutation process along the genome, the method is typically applied only to sites that are variable within species. However, the power to detect and localize selective sweeps can be greatly improved if invariable sites are also included in the analysis. In the spirit of a Hudson-Kreitman-Aguadé test, we suggest to add fixed differences relative to an outgroup to account for variation in mutation rate, thereby facilitating more robust and powerful analyses. We also develop a method for including background selection modeled as a local reduction in the effective population size. Using simulations we show that these advances lead to a gain in power while maintaining robustness to mutation rate variation. Furthermore, the new method also provides more precise localization of the causative mutation than methods using the spatial pattern of segregating sites alone.

Surveying the relative impact of mRNA features on local ribosome profiling read density in 28 datasets.

Surveying the relative impact of mRNA features on local ribosome profiling read density in 28 datasets.

Patrick O’Connor , Dmitry Andreev , Pavel Baranov
doi: http://dx.doi.org/10.1101/018762

Ribosome profiling is a promising technology for exploring gene expression. However, ribosome profiling data are characterized by a substantial number of outliers due to technical and biological factors. Here we introduce a simple computational method, Ribo-seq Unit Step Transformation (RUST) for the characterization of ribosome profiling data. We show that RUST is robust and outperforms conventional normalization techniques in the presence of sporadic noise. We used RUST to analyse 28 publicly available ribosome profiling datasets obtained from mammalian cells and tissues and from yeast. This revealed substantial protocol dependent variation in the composition of footprint libraries. We selected a high quality dataset to explore the mRNA features that affect local decoding rates and found that the amino acid identity encoded by the codon in the A-site is the major contributing factor followed by the identity of the codon itself and then the amino acid in the P-site. We also found that bulky amino acids slow down ribosome movement when they occur within the peptide tunnel and Proline residues may decrease or increase ribosome velocities depending on the context in which they occur. Moreover we show that a few parameters obtained with RUST are sufficient for predicting experimental densities with high accuracy. Due to its robustness and low computational demand, RUST could be used for quick routine characterization of ribosome profiling datasets to assess their quality as well as for the analysis of the relative impact of mRNA sequence features on local decoding rates.

Most viewed on Haldane’s Sieve: April 2014

The most viewed posts this month were:

Distinct nucleosome distribution patterns in two structurally and functionally differentiated nuclei of a unicellular eukaryote

Distinct nucleosome distribution patterns in two structurally and functionally differentiated nuclei of a unicellular eukaryote

Jie Xiong , Shan Gao , Wen Dui , Wentao Yang , Xiao Chen , Sean D Taverna , Ronald E. Pearlman , Wendy Ashlock , Wei Miao , Yifan Liu
doi: http://dx.doi.org/10.1101/018754

The ciliate protozoan Tetrahymena thermophila contains two types of structurally and functionally differentiated nuclei: the transcriptionally active somatic macronucleus (MAC) and the transcriptionally silent germ-line micronucleus (MIC). Here we demonstrate that MAC features well-positioned nucleosomes downstream of transcription start sites (TSS) likely connected with promoter proximal pausing of RNA polymerase II, as well as in exonic regions flanking both the 5′ and 3′ splice sites. In contrast, nucleosomes in MIC are more delocalized. Nucleosome occupancy in MAC and MIC are nonetheless highly correlated with each other and with predictions based upon DNA sequence features. Arrays of well-positioned nucleosomes are often correlated with GC content oscillations, suggesting significant contributions from cis-determinants. We propose that cis- and trans-determinants may coordinately accommodate some well-positioned nucleosomes with important functions, driven by a process in which positioned nucleosomes shape the mutational landscape of associated DNA sequences, while the DNA sequences in turn reinforce nucleosome positioning.

Standing genetic variation as a major contributor to adaptation in the Virginia chicken lines selection experiment

Standing genetic variation as a major contributor to adaptation in the Virginia chicken lines selection experiment

Zheya Sheng , Mats E Pettersson , Christa F Honaker , Paul B Siegel , Örjan Carlborg
doi: http://dx.doi.org/10.1101/018721

Artificial selection has, for decades, provided a powerful approach to study the genetics of adaptation. Using selective-sweep mapping, it is possible to identify genomic regions in populations where the allele-frequencies have diverged during selection. To avoid misleading signatures of selection, it is necessary to show that a sweep has an effect on the selected trait before it can be considered adaptive. Here, we confirm candidate selective-sweeps on a genome-wide scale in one of the longest, on-going bi-directional selection experiments in vertebrates, the Virginia high and low body-weight selected chicken lines. The candidate selective-sweeps represent standing genetic variants originating from the common base-population. Using a deep-intercross between the selected lines, 16 of 99 evaluated regions were confirmed to contain adaptive selective-sweeps based on their association with the selected trait, 56-day body-weight. Although individual additive effects were small, the fixation for alternative alleles in the high and low body-weight lines across these loci contributed at least 40% of the divergence between them and about half of the additive genetic variance present within and between the lines after 40 generations of selection. The genetic variance contributed by the sweeps corresponds to about 85% of the additive genetic variance of the base-population, illustrating that these loci were major contributors to the realised selection-response. Thus, the gradual, continued, long- term selection response in the Virginia lines was likely due to a considerable standing genetic variation in a highly polygenic genetic architecture in the base-population with contributions from a steady release of selectable genetic variation from new mutations and epistasis throughout the course of selection.

The complex admixture history and recent southern origins of Siberian populations

The complex admixture history and recent southern origins of Siberian populations

Irina Pugach , Rostislav Matveev , Viktor Spitsyn , Sergey Makarov , Innokentiy Novgorodov , Vladimir Osakovsky , Mark Stoneking , Brigitte Pakendorf
doi: http://dx.doi.org/10.1101/018770

Although Siberia was inhabited by modern humans at an early stage, there is still debate over whether this area remained habitable during the extremely cold period of the Last Glacial Maximum or whether it was subsequently repopulated by peoples with a recent shared ancestry. Previous studies of the genetic history of Siberian populations were hampered by the extensive admixture that appears to have taken place among these populations, since commonly used methods assume a tree-like population history and at most single admixture events. We therefore developed a new method based on the covariance of ancestry components, which we validated with simulated data, in order to investigate this potentially complex admixture history and to distinguish the effects of shared ancestry from prehistoric migrations and contact. We furthermore adapted a previously devised method of admixture dating for use with multiple events of gene flow, and applied these methods to whole-genome genotype data from over 500 individuals belonging to 20 different Siberian ethnolinguistic groups. The results of these analyses indicate that there have indeed been multiple layers of admixture detectable in most of the Siberian populations, with considerable differences in the admixture histories of individual populations, and with the earliest events dated to not more than 4500 years ago. Furthermore, most of the populations of Siberia included here, even those settled far to the north, can be shown to have a southern origin. These results provide support for a recent population replacement in this region, with the northward expansions of different populations possibly being driven partly by the advent of pastoralism, especially reindeer domestication. These newly developed methods to analyse multiple admixture events should aid in the investigation of similarly complex population histories elsewhere.