Author post: Segregation distorters are not a primary source of Dobzhansky-Muller incompatibilities in house mouse hybrids

This guest post is by Russ Corbett-Detig, Emily Jacobs-Palmer, and Hopi Hoekstra (@hopihoekstra) on their paper Corbett-Detig et al Segregation distorters are not a primary source of Dobzhansky-Muller incompatibilities in house mouse hybrids bioRxived here.

What are segregation distorters and how can they contribute to reproductive isolation?

Within an individual, somatic cells are typically genetic clones of one another; in contrast, haploid gametes are related to their compatriots at only half of all loci on average, opening doors to intra-individual competition and conflict. Eggs and sperm may express selfish genetic elements called segregation distorters (SDs) that disable or destroy competitor gametes carrying unrelated alleles. The resulting transmission advantage attained by SDs allows them to invade populations without improving the fitness of individuals that harbor them. Indeed, SDs often negatively impact carriers’ fitness because such hosts transmit fewer fit (or viable) gametes. Hence natural selection favors the evolution of alleles that suppress distortion and thereby restore fertility.

Coevolution of SDs and their suppressors can in turn contribute to the evolution of reproductive isolation between diverging lineages. How? If two populations become temporarily isolated from one another, SDs and later their accompanying suppressors may arise and eventually fix in one isolated population, possibly multiple times over. Should the two populations then encounter each other again, the sperm of hybrid males, for example, will contain one or more distorters without the appropriate suppressors, and these males will suffer decreased fertility. Over time, gene flow may be substantially and perhaps permanently hindered leading to the formation of two reproductively isolated species.

In some Drosophila species pairs, and in many crop plants, it is clear that the coevolution of SDs and their suppressors are major, even primary, contributors to the evolution of reproductive isolation between diverging lineages. At present, however, the relative importance of SDs-suppressor systems to reproductive isolation in broader taxonomic swathes of sexually reproducing organisms (e.g. mammals) is largely unexplored.

Our solution to the practical challenges of studying SDs

Supplemental_Figure_S1

The primary impediment to addressing this important question in evolutionary biology is practical, not conceptual. Conventionally, researchers detect SD-suppressor systems by crossing two strains to produce a large second-generation hybrid population; they then genotype these hybrids at a set of markers across the genome to identify loci that show substantive deviations from 50:50 mendelian ratios—putative SDs. Ultimately, this traditional approach suffers from two major pitfalls. First, for many organisms it is not feasible to raise and genotype enough hybrids (hundreds to thousands) to have sufficient statistical power to detect SDs, especially those with weaker effects. Second, by genotyping these second generation hybrids, rather than the gametes of their parents, one conflates SD with hybrid inviability, and it can be very difficult to disentangle these two factors.

How to circumvent these challenges? In this work, we develop an alternative approach that avoids these practical challenges. We first obtain high quality, motile sperm from first generation hybrid males (generated from two strains with available genome sequences), and then sequence these sperm in bulk as well as a somatic ‘control’ tissue. We then contrast the relative representation of the parental chromosomes in windows across the genome in both samples, searching for regions where the sperm allele ratios show more DNA copies of one parental haplotype, but the somatic alleles do not. Importantly, this approach is very general, and it can easily be applied to any number of interspecific or intraspecific crosses where it is possible to obtain large quantities of viable gametes.

Little evidence for SDs in house mouse hybrids

We apply this method to a nascent pair of Mus musculus subspecies,M. m. castaneus and M. m. domesticus. We chose these subspecies because hybrid males formed in this cross are known to be partially reproductively dysfunctional. Nonetheless, using our novel method we find no evidence supporting the presence of SDs—no genomics regions showing a statistical deviation from 50:50 compared to control tissue—despite strong statistical power to detect them. We conclude that SDs do not contribute appreciably to the evolution of reproductive isolation in this nascent species pair. Instead, reproductive isolation in these mammalian subspecies likely stems from other incompatibilities in spermatogenesis or ejaculate production unrelated to SD-suppressor coevolution.

So what’s next? Because this approach—bulk sequencing of sperm from hybrid males—can be used on almost any pair of interfertile taxa, we can begin to better understand the prevalence of SD and its role in speciation in a wide diversity of species.

Increasing evolvability of local adaptation during range expansion.

Increasing evolvability of local adaptation during range expansion.
Marleen M. P. Cobben, Alexander Kubisch
doi: http://dx.doi.org/10.1101/008979

Increasing dispersal under range expansion increases invasion speed, which implies that a species needs to adapt more rapidly to newly experienced local conditions. However, due to iterated founder effects, local genetic diversity under range expansion is low. Evolvability (the evolution of mutation rates) has been reported to possibly be an adaptive trait itself. Thus, we expect that increased dispersal during range expansion may raise the evolvability of local adaptation, and thus increase the survival of expanding populations. We have studied this phenomenon with a spatially explicit individual-based metapopulation model of a sexually reproducing species with discrete generations, expanding into an elevational gradient. Our results show that evolvability is likely to evolve as a result of spatial variation experienced under range expansion. In addition, we show that different spatial phenomena associated with range expansion, in this case spatial sorting / kin selection and priority effects, can enforce each other.

The Sea Lamprey Meiotic Map Resolves Ancient Vertebrate Genome Duplications

The Sea Lamprey Meiotic Map Resolves Ancient Vertebrate Genome Duplications
Jeramiah Smith
doi: http://dx.doi.org/10.1101/008953

Gene and genome duplications serve as an important reservoir of material for the evolution of new biological functions. It is generally accepted that many genes present in vertebrate genomes owe their origin to two whole genome duplications that occurred deep in the ancestry of the vertebrate lineage. However, details regarding the timing and outcome of these duplications are not well resolved. We present high-density meiotic and comparative genomic maps for the sea lamprey, a representative of an ancient lineage that diverged from all other vertebrates approximately 550 million years ago. Linkage analyses yielded a total of 95 linkage groups, similar to the estimated number of germline chromosomes (1N ~ 99), spanning a total of 5,570.25 cM. Comparative mapping data yield strong support for one ancient whole genome duplication but do not strongly support a hypothetical second event. Rather, these comparative maps reveal several evolutionary independent segmental duplications occurring over the last 600+ million years of chordate evolution. This refined history of vertebrate genome duplication should permit more precise investigations into the evolution of vertebrate gene functions.

Generation of a Panel of Induced Pluripotent Stem Cells From Chimpanzees: a Resource for Comparative Functional Genomics

Generation of a Panel of Induced Pluripotent Stem Cells From Chimpanzees: a Resource for Comparative Functional Genomics
Irene Gallego Romero, Bryan J Pavlovic, Irene Hernando-Herraez, Nicholas E Banovich, Courtney L Kagan, Jonathan E Burnett, Constance H Huang, Amy Mitrano, Claudia I Chavarria, Inbar F Ben-Nun, Yingchun Li, Karen Sabatini, Trevor Leonardo, Mana Parast, Tomas Marques-Bonet, Louise C Laurent, Jeanne F Loring, Yoav Gilad
doi: http://dx.doi.org/10.1101/008862

Comparative genomics studies in primates are extremely restricted because we only have access to a few types of cell lines from non-human apes and to a limited collection of frozen tissues. In order to gain better insight into regulatory processes that underlie variation in complex phenotypes, we must have access to faithful model systems for a wide range of tissues and cell types. To facilitate this, we have generated a panel of 7 fully characterized chimpanzee (Pan troglodytes) induced pluripotent stem cell (iPSC) lines derived from fibroblasts of healthy donors. All lines appear to be free of integration from exogenous reprogramming vectors, can be maintained using standard iPSC culture techniques, and have proliferative and differentiation potential similar to human and mouse lines. To begin demonstrating the utility of comparative iPSC panels, we collected RNA sequencing data and methylation profiles from the chimpanzee iPSCs and their corresponding fibroblast precursors, as well as from 7 human iPSCs and their precursors, which were of multiple cell type and population origins. Overall, we observed much less regulatory variation within species in the iPSCs than in the somatic precursors, indicating that the reprogramming process has erased many of the differences observed between somatic cells of different origins. We identified 4,918 differentially expressed genes and 3,598 differentially methylated regions between iPSCs of the two species, many of which are novel inter-species differences that were not observed between the somatic cells of the two species. Our panel will help realise the potential of iPSCs in primate studies, and in combination with genomic technologies, transform studies of comparative evolution.

Butter: High-precision genomic alignment of small RNA-seq data

Butter: High-precision genomic alignment of small RNA-seq data
Michael J Axtell

Eukaryotes produce large numbers of small non-coding RNAs that act as specificity determinants for various gene-regulatory complexes. These include microRNAs (miRNAs), endogenous short interfering RNAs (siRNAs), and Piwi-associated RNAs (piRNAs). These RNAs can be discovered, annotated, and quantified using small RNA-seq, a variant RNA-seq method based on highly parallel sequencing. Alignment to a reference genome is a critical step in analysis of small RNA-seq data. Because of their small size (20-30 nts depending on the organism and sub-type) and tendency to originate from multi-gene families or repetitive regions, reads that align equally well to more than one genomic location are very common. Typical methods to deal with multi-mapped small RNA-seq reads sacrifice either precision or sensitivity. The tool ‘butter’ balances precision and sensitivity by placing multi-mapped reads using an iterative approach, where the decision between possible locations is dictated by the local densities of more confidently aligned reads. Butter displays superior performance relative to other small RNA-seq aligners. Treatment of multi-mapped small RNA-seq reads has substantial impacts on downstream analyses, including quantification of MIRNA paralogs, and discovery of endogenous siRNA loci. Butter is freely available under a GNU general public license.

Concerning RNA-Guided Gene Drives for the Alteration of Wild Populations

Concerning RNA-Guided Gene Drives for the Alteration of Wild Populations
Kevin M Esvelt, Andrea L Smidler, Flaminia Catteruccia, George M Church

Gene drives may be capable of addressing ecological problems by altering entire populations of wild organisms, but their use has remained largely theoretical due to technical constraints. Here we consider the potential for RNA-guided gene drives based on the CRISPR nuclease Cas9 to serve as a general method for spreading altered traits through wild populations over many generations. We detail likely capabilities, discuss limitations, and provide novel precautionary strategies to control the spread of gene drives and reverse genomic changes. The ability to edit populations of sexual species would offer substantial benefits to humanity and the environment. For example, RNA-guided gene drives could potentially prevent the spread of disease, support agriculture by reversing pesticide and herbicide resistance in insects and weeds, and control damaging invasive species. However, the possibility of unwanted ecological effects and near-certainty of spread across political borders demand careful assessment of each potential application. We call for thoughtful, inclusive, and well-informed public discussions to explore the responsible use of this currently theoretical technology.

No evidence that sex and transposable elements drive genome size variation in evening primroses

No evidence that sex and transposable elements drive genome size variation in evening primroses
J Arvid Agren, Stephan Greiner, Marc TJ Johnson, Stephen I Wright

Genome size varies dramatically across species, but despite an abundance of attention there is little agreement on the relative contributions of selective and neutral processes in governing this variation. The rate of sexual reproduction can potentially play an important role in genome size evolution because of its effect on the efficacy of selection and transmission of transposable elements. Here, we used a phylogenetic comparative approach and whole genome sequencing to investigate the contribution of sex and transposable element content to genome size variation in the evening primrose (Oenothera) genus. We determined genome size using flow cytometry from 30 Oenothera species of varying reproductive system and find that variation in sexual/asexual reproduction cannot explain the almost two-fold variation in genome size. Moreover, using whole genome sequences of three species of varying genome sizes and reproductive system, we found that genome size was not associated with transposable element abundance; instead the larger genomes had a higher abundance of simple sequence repeats. Although it has long been clear that sexual reproduction may affect various aspects of genome evolution in general and transposable element evolution in particular, it does not appear to have played a major role in the evening primroses.

Interpretation and approximation tools for big, dense Markov chain transition matrices in ecology and evolution

Interpretation and approximation tools for big, dense Markov chain transition matrices in ecology and evolution
Katja Reichel, Valentin Bahier, Cédric Midoux, Jean-Pierre Masson, Solenn Stoeckel
Comments: 8 pages, 4 figures, supplement: 2 figures, visual abstract, highlights, source code
Subjects: Quantitative Methods (q-bio.QM); Populations and Evolution (q-bio.PE)

Markov chains are a common framework for individual-based state and time discrete models in ecology and evolution. Their use, however, is largely limited to systems with a low number of states, since the transition matrices involved pose considerable challenges as their size and their density increase. Big, dense transition matrices may easily defy both the computer’s memory and the scientists’ ability to interpret them, due to the very high amount of information they contain; yet approximations using other types of models are not always the best solution.
We propose a set of methods to overcome the difficulties associated with big, dense Markov chain transition matrices. Using a population genetic model as an example, we demonstrate how big matrices can be transformed into clear and easily interpretable graphs with the help of network analysis. Moreover, we describe an algorithm to save computer memory by substituting the original matrix with a sparse approximate while preserving all its mathematically important properties. In the same model example, we manage to store about 90% less data while keeping more than 99% of the information contained in the matrix and a closely corresponding dominant eigenvector.
Our approach is an example how numerical limitations for the number of states in a Markov chain can be overcome. By facilitating the use of state-rich Markov chain models, they may become a valuable supplement to the diversity of models currently employed in biology.

Redefining Genomic Privacy: Trust and Empowerment

Redefining Genomic Privacy: Trust and Empowerment

Arvind Narayanan, Kenneth Yocum, David Glazer, Nita Farahany, Maynard Olson, Lincoln D. Stein, James B. Williams, Jan A. Witkowski, Robert C. Kain, Yaniv Erlich

Fulfilling the promise of the genetic revolution requires the analysis of large datasets containing information from thousands to millions of participants. However, sharing human genomic data requires protecting subjects from potential harm. Current models rely on de-identification techniques that treat privacy versus data utility as a zero-sum game. Instead we propose using trust-enabling techniques to create a solution where researchers and participants both win. To do so we introduce three principles that facilitate trust in genetic research and outline one possible framework built upon those principles. Our hope is that such trust-centric frameworks provide a sustainable solution that reconciles genetic privacy with data sharing and facilitates genetic research.

Author post: Inferring human population size and separation history from multiple genome sequences

This guest post is by Stephan Schiffels (@stschiff) on his paper with Richard Durbin Inferring human population size and separation history from multiple genome sequences biorxived here

In our paper, we study genome sequences to learn about human history and how human populations are related to each other. Remarkably, we only need a few individuals for this, because once we look sufficiently many generations into the past, every single genome contains fragments from a very large number of ancestors. This means that given only two genomes, say one individual from Africa and one individual from Europe, we typically find shared fragments from common ancestors (great great … great grandparents) from 2,000 or more generations ago. This trace of shared segments in our genomes can be detected and enables us to make inference about human history.

A few years ago, Heng Li and Richard Durbin introduced the PSMC method which is based on estimating this shared common ancestry in a single diploid genome to infer population sizes. We now introduced a major extension to this approach, called MSMC (Multiple Sequentially Markovian Coalescent), which is able to find and date traces of shared ancestry across multiple genome sequences. This is generally a hard problem because of the complex way of how sequences relate with each other through recombination and mutation (see an excellent blog post by Adam Siepel). In our method, we therefore made a choice to focus only on the pair of segments which coalesce first, i.e. share the most recent common ancestor of all pairs. Because of ancestral recombinations, this changes along the sequences.

Consider again the example of an African and a European individual, each of them carrying two copies of a chromosome. In one part of their genomes, the most recent ancestor of any two chromosomes may be shared between the two European chromosomes, in other parts it may be shared between the two African chromosomes, and in some cases it may actually be found across a European and an African chromosome. The relative frequency of how often we observe each of the three cases, and the distribution of times to the most recent common ancestor, give information about when the separation happened, and how long it took for the ancestral people to part fully from each other. In the case of West-Africans and Europeans, we found that the two populations started to separate from each other (at least genetically) long before the known out-of-Africa emigration 50,000 years ago. And we see the same thing if we compare West-Africans to Asians or Americans instead of Europeans. We can also see clearly how ancestors of Native Americans separated from Asians around 20,000 years ago, consistently preceding the known first arrival of people in the New World around 15,000 years ago.

Our method can also estimate effective population size changes through time. One consequence of our approach to look only for the first common ancestor is that we can now look into the much more recent past than was previously possible with similar methods, such as PSMC. For example, we can now see a deep bottleneck in Native American ancestors around 15,000 years ago which fits with the separation and immigration history described above, and we can see recent expansions that are consistent with the spread of agriculture in Africa.

We believe that MSMC is a useful tool for estimating population history from whole genome sequences. But more ideas and development are still needed in the future to expand this approach to more genomes and to look into the past even more recently than 2,000 years ago, which is our current limit with MSMC. Closely related approaches are currently developed by Yun Song, Thomas Mailund and others, which will complement MSMC. This is a great time to work in this field, given that many more high quality individual genome sequences are being generated, and in many cases from populations that we have not covered at all in our paper. All of this will help to greatly expand our knowledge of human population history.