Sequencing of 15,622 gene-bearing BACs reveals new features of the barley genome

Sequencing of 15,622 gene-bearing BACs reveals new features of the barley genome
María Muñoz-Amatriaín , Stefano Lonardi , MingCheng Luo , Kavitha Madishetty , Jan Svensson , Matthew Moscou , Steve Wanamaker , Tao Jiang , Andris Kleinhofs , Gary Muehlbauer , Roger Wise , Nils Stein , Yaqin Ma , Edmundo Rodriguez , Dave Kudrna , Prasanna R Bhat , Shiaoman Chao , Pascal Condamine , Shane Heinen , Josh Resnik , Rod Wing , Heather N Witt , Matthew Alpert , Marco Beccuti , Serdar Bozdag , Francesca Cordero , Hamid Mirebrahim , Rachid Ounit , Yonghui Wu , Frank You , Jie Zheng , Hana Šimková , Jaroslav Doležel , Jane Grimwood , Jeremy Schmutz , Denisa Duma , Lothar Altschmied , Tom Blake , Phil Bregitzer , Laurel Cooper , Muharrem Dilbirligi , Anders Falk , Leila Feiz , Andreas Graner , Perry Gustafson , Patrick Hayes , Peggy Lemaux , Jafar Mammadov , Timothy Close
doi: http://dx.doi.org/10.1101/018978

Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework. However, since only 6,278 BACs in the physical map were sequenced, detailed fine structure was limited. To gain access to the gene-containing portion of the barley genome at high resolution, we identified and sequenced 15,622 BACs representing the minimal tiling path of 72,052 physical mapped gene-bearing BACs. This generated about 1.7 Gb of genomic sequence containing 17,386 annotated barley genes. Exploration of the sequenced BACs revealed that although distal ends of chromosomes contain most of the gene-enriched BACs and are characterized by high rates of recombination, there are also gene-dense regions with suppressed recombination. Knowledge of these deviant regions is relevant to trait introgression, genome-wide association studies, genomic selection model development and map-based cloning strategies. Sequences and their gene and SNP annotations can be accessed and exported via http://harvest-web.org/hweb/utilmenu.wc or through the software HarvEST:Barley (download from harvest.ucr.edu). In the latter, we have implemented a synteny viewer between barley and Aegilops tauschii to aid in comparative genome analysis.

Theoretical consequences of the Mutagenic Chain Reaction for manipulating natural populations

Theoretical consequences of the Mutagenic Chain Reaction for manipulating natural populations
Robert Unckless , Philipp Messer , Andrew Clark
doi: http://dx.doi.org/10.1101/018986

The use of recombinant genetic technologies for population manipulation has mostly remained an abstract idea due to the lack of a suitable means to drive novel gene constructs to high frequency in populations. Recently Gantz and Bier showed that the use of CRISPR/Cas9 technology could provide an artificial drive mechanism, the so-called Mutagenic Chain Reaction (MCR), which could lead to rapid fixation of even a deleterious introduced allele. We establish the equivalence of this system to models of meiotic drive and review the results of simple models showing that, when there is a fitness cost to the MCR allele, an internal equilibrium exists that is usually unstable. Introductions must be at a frequency above this critical point for the successful invasion of the MCR allele. These modeling results have important implications for application of MCR in natural populations.

A Chronological Atlas of Natural Selection in the Human Genome during the Past Half-million Years

A Chronological Atlas of Natural Selection in the Human Genome during the Past Half-million Years
Hang Zhou , Sile Hu , Rostislav Matveev , Qianhui Yu , Jing Li , Philipp Khaitovich , Li Jin , Michael Lachmann , Mark Stoneking , Qiaomei Fu , Kun Tang
doi: http://dx.doi.org/10.1101/018929

The spatiotemporal distribution of recent human adaptation is a long standing question. We developed a new coalescent-based method that collectively assigned human genome regions to modes of neutrality or to positive, negative, or balancing selection. Most importantly, the selection times were estimated for all positive selection signals, which ranged over the last half million years, penetrating the emergence of anatomically modern human (AMH). These selection time estimates were further supported by analyses of the genome sequences from three ancient AMHs and the Neanderthals. A series of brain function-related genes were found to carry signals of ancient selective sweeps, which may have defined the evolution of cognitive abilities either before Neanderthal divergence or during the emergence of AMH. Particularly, signals of brain evolution in AMH are strongly related to Alzheimer’s disease pathways. In conclusion, this study reports a chronological atlas of natural selection in Human.

Driven to Extinction: On the Probability of Evolutionary Rescue from Sex-Ratio Meiotic Drive

Driven to Extinction: On the Probability of Evolutionary Rescue from Sex-Ratio Meiotic Drive
Robert Unckless , Andrew Clark
doi: http://dx.doi.org/10.1101/018820

Many evolutionary processes result in sufficiently low mean fitness that they pose a risk of species extinction. Sex-ratio meiotic drive was recognized by W.D. Hamilton (1967) to pose such a risk, because as the driving sex chromosome becomes common, the opposite sex becomes rare. We expand on Hamilton’s classic model by allowing for the escape from extinction due to evolution of suppressors of X and Y drivers. We explore differences in the two systems in their probability of escape from extinction. Several novel conclusions are evident, including a) that extinction time scales approximately with the log of population size so that even large populations may go extinct quickly, b) extinction risk is driven by the relationship between female fecundity and drive strength, c) anisogamy and the fact that X and Y drive result in sex ratios skewed in opposite directions, mean systems with Y drive are much more likely to go extinct than those with X drive, and d) suppressors are most likely to become established when the strength of drive is intermediate, since weak drive leads to weak selection for suppression and strong drive leads to rapid extinction.

Twisted trees and inconsistency of tree estimation when gaps are treated as missing data — the impact of model mis-specification in distance corrections

Twisted trees and inconsistency of tree estimation when gaps are treated as missing data — the impact of model mis-specification in distance corrections
Emily Jane McTavish, Mike Steel, Mark T. Holder
Comments: 29 pages, 3 figures
Subjects: Populations and Evolution (q-bio.PE)

Statistically consistent estimation of phylogenetic trees or gene trees is possible if pairwise sequence dissimilarities can be converted to a set of distances that are proportional to the true evolutionary distances. Susko et al. (2004) reported some strikingly broad results about the forms of inconsistency in tree estimation that can arise if corrected distances are not proportional to the true distances. They showed that if the corrected distance is a concave function of the true distance, then inconsistency due to long branch attraction will occur. If these functions are convex, then two “long branch repulsion” trees will be preferred over the true tree — though these two incorrect trees are expected to be tied as the preferred true. Here we extend their results, and demonstrate the existence of a tree shape (which we refer to as a “twisted Farris-zone” tree) for which a single incorrect tree topology will be guaranteed to be preferred if the corrected distance function is convex. We also report that the standard practice of treating gaps in sequence alignments as missing data is sufficient to produce non-linear corrected distance functions if the substitution process is not independent of the insertion/deletion process. Taken together, these results imply inconsistent tree inference under mild conditions. For example, if some positions in a sequence are constrained to be free of substitutions and insertion/deletion events while the remaining sites evolve with independent substitutions and insertion/deletion events, then the distances obtained by treating gaps as missing data can support an incorrect tree topology even given an unlimited amount of data.

Selection for Intermediate Genotypes Enables a Key Innovation in Phage Lambda

Selection for Intermediate Genotypes Enables a Key Innovation in Phage Lambda
Alita Burmeister , Richard Lenski , Justin Meyer
doi: http://dx.doi.org/10.1101/018606

The evolution of qualitatively new functions is fundamental for shaping the diversity of life. Such innovations are rare because they require multiple coordinated changes. We sought to understand the evolutionary processes involved in a particular key innovation, whereby phage λ evolved the ability to exploit a novel receptor, OmpF, on the surface of Escherichia coli cells. Previous work has shown that this transition repeatedly evolves in the laboratory, despite requiring four mutations in specific regions of a single gene. Here we examine how this innovation evolved by studying six intermediate genotypes that arose during independent transitions to use OmpF. In particular, we tested whether these genotypes were favored by selection, and how a coevolved change in the hosts influenced the fitness of the phage genotypes. To do so, we measured the fitness of the intermediate types relative to the ancestral λ when competing for either ancestral or coevolved host cells. All six intermediates had improved fitness on at least one host, and four had higher fitness on the coevolved host than on the ancestral host. These results show that the evolution of the phage’s new ability to use OmpF was repeatable because the intermediate genotypes were adaptive and, in many cases, because coevolution of the host favored their emergence.

Proteins linked to autosomal dominant and autosomal recessive disorders harbor characteristic rare missense mutation distribution patterns

Proteins linked to autosomal dominant and autosomal recessive disorders harbor characteristic rare missense mutation distribution patterns
Tychele Turner , Christopher Douville , Dewey Kim , Peter D Stenson , David N Cooper , Aravinda Chakravarti , Rachel Karchin
doi: http://dx.doi.org/10.1101/018648

The role of rare missense variants in disease causation remains difficult to interpret. We explore whether the clustering pattern of rare missense variants (MAF<0.01) in a protein is associated with mode of inheritance. Mutations in genes associated with autosomal dominant (AD) conditions are known to result in either loss or gain of function, whereas mutations in genes associated with autosomal recessive (AR) conditions invariably result in loss of function. Loss- of-function mutations tend to be distributed uniformly along protein sequence, while gain-of- function mutations tend to localize to key regions. It has not previously been ascertained whether these patterns hold in general for rare missense mutations. We consider the extent to which rare missense variants are located within annotated protein domains and whether they form clusters, using a new unbiased method called CLUstering by Mutation Position (CLUMP). These approaches quantified a significant difference in clustering between AD and AR diseases. Proteins linked to AD diseases exhibited more clustering of rare missense mutations than those linked to AR diseases (Wilcoxon P=5.7×10-4, permutation P=8.4×10-4). Rare missense mutation in proteins linked to either AD or AR diseases were more clustered than controls (1000G) (Wilcoxon P=2.8×10-15 for AD and P=4.5×10-4 for AR, permutation P=3.1×10-12 for AD and P=0.03 for AR). Differences in clustering patterns persisted even after removal of the most prominent genes. Testing for such non-random patterns may reveal novel aspects of disease etiology in large sample studies.