Successful asexual lineages of the Irish potato Famine pathogen are triploid

Successful asexual lineages of the Irish potato Famine pathogen are triploidYing Li, Qian Zhou, Kun Qian, Theo van der Lee, Sanwen Huang
doi: http://dx.doi.org/10.1101/024596
The oomycete Phytophthora infestans was the causal agent of the Irish Great Famine and is a recurring threat to global food security. The pathogen can reproduce both sexually and asexually and has a potential to adapt both abiotic and biotic environment. Although in many regions the A1 and A2 mating types coexist, the far majority of isolates belong to few clonal, asexual lineages. As other oomycetes, P. infestans is thought to be diploid during the vegetative phase of its life cycle, but it was observed that trisomy correlated with virulence and mating type locus and that polyploidy can occur in some isolates. It remains unknown about the frequency of polyploidy occurrence in nature and the relationship between ploidy level and sexuality. Here we discovered that the sexuality of P. infestans isolates correlates with ploidy by comparison of microsatellite fingerprinting, genome-wide polymorphism, DNA quantity, and chromosome numbers. The sexual progeny of P. infestans in nature are diploid, whereas the asexual lineages are mostly triploids, including successful clonal lineages US-1 and 13_A2. This study reveals polyploidization as an extra evolutionary risk to this notorious plant destroyer.

S/HIC: Robust identification of soft and hard sweeps using machine learning

S/HIC: Robust identification of soft and hard sweeps using machine learningDaniel R Schrider, Andrew D Kern
doi: http://dx.doi.org/10.1101/024547
Detecting the targets of adaptive natural selection from whole genome sequencing data is a central problem for population genetics. However, to date most methods have shown sub-optimal performance under realistic demographic scenarios. Moreover, over the past decade there has been a renewed interest in determining the importance of selection from standing variation in adaptation of natural populations, yet very few methods for inferring this model of adaptation at the genome scale have been introduced. Here we introduce a new method, S/HIC, which uses supervised machine learning to precisely infer the location of both hard and soft selective sweeps. We show that S/HIC has unrivaled accuracy for detecting sweeps under demographic histories that are relevant to human populations, and distinguishing sweeps from linked as well as neutrally evolving regions. Moreover we show that S/HIC is uniquely robust among its competitors to model misspecification. Thus even if the true demographic model of a population differs catastrophically from that specified by the user, S/HIC still retains impressive discriminatory power. Finally we apply S/HIC to the case of resequencing data from human chromosome 18 in a European population sample and demonstrate that we can reliably recover selective sweeps that have been identified earlier using less specific and sensitive methods.

Genomic study of the Ket: a Paleo-Eskimo-related ethnic group with significant ancient North Eurasian ancestry

Genomic study of the Ket: a Paleo-Eskimo-related ethnic group with significant ancient North Eurasian ancestryPavel Flegontov, Piya Changmai, Anastassiya Zidkova, Maria D. Logacheva, Olga Flegontova, Mikhail S. Gelfand, Evgeny S. Gerasimov, Ekaterina V. Khrameeva, Olga P. Konovalova, Tatiana Neretina, Yuri V. Nikolsky, George Starostin, Vita V. Stepanova, Igor V. Travinsky, Martin Tříska, Petr Tříska, Tatiana V. Tatarinova
doi: http://dx.doi.org/10.1101/024554
The Kets, an ethnic group in the Yenisei River basin, Russia, are considered the last nomadic hunter-gatherers of Siberia, and Ket language has no transparent affiliation with any language family. We investigated connections between the Kets and Siberian and North American populations, with emphasis on the Mal’ta and Paleo-Eskimo ancient genomes, using original data from 46 unrelated samples of Kets and 42 samples of their neighboring ethnic groups (Uralic-speaking Nganasans, Enets, and Selkups). We genotyped over 130,000 autosomal SNPs, determined mitochondrial and Y-chromosomal haplogroups, and performed high-coverage genome sequencing of two Ket individuals. We established that the Kets belong to the cluster of Siberian populations related to Paleo-Eskimos. Unlike other members of this cluster (Nganasans, Ulchi, Yukaghirs, and Evens), Kets and closely related Selkups have a high degree of Mal’ta ancestry. Implications of these findings for the linguistic hypothesis uniting Ket and Na-Dene languages into a language macrofamily are discussed.

Genomic analysis of allele-specific expression in the mouse liver

Genomic analysis of allele-specific expression in the mouse liverAshutosh K Pandey, Robert W Williams
doi: http://dx.doi.org/10.1101/024588

Genetic differences in gene expression contribute significantly to phenotypic diversity and differences in disease susceptibility. In fact, the great majority of causal variants highlighted by genome-wide association are in non-coding regions that modulate expression. In order to quantify the extent of allelic differences in expression, we analyzed liver transcriptomes of isogenic F1 hybrid mice. Allele-specific expression (ASE) effects are pervasive and are detected in over 50% of assayed genes. Genes with strong ASE do not differ from those with no ASE with respect to their length or promoter complexity. However, they have a higher density of sequence variants, higher functional redundancy, and lower evolutionary conservation compared to genes with no ASE. Fifty percent of genes with no ASE are categorized as house-keeping genes. In contrast, the high ASE set may be critical in phenotype canalization. There is significant overlap between genes that exhibit ASE and those that exhibit strong cis expression quantitative trait loci (cis eQTLs) identified using large genetic expression data sets. Eighty percent of genes with cis eQTLs also have strong ASE effects. Conversely, 40% of genes with ASE effects are associated with strong cis eQTLs. Cis-acting variation detected at the protein level is also detected at the transcript level, but the converse is not true. ASE is a highly sensitive and direct method to quantify cis-acting variation in gene expression and complements and extends classic cis eQTL analysis. ASE differences can be combined with coding variants to produce a key resource of functional variants for precision medicine and genome-to-phenome mapping.

Fast and efficient QTL mapper for thousands of molecular phenotypes

Fast and efficient QTL mapper for thousands of molecular phenotypes

Halit Ongen, Alfonso Buil, Andrew Brown, Emmanouil Dermitzakis, Olivier Delaneau
doi: http://dx.doi.org/10.1101/022301

Motivation: In order to discover quantitative trait loci (QTLs), multi-dimensional genomic data sets combining DNA-seq and ChiP-/RNA-seq require methods that rapidly correlate tens of thousands of molecular phenotypes with millions of genetic variants while appropriately controlling for multiple testing. Results: We have developed FastQTL, a method that implements a popular cis-QTL mapping strategy in a user- and cluster-friendly tool. FastQTL also proposes an efficient permutation procedure to control for multiple testing. The outcome of permutations is modeled using beta distributions trained from a few permutations and from which adjusted p-values can be estimated at any level of significance with little computational cost. The Geuvadis & GTEx pilot data sets can be now easily analyzed an order of magnitude faster than previous approaches. Availability: Source code, binaries and comprehensive documentation of FastQTL are freely available to download at http://fastqtl.sourceforge.net/

Impact of the X chromosome and sex on regulatory variation

Impact of the X chromosome and sex on regulatory variation

Kimberly R Kukurba, Princy Parsana, Kevin S Smith, Zachary Zappala, David A Knowles, Marie-Julie Favé, Xin Li, Xiaowei Zhu, James B Potash, Myrna M Weissman, Jianxin Shi, Anshul Kundaje, Douglas F Levinson, Philip Awadalla, Sara Mostafavi, Alexis Battle, Stephen B Montgomery
doi: http://dx.doi.org/10.1101/024117

The X chromosome, with its unique mode of inheritance, contributes to differences between the sexes at a molecular level, including sex-specific gene expression and sex-specific impact of genetic variation. We have conducted an analysis of the impact of both sex and the X chromosome on patterns of gene expression identified through transcriptome sequencing of whole blood from 922 individuals. We identified that genes on the X chromosome are more likely to have sex-specific expression compared to the autosomal genes. Furthermore, we identified a depletion of regulatory variants on the X chromosome, especially among genes under high selective constraint. In contrast, we discovered an enrichment of sex-specific regulatory variants on the X chromosome. To resolve the molecular mechanisms underlying such effects, we generated and connected sex-specific chromatin accessibility to sex-specific expression and regulatory variation. As sex-specific regulatory variants can inform sex differences in genetic disease prevalence, we have integrated our data with genome-wide association study data for multiple immune traits and to identify traits with significant sex biases. Together, our study provides genome-wide insight into how the X chromosome and sex shape human gene regulation and disease.

Genome variation and meiotic recombination in Plasmodium falciparum: insights from deep sequencing of genetic crosses

Genome variation and meiotic recombination in Plasmodium falciparum: insights from deep sequencing of genetic crosses

Alistair Miles, Zamin Iqbal, Paul Vauterin, Richard Pearson, Susana Campino, Michel Theron, Kelda Gould, Daniel Mead, Eleanor Drury, John O’Brien, Valentin Ruano Rubio, Bronwyn MacInnis, Jonathan Mwangi, Upeka Samarakoon, Lisa Ranford-Cartwright, Michael Ferdig, Karen Hayton, Xinzhuan Su, Thomas Wellems, Julian Rayner, Gil McVean, Dominic Kwiatkowski
doi: http://dx.doi.org/10.1101/024182

The malaria parasite Plasmodium falciparum has a great capacity for evolutionary adaptation to evade host immunity and develop drug resistance. Current understanding of parasite evolution is impeded by the fact that a large fraction of the genome is either highly repetitive or highly variable, and thus difficult to analyse using short read technologies. Here we describe a resource of deep sequencing data on parents and progeny from genetic crosses, which has enabled us to perform the first integrated analysis of SNP, INDEL and complex polymorphisms, using Mendelian error rates as an indicator of genotypic accuracy. These data reveal that INDELs are exceptionally abundant and the dominant mode of polymorphism within the core genome. We analyse patterns of meiotic recombination, including the relative contribution of crossover and non-crossover events, and we observe several instances of recombination that modify copy number variants associated with drug resistance. We describe a novel web application that allows these data to be explored in detail.