Strong Purifying Selection at Synonymous Sites in D. melanogaster

Strong Purifying Selection at Synonymous Sites in D. melanogaster
David S. Lawrie, Philipp W. Messer, Ruth Hershberg, Dmitri A. Petrov
(Submitted on 15 Jan 2013)

Synonymous sites are generally assumed to be subject to weak selective constraint. For this reason, they are often neglected as a possible source of important functional variation. We use site frequency spectra from deep population sequencing data to show that, contrary to this expectation, 22% of four-fold synonymous (4D) sites in D. melanogaster evolve under very strong selective constraint while few, if any, appear to be under weak constraint. Linking polymorphism with divergence data, we further find that the fraction of synonymous sites exposed to strong purifying selection is higher for those positions that show slower evolution on the Drosophila phylogeny. The function underlying the inferred strong constraint appears to be separate from splicing enhancers, nucleosome positioning, and the translational optimization generating canonical codon bias. The fraction of synonymous sites under strong constraint within a gene correlates well with gene expression, particularly in the mid-late embryo, pupae, and adult developmental stages. Genes enriched in strongly constrained synonymous sites tend to be particularly functionally important and are often involved in key developmental pathways. Given that the observed widespread constraint acting on synonymous sites is likely not limited to Drosophila, the role of synonymous sites in genetic disease and adaptation should be reevaluated.

A genome-wide survey of genetic variation in gorillas using reduced representation sequencing

A genome-wide survey of genetic variation in gorillas using reduced representation sequencing
Aylwyn Scally, Bryndis Yngvadottir, Yali Xue, Qasim Ayub, Richard Durbin, Chris Tyler-Smith
(Submitted on 9 Jan 2013)

All non-human great apes are endangered in the wild, and it is therefore important to gain an understanding of their demography and genetic diversity. To date, however, genetic studies within these species have largely been confined to mitochondrial DNA and a small number of other loci. Here, we present a genome-wide survey of genetic variation in gorillas using a reduced representation sequencing approach, focusing on the two lowland subspecies. We identify 3,274,491 polymorphic sites in 14 individuals: 12 western lowland gorillas (Gorilla gorilla gorilla) and 2 eastern lowland gorillas (Gorilla beringei graueri). We find that the two species are genetically distinct, based on levels of heterozygosity and patterns of allele sharing. Focusing on the western lowland population, we observe evidence for population substructure, and a deficit of rare genetic variants suggesting a recent episode of population contraction. In western lowland gorillas, there is an elevation of variation towards telomeres and centromeres on the chromosomal scale. On a finer scale, we find substantial variation in genetic diversity, including a marked reduction close to the major histocompatibility locus, perhaps indicative of recent strong selection there. These findings suggest that despite their maintaining an overall level of genetic diversity equal to or greater than that of humans, population decline, perhaps associated with disease, has been a significant factor in recent and long-term pressures on wild gorilla populations.

The GenoChip: A New Tool for Genetic Anthropology

The GenoChip: A New Tool for Genetic Anthropology
Eran Elhaik, Elliott Greenspan, Sean Staats, Thomas Krahn, Chris Tyler-Smith, Yali Xue, Sergio Tofanelli, Paolo Francalacci, Francesco Cucca, Luca Pagani, Li Jin, Hui Li, Theodore G. Schurr, Bennett Greenspan, R. Spencer Wells, the Genographic Consortium
(Submitted on 17 Dec 2012)

The Genographic Project is an international effort using genetic data to chart human migratory history. The project is non-profit and non-medical, and through its Legacy Fund supports locally led efforts to preserve indigenous and traditional cultures. In its second phase, the project is focusing on markers from across the entire genome to obtain a more complete understanding of human genetic variation. Although many commercial arrays exist for genome-wide SNP genotyping, they were designed for medical genetic studies and contain medically related markers that are not appropriate for global population genetic studies. GenoChip, the Genographic Project’s new genotyping array, was designed to resolve these issues and enable higher-resolution research into outstanding questions in genetic anthropology. We developed novel methods to identify AIMs and genomic regions that may be enriched with alleles shared with ancestral hominins. Overall, we collected and ascertained AIMs from over 450 populations. Containing an unprecedented number of Y-chromosomal and mtDNA SNPs and over 130,000 SNPs from the autosomes and X-chromosome, the chip was carefully vetted to avoid inclusion of medically relevant markers. The GenoChip results were successfully validated. To demonstrate its capabilities, we compared the FST distributions of GenoChip SNPs to those of two commercial arrays for three continental populations. While all arrays yielded similarly shaped (inverse J) FST distributions, the GenoChip autosomal and X-chromosomal distributions had the highest mean FST, attesting to its ability to discern subpopulations. The GenoChip is a dedicated genotyping platform for genetic anthropology and promises to be the most powerful tool available for assessing population structure and migration history.

Comment on “Evidence of Abundant and Purifying Selection in Humans for Recently Acquired Regulatory Functions”

Comment on “Evidence of Abundant and Purifying Selection in Humans for Recently Acquired Regulatory Functions”
Nicolas Bray, Lior Pachter
(Submitted on 13 Dec 2012)

Ward and Kellis (Reports, September 5 2012) identify regulatory regions in the human genome exhibiting lineage-specific constraint and estimate the extent of purifying selection. There is no statistical rationale for the examples they highlight, and their estimates of the fraction of the genome under constraint are biased by arbitrary designations of completely constrained regions.

Detection of selective sweeps in cattle using genome-wide SNP data

Detection of selective sweeps in cattle using genome-wide SNP data
Holly R. Ramey, Jared E. Decker, Stephanie D. McKay, Megan M. Rolf, Robert D. Schnabel, Jeremy F. Taylor
(Submitted on 11 Dec 2012)

The domestication and subsequent selection by humans to create breeds of cattle undoubtedly altered the patterning of variation within their genomes. Strong selection to fix advantageous large-effect mutations underlying domesticability, breed characteristics or productivity created selective sweeps in which variation was lost in the chromosomal region flanking the selected allele. Selective sweeps have been identified in the genomes of many species including humans, dogs, horses, and chickens. We attempt to identify regions of the bovine genome that have been subjected to selective sweeps. Two datasets were used for the discovery and validation of selective sweeps via the fixation of alleles at a series of contiguous SNP loci. BovineSNP50 data were used to identify 28 putative sweep regions among 14 cattle breeds. Affymetrix BOS 1 prescreening assay data for five breeds were used to identify 114 regions and validate 5 regions identified using the BovineSNP50 data. Many genes are located within these regions; however, phenotypes that we predict to have historically been under strong selection include horned-polled, coat color, stature, ear morphology, and behavior. The identified selective sweeps represent recent events associated with breed formation rather than ancient events associated with domestication. No sweep regions were shared between indicine and taurine breeds reflecting their divergent selection histories. A primary finding of this study is the sensitivity of results to assay resolution. Despite the bias towards common SNPs in the BovineSNP50 design, false positive sweep regions appear to be common due to the limited resolution of the assay. This assay design bias leads to the detection of breed-specific sweep regions, or regions shared by a small number of breeds, restricting the suite of selected phenotypes detected to primarily those associated with breed characteristics.

Our paper: Oh sister, where art thou? Indirect fitness benefit could maintain a host defense trait

This guest post is by Pleuni Pennings on the paper “Oh sister, where art thou? Indirect fitness benefit could maintain a host defense trait”, available from the arXiv here. This is cross-posted from her website here

Tobias Pamminger, Susanne Foitzik, Dirk Metzler and I analyzed the small scale spatial structure of ants of the species Temnothorax longispinosus. These ants are the host of a slavemaking ant. The slavemakers go on raids, and steal young from the host species to work as slaves in their nests. We wanted to know whether the slaves still have relatives in the nearby nests. If they do, then their behavior – which influences the slavemakers – could have an effect on their relatives and therefore on their indirect fitness.

To find out if slaves are related to their neighbours, we collected lots of ant nests (they nest in acorns), both in New York and in West Virginia, marked exactly where we found them and genotyped them at six microsatellites.

Ants in acorn

Photograph by Andreas Gros
Temnothorax longispinosus in acorn

US2009 132

We put little flags at the exact location of an ant nest to measure the distances between the nests.

Microsat Data

This is one of the figures from the manuscript. Plot R (from West Virginia) is is shown to demonstrate the distribution of colonies within a plot and to show the distribution of alleles of one of the six microsatellite loci (GT1) among colonies. Each colony is represented by a pie-diagram with the frequencies of different GT1 alleles amongst the genotyped individuals of the colony. R3 is a slavemaker nest (we genotyped the slaves, not the slavemakers) and shares most of its alleles with the free nest R7. R13 and R15 are free living host colonies in close proximity and appear to be related.

Our main conclusion is that the enslaved ants are indeed related to their neighbors. The manuscript can be found on the arXiv here: http://arxiv.org/abs/1212.0790

The manuscript was peer-reviewed at Peerage of Science, a new and very useful community of scientists who agree to review each others papers fairly. See http://www.peerageofscience.org/

The manuscript is part of Tobias Pamminger’s PhD thesis. Tobias defends his thesis this week in Mainz!! Congrats Tobias!

Tobias came up with the awesome title for the paper “Oh sister, where art thou? Indirect fitness benefit could maintain a host defense trait.”

Reconstructing Roma history from genome-wide data

Reconstructing Roma history from genome-wide data

Priya Moorjani, Nick Patterson, Po-Ru Loh, Mark Lipson, Péter Kisfali, Bela I Melegh, Michael Bonin, Ľudevít Kádaši, Olaf Rieß, Bonnie Berger, David Reich, Béla Melegh
(Submitted on 7 Dec 2012)

The Roma people, living throughout Europe, are a diverse population linked by the Romani language and culture. Previous linguistic and genetic studies have suggested that the Roma migrated into Europe from South Asia about 1000-1500 years ago. Genetic inferences about Roma history have mostly focused on the Y chromosome and mitochondrial DNA. To explore what additional information can be learned from genome-wide data, we analyzed data from six Roma groups that we genotyped at hundreds of thousands of single nucleotide polymorphisms (SNPs). We estimate that the Roma harbor about 80% West Eurasian ancestry-deriving from a combination of European and South Asian sources- and that the date of admixture of South Asian and European ancestry was about 850 years ago. We provide evidence for Eastern Europe being a major source of European ancestry, and North-west India being a major source of the South Asian ancestry in the Roma. By computing allele sharing as a measure of linkage disequilibrium, we estimate that the migration of Roma out of the Indian subcontinent was accompanied by a severe founder event, which we hypothesize was followed by a major demographic expansion once the population arrived in Europe.

Oh sister, where art thou? Indirect fitness benefit could maintain a host defense trait

Oh sister, where art thou? Indirect fitness benefit could maintain a host defense trait
Tobias Pamminger, Susanne Foitzik, Dirk Metzler, Pleuni S. Pennings
(Submitted on 4 Dec 2012)

Population structure can affect the evolution of parasite virulence and host defense, a hypothesis that has been confirmed by studies focusing on large spatial scales. In contrast, we examine the small scale population structure of a host species and investigate whether it could explain the evolution of a defense trait against slavemaking ants. Slavemaking ants steal worker brood from host colonies, which will later serve as slaves to rear parasite offspring. The host species Temnothorax longispinosus has evolved an effective post-enslavement defense mechanism; instead of taking care of the slavemaker young, these slaves kill a high proportion of the parasite offspring. Because slaves never reproduce, they were thought to be trapped in an evolutionary dead end without the possibility of evolving such defense traits. Using detailed microsatellite data on a small spatial scale we can demonstrate that slaves can gain indirect fitness benefits by reducing parasite pressure on nearby host colonies, because these are often closely related to the slaves. Our genetic analyses indicate that polydomy, i.e., the occupation of several nest sites by a single colony, is sufficient to explain the elevated relatedness values between slaves and the surrounding host colonies, which may benefit from the slaves’ rebellion behavior.

Deep-sequencing of the Peach Latent Mosaic Viroid Reveals New Aspects of Population Heterogeneity

Deep-sequencing of the Peach Latent Mosaic Viroid Reveals New Aspects of Population Heterogeneity
Jean-Pierre Sehi Glouzon, François Bolduc, Rafael Najmanovich, Shengrui Wang, Jean-Pierre Perreault
(Submitted on 3 Dec 2012)

Viroids are small circular single-stranded infectious RNAs that are characterized by a relatively high mutation level. Knowledge of their sequence heterogeneity remains largely elusive, and, as yet, no strategy attempting to address this question from a population dynamics point of view is in place. In order to address these important questions, a GF305 indicator peach tree was infected with a single variant of the Avsunviroidae family member Peach latent mosaic viroid (PLMVd). Six months post-inoculation, full-length circular conformers of PLMVd were isolated, deep-sequenced and the resulting sequences analyzed using an original bioinformatics scheme specifically designed and developed in order to evaluate the richness of a given the sequence’s population. Two distinct libraries were analyzed, and yielded 1125 and 1061 different PLMVd variants respectively, making this study the most productive to date (by more than an order of magnitude) in terms of the reporting of novel viroid sequences. Sequence variants exhibiting up to ~20% of mutations relative to the inoculated viroid were retrieved, clearly illustrating the high divergence dynamic inside a unique population. Using a novel hierarchical clustering algorithm, the different variants obtained were grouped into either 7 or 8 clusters depending on the library being analyzed. Most of the sequences contained, on average, between 4.6 and 6.3 mutations relative to the variant used initially to inoculate the plant. Interestingly, it was possible to reconstitute the sequence evolution between these clusters. On top of providing a reliable pipeline for the treatment of viroid deep-sequencing, this study sheds new light on the importance of the sequence variation that may take place in a viroid population and which may result in the formation of a quasi-species.

ZRT1 harbors an excess of nonsynonymous polymorphism and shows evidence of balancing selection in Saccharomyces cerevisiae

ZRT1 harbors an excess of nonsynonymous polymorphism and shows evidence of balancing selection in Saccharomyces cerevisiae
Elizabeth K. Engle, Justin C. Fay
(Submitted on 1 Dec 2012)

Estimates of the fraction of nucleotide substitutions driven by positive selection vary widely across different species. Accounting for different estimates of positive selection has been difficult, in part because selection on polymorphism within a species is known to obscure a signal of positive selection between species. While methods have been developed to control for the confounding effects of negative selection against deleterious polymorphism, the impact of balancing selection on estimates of positive selection has not been assessed. In Saccharomyces cerevisiae, there is no signal of positive selection within protein coding sequences as the ratio of nonsynonymous to synonymous polymorphism is higher than that of divergence. To investigate the impact of balancing selection on estimates of positive selection we examined five genes with high rates of nonsynonymous polymorphism in S. cerevisiae relative to divergence from S. paradoxus. One of the genes, a high affinity zinc transporter ZRT1, shows an elevated rate of synonymous polymorphism indicative of balancing selection. The high rate of synonymous polymorphism coincides with nonsynonymous divergence between three haplotype groups, which we find to be functionally indistinguishable. We conclude that balancing selection is not likely to be a common cause of genes harboring a large excess of nonsynonymous polymorphism in yeast.