A pooling-based approach to mapping genetic variants associated with DNA methylation

A pooling-based approach to mapping genetic variants associated with DNA methylation

Irene Miriam Kaplow, Julia L MacIsaac, Sarah M Mah, Lisa M McEwen, Michael S Kobor, Hunter B Fraser
doi: http://dx.doi.org/10.1101/013649

DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover less than 2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a truly genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified over 2,000 genetic variants associated with DNA methylation. We found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data.

SWS2 visual pigment evolution as a test of historically contingent patterns of plumage color evolution in Warblers

SWS2 visual pigment evolution as a test of historically contingent patterns of plumage color evolution in Warblers

Natasha Bloch, James M Morrow, Belinda SW Chang, Trevor D Price
doi: http://dx.doi.org/10.1101/013573

Distantly related clades that occupy similar environments may differ due to the lasting imprint of their ancestors – historical contingency. The New World warblers (Parulidae) and Old World warblers (Phylloscopidae) are ecologically similar clades that differ strikingly in plumage coloration. We studied genetic and functional evolution of the short-wavelength sensitive visual pigments (SWS2 and SWS1) to ask if altered color perception could contribute to the plumage color differences between clades. We show SWS2 is short-wavelength shifted in birds that occupy open environments, such as finches, compared to those in closed environments, including warblers. Sequencing of opsin genes and phylogenetic reconstructions indicate New World warblers were derived from a finch-like form that colonized from the Old World 15-20Ma. During this process the SWS2 gene accumulated 6 substitutions in branches leading to New World warblers, inviting the hypothesis that passage through a finch-like ancestor resulted in SWS2 evolution. In fact, we show spectral tuning remained similar across warblers as well as the finch ancestor. Results reject the hypothesis of historical contingency based on opsin spectral tuning, but point to evolution of other aspects of visual pigment function. Using the approach outlined here, historical contingency becomes a generally testable theory in systems where genotype and phenotype can be connected.

Independent molecular basis of convergent highland adaptation in maize

Independent molecular basis of convergent highland adaptation in maize

Shohei Takuno, Peter Ralph, Kelly Swarts, Rob J Elshire, Jeffrey C Glaubitz, Edward S. Buckler, Matthew B Hufford, Jeff Ross-Ibarra
doi: http://dx.doi.org/10.1101/013607

Convergent evolution occurs when multiple species/subpopulations adapt to similar environments via similar phenotypes. We investigate here the molecular basis of convergent adaptation in maize to highland climates in Mexico and South America using genome-wide SNP data. Taking advantage of archaeological data on the arrival of maize to the highlands, we infer demographic models for both populations, identifying evidence of a strong bottleneck and rapid expansion in South America. We use these models to then identify loci showing an excess of differentiation as a means of identifying putative targets of natural selection, and compare our results to expectations from recently developed theory on convergent adaptation. Consistent with predictions across a wide array of parameter space, we see limited evidence for convergent evolution at the nucleotide level in spite of strong similarities in overall phenotypes. Instead, we show that selection appears to have predominantly acted on standing genetic variation, and that introgression from wild teosinte populations appears to have played a role in highland adaptation in Mexican maize.

A Spatial Framework for Understanding Population Structure and Admixture.

A Spatial Framework for Understanding Population Structure and Admixture.
Gideon Bradburd, Peter L. Ralph, Graham Coop
doi: http://dx.doi.org/10.1101/013474

Geographic patterns of genetic variation within modern populations, produced by complex histories of migration, can be difficult to infer and visually summarize. A general consequence of geographically limited dispersal is that samples from nearby locations tend to be more closely related than samples from distant locations, and so genetic covariance often recapitulates geographic proximity. We use genome-wide polymorphism data to build “geogenetic maps”, which, when applied to stationary populations, produces a map of the geographic positions of the populations, but with distances distorted to reflect historical rates of gene flow. In the underlying model, allele frequency covariance is a decreasing function of geogenetic distance, and nonlocal gene flow such as admixture can be identified as anomalously strong covariance over long distances. This admixture is explicitly co-estimated and depicted as arrows, from the source of admixture to the recipient, on the geogenetic map. We demonstrate the utility of this method on a circum-Tibetan sampling of the greenish warbler (Phylloscopus trochiloides), in which we find evidence for gene flow between the adjacent, terminal populations of the ring species. We also analyze a global sampling of human populations, for which we largely recover the geography of the sampling, with support for significant histories of admixture in many samples. This new tool for understanding and visualizing patterns of population structure is implemented in a Bayesian framework in the program SpaceMix.

The effect of the dispersal kernel on isolation-by-distance in a continuous population


The effect of the dispersal kernel on isolation-by-distance in a continuous population

Tara N. Furstenau, Reed A. Cartwright
Comments: 18 pages (main); 4 pages (supp)
Subjects: Populations and Evolution (q-bio.PE)

Under models of isolation-by-distance, population structure is determined by the probability of identity-by-descent between pairs of genes according to the geographic distance between them. Well established analytical results indicate that the relationship between geographical and genetic distance depends mostly on the neighborhood size of the population, $N_b = 4{\pi}{\sigma}^2 D_e$, which represents a standardized measure of dispersal. To test this prediction, we model local dispersal of haploid individuals on a two-dimensional torus using four dispersal kernels: Rayleigh, exponential, half-normal and triangular. When neighborhood size is held constant, the distributions produce similar patterns of isolation-by-distance, confirming predictions. Considering this, we propose that the triangular distribution is the appropriate null distribution for isolation-by-distance studies. Under the triangular distribution, dispersal is uniform within an area of $4{\pi}{\sigma}^2$ (i.e. the neighborhood area), which suggests that the common description of neighborhood size as a measure of a local panmictic population is valid for popular families of dispersal distributions. We further show how to draw from the triangular distribution efficiently and argue that it should be utilized in other studies in which computational efficiency is important

Geographic range size is predicted by plant mating system

Geographic range size is predicted by plant mating system
Dena Grossenbacher, Ryan Briscoe Runquist, Emma Goldberg, Yaniv Brandvain
doi: http://dx.doi.org/10.1101/013417

Species ranges vary enormously, and even closest relatives may differ in range size by several orders of magnitude. With data from hundreds of species spanning 20 genera and generic sections, we show that plant species that autonomously reproduce via self-pollination consistently have larger geographic ranges than their close relatives that generally require two parents for reproduction. Further analyses strongly implicate autonomous fertilization in causing this relationship, as it is not driven by traits such as polyploidy or annual life history whose evolution is sometimes correlated with the transition to autonomous self-fertilization. Furthermore, we find that selfers occur at higher maximum latitudes and that disparity in range size between selfers and outcrossers increases with time since their separation. Together, these results show that autonomous reproduction – a critical biological trait that eliminates mate limitation and thus potentially increases the probability of establishment – increases range size.

Sifting through 2014 on Haldane’s Sieve

2014 was the second full year of Haldane’s Sieve, which we started in 2012 to bring attention to preprints in population and evolutionary genetics. This year we had over 100,000 visitors from across the globe; the most viewed posts were: