An annotated consensus genetic map for Pinus taeda L. and extent of linkage disequilibrium in three genotype-phenotype discovery populations

An annotated consensus genetic map for Pinus taeda L. and extent of linkage disequilibrium in three genotype-phenotype discovery populations
Jared W. Westbrook, Vikram E. Chhatre, Le-Shin Wu, Srikar Chamala, Leandro Gomide Neves, Patricio Muñoz, Pedro J Martínez-García, David B. Neale, Matias Kirst, Keithanne Mockaitis, C. Dana Nelson, Gary F. Peter, John M. Davis, Craig S. Echt
doi: http://dx.doi.org/10.1101/012625

A consensus genetic map for Pinus taeda (loblolly pine) was constructed by merging three previously published maps with a map from a pseudo-backcross between P. taeda and P. elliottii (slash pine). The consensus map positioned 4981 markers via genotyping of 1251 individuals from four pedigrees. It is the densest linkage map for a conifer to date. Average marker spacing was 0.48 centiMorgans and total map length was 2372 centiMorgans. Functional predictions for 4762 markers for expressed sequence tags were improved by alignment to full-length P. taeda transcripts. Alignments to the P. taeda genome mapped 4225 scaffold sequences onto linkage groups. The consensus genetic map was used to compare the extent of genome-wide linkage disequilibrium in an association population of distantly related P. taeda individuals (ADEPT2), a multiple-family pedigree used for genomic selection studies (CCLONES), and a full-sib quantitative trait locus mapping population (BC1). Weak linkage disequilibrium was observed in CCLONES and ADEPT2. Average squared correlations, R2, between genotypes at SNPs less than one centiMorgan apart was less than 0.05 in both populations and R2 did not decay substantially with genetic distance. By contrast, strong and extended linkage disequilibrium was observed among BC1 full-sibs where average R2 decayed from 0.8 to less than 0.1 over 53 centiMorgans. The consensus map and analysis of linkage disequilibrium establish a foundation for comparative association and quantitative trait locus mapping between genotype-phenotype discovery populations. 

Testing for genetic associations in arbitrarily structured populations

Testing for genetic associations in arbitrarily structured populations
Minsun Song, Wei Hao, John D. Storey
doi: http://dx.doi.org/10.1101/012682

We present a new statistical test of association between a trait (either quantitative or binary) and genetic markers, which we theoretically and practically prove to be robust to arbitrarily complex population structure. The statistical test involves a set of parameters that can be directly estimated from large-scale genotyping data, such as that measured in genome-wide associations studies (GWAS). We also derive a new set of methodologies, called a genotype-conditional association test (GCAT), shown to provide accurate association tests in populations with complex structures, manifested in both the genetic and environmental contributions to the trait. We demonstrate the proposed method on a large simulation study and on the Northern Finland Birth Cohort study. In the Finland study, we identify several new significant loci that other methods do not detect. Our proposed framework provides a substantially different approach to the problem from existing methods. We provide some discussion on its similarities and differences with the linear mixed model and principal component approaches.

The competition between simple and complex evolutionary trajectories in asexual populations


The competition between simple and complex evolutionary trajectories in asexual populations

Ian E. Ochs, Michael M. Desai
Comments: 8 pages, 3 figures
Subjects: Populations and Evolution (q-bio.PE)

On rugged fitness landscapes where sign epistasis is common, adaptation can often involve either individually beneficial “uphill” mutations or more complex mutational trajectories involving fitness valleys or plateaus. The dynamics of the evolutionary process determine the probability that evolution will take any specific path among a variety of competing possible trajectories. Understanding this evolutionary choice is essential if we are to understand the outcomes and predictability of adaptation on rugged landscapes. We present a simple model to analyze the probability that evolution will eschew immediately uphill paths in favor of crossing fitness valleys or plateaus that lead to higher fitness but less accessible genotypes. We calculate how this probability depends on the population size, mutation rates, and relevant selection pressures, and compare our analytical results to Wright-Fisher simulations. We find that the probability of valley crossing depends nonmonotonically on population size: intermediate size populations are most likely to follow a “greedy” strategy of acquiring immediately beneficial mutations even if they lead to evolutionary dead ends, while larger and smaller populations are more likely to cross fitness valleys to reach distant advantageous genotypes. We explicitly identify the boundaries between these different regimes in terms of the relevant evolutionary parameters. Above a certain threshold population size, we show that the degree of evolutionary “foresight” depends only on a single simple combination of the relevant parameters.

Reproductive workers show queen-like gene expression in an intermediately eusocial insect, the buff-tailed bumble bee Bombus terrestris

Reproductive workers show queen-like gene expression in an intermediately eusocial insect, the buff-tailed bumble bee Bombus terrestris.

Mark Christian Harrison, Robert L Hammond, Eamonn B Mallon
doi: http://dx.doi.org/10.1101/012500

Bumble bees represent a taxon with an intermediate level of eusociality within Hymenoptera. The clear division of reproduction between a single founding queen and the largely sterile workers is characteristic for highly eusocial species, whereas the morphological similarity between the bumble bee queen and the workers is typical for more primitively eusocial hymenopterans. Also, unlike other highly eusocial hymenopterans, division of labour among worker sub-castes is plastic and not predetermined by morphology or age. We conducted a differential expression analysis based on RNA-seq data from 11 combinations of developmental stage and caste to investigate how a single genome can produce the distinct castes of queens, workers and males in the buff-tailed bumble bee Bombus terrestris. Based on expression patterns, we found males to be the most distinct of all adult castes (2,411 transcripts differentially expressed compared to non-reproductive workers). However, only relatively few transcripts were differentially expressed between males and workers during development (larvae: 71, pupae: 162). This indicates the need for more distinct expression patterns to control behaviour and physiology in adults compared to those required to create different morphologies. Among the female castes, the expression of over ten times more transcripts differed signifcantly between reproductive workers and their non-reproductive sisters than when comparing reproductive workers to the mother queen. This suggests a strong shift towards a more queen-like behaviour and physiology when a worker becomes fertile. This is in contrast to findings for higher eusocial species, in which reproductive workers are more similar to non-reproductive workers than the queen.

An experimental test of the relationship between melanism and desiccation survival in insects

An experimental test of the relationship between melanism and desiccation survival in insects

Subhash Rajpurohit, Lisa Marie Peterson, Andrew Orr, Anthony J. Marlon, Allen G Gibbs
doi: http://dx.doi.org/10.1101/012369

We used experimental evolution to test the ?melanism-desiccation? hypothesis, which proposes that dark cuticle in several Drosophila species is an adaptation for increased desiccation tolerance. We selected for dark and light body pigmentation in replicated populations of D. melanogaster and assayed traits related to water balance. We also scored pigmentation and desiccation tolerance in populations selected for desiccation survival. Populations in both selection regimes showed large differences in the traits directly under selection. However, after over 40 generations of pigmentation selection, dark-selected populations were not more desiccation-tolerant than light-selected and control populations, nor did we find significant changes in carbohydrate amounts that could affect desiccation resistance. Body pigmentation of desiccation-selected populations did not differ from control populations after over 140 generations of selection. Our results do not support an important role for melanization in Drosophila water balance.

DensiTree 2: Seeing Trees Through the Forest

DensiTree 2: Seeing Trees Through the Forest

Remco Bouckaert, Joseph Heled
doi: http://dx.doi.org/10.1101/012401

Motivation: Phylogenetic analysis like Bayesian MCMC or bootstrapping result in a collection of trees. Trees are discrete objects and it is generally difficult to get a mental grip on a distributions over trees. Visualisation tools like DensiTree can give good intuition on tree distributions. It works by drawing all trees in the set transparently thus highlighting areas where the tree in the set agrees. In this way, both uncertainty in clade heights and uncertainty in topology can be visualised. In our experience, a vanilla DensiTree can turn out to be misleading in that it shows too much uncertainty due to wrongly ordering taxa or due to unlucky placement of internal nodes. Results: DensiTree is extended to allow visualisation of meta-data associated with branches such as population size and evolutionary rates. Furthermore, geographic locations of taxa can be shown on a map, making it easy to visually check there is some geographic pattern in a phylogeny. Taxa orderings have a large impact on the layout of the tree set, and advances have been made in finding better orderings resulting in significantly more informative visualisations. We also explored various methods for positioning internal nodes, which can improve the quality of the image. Together, these advances make it easier to comprehend distributions over trees. Availability: DensiTree is freely available from http://compevol. auckland.ac.nz/software/.

The genomic signature of social interactions regulating honey bee caste development

The genomic signature of social interactions regulating honey bee caste development
Svjetlana Vojvodic, Brian R Johnson, Brock Harpur, Clement Kent, Amro Zayed, Kirk E Anderson, Timothy Linksvayer
doi: http://dx.doi.org/10.1101/012385

Social evolution theory posits the existence of genes expressed in one individual that affect the traits and fitness of social partners. The archetypal example of reproductive altruism, honey bee reproductive caste, involves strict social regulation of larval caste fate by care-giving nurses. However, the contribution of nurse-expressed genes, which are prime socially-acting candidate genes, to the caste developmental program and to caste evolution remains mostly unknown. We experimentally induced new queen production by removing the current colony queen, and we used RNA sequencing to study the gene expression profiles of both developing larvae and their care-giving nurses before and after queen removal. By comparing the gene expression profiles between both queen-destined larvae and their nurses to worker-destined larvae and their nurses in queen-present and queen-absent conditions, we identified larval and nurse genes associated with larval caste development and with queen presence. Of 950 differentially-expressed genes associated with larval caste development, 82% were expressed in larvae and 18% were expressed in nurses. Behavioral and physiological evidence suggests that nurses may specialize in the short term feeding queen- versus worker-destined larvae. Estimated selection coefficients indicated that both nurse and larval genes associated with caste are rapidly evolving, especially those genes associated with worker development. Of the 1863 differentially-expressed genes associated with queen presence, 90% were expressed in nurses. Altogether, our results suggest that socially-acting genes play important roles in both the expression and evolution of socially-influenced traits like caste.

Evaluating intra- and inter-individual variation in the human placental transcriptome

Evaluating intra- and inter-individual variation in the human placental transcriptome
David A Hughes, Martin Kircher, Zhisong He, Song Guo, Genevieve L Fairbrother, Carlos S Moreno, Philipp Khaitovich, Mark Stoneking
doi: http://dx.doi.org/10.1101/012468

Background: Gene expression variation is a phenotypic trait of particular interest as it represents the initial link between genotype and other phenotypes. Analyzing how such variation apportions among and within groups allows for the evaluation of how genetic and environmental factors influence such traits. It also provides opportunities to identify genes and pathways that may have been influenced by non-neutral processes. Here we use a population genetics framework and next generation sequencing to evaluate how gene expression variation is apportioned among four human groups in a natural biological tissue, the placenta. Results: We estimate that on average, 33.2%, 58.9% and 7.8% of the placental transcriptome is explained by variation within individuals, among individuals and among human groups, respectively. Additionally, when technical and biological traits are included in models of gene expression they account for roughly 2% of total gene expression variation. Notably, the variation that is significantly different among groups is enriched in biological pathways associated with immune response, cell signaling and metabolism. Many biological traits demonstrated correlated changes in expression in numerous pathways of potential interest to clinicians and evolutionary biologists. Finally, we estimate that the majority of the human placental transcriptome (65% of expressed genes) exhibits expression profiles consistent with neutrality; the remainder are consistent with stabilizing selection (26%), directional selection (4.9%), or diversifying selection (4.8%). Conclusion: We apportion placental gene expression variation into individual, population and biological trait factors and identify how each influence the transcriptome. Additionally, we advance methods to associate expression profiles with different forms of selection.

Synthesis of phylogeny and taxonomy into a comprehensive tree of life

Synthesis of phylogeny and taxonomy into a comprehensive tree of life

Karen A Cranston, Open Tree of Life
doi: http://dx.doi.org/10.1101/012260

Reconstructing the phylogenetic relationships that unite all biological lineages (the tree of life) is a grand challenge of biology. However, the paucity of readily available homologous character data across disparately related lineages renders direct phylogenetic inference currently untenable. Our best recourse towards realizing the tree of life is therefore the synthesis of existing collective phylogenetic knowledge available from the wealth of published primary phylogenetic hypotheses, together with taxonomic hierarchy information for unsampled taxa. We combined phylogenetic and taxonomic data to produce a draft tree of life—the Open Tree of Life—containing 2.3 million tips. Realization of this draft tree required the assembly of two resources that should prove valuable to the community: 1) a novel comprehensive global reference taxonomy, and 2) a database of published phylogenetic trees mapped to this common taxonomy. Our open source framework facilitates community comment and contribution, enabling a continuously updatable tree when new phylogenetic and taxonomic data become digitally available. While data coverage and phylogenetic conflict across the Open Tree of Life illuminates significant gaps in both the underlying data available for phylogenetic reconstruction and the publication of trees as digital objects, the tree provides a compelling starting point from which we can continue to improve through community contributions. Having a comprehensive tree of life will fuel fundamental research on the nature of biological diversity, ultimately providing up-to-date phylogenies for downstream applications in comparative biology, ecology, conservation biology, climate change studies, agriculture, and genomics.

SpeedSeq: Ultra-fast personal genome analysis and interpretation

SpeedSeq: Ultra-fast personal genome analysis and interpretation

Colby Chiang, Ryan M Layer, Gregory G Faust, Michael R Lindberg, David B Rose, Erik P Garrison, Gabor T Marth, Aaron R Quinlan, Ira M Hall
doi: http://dx.doi.org/10.1101/012179

Comprehensive interpretation of human genome sequencing data is a challenging bioinformatic problem that typically requires weeks of analysis, with extensive hands-on expert involvement. This informatics bottleneck inflates genome sequencing costs, poses a computational burden for large-scale projects, and impedes the adoption of time-critical clinical applications such as personalized cancer profiling and newborn disease diagnosis, where the actionable timeframe can measure in hours or days. We developed SpeedSeq, an open-source genome analysis platform that vastly reduces computing time. SpeedSeq accomplishes read alignment, duplicate removal, variant detection and functional annotation of a 50X human genome in <24 hours, even using one low-cost server. SpeedSeq offers competitive or superior performance to current methods for detecting germline and somatic single nucleotide variants (SNVs), indels, and structural variants (SVs) and includes novel functionality for SV genotyping, SV annotation, fusion gene detection, and rapid identification of actionable mutations. SpeedSeq will help bring timely genome analysis into the clinical realm.