One year at Haldane’s Sieve

We started Haldane’s Sieve back in August 2012, so we’ve just passed our one year anniversary. You can read our first post on our motivations for starting the blog here. We are pretty happy about how well Haldane’s Sieve has done at promoting preprints and a preprint culture more generally in population and evolutionary genetics and genomics.

Overall we posted 430 posts, the majority of which have been abstracts of arXived papers. It’s been great to see so many people starting to experiment with preprinting their work.

We’ve also had 41 guest posts by authors blogging about their papers (see here). This has been a really nice side effect of Haldane’s Sieve; we have gotten more researchers blogging about their work. The main aim of these “our paper” posts has been to allow authors to write about their paper in a more informal setting than a paper, to reach out to other researchers for feedback and to start to publicize their papers to the population and evolutionary genetics and genomics communities.

Over the past year Haldane’s Sieve has had over 600 comments. The majority of preprints have passed without comment, which is fine by us. Not all preprints need commentary, and a reasonable fraction are likely to have little long-term impact (like many papers). However, all of the abstracts posted at Haldane’s Sieve have been visited multiple times (the top ones hundreds of times), and the majority have been tweeted on twitter. Thus all of the preprints have received attention, and have likely had many more sets of eyes viewing them earlier than if they’d never been preprinted.

Some of the preprints get significant amounts of attention, comments, and feedback (both online and offline), which is really heartening to see. We think that many papers have been improved thanks to appearing on the arXiv and at Haldane’s Sieve. Thanks to everyone for their comments. It would be great to have more, remember they do not have to be substantial and could be as simple as asking for clarification on a figure legend. We try to make sure that the authors of preprints get notified about comments, however minor. Every comment helps improve preprints, to encourage others to preprint their papers, and a culture of preprint comments more generally.

Encouragingly, during the past year Genetics, Genome Research, and MBE have all changed their preprint policies to allow the submission of previously preprinted articles (see here). It is great to see preprints are starting to gain more acceptance in evolutionary genetics and genomics.

Here’s hoping for another good year, and we are thinking about extending Haldane’s Sieve in a few different ways over the coming year.

Haldane’s Sieve sifts through 2012

We started Haldane’s Sieve back in August 2012 to promote a preprint culture in evolutionary genetics (see here for more details). Since starting we’ve had ~150 posts, the vast majority of which have been preprint abstracts. We’ve had over 30,000 views from all over the world. During this time we’ve also seen more journals adopting favorable policies towards preprints, in particular Genetics and Genome Research, reflecting a growing recognition that preprint archives are a natural stage in the publication process. Overall it has been great to see the support for Haldane’s Sieve from so many people; we hope that it, and preprints more generally, will go from strength to strength in 2013.

Below are our top 10 most viewed pages of 2012. Each one of these has received hundreds of views. One noticeable trend is that many of them are the “Our paper” posts, which suggests that writing a blurb about your paper for Haldane’s Sieve is a great way to bring it more attention. Let us know if you want to write a post on your preprint article, or a quick post on a preprint you’ve enjoyed.

  1. Horizontal gene transfer may explain variation in θs. Maddamsetti et al. respond to a recent paper by Martincorena et al. The attention garnered by this post is undoubtedly due to its lively comment section. Martincorena et al. themselves responded with a pre-print here.
  2. Our paper: The genetic prehistory of southern Africa. Pickrell et al. write about their preprint. Their published paper is out at Nature Communications.
  3. Thoughts on: Finding the sources of missing heritability in a yeast cross. Joe Pickrell’s post about Bloom et al.
  4. Our paper: The geography of recent genetic ancestry across Europe. Peter Ralph and Graham Coop write about their arXived paper.
  5. Thoughts on: The date of interbreeding between Neandertals and modern humans. Graham Coop’s post on Sankararaman et al.’s paper. The authors’ post on their paper (Our paper: The date of interbreeding between Neandertals and modern humans) also made our top 10. The paper was published in PLoS Genetics.
  6. Our paper: Population genomics of the Wolbachia endosymbiont in Drosophila melanogaster. Casey Bergman’s post on his group’s paper by Richardson et al. The paper was published in PLoS Genetics.
  7. Our paper: A genetic variant near olfactory receptor genes influences cilantro preference. Nick Eriksson’s post about 23andMe’s preprint. The paper appeared in Flavour.
  8. Species Identification and Unbiased Profiling of Complex Microbial Communities Using Shotgun Illumina Sequencing of 16S rRNA Amplicon Sequences. Ong et al.
  9. Our paper: Population genomics of sub-Saharan Drosophila melanogaster: African diversity and non-African admixture. John Pool’s post on Pool et al. The paper appeared in PLoS Genetics.
  10. Blood ties: ABO is a trans-species polymorphism in primates . Ségurel et al.’s paper which Laure Ségurel posted about here. The paper came out in PNAS.

Welcome to Haldane’s sieve

The ease of communication facilitated by the Internet has dramatically affected the process of scientific communication in many fields. Most notably, many physics, math, and economics communities have adopted a system in which new research papers are immediately distributed throughout the world prior to formal evaluation in the form of peer review. This system allows for rapid distribution of “bleeding edge” results among all the experts in a field, allowing them to see and build upon the most recent advances.

This practice has historically been uncommon in biology, where instead results are generally made available to the community (including many people qualified to judge them) only after a delay of generally around six months to a year, during which a paper is reviewed, formatted, and published. We believe this is unfortunate. However, there is growing pressure in some parts of biology (in particular our fields of evolutionary and population genetics) to follow physics and math in posting papers to preprint servers ahead of formal publication.

Some authors have a variety of reasonable concerns about posting their papers to preprint servers. In particular, one worry is that, in a morass of online content, their work will not reach the relevant audience. Others see no benefit in posting their papers prior to review if they will not receive useful feedback. The goal of Haldane’s Sieve is to partially remedy these issues. We aim to provide a simple feed of preprints in the fields of evolutionary and population genetics (though we may later expand to other fields). Thus, instead of checking arXiv, PeerJ, or Figshare for relevant preprints, readers in these fields could simply check Haldane’s Sieve.

What to expect

As described above, most posts to Haldane’s Sieve will be basic descriptions of relevant preprints, with little to no commentary. All posts will have comment sections where discussion of the papers will be welcome. A second type of post will be detailed comments on a preprint of particular interest to a contributor. These posts could take the style of a journal review, or may simply be some brief comments. We hope they will provide useful feedback to the authors of the preprint. Finally, there will be posts by authors of preprints in which they describe their work and place it in broader context.

We ask the commenters to remember that by submitting articles to preprint servers the authors (often biologists) are taking a somewhat unusual step. Therefore, comments should be phrased in a constructive manner to aid the authors.

Authors: Our choice of what to post reflects our interests and knowledge, so we will only post a biased subset of evolutionary, population, and statistical genetics preprints that attract our interest. We will endeavor to be somewhat thorough but we will doubtless miss some interesting preprints, e.g. especially if they are not in the quantitative biology arXiv subfield. If you want us to link to your preprint please drop us a line, our emails can easily be found via our University sites. Alternatively send a tweet to @Haldanessieve.

Why “Haldane’s Sieve”?

A brief description of the name of this site is perhaps in order. When a new beneficial allele arises in a population, the probability that it eventually reaches fixation is influenced by a number of factors. One of these is the dominance coefficient of the allele. The reason the dominance coefficient matters is because early in the life of the allele, while it is at low frequency, it is almost always present in the population in heterozygous form. Therefore all else being equal, dominant beneficial alleles can increase in frequency due to selection faster than recessive alleles, increasing their probability of eventual fixation (or establishment in the population). This effect was noted by Haldane (Haldane 1924,1927) and has become known as “Haldane’s sieve” (Turner 1981; Charlesworth 1992). Analogously, we seek to increase the exposure of interesting papers early in their lifespan, hopefully increasing the probability that they reach their target audience.

A nameless wit has pointed out to us that preprints would really count as standing variation in this analogy and might therefore not be subject to Haldane’s sieve (see Orr and Betancourt Genetics 2001 ). We leave it to the reader to decide whether the analogy holds.

Image

The image of Haldane is from wikipedia.
The image of sieve is from fdctsevilla who kindly uses the creative commons 2.0. It’s surprisingly difficult to find a usable picture of a sieve.

Graham Coop and Joe Pickrell