Controlling False Positive Rates in Methods for Differential Gene Expression Analysis using RNA-Seq Data

Controlling False Positive Rates in Methods for Differential Gene Expression Analysis using RNA-Seq Data

David M Rocke , Luyao Ruan , J. Jared Gossett , Blythe Durbin-Johnson , Sharon Aviran
doi: http://dx.doi.org/10.1101/018739

We review existing methods for the analysis of RNA-Seq data and place them in a common framework of a sequence of tasks that are usually part of the process. We show that many existing methods produce large numbers of false positives in cases where the null hypothesis is true by construction and where actual data from RNA-Seq studies are used, as opposed to simulations that make specific assumptions about the nature of the data. We show that some of those mathematical assumptions about the data likely are one of the causes of the false positives, and define a general structure that is not apparently subject to these problems. The best performance was shown by limma-voom and by some simple methods composed of easily understandable steps.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s