Polyester: simulating RNA-seq datasets with differential transcript expression

Alyssa C Frazee, Andrew E Jaffe, Ben Langmead, Jeffrey Leek

Statistical methods development for differential expression analysis of RNA sequencing (RNA-seq) requires software tools to assess accuracy and error rate control. Since true differential expression status is often unknown in experimental datasets, artificially-constructed datasets must be utilized, either by generating costly spike-in experiments or by simulating RNA-seq data. Polyester is an R package designed to simulate RNA-seq data, beginning with an experimental design and ending with col- lections of RNA-seq reads. The main advantage of Polyester is the ability to simulate isoform-level differential expression across biological replicates for a variety of experimental designs at the read level. Differential expression signal can be simulated with either built-in or user-defined statistical models. Polyester is available on GitHub at https://github.com/alyssafrazee/polyester.

