Bayesian genome assembly and assessment by Markov Chain Monte Carlo sampling

Bayesian genome assembly and assessment by Markov Chain Monte Carlo sampling
Mark Howison, Felipe Zapata, Erika J. Edwards, Casey W. Dunn
(Submitted on 6 Aug 2013)

Most genome assemblers provide a point estimates of the true genome sequences, chosen from among many alternative hypotheses that are supported by the data. We present a Markov Chain Monte Carlo approach to sequence assembly that instead generates a distribution of assembly hypotheses with quantified probabilities. This statistically explicit Bayesian approach to assembly allows the investigator to evaluate alternative assembly hypotheses in a unified framework and propagate uncertainty about genomes assembly to downstream analyses. We implement this approach in a prototype assembler and illustrate its application to the genome of the bacteriophage $\Phi$X174.

Leave a comment