Optimal Assembly for High Throughput Shotgun Sequencing
Guy Bresler, Ma’ayan Bresler, David Tse
(Submitted on 1 Jan 2013)
We present a framework for the design of optimal assembly algorithms for shotgun sequencing under the criterion of complete reconstruction. We derive a lower bound on the read length and the coverage depth required for reconstruction in terms of the repeat statistics of the genome. We design a de Brujin graph based assembly algorithm which can achieve very close to the lower bound for repeat statistics of a wide range of sequenced genomes.