An algebraic framework to sample the rearrangement histories of a cancer metagenome with double cut and join, duplication and deletion events

An algebraic framework to sample the rearrangement histories of a cancer metagenome with double cut and join, duplication and deletion events
Daniel R. Zerbino, Benedict Paten, Glenn Hickey, David Haussler
(Submitted on 22 Mar 2013)

Algorithms to study structural variants (SV) in whole genome sequencing (WGS) cancer datasets are currently unable to sample the entire space of rearrangements while allowing for copy number variations (CNV). In addition, rearrangement theory has up to now focused on fully assembled genomes, not on fragmentary observations on mixed genome populations. This affects the applicability of current methods to actual cancer datasets, which are produced from short read sequencing of a heterogeneous population of cells. We show how basic linear algebra can be used to describe and sample the set of possible sequences of SVs, extending the double cut and join (DCJ) model into the analysis of metagenomes. We also describe a functional pipeline which was run on simulated as well as experimental cancer datasets.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s