Utilization of high throughput genome sequencing technology for large scale single nucleotide polymorphism discovery in red deer and Canadian elk
Deer farming is a significant international industry. For genetic improvement, using genomic tools, an ordered array of DNA variants and associated flanking sequence across the genome is required. This work reports a comparative assembly of the deer genome and subsequent DNA variant identification. Next generation sequencing combined with an existing bovine reference genome enabled the deer genome to be assembled sufficiently for large-scale SNP discovery. In total, 28 Gbp of sequence data were generated from seven Cervus elaphus (European red deer and Canadian elk) individuals. After aligning sequence to the bovine reference genome build UMD 3.0 and binning reads into one Mbp groups; reads were assembled and analyzed for SNPs. Greater than 99% of the non-repetitive fraction of the bovine genome was covered by deer chromosomal scaffolds. We identified 1.8 million SNPs meeting Illumina InfiniumII SNP chip technical threshold. Markers on the published Red x Pere David deer linkage map were aligned to both UMD3.0 and the new deer chromosomal scaffolds. This enabled deer linkage groups to be assigned to deer chromosomal scaffolds, although the mapping locations remain based on bovine order. Genotyping of 270 SNPs on a Sequenom MS system showed that 88% of SNPs identified could be amplified. Also, inheritance patterns showed no evidence of departure from Hardy-Weinberg equilibrium. A comparative assembly of the deer genome, alignment with existing deer genetic linkage groups and SNP discovery has been successfully completed and validated facilitating application of genomic technologies for subsequent deer genetic improvement.