English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Thesis

Detection of copy number variants in sequencing data.

MPS-Authors
/persons/resource/persons50442

Neubert,  Kerstin
Bioinformatics (Ralf Herwig), Dept. of Vertebrate Genomics (Head: Hans Lehrach), Max Planck Institute for Molecular Genetics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

neubert_master2010.pdf
(Any fulltext), 2MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Neubert, K. (2010). Detection of copy number variants in sequencing data. Master Thesis, FREIE UNIVERSITAET BERLIN, Berlin.


Cite as: https://hdl.handle.net/11858/00-001M-0000-0010-7A5F-E
Abstract
In this work a program for detection of CNVs in sequencing data based on depth of coverage was implemented in C++ (copyDOC). Single steps in the pipeline, the acquisition of DOC signals in windows, the event calling and merging are implemented using generic programming techniques that enable the future integration of other algorithms in the pipeline. Furthermore, a testing environment was implemented, the copySim platform, which is very useful for testing and evaluation of different algorithms. CopyDOC was successfully applied to synthetic and real data using constant sized windows. Dynamic windows, that adapt according to the local mappability of the sequence, are implemented in the pipeline, but could not be tested in this work. They might be advantageous in datasets that contain uniquely mapped reads. However, CNVs have been shown to be overrepresented in segmental duplications (Nguyen et al. 2006; Cooper et al. 2007) and by a general exclusion of multireads those CNVs might be difficult to ascertain. In the application of copyDOC to a 1000 genomes dataset the overlap of predicted variants was considerable higer using multireads compared to uniquely mapped reads. Thus there is a requirement for tools that can handle multireads. Futher improvements of copyDOC might be done for the CNV calling algorithm and the merging step. For example the program workflow could be tested with a direct comparison of the DOC signals in two datasets via log ratios instead of appling a t-test on DOC signals in the two datasets. CopyDOC and copySim could be used as platform for the implementation and evaluation of futher CNV detection algorithms.