Statistical analysis of high-throughput sequencing count data

Love, Michael I.

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Thesis

Statistical analysis of high-throughput sequencing count data

MPS-Authors

/persons/resource/persons50412

Love, Michael I.
IMPRS for Computational Biology and Scientific Computing - IMPRS-CBSC (Kirsten Kelleher), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;
Freie Universität Berlin, Fachbereich Mathematik und Informatik;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

thesis_Love.pdf
(Any fulltext), 2MB

Supplementary Material (public)

There is no public supplementary material available

Citation

Love, M. I. (2014). Statistical analysis of high-throughput sequencing count data. PhD Thesis.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0018-D0ED-F

Abstract

All of the work presented in this thesis grew out of collaborations with other researchers. For each chapter, I brie y summarize my contribution and acknowledge the contributions of others. Chapter 2 represents a conceptual framework for modeling read counts using various distributions. These ideas grew out of conversations with Ho-Ryun Chung at the Max Planck Institute for Molecular Genetics (MPIMG) in Berlin and Simon Anders at the European Molecular Biology Laboratories (EMBL) in Heidelberg. Chapter 3 was published in Statistical Applications in Genetics and Molecular Biology [1]. The idea for detecting copy number variants in exome-enriched sequencing data was proposed by Stefan Haas and with Alena van Bommel various methods were tested and evaluated. My contribution was developing the hidden Markov model, implementing the software and testing the performance. I wish to acknowledge the X-linked intellectual disabilities project team at MPIMG including H.-Hilger Ropers, Vera Kalscheuer, Ruping Sun, Anne-Katrin Emde, Wei Chen, Hao Hu and Tomasz Zemojtel, who provided helpful discussions. Chapter 4 resulted from a 5 month visit to the group of Wolfgang Huber at EMBL in Heidelberg. Simon Anders proposed the idea of incorporating priors for dispersion and log fold change into the DESeq framework. My contribution was to implement these new statistical methods as a new package DESeq2, with closer integration with core Bioconductor packages. I would like to acknowledge all the members of the Huber group for helpful discussions. Chapter 5 resulted from a collaboration with the Transcriptional Regulation Group of Sebastiaan Meijsing at the MPIMG. I would like to thank Stephan Starick who initially proposed to investigate the interaction between glucocorticoid receptor and the chromatin landscape. My contribution was the statistical analysis presented in the chapter. Sebastiaan Meijsing provided valuable feedback during the evolution of the project. I wish to acknowledge the contributions of Morgane Thomas-Chollier, Katja Borzym, Sam Cooper and Ho-Ryun Chung.