English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Improved Prediction of Non-methylated Islands in Vertebrates Highlights Different Characteristic Sequence Patterns

MPS-Authors
/persons/resource/persons73755

Huska,  M.
IMPRS for Computational Biology and Scientific Computing - IMPRS-CBSC (Kirsten Kelleher), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

/persons/resource/persons50613

Vingron,  M.
Gene regulation (Martin Vingron), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

External Resource
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

Huska.pdf
(Publisher version), 3MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Huska, M., & Vingron, M. (2016). Improved Prediction of Non-methylated Islands in Vertebrates Highlights Different Characteristic Sequence Patterns. PLoS Computational Biology, 12(12): e1005249. doi:10.1371/journal.pcbi.1005249.


Cite as: https://hdl.handle.net/11858/00-001M-0000-002D-4773-9
Abstract
Non-methylated islands (NMIs) of DNA are genomic regions that are important for gene regulation and development. A recent study of genome-wide non-methylation data in vertebrates by Long et al. (eLife 2013;2:e00348) has shown that many experimentally identified non-methylated regions do not overlap with classically defined CpG islands which are computationally predicted using simple DNA sequence features. This is especially true in cold-blooded vertebrates such as Danio rerio (zebrafish). In order to investigate how predictive DNA sequence is of a region's methylation status, we applied a supervised learning approach using a spectrum kernel support vector machine, to see if a more complex model and supervised learning can be used to improve non-methylated island prediction and to understand the sequence properties of these regions. We demonstrate that DNA sequence is highly predictive of methylation status, and that in contrast to existing CpG island prediction methods our method is able to provide more useful predictions of NMIs genome-wide in all vertebrate organisms that were studied. Our results also show that in cold-blooded vertebrates (Anolis carolinensis, Xenopus tropicalis and Danio rerio) where genome-wide classical CpG island predictions consist primarily of false positives, longer primarily AT-rich DNA sequence features are able to identify these regions much more accurately.