de.mpg.escidoc.pubman.appbase.FacesBean
English
 
Help Guide Disclaimer Contact us Login
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Evaluation of methods for modeling transcription factor sequence specificity

MPS-Authors
http://pubman.mpdl.mpg.de/cone/persons/resource/persons50380

Kielbasa,  S. M.
Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons50613

Vingron,  M.
Gene regulation (Martin Vingron), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

Locator
There are no locators available
Fulltext (public)

Weirauch et al.pdf
(Publisher version), 2MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Weirauch, M. T., Cote, A., Norel, R., Annala, M., Zhao, Y., Riley, T. R., et al. (2013). Evaluation of methods for modeling transcription factor sequence specificity. Nature biotechnology, 31(2), 126-34. doi:10.1038/nbt.2486.


Cite as: http://hdl.handle.net/11858/00-001M-0000-0014-7B8E-1
Abstract
Genomic analyses often involve scanning for potential transcription factor (TF) binding sites using models of the sequence specificity of DNA binding proteins. Many approaches have been developed to model and learn a protein's DNA-binding specificity, but these methods have not been systematically compared. Here we applied 26 such approaches to in vitro protein binding microarray data for 66 mouse TFs belonging to various families. For nine TFs, we also scored the resulting motif models on in vivo data, and found that the best in vitro-derived motifs performed similarly to motifs derived from the in vivo data. Our results indicate that simple models based on mononucleotide position weight matrices trained by the best methods perform similarly to more complex models for most TFs examined, but fall short in specific cases (<10% of the TFs examined here). In addition, the best-performing motifs typically have relatively low information content, consistent with widespread degeneracy in eukaryotic TF sequence preferences.