A complete workflow for the analysis of full-size ChIP-seq (and similar) data 
sets using peak-motifs

Thomas-Chollier, Morgane; Darbo, Elodie; Herrmann, Carl; Defrance, Matthieu; Thieffry, Denis; van Helden, Jacques

doi:10.1038/nprot.2012.088

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Journal Article

A complete workflow for the analysis of full-size ChIP-seq (and similar) data sets using peak-motifs

MPS-Authors

/persons/resource/persons50595

Thomas-Chollier, Morgane
Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

Thomas-Chollier.pdf
(Publisher version), 4MB

Supplementary Material (public)

There is no public supplementary material available

Citation

Thomas-Chollier, M., Darbo, E., Herrmann, C., Defrance, M., Thieffry, D., & van Helden, J. (2012). A complete workflow for the analysis of full-size ChIP-seq (and similar) data sets using peak-motifs. Nature Protocols, 7(8), 1551-1568. doi:10.1038/nprot.2012.088.

Cite as: https://hdl.handle.net/11858/00-001M-0000-000E-E86C-F

Abstract

This protocol explains how to use the online integrated pipeline 'peak-motifs' (http://rsat.ulb.ac.be/rsat/) to predict motifs and binding sites in full-size peak sets obtained by chromatin immunoprecipitation-sequencing (ChIP-seq) or related technologies. The workflow combines four time- and memory-efficient motif discovery algorithms to extract significant motifs from the sequences. Discovered motifs are compared with databases of known motifs to identify potentially bound transcription factors. Sequences are scanned to predict transcription factor binding sites and analyze their enrichment and positional distribution relative to peak centers. Peaks and binding sites are exported as BED tracks that can be uploaded into the University of California Santa Cruz (UCSC) genome browser for visualization in the genomic context. This protocol is illustrated with the analysis of a set of 6,000 peaks (8 Mb in total) bound by the Drosophila transcription factor Kruppel. The complete workflow is achieved in about 25 min of computational time on the Regulatory Sequence Analysis Tools (RSAT) Web server. This protocol can be followed in about 1 h.