English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Identifying protein complexes directly from high-throughput TAP data with Markov random fields

MPS-Authors

Rungsarityotin,  Wasinee
Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

/persons/resource/persons50394

Krause,  Roland
Department of Cellular Microbiology, Max Planck Institute for Infection Biology;
Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

/persons/resource/persons50523

Schliep,  Alexander
Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

BMC_Bioinformatics_2007_8_482.pdf
(Publisher version), 2MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Rungsarityotin, W., Krause, R., Schödl, A., & Schliep, A. (2007). Identifying protein complexes directly from high-throughput TAP data with Markov random fields. BMC Bioinformatics, 8(Article Number: 482), 1-19. doi:10.1186/1471-2105-8-482.


Cite as: https://hdl.handle.net/11858/00-001M-0000-000E-C23F-8
Abstract
Background Predicting protein complexes from experimental data remains a challenge due to limited resolution and stochastic errors of high-throughput methods. Current algorithms to reconstruct the complexes typically rely on a two-step process. First, they construct an interaction graph from the data, predominantly using heuristics, and subsequently cluster its vertices to identify protein complexes. Results We propose a model-based identification of protein complexes directly from the experimental observations. Our model of protein complexes based on Markov random fields explicitly incorporates false negative and false positive errors and exhibits a high robustness to noise. A model-based quality score for the resulting clusters allows us to identify reliable predictions in the complete data set. Comparisons with prior work on reference data sets shows favorable results, particularly for larger unfiltered data sets. Additional information on predictions, including the source code under the GNU Public License can be found at http://algorithmics.molgen.mpg.de/Static/Supplements/ProteinComplexes. Conclusion We can identify complexes in the data obtained from high-throughput experiments without prior elimination of proteins or weak interactions. The few parameters of our model, which does not rely on heuristics, can be estimated using maximum likelihood without a reference data set. This is particularly important for protein complex studies in organisms that do not have an established reference frame of known protein complexes.