de.mpg.escidoc.pubman.appbase.FacesBean
English
 
Help Guide Disclaimer Contact us Login
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

On Subset Seeds for Protein Alignment.

MPS-Authors
http://pubman.mpdl.mpg.de/cone/persons/resource/persons50545

Szczurek,  Ewa
Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

Locator
There are no locators available
Fulltext (public)
There are no public fulltexts available
Supplementary Material (public)
There is no public supplementary material available
Citation

Roytberg, M., Gambin, A., Noé, L., Lasota, S., Furletova, E., Szczurek, E., et al. (2009). On Subset Seeds for Protein Alignment. IEEE ACM Transactions on Computational Biology and Bioinformatics, 6(3), 483-494. doi:http://doi.ieeecomputersociety.org/10.1109/TCBB.2009.4.


Cite as: http://hdl.handle.net/11858/00-001M-0000-0010-7D17-5
Abstract
We apply the concept of subset seeds to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal sensitivity/selectivity trade-offs. We propose several different design methods and use them to construct several alphabets. We then perform a comparative analysis of seeds built over those alphabets and compare them with the standard Blastp seeding method, as well as with the family of vector seeds. While the formalism of subset seeds is less expressive (but less costly to implement) than the cumulative principle used in Blastp and vector seeds, our seeds show a similar or even better performance than Blastp on Bernoulli models of proteins compatible with the common BLOSUM62 matrix. Finally, we perform a large-scale benchmarking of our seeds against several main databases of protein alignments. Here again, the results show a comparable or better performance of our seeds versus Blastp.