ausblenden:
Schlagwörter:
-
Zusammenfassung:
A key tool in protein function discovery is the ability to rank databases of proteins given a query amino acid sequence. The most successful method so far is a web-based tool called PSI-BLAST which uses heuristic alignment of a profile built using the large unlabeled database. It has been shown that such use of global information via an unlabeled data improves over a local measure derived from a basic pairwise alignment such as performed by PSI-BLAST's predecessor, BLAST. In this article we
look at ways of leveraging techniques from the field of machine learning for the problem of ranking. We show how clustering and semi-supervised learning techniques, which aim to capture global structure in data, can significantly improve over PSI-BLAST.