Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Zeitschriftenartikel

Confidence measures for protein fold recognition

MPG-Autoren
Es sind keine MPG-Autoren in der Publikation vorhanden
Volltexte (beschränkter Zugriff)
Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte in PuRe verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Sommer, I., Zien, A., von Ohsen, N., Zimmer, R., & Lengauer, T. (2002). Confidence measures for protein fold recognition. Bioinformatics, 18(6), 802-812. doi:10.1093/bioinformatics/18.6.802.


Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0013-DF96-F
Zusammenfassung
Motivation: We present an extensive evaluation of different methods and criteria to detect remote homologs of a given protein sequence. We investigate two associated problems: first, to develop a sensitive searching method to identify possible candidates and, second, to assign a confidence to the putative candidates in order to select the best one.

For searching methods where the score distributions are known, p-values are used as confidence measure with great success. For the cases where such theoretical backing is absent, we propose empirical approximations to p-values for searching procedures.

Results: As a baseline, we review the performances of different methods for detecting remote protein folds (sequence alignment and threading, with and without sequence profiles, global and local). The analysis is performed on a large representative set of protein structures.

For fold recognition, we find that methods using sequence profiles generally perform better than methods using plain sequences, and that threading methods perform better than sequence alignment methods.

In order to assess the quality of the predictions made, we establish and compare several confidence measures, including raw scores, z-scores, raw score gaps, z-score gaps, and different methods of p-value estimation. We work our way from the theoretically well backed local scores towards more explorative global and threading scores.

The methods for assessing the statistical significance of predictions are compared using specificity--sensitivity plots. For local alignment techniques we find that p-value methods work best, albeit computationally cheaper methods such as those based on score gaps achieve similar performance. For global methods where no theory is available methods based on score gaps work best.

By using the score gap functions as the measure of confidence we improve the more powerful fold recognition methods for which p-values are unavailable.
Availability: The benchmark set is available upon request.