ausblenden:
Schlagwörter:
-
Zusammenfassung:
High sequence identity between two proteins (e.g. > 60%) is a strong evidence
for high structural similarity. However, internal shifts in one of the two
proteins can sometimes give rise to unexpectedly high structural differences.
This, in turn, causes unreliable structure predictions when two such proteins
are used in homology modeling. Here, we perform a computational analysis of
helix shifts and we show that their occurrence can be predicted with
statistical learning methods.
Our results indicate that helix shifts increase the RMS error by factor 2.6
compared to those protein pairs without a helix shift. Although helix shifts
are rare (1.6% of helices and a commensurately higher number of proteins are
affected), they therefore pose a significant problem for reliable structure
prediction systems. In this paper, we prototype a new approach for model
quality assessment and demonstrate that it can successfully warn against helix
shifts. A support vector machine trained on a wide range of sequence and
structure properties predicts the occurrence of helix shifts with a sensitivity
of 74.2% and a specificity of 83.6%. On an equalized test dataset, this
corresponds to an accuracy of 78.9%. Projected to the full dataset, it
translates to an accuracy of 83.4%.
Our analysis shows that helix shift detection is a valuable building block for
highly reliable structure prediction systems. Furthermore, the statistical
learning based approach to helix shift detection that we employ here is
orthogonal to well-established model quality assessment methods (which use
geometric constraint checking or mean force potentials). Therefore, a further
increase of prediction accuracy is expected from the combination of these
methods.