T-cell recognition is a critical step in regulating immune response. Activation
of Cytotoxic T-cell requires the MHC class I molecules in complex with specific
peptides and present them on the surface of the cell. Identification of
potential ligands to MHC is therefore important for understanding disease
pathogenesis and aiding vaccine design.
Despite years of effort in the field, reliable prediction of MHC ligands
remains a difficult task. It is reported that only one out of 100 to 200
potential binders actually binds. Methods based on sequence data alone are fast
but fail to capture all binding patterns, while the structure based methods are
more promising but far too slow for large-scale screening of protein sequences.
In this work, we propose a new method to the prediction problem. It is based on
the assumption that peptide binding is an aggregrate effect of contributions
from independent binding of residues. Compatibility of each amino acid in the
MHC binding pockets is examined thoroughly by molecular dynamics simulation.
Values of energy terms important for binding are collected from the generated
ensembles, and are used to produce the allele-specific scoring matrix. Each
entry in this matrix represents the favorableness in terms of a particular
"feature" of an amino acid in a binding position. Prediction models based on
machine learning techniques are then trained to discriminate binders from
Our method is compared to two other sequence-based methods using HLA-A*0201
9-mer sequences. Three publicly available data sets are used: the MHCPEP,
SYFPEITHI data sets, and the HXB2 genome. In overall, our method successfully
improves the prediction accuracy with higher specificity. Its robustness to
different sizes and ratios of training data proves its ability to provide
reliable prediction by less dependency on the sequence data. The method also
shows better generalizability in cross-allele predictions. For predicting
peptide bound conformations, our preliminary approach based on energy
minimization gives the satisfactory result of a backbone RMSD at 1.7 to 1.88 A
as compared to the crystal structures.