Graph Kernels for Chemical Informatics

Ralaivola, L; Swamidass , JS; Saigo, H; Baldi, P

doi:10.1016/j.neunet.2005.07.009

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Zeitschriftenartikel

Graph Kernels for Chemical Informatics

MPG-Autoren

Es sind keine MPG-Autoren in der Publikation vorhanden

Externe Ressourcen

https://www.sciencedirect.com/science/article/pii/S0893608005001693
(Verlagsversion)

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Ralaivola, L., Swamidass, J., Saigo, H., & Baldi, P. (2005). Graph Kernels for Chemical Informatics. Neural networks, 18(8), 1093-1110. doi:10.1016/j.neunet.2005.07.009.

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0013-D6AB-5

Zusammenfassung

Increased availability of large repositories of chemical compounds is creating new
challenges and opportunities for the application of machine learning methods to
problems in computational chemistry and chemical informatics. Because chemical
compounds are often represented by the graph of their covalent bonds, machine
learning methods in this domain must be capable of processing graphical structures
with variable size. Here we first briefly review the literature on graph kernels and
then introduce three new kernels (Tanimoto, MinMax, Hybrid) based on the idea
of molecular fingerprints and counting labeled paths of depth up to d using depthfirst
search from each possible vertex. The kernels are applied to three classification
problems to predict mutagenicity, toxicity, and anti-cancer activity on three publicly
available data sets. The kernels achieve performances at least comparable, and most
often superior, to those previously reported in the literature reaching accuracies of
91.5 on the Mutag dataset, 65-67 on the PTC (Predictive Toxicology Challenge)
dataset, and 72 on the NCI (National Cancer Institute) dataset. Properties and
tradeoffs of these kernels, as well as other proposed kernels that leverage 1D or 3D
representations of molecules, are briefly discussed.