English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Decision tree supported substructure prediction of metabolites from GC-MS profiles

MPS-Authors
/persons/resource/persons97205

Hummel,  J.
BioinformaticsCRG, Cooperative Research Groups, Max Planck Institute of Molecular Plant Physiology, Max Planck Society;
BioinformaticsCIG, Infrastructure Groups and Service Units, Max Planck Institute of Molecular Plant Physiology, Max Planck Society;

/persons/resource/persons97429

Strehmel,  N.
Applied Metabolome Analysis, Department Willmitzer, Max Planck Institute of Molecular Plant Physiology, Max Planck Society;

/persons/resource/persons97409

Selbig,  J.
BioinformaticsCRG, Cooperative Research Groups, Max Planck Institute of Molecular Plant Physiology, Max Planck Society;

/persons/resource/persons97467

Walther,  D.
BioinformaticsCIG, Infrastructure Groups and Service Units, Max Planck Institute of Molecular Plant Physiology, Max Planck Society;

/persons/resource/persons97239

Kopka,  J.
Applied Metabolome Analysis, Department Willmitzer, Max Planck Institute of Molecular Plant Physiology, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
Supplementary Material (public)
There is no public supplementary material available
Citation

Hummel, J., Strehmel, N., Selbig, J., Walther, D., & Kopka, J. (2010). Decision tree supported substructure prediction of metabolites from GC-MS profiles. Metabolomics, 6(2), 322-333. doi:10.1007/s11306-010-0198-7.


Cite as: https://hdl.handle.net/11858/00-001M-0000-0014-23E9-B
Abstract
Gas chromatography coupled to mass spectrometry (GC-MS) is one of the most widespread routine technologies applied to the large scale screening and discovery of novel metabolic biomarkers. However, currently the majority of mass spectral tags (MSTs) remains unidentified due to the lack of authenticated pure reference substances required for compound identification by GC-MS. Here, we accessed the information on reference compounds stored in the Golm Metabolome Database (GMD) to apply supervised machine learning approaches to the classification and identification of unidentified MSTs without relying on library searches. Non-annotated MSTs with mass spectral and retention index (RI) information together with data of already identified metabolites and reference substances have been archived in the GMD. Structural feature extraction was applied to sub-divide the metabolite space contained in the GMD and to define the prediction target classes. Decision tree (DT)-based prediction of the most frequent substructures based on mass spectral features and RI information is demonstrated to result in highly sensitive and specific detections of sub-structures contained in the compounds. The underlying set of DTs can be inspected by the user and are made available for batch processing via SOAP (Simple Object Access Protocol)-based web services. The GMD mass spectral library with the integrated DTs is freely accessible for non-commercial use at http://gmd.mpimp-golm.mpg.de/. All matching and structure search functionalities are available as SOAP-based web services. A XML + HTTP interface, which follows Representational State Transfer (REST) principles, facilitates read-only access to data base entities.