de.mpg.escidoc.pubman.appbase.FacesBean
English
 
Help Guide Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Conference Paper

Multilingual Text Classification using Ontologies

MPS-Authors
http://pubman.mpdl.mpg.de/cone/persons/resource/persons44292

de Melo,  Gerard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons45482

Siersdorfer,  Stefan
Databases and Information Systems, MPI for Informatics, Max Planck Society;

Locator
There are no locators available
Fulltext (public)
There are no public fulltexts available
Supplementary Material (public)
There is no public supplementary material available
Citation

de Melo, G., & Siersdorfer, S. (2007). Multilingual Text Classification using Ontologies. In G. Amati, C. Carpineto, & G. Romano (Eds.), Advances in Information Retrieval: 29th European Conference on IR Research, ECIR 2007 (pp. 541-548). Berlin, Germany: Springer.


Cite as: http://hdl.handle.net/11858/00-001M-0000-000F-1FF1-8
Abstract
In this paper, we investigate strategies for automatically classifying documents in different languages thematically, geographically or according to other criteria. A novel linguistically motivated text representation scheme is presented that can be used with machine learning algorithms in order to learn classifications from pre-classified examples and then automatically classify documents that might be provided in entirely different languages. Our approach makes use of ontologies and lexical resources but goes beyond a simple mapping from terms to concepts by fully exploiting the external knowledge manifested in such resources and mapping to entire regions of concepts. For this, a graph traversal algorithm is used to explore related concepts that might be relevant. Extensive testing has shown that our methods lead to significant improvements compared to existing approaches.