Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Konferenzbeitrag

Learning Word-to-Concept Mappings for Automatic Text Classification

MPG-Autoren
/persons/resource/persons44668

Ifrim,  Georgiana
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45609

Theobald,  Martin
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45720

Weikum,  Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

De Raedt,  Luc
Max Planck Society;

Wrobel,  Stefan
Max Planck Society;

Externe Ressourcen
Es sind keine externen Ressourcen hinterlegt
Volltexte (beschränkter Zugriff)
Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte in PuRe verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Ifrim, G., Theobald, M., & Weikum, G. (2005). Learning Word-to-Concept Mappings for Automatic Text Classification. In Proceedings of the 22nd International Conference on Machine Learning - Learning in Web Search (LWS 2005) (pp. 18-26). Bonn, Germany: ICMLW4-LWS2005.


Zitierlink: https://hdl.handle.net/11858/00-001M-0000-000F-26F0-F
Zusammenfassung
For both classification and retrieval of natural language text documents, the standard document representation is a term vector where a term is simply a morphological normal form of the corresponding word. A potentially better approach would be to map every word onto a concept, the proper word sense and use this additional information in the learning process. In this paper we address the problem of automatically classifying natural language text documents. We investigate the effect of word to concept mappings and word sense disambiguation techniques on improving classification accuracy. We use the WordNet thesaurus as a background knowledge base and propose a generative language model approach to document classification. We show experimental results comparing the performance of our model with Naive Bayes and SVM classifiers.