de.mpg.escidoc.pubman.appbase.FacesBean
English
 
Help Guide Disclaimer Contact us Login
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Report

Towards a Universal Wordnet by Learning from Combined Evidenc

MPS-Authors
http://pubman.mpdl.mpg.de/cone/persons/resource/persons44292

de Melo,  Gerard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons45720

Weikum,  Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

Locator
There are no locators available
Fulltext (public)

mpi-i-2009-5-005.pdf
(Any fulltext), 718KB

Supplementary Material (public)
There is no public supplementary material available
Citation

de Melo, G., & Weikum, G.(2009). Towards a Universal Wordnet by Learning from Combined Evidenc (MPI-I-2009-5-005). Saarbrücken: Max-Planck-Institut für Informatik.


Cite as: http://hdl.handle.net/11858/00-001M-0000-0014-665C-5
Abstract
Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification.