Efficient knowledge Management for Named Entities from Text

Dutta, Sourav

doi:10.22028/D291-26701

Efficient knowledge Management for Named Entities from Text

Dutta, S. (2017). Efficient knowledge Management for Named Entities from Text. PhD Thesis, Universität des Saarlandes, Saarbrücken.

Item is 公開

表示: 全項目非表示: 全項目

基本情報

表示: 非表示:

アイテムのパーマリンク: https://hdl.handle.net/11858/00-001M-0000-002C-A793-E 版のパーマリンク: https://hdl.handle.net/21.11116/0000-000C-6C99-F

資料種別: 学位論文

ファイル

表示: ファイル

作成者

表示:

非表示:

作成者:
Dutta, Sourav^{1, 2}, 著者
Weikum, Gerhard¹, 学位論文主査
Nejdl, Wolfgang³, 監修者
Berberich, Klaus¹, 監修者

所属:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018
2International Max Planck Research School, MPI for Informatics, Max Planck Society, Campus E1 4, 66123 Saarbrücken, DE, ou_1116551
3External Organizations, ou_persistent22

内容説明

表示:

非表示:

キーワード: -

要旨: The evolution of search from keywords to entities has necessitated the efficient harvesting and management of entity-centric information for constructing knowledge bases catering to various applications such as semantic search, question answering, and information retrieval. The vast amounts of natural language texts available across diverse domains on the Web provide rich sources for discovering facts about named entities such as people, places, and organizations.

A key challenge, in this regard, entails the need for precise identification and disambiguation of entities across documents for extraction of attributes/relations and their proper representation in knowledge bases. Additionally, the applicability of such repositories not only involves the quality and accuracy of the stored information, but also storage management and query processing efficiency. This dissertation aims to tackle the above problems by presenting efficient approaches for entity-centric knowledge
acquisition from texts and its representation in knowledge repositories.

This dissertation presents a robust approach for identifying text phrases pertaining to the same named entity across huge corpora, and their disambiguation to canonical entities present in a knowledge base, by using enriched semantic contexts and link validation encapsulated in a hierarchical clustering framework. This work further presents language and consistency features for classification models to compute the credibility of obtained textual facts, ensuring quality of the extracted information. Finally, an encoding algorithm, using frequent term detection and improved data locality, to represent entities for enhanced knowledge base storage and query performance is presented.

資料詳細

表示:

非表示:

言語: eng - English

日付: 投稿: 2016受理: 2017-03-09オンライン出版: 2017-03-10出版: 2017

出版の状態: 出版

ページ: xv, 134 p.

出版情報: Saarbrücken : Universität des Saarlandes

目次: -

査読: -

識別子（DOI, ISBNなど）: BibTex参照ID: duttaphd17
URN: urn:nbn:de:bsz:291-scidok-67924
DOI: 10.22028/D291-26701
その他: hdl:20.500.11880/26757

学位: 博士号 (PhD)

アイテム詳細

基本情報

ファイル

関連URL

作成者

内容説明

資料詳細

関連イベント

訴訟

Project information

出版物