Efficient and Self-Tuning Incremental Query Expansion for Top-k Query Processing

Theobald, Martin; Schenkel, Ralf; Weikum, Gerhard; Baeza-Yates, Ricardo A.; Ziviani, Nivio; Marchionini, Gary; Moffat, Alistair; Tait, John

Efficient and Self-Tuning Incremental Query Expansion for Top-k Query Processing

Theobald, M., Schenkel, R., & Weikum, G. (2005). Efficient and Self-Tuning Incremental Query Expansion for Top-k Query Processing. In 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2005) (pp. 242-249). New York, USA: ACM.

Item is 公開

表示: 全項目非表示: 全項目

基本情報

表示: 非表示:

アイテムのパーマリンク: https://hdl.handle.net/11858/00-001M-0000-000F-2659-5 版のパーマリンク: https://hdl.handle.net/11858/00-001M-0000-000F-265A-3

資料種別: 会議論文

ファイル

表示: ファイル

非表示: ファイル

:

TheobaldSTW.pdf (全文テキスト（全般）), 521KB

ファイルのパーマリンク:
-

ファイル名:
TheobaldSTW.pdf

説明:
-

OA-Status:

閲覧制限:
非公開

MIMEタイプ / チェックサム:
application/pdf

技術的なメタデータ:

著作権日付:
-

著作権情報:
-

CCライセンス:
-

作成者

表示:

非表示:

作成者:
Theobald, Martin¹, 著者
Schenkel, Ralf¹, 著者
Weikum, Gerhard¹, 著者
Baeza-Yates, Ricardo A., 編集者
Ziviani, Nivio, 編集者
Marchionini, Gary, 編集者
Moffat, Alistair, 編集者
Tait, John, 編集者

所属:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018

内容説明

表示:

非表示:

キーワード: -

要旨: We present a novel approach for efficient and self-tuning query expansion that is embedded into a top-k query processor with candidate pruning. Traditional query expansion methods select expansion terms whose thematic similarity to the original query terms is above some specified threshold, thus generating a disjunctive query with much higher dimensionality. This poses three major problems: 1) the need for hand-tuning the expansion threshold, 2) the potential topic dilution with overly aggressive expansion, and 3) the drastically increased execution cost of a high-dimensional query. The method developed in this paper addresses all three problems by dynamically and incrementally merging the inverted lists for the potential expansion terms with the lists for the original query terms. A priority queue is used for maintaining result candidates, the pruning of candidates is based on Fagin's family of top-k algorithms, and optionally probabilistic estimators of candidate scores can be used for additional pruning. Experiments on the TREC collections for the 2004 Robust and Terabyte tracks demonstrate the increased efficiency, effectiveness, and scalability of our approach.

資料詳細

表示:

非表示:

言語: eng - English

日付: 修正: 2006-04-14出版: 2005

出版の状態: 出版

ページ: -

出版情報: New York, USA : ACM

目次: -

査読: -

識別子（DOI, ISBNなど）: eDoc: 278885
その他: Local-ID: C1256DBF005F876D-5AD4A59C97D8DFFAC1256FE0002DD302-TheobaldSW05

学位: -

訴訟

表示:

Project information

表示:

出版物 1

表示:

非表示:

出版物名: 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2005)

種別: 会議論文集

著者・編者:

所属:

出版社, 出版地: New York, USA : ACM

ページ: - 巻号: - 通巻号: - 開始・終了ページ: 242 - 249 識別子（ISBN, ISSN, DOIなど）: ISBN: 1-59593-034-5

アイテム詳細

基本情報

ファイル

関連URL

作成者

内容説明

資料詳細

関連イベント

訴訟

Project information

出版物 1