English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Efficient and Self-Tuning Incremental Query Expansion for Top-k Query Processing

Theobald, M., Schenkel, R., & Weikum, G. (2005). Efficient and Self-Tuning Incremental Query Expansion for Top-k Query Processing. In 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2005) (pp. 242-249). New York, USA: ACM.

Item is

Files

show Files
hide Files
:
TheobaldSTW.pdf (Any fulltext), 521KB
 
File Permalink:
-
Name:
TheobaldSTW.pdf
Description:
-
OA-Status:
Visibility:
Private
MIME-Type / Checksum:
application/pdf
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show

Creators

show
hide
 Creators:
Theobald, Martin1, Author           
Schenkel, Ralf1, Author           
Weikum, Gerhard1, Author           
Baeza-Yates, Ricardo A., Editor
Ziviani, Nivio, Editor
Marchionini, Gary, Editor
Moffat, Alistair, Editor
Tait, John, Editor
Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              

Content

show
hide
Free keywords: -
 Abstract: We present a novel approach for efficient and self-tuning query expansion that is embedded into a top-k query processor with candidate pruning. Traditional query expansion methods select expansion terms whose thematic similarity to the original query terms is above some specified threshold, thus generating a disjunctive query with much higher dimensionality. This poses three major problems: 1) the need for hand-tuning the expansion threshold, 2) the potential topic dilution with overly aggressive expansion, and 3) the drastically increased execution cost of a high-dimensional query. The method developed in this paper addresses all three problems by dynamically and incrementally merging the inverted lists for the potential expansion terms with the lists for the original query terms. A priority queue is used for maintaining result candidates, the pruning of candidates is based on Fagin's family of top-k algorithms, and optionally probabilistic estimators of candidate scores can be used for additional pruning. Experiments on the TREC collections for the 2004 Robust and Terabyte tracks demonstrate the increased efficiency, effectiveness, and scalability of our approach.

Details

show
hide
Language(s): eng - English
 Dates: 2006-04-142005
 Publication Status: Issued
 Pages: -
 Publishing info: New York, USA : ACM
 Table of Contents: -
 Rev. Type: -
 Identifiers: eDoc: 278885
Other: Local-ID: C1256DBF005F876D-5AD4A59C97D8DFFAC1256FE0002DD302-TheobaldSW05
 Degree: -

Event

show
hide
Title: Untitled Event
Place of Event: Salvador, Brazil
Start-/End Date: 2005-08-15

Legal Case

show

Project information

show

Source 1

show
hide
Title: 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2005)
Source Genre: Proceedings
 Creator(s):
Affiliations:
Publ. Info: New York, USA : ACM
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 242 - 249 Identifier: ISBN: 1-59593-034-5