de.mpg.escidoc.pubman.appbase.FacesBean
English
 
Help Guide Disclaimer Contact us Login
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Conference Paper

Output-Sensitive Autocompletion Search

MPS-Authors
http://pubman.mpdl.mpg.de/cone/persons/resource/persons44076

Bast,  Holger
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons45065

Mortensen,  Christian Worm
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons45715

Weber,  Ingmar
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons44412

Ferragina,  Paolo
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

Locator
There are no locators available
Fulltext (public)
There are no public fulltexts available
Supplementary Material (public)
There is no public supplementary material available
Citation

Bast, H., Mortensen, C. W., & Weber, I. (2006). Output-Sensitive Autocompletion Search. In String Processing and Information Retrieval : 13th International Conference, SPIRE 2006 (pp. 150-162). Berlin, Germany: Springer.


Cite as: http://hdl.handle.net/11858/00-001M-0000-000F-23AA-7
Abstract
We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query word that would lead to the best hits, and also display the best such hits. The following problem is at the core of this feature: for a fixed document collection, given a set D of documents, and an alphabetical range W of words, compute the set of all word-in-document pairs (w ,d) from the collection such that w ∈W and d∈D. We present a new data structure with the help of which such autocompletion queries can be processed, on the average, in time linear in the input plus output size, independent of the size of the underlying document collection. At the same time, our data structure uses no more space than an inverted index. Actual query processing times on a large test collection correlate almost perfectly with our theoretical bound.