de.mpg.escidoc.pubman.appbase.FacesBean
English
 
Help Guide Disclaimer Contact us Login
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Conference Paper

Discovering and Exploiting Keyword and Attribute-Value Co-occurrences to Improve P2P Routing Indices

MPS-Authors
http://pubman.mpdl.mpg.de/cone/persons/resource/persons45041

Michel,  Sebastian
Databases and Information Systems, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons44113

Bender,  Matthias
Databases and Information Systems, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons45636

Triantafillou,  Peter
Databases and Information Systems, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons45720

Weikum,  Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons45808

Zimmer,  Christian
Databases and Information Systems, MPI for Informatics, Max Planck Society;

Locator
There are no locators available
Fulltext (public)
There are no public fulltexts available
Supplementary Material (public)
There is no public supplementary material available
Citation

Michel, S., Bender, M., Ntarmos, N., Triantafillou, P., Weikum, G., & Zimmer, C. (2006). Discovering and Exploiting Keyword and Attribute-Value Co-occurrences to Improve P2P Routing Indices. In ACM 15th Conference on Information and Knowledge Management (CIKM2006) (pp. 172-181). New York, USA: ACM.


Cite as: http://hdl.handle.net/11858/00-001M-0000-000F-2293-1
Abstract
Peer-to-Peer (P2P) search requires intelligent decisions for {\em query routing}: selecting the best peers to which a given query, initiated at some peer, should be forwarded for retrieving additional search results. These decisions are based on statistical summaries for each peer, which are usually organized on a per-keyword basis and managed in a distributed directory of routing indices. Such architectures disregard the possible correlations among keywords. Together with the coarse granularity of per-peer summaries, which are mandated for scalability, this limitation may lead to poor search result quality. This paper develops and evaluates two solutions to this problem, {\em sk-STAT} based on single-key statistics only, and {\em mk-STAT} based on additional multi-key statistics. For both cases, hash sketch synopses are used to compactly represent a peer's data items and are efficiently disseminated in the P2P network to form a decentralized directory. Experimental studies with Gnutella and Web data demonstrate the viability and the trade-offs of the approaches.