非表示:
キーワード:
-
要旨:
Peer-to-Peer (P2P) search requires intelligent decisions for
{\em query routing}: selecting the best peers to which a given query,
initiated at some peer, should be forwarded for retrieving additional
search results. These decisions are based on statistical summaries
for each peer, which are usually organized on a per-keyword basis and
managed in a distributed directory of routing indices.
Such architectures disregard the
possible correlations among keywords. Together with the coarse granularity
of per-peer summaries, which are mandated for scalability,
this limitation may lead to poor search result quality.
This paper develops and evaluates two solutions to this problem, {\em sk-STAT}
based on single-key statistics only,
and {\em mk-STAT} based
on additional multi-key statistics. For both cases, hash sketch synopses are
used to compactly
represent a peer's data items and
are efficiently disseminated in the P2P network to form a decentralized
directory. Experimental studies with Gnutella and Web data demonstrate
the viability and the trade-offs of the approaches.