非表示:
キーワード:
-
要旨:
PageRank-style (PR) link analyses are a cornerstone of Web
search engines and Web mining, but they are computationally
expensive. Recently, various techniques have been proposed
for speeding up these analyses by distributing the link
graph among multiple sites. However, none of these advanced
methods is suitable for a fully decentralized PR computation
in a peer-to-peer (P2P) network with autonomous
peers, where each peer can independently crawl Web fragments
according to the user's thematic interests. In such
a setting the graph fragments that different peers have locally
available or know about may arbitrarily overlap among
peers, creating additional complexity for the PR computation.
This paper presents the JXP algorithm for dynamically
and collaboratively computing PR scores of Web pages that
are arbitrarily distributed in a P2P network. The algorithm
runs at every peer, and it works by combining locally computed
PR scores with random meetings among the peers in
the network. It is scalable as the number of peers on the
network grows, and experiments as well as theoretical arguments
show that JXP scores converge to the true PR scores
that one would obtain by a centralized computation.