hide
Free keywords:
-
Abstract:
The promises inherent in users coming together to form data sharing network
communities, bring to the foreground new problems formulated over such dynamic,
ever growing, computing, storage, and networking infrastructures. A key open
challenge is to harness these highly distributed resources toward the
development of
an ultra scalable, efficient search engine. From a technical viewpoint, any
acceptable solution must fully exploit all available resources dictating the
removal of any centralized points of control, which can also readily lead
to performance bottlenecks and reliability/availability problems. Equally
importantly, however, a highly distributed solution can also facilitate
pluralism in informing users about internet content, which is crucial in order
to preclude the formation of information-resource monopolies and the biased
visibility of content from economically-powerful sources. To meet these
challenges, the work described here puts forward MINERVA$\infty$, a novel
search engine architecture, designed for scalability and efficiency.
MINERVA$\infty$ encompasses a suite of novel algorithms, including algorithms
for creating data networks of interest, placing data on network nodes, load
balancing, top-k algorithms for retrieving data at query time, and replication
algorithms for expediting top-k query processing.
We have implemented the proposed architecture and we report on our extensive
experiments with real-world, web-crawled, and synthetic data and queries,
showcasing the scalability and efficiency traits of MINERVA$\infty$.