Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Konferenzbeitrag

Efficient peer-to-peer semantic overlay networks based on statistical language models

MPG-Autoren
/persons/resource/persons44926

Linari,  Alessandro
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45720

Weikum,  Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

Externe Ressourcen
Es sind keine externen Ressourcen hinterlegt
Volltexte (beschränkter Zugriff)
Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte in PuRe verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Linari, A., & Weikum, G. (2006). Efficient peer-to-peer semantic overlay networks based on statistical language models. In P2PIR '06: Proceedings of the International Workshop on Information Retrieval in Peer-to-peer Networks (pp. 9-16). New York, USA: ACM.


Zitierlink: https://hdl.handle.net/11858/00-001M-0000-000F-22AA-F
Zusammenfassung
In this paper we address the query routing problem in peer-to-peer ({P2P}) information retrieval. Our system builds up on the idea of a {S}emantic {O}verlay {N}etwork ({SON}), in which each peer becomes neighbor of a small number of peers, chosen among those that are most similar to it. Peers in the network are represented by a statistical Language Model derived from their local data collections but, instead of using the non-metric Kullback-Leibler divergence to compute the similarity between them, we use a symmetrized and "metricized" related measure, the square root of the Jensen-Shannon divergence, which let us map the problem to a metric search problem. The search strategy exploits the triangular inequality to efficiently prune the search space and relies on a priority queue to visit the most promising peers first. To keep communications costs low and to perform an efficient comparison between Language Models, we devise a compression technique that builds on Bloom-filters and histograms and we provide error bounds for the approximation and a cost analysis for the algorithms used to build and maintain the {SON}.