de.mpg.escidoc.pubman.appbase.FacesBean
Deutsch
 
Hilfe Wegweiser Datenschutzhinweis Impressum Kontakt
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Konferenzbeitrag

Efficient peer-to-peer semantic overlay networks based on statistical language models

MPG-Autoren
http://pubman.mpdl.mpg.de/cone/persons/resource/persons44926

Linari,  Alessandro
Databases and Information Systems, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons45720

Weikum,  Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

Externe Ressourcen
Es sind keine Externen Ressourcen verfügbar
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Linari, A., & Weikum, G. (2006). Efficient peer-to-peer semantic overlay networks based on statistical language models. In P2PIR '06: Proceedings of the International Workshop on Information Retrieval in Peer-to-peer Networks (pp. 9-16). New York, USA: ACM.


Zitierlink: http://hdl.handle.net/11858/00-001M-0000-000F-22AA-F
Zusammenfassung
In this paper we address the query routing problem in peer-to-peer ({P2P}) information retrieval. Our system builds up on the idea of a {S}emantic {O}verlay {N}etwork ({SON}), in which each peer becomes neighbor of a small number of peers, chosen among those that are most similar to it. Peers in the network are represented by a statistical Language Model derived from their local data collections but, instead of using the non-metric Kullback-Leibler divergence to compute the similarity between them, we use a symmetrized and "metricized" related measure, the square root of the Jensen-Shannon divergence, which let us map the problem to a metric search problem. The search strategy exploits the triangular inequality to efficiently prune the search space and relies on a priority queue to visit the most promising peers first. To keep communications costs low and to perform an efficient comparison between Language Models, we devise a compression technique that builds on Bloom-filters and histograms and we provide error bounds for the approximation and a cost analysis for the algorithms used to build and maintain the {SON}.