TopX 2.0 at the INEX 2009 Ad-Hoc and Efficiency Tracks

Theobald, Martin; Aji, Ablimit; Schenkel, Ralf

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Konferenzbeitrag

TopX 2.0 at the INEX 2009 Ad-Hoc and Efficiency Tracks

MPG-Autoren

/persons/resource/persons45609

Theobald, Martin
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons43982

Aji, Ablimit
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45380

Schenkel, Ralf
Databases and Information Systems, MPI for Informatics, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Theobald, M., Aji, A., & Schenkel, R. (2009). TopX 2.0 at the INEX 2009 Ad-Hoc and Efficiency Tracks. In S. Geva, J. Kamps, & A. Trotman (Eds.), INEX 2009 Workshop Preproceedings (pp. 198-208). Berlin: Springer.

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-000F-195F-F

Zusammenfassung

This paper presents the results of our INEX 2009 Ad-hoc and Efficiency track experiments. While our scoring model remained almost unchanged in comparison to previous years, we focused on a complete redesign of our XML indexing component with respect to the increased need for scalability that came with the new 2009 INEX Wikipedia collection, which is about 10 times larger than the previous INEX collection. TopX now supports a CAS-specific distributed index structure, with a completely {\em parallel} execution of all indexing steps, including parsing, sampling of term statistics for our element-specific BM25 ranking model, as well as sorting and compressing the index lists for our final inverted block-index. Overall, TopX ranked among the top 3 systems in both the Ad-hoc and Efficiency tracks, with a maximum value of 0.61 for iP[0.01] and 0.29 for MAiP in focused retrieval mode at the Ad-hoc track. Our fastest runs achieved an average runtime of 72 ms per CO query, and 235 ms per CAS query at the Efficiency track, respectively.