RDF-3X: a RISC-style Engine for RDF

Neumann, Thomas; Weikum, Gerhard

doi:10.1145/1453856.1453927

アイテム詳細

登録内容を編集ファイル形式で保存

一時保存へ追加

タグ情報を表示リリース履歴を表示詳細要約

公開

学術論文

RDF-3X: a RISC-style Engine for RDF

MPS-Authors

/persons/resource/persons127842

Neumann, Thomas
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45720

Weikum, Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

External Resource

There are no locators available

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

フルテキスト (公開)

公開されているフルテキストはありません

付随資料 (公開)

There is no public supplementary material available

引用

Neumann, T., & Weikum, G. (2008). RDF-3X: a RISC-style Engine for RDF. Proceedings of the VLDB Endowment, 1(1), 647-659. doi:10.1145/1453856.1453927.

引用: https://hdl.handle.net/11858/00-001M-0000-000F-1CC1-E

要旨

RDF is a data representation format for schema-free structured information that is gaining momentum in the context of Semantic-Web corpora, life sciences, and also Web 2.0 platforms. The ``pay-as-you-go'' nature of RDF and the flexible pattern-matching capabilities of its query language SPARQL entail efficiency and scalability challenges for complex queries including long join paths. This paper presents the RDF-3X engine, an implementation of SPARQL that achieves excellent performance by pursuing a RISC-style architecture with a streamlined architecture and carefully designed, puristic data structures and operations. The salient points of RDF-3X are: 1) a generic solution for storing and indexing RDF triples that completely eliminates the need for physical-design tuning, 2) a powerful yet simple query processor that leverages fast merge joins to the largest possible extent, and 3) a query optimizer for choosing optimal join orders using a cost model based on statistical synopses for entire join paths. The performance of RDF-3X, in comparison to the previously best state-of-the-art systems, has been measured on several large-scale datasets with more than 50 million RDF triples and benchmark queries that include pattern matching and long join paths in the underlying data graphs.