Efficient Querying and Learning in Probabilistic and Temporal Databases

Dylla, Maximilian

doi:10.22028/D291-26567

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Thesis

Efficient Querying and Learning in Probabilistic and Temporal Databases

MPS-Authors

/persons/resource/persons44360

Dylla, Maximilian
Databases and Information Systems, MPI for Informatics, Max Planck Society;
International Max Planck Research School, MPI for Informatics, Max Planck Society;

External Resource

http://scidok.sulb.uni-saarland.de/volltexte/2014/5814/
(Any fulltext)

http://scidok.sulb.uni-saarland.de/doku/lic_ohne_pod.php?la=de
(Copyright transfer agreement)

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

phd_thesis_maximilian_dylla.pdf
(Postprint), 4MB

Supplementary Material (public)

There is no public supplementary material available

Citation

Dylla, M. (2014). Efficient Querying and Learning in Probabilistic and Temporal Databases. PhD Thesis, Universität des Saarlandes, Saarbrücken. doi:10.22028/D291-26567.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0024-3C44-E

Abstract

Probabilistic databases store, query, and manage large amounts of uncertain information. This thesis advances the state-of-the-art in probabilistic databases in three different ways:
1. We present a closed and complete data model for temporal probabilistic databases and analyze its complexity. Queries are posed via temporal deduction rules which induce lineage formulas capturing both time and uncertainty.
2. We devise a methodology for computing the top-k most probable query answers. It is based on first-order lineage formulas representing sets of answer candidates. Theoretically derived probability bounds on these formulas enable pruning low-probability answers.
3. We introduce the problem of learning tuple probabilities which allows updating and cleaning of probabilistic databases. We study its complexity, characterize its solutions, cast it into an optimization problem, and devise an approximation algorithm based on stochastic gradient descent.
All of the above contributions support consistency constraints and are evaluated experimentally.