hide
Free keywords:
-
Abstract:
We examine the problem of producing the optimal evaluation order for queries
containing joins, selections, and maps. Specifically, we look at the case where
common subexpressions involving expensive UDF calls can be factored out. First,
we show that ignoring factorization during optimization can lead to plans that
are far of the best possible plan: the difference in cost between the best plan
considering factorization and the best plan not considering factorization can
easily reach several orders of magnitude. Then, we introduce optimization
strategies that produce op- timal left-deep and bushy plans when factorization
is taken into account. Experiments (1) confirm that factorization is a critical
issue when it comes to generating optimal plans and (2) we show that to
consider factorization does not make plan generation significantly more
expensive.