Alon Y. Levy , Yehoshua Sagiv , Semantic Query Optimization in Datalog Programs Proceedings of the 14th ACM SIGACT-SIGMOD-SIGARTSymposium on Principles of Database Systems, San Jose, CA 1995
Abstract: Semantic query optimization refers to the process of using integrity constraints (ic's) in order to optimize the evaluation of queries. The process is well understood in the case of unions of select-project-join queries (i.e., non-recursive datalog). For arbitrary datalog programs, however, the issue has largely remained an unsolved problem. This paper studies this problem and shows when semantic query optimization can be completely done in recursive rules provided that order constraints and negated EDB subgoals appear only in the recursive rules, but not in the ic's. If either order constraints or negated EDB subgoals are introduced in ic's, then the problem of semantic query optimization becomes undecidable. The algorithm we describe also enables extending semantic query optimization to SQL queries involving aggregation and duplicates, for which previous techniques are not applicable. Since semantic query optimization is closely related to the containment problem of a datalog program in a union of conjunctive queries, our results also imply new decidability and undecidability results for that problem when order constraints and negated EDB subgoals are used.