souffle  2.0.2-371-g6315b36
MaterializeAggregationQueries.h
Go to the documentation of this file.
1 /*
2  * Souffle - A Datalog Compiler
3  * Copyright (c) 2015, Oracle and/or its affiliates. All rights reserved
4  * Licensed under the Universal Permissive License v 1.0 as shown at:
5  * - https://opensource.org/licenses/UPL
6  * - <souffle root>/licenses/SOUFFLE-UPL.txt
7  */
8 
9 /************************************************************************
10  *
11  * @file MaterializeAggregationQueries.h
12  *
13  * Transformation pass to create artificial relations for bodies of
14  * aggregation functions consisting of more than a single atom.
15  *
16  ***********************************************************************/
17 
18 #pragma once
19 
20 #include "ast/Aggregator.h"
21 #include "ast/TranslationUnit.h"
23 #include <string>
24 
25 namespace souffle::ast::transform {
26 /**
27  * Transformation pass to create artificial relations for bodies of
28  * aggregation functions consisting of more than a single atom.
29  */
30 class MaterializeAggregationQueriesTransformer : public Transformer {
31 public:
32  std::string getName() const override {
33  return "MaterializeAggregationQueriesTransformer";
34  }
35 
36  /**
37  * Creates artificial relations for bodies of aggregation functions
38  * consisting of more than a single atom, in the given program.
39  *
40  * @param program the program to be processed
41  * @return whether the program was modified
42  */
43  static bool materializeAggregationQueries(TranslationUnit& translationUnit);
44 
47  }
48 
49 private:
50  bool transform(TranslationUnit& translationUnit) override {
51  return materializeAggregationQueries(translationUnit);
52  }
53  /**
54  * Modify the aggClause by adding in grounding literals for every
55  * variable that appears in the clause ungrounded. The source of literals
56  * to copy from is the originalClause.
57  **/
58  static void groundInjectedParameters(const TranslationUnit& translationUnit, Clause& aggClause,
59  const Clause& originalClause, const Aggregator& aggregate);
60 
61  /**
62  * A test determining whether the body of a given aggregation needs to be
63  * 'outlined' into an independent relation or can be kept inline.
64  */
65  static bool needsMaterializedRelation(const Aggregator& agg);
66  /**
67  * Whatever variables have been left unnamed
68  * have significance for a count aggregate. They NEED to
69  * be in the head of the materialized atom that will replace
70  * this aggregate subclause. Therefore, we give them
71  * dummy names (_n rather than just _) so that the count
72  * will be correct.
73  **/
74  static void instantiateUnnamedVariables(Clause& aggClause);
75  /**
76  * When we materialise an aggregate subclause,
77  * it's a good question which variables belong in the
78  * head of that materialised clause and which don't.
79  *
80  * For a count aggregate for example, it's important that
81  * all variables (even unnamed ones) appearing in the subclause are given a space
82  * in the head. For min and max aggregates, we know that the entire subclause
83  * (as is the case for any aggregate) will need to appear as the body of the materialised rule,
84  * but what goes in the head is a bit less straightforward. We could get away with only
85  * exporting the column that we are taking the maximum over, because this is the only value
86  * that we need to retrieve. The same can be said for sum and mean.
87  *
88  * BUT there is a caveat: We support the construct of witnesses to an aggregate calculation.
89  * A witness is some variable present in the aggregate subclause that is exported
90  * *ungrounded* to the base scope. If we do save all configurations of the variables
91  * in the aggregate subclause (as we are doing when we materialise it), then we can
92  * reuse this materialised subclause to retrieve the witness almost instantly.
93  *
94  * (This is assuming that we have a good enough machinery to recognise when
95  * rules are exactly the same, because the witness transformation actually has to take place BEFORE
96  * the materialisation, meaning that there will be two separate relations that actually will represent
97  * exactly the same relation semantically, and we would want those to be recognised as a single relation)
98  *
99  * So not only then (assuming we have good "equality of rules" machinery) does using the "count"
100  * aggregate technique (just giving each variable a place in the head) allow us to save and retrieve
101  * witnesses easily, but it also just makes the treatment of aggregates uniform, which I suppose is
102  * simpler, nicer, more aesthetic maybe.
103  *
104  * Giving min/max/mean/sum aggregate materialised relations only a single column would be far more
105  * efficient, but then we would lose the ability to retrieve witnesses. So this was the tradeoff, and
106  * losing witnesses is just not an option. We NEED them to be able to give the user witness functionality.
107  *
108  * To clarify, we don't want to include local variables of any inner aggregates appearing in this
109  * aggregate. My reasoning for deciding this was really just a hunch that I felt like local variables of
110  *an inner aggregate should be totally irrelevant. So we include "immediate" local variables, not local
111  * variables of any inner aggregate.
112  *
113  * The other thing we must include is any injected variables. Depending on the assignment of the injected
114  * variable, the aggregate's value may change, but so, still, why would it be important to include the
115  * injected variable in the head then?
116  *
117  * A(x, n) :- B(x), n = sum y : { C(y, x) }.
118  *
119  * Because we need the assignment of x in the outer scope to COINCIDE with the assignment of x in the
120  * inner scope. The only thing visible to the outer scope is whatever is in the head. That is good enough
121  * justification. And this applies to any type of aggregate.
122  *
123  * So the overall conclusion is that what we should include are:
124  * * injected variables
125  * * *immediate* local variables
126  **/
127  static std::set<std::string> distinguishHeadArguments(
128  const TranslationUnit& tu, const Clause& clause, const Aggregator& aggregate);
129 };
130 
131 } // namespace souffle::ast::transform
souffle::ast::transform::MaterializeAggregationQueriesTransformer::groundInjectedParameters
static void groundInjectedParameters(const TranslationUnit &translationUnit, Clause &aggClause, const Clause &originalClause, const Aggregator &aggregate)
Modify the aggClause by adding in grounding literals for every variable that appears in the clause un...
Definition: MaterializeAggregationQueries.cpp:114
TranslationUnit.h
Transformer.h
souffle::ast::Clause
Intermediate representation of a horn clause.
Definition: Clause.h:51
souffle::ast::transform::MaterializeAggregationQueriesTransformer::needsMaterializedRelation
static bool needsMaterializedRelation(const Aggregator &agg)
A test determining whether the body of a given aggregation needs to be 'outlined' into an independent...
Definition: MaterializeAggregationQueries.cpp:357
souffle::ast::transform::MaterializeAggregationQueriesTransformer::instantiateUnnamedVariables
static void instantiateUnnamedVariables(Clause &aggClause)
Whatever variables have been left unnamed have significance for a count aggregate.
Definition: MaterializeAggregationQueries.cpp:53
souffle::ast::TranslationUnit
Translation unit class for the translation pipeline.
Definition: TranslationUnit.h:51
souffle::ast::transform
Definition: Program.h:45
souffle::ast::transform::MaterializeAggregationQueriesTransformer::materializeAggregationQueries
static bool materializeAggregationQueries(TranslationUnit &translationUnit)
Creates artificial relations for bodies of aggregation functions consisting of more than a single ato...
Definition: MaterializeAggregationQueries.cpp:249
souffle::ast::transform::MaterializeAggregationQueriesTransformer::transform
bool transform(TranslationUnit &translationUnit) override
Definition: MaterializeAggregationQueries.h:64
Aggregator.h
souffle::ast::Aggregator
Defines the aggregator class.
Definition: Aggregator.h:53
souffle::ast::transform::MaterializeAggregationQueriesTransformer
Transformation pass to create artificial relations for bodies of aggregation functions consisting of ...
Definition: MaterializeAggregationQueries.h:37
souffle::ast::transform::MaterializeAggregationQueriesTransformer::distinguishHeadArguments
static std::set< std::string > distinguishHeadArguments(const TranslationUnit &tu, const Clause &clause, const Aggregator &aggregate)
When we materialise an aggregate subclause, it's a good question which variables belong in the head o...
Definition: MaterializeAggregationQueries.cpp:77
souffle::ast::transform::MaterializeAggregationQueriesTransformer::getName
std::string getName() const override
Definition: MaterializeAggregationQueries.h:46
souffle::ast::transform::MaterializeAggregationQueriesTransformer::clone
MaterializeAggregationQueriesTransformer * clone() const override
Definition: MaterializeAggregationQueries.h:59