The paper Building a Join Optimizer for Soufflé has
been published in LOPSTR 2022: The 32nd International Symposium on Logic-Based Program Synthesis and Transformation.
The paper describes the design of a feedback-directed join optimizer for compiling Datalog engines
such as Soufflé. The join optimizer automatically finds high-performance join orders that are competitive
with hand-tuned orders found by experts. As a result, Datalog programmers can unlock high-performance
execution without understanding the engine’s internals.
The work is available in the main branch of Soufflé and documented in the Auto-Tuning section of the Soufflé wiki.
Datalog has grown in popularity as a domain-specific language (DSL) for real-world
applications. Crucial to its resurgence has been the advent of high-performance Datalog
compilers, including Soufflé. Yet this high performance is unobtainable for users unless
they provide performance hints such as join orders for rules.
In this paper, we develop a join optimizer for Soufflé that automatically computes
high-quality join orders using a feedback-directed optimization strategy:
In a profiling stage, the compiler obtains join size estimates, and in a join
ordering stage, an offline join optimizer derives cost-optimal join orders. The
performance of the automatically optimized joins is demonstrated using complex
real-world applications, including DOOP, DDISASM, and VPC, surpassing the performance
of un-tuned join orders by a geometric mean speedup of 12.07×.