# Relation Representation in Souffle

Relations can be represented using different internal data structures in Soufflé, each exhibiting different performance characteristics. By default, the B-tree is used to store tuples of a relation. However, this default selection can be overridden by users, by specifying a relation qualifier. Currently, the possible data structures are B-tree, Brie, and Eqrel (for equivalence relations).

## B-tree Relations

The B-tree data structure is used by default, however the *direct* flavour (see below) can be forced by adding the `btree`

qualifier to a relation declaration:

```
.decl A(x:number, y:symbol) btree
```

The B-tree is a good default data structure that performs reasonably well in general use cases. In Soufflé, two flavours of the B-tree are offered. The first is a *direct* version, where tuple data is stored directly in the B-tree. The second is an *indirect* version, where the B-tree stores pointers to an external table data structure which holds the actual tuple data.

By default, a relation without qualifiers will be *direct* if it has arity ≤ 6, and *indirect* if it has arity > 6. This choice is made because larger arity tuples take more space in the B-tree data structure, so cache coherence can be improved by using pointers instead.

Note, however, that adding the `btree`

qualifier to the relation declaration forces the *direct* B-tree.

More details on the Soufflé B-tree can be found in this paper.

## Brie Relations

The Brie data structure is a specialised data structure designed for *dense* data. It can be used by adding the `brie`

qualifier to a relation declaration.

```
.decl A(x:number, y:symbol) brie
```

The Brie data structure is a specialised form of a trie, or prefix tree. The benefits of using a prefix tree-style data structure for dense data can be illustrated as follows.

Consider a 3-arity relation containing tuples

```
(0,0,0), (0,0,1), (0,0,2)
```

In this case, all tuples share the common prefix `0, 0`

, and differ only in the final digit. This data can be described as `dense`

, since a common prefix reduces redundant stored information in the prefix tree. On the other hand, a relation containing

```
(0,0,0), (1,2,3), (4,5,6)
```

is not dense, and the prefix tree provides no benefit.

The Brie data structure in Soufflé similarly provides a performance benefit for highly dense data. However, it is slower than the B-tree in the average case, and so its use must be considered carefully.

More details on the Brie can be found in this paper.

## Equivalence Relations

An equivalence relation is a special kind of binary relation which exhibits three properties: *reflexivity*, *symmetry*, and *transitivity*. An equivalence relation could be expressed in Datalog as follows:

```
.decl equivalence(x:number, y:number)
equivalence(a, b) :- rel1(a), rel2(b). // every element of rel1 is equivalent to every element of rel2
equivalence(a, a) :- equivalence(a, _). // reflexivity
equivalence(a, b) :- equivalence(b, a). // symmetry
equivalence(a, c) :- equivalence(a, b), equivalence(b, c). // transitivity
```

However, this expression of an equivalence relation would require a quadratic number of tuples to be stored. In Soufflé, the Eqrel data structure provides a linear representation of equivalence relations, by using a union-find based algorithm. Eqrel can be used by adding the `eqrel`

qualifier to a relation declaration.

The following example demonstrates the use of Eqrel, and is semantically equivalent to the example above. However, using Eqrel carries at best a quadratic speed and memory improvement over the above example.

```
.decl eqrel_fast(x : number, y : number) eqrel
eqrel_fast(a,b) :- rel1(a), rel2(b).
```

More details on Eqrel can be found in this paper.

## Nullaries

Nullary relations are special relations. Their arity is zero, i.e., they don’t have attributes. They are defined as

```
.decl A()
```

These relations are either empty or contain the empty tuple `()`

. Internatlly, they are implemented
as a boolean variable.

Semantically, they can be seen as a proposition rather than a relation.

## Auto Selection

In Soufflé, the default data structure is the B-tree, with the *direct* version for relations with arity ≤ 6, or the *indirect* version for relations with arity > 6.