11. Transactions¶

Operations in a database are normally grouped into a transaction, which is a unit of work that must be executed atomically and in apparent isolation from other transactions.

Each transaction in RDFox operates on a single data store — that is, transactions cannot span several data stores within a single server or data stores in different servers.

A transaction can be rolled back (i.e., aborted without changing the data store) or committed.

Transactions in RDFox satisfy the well-known ACID properties:

Atomicity: If an operation inside a transaction starts changing the store but then fails in the middle, the transaction will be rolled back and hence an operation in a transaction cannot be partially executed.
Consistency: A transaction can only bring the store from a consistent state to another consistent state. In RDFox, consistency means that 1) every implicit fact that logically follows from the given rules and explicit facts has been materialized and 2) each constraint defined on the data store content is satisfied (see Section 11.2).
Isolation: Transactions appear to be executed as if no other transaction was being executing at the same time.
Durability: The effect of a committed transaction is never lost; in particular, once a transaction has been committed RDFox ensures that the state of the relevant data store will be persisted on disk. Durability is configurable in RDFox: see Section 13 for details.

11.1. Types of Transactions¶

Transactions in RDFox can be read/write or read-only. A data store can be updated only by a read/write transaction. Changes made by a read/write transaction are immediately visible to the transaction, including any new facts derived from reasoning (if the user chooses to).

Example Read-write transaction

Consider again our usual family example. Let us first initialize a store in the shell and load the data as we did in the Getting Started guide:

dstore create family
active family
import data.ttl
set output out

We can now group in a read/write transaction a write operation consisting of a rule importation and a read operation consisting of a query as follows:

begin
import ! [?p, :hasChild, ?c] :- [?c, :hasParent, ?p] .
select ?p ?c where { ?p :hasChild ?c }
commit

Reasoning will happen after the rule is imported and hence the query results will reflect the changes in the materialization. In particular, we will obtain the following query results:

:lois :meg .
:peter :meg .
:peter :chris .
:lois :stewie .

If an operation in a transaction starts changing the store but fails in the middle, the transaction will be rolled back. For instance, in the previous transaction an error could happen in the middle of the import operation when the instruction has started changing the store but has not finished (thus leaving the database in an inconsistent state). In this case, the transaction will be rolled back.

In contrast, if an operation throws an exception without having changed the store, then there is no rollback. The idea is that if an operation throws an exception but it does not change the store, then you can just continue because you know that the failing operation failed in its entirety. For instance, consider the following read/write transaction where the second rule being imported contains syntax errors.

begin
import ! [?p, :hasDescendant, ?c] :- [?c, :hasParent, ?p] .
import ! [?x, :marriedTo ?y] - [?y, :marriedTo, ?x] .
commit

The first rule will be imported into the store and the importation of the second rule will fail. The transaction, however, will commit since the second importation operation has failed before it has actually made any changes to the store. Indeed, if we now run the query

select ?x ?y where {?x :hasDescendant ?y}

we will obtain four answers, showing that the first rule has taken effect.

Read-only transactions are only allowed to query a data store and cannot update its contents in any way. Their use is demonstrated in the following example.

Example Read-only transactions

Building on the previous example, we can write in the shell a transaction consisting of two queries over the store.

begin read
select ?p ?n where { ?p rdf:type :Person . ?p :forename ?n }
select ?x ?y where { ?x :marriedTo ?y }
commit

We will obtain the result of both queries as a result. Attempting to update the store in a read-only transaction as given below will immediately lead to an error in RDFox indicating that read-only transactions do not support updates.

begin read
import ! [?p, :hasChild, ?c] :- [?p, :hasDescendant, ?c] .
commit

11.2. Constraining Data Store Content¶

Transactions in which the default RDF graph contains at least one instance of the class <https://rdfox.com/vocabulary#ConstraintViolation> cannot be committed. Since RDFox runs incremental materialization prior to committing each Read/Write transaction, rules which derive an instance of the above class act as constraints on a data store’s content.

When an attempt to commit a transaction fails due to constraint violations, the resulting error message will include up to ten properties of up to ten of the violations to aid diagnosis of the problem.

The following examples use the prefix rdfox: to represent <https://rdfox.com/vocabulary#> and : to represent <http://example.com/>.

Example Mandatory property constraint

The following rule prevents instances of class foaf:Person from being added to the default graph unless they have at least one foaf:mbox property.

[?person, a, rdfox:ConstraintViolation] :-
    [?person, a, foaf:Person],
    NOT EXIST ?mbox IN [?person, foaf:mbox, ?mbox] .

With this rule loaded, attempting to import the following triples will fail with the message shown underneath.

:alice a foaf:Person; foaf:name "Alice" .

The transaction could not be committed because it would have introduced
the following constraint violation:

<http://example.com/alice> <http://xmlns.com/foaf/0.1/name> "Alice";
    rdf:type <http://xmlns.com/foaf/0.1/Person> .

Although it is possible to make existing resources members of the constraint violation class, as in the example above, more informative failure messages can be obtained by ensuring that each separate violation has a unique resource to represent it. The built-in tuple-table SKOLEM can be used to generate blank nodes for this purpose.

Once each violation has its own resource, it is safe to add further atoms to the rule head to associate with the violation any additional information that will help the reader of the error message understand what is wrong.

Example Improved mandatory property constraint

As in the previous example, the following rule prevents insertion of foaf:Person instances with no foaf:mbox property but this time using SKOLEM.

PREFIX rdfox: <https://rdfox.com/vocabulary#>
[?v, a, rdfox:ConstraintViolation],
[?v, :mboxMissingFrom, ?person],
[?v, :constraintDescription, "Every foaf:Person must have at least one foaf:mbox property."] :-
    [?person, a, foaf:Person],
    NOT EXIST ?mbox IN [?person, foaf:mbox, ?mbox],
    SKOLEM("MissingMbox", ?person, ?v) .

The rule head classifies the blank node bound to ?v via the SKOLEM tuple table as a constraint violation and gives it additional properties identifying the deficient foaf:Person node and describing the constraint it violates in natural language. With this rule loaded, the failure message for importing the same data as in the previous example is:

The transaction could not be committed because it would have introduced
the following constraint violation:

_:__05TWlzc2luZ01ib3gA_02aHR0cDovL2V4YW1wbGUuY29tL2FsaWNlAA-- <http://example.com/constraintDescription> "Every foaf:Person must have at least one foaf:mbox property.";
  <http://example.com/mboxMissingFrom> <http://example.com/alice> .

11.3. Concurrent Execution of Transactions¶

At each point in time the following transactions can be active in a data store:

a single read/write transaction; or
multiple read-only transactions.

As a result of this model, the common issues associated with concurrent execution of transactions in databases (e.g., “dirty reads”) cannot occur in RDFox. In particular, RDFox achieves the serializability isolation level without the need to implement any mechanism (such as locking) to prevent concurrency anomalies.