7. Transactions in RDFox

Operations in a database are normally grouped into a transaction, which is a unit of work that must be executed atomically and in apparent isolation from other transactions.

Each transaction in RDFox operates on a single data store — that is, transactions cannot span several data stores within a single server or data stores in different servers.

A transaction can be rolled back (i.e., aborted without changing the data store) or committed.

Transactions in RDFox satisfy the well-known ACID properties:

Atomicity

If an operation inside a transaction starts changing the store but then fails in the middle, the transaction will be rolled back and hence an operation in a transaction cannot be partially executed.

Consistency

A transaction can only bring the store from a consistent state to another consistent state. In RDFox, consistency means that 1) every implicit fact that logically follows from the given rules and explicit facts has been materialized and 2) each constraint defined on the data store content is satisfied (see Section 7.2).

Isolation

Transactions appear to be executed as if no other transaction was being executing at the same time.

Durability

The effect of a committed transaction is never lost; in particular, once a transaction has been committed RDFox ensures that the state of the relevant data store will be persisted on disk. Durability is configurable in RDFox: data store can be configured to be persisted after a transaction, where by default persistence is disabled.

7.1. Types of Transactions

Transactions in RDFox can be of one of the following three types:

  • Read/write.

  • Interruptible read-only.

  • Read-only.

A data store can be updated only by a read/write transaction. Changes made by a read/write transaction are immediately visible to the transaction, including any new facts derived from reasoning (if the user chooses to).

Example Read-write transaction

Consider again our usual family example. Let us first initialize a store in the shell and load the data as we did in the Getting Started guide:

dstore create family par-complex-nn
active family
import data.ttl
set output out

We can now group in a read/write transaction a write operation consisting of a rule importation and a read operation consisting of a query as follows:

begin
import ! [?p, :hasChild, ?c] :- [?c, :hasParent, ?p] .
select ?p ?c where { ?p :hasChild ?c }
commit

Reasoning will happen after the rule is imported and hence the query results will reflect the changes in the materialization. In particular, we will obtain the following query results:

:lois :meg .
:peter :meg .
:peter :chris .
:lois :stewie .

If an operation in a transaction starts changing the store but fails in the middle, the transaction will be rolled back. For instance, in the previous transaction an error could happen in the middle of the import operation when the instruction has started changing the store but has not finished (thus leaving the database in an inconsistent state). In this case, the transaction will be rolled back.

In contrast, if an operation throws an exception without having changed the store, then there is no rollback. The idea is that if an operation throws an exception but it does not change the store, then you can just continue because you know that the failing operation failed in its entirety. For instance, consider the following read/write transaction where the second rule being imported contains syntax errors.

begin
import ! [?p, :hasDescendant, ?c] :- [?c, :hasParent, ?p] .
import ! (?x, :marriedTo ?y] - [?y, :marriedTo, ?x] .
commit

The first rule will be imported into the store and the importation of the second rule will fail. The transaction, however, will commit since the second importation operation has failed before it has actually made any changes to the store. Indeed, if we now run the query

select ?x ?y where {?x :hasDescendant ?y}

we will obtain four answers, showing that the first rule has taken effect.

Read-only transactions are only allowed to query a data store and cannot update its contents in any way. There are two types of read-only transactions, and the distinction between them is determined by how they can be concurrently executed with other transactions, as explained in the following section.

Example Read-only transactions

Building on the previous example, we can write in the shell a transaction consisting of two queries over the store.

begin read
select ?p ?n where { ?p rdf:type :Person . ?p :forename ?n }
select ?x ?y where { ?x :marriedTo ?y }
commit

We will obtain the result of both queries as a result. Attempting to update the store in a read-only transaction as given below will immediately lead to an error in RDFox indicating that read-only transactions do not support updates.

begin read
import ! [?p, :hasChild, ?c] :- [?p, :hasDescendant, ?c] .
commit

7.2. Constraining Data Store Content

Any transaction that would add instances of the class <http://oxfordsemantic.tech/RDFox#ConstraintViolation> to the default graph will fail. Since RDFox runs incremental materialization prior to committing each Read/Write transaction, rules which derive an instance of the above class act as constraints on a data store’s content.

When an attempt to commit a transaction fails due to constraint violations, the resulting error message will include up to ten properties of up to ten of the violations to aid diagnosis of the problem.

The following examples use the prefix rdfox: to represent <http://oxfordsemantic.tech/RDFox#> and : to represent <http://example.com/>.

Example Mandatory property constraint

The following rule prevents instances of class foaf:Person from being added to the default graph unless they have at least one foaf:mbox property.

[?person, a, rdfox:ConstraintViolation] :-
    [?person, a, foaf:Person],
    NOT EXIST ?mbox IN [?person, foaf:mbox, ?mbox] .

With this rule loaded, attempting to import the following triples will fail with the message shown underneath.

:alice a foaf:Person; foaf:name "Alice" .
The transaction could not be committed because it would have introduced the following constraint violation:

<http://example.com/alice> <http://xmlns.com/foaf/0.1/name> "Alice";
    rdf:type <http://xmlns.com/foaf/0.1/Person> .

Although it is possible to make existing individuals members of the constraint violation class, as in the example above, more informative failure messages can be obtained by using the SKOLEM function to create blank nodes representing the actual violations. These nodes can be given properties to make the failure message more informative and introducing them ensures that each violation is printed separately.

Example Improved mandatory property constraint

As in the previous example, the following rule prevents insertion of foaf:Person instances with no foaf:mbox property but this time using SKOLEM.

[?v, a, <http://oxfordsemantic.tech/RDFox#ConstraintViolation>],
[?v, :mboxMissingFrom, ?person],
[?v, :constraintDescription, "Every foaf:Person must have at least one foaf:mbox property."] :-
    [?person, a, foaf:Person],
    NOT EXIST ?mbox IN [?person, foaf:mbox, ?mbox],
    BIND(SKOLEM("MissingMbox", ?person) AS ?v) .

The rule head classifies the blank node created by SKOLEM as a constraint violation and gives it additional properties identifying the deficient foaf:Person node and describing the constraint it violates in natural language. With this rule loaded, the failure message for importing the same data as in the previous example is:

The transaction could not be committed because it would have introduced the following constraint violation:

[] <http://example.com/constraintDescription> "Every foaf:Person must have at least one foaf:mbox property.";
   <http://example.com/missingFrom> <http://example.com/alice> .

7.3. Concurrent Execution of Transactions

At each point in time the following transactions can be active in a data store:

  • a single read/write transaction; or

  • multiple read-only transactions (interruptible or not).

As a result of this model, the common issues associated with concurrent execution of transactions in databases (e.g., “dirty reads”) cannot occur in RDFox. In particular, RDFox achieves the serializability isolation level without the need to implement any mechanism (such as locking) to prevent concurrency anomalies.

A read/write transaction cannot be started on a data store before all other read-only or read/write transactions finish. In contrast, a read/write transaction can be started on a data store if interruptible read-only transactions are active on a data store. Depending on whether and which operations are active on the interruptible read-only transaction, starting a read/write transaction might be delayed until these operations finish.

Crucially, however, existence of a cursor on an interruptible read-only transaction does not prevent a transaction from being interrupted immediately. If a transaction is interrupted, any subsequent operations on that transaction (or a cursor associated with the transaction) will fail with an appropriate error. Thus, interruptible read-only transactions can be convenient in cases when an application needs to present query results using a cursor, but holding an open cursor should not prevent updates from happening.

7.4. Persistence on Disk

RDFox’s persistence mechanism ensures durability of transactions by saving data store content to disk whenever a transaction commits.

While it is possible to use the builtin save and load commands to achieve durability, the persistence mechanism differs in the following key ways.

  • Persistence is built into RDFox transactions. This ensures that content is persisted atomically. If saving into disk fails, e.g. the disk is full, the entire transaction is rolled back; therefore, it is not possible to have a situation where the in-memory content is different from that stored on disk.

  • Persistence works at the server level rather than the data store level and this allows the complete server configuration to be restored.

  • As a data store is modified, only the changes or deltas are saved, which increases efficiency in practice.

Persistence in RDFox is configurable. If we configure the server to be persisted, then we can configure each store in the server individually; hence, it is possible to have within the same server data stores that are configured to be persisted and others that are not. In contrast, it is not possible to enable persistence on a data store without it being enabled on the server.

Upon restart, RDFox will restart the server and recreate the contents of all the data stores in the server that were configured to be persisted.

Example

The command

./RDFox shell .

initializes an RDFox server where persistence in enabled by default and set to file. (Note that, in contrast the command ./RDFox sandbox initializes the server with all persistence-related options to off by default).

Persistence can be controlled in the shell using the persist-ds (persist data store) option, which can be applied on a per data store basis. In particular, the shell command

dstore create store-name par-complex-nn persist-ds off

creates in the server a data store named store-name whose content is not saved to disk. The content of this data store will be lost when RDFox exits.