7. Transactions in RDFox¶
Operations in a database are normally grouped into a transaction, which is a unit of work that must be executed atomically and in apparent isolation from other transactions.
Each transaction in RDFox operates on a single data store — that is, transactions cannot span several data stores within a single server or data stores in different servers.
A transaction can be rolled back (i.e., aborted without changing the data store) or committed.
Transactions in RDFox satisfy the well-known ACID properties:
- Atomicity
If an operation inside a transaction starts changing the store but then fails in the middle, the transaction will be rolled back and hence an operation in a transaction cannot be partially executed.
- Consistency
A transaction can only bring the store from a consistent state to another consistent state. In RDFox, consistency means that 1) every implicit fact that logically follows from the given rules and explicit facts has been materialized and 2) each constraint defined on the data store content is satisfied (see Section 7.2).
- Isolation
Transactions appear to be executed as if no other transaction was being executing at the same time.
- Durability
The effect of a committed transaction is never lost; in particular, once a transaction has been committed RDFox ensures that the state of the relevant data store will be persisted on disk. Durability is configurable in RDFox: data store can be configured to be persisted after a transaction, where by default persistence is disabled.
7.1. Types of Transactions¶
Transactions in RDFox can be of one of the following three types:
Read/write.
Interruptible read-only.
Read-only.
A data store can be updated only by a read/write transaction. Changes made by a read/write transaction are immediately visible to the transaction, including any new facts derived from reasoning (if the user chooses to).
Example Read-write transaction
Consider again our usual family example. Let us first initialize a store in the shell and load the data as we did in the Getting Started guide:
dstore create family par-complex-nn
active family
import data.ttl
set output out
We can now group in a read/write transaction a write operation consisting of a rule importation and a read operation consisting of a query as follows:
begin
import ! [?p, :hasChild, ?c] :- [?c, :hasParent, ?p] .
select ?p ?c where { ?p :hasChild ?c }
commit
Reasoning will happen after the rule is imported and hence the query results will reflect the changes in the materialization. In particular, we will obtain the following query results:
:lois :meg .
:peter :meg .
:peter :chris .
:lois :stewie .
If an operation in a transaction starts changing the store but fails in the middle, the transaction will be rolled back. For instance, in the previous transaction an error could happen in the middle of the import operation when the instruction has started changing the store but has not finished (thus leaving the database in an inconsistent state). In this case, the transaction will be rolled back.
In contrast, if an operation throws an exception without having changed the store, then there is no rollback. The idea is that if an operation throws an exception but it does not change the store, then you can just continue because you know that the failing operation failed in its entirety. For instance, consider the following read/write transaction where the second rule being imported contains syntax errors.
begin
import ! [?p, :hasDescendant, ?c] :- [?c, :hasParent, ?p] .
import ! [?x, :marriedTo ?y] - [?y, :marriedTo, ?x] .
commit
The first rule will be imported into the store and the importation of the second rule will fail. The transaction, however, will commit since the second importation operation has failed before it has actually made any changes to the store. Indeed, if we now run the query
select ?x ?y where {?x :hasDescendant ?y}
we will obtain four answers, showing that the first rule has taken effect.
Read-only transactions are only allowed to query a data store and cannot update its contents in any way. There are two types of read-only transactions, and the distinction between them is determined by how they can be concurrently executed with other transactions, as explained in the following section.
Example Read-only transactions
Building on the previous example, we can write in the shell a transaction consisting of two queries over the store.
begin read
select ?p ?n where { ?p rdf:type :Person . ?p :forename ?n }
select ?x ?y where { ?x :marriedTo ?y }
commit
We will obtain the result of both queries as a result. Attempting to update the store in a read-only transaction as given below will immediately lead to an error in RDFox indicating that read-only transactions do not support updates.
begin read
import ! [?p, :hasChild, ?c] :- [?p, :hasDescendant, ?c] .
commit
7.2. Constraining Data Store Content¶
Transactions in which the default RDF graph contains at least one instance
of the class <http://oxfordsemantic.tech/RDFox#ConstraintViolation>
cannot be committed. Since RDFox runs incremental materialization
prior to committing each Read/Write transaction, rules which derive an
instance of the above class act as constraints on a data store’s content.
When an attempt to commit a transaction fails due to constraint violations, the resulting error message will include up to ten properties of up to ten of the violations to aid diagnosis of the problem.
The following examples use the prefix rdfox:
to represent
<http://oxfordsemantic.tech/RDFox#>
and :
to represent
<http://example.com/>
.
Example Mandatory property constraint
The following rule prevents instances of class foaf:Person
from
being added to the default graph unless they have at least one
foaf:mbox
property.
[?person, a, rdfox:ConstraintViolation] :-
[?person, a, foaf:Person],
NOT EXIST ?mbox IN [?person, foaf:mbox, ?mbox] .
With this rule loaded, attempting to import the following triples will fail with the message shown underneath.
:alice a foaf:Person; foaf:name "Alice" .
The transaction could not be committed because it would have introduced the following constraint violation:
<http://example.com/alice> <http://xmlns.com/foaf/0.1/name> "Alice";
rdf:type <http://xmlns.com/foaf/0.1/Person> .
Although it is possible to make existing resources members of the
constraint violation class, as in the example above, more informative
failure messages can be obtained by ensuring that each separate violation
has a unique resource to represent it. The proprietary built-in function
CONSTRAINT_VIOLATION
(see Section 4.2.1.2)
is provided to generate these resources. Although its use is not mandatory,
it generates blank node identifiers that are shorter and more human-readable
than those generated by the similar SKOLEM
function leading to more
readable error messages.
Once each violation has its own resource, it is safe to add further atoms to the rule head to associate with the violation any additional information that will help the reader of the error message understand what is wrong.
Example Improved mandatory property constraint
As in the previous example, the following rule prevents insertion of
foaf:Person
instances with no foaf:mbox
property but this
time using CONSTRAINT_VIOLATION
.
[?v, a, <http://oxfordsemantic.tech/RDFox#ConstraintViolation>],
[?v, :mboxMissingFrom, ?person],
[?v, :constraintDescription, "Every foaf:Person must have at least one foaf:mbox property."] :-
[?person, a, foaf:Person],
NOT EXIST ?mbox IN [?person, foaf:mbox, ?mbox],
BIND(CONSTRAINT_VIOLATION("MissingMbox", ?person) AS ?v) .
The rule head classifies the blank node created by
CONSTRAINT_VIOLATION
as a constraint violation and gives it
additional properties identifying the deficient foaf:Person
node
and describing the constraint it violates in natural language. With
this rule loaded, the failure message for importing the same data as
in the previous example is:
The transaction could not be committed because it would have introduced the following constraint violation:
_:ConstraintViolation-FB4C4AFAE9E5657A832BF992EA1388653C4B668B25F0690065A61C2CABD0B3C <http://example.com/constraintDescription> "Every foaf:Person must have at least one foaf:mbox property.";
<http://example.com/mboxMissingFrom> <http://example.com/alice> .
7.3. Concurrent Execution of Transactions¶
At each point in time the following transactions can be active in a data store:
a single read/write transaction; or
multiple read-only transactions (interruptible or not).
As a result of this model, the common issues associated with concurrent execution of transactions in databases (e.g., “dirty reads”) cannot occur in RDFox. In particular, RDFox achieves the serializability isolation level without the need to implement any mechanism (such as locking) to prevent concurrency anomalies.
A read/write transaction cannot be started on a data store before all other read-only or read/write transactions finish. In contrast, a read/write transaction can be started on a data store if interruptible read-only transactions are active on a data store. Depending on whether and which operations are active on the interruptible read-only transaction, starting a read/write transaction might be delayed until these operations finish.
Crucially, however, existence of a cursor on an interruptible read-only transaction does not prevent a transaction from being interrupted immediately. If a transaction is interrupted, any subsequent operations on that transaction (or a cursor associated with the transaction) will fail with an appropriate error. Thus, interruptible read-only transactions can be convenient in cases when an application needs to present query results using a cursor, but holding an open cursor should not prevent updates from happening.
7.4. Persistence on Disk¶
RDFox’s persistence mechanism ensures durability of transactions by saving data store content to disk whenever a transaction commits.
While it is possible to use the builtin save
and load
commands
to achieve durability, the persistence mechanism differs in the following
key ways.
Persistence is built into RDFox transactions. This ensures that content is persisted atomically. If saving into disk fails, e.g. the disk is full, the entire transaction is rolled back; therefore, it is not possible to have a situation where the in-memory content is different from that stored on disk.
Persistence works at the server level rather than the data store level and this allows the complete server configuration to be restored.
As a data store is modified, only the changes or deltas are saved, which increases efficiency in practice.
Persistence in RDFox is configurable. If we configure the server to be persisted, then we can configure each store in the server individually; hence, it is possible to have within the same server data stores that are configured to be persisted and others that are not. In contrast, it is not possible to enable persistence on a data store without it being enabled on the server.
Upon restart, RDFox will restart the server and recreate the contents of all the data stores in the server that were configured to be persisted.
Example
The command
./RDFox shell .
initializes an RDFox server where persistence in enabled by default and
set to file
. (Note that, in contrast the command ./RDFox sandbox
initializes the server with all persistence-related options to off
by default).
Persistence can be controlled in the shell using the persist-ds
(persist data store) option, which can be applied on a per data store
basis. In particular, the shell command
dstore create store-name par-complex-nn persist-ds off
creates in the server a data store named store-name
whose content
is not saved to disk. The content of this data store will be lost when
RDFox exits.