16. APIs¶
Programmatic control of RDFox can be gained remotely via a RESTful API exposed through an HTTP endpoint or in-memory via Java.
This section describes the functionality provided in both APIs for managing the different information elements of the system. This section should be understood as a reference for the JAVA and REST APIs in RDFox, and it requires understanding of the structure of RDFox as described in Section 4.
16.1. Basics of the Java API¶
The Java API provides access to RDFox via connections to a server and/or its data stores. A connection encapsulates the identity of the object being connected to, as well as the credentials of the user making the connection. The following example demonstrates a typical life cycle of a connection.
String serverURL = ...;
String roleName = ...;
String password = ...;
ServerConnection sConn = ConnectionFactory.newServerConnection(serverURL, roleName, password);
// Use the server connection...
String dataStoreName = ...;
DataStoreConnection dsConn = sConn.newDataStoreConnection(dataStoreName);
// Use the data store connection...
dsConn.close();
sConn.close();
Both server and data store connections must be closed after use in order to release system resources. There are no requirements that a server connection must be closed after a data store connection — that is, both connections are independent.
For convenience, one can connect to a data store directly.
String serverURL = ...;
String dataStoreName = ...;
String roleName = ...;
String password = ...;
DataStoreConnection dsConn = ConnectionFactory.newDataStoreConnection(serverURL, dataStoreName, roleName, password);
// Use the data store connection...
dsConn.close();
All connections are single-threaded — that is, they can safely be used only from one thread at a time. Using the same connection from multiple threads results in undefined behavior and can lead to a system crash (although the server itself will not be corrupted provided that the containing process survives the crash). To use RDFox concurrently, one should use a distinct connection per execution thread.
RDFox provides various APIs for adding and deleting facts and rules. All updates are performed within the context of a transaction, which ensures that either all changes are performed as a unit, or no changes are performed at all. The transaction API is described in more detail in Section 16.11.
Adding or deleting facts or rules might require adjusting the inferred facts. In most cases, RDFox achieves this by using highly optimized incremental reasoning algorithms, whose aim is to update the derived facts while minimizing the amount of work. This process is automatically initiated before a query is evaluated in a transaction; thus, each query evaluated in a transaction always sees the results of prior updates made on the transaction. To promote performance, incremental reasoning is initiated only when a query is issued or a transaction is committed; thus, if several updates are issued before a transaction is committed, incremental reasoning is run only once.
It is generally good practice to add all rules before the facts, or to add rules and facts in an arbitrary order but grouped in a single transaction. This will usually increase the performance of the first reasoning operation.
16.2. Basics of the RESTful API¶
The RESTful API is available whenever the RDFox Endpoint is listening. Please refer to Section 19 for details of how to configure, start and stop the endpoint.
The endpoint provides access to one RDFox server via the following API keys.
/ : management of the server (GET/PATCH)
/commands : remote submission of shell commands (POST)
/connections : management of server connections (GET/POST)
/<SRVCONN> : management of a server conection (GET/PATCH/DELETE)
/datastores : listing available data stores (GET)
/<DSTRNAME> : management of a data store (GET/PATCH/POST/DELETE)
/connections : management of data store connections (GET/POST)
/<DSCONN> : management of a data store conection (GET/PATCH/DELETE)
/cursors : management of transaction cursors (GET/POST)
/<CURSID> : management of a cursor (GET/PATCH/DELETE)
/transaction : management of the connection transaction (GET/POST)
/content : data store content (GET/PATCH/PUT/POST/DELETE)
/datasources : listing available data sources (GET)
/<DSRCNAME> : management of a data source (GET/POST/DELETE)
/tables : listing available data source tables (GET)
/<DTNAME> : information about a data source table (GET)
/data : sampling facts of a data source table (GET)
/explanation : explanation of the reasoning process (GET)
/prefixes : prefixes of the data store (GET/PUT/PATCH)
/sparql : data store SPARQL endpoint (GET/POST)
/stats : listing the available statistics (GET/PUT)
/<STNAME> : management of the statistics (GET/PUT/POST/DELETE)
/tupletables : listing available tuple tables (GET)
/<TTNAME> : management of a tuple table (GET/POST/DELETE)
/health : checking that the endpoint is healthy (GET)
/password : changing the password of the authenticated role (PUT)
/roles : listing roles (GET)
/<ROLENAME> : management of a role (POST/DELETE)
/privileges : management of a role's privileges (GET/PATCH)
/memberships : management of a role's memberships (GET/PATCH)
/members : listing a role's members (GET)
/shells : management of shells (GET/POST)
/<SHELL> : management of a shell (GET/PATCH/DELETE)
16.2.1. Authentication¶
The RESTful API supports basic HTTP authentication. For example, to supply role
name Aladdin
with password OpenSesame
, one should include the following
header into the request:
Authorization: Basic QWxhZGRpbjpPcGVuU2VzYW1l
Since RESTful API is stateless, this header should be included with each call — that is, the role name and password are not kept between calls.
When no Authorization
header is present in a RESTful API call, the call is
processed with the role name guest
and password guest
. To prevent
anonymous access via the RESTful API, the guest
role can be deleted.
RDFox also supports a proprietary RDFox
authentication scheme intended for
use with explicit connection management. This feature is described in
Section 16.2.5.
16.2.2. Arguments Involving Key-Value Pairs¶
Several API calls take a set of key-value pairs as arguments. In the RESTful
API, these can be encoded into the query string, or into the request body using
the application/x-www-form-urlencoded
content type for PATCH/POST/PUT
requests; if a request requires both a message body and request parameters,
then the request parameters must be part of the query string. Some REST API
requests require specific HTTP parameters (such as ‘operation’ for data store
content addition/deletion), so only the keys not explicitly mentioned in the
documentation for the REST request are interpreted by RDFox as key-value
parameters (which are universal across all RDFox APIs).
16.2.3. Treating GET
Results as Answers to SPARQL Queries¶
Many RESTful API calls return information about various parts of the data store. For example, one can list all data stores in a server, all data sources in a data store, and so on. In order to avoid introducing additional formats, the output of all such requests are formatted as answers to certain SPARQL queries. (This does not mean that such a query can be evaluated through a SPARQL endpoint; rather, it only means that the same result format is reused to represent query results.)
Answers of such queries can be serialized using any of the supported query
answer formats (see Section 9.1.2) apart from
application/sparql-results+resourceid
.
Content negotiation determines the format to be used, as usual in the SPARQL
1.1 protocol. The examples in this
document use the CSV format for simplicity. All such calls accept an optional
parameter with name filter
, whose value must be a SPARQL 1.1 FILTER
expression. If a filter expression is specified, it is evaluated for each
answer in the list, and only those answers on which the expression returns
true
are returned.
16.2.4. RESTful Connections and Transactions¶
Just like in the Java API, each RESTful API request is also evaluated within a context of a server or a data store connection. The RESTful endpoint provides two ways of associating a connection with each request.
If no connection management headers are present in the HTTP request, each request will be evaluated in the context of a fresh connection. This provides users with a convenient way of using the RESTful API without any complication with connection management, which is arguably not natural in a connectionless protocol such as HTTP.
By including a
connection
HTTP request parameter, users can specify that the request should be evaluated within a specific connection. In such a case, a connection can be understood as a session: creating a connection requires checking the caller’s credentials, and subsequent requests on this connection are performed with the credentials associated with the connection. Moreover, connections can be used to support user-controlled transactions. Finally, the RESTful API provides calls for managing server and data store connections.
Most RESTful API calls are evaluated inside a read-only or a read/write transaction, which is started implicitly whenever the underlying connection is not already associated with a transaction. Depending on the workload, starting a transaction may take a long period of time. In order to prevent API calls from being blocked indefinitely, the RESTful API will cancel a request and report an error if the transaction cannot be acquired within a predetermined time period (which is currently hard-coded to two seconds).
16.2.5. Explicit Connection Management¶
The /connections
key can be used to manage server connections, and the
/datastores/<DSTRNAME>/connections
key is used to manage connections to
data store <DSTRNAME>
. Both provide exactly the same API, so all examples
in the rest of this section are presented for the latter connection type. All
examples assume that a data store called myStore
has been created in the
server.
The following request creates a connection to data store called myStore
.
The connection is identified by a identifier, which is returned in the
Location
response header. The newly created connection is associated with
the role specified in the request; that is, if provided, the Authorization
header specifies the role name and password, and otherwise the guest role is
used. The response will also contain a RDFox-Authentication-Token
header,
which will contain another random value that be used for authentication on the
connection as described below; since this value is used for authentication,
measures should be taken to keep it secret.
Request
POST /datastores/myStore/connections HTTP/1.1
Host: localhost
Response
HTTP/1.1 201 Created
RDFox-Authentication-Token: 11111222223333344444
Location: /datastores/myStore/connections/01234567890123456789
Any RESTful API request that requires a data store connection can now be
performed on a specific connection by including the connection ID as the value
of the connection
request parameter. For example, the following request
will import the data into the data store using the connection created above.
Request
POST /datastores/myStore/content?connection=01234567890123456789 HTTP/1.1
Host: localhost
[The facts/rules to be added in a format supported by RDFox]
Response
HTTP/1.1 200 OK
[Response body as usual]
All such requests are performed with the role associated with the connection. RDFox provides two ways of making sure that such requests are indeed issued by the appropriate user.
One can use basic authentication to supply a role name and password. For the request to succeed, the role name must match the name of the role logged into the connection, and the password must be valid for the role at the time the request is serviced.
Alternatively, one can use RDFox authentication scheme by including the header
Authorization: RDFox <token>
, where<token>
is the authentication token returned when the connection was created. For example, the above request can be issued as follows:
Request
POST /datastores/myStore/content?connection=01234567890123456789 HTTP/1.1
Host: localhost
Authorization: RDFox 11111222223333344444
[The facts/rules to be added in a format supported by RDFox]
Created connections can be managed using the
/datastores/<DSTRNAME>/connections/<DSCONN>
key. A GET
request on the
connection provides information about the connection. The response is written
as the output of a SPARQL query that binds the variable ?Property
to the
property name, variable ?Value
to the property value, and variable
?Mutable
to true
if the value of the property can be changed and to
false
otherwise. At present, role-name
is the only property associated
with the connection, and its value reflects the name of the role associated
with the connection.
Request
GET /datastores/myStore/connections/01234567890123456789 HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value,Mutable
role-name,guest,false
A connection can be deleted using a DELETE
request.
Request
DELETE /datastores/myStore/connections/01234567890123456789 HTTP/1.1
Host: localhost
Authorization: RDFox 11111222223333344444
Response
HTTP/1.1 204 No Content
A PATCH
request can be used to check the password of the role associated
with the connection, to interrupt another request currently running on the
connection, or to duplicate the connection. The type of request is specified in
the operation
request parameter. When checking the role password, the
request body specifies the password of the new role. The remaining connection
operations accept no parameters and the request body must be empty.
Request
PATCH /datastores/myStore/connections/01234567890123456789?operation=check-password HTTP/1.1
Host: localhost
Password
Response
HTTP/1.1 204 No Content
Request
PATCH /datastores/myStore/connections/01234567890123456789?operation=interrupt HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Request
PATCH /datastores/myStore/connections/01234567890123456789?operation=duplicate HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Location: //datastores/myStore/connections/98765432109876543210
Finally, GET
on /datastores/myStore/connections
lists the connections
to data store myStore
. The response is written as the output of a SPARQL
query that binds the variable ?Name
to the connection identifier.
Request
GET /datastores/myStore/connections HTTP/1.1
Host: localhost
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
?Name
01234567890123456789
98765432109876543210
Server connections are managed in exactly the same way.
16.2.5.1. Connections and Concurrency¶
Access to connections in the RESTful API is serialized: if two requests attempts to access the same connection, one of request will fail in order to safeguard the integrity of the RDFox server. To use RDFox concurrently from multiple requests, one should use distinct connections. Without explicit connection management, this is automatically achieved by creating a temporary connection to service each request.
16.2.5.2. Connection Expiry¶
Since the RESTful API is connectionless, there is no way to associate a data
store or server connection with a physical network connection to the server. In
order to avoid situations where a connection is created but never deleted, the
RESTful API will delete a connection if it has not been used (i.e., no HTTP
request accessed it) for a period longer than the value of the
object-keep-alive-time
endpoint parameter. That is, a connection will remain
valid for at least that much time (but it may actually remain valid slightly
longer).
16.3. Managing Servers¶
This section describes the API calls responsible for managing an RDFox server.
16.3.1. Retrieving Server Properties¶
The following request retrieves standard properties of a server. The response
is written as the output of a SPARQL query that binds the variable
?Property
to the property name, variable ?Value
to the property value,
and variable ?Mutable
to true
if the value of the property can be
changed and to false
otherwise. The names of all properties specified at
the time the server was created are prefixed with parameters.
so that they
can be identified in the output.
Request
GET / HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value,Mutable
version,5.7,false
git-SHA,74840e53a04b64af24538483163d3aabb7a04c7e,false
max-memory,100000000,true
available-memory,80000000,false
num-threads,8,true
parameters.max-memory,100000000,false
The Java API provides various getter functions on ServerConnection
to
retrieve the properties of a server.
Java API
int numThreads = sConn.getNumberOfThreads();
// ...
16.3.2. Setting Server Properties¶
The following request updates the server properties using the values specified
in the request. Only properties names returned in a GET
call from the
previous section are supported, and only mutable properties can be changed.
Request
PATCH /?operation=set&num-threads=5 HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
The Java API provides various setter functions on ServerConnection
to set
the properties of a server.
Java API
int numberOfThreads = ...;
sConn.setNumberOfThreads(numberOfThreads);
16.4. Managing Data Stores¶
This section describes the API calls responsible for managing data stores of an RDFox server.
16.4.1. Listing Available Data Stores¶
The following request retrieves the list of data stores available at a server.
The response is written as an output of a SPARQL query that binds variable
?Name
to the names of the available data stores.
Request
GET /datastores HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Name
myStore
yourStore
theirStore
Java API
List<DataSourceInfo> dataStoreInfos = sConn.listDataStores();
16.4.2. Creating a Data Store¶
The following request creates a new data store. The data store name is
specified as part of the request URL. Various data store parameters can be
provided using the mechanism described in Section 16.2.2.
The type
parameter must be provided, and it specifies the type of the new
data store. The location of the new store is returned in the Location
header.
Request
POST /datastores/myStore?type=parallel-nn&key1=val1&key2=val2 HTTP/1.1
Host: localhost
Response
HTTP/1.1 201 CREATED
Location: /datastores/myStore
Java API
Map<String, String> parameters = new HashMap<String, String>();
parameters.put("key1", "val1");
parameters.put("key2", "val2");
sConn.createDataStore("myStore", "parallel-nn", parameters);
16.4.3. Deleting a Data Store¶
The following request deletes a data store.
Request
DELETE /datastores/myStore HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
sConn.deleteDataStore("myStore");
Deleting a data store invalidates all connections to it — that is, any request made on the connection will result in an error. However, all connections to the deleted data store must still be explicitly closed in order to release all system resources.
16.4.4. Retrieving Data Store Properties¶
The following request retrieves standard properties of the data store. The
response is written as an output of a SPARQL query that binds the variable
?Property
to the property name, variable ?Value
to the property value,
and variable ?Mutable
to true
if the value of the property can be
changed and to false
otherwise. The exact properties and their values are
dependent on the type of the data store. The names of all properties specified
at the time the data store was created are prefixed with parameters.
so
that they can be identified in the output.
Request
GET /datastores/myStore HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value,Mutable
name,TestDataStore,false
unique-id,01234567890,false
data-store-version,2,false
parameters.by-levels,true,false
parameters.equality,off,false
parameters.use-DRed,false,false
concurrent,true,false
equality-axiomatization,off,false
generation-counter,0,false
base-iri,https://rdfox.com/default-base-iri/,true
requires-incremental-reasoning,false,false
The Java API provides various getter functions on DataStoreConnection
to
retrieve the basic properties of a data store.
Java API
String type = dsConn.getType();
String uniqueID = dsConn.getUniqueID();
// ...
16.4.5. Setting Properties and Manipulating a Data Store¶
The PATCH
request can be used to alter the content of a data store or how
the data store is persisted. The query parameter operation
should be set to
one of the operation values given in the table below.
Operation |
Description |
---|---|
|
Removes all facts, axioms and rules from the data store. Equivalent to the shell command clear |
|
Clears all rules and makes all facts explicit. Equivalent to the shell command clear rules-explicate-facts. |
|
Compacts all facts in the data store, reclaiming the space used by the deleted facts in the process and persistent storage. Please refer to Section 13.1.3 for details. |
|
Recompiles the rules in the current data store according to the current statistics. Equivalent to the shell command recompilerules. |
|
Performs a full, from-scratch materialization within the data store. Equivalent to the shell command remat. |
|
Sets a mutable data store property. The property name and value
are specified as |
|
Explicitly updates the set of materialized facts in the data
store. Unlike |
For example, clearing rules and explicating facts can be achieved as follows.
Request
PATCH /datastores/myStore?operation=clear-rules-explicate-facts HTTP/1.1
Host: localhost
Accept: */*
Response
HTTP/1.1 204 No Content
Java API
dsConn.clearRulesExplicateFacts();
The base IRI of the data store can be set to http://example.com/
as follows.
Request
PATCH /datastores/myStore?operation=set&base-iri=http%3A%2F%2Fexample.com%2F HTTP/1.1
Host: localhost
Accept: */*
Response
HTTP/1.1 204 No Content
Java API
dsConn.setBaseIRI("http://example.com/");
16.4.6. In-Depth Diagnostic Information¶
RDFox can report extensive diagnostic information about its internal components, which is often useful during performance tuning and debugging. Please note that this call is intended for diagnostic purposes only. The information provided is determined by RDFox internals and is likely to change in future versions of RDFox. Thus, applications should not rely on this information being stable.
Diagnostic information can be retrieved at the level of the server (by querying
the /
key) or for a specific data store (by querying the appropriate subkey
of /datastores
). In either case, the component-info
request parameter
can be specified with values short
or extended
to determine whether a
shortened or an extended report should be returned. The result is organized in
a tree of hierarchical components. The component at the root of this tree
represents a data store, and it contains a number of subcomponents that
represent various parts of the data store. For example, there is a subcomponent
for each tuple table, a subcomponent for each registered data source, and so
on. The structure of the component tree is determined by the data store type.
The state of each component in the tree is described using a list of
property/value pairs, where values can be strings or numeric values.
To output this complex data structure, the RESTful API converts the component
tree into a list as follows. Each component in the tree is assigned an integer
component ID using depth-first traversal (with root being assigned ID one).
Then, the tree is serialized as a result of a query containing three variables.
In each result to the query, variable ?ComponentID
contains the ID of the
component, variable ?Property
contains the name of the property describing
the component with the ID stored in ?ComponentID
, and variable ?Value
represents the property value. For each component, the result contains a row
with ?Property="Component name"
and where ?Value
contains the name of
the component. Finally, for each component other than the root, the result
contains a row with ?Property="Parent component ID"
and where ?Value
contains the ID of the parent component.
Request
GET /datastores/myStore?component-info=extended HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
ComponentID,Property,Value
1,Component name,RDFStore
1,Name,TestDataStore
1,Unique ID,0123456789
1,Type,parallel-nn
1,Concurrent,yes
... etc ...
2,Component name,Parameters
2,Parent component ID,1
2,by-levels,true
2,equality,off
2,use-DRed,false
... etc ...
In the above example, diagnostic information is requested for data store
myStore
. The root result is component with ID 1
that represents the
data store. Properties such as Name
, Unique ID
, and so on provide
information about the data store. Component with ID 2
is a subcomponent of
the data store. It provides information about the parameters that the data
store was created with, such as by-levels
and equality
.
Java API
ComponentInfo componentInfo = sConn.getComponentInfo(true);
... or ...
ComponentInfo componentInfo = dsConn.getComponentInfo(true);
16.5. Managing Data Store Prefixes¶
As explained in Section 4, each data store keeps track of a base IRI and a set of prefixes, which provide the defaults for content import/export and query evaluation operations. The base IRI is managed as a data store properties as explained in Section 16.4.4 and Section 16.4.5. In contrast, data store prefixes are managed as explained in this section.
16.5.1. Retrieving Prefixes¶
The following request retrieves the prefixes of the data store. The response is
written as an output of a SPARQL query that binds the variable ?PrefixName
to the prefix name and variable ?PrefixIRI
to the prefix IRI.
Request
GET /datastores/myStore/prefixes HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
PrefixName,PrefixIRI
:,http://example.com/
owl:,http://www.w3.org/2002/07/owl#
rdf:,http://www.w3.org/1999/02/22-rdf-syntax-ns#
... etc ...
Java API
Prefixes prefixes = dsConn.getPrefixes();
16.5.2. Setting Prefixes in Bulk¶
The following request replaces the prefixes of the data store with the ones specified in the request. The request content can be any of the formats supported by RDFox that contains just the prefixes (and no facts, rules, or axioms).
Request
PUT /datastores/myStore/prefixes HTTP/1.1
Host: localhost
@prefix p: <http://new.prefix/> .
Response
HTTP/1.1 204 No Content
Java API
Prefixes prefixes = ...;
dsConn.setPrefixes(prefixes);
16.5.3. Setting One Prefix¶
One prefix of the data store can be set using a PATCH
request with the
operation=set
request parameter. The prefix name and prefix IRIs are
specified using the prefix-name
and prefix-iri
request parameters,
respectively. The response is written as an output of a SPARQL query that binds
the variable ?Changed
to the Boolean value reflecting whether the set of
data store prefixes was changed or not.
The following request sets the prefix name pn:
to the prefix IRI
http://example.com/
.
Request
PATCH /datastores/myStore/prefixes?operation=set&prefix-name=pn:&prefix-iri=http%3A%2F%2Fexample.com%2F HTTP/1.1
Host: localhost
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Changed
true
Java API
Prefixes prefixes = ...;
boolean changed = dsConn.setPrefix("pn:", "http://example.com/");
16.5.4. Unsetting One Prefix¶
One prefix of the data store can be unset using a PATCH
request with the
operation=unset
request parameter. The prefix name is specified using the
prefix-name``request parameter. The response is written as an output of a
SPARQL query that binds the variable ``?Changed
to the Boolean value
reflecting whether the set of data store prefixes was changed or not.
The following request unsets the prefix name pn:
.
Request
PATCH /datastores/myStore/prefixes?operation=unset&prefix-name=pn: HTTP/1.1
Host: localhost
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Changed
true
Java API
Prefixes prefixes = ...;
boolean changed = dsConn.unsetPrefix("pn:");
16.6. Managing Data Store Content¶
The facts, rules, and axioms are collectively called the content of the data
store, and they can be managed using /content
key. All modification is
transactional — that is, a transaction is started before the call and it is
committed (if modification is successful) or rolled back (if there is an error)
before the call returns. All reasoning (if any is needed) is performed before
the transaction is committed. The /content
key implements the SPARQL 1.1
Graph Store HTTP Protocol.
However, since this protocol does not support incremental deletion, the
/content
key also supports a proprietary extension for incremental updates.
All of these functions are applied to a data store as a whole — that is, the
default
and graph=uri
request parameters of SPARQL 1.1 Graph Store HTTP
Protocol are ignored. A subset of the data store (e.g., just one named graph)
can be retrieved using CONSTRUCT
queries, and arbitrary modifications can
be implemented using DELETE/INSERT
updates. The /content
key also
implements a proprietary protocol extension, which can be used to receive
errors and/or warnings while the content is being parsed, as well as summary
information about the size of the import.
The formats that RDFox supports for encoding triples and/or rules are described
in Section 8.1 and are identified using MIME
types. In RESTful API calls that retrieve data store content, the format
determines which part of the content is being retrieved or updated. For example,
a request to output data store content using the Turtle format (MIME type
text/turtle
) retrieves all triples from the default graph, whereas a request
to output the content using the datalog format (MIME type
application/x.datalog
) retrieves all rules and no triples. As another
example, an incremental addition request that uses the Turtle format will update
the triples in the default graph.
RDFox can usually detect the format of input data, so the Content-Type
specification in update requests can generally be omitted. However, if the
Content-Type
header is present, it must match the type of the content or
the update is rejected.
16.6.1. Retrieving Data Store Content¶
The following request retrieves the content of the data store. The media type
specified using the Accept
header determines which subset of the store is
retrieved. Depending on the format, the mechanism described in
Section 16.2.2 can be used to supply request parameters
that customize the exported data.
When retrieving facts in the application/n-triples
, text/turtle
,
application/n-quads
, or application/trig
formats, the only supported
parameter is fact-domain
, and its value is the fact domain
(6.2) that determines which facts are exported.
The default fact domain is explicit
.
When retrieving rules in the application/x.datalog
format, the only
supported parameter is rule-domain
and its value is the rule domain
(4.1.4) that determines which rules are exported. The
default rule domain is user
.
Request
GET /datastores/myStore/content?fact-domain=all HTTP/1.1
Host: localhost
Accept: text/turtle; charset=UTF-8
Response
HTTP/1.1 200 OK
[The content of the store formatted according to the Turtle 1.1 standard]
Java API
OutputStream output = ...;
Map<String, String> parameters = ...;
dsConn.exportData(output, "text/turtle", parameters);
16.6.2. Incrementally Adding Data Store Content¶
The PATCH
request can be used to incrementally add content to a data store.
The query parameter operation
should be set to add-content
. The type of
content added is determined in the following way:
If the
Content-Type
header is absent, then the type of content is inferred automatically from the supplied content.If the
Content-Type
header is present, then the supplied request body must be of that type, or the request is rejected.If the
Content-Type
header has valuetext/uri-list
, then the body of the request is interpreted as a newline-delimited list of IRIs specifying the location of the content to be added. RDFox will add the content by dereferencing the listed IRIs. At least one IRI must be present; moreover, if more than one IRI is specified, all IRIs are imported in parallel. If the proprietary headerImported-Content-Type
is present, then the content of all IRIs in the list must be of that type, or the request is rejected. Otherwise, the type for each IRI will be inferred automatically.
Query parameter default-graph
can be used to specify the name of the
default graph. That is, if this parameter is specified, then triples that would
normally be imported into the default graph will instead be imported into the
graph with the specified name.
RDFox will provide information about this operation as follows.
If the
Accept
header identifies a SPARQL answer format, then the response body is structured as an answer to a SPARQL query with variables?Type
,?Line
,?Column
,?Description
, and?Value
. For each error or warning, an answer is emitted where the value of?Type
identifies the notification type (e.g.,"error"
or"warning"
, but other notification types are possible too), the values of?Line
and?Column
may identify the place in the input where the error was detected, and the value of?Description
describes the error or warning. Moreover, the following answers will summarize information about the importation:For each prefix definition encountered during importation, one answer will be emitted where the value of
?Type
is"prefix"
, the value of?Description
is the prefix name (which ends with:
), andthe value of ?Value
is the prefix URI. This allows the client to retrieve the prefixes from the submitted input.An answer with
?Type
equal to"information"
,?Description
equal to"#aborted"
, and?Value
a Boolean value specifies whether the import was aborted prematurely.Answers with
?Type
equal to"information"
,?Description
equal to"#errors"
and"#warnings"
, and?Value
integers specify the number of errors and warnings, respetively, encountered during import.Aanswers with
?Type
equal to"information"
,?Description
equal to"#processed-facts"
and"#changed-facts"
, and?Value
integers specify the number of facts processed in the input and facts actually added to or deleted from the data store, respectively.Aanswers with
?Type
equal to"information"
,?Description
equal to"#processed-rules"
and"#changed-rules"
, and?Value
integers specify the number of rules processed in the input and rules actually added to or deleted from the data store, respectively.Aanswers with
?Type
equal to"information"
,?Description
equal to"#processed-axioms"
and"#changed-axioms"
, and?Value
integers specify the number of axioms processed in the input and axioms actually added to or deleted from the data store, respectively.
If the
Accept
header is either absent or has valuetext/plain
, then theContent-Type
header of the response is then set totext/plain
, and the response body contains a human-readable description of the same information as in the previous case.
RDFox also uses a proprietary header Notify-Immediately
to determine how to
return information about the operation to the client, which also determines the
status codes used.
If the request does not include the
Notify-Immediately
header, then the entire request is processed before the response is returned to the client. The response will indicate success or failure by using one of the following status codes (which are compatible with the SPARQL 1.1 Graph Store HTTP Protocol):400 Bad Request
indicates that at least one error has been encountered,204 No Content
indicates that no additional information is provided so the response body is empty, and200 OK
indicates that no errors have been encountered, but the response body contains additional information (which can be information about warnings, or summary information in the extended format).
If the request includes the
Notify-Immediately: true
header, then notifications about errors and warnings are sent to the client as soon as they are available, possibly even before the client has finished sending the request body, thus allowing the client to take appropriate action early on. For example, a client may decide to stop sending the rest of the request body after receiving an error. This option increases the flexibility of the RESTful API, but at the expense of added complexity.The client must keep reading the notifications while it is still sending the request body. In particular, the notification produced and sent eagerly by RDFox can fill the TCP/IP buffers on the sender and receiver side, in which case RDFox will wait for client to read the notifications and thus free the buffers. But then, if the client is not reading the notifications, a deadlock will occur where the client is waiting for RDFox to process the request content, and RDFox is waiting for the client to read the notifications.
If a warning is generated before an error, RDFox must start producing the response without knowing whether the entire operation will succeed (i.e., errors can be generated later during the process). In such situations, RDFox uses the
202 Accepted
status code in the response to indicate that the status of the operation is not yet know. In such situations, the operation succeeds if and only if the response body contains no errors.
The following is an example of a successful request that follows the SPARQL 1.1 Graph Store HTTP Protocol.
Request
PATCH /datastores/myStore/content?operation=add-content HTTP/1.1
Host: localhost
[The facts/rules to be added in a format supported by RDFox]
Response
HTTP/1.1 200 OK
prefix: pref: = http://www.test.com/test#
information: #aborted = false
information: #errors = 0
information: #warnings = 0
information: #processed-facts = 9
information: #changed-facts = 8
information: #processed-rules = 0
information: #changed-rules = 0
information: #processed-axioms = 0
information: #changed-axioms = 0
The following is an example of an unsuccessful request where errors are returned in text format.
Request
PATCH /datastores/myStore/content?operation=add-content HTTP/1.1
Host: localhost
a b c .
Response
HTTP/1.1 400 Bad Request
Content-Type: text/plain; charset=UTF-8
Transfer-Encoding: chunked
XX
error: line 1: column 3: Resource expected.
information: #aborted = false
information: #errors = 1
information: #warnings = 0
information: #processed-facts = 0
information: #changed-facts = 0
information: #processed-rules = 0
information: #changed-rules = 0
information: #processed-axioms = 0
information: #changed-axioms = 0
0
The following is an example of a request where errors are returned in a SPARQL answer format.
Request
PATCH /datastores/myStore/content?operation=add-content HTTP/1.1
Content-Type: text/csv
@prefix pref: <http://www.test.com/test#> .
pref:a pref:b pref:c .
a b c .
Response
HTTP/1.1 400 Bad Request
Content-Type: text/csv; charset=UTF-8
Transfer-Encoding: chunked
XX
Type,Line,Column,Description,Value
error,3,3,Resource expected.,
prefix,,,pref:,http://www.test.com/test#
information,,,#aborted,false
information,,,#errors,1
information,,,#warnings,0
information,,,#processed-facts,1
information,,,#changed-facts,1
information,,,#processed-rules,0
information,,,#changed-rules,0
information,,,#processed-axioms,0
information,,,#changed-axioms,0
0
In the Java API, notifications are received by passing an instance implementing
the ImportNotificationMonitor
interface.
Java API
InputStream input = ...;
ImportNotificationMonitor importNotificationMonitor = ...;
ImportResult result = dsConn.importData(UpdateType.ADD, input, "", importNotificationMonitor);
16.6.3. Adding Content and Updating Prefixes¶
When incrementally adding content to a data store, it is possible to instruct
RDFox to set the data store prefixes to the final set of prefixes after import.
The request for doing so is the same as in Section 16.6.2, with the
difference that the operation
request parameter needs to be set to
add-content-update-prefixes
. This can be useful because the prefixes
included in the content then do not need to be explicitly provided in further
API calls to RDFox. In the Java API, this option is specified using the
ADDITION_UPDATE_PREFIXES
update type.
16.6.4. Incrementally Deleting Data Store Content¶
The following request incrementally deletes content from a data store. The
request and response formats follow the same structure as in the case of
incremental addition however the operation
query parameter should be set to
delete-content
. Query parameter default-graph
can be used to
specify the name of the default graph in the same way as in incremental
addition.
Request
PATCH /datastores/myStore/content?operation=delete-content HTTP/1.1
Host: localhost
[The facts/rules to be deleted in a format supported by RDFox]
Response
HTTP/1.1 204 No Content
Java API
InputStream input = ...;
ImportNotificationMonitor importNotificationMonitor = ...;
ImportResult result = dsConn.importData(UpdateType.DELETE, input, importNotificationMonitor);
16.6.5. Deleting All Data Store Content¶
The following request clears all data store content — that is, it removes all triples (in the default graph and all named graphs), all facts in all tuple tables, and all rules.
Request
DELETE /datastores/myStore/content HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
dsConn.clear();
16.6.6. Replacing All Data Store Content¶
The following request clears all data store content — that is, it removes all
triples (in the default graph and all named graphs), all facts in all tuple
tables, and all rules — and then adds the specified content to the data store.
The request and response formats follow the same structure as in the case of
incremental addition. Query parameter default-graph
can be used to
specify the name of the default graph in the same way as in incremental
addition.
Request
PUT /datastores/myStore/content HTTP/1.1
Host: localhost
[The facts/rules in a format supported by RDFox]
Response
HTTP/1.1 204 No Content
The Java API does not have a separate ‘replace content’ primitive.
16.6.7. Adding/Deleting OWL Axioms From Triples¶
As explained in Section 10.6, RDFox can be instructed to analyze the triples of one named graph, parse them into a set of OWL axioms, and add these axioms to another named graph. An analogous operation can be used to remove the axioms from a named graph. The named graph being analyzed and the named graph to which the axioms are added may, but need not be the same.
In the RESTful API, the operation is invoked using the PATCH
verb. The
source and destination graphs are specified using the source-graph
and
destination-graph
query parameters, respectively. If either of the two
parameters is omitted, the default graph is used as a default. The
operation
query parameter can be set to add-axioms
or
delete-axioms
. Finally, the assertions
query parameter can be set to
true
, in which case ABox assertions are extracted as well, or to false
,
in which case only the TBox (i.e., schema) axioms are extracted. For example,
the following request imports the axioms from triples in named graph called
SG
and stores the axioms into the named graph called DG
.
Request
PATCH /datastores/myStore/content?operation=add-axioms&source-graph=SG&destination-graph=DG HTTP/1.1
Host: localhost
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Transfer-Encoding: chunked
XX
information,,,#processed-axioms,2
information,,,#changed-axioms,2
0
Java API
dsConn.importAxiomsFromTriples("SG", false, "DG", UpdateType.ADDITION);
16.7. Managing Data Sources¶
RDFox can access external data stored in different kinds of data sources. Currently, a data source can be a CSV/TSV file, a PostgreSQL database, ODBC database, or an Apache Solr index. For an overview of how RDFox manages data sources, see Section 7.
All modification functions described in this sections are not transactional: they are applied immediately, and in fact their invocation fails if the connection has an active transaction. Consequently, there is no way to rollback the effects of these functions.
16.7.1. Listing the Registered Data Sources¶
The following request retrieves the list of data sources registered with a data
store. The response is written as an output of a SPARQL query that binds
variable ?Name
to the name of the data source, variable ?Type
to the
data source type, variable ?Parameters
to a string describing the data
source parameters (with all key-value pairs concatenated as in a query string),
and variable ?NumberOfTables
to the number of tables in the data source.
Request
GET /datastores/myStore/datasources HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Name,Type,Parameters,NumberOfTables
F1,PostgreSQL,connection-string=postgresql://user:pw@localhost:5432/DB,2
DBpedia,DelimitedFile,"file=/table.csv&delimiter=,",1
[...]
Java API
List<DataSourceInfo> dataSourceInfos = dsConn.listDataSources();
16.7.2. Registering a Data Source¶
The following request registers a new data source. The data source name is encoded in the URI. Various data source parameters can be provided using the mechanism described in Section 16.2.2.
Request
POST /datastores/myStore/datasources/mySource?type=PostgreSQL&key1=val1&key2=val2 HTTP/1.1
Host: localhost
Response
HTTP/1.1 201 CREATED
Location: /datastores/myStore/datasources/mySource
Java API
Map<String, String> parameters = new HashMap<String, String>();
parameters.put("key1", "val1");
parameters.put("key2", "val2");
dsConn.registerDataSource("mySource", "PostgreSQL", parameters);
16.7.3. Deregistering a Data Source¶
The following request deregisters a data source. The request succeeds if no tuple tables are mounted on the data source. Thus, to delete a data source, one must first delete all rules mentioning any tuple tables of the data source, and then delete all tuple tables mounted from the data source.
Request
DELETE /datastores/myStore/datasources/mySource HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
dsConn.deregisterDataSource("mySource");
16.7.4. Retrieving Information About a Data Source¶
The following request retrieves information about a data source. The response
is written as an output of a SPARQL query that binds variables ?Property
and ?Value
. What exact properties and values are supported depends on the
data source. The names of all parameters specified at the time the tuple table
was created are prefixed with parameters.
Request
GET /datastores/myStore/datasources/mySource HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value
name,mySource
type,PostgreSQL
tables,3
... etc ...
The DataSourceInfo
class encapsulates information about a data source in
the Java API. Instances of this class are immutable.
Java API
DataSourceInfo dataSourceInfo = dsConn.describeDataSource("mySource");
16.7.5. Listing the Data Source Tables of a Data Source¶
The following request retrieves the list of data source tables of a data
source. The response is written as an output of a SPARQL query that binds
variable ?Name
to the name of a data source table, variable
?NumberOfColumns
to the number of columns in the table, and variable
?Columns
to a percent-encoded string describing the table columns using the
form name1=dt1&name2=dt2&...
where namei
is the column name, and
dti
is the column datatype.
Request
GET /datastores/myStore/datasources/mySource/tables HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Name,NumberOfColumns,Columns
drivers,2,id=integer&name=string
constructors,3,key=integer&name=string&address=string
Java API
List<DataSourceTableInfo> dataSourceTableInfos = dsConn.listDataSourceTables("mySource");
16.7.6. Retrieving Information About a Data Source Table¶
The following request retrieves information about a data source table. The
response is written as an output of a SPARQL query that binds variable
?Column
to the integer referencing a column of a data source, variable
?Name
to the column name, and variable ?Datatype
to the name of the
RDFox datatype that best corresponds to the datatype of the the column in the
data source.
Request
GET /datastores/myStore/datasources/mySource/tables/drivers HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Column,Name,Datatype
1,id,http://www.w3.org/2001/XMLSchema#name
2,first_name,http://www.w3.org/2001/XMLSchema#string
3,last_name,http://www.w3.org/2001/XMLSchema#string
... etc ...
The DataSourceTableInfo
class encapsulates information about a data source
table in the Java API. Instances of this class are immutable.
Java API
DataSourceTableInfo dataSourceTableInfo = dsConn.describeDataSourceTable("mySource", "drivers");
16.7.7. Sampling a Data Source Table¶
The following request retrieves a sample of data from a data source table. The
response is written as an output of a SPARQL query that binds the variable
corresponding to column names to the values in the columns. The limit=n
request parameter would determine how many rows are to be returned. RDFox
supports a configurable, system-wide maximum limit on the number of returned
rows, which can be used to avoid accidentally requesting large portions of a
data source. The main purpose of this API is not to provide access to the data,
but only provide a sample of the data so that clients can see roughly what the
source contains and then mount the corresponding tuple table.
Request
GET /datastores/myStore/datasources/mySource/tables/drivers/data?limit=20 HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
id,first_name,last_name
1,Ayrton,Senna
2,Michael,Schumacher
... etc ...
Data from data source tables is returned using cursors in the Java API. These cursors are always full — that is, all relevant data is retrieved before the call finishes. The result is unaffected by the transaction that may be associated with the connection: RDFox does not support transactions over data sources.
Java API
Cursor data = dsConn.getDataSourceTableData("mySource", "drivers", 20);
16.8. Managing Tuple Tables¶
Both types of tuple tables are managed using the same API, which is described in this section. All modification functions described in this sections are not transactional: they are applied immediately, and in fact their invocation fails if the connection has an active transaction. Consequently, there is no way to rollback the effects of these functions.
16.8.1. Listing the Available Tuple Tables¶
The following request retrieves the list of tuple tables currently available in
a data store. The response is written as an output of a SPARQL query that binds
variable ?Name
to the name of the tuple table, variable ?Type
to a
string reflecting the type of the tuple table, variable ?ID
to a unique
integer ID of the tuple table, while variables ?MinArity
and ?MinArity
to the minimum and maximum numbers of arguments of atoms that refer to the
tuple table.
Request
GET /datastores/myStore/tupletables HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Name,Type,ID,MinArity,MaxArity
DefaultTriples,memory,1,3,3
[...]
Java API
List<TupleTableInfo> tupleTableInfos = dsConn.listTupleTables();
16.8.2. Creating a Tuple Table¶
The following request creates a new tuple table, which can be either an in-memory tuple table or a tuple table backed by a data source. Creating a tuple table requires specifying the table name as part of the URI. Various tuple table parameters can be provided using the mechanism described in Section 16.2.2. For more details see Section 6.
Request
POST /datastores/myStore/tupletables/myTable?key1=val1&key2=val2 HTTP/1.1
Host: localhost
Response
HTTP/1.1 201 CREATED
Location: /datastores/myStore/tupletables/myTable
Java API
Map<String, String> parameters = new HashMap<String, String>();
parameters.put("key1", "val1");
parameters.put("key2", "val2");
dsConn.createTupleTable("myTable", parameters);
16.8.3. Deleting a Tuple Table¶
The following request deletes a tuple table, which can be either an in-memory tuple table or a tuple table backed by a data source. The request succeeds only if a tuple table is not used in a rule currently loaded in the data store.
Request
DELETE /datastores/myStore/tupletables/myTable HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
dsConn.deleteTupleTable("myTable");
16.8.4. Retrieving Information About a Tuple Table¶
The following request retrieves information about a tuple table. The response
is written as an output of a SPARQL query that binds variables ?Property
and ?Value
. The exact properties and values are determined by the tuple
table type.
Request
GET /datastores/myStore/tupletables/myTable HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value
name,DefaultTriples
type,memory
ID,1
min-arity,3
max-arity,3
... etc ...
The TupleTableInfo
class encapsulates information about a tuple table in
the Java API. Instances of this class are immutable.
Java API
TupleTableInfo tupleTableInfo = dsConn.describeTupleTable("myTable");
16.9. Managing Statistics¶
Like most databases, RDFox needs in its operation various statistics about the data it contains. These are mainly used for query planning: when determining how to efficiently evaluate a query, RDFox consults information gathered from the data in a data store in order to estimate which query evaluation plan is more likely to be efficient. These statistics can be managed explicitly through the core and REST APIs. Configuring the available statistics is largely of interest for system administrator. Moreover, after large updates (e.g., after a large amount of data is added to the system), it is advisable to update the statistics — that is, to request RDFox to recompute all summaries from the data currently available in the system.
16.9.1. Listing the Available Statistics¶
The following request retrieves the list of statistics currently available in a
data store. The response is written as an output of a SPARQL query that binds
variable ?Name
to the name of the statistics, and variable ?Parameters
to a string describing the data source parameters (with all key-value pairs
concatenated as in a query string).
Request
GET /datastores/myStore/stats HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Name,Parameters
column-counts,
[...]
Java API
List<StatisticsInfo> statisticsInfos = dsConn.listStatistics();
16.9.2. Creating Statistics¶
The following request creates new statistics. The mechanism described in
Section 16.2.2 can be used to supply request parameters
that customize the newly created object. The location of the new statistics is
returned in the Location
header.
Request
POST /datastores/myStore/stats/column-counts HTTP/1.1
Host: localhost
Response
HTTP/1.1 201 CREATED
Location: /datastores/myStore/stats/column-counts
Java API
Map<String, String> parameters = new HashMap<String, String>();
dsConn.createStatistics("column-counts", parameters);
16.9.3. Deleting Statistics¶
The following request deletes the statistics with the given name.
Request
DELETE /datastores/myStore/stats/column-counts HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
dsConn.deleteStatistics("column-counts");
16.9.4. Retrieving Information About Statistics¶
The following request retrieves information about statistics. The response is
written as an output of a SPARQL query that binds variables ?Property
and
?Value
. The exact properties and values are determined by the statistics.
Request
GET /datastores/myStore/stats/column-counts HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value
name,column-counts
The statisticsInfo
class encapsulates information about the statistics in
the Java API. Instances of this class are immutable.
Java API
StatisticsInfo statisticsInfo = dsConn.describestatistics("column-counts");
16.9.5. Updating Statistics¶
The following request updates all statistics currently present in the data store.
Request
PUT /datastores/myStore/stats HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
StatisticsInfo statisticsInfo = dsConn.updateStatistics();
The following request updates only the statistics with the given name.
Request
PUT /datastores/myStore/stats/column-counts HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
StatisticsInfo statisticsInfo = dsConn.updateStatistics("column-counts");
16.10. Evaluating Queries¶
The /sparql
key exposes a SPARQL 1.1 endpoint implemented exactly as in the
specification. Both GET
and POST
request methods are supported.
Moreover, SELECT/ASK
and CONSTRUCT
query operations, and
DELETE/INSERT
update operations, are supported. Query evaluation in RDFox
can be influenced using a number of parameters, which can be specified using
the mechanism described in Section 16.2.2. The query
result is encoded according to the required format, and a request fails if the
format does not match the query type (e.g., if a request specifies a SELECT
query and the Turtle answer format).
The base
query parameter can be used to specify an IRI that will be used as
the default base IRI when processing the query.
In order to prevent the RDFox endpoint from hanging for a long time, query
evaluation requests can be subjected to time limits. Endpoint configuration
options query-time-limit
and allow-query-time-limit-override
and
RDFox-proprietary HTTP header Query-Time-Limit
can be used to configure
these limits, and they are described in Section 19.2.
The following is an example of a query request.
Request
GET /datastores/myStore/sparql?query=SELECT+%3FX+%3FY+%3FZ+WHERE+{+%3FX+%3FY+%3FZ+} HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
X,Y,Z
[...result of the query...]
Java API
Map<String, String> compilationParameters = new HashMap<String, String>();
// SPARQL evaluation supports a bunch of parameters that govern how
// SPARQL statements are compiled.
compilationParameters.set(..., ...);
// SPARQL evaluation can return either a set of results (in the case of
// SELECT/ASK/CONSTRUCT queries) or nothing (in the case of UPDATEs).
// It seems useful to have an API that can evaluate any SPARQL string,
// regardless of whether it contains a query or update. Therefore, the
// following function requires two parameters:
//
// * an output stream to which answers are written (if there are any),
// * the name of a SPARQL answer format.
//
OutputStream output = ...;
dsConn.evaluateStatement("SELECT ?S ?P ?O WHERE { ?S ?P ?O }", compilationParameters, output, "text/csv");
SPARQL supports pagination of query results using OFFSET
and LIMIT
query clauses; however, evaluating the same query while varying its
OFFSET/LIMIT
clauses may be inefficient because the query in each request
is evaluated from scratch.
In the RESTful API, including the offset=m;limit=n
parameters into a query
request has the same effect as adding the OFFSET m LIMIT n
clauses to the
query. However, doing the former can be more efficient when
a user makes a query request with
offset=m1;limit=n1
,the same user makes another request for exactly the same query (i.e., a query that is character-for-character identical as the previous one) with
offset=m2;limit=n2
wherem2 = m1 + n1 + 1
, andthe data store has not been updated between these two requests.
RDFox provides no hard efficiency guarantees, but will try to process
requests containing offset=m;limit=n
as efficiently as possible.
Therefore, applications should use this approach to result pagination
whenever possible. The endpoint.object-keep-alive-time
option specifies
the rough amount of time between two such requests for the same query
during which RDFox will aim to speed up query evaluation.
SPARQL queries can be long in some applications, so sending the same query multiple times can be a considerable source of overhead. In such cases, applications can consider using cursors (See Section 16.12), where a query is submitted for execution just once.
16.11. Working with Transactions¶
16.11.1. Transactions in the Java API¶
In the Java API, each transaction is associated with one data store connection.
The DataStoreConnection
class provides beginTransaction()
,
commitTransaction()
, and rollbackTransaction()
functions, which
respectively start, commit, and roll back a transaction.
If no transaction is associated with a connection, then data store modification functions and query evaluation functions start a transaction that is committed or rolled back before the function finishes. In contrast, if a transaction is started on a connection when a modification/query function is called, then the operation is evaluated within the context of that transaction.
A transaction remains open in the Java API as long as it is not explicitly committed or rolled back. Closing a connection with a running transaction will rollback the transaction first.
Data store connections are single-threaded objects: attempting to use the same object in parallel from multiple threads will result in unpredictable behavior and is likely to crash the system. (However, the same data store connection object can be used from different threads at distinct time points — that is, there is no affinity between connection objects and threads.) In order to access RDFox concurrently, one should use distinct connections, each running a separate transaction.
16.11.2. Transactions in the RESTful API¶
The RESTful API follows the same principles and associates transactions with
data store connections. To use transactions in the RESTful API, one must use
explicitly create a connection (see
Section 16.2.5). To start, commit, or rollback a
transaction, one can issue a PATCH
request to the
/datastores/<DSTRNAME>/connections/<DSCONN>/transaction
key with the
operation
request parameter set to begin-read-only-transaction
,
begin-read-write-transaction
, commit-transaction
, or
rollback-transaction
. After this, any operation evaluated on this
connection (which can be achieved by including the connection
request
parameter) will be evaluated inside the transaction associated with the
connection.
For example, the following sequence of requests creates a connection to the
myStore
data store, starts a read/write transaction on the new connection,
imports data twice, and commits the transaction. Please note that, although the
transaction has been committed, the connection persists after the last request.
Request
POST /datastores/myStore/connections HTTP/1.1
Host: localhost
Response
HTTP/1.1 201 Created
Location: /datastores/myStore/connections/01234567890123456789
Request
PATCH /datastores/myStore/connections/01234567890123456789/transaction?operation=begin-read-write-transaction HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Request
POST /datastores/myStore/content?connection=01234567890123456789 HTTP/1.1
Host: localhost
[First batch of facts/rules]
Response
HTTP/1.1 200 OK
[Response body as usual]
Request
POST /datastores/myStore/content?connection=01234567890123456789 HTTP/1.1
Host: localhost
[Second batch of facts/rules]
Response
HTTP/1.1 200 OK
[Response body as usual]
Request
PATCH /datastores/myStore/connections/01234567890123456789/transaction?operation=commit-transaction HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
At any point, one can see whether a connection is associated with a transaction
as follows. The response is written as the output of a SPARQL query that binds
the variable ?Property
to the property name, variable ?Value
to the
property value, and variable ?Mutable
to true
if the value of the
property can be changed and to false
otherwise.
Request
GET /datastores/myStore/connections/01234567890123456789/transaction HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value,Mutable
transaction-state,read-write,false
transaction-requires-rollback,false,false
last-transaction-data-store-version,5,false
16.12. Cursors¶
As already mentioned in Section 16.10, RDFox supports efficient APIs for paginating query results using cursors, which provide a view into the results of a query evaluated on a ‘frozen’ snapshot of data. The concept of cursors is used in slightly different ways in the Core and the RESTful APIs, so this section discusses first the former and then the latter.
16.12.1. Cursors in the Java API¶
The Java API uses cursors to provide access to answers to queries. A cursor goes through the following life cycle.
When a cursor is created, it is in an unopened state.
Before it is used, a cursor must be opened, which positions the cursor on the first answer tuple, or at the answer end if there are no answer tuples. Opening the cursor returns the multiplicity of the current answer, or zero if there are no answers.
Advancing a cursor returns the multiplicity of the next row. Cursors cannot go backwards — all movement is forward.
A cursor can at any point be reopened, in which case the query underlying the cursor is reevaluated afresh. By creating cursors for queries that are evaluated many times, applications can speed up query processing by avoiding the overhead of parsing and compiling the query in each request.
When a cursor is no longer needed, it must be closed so that any resources associated with it can be released. This must be done even when cursors are read to the end. In Java, the
Cursor
class implements theAutoCloseable
interface so that it can be used in a try-with-resources statement.
The reason why rows have multiplicities is because SPARQL has bag semantics, and
if an answer contains the same tuple n
times, it can be more efficient to
return the tuple once and say that the tuple’s multiplicity is n
. The Java
API supports cursors for SELECT/ASK
and CONSTRUCT
queries. A cursor for
a CONSTRUCT
query behaves as a cursor for a SELECT/ASK
query returning
variables ?S
, ?P
, and ?O
for each constructed triple.
Each cursor is associated with a data store connection that it is created on. Moreover, all operations on a cursor are evaluated in the context of a connection transaction. For example, if a transaction is running on the connection when a cursor is opened, then opening the cursor is performed within this transaction. Moreover, if no transaction is running on the connection when a cursor is opened, a temporary read-only transaction is started, the cursor is opened, and the transaction is rolled back. A cursor is advanced analogously, possibly starting a temporary transaction each time it is advanced.
The use of temporary transactions opens a potential consistency problem, which is illustrated by the following sequence of actions.
Create a cursor on a connection not associated with a transaction.
Open a cursor (which implicitly creates a temporary transaction for the duration of the operation).
Modify the content on the data store using a different connection. Since the cursor’s connection is not associated with a transaction, modification is possible, and it can affect the results of the query produced by the cursor.
Advance the cursor. At this point, RDFox will detect that the data store has changed since the cursor was opened, and, to inform the user of this fact, it would throw
StaleCursorException
. In this way, RDFox prevents users from possibly overlooking the effects of updates applied to the data store while the cursor is being used. Please note that RDFox will throwStaleCursorException
even if the update does not affect the cursor’s result — that is, RDFox’s consistency mechanism is pessimistic.
Please note that StaleCursorException
can happen only if the cursor uses
temporary transactions in open and advance. In other words, the the cursor is
opened and advanced within a single, uninterrupted transaction, then
StaleCursorException
cannot happen.
Cursors are typically used in the Java API as follows.
Map<String, String> parameters = new HashMap<String, String>();
// Initialize parameters that govern query evaluation.
parameters.set(..., ...);
// Create the cursor.
Cursor crs = dsConn.createCursor("SELECT ?X ?Y ?Z WHERE { ?X ?Y ?Z }", parameters);
for (long multiplicity = crs.open(); multiplicity != 0; multiplicity = crs.advance()) {
// Read the current answer
}
crs.close();
16.12.2. Cursors in the RESTful API¶
The RESTful API supports efficient query result pagination using the
offset=m;limit=n
request parameters (see Section 16.10).
However, this style of result pagination requires resending the same query in
each request, which can be inefficient. Moreover, applications relying on the
RESTful API might also benefit from precompiling common queries into cursors
that are managed explicitly.
To support such use cases, the RESTful API supports explicit cursor management
that mimics the Java cursor API. Each cursor is identified by an ID exposed
under the /datastores/<DSTRNAME>/connections/<DSCONN>/cursors
key; note
that this arrangement reflects the fact that each cursor is associated with a
specific data store connection. When a data store connection is deleted, all
cursors associated with the connection are deleted as well. Each cursor exposed
by the RESTful API maintains its position, and there is an API allowing users
to query the current cursor position.
16.12.2.1. Listing Available Cursors¶
The following request retrieves the list of cursors available on a server
transaction. The response is written as an output of a SPARQL query that binds
variable ?CursorID
to the cursor ID.
Request
GET /datastores/myStore/connections/01234567890123456789/cursors HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
CursorID
CRS101
CRS102
16.12.2.2. Creating a Cursor¶
A cursor is created by submitting the query to the /cursors
key using the
POST
method of the SPARQL 1.1 Protocol. The location of the new cursor is
returned in the Location
header.
The base
query parameter can be used to specify an IRI that will be used as
the default base IRI when processing the query.
Request
POST /datastores/myStore/connections/01234567890123456789/cursors HTTP/1.1
Host: localhost
Content-Type: application/sparql-query
Content-Length: 34
SELECT ?X ?Y ?Z WHERE { ?X ?Y ?Z }
Response
HTTP/1.1 201 CREATED
Location: /datastores/myStore/connections/01234567890123456789/cursors/CRS101
16.12.2.3. Opening and Advancing a Cursor¶
A PATCH
request on the cursor opens or advances the cursor; to distinguish
the two, the operation
request parameter must be included with value
open
or advance
. Moreover, request can include limit=n
parameter
determining how many rows should be returned; if this parameter is absent, all
remaining rows are returned. Parameter limit=0
can be used to specify that
no answers should be returned (and so the request just validates the cursor).
The request updates the cursor position and so such a request is not
idempotent; consequently, the request method is PATCH
. In all such cases,
the request must specify an Accept
header to determine the format of the
returned data. Different requests on the same cursor can request different
result formats.
Request
PATCH /datastores/myStore/connections/01234567890123456789/cursors/CRS101?operation=open&limit=10 HTTP/1.1
Host: localhost
Accept: text/csv
Response
HTTP/1.1 200 OK
[The first 10 answers to the query in CSV format]
16.12.2.4. Retrieving Cursor Information¶
The following request retrieves information about a specific cursor. The
response is written as an output of a SPARQL query that binds variable
?Property
to the name of a cursor property, and variable ?Value
to
property value.
Request
GET /datastores/myStore/connections/01234567890123456789/cursors/CRS101 HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value
ID,CRS1
position,10
16.12.2.5. Deleting a Cursor¶
The following request closes/deletes the cursor.
Request
DELETE /datastores/myStore/connections/01234567890123456789/cursors/CRS101 HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
16.13. Explaining Fact Derivation¶
RDFox allows clients to supply a fact and retrieve an explanation of how a fact is derived from the facts explicitly given in the input using rules. An explanation consists of one or more chains of rule applications that derive the target fact.
In the RESTful API, an explanation of how a fact is derived can be retrieved
using the /explanation
key. This key takes the following request parameters.
The mandatory
fact
request parameter specifies the fact for which the explanation is to be produced. The fact should be written using the standard Datalog syntax. IRIs in the fact can be abbreviated using the prefixes of the data store.The optional
type
parameter specifies the type of explanation that is to be retrieved.The value of
shortest
specifies that one shortest explanation should be returned. That is, the explanation shows one shortest way in which the specified fact can be derived from the facts explicitly given in the input using rules. This is the default value for thetype
parameter.The value of
to-explicit
specifies that all possible ways to derive the specified fact from explicit facts should be returned.The value of
exhaustive
specifies that all possible ways to derive the specified fact should be returned. This option differs fromto-explicit
andshortest
in that, if explicit facts are also derived by rules, these inferences are explained as well; in contrast, withto-explicit
andshortest
, the explicit facts are not further explained.
The optional
max-distance-from-root
numeric parameter can be used to specify the depth to which an explanation should be explored. That is, facts that require more than this number of rule applications are not explained themselves. This parameter can be used to limit the size of the explanation produced by RDFox. If a value for this parameter is not specified, the maximum depth is unlimited.The optional
max-rule-instances-per-fact
numeric parameter can be used to specify maximum number of ways of deriving a single fact that RDFox should explore. This parameter can be used to limit the size of the explanation produced by RDFox. If a value for this parameter is not specified, the maximum depth is unlimited.
The Java API takes the same parameters, but coded as method arguments. The
resulting explanation is serialized as a JSON object using the format discussed
below. The IRIs of facts and constants in the output are serialized using the
prefixes of the data store. In the RESTful API, the content type of the output
is set to application/x.explanation+json
.
The output format is described using the following example. Let us assume that a data store contains the following triples.
:a :R :b .
:a :R :c .
:a rdf:type :B .
:a rdf:type :D .
Moreover, let us assume that the data store contains the following rules.
:A[?X] :- :R[?X,?Y] .
:A[?X] :- :B[?X], NOT :C[?X] .
:B[?X] :- :D[?X] .
Clearly, fact :A[:a]
is derived in this data store using the following
three rule instances:
:A[:a] :- :R[:a,:b] .
:A[:a] :- :R[:a,:c] .
:A[:a] :- :B[:a], NOT :C[:a] .
Moreover, fact :B[:a]
is explicitly given in the input, but it is also
derived by the following rule instance.
:B[:a] :- :D[:a] .
When asked to produce a shortest explanation for fact :A[:a]
, RDFox might
return the following JSON object.
{ "prefixes": {
":": "http://example.com/",
"rdf:": "http://www.w3.org/1999/02/22-rdf-syntax-ns#" },
"complete": true,
"facts": {
"0": {
"fact": ":A[:a]",
"type": "derived",
"distance-from-root": 0,
"proof-height": 1,
"rule-instances-complete": true,
"shortest-proof-rule-instance": 0,
"rule-instances": [
{ "rule": ":A[?X] :- :B[?X], NOT :C[?X] .",
"head-atom-index": 0,
"substitution": {
"X": { "type": "uri-abbrev", "value": ":a" } },
"body-facts": [ "1", null ] } ] },
"1": {
"fact": ":R[:a,:c]",
"type": "explicit",
"distance-from-root": 1,
"proof-height": 0,
"rule-instances-complete": true,
"shortest-proof-rule-instance": null,
"rule-instances": [ ] }
} }
Each explanation consists of a number of facts, each of which is identified by
a numeric identified. The facts
JSON objects contains all facts indexed by
their key. The facts with the zero identifier is the root of the explanation
— that is, this is the fact for which the explanation was requested. Each
fact has the following properties.
The
fact
key contains the fact using Datalog syntax. Each argument is encoded in the same way as a binding object in theapplication/x.sparql-results+json-abbrev
query answer format.The
type
key describes the kind of fact:explicit
mean that the fact was explicitly given in the input;derived
means that the fact was not explicitly given in the input but was derived by one or more rules; andfalse
means that the fact is neither explicitly given nor derived. The value offalse
is possible only for the root fact.The
distance-from-root
key specifies how far a particular fact is from the root: this value is zero for the root fact, the facts that are used to derive the root fact have distance one, and so on.The
proof-height
key specifies the least number of rule applications needed to derive a fact. This number is zero for explicit facts, it is one for facts derived from the explicit facts using one rule application, and so on. The value of this key can benull
when themax-distance-from-root
andmax-rule-instances-per-fact
parameters prevent RDFox from exploring all rule instances necessary to identify the shortest proof.The
rule-instances-complete
key contains a Boolean value specifying whether RDFox has explored all rule instances that derive this fact. In particular, this value will befalse
if the value ofdistance-from-root
is larger than themax-distance-from-root
parameter, or if more thanmax-rule-instances-per-fact
rule instances derive the fact.The
rule-instances
key contains an array of rule instances that derive the given fact. Each rule instance is a JSON object with the structure specified below.The
shortest-proof-rule-instance
key contains a numeric zero-based index of the rule instance from therule-instances
that belongs to a shortest proof for this fact. The value of this key can benull
when themax-distance-from-root
andmax-rule-instances-per-fact
parameters prevent RDFox from exploring all rule instances necessary to identify the shortest proof.
Each rule instance is encoded as a JSON object with the following structure.
The
rule
key contains the rule written using Datalog syntax. Any IRI inside the rule is serialized using the prefixes used in the request.The
head-atom-index
key contains the index of the head atom of the rule that participated in the inference. That is, fact that mentions this rule instance is derived by the head atom of the rule with the specified index.The
substitution
key contains a JSON object specifying the substitution that is used in the inference. A substitution is a mapping of variables to values, and it is encoded in the same way as a binding object in theapplication/x.sparql-results+json-abbrev
query answer format.The
body-facts
key contains an array of IDs of facts that are matched to the body formulas of the rule in the inference. In our example, there is one instance of rule:A[?X] :- :B[?X], NOT :C[?X] .
that contains[ "1", null ]
as the value forbody-facts
, which should be interpreted as follows. The first element of the array is"1"
, which means that the first formula in the rule body is an atom that is matched to a fact whose ID1
. The second element of the array isnull
, which means that the second formula in the rule body is not an atom and is thus not directly matched to facts.
Since only a shortest explanation is required, the above JSON object presents
just one way to derive fact :A[:a]
. Note that several rule instances derive
this fact in the same number of steps, but RDFox selects just one of them.
With the to-explicit
option, RDFox produces the following explanation. The
structure of the JSON object is the same as before, but all possible ways to
derive :A[:a]
from explicit facts are shown. Note, however, that fact
:B[:a]
is not further explained since it is an explicit fact.
{ "prefixes": {
":": "http://example.com/",
"rdf:": "http://www.w3.org/1999/02/22-rdf-syntax-ns#" },
"complete": true,
"facts": {
"0": {
"fact": ":A[:a]",
"type": "derived",
"distance-from-root": 0,
"proof-height": 1,
"rule-instances-complete": true,
"shortest-proof-rule-instance": 1,
"rule-instances": [
{ "rule": ":A[?X] :- :R[?X,?Y] .",
"head-atom-index": 0,
"substitution": {
"X": { "type": "uri-abbrev", "value": ":a" },
"Y": { "type": "uri-abbrev", "value": ":b" } },
"body-facts": [ "1" ] },
{ "rule": ":A[?X] :- :R[?X,?Y] .",
"head-atom-index": 0,
"substitution": {
"X": { "type": "uri-abbrev", "value": ":a" },
"Y": { "type": "uri-abbrev", "value": ":c" } },
"body-facts": [ "2" ] },
{ "rule": ":A[?X] :- :B[?X], NOT :C[?X] .",
"head-atom-index": 0,
"substitution": {
"X": { "type": "uri-abbrev", "value": ":a" } },
"body-facts": [ "3", null ] } ] },
"1": {
"fact": ":R[:a,:b]",
"type": "explicit",
"distance-from-root": 1,
"proof-height": 0,
"rule-instances-complete": true,
"shortest-proof-rule-instance": null,
"rule-instances": [ ] },
"2": {
"fact": ":R[:a,:c]",
"type": "explicit",
"distance-from-root": 1,
"proof-height": 0,
"rule-instances-complete": true,
"shortest-proof-rule-instance": null,
"rule-instances": [ ] },
"3": {
"fact": ":B[:a]",
"type": "explicit",
"distance-from-root": 1,
"proof-height": 0,
"rule-instances-complete": true,
"shortest-proof-rule-instance": null,
"rule-instances": [ ] }
} }
With the exhaustive
option, RDFox produces the following explanation. The
main difference to the previous case is that fact :B[:a]
is explained even
though it is explicitly given in the input.
{ "prefixes": {
":": "http://example.com/",
"rdf:": "http://www.w3.org/1999/02/22-rdf-syntax-ns#" },
"complete": true,
"facts": {
"0": {
"fact": ":A[:a]",
"type": "derived",
"distance-from-root": 0,
"proof-height": 1,
"rule-instances-complete": true,
"shortest-proof-rule-instance": 1,
"rule-instances": [
{ "rule": ":A[?X] :- :R[?X,?Y] .",
"head-atom-index": 0,
"substitution": {
"X": { "type": "uri-abbrev", "value": ":a" },
"Y": { "type": "uri-abbrev", "value": ":b" } },
"body-facts": [ "1" ] },
{ "rule": ":A[?X] :- :R[?X,?Y] .",
"head-atom-index": 0,
"substitution": {
"X": { "type": "uri-abbrev", "value": ":a" },
"Y": { "type": "uri-abbrev", "value": ":c" } },
"body-facts": [ "2" ] },
{ "rule": ":A[?X] :- :B[?X], NOT :C[?X] .",
"head-atom-index": 0,
"substitution": {
"X": { "type": "uri-abbrev", "value": ":a" } },
"body-facts": [ "3", null ] } ] },
"1": {
"fact": ":R[:a,:b]",
"type": "explicit",
"distance-from-root": 1,
"proof-height": 0,
"rule-instances-complete": true,
"shortest-proof-rule-instance": null,
"rule-instances": [ ] },
"2": {
"fact": ":R[:a,:c]",
"type": "explicit",
"distance-from-root": 1,
"proof-height": 0,
"rule-instances-complete": true,
"shortest-proof-rule-instance": null,
"rule-instances": [ ] },
"3": {
"fact": ":B[:a]",
"type": "explicit",
"distance-from-root": 1,
"proof-height": 0,
"rule-instances-complete": true,
"shortest-proof-rule-instance": null,
"rule-instances": [
{ "rule": ":B[?X] :- :D[?X] .",
"head-atom-index": 0,
"substitution": {
"X": { "type": "uri-abbrev", "value": ":a" } },
"body-facts": [ "4" ] } ] },
"4": {
"fact": ":D[:a]",
"type": "explicit",
"distance-from-root": 2,
"proof-height": 0,
"rule-instances-complete": true,
"shortest-proof-rule-instance": null,
"rule-instances": [ ] }
} }
16.14. Handling Concurrent Updates¶
Many applications of RDFox need to gracefully handle concurrent updates by different users. Although RDFox provides transactions to ensure consistency of parallel updates, such a construct may be insufficient in examples such as the following.
A graph-like vizualisation of an RDF dataset may initially show just a handful of RDF resources, and allow users to interactively explore and expand the neighborhood of each resource. Clearly, it is desirable to show to each user a consistent view of the data at any given point in time. However, opening a read-only transaction for the duration of each user’s interaction is ill-advised as it would prevent the data store from being updated. Instead, the application might want to detect when the underlying data has changed and notify the user and/or refresh the view appropriately.
A common pattern to updating data in applications involves reading current data, showing the data to the user and allowing them to make changes, and then writing the data back to the data store. Such operations involves user interaction, which can take significant time. As a result, it is usually not desirable to wrap the entire operation in a read/write transaction. Instead, many applications use ‘optimistic’ concurrency control, where the update succeeds only if the data store was not updated since the data was shown to the user; otherwise, the entire process is restarted.
This section describes aspects of the RDFox API that aim to address these two problems.
16.14.1. Detecting Updates¶
Each data store is associated with a 20-digit ID that, at any point in time, uniquely identifies a data store in the server. Also, RDFox aims to assign different unique IDs to data stores created at different point in time.
Each data store also maintains a data store version, which is a positive integer. Each time a data store is updated, the version of the data store is incremented.
Jointly, a data store unique ID and a data store version identify a particular state of a data store. That is, if these values are the same between two API calls, then we know that the data store has not been updated. The converse does not hold necessarily: even if an update request fails, the version may be incremented before the failure is detected. Nevertheless, differing unique ID and/or versions between two API calls indicate with a high degree of probability that the data store has been updated.
16.14.1.1. Java API¶
The DataStoreConnection
class in Java API provides a getUniqueID()
and
getDataStoreVersion()
methods that allow users to retrieve the unique ID
and the data store version, respectively. Moreover, the
getLastTransactionDataStoreVersion()
method returns the version of the data
store at the point the last transaction was successfully evaluated on this
connection (or 0
if no transaction has been evaluated on this connection).
Thus, by recording the unique ID and the data store version and comparing them
with the values in subsequent requests, an application can detect that a data
store has been updated. Note that the unique ID of a data store never changes
during the lifetime of a data store. Moreover, each data store connection is
associated with just one data store, and so a unique ID can never change on one
connection.
16.14.1.2. RESTful API¶
The RESTful API exposes these pieces of information as ETags, which are strings
of the form "uniqueID-version"
. For HTTP requests operating on a data store
or any of its parts, the response will contain a header of the form ETag:
"uniqueID-version"
indicating the version of the data store upon request’s
completion. The following example illustrates this on the example of data
importation.
Request
POST /datastores/myStore/content HTTP/1.1
Host: localhost
[The facts/rules to be added in a format supported by RDFox]
Response
HTTP/1.1 200 OK
ETag: "01234567890123456789-2"
[Response body as usual]
In HTTP, an ETag is considered specific to each resource. In RDFox, however,
the same ETag is generated for all resources of a particular data store. Thus,
in the above example, ETag "01234567890123456789-2"
applies not only to
resource /datastores/myStore/content
, but also to /datastores/myStore
and every resource underneath (such as, say,
/datastores/myStore/content/tupletables
). In other words, importing the
data into the data store using /datastores/myStore/content
changes the
ETags of all parts of the data store.
In HTTP, it is customary to return an ETag only on successful responses. In RDFox, however, an ETag will be returned even in some error responses. This is in order to keep the user informed about the current data store version as much as possible. Specifically, most requests are processed as follows.
A request is first checked to conform to the RESTful API syntax. For example, some requests must specify certain request parameters, some requests may not admit a request body, and so on. In most cases, an ETag will not be sent if a request cannot be validated properly. The rationale behind this is that syntactically malformed requests do not match to well-defined RDFox operations.
A request is then submitted for execution. If this step fails (e.g., because the data in an update request is malformed), an ETag will in most cases be sent in the error response. The rationale behind this is that the request matches to well-defined RDFox operations, and so knowing the current data store version might actually be used to recover from failure.
Since RDFox forms ETags by combining the data store’s unique ID and version number, neither of which change during the lifetime of a read-write transaction, returning ETags while such a transaction remains open would lead to the same ETag value being returned for consecutive requests even if the response body were different. Such behavior would violate the basic principles of ETags. To avoid this, RDFox does not send ETags in response to requests that use a data store connection with an open read-write transaction. The exception to this rule is when the request has closed the transaction either by committing it or rolling it back. See Section 16.11.2 for details of how transactions are managed in the RESTful API.
16.14.2. Conditional Requests¶
RDFox can evaluate all operations conditionally — that is, an operation succeeds only if the data store unique ID and version match specific values before the request is processed. Note that a naive solution, where a user reads and compares the data store version before each request, is incorrect: a data store version can change in the interval between the user reading the version and issuing the request. RDFox addresses this by integrating these checks with its transaction processing.
16.14.2.1. Java API¶
To support version checking, the DataStoreConnection
class in Java API
provides the setNextOperationMustMatchDataStoreVersion()
and
setNextOperationMustNotMatchDataStoreVersion()
methods. Both methods take
an integer argument, which configure the connection to expect or not expect a
specific version on the next operation. Please note that version validation is
not done in these methods themselves; rather, the version is validated on the
next operation executed that uses the connection. If the validation fails, the
request will throw a DataStoreVersionDoesNotMatchException
or
DataStoreVersionMatchesException
.
DataStoreConnection dsConn = ...
// Use the data store connection...
...
// Save the data store version after the last transaction.
long savedDataStoreVersion = dsConn.getDataStoreVersionAfterLastOperation();
// Use the data store connection some more...
...
// Configure the connection to expect savedDataStoreVersion in next transaction.
// The following call will not check the version!
dsConn.setNextOperationMustMatchDataStoreVersion(savedDataStoreVersion);
// The following call fails if the version at the point of execution is
// different from savedDataStoreVersion.
dsConn.importData(...);
// The following switches data store validation off.
dsConn.setNextOperationMustMatchDataStoreVersion(0);
Once an expected version has been set on the connection, the value remains
active until setNextOperationMustMatchDataStoreVersion()
is called with
argument 0
. Moreover, if the connection is configured to expect a
particular version and a data store update is successful, the data store
version will be incremented and the resulting value will be set as the next
expected data store version. In this way, users can process subsequent updates
on the connection without having to update an expected version.
The setNextOperationMustNotMatchDataStoreVersion()
method is analogous, but
it configures the connection to not accept a specific version. This can be
used, for example, to avoid reevaluating a complex query unless the data store
has changed. Successful updates will not change this parameter of the
connection.
16.14.2.2. RESTful API¶
In the RESTful API, conditional requests are supported using standard HTTP
If-Match
and If-None-Match
headers. Specifically, if a request contains
an If-Match
header with a particular ETag, the request will succeed only if
the version of the data store matches the ETag when the request is executed.
This is illustrated by the following example, where the request (presumably)
fails because of a version mismatch.
Request
POST /datastores/myStore/content HTTP/1.1
Host: localhost
If-Match: "01234567890123456789-2"
[The facts/rules to be added in a format supported by RDFox]
Response
HTTP/1.1 412 Precondition Failed
ETag: "01234567890123456789-5"
Content-Type: text/plain; charset=UTF-8
Content-Length: XX
DataStoreVersionDoesNotMatchException: Data store version is 5, which is different from the expected version 2.
Note that the response in the above example contains the current ETag of the data store. Thus, an application can try to use this ETag in any subsequent request, which will succeed only if the data store has not been modified in the meanwhile.
The If-None-Match
header is analogous, but it ensures that the request
succeeds only if the version is different from the given one. This is
illustrated by the following request.
Request
GET /datastores/myStore/sparql?query=SELECT+%3FX+%3FY+%3FZ+WHERE+{+%3FX+%3FY+%3FZ+} HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
If-None-Match: "01234567890123456789-5"
Response
HTTP/1.1 304 Not Modified
ETag: "01234567890123456789-5"
Content-Type: text/plain; charset=UTF-8
Content-Length: XX
DataStoreVersionMatchesException: Data store version is equal to 5.
Conditional requests in RDFox differ from HTTP in the following minor ways.
ETags are opaque values in HTTP that must match exactly. RDFox, however, allows for partial matches. In particular, the value of
If-Match
andIf-None-Match
headers can have the form"uniqueID-version"
whereuniqueID
andversion
can either be specific values or the wildcard character*
. Thus,"01234567890123456789-*"
matches any data store whose unique ID is01234567890123456789
, regardless of the current data store version. Analogously,"*-5"
matches any data store whose version is5
, and"*-*"
matches any data store.HTTP allows one to specify more than one ETag in the
If-Match
orIf-None-Match
headers. However, RDFox will reject such requests: the allowed values for these headers are*
(which means ‘match any’ in HTTP) or a single ETag (possibly containing wildcard characters as explained above).
Conditional requests may not be used in combination with explicitly opened read-write transactions for the same reason that ETags are not included in responses under such circumstances (see Section 16.14.1.2).
16.15. Managing Roles¶
This section describes the API calls responsible for managing the roles defined within an RDFox server. For an introduction to RDFox’s access control model see Section 12.
16.15.1. Listing Roles¶
The following request retrieves the list of roles defined within the server.
The response is written as an output of a SPARQL query that binds variable
?Name
to the names of the available roles.
Request
GET /roles HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Name
admin
group
user1
Java API
List<String> roleNames = sConn.listRoles();
16.15.2. Creating a Role¶
The following request creates a new role. The role name is specified as part of
the request URL and the password as a text/plain
body. The type
query
parameter can be used to specify whether the role should be created using a
password (if type
is not specified or set to password
) or password hash
(if type
is set to hash
).The location of the new role is returned in
the Location
header.
Request
POST /roles/user2 HTTP/1.1
Host: localhost
Content-Type: text/plain
Content-Length: 14
user2's secret
Response
HTTP/1.1 201 Created
Location: /roles/user2
Java API
sConn.createRole("user2", "user2's secret");
16.15.3. Deleting a Role¶
The following request deletes a role.
Request
DELETE /roles/user2 HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
sConn.deleteRole("user2");
16.15.4. Listing Role Information¶
The following request lists information about an existing role. The response is
written as an output of a SPARQL query that returns one answer per property of
the role. For each answer, the variable ?Name
contains a name of the
parameter and the variable ?Value
holds its value.
Request
GET /roles/admin HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value
name,admin
password-hash,"$argon2i$v=19$m=208651,t=3,p=16$xNWat7TDiKEGGU2W66u/Pw$h9ObPGi855ypuDBI7Nr2zeWAa6f2VBmIrFRs32gEXHY"
Java API
String passwordHash = sConn.getRolePasswordHash("user1");
16.15.5. Listing Privileges¶
The following request lists the privileges of an existing role. The response is
written as an output of a SPARQL query that returns one answer per resource
specifier over which the role has any privileges. For each answer, the variable
?AllowedAccessTypes
contains a comma-separated list of access types the
role is allowed to perform over the resources specified by the resource
specifier in the ?ResourceSpecifier
variable.
Request
GET /roles/user1/privileges HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
ResourceSpecifier,AllowedAccessTypes
>datastores,"read,write"
|roles,read
Java API
Map<String, Byte> privileges = sConn.listPrivileges("user1");
16.15.6. Granting Privileges to a Role¶
The following request grants the read
and write
privileges over the
data store list to an existing role.
Request
PATCH /roles/user1/privileges?operation=grant HTTP/1.1
Host: localhost
Content-Length: 54
Content-Type: application/x-www-form-urlencoded
Accept: text/csv; charset=UTF-8
resource-specifier=|datastores&access-types=read,write
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Changed
true
Java API
boolean changed = sConn.grantPrivileges("user1", "|datastores", (byte)(ServerConnection.ACCESS_TYPE_READ | ServerConnection.ACCESS_TYPE_WRITE));
16.15.7. Revoking Privileges from a Role¶
The following request revokes the write
privilege over the data store list
from an existing role.
Request
PATCH /roles/user1/privileges?operation=revoke HTTP/1.1
Host: localhost
Content-Length: 49
Content-Type: application/x-www-form-urlencoded
Accept: text/csv; charset=UTF-8
resource-specifier=|datastores&access-types=write
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Changed
true
Java API
boolean changed = sConn.revokePrivileges("user1", "|datastores", (byte)(ServerConnection.ACCESS_TYPE_WRITE));
16.15.8. Listing Memberships¶
The following request lists the roles of which the specified role is a member.
The response is written as an output of a SPARQL query that binds variable
?Name
to the names of the role’s super roles.
Request
GET /roles/user1/memberships HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Name
group
Java API
List<String> memberships = sConn.listRoleMemberships("user1");
16.15.9. Granting Memberships¶
The following request grants membership of the role group
to an existing
role.
Request
PATCH /roles/user1/memberships?operation=add HTTP/1.1
Host: localhost
Content-Length: 19
Content-Type: application/x-www-form-urlencoded
Accept: text/csv; charset=UTF-8
super-role-name=group
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Changed
true
Java API
boolean changed = sConn.grantRole("user1", "group");
16.15.10. Revoking Memberships¶
The following request revokes membership of the role group
from an existing
role.
Request
PATCH /roles/user1/memberships?operation=delete HTTP/1.1
Host: localhost
Content-Length: 19
Content-Type: application/x-www-form-urlencoded
Accept: text/csv; charset=UTF-8
super-role-name=group
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Changed
true
Java API
boolean changed = sConn.revokeRole("user1", "group");
16.15.11. Listing Members¶
The following request lists the roles which are members of the specified role.
The response is written as an output of a SPARQL query that binds variable
?Name
to the names of the role’s members.
Request
GET /roles/group/members HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Name
user1
Java API
List<String> members = sConn.listRoleMembers("user1");
16.15.12. Changing Passwords¶
The following request sets the password for the authenticated role. The request
body must contain the old and the new passwords separated by a single \n
(CR) character.
Request
PUT /password HTTP/1.1
Host: localhost
Authorization: Basic dXNlcjE6dXNlcjE=
Content-Type: text/plain
Content-Length: 20
user1's old password
user1's new password
Response
HTTP/1.1 204 No Content
Java API
sConn.changeRolePassword("user1's old password", "user1's new password");
16.16. Checking Endpoint Health¶
The following request may be used to test that the endpoint is healthy (able to respond to requests). Note that no authorization is required, irrespective of the server’s access control policy.
Request
GET /health HTTP/1.1
Response
HTTP/1.1 204 No Content
There is no equivalent of this API in Java.
16.17. Remote Shell Execution¶
The RESTful API includes support for using the RDFox Shell remotely. The rest of this section describes the basic API, whereas Section 18.2 describes how to use the RDFox executable as a client for these APIs.
16.17.1. Basic Remote Command Execution¶
To execute a shell script, one may POST
it to the /commands
path and
read the output of the commands returned in the response body. For example, the
following request sets the active data store and runs the compact command on it.
Request
POST /commands HTTP/1.1
Content-Type: text/plain
Content-Length: 22
active myStore
compact
Response
HTTP/1.1 200 OK
Server: RDFox Endpoint
Content-Type: text/plain; charset=UTF-8
Transfer-Encoding: chunked
Data store connection 'myStore' is active.
The current data store was compacted in 0.002000 s.
Warning
Shell commands that require RDFox to access files, such as those that import
or export content, redirect shell output or invoke shell scripts, could be
used in conjunction with the remote shell API, or other APIs, to probe for
the existence of files in the server’s file system. To guard against
disclosing file and directory names that are unrelated to RDFox’s operation
in this way, RDFox’s core API implementations will only open files whose
paths are within the directory specified by the sandbox-directory
server
parameter. Administrators should set this parameter to the deepest (that is,
furthest from the root) node that contains all of the paths that users of
the remote shell or other APIs should be able to access. See
Section 4.3 for more details.
16.17.1.1. Limitations¶
The remote shell API does not support commands that prompt for interaction.
Such commands fall into one of two categories. The first category is commands
that require confirmation such as dstore delete myStore
. These commands can
be used remotely so long as the keyword force
is appended to prevent them
from prompting. The second category is commands that prompt for a password.
Commands in this category cannot be used via the remote shell.
Commands for managing the endpoint (endpoint
and daemon
) cannot be used
via the remote shell.
16.17.2. Explicit Shell Creation¶
By default, RDFox will create an instance of the remote shell for each RESTful
API request sent to the /commands
path. This means that the state
maintained by a shell instance, such as variables, connections and prefix
definitions, is discarded at the end of each request. To overcome this and
allow the shell instance state to be preserved from one call to the next, one
can explicitly manage remote shell instances and associate them with command
execution requests. The following sequence of requests demonstrates this.
First, a new shell is created by submitting an empty POST
request to the
/shells
path:
Request
POST /shells HTTP/1.1
Content-Type: text/plain
Response
HTTP/1.1 201 Created
RDFox-Authentication-Token: 98765432109876543210
Location: /shells/01234567890123456789
Content-Length: 0
On success, the URL of the newly created remote shell instance and an
authentication token specific for the shell instance are returned via the
Location
and RDFox-Authentication-Token
response headers respectively.
As with explicit connection management, the newly created remote shell instance
is associated with the role that created it: either the role specified in
Authorization
request header if basic authentication is used, or the guest
role if the Authorization
request header is missing.
Once created, a remote shell instance may be used in calls to the /commands
API by specifying the final segment of its URL as the shell
request
parameter.
Requests to use the remote shell instance must be authorized in one of two
ways. The first option is to use the standard Basic
authentication scheme
with the name and password of the role that created the shell instance
(instances created by the guest role can be used without authentication).
Alternatively, the authentication token returned in the
RDFox-Authentication-Token
response header can be used as in the following
example:
Request
POST /commands?shell=01234567890123456789 HTTP/1.1
Content-Type: text/plain
Content-Length: 34
Authorization: RDFox 98765432109876543210
set myVariableName myVariableValue
Response
HTTP/1.1 200 OK
Server: RDFox Endpoint
Content-Type: text/plain; charset=UTF-8
Transfer-Encoding: chunked
myVariableName = "myVariableValue"
By submitting subsequent requests with the same remote shell instance, we see that shell state set in one request is still available in subsequent requests:
Request
POST /commands?shell=01234567890123456789 HTTP/1.1
Content-Type: text/plain
Content-Length: 22
Authorization: RDFox 98765432109876543210
echo $(myVariableName)
Response
HTTP/1.1 200 OK
Server: RDFox Endpoint
Content-Type: text/plain; charset=UTF-8
Transfer-Encoding: chunked
myVariableValue
16.17.3. Interrupting Remote Shell Execution¶
Commands submitted to an explicitly created remote shell instance can be
interrupted by submitting a PATCH
request to the full URL of the instance
with the operation
request parameter set to interrupt
.
Request
PATCH /shells/01234567890123456789?operation=interrupt HTTP/1.1
Host: localhost:12110
Authorization: RDFox 98765432109876543210
Response
HTTP/1.1 204 No Content
16.17.4. Inspecting the Status of the Remote Shell Instance¶
Requests to the /commands
API that are correctly formatted and carry valid
credentials will always receive a 200 OK
response code, irrespective of
errors arising from command execution. Clients can, however, receive additional
information about the status of the shell at the end of each request through
HTTP trailers. To opt-in to receive trailers, clients must add the TE:
trailers
request header. RDFox will then transmit trailers including the
RDFox-Shell-Status
trailer:
Request
POST /commands?shell=01234567890123456789 HTTP/1.1
Content-Type: text/plain
Content-Length: 4
TE: trailers
quit
Response (with raw response body)
HTTP/1.1 200 OK
Trailer: RDFox-Final-Status-Code, RDFox-Error, RDFox-Shell-Status
Content-Type: text/plain; charset=UTF-8
Transfer-Encoding: chunked
0
RDFox-Shell-Status: quit
RDFox-Final-Status-Code: 200
RDFox-Error: ""
The possible values for the RDFox-Shell-Status
trailer are as follows:
running
The remote shell instance is running and available for further requests.
quit
The remote shell instance has exited and is not available for any further requests. This status is returned when the shell has exited due to the commands in the request (as in the example above) and when an implicitly-created shell has successfully executed all of the commands in the request it was created for.
aborted-duplicate
Execution of the commands in the request was aborted due to an attempt to create a resource with a name that is already in use. The remote shell instance has exited and is not available for any further requests. This status will only be returned when the
on-error
variable of the shell is set tostop
. See Section 15.5 for more details of how errors are handled by the shell.aborted-non-duplicate
Execution of the commands in the request was aborted due to an error other than an attempt to create a resource with a name that is already in use. The remote shell instance has exited and is not available for any further requests. This status will only be returned when the
on-error
variable of the shell is set tostop
orcontinue-if-exists
. See Section 15.5 for more details of how errors are handled by the shell.
16.17.5. Deletion of Remote Shell Instances¶
Remote shell instances created using the /shells
path will be automatically
garbage collected by the endpoint once they have been unused for the endpoint’s
object-keep-alive-time
(see Section 19.2). Alternatively,
they may be deleted explicitly as demonstrated by the following request:
Request
DELETE /shells/01234567890123456789 HTTP/1.1
Host: localhost:12110
Response
HTTP/1.1 204 No Content