8. Programmatic Access to RDFox¶
Programmatic control of RDFox can be gained remotely via a RESTful API exposed through an HTTP endpoint or in-memory via Java.
This section describes the functionality provided in both APIs for managing the different information elements of the system. This section should be understood as a reference for the JAVA and REST APIs in RDFox, and it requires understanding of the structure of RDFox as described in Section 6.
8.1. Basics of the Java API¶
The Java API provides access to RDFox via connections to a server and/or its data stores. A connection encapsulates the identity of the object being connected to, as well as the credentials of the user making the connection. The following example demonstrates a typical life cycle of a connection.
String serverURL = ...;
String roleName = ...;
String password = ...;
ServerConnection sConn = ConnectionFactory.newServerConnection(serverURL, roleName, password);
// Use the server connection...
String dataStoreName = ...;
DataStoreConnection dsConn = sConn.newDataStoreConnection(dataStoreName);
// Use the data store connection...
dsConn.close();
sConn.close();
Both server and data store connections must be closed after use in order to release system resources. There are no requirements that a server connection must be closed after a data store connection — that is, both connections are independent.
For convenience, one can connect to a data store directly.
String serverURL = ...;
String dataStoreName = ...;
String roleName = ...;
String password = ...;
DataStoreConnection dsConn = ConnectionFactory.newDataStoreConnection(serverURL, dataStoreName, roleName, password);
// Use the data store connection...
dsConn.close();
All connections are single-threaded — that is, they can safely be used only from one thread at a time. Using the same connection from multiple threads results in undefined behavior and can lead to a system crash (although the server itself will not be corrupted provided that the containing process survives the crash). To use RDFox concurrently, one should use a distinct connection per execution thread.
8.2. Basics of the RESTful API¶
Each RESTful endpoint can be started using the endpoint
shell
command; please refer to Section 10.1.13 for details about this
command and the endpoint configuration options. The endpoint provides
access to one RDFox server via the following API keys.
/ : management of the server (GET/PATCH)
/datastores : listing available data stores (GET)
/<DSTRNAME> : management of a data store (GET/PATCH/POST/DELETE)
/content : data store content (GET/PATCH/PUT/POST/DELETE)
/datasources : listing available data sources (GET)
/<DSRCNAME> : management of a data source (GET/POST/DELETE)
/tables : listing available data source tables (GET)
/<DTNAME> : information about a data source table (GET)
/data : sampling facts of a data source table (GET)
/dictionary : the data store dictionary (GET)
/sparql : data store SPARQL endpoint (GET/POST)
/stats : listing the available statistics (GET/PUT)
/<STNAME> : management of the statistics (GET/PUT/POST/DELETE)
/transactions : management of transactions (GET/POST)
/<TXID> : management of a transaction (GET/POST/PATCH/DELETE)
/content : transactional updates of facts/rules (GET/PATCH/PUT/POST/DELETE)
/cursors : management of transaction cursors (GET/POST)
/<CURSID> : management of a cursor (GET/POST/DELETE)
/sparql : transactional querying (GET/POST)
/tupletables : listing available tuple tables (GET)
/<TTNAME> : management of a tuple table (GET/POST/DELETE)
/roles : listing roles (GET)
/<ROLENAME> : management of a role (POST/DELETE)
/privileges : management of a role's privileges (GET/PATCH)
/memberships : management of a role's memberships (GET/PATCH)
/members : listing a role's members (GET)
/password : changing the password of the authenticated role (PUT)
8.3. Managing Servers¶
This section describes the API calls responsible for managing an RDFox server.
8.3.1. Retrieving Server Properties¶
The following request retrieves standard properties of a server. The
response is written as the output of a SPARQL query that binds the
variable ?Property
to the property name, variable ?Value
to the
property value, and variable ?Mutable
to true
if the value of
the property can be changed and to false
otherwise. The names of all
properties specified at the time the server was created are prefixed
with parameters.
so that they can be identified in the output.
Request
GET / HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value,Mutable
num-threads,8,true
parameters.max-memory,100000000,false
The Java API provides various getter functions on ServerConnection
to retrieve the properties of a server.
Java API
int numThreads = sConn.getNumberOfThreads();
// ...
8.3.2. Setting Server Properties¶
The following request updates the server properties using the values
specified in the request. Only properties names returned in a GET
call from the previous section are supported, and only mutable
properties can be changed.
Request
PATCH /?num-threads=5 HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
The Java API provides various setter functions on ServerConnection
to set the properties of a server.
Java API
int numberOfThreads = ...;
sConn.setNumberOfThreads(numberOfThreads);
8.4. Managing Data Stores¶
This section describes the API calls responsible for managing data stores of an RDFox server.
8.4.1. Listing Available Data Stores¶
The following request retrieves the list of data stores available at a
server. The response is written as an output of a SPARQL query that
binds variable ?Name
to the names of the available data stores.
Request
GET /datastores HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Name
myStore
yourStore
theirStore
Java API
List<DataSourceInfo> dataStoreInfos = sConn.listDataStores();
8.4.2. Creating a Data Store¶
The following request creates a new data store. The data store name is
specified as part of the request URL, and key-value pairs can be
supplied as request parameters to determine various data store options.
The type
parameter must be provided, and it specifies the type of
the new data store. The location of the new store is returned in the
Location
header.
Request
POST /datastores/myStore?type=par-complex-nn;key1=val1;key2=val2 HTTP/1.1
Host: localhost
Response
HTTP/1.1 201 CREATED
Location: /datastores/myStore
Java API
Map<String, String> parameters = new HashMap<String, String>();
parameters.put("key1", "val1");
parameters.put("key2", "val2");
sConn.createDataStore("myStore", "par-complex-nn", parameters);
8.4.3. Deleting a Data Store¶
The following request deletes a data store.
Request
DELETE /datastores/myStore HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
sConn.deleteDataStore("myStore");
Deleting a data store invalidates all connections to it — that is, any request made on the connection will result in an error. However, all connections to the deleted data store must still be explicitly closed in order to release all system resources.
8.4.4. Retrieving Data Store Properties¶
The following request retrieves standard properties of the data store.
The response is written as an output of a SPARQL query that binds the
variable ?Property
to the property name, variable ?Value
to the
property value, and variable ?Mutable
to true
if the value of
the property can be changed and to false
otherwise. The exact
properties and their values are dependent on the type of the data store.
The names of all properties specified at the time the data store was
created are prefixed with parameters.
so that they can be identified
in the output.
Request
GET /datastores/myStore HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value,Mutable
name,TestDataStore,false
unique-id,01234567890
type,par-complex-nn,false
parameters.by-levels,true,false
parameters.equality,off,false
parameters.use-DRed,false,false
concurrent,true,false
equality-axiomatization,off,false
generation-counter,0,false
requires-incremental-reasoning,false,false
The Java API provides various getter functions on
DataStoreConnection
to retrieve the basic properties of a data
store.
Java API
String type = dsConn.getType();
String uniqueID = dsConn.getUniqueID();
// ...
8.4.5. Manage Data Store Content¶
The PATCH
request can be used to alter the content of a data store or how the data store is persisted.
The query parameter command
should be set to one of the command values given in the table below.
Command |
Description |
---|---|
|
Removes all facts, axioms and rules from the data store. Equivalent to the shell command clear |
|
Clears all rules and makes all facts explicit. Equivalent to the shell command clear rules-explicate-facts. |
|
Clears all facts but keeps all rules currently loaded into the data store. Equivalent to the shell command clear facts-keep-rules. |
|
Compacts all facts in the data store, reclaiming the space used by the deleted facts in the process and persistent storage. Equivalent to the shell command compact. |
|
Recompiles the rules in the current data store according to the current statistics. Equivalent to the shell command recompilerules. |
Request
PATCH /datastores/myStore?command=clear-rules-explicate-facts HTTP/1.1
Host: localhost
Accept: */*
Response
HTTP/1.1 204 No Content
Java API
dsConn.clearRulesExplicateFacts();
8.4.6. Retrieving and Modifying Data Store Content¶
The content of a data store can be modified using the /content
key.
All modification is transactional — that is, a transaction is started
before the call and it is committed (if modification is successful) or
rolled back (if there is an error) before the call returns. All
reasoning (if any is needed) is performed before the transaction is
committed. The /content
key implements the SPARQL 1.1 Graph Store
HTTP Protocol.
However, since this protocol does not support incremental deletion, the
/content
key also supports a proprietary extension for incremental
updates. All of these functions are applied to a data store as a whole —
that is, the default
and graph=uri
request parameters of SPARQL
1.1 Graph Store HTTP Protocol are ignored. A subset of the data store
(e.g., just one named graph) can be retrieved using CONSTRUCT
queries, and arbitrary updates can be implemented using
DELETE/INSERT
queries. The /content
key also implements a
proprietary protocol extension, which can be used to receive errors
and/or warnings while the content is being parsed, as well as summary
information about the size of the import.
The formats that RDFox supports for encoding triples and/or rules are
described in Section 8.9 and
are identified using MIME types. In RESTful API calls that retrieve data
store content, the format determines which part of the content is being
retrieved or updated. For example, a request to output data store
content using the Turtle format (MIME type text/turtle
) retrieves
all triples from the default graph, whereas a request to output the
content using the datalog format (MIME type application/x.datalog
)
retrieves all rules and no triples. As another example, an incremental
addition request that uses the Turtle format will update the triples in
the default graph.
RDFox can reliably detect the format of input data, and so the
Content-Type
specification in update requests can be omitted.
However, if the Content-Type
header is present, it must match the
type of the content or the update is rejected.
8.4.6.1. Retrieving Data Store Content¶
The following request retrieves the content of the data store. The media
type specified using the Accept
header determines which subset of
the store is retrieved. Depending on the format, different request
parameters can be specified to customize the data returned.
When retrieving facts in the application/n-triples
, text/turtle
,
application/n-quads
, or application/trig
formats, the only
supported parameter is fact-domain
, and its value is the fact domain
(6.5.2) that determines which facts are exported.
The default fact domain is EDB
.
When retrieving OWL 2 axioms in the text/owl-functional
format, the
only supported parameter is axiom-domain
and its value is the axiom
domain (6.3) that determines which axioms are
exported. The default axiom domain is user
.
When retrieving rules in the application/x.datalog
format, the only
supported parameter is rule-domain
and its value is the rule domain
(6.4) that determines which rules are exported.
The default rule domain is user
.
Request
GET /datastores/myStore/content?fact-domain=IDB HTTP/1.1
Host: localhost
Accept: text/turtle; charset=UTF-8
Response
HTTP/1.1 200 OK
[The content of the store formatted according to the Turtle 1.1 standard]
Java API
OutputStream output = ...;
Map<String, String> parameters = ...;
dsConn.exportData(prefixes, output, "text/turtle", parameters);
8.4.6.2. Incrementally Adding Data Store Content¶
The PATCH
request can be used to incrementally add content to a data
store. The query parameter mode
should be set to add
. The type
of content added is determined in one of the following two ways:
if the
Content-Type
header is absent, then the type of content is inferred automatically from the supplied content; andif the
Content-Type
header is present, then the supplied content must be of that type, or the request is rejected.
RDFox will provide information about this operation as follows.
If the
Accept
header identifies a SPARQL answer format, in which case the response body is structured as an answer to a SPARQL query with variables?Type
,?Line
,?Column
,?Description
, and?Vvalue
. For each error or warning, an answer is emitted where the value of?Type
identifies the notification type (e.g.,"error"
or"warning"
, but other notification types are possible too), the values of?Line
and?Column
may identify the place in the input where the error was detected, and the value of?Description
describes the error or warning. Moreover, the following answers will summarize information about the importation:For each prefix definition encountered during importation, one answer will be emitted where the value of
?Type
is"prefix"
, the value of?Description
is the prefix name (which ends with:
), andthe value of ?Value
is the prefix URI. This allows the client to retrieve the prefixes from the submitted input.In addition, answers with
?Type
equal to"information"
,?Description
equal to"#items"
,"#errors"
, or"#warnings"
, and the value of?Value
containing an integer will be generated to report the number of items (i.e., facts, rules, or axioms) processed in the input, as well as the numbers of errors and warnings encountered. Only nonzero values will be reported. This allows the client to determine how the data store was updated.
If the
Accept
header is either absent or has valuetext/plain
, then theContent-Type
header of the response is then set totext/plain
, and the response body contains a human-readable description of the same information as in the previous case.
RDFox also uses a proprietary header Notify-Immediately
to determine
how to return information about the operation to the client, which also
determines the status codes used.
If the request does not include the
Notify-Immediately
header, then the entire request is processed before the response is returned to the client. The response will indicate success or failure by using one of the following status codes (which are compatible with the SPARQL 1.1 Graph Store HTTP Protocol):400 Bad Request
indicates that at least one error has been encountered,204 No Content
indicates that no additional information is provided so the response body is empty, and200 OK
indicates that no errors have been encountered, but the response body contains additional information (which can be information about warnings, or summary information in the extended format).
If the request includes the
Notify-Immediately: true
header, then notifications about errors and warnings are sent to the client as soon as they are available, possibly even before the client has finished sending the request body, thus allowing the client to take appropriate action early on. For example, a client may decide to stop sending the rest of the request body after receiving an error. This option increases the flexibility of the RESTful API, but at the expense of added complexity.The client must keep reading the notifications while it is still sending the request body. In particular, the notification produced and sent eagerly by RDFox can fill the TCP/IP buffers on the sender and receiver side, in which case RDFox will wait for client to read the notifications and thus free the buffers. But then, if the client is not reading the notifications, a deadlock will occur where the client is waiting for RDFox to process the request content, and RDFox is waiting for the client to read the notifications.
If a warning is generated before an error, RDFox must start producing the response without knowing whether the entire operation will succeed (i.e., errors can be generated later during the process). In such situations, RDFox uses the
202 Accepted
status code in the response to indicate that the status of the operation is not yet know. In such situations, the operation succeeds if and only if the response body contains no errors.
The following is an example of a successful request that follows the SPARQL 1.1 Graph Store HTTP Protocol.
Request
PATCH /datastores/myStore/content?mode=add HTTP/1.1
Host: localhost
[The facts/rules to be added in a format supported by RDFox]
Response
HTTP/1.1 200 OK
prefix: pref: = http://www.test.com/test#
information: #items = 9
The following is an example of an unsuccessful request where errors are returned in text format.
Request
PATCH /datastores/myStore/content?mode=add HTTP/1.1
Host: localhost
a b c .
Response
HTTP/1.1 400 Bad Request
Content-Type: text/plain; charset=UTF-8
Transfer-Encoding: chunked
XX
error: line 1: column 3: Resource expected.
information: #items = 9
information: #errors = 2
0
The following is an example of a request where errors are returned in a SPARQL answer format.
Request
PATCH /datastores/myStore/content?mode=add HTTP/1.1
Content-Type: text/csv
@prefix pref: <http://www.test.com/test#> .
pref:a pref:b pref:c .
a b c .
Response
HTTP/1.1 400 Bad Request
Content-Type: text/csv; charset=UTF-8
Transfer-Encoding: chunked
XX
Type,Line,Column,Description,Value
error,3,3,Resource expected.,
prefix,,,pref:,http://www.test.com/test#
infromation,,,#items,0
infromation,,,#errors,1
0
In the Java API, notifications are received by passing an instance
implementing the ImportNotificationMonitor
interface.
Java API
InputStream input = ...;
ImportNotificationMonitor importNotificationMonitor = ...;
ImportResult result = dsConn.importData(UpdateType.ADD, prefixes, input, "", importNotificationMonitor);
8.4.6.3. Incrementally Deleting Data Store Content¶
The following request incrementally deletes content from a data store.
The request and response formats follow the same structure as in the
case of incremental addition however the mode
query parameter should
be set to delete
.
Request
PATCH /datastores/myStore/content?mode=delete HTTP/1.1
Host: localhost
[The facts/rules to be deleted in a format supported by RDFox]
Response
HTTP/1.1 204 No Content
Java API
InputStream input = ...;
ImportNotificationMonitor importNotificationMonitor = ...;
ImportResult result = dsConn.importData(UpdateType.DELETE, prefixes, input, importNotificationMonitor);
8.4.6.4. Deleting All Data Store Content¶
The following request clears all data store content — that is, it removes all triples (in the default graph and all named graphs), all facts in all tuple tables, and all rules.
Request
DELETE /datastores/myStore/content HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
dsConn.clear();
8.4.6.5. Replacing All Data Store Content¶
The following request clears all data store content — that is, it removes all triples (in the default graph and all named graphs), all facts in all tuple tables, and all rules — and then adds the specified content to the data store. The request and response formats follow the same structure as in the case of incremental addition.
Request
PUT /datastores/myStore/content HTTP/1.1
Host: localhost
[The facts/rules in a format supported by RDFox]
Response
HTTP/1.1 204 No Content
The Java API does not have a separate ‘replace content’ primitive.
8.4.6.6. Determining the Number of Threads¶
A parallel data store provides a parameter that determines the number of threads that the store will use for importation of data and reasoning. This parameter is initially always one, and it can be changed as needed while the store is being used.
8.4.6.7. Unique Identifier of a Data Store¶
Upon creation, each data store is assigned an identifier that is with a high degree of probability unique across servers. Clients can use this identifier to ensure that they are referring to the same data store in different API calls.
8.4.7. In-Depth Diagnostic Information¶
RDFox can report extensive diagnostic information about its internal components, which is often useful during performance tuning and debugging. Please note that this call is intended for diagnostic purposes only. The information provided is determined by RDFox internals and is likely to change in future versions of RDFox. Thus, applications should not rely on this information being stable.
Diagnostic information can be retrieved at the level of the server (by
querying the /
key) or for a specific data store (by querying the
appropriate subkey of /datastores
). In either case, the
component-info
request parameter can be specified with values
short
or extended
to determine whether a shortened or an
extended report should be returned. The result is organized in a tree of
hierarchical components. The component at the root of this tree
represents a data store, and it contains a number of subcomponents that
represent various parts of the data store. For example, there is a
subcomponent representing the data store dictionary, a subcomponent for
each tuple table, a subcomponent for each attached data source, and so
on. The structure of the component tree is determined by the data store
type. The state of each component in the tree is described using a list
of property/value pairs, where values can be strings or numeric values.
To output this complex data structure, the RESTful API converts the
component tree into a list as follows. Each component in the tree is
assigned an integer component ID using depth-first traversal (with root
being assigned ID one). Then, the tree is serialized as a result of a
query containing three variables. In each result to the query, variable
?ComponentID
contains the ID of the component, variable
?Property
contains the name of the property describing the component
with the ID stored in ?ComponentID
, and variable ?Value
represents the property value. For each component, the result contains a
row with ?Property="Component name"
and where ?Value
contains
the name of the component. Finally, for each component other than the
root, the result contains a row with ?Property="Parent component ID"
and where ?Value
contains the ID of the parent component.
Request
GET /datastores/myStore&component-info=extended HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
ComponentID,Property,Value
1,Component name,RDFStore
1,Name,TestDataStore
1,Unique ID,0123456789
1,Type,par-complex-nn
1,Concurrent,yes
... etc ...
2,Component name,Parameters
2,Parent component ID,1
2,by-levels,true
2,equality,off
2,use-DRed,false
... etc ...
3,Component name,Dictionary
3,Parent component ID,1
3,Resource mapping size,704
3,Aggregate size,12586653
... etc ...
In the above example, diagnostic information is requested for data store
myStore
. The root result is component with ID 1
that represents
the data store. Properties such as Name
, Unique ID
, and so on
provide information about the data store. Component with ID 2
is a
subcomponent of the data store. It provides information about the
parameters that the data store was created with, such as by-levels
and equality
. Analogously, subcomponent with ID 3
of the data
store provides information the data store dictionary such as
Resource mapping
and Aggregate size
.
Java API
ComponentInfo componentInfo = sConn.getComponentInfo(true);
... or ...
ComponentInfo componentInfo = dsConn.getComponentInfo(true);
8.4.8. Managing Statistics¶
Like most databases, RDFox needs in its operation various statistics about the data it contains. These are mainly used for query planning: when determining how to efficiently evaluate a query, RDFox consults information gathered from the data in a data store in order to estimate which query evaluation plan is more likely to be efficient. These statistics can be managed explicitly through the core and REST APIs. Configuring the available statistics is largely of interest for system administrator. Moreover, after large updates (e.g., after a large amount of data is added to the system), it is advisable to update the statistics — that is, to request RDFox to recompute all summaries from the data currently available in the system.
8.4.8.1. Listing the Available Statistics¶
The following request retrieves the list of statistics currently
available in a data store. The response is written as an output of a
SPARQL query that binds variable ?Name
to the name of the
statistics, and variable ?Parameters
to a string describing the data
source parameters (with all key-value pairs concatenated as in a query
string).
Request
GET /datastores/myStore/stats HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Name,Parameters
column-counts,
[...]
Java API
List<StatisticsInfo> statisticsInfos = dsConn.listStatistics();
8.4.8.2. Creating Statistics¶
The following request creates new statistics. One can supply to the
request a number of key-value pairs that govern how the statistics are
generated. The location of the new statistics is returned in the
Location
header.
Request
POST /datastores/myStore/stats/column-counts HTTP/1.1
Host: localhost
Response
HTTP/1.1 201 CREATED
Location: /datastores/myStore/stats/column-counts
Java API
Map<String, String> parameters = new HashMap<String, String>();
dsConn.createStatistics("column-counts", parameters);
8.4.8.3. Deleting Statistics¶
The following request deletes the statistics with the given name.
Request
DELETE /datastores/myStore/stats/column-counts HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
dsConn.deleteStatistics("column-counts");
8.4.8.4. Retrieving Information About Statistics¶
The following request retrieves information about statistics. The
response is written as an output of a SPARQL query that binds variables
?Property
and ?Value
. The exact properties and values are
determined by the statistics.
Request
GET /datastores/myStore/stats/column-counts HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value
name,column-counts
The statisticsInfo
class encapsulates information about the
statistics in the Java API. Instances of this class are immutable.
Java API
StatisticsInfo statisticsInfo = dsConn.describestatistics("column-counts");
8.4.9. Updating Statistics¶
The following request updates all statistics currently present in the data store.
Request
PUT /datastores/myStore/stats HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
StatisticsInfo statisticsInfo = dsConn.updateStatistics();
The following request updates only the statistics with the given name.
Request
PUT /datastores/myStore/stats/column-counts HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
StatisticsInfo statisticsInfo = dsConn.updateStatistics("column-counts");
8.4.10. Accessing the Dictionary¶
The main purpose of a data store dictionary is to resolve resources
(i.e., IRIs, strings, integers, dates, and so on) into integer resource
IDs, which are then used internally to store facts. These IDs are
immutable as long as a data store is not reinitialized. Although many
users never need to use the dictionary directly, exposing the dictionary
allows for more efficient communication. In particular, in the
proprietary format SPARQL answer format with MIME type
application/sparql-results+resourceid
, each query answer is written
out as a sequence of 64-bit resource IDs, each corresponding to a
resource in the dictionary. By caching resource IDs between calls, an
application can considerably reduce the communication overheads.
Resource IDs can be resolved in the RESTful API by submitting a GET
request to the /dictionary
key of a data store. The IDs to be
resolved are submitted as a comma-separated list in the id
or
ids
request parameter (i.e., the two parameters are synonyms). The
result of dictionary lookup is written as an output of a SPARQL query
that binds variable ?ResourceID
to the resource ID, ?LexicalForm
to the lexical form of the resource, and variable ?DatatypeID
to an
ID representing the datatype. The order of the resources in the response
matches the order of resource in the ids
parameter. Moreover, if a
particular resource ID is not in the dictionary, then the resource ID
occurs in the answer with ?LexicalForm
and ?DatatypeID
unbound.
Finally, data types are encoded as IDs as shown in the following table
(note that datatype ID with value zero stands for a special datatype
whose only value represents the unbound value):
DatatypeID |
Datatype |
---|---|
0 |
Unbound value |
1 |
Blank node |
2 |
IRI reference |
3 |
http://www.w3.org/2000/01/rdf-schema#Literal |
4 |
http://www.w3.org/2001/XMLSchema#anyURI |
5 |
http://www.w3.org/2001/XMLSchema#string |
6 |
http://www.w3.org/1999/02/22-rdf-syntax-ns#PlainLiteral |
7 |
http://www.w3.org/2001/XMLSchema#boolean |
8 |
http://www.w3.org/2001/XMLSchema#dateTime |
9 |
http://www.w3.org/2001/XMLSchema#dateTimeStamp |
10 |
http://www.w3.org/2001/XMLSchema#time |
11 |
http://www.w3.org/2001/XMLSchema#date |
12 |
http://www.w3.org/2001/XMLSchema#gYearMonth |
13 |
http://www.w3.org/2001/XMLSchema#gYear |
14 |
http://www.w3.org/2001/XMLSchema#gMonthDay |
15 |
http://www.w3.org/2001/XMLSchema#gDay |
16 |
http://www.w3.org/2001/XMLSchema#gMonth |
17 |
http://www.w3.org/2001/XMLSchema#duration |
18 |
http://www.w3.org/2001/XMLSchema#yearMonthDuration |
19 |
http://www.w3.org/2001/XMLSchema#dayTimeDuration |
20 |
http://www.w3.org/2001/XMLSchema#double |
21 |
http://www.w3.org/2001/XMLSchema#float |
22 |
http://www.w3.org/2001/XMLSchema#decimal |
23 |
http://www.w3.org/2001/XMLSchema#integer |
24 |
http://www.w3.org/2001/XMLSchema#nonNegativeInteger |
25 |
http://www.w3.org/2001/XMLSchema#nonPositiveInteger |
26 |
http://www.w3.org/2001/XMLSchema#negativeInteger |
27 |
http://www.w3.org/2001/XMLSchema#positiveInteger |
28 |
http://www.w3.org/2001/XMLSchema#long |
29 |
http://www.w3.org/2001/XMLSchema#int |
30 |
http://www.w3.org/2001/XMLSchema#short |
31 |
http://www.w3.org/2001/XMLSchema#byte |
32 |
http://www.w3.org/2001/XMLSchema#unsignedLong |
33 |
http://www.w3.org/2001/XMLSchema#unsignedInt |
34 |
http://www.w3.org/2001/XMLSchema#unsignedShort |
35 |
http://www.w3.org/2001/XMLSchema#usignedByte |
Request
GET /datastores/myStore/dictionary?ids=3,9,6,5000 HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
ResourceID,LexicalForm,DatatypeID
3,Peter,2
9,42,5
6,Stewie,3
5000,,
Java API
long[] resourceIDs = new long[] { 3, 6, 9, 5000 }
GroundTerm[] groundTerms = new GroundTerm[4];
dsConn.getGroundTerms(resourceIDs, groundTerms);
8.5. Managing Data Sources¶
RDFox can access external data stored in different kinds of data sources. Currently, a data source can be a CSV/TSV file, a PostgreSQL database, or an ODBC database. Data sources are managed as described in this section. All modification functions described in this sections are not transactional: they are applied immediately, and in fact their invocation fails if the connection has an active transaction. Consequently, there is no way to rollback the effects of these functions.
8.5.1. Listing the Available Data Sources¶
The following request retrieves the list of data sources of a data
store. The response is written as an output of a SPARQL query that binds
variable ?Name
to the name of the data source, variable ?Type
to
the data source type, variable ?Parameters
to a string describing
the data source parameters (with all key-value pairs concatenated as in
a query string), and variable ?NumberOfTables
to the number of
tables in the data source.
Request
GET /datastores/myStore/datasources HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Name,Type,Parameters,NumberOfTables
F1,PostgreSQL,connection-string=postgresql://user:pw@localhost:5432/DB,2
DBpedia,DelimitedFile,"file=/table.csv&delimiter=,",1
[...]
Java API
List<DataSourceInfo> dataSourceInfos = dsConn.listDataSources();
8.5.2. Creating a Data Source¶
The following request creates a new data source. The data source name is encoded in the URI, and it also accepts request parameters that include the data source type and a number of key-value pairs determining the data source options.
Request
POST /datastores/myStore/datasources/mySource?type=PostgreSQL;key1=val1;key2=val2 HTTP/1.1
Host: localhost
Response
HTTP/1.1 201 CREATED
Location: /datastores/myStore/datasources/mySource
Java API
Map<String, String> parameters = new HashMap<String, String>();
parameters.put("key1", "val1");
parameters.put("key2", "val2");
dsConn.createDataSource("mySource", "PostgreSQL", parameters);
8.5.3. Deleting a Data Source¶
The following request creates a new data source. The request succeeds if no tuple tables are mounted on the data source. Thus, to delete a data source, one must first delete all rules mentioning any tuple tables of the data source, and then delete all tuple tables mounted from the data source.
Request
DELETE /datastores/myStore/datasources/mySource HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
dsConn.deleteDataSource("mySource");
8.5.4. Retrieving Information About a Data Source¶
The following request retrieves information about a data source. The
response is written as an output of a SPARQL query that binds variables
?Property
and ?Value
. What exact properties and values are
supported depends on the data source. The names of all parameters
specified at the time the tuple table was created are prefixed with
parameters.
Request
GET /datastores/myStore/datasources/mySource HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value
name,mySource
type,PostgreSQL
tables,3
... etc ...
The DataSourceInfo
class encapsulates information about a data
source in the Java API. Instances of this class are immutable.
Java API
DataSourceInfo dataSourceInfo = dsConn.describeDataSource("mySource");
8.5.5. Listing the Data Source Tables of a Data Source¶
The following request retrieves the list of data source tables of a data
source. The response is written as an output of a SPARQL query that
binds variable ?Name
to the name of a data source table, variable
?NumberOfColumns
to the number of columns in the table, and variable
?Columns
to a percent-encoded string describing the table columns
using the form name1=dt1&name2=dt2&...
where namei
is the column
name, and dti
is the column datatype.
Request
GET /datastores/myStore/datasources/mySource/tables HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Name,NumberOfColumns,Columns
drivers,2,id=integer&name=string
constructors,3,key=integer&name=string&address=string
Java API
List<DataSourceTableInfo> dataSourceTableInfos = dsConn.listDataSourceTables("mySource");
8.5.6. Retrieving Information About a Data Source Table¶
The following request retrieves information about a data source table.
The response is written as an output of a SPARQL query that binds
variable ?Column
to the integer referencing a column of a data
source, variable ?Name
to the column name, and variable
?Datatype
to the name of the RDFox datatype that best corresponds to
the datatype of the the column in the data source.
Request
GET /datastores/myStore/datasources/mySource/tables/drivers HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Column,Name,Datatype
1,id,http://www.w3.org/2001/XMLSchema#name
2,first_name,http://www.w3.org/2001/XMLSchema#string
3,last_name,http://www.w3.org/2001/XMLSchema#string
... etc ...
The DataSourceTableInfo
class encapsulates information about a data
source table in the Java API. Instances of this class are immutable.
Java API
DataSourceTableInfo dataSourceTableInfo = dsConn.describeDataSourceTable("mySource", "drivers");
8.5.7. Sampling a Data Source Table¶
The following request retrieves a sample of data from a data source
table. The response is written as an output of a SPARQL query that binds
the variable corresponding to column names to the values in the columns.
The limit=n
request parameter would determine how many rows are to
be returned. RDFox supports a configurable, system-wide maximum limit on
the number of returned rows, which can be used to avoid accidentally
requesting large portions of a data source. The main purpose of this API
is not to provide access to the data, but only provide a sample of the
data so that clients can see roughly what the source contains and then
mount the corresponding tuple table.
Request
GET /datastores/myStore/datasources/mySource/tables/drivers/data?limit=20 HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
id,first_name,last_name
1,Ayrton,Senna
2,Michael,Schumacher
... etc ...
Data from data source tables is returned using cursors in the Java API. These cursors are always full — that is, all relevant data is retrieved before the call finishes. The result is unaffected by the transaction that may be associated with the connection: RDFox does not support transactions over data sources.
Java API
Cursor data = dsConn.getDataSourceTableData("mySource", "drivers", 20);
8.6. Managing Tuple Tables¶
Both types of tuple tables are managed using the same API, which is described in this section. All modification functions described in this sections are not transactional: they are applied immediately, and in fact their invocation fails if the connection has an active transaction. Consequently, there is no way to rollback the effects of these functions.
8.6.1. Listing the Available Tuple Tables¶
The following request retrieves the list of tuple tables currently
available in a data store. The response is written as an output of a
SPARQL query that binds variable ?Name
to the name of the tuple
table, variable ?ID
to a unique integer ID of the tuple table,
variable ?NumberOfColumns
to the number of columns of the tuple
table, variable ?Columns
to a string containing the percent-encoded
column names separated by &
, and optionally variable DataSource
to the name of the data source if the tuple table is backed by a data
source.
Request
GET /datastores/myStore/tupletables HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Name,ID,NumberOfColumns,Columns,DataSource
internal:triple,1,3,s&p&o
[...]
Java API
List<TupleTableInfo> tupleTableInfos = dsConn.listTupleTables();
8.6.2. Creating a Tuple Table¶
The following request creates a new tuple table, which can be either an in-memory tuple table or a tuple table backed by a data source. Creating a tuple table requires specifying the table name as part of the URI, and supplying request parameters that determine the table arity and a number of key-value pairs that either specify indexing options for in-memory tuple tables, or parameters used to mount a tuple table from a data source.
Request
POST /datastores/myStore/tupletables/myTable?arity=2;key1=val1;key2=val2 HTTP/1.1
Host: localhost
Response
HTTP/1.1 201 CREATED
Location: /datastores/myStore/tupletables/myTable
Java API
Map<String, String> parameters = new HashMap<String, String>();
parameters.put("key1", "val1");
parameters.put("key2", "val2");
dsConn.createTupleTable("myTable", parameters);
8.6.3. Deleting a Tuple Table¶
The following request deletes a tuple table, which can be either an in-memory tuple table or a tuple table backed by a data source. The request succeeds only if a tuple table is not used in a rule currently loaded in the data store.
Request
DELETE /datastores/myStore/tupletables/myTable HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
dsConn.deleteTupleTable("myTable");
8.6.4. Retrieving Information About a Tuple Table¶
The following request retrieves information about a tuple table. The
response is written as an output of a SPARQL query that binds variables
?Property
and ?Value
. The exact properties and values are
determined by the tuple table type.
Request
GET /datastores/myStore/tupletables/myTable HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value
name,internal:triple
ID,1
number-of-columns,3
... etc ...
The TupleTableInfo
class encapsulates information about a tuple
table in the Java API. Instances of this class are immutable.
Java API
TupleTableInfo tupleTableInfo = dsConn.describeTupleTable("myTable");
8.7. Working with Facts, Rules, and Queries¶
RDFox provides various APIs for adding and deleting facts and rules. All updates are performed within the context of a transaction, which ensures that either all changes are performed as a unit, or no changes are performed at all. The transaction API is described in more detail in Section 8.
Adding or deleting facts or rules might require adjusting the inferred facts. In most cases, RDFox achieves this by using highly optimized incremental reasoning algorithms, whose aim is to update the derived facts while minimizing the amount of work. This process is automatically initiated before a query is evaluated in a transaction; thus, each query evaluated in a transaction always sees the results of prior updates made on the transaction. To promote performance, incremental reasoning is initiated only when a query is issued or a transaction is committed; thus, if several updates are issued before a transaction is committed, incremental reasoning is run only once.
It is possible to explicitly request the inferred facts to be updated, but this option is mainly used to analyze reasoning performance. In particular, the command line of RDFox provides various tools that can analyze the reasoning process and present a number of statistics that can be used to spot performance hot spots.
It is generally good practice to add all rules before the facts, or to add rules and facts in an arbitrary order but grouped in a single transaction. This will usually increase the performance of the first reasoning operation.
8.7.1. Authentication¶
The RESTful API supports basic HTTP authentication. For example, to
supply role name Aladdin
with password OpenSesame
, one should
include the following header into the request:
Authorization: Basic QWxhZGRpbjpPcGVuU2VzYW1l
Since RESTful API is stateless, this header should be included with each call — that is, the role name and password are not kept between calls.
When no Authorization
header is present in a RESTful API call, the
call is processed with the role name guest
and password guest
.
To prevent anonymous access via the RESTful API, the guest
role can
be deleted.
8.7.2. Key-Value Pairs as Arguments¶
Several API calls take a set of key-value pairs as arguments. In the
RESTful API, these can be encoded into the query string, or into the
request body using the application/x-www-form-urlencoded
content
type for PATCH/POST/PUT
requests. If a request requires both a
message body and request parameters, then the request parameters must be
part of the query string.
8.7.3. Treating GET
Results as Answers to SPARQL Queries¶
Many RESTful API calls return information about various parts of the data store. For example, one can list all data stores in a server, all data sources in a data store, and so on. In order to avoid introducing additional formats, the output of all such requests are formatted as answers to certain SPARQL queries. (This does not mean that such a query can be evaluated through a SPARQL endpoint; rather, it only means that the same result format is reused to represent query results.)
Answers of such queries can be serialized using any of the supported
query answer formats (see
Section 8.9.2) apart from
application/sparql-results+resourceid
.
Content negotiation determines the format to be used, as usual in the
SPARQL 1.1 protocol. The
examples in this document use the CSV format for simplicity. All such
calls accept an optional parameter with name filter
, whose value
must be a SPARQL 1.1 FILTER
expression. If a filter expression is
specified, it is evaluated for each answer in the list, and only those
answers on which the expression returns true
are returned.
8.7.4. Issues Surrounding Concurrent Access¶
Most RESTful API calls require acquiring a lock inside RDFox, and certain calls (e.g., a loading a large amount of data) may hold this lock for long periods of time. In order to prevent API calls from being blocked indefinitely, the RESTful API will cancel a request and report an error if the lock cannot be acquired within a predetermined time period (currently hard-coded to two seconds).
8.7.5. The Data-Store-ID
Header¶
The RESTful API can be accessed concurrently by different threads, which
can lead to the following problem. Assume that client A
is
performing a series of RESTful API calls on a particular data store, and
client B
simultaneously deletes the data store and creates another
data store with the same name. In such a case, client A
might want
to detect that the data store actually changed, but it cannot do so
using the data store name.
To detect such situations, each data store is assigned a unique
identifier when it is created. Then, client A
can initially record
this identifier, and it can supply it in subsequent RESTful API calls as
the value of the proprietary Data-Store-ID
header. RDFox will reject
each call where the supplied identifier does not match the identifier of
the data store that the call operates on, this allowing the client to
detect that the data store changed.
8.8. Managing Transactions¶
In the Java API, each transaction is associated with one data store
connection. The DataStoreConnection
class provides begin()
,
commit()
, and rollback()
functions, which respectively start,
commit, and roll back a transaction.
If no transaction is associated with a connection, then data store modification functions and query evaluation functions start a transaction that is committed or rolled back before the function finishes. In contrast, if a transaction is started on a connection when a modification/query function is called, then the operation is evaluated within the context of that transaction.
A transaction remains open in the Java API as long as it is not explicitly committed or rolled back. Closing a connection with a running transaction will rollback the transaction first.
Data store connections are single-threaded objects: attempting to use the same object in parallel from multiple threads will result in unpredictable behavior and is likely to crash the system. (However, the same data store connection object can be used from different threads at distinct time points — that is, there is no affinity between connection objects and threads.) In order to access RDFox concurrently, one should use distinct connections, each running a separate transaction.
Since the RESTful API is connectionless, there is no way to associate a
transaction with a connection. Instead, each running transaction is
identified with a unique ID, and the list of currently running
transactions can be retrieved from the /transactions
key. Access to
transactions in the RESTful API is serialized: if two requests attempts
to access the same transaction, one of them will fail. To use RDFox
concurrently, one should use distinct transactions. In order to avoid
situations where a transaction is created but never committed or rolled
back, the RESTful API will rollback a transaction if it has not been
used (i.e., no HTTP request accessed it) for a period longer than the
value of the endpoint.object-expiry-time
endpoint parameter. That
is, a transaction will remain valid for at least that much time (but it
may actually remain valid slightly longer).
8.8.1. Listing Currently Running Transactions¶
The following request lists the transactions currently running on a data
store. The response is written as an output of a SPARQL query that binds
variable ?TransactionID
to the ID of a transaction, and variable
?Type
to a string specifying transaction type.
Request
GET /datastores/myStore/transactions HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
TransactionID,Type
TX1,read-write
TX2,read-only
[... other transactions ...]
The Java API provides no equivalent functionality: transactions are local to data store connections and there is no way to enumerate active connections.
8.8.2. Creating a Transaction¶
The following request creates a transaction. The type
request
parameter specifies the type of transaction started and can be
interruptible-read-only
, read-only
, or read-write
. If this
parameter is omitted, read-write
is used as default. The created
transaction ID is returned in the Location
header. A transaction ID
is unique for each run of an RDFox server.
Request
POST /datastores/myStore/transactions?type=read-only HTTP/1.1
Host: localhost
Response
HTTP/1.1 201 CREATED
Location: /datastores/myStore/transactions/TX101
Java API
TransactionType transactionType = TransactionType.READ_ONLY;
dsConn.begin(transactionType);
8.8.3. Committing a Transaction¶
The following request commits a transaction. Doing so invalidates the corresponding transaction ID.
Request
POST /datastores/myStore/transactions/TX101 HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
dsConn.commit();
8.8.4. Rolling Back a Transaction¶
The following request rolls back a transaction. Doing so invalidates the corresponding transaction ID.
Request
DELETE /datastores/myStore/transactions/TX101 HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content
Java API
dsConn.rollback();
8.8.5. Retrieving Information About a Transaction¶
The following request retrieves information about a transaction. The
response is written as an output of a SPARQL query that binds variable
?Property
to the property name, variable ?Value
to the property
value, and variable ?Mutable
to true
if the value of the
property can be changed and to false
otherwise.
Request
GET /datastores/myStore/transactions/TX101 HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value,Mutable
ID,TX101,false
type,interruptible-read-only,true
... etc ...
Transaction properties can be read off the data store connection in the Java API.
Java API
DataStoreConnection.TransactionState state = dsConn.getTransactionState();
8.8.6. Changing Transaction Properties¶
The following request changes transaction properties. At present, only
the type
property can be changed.
Request
PATCH /datastores/myStore/transactions/TX101?type=interruptible-read-only HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 204 No Content
Transaction properties can be changed using the data store connection in the Java API.
Java API
dsConn.setTransactionState(DataStoreConnection.INTERRUPTIBLE_READ_ONLY);
8.8.7. Modifying and Querying Data in a Transaction¶
In the Java API, transactional updates and queries are achieved by
wrapping the relevant calls into corresponding begin()/commit()
calls.
In the RESTful API, each transaction exposes the /content
key that
provides an API similar to the /content
key of a data store — that
is, all the same HTTP verbs are supported in exactly the same way, but
all operations are executed within the context of the given transaction.
For example, the following request schedules facts to be updated within
a transaction.
Request
PATCH /datastores/myStore/transactions/TX101/content?mode=add HTTP/1.1
Host: localhost
[The facts/rules to be added in a format supported by RDFox]
Response
HTTP/1.1 204 No Content
In the RESTful API, each transaction also exposes the /sparql
key
that provides an API similar to the /sparql
key of a data store, but
where queries are evaluated within the context of the given transaction.
Queries can be either retrieval (i.e., SELECT/ASK/CONSTRUCT
) or
update (i.e., DELETE/INSERT
) queries. This key also supports
offset=m;limit=n
request parameters, as described in
Section 8.10.
8.9. Supported Formats and Their MIME Types¶
Various API calls either require input or produce output in one of the several formats, each of which is identified by a MIME type. Each format falls in one of the following two groups.
8.9.1. Formats Encoding Data Store Content¶
The first group contains formats that encode the content of a data
store, such as Triples and/or rules. These formats can be used in API
calls that update the data store, or that return data store content. In
the RESTful API, these calls are available using the /content
API
keys. Specifically, the following formats are supported.
The N-Triples format has MIME type
application/n-triples
.The Turtle format has MIME type
text/turtle
.The N-Quads format has MIME type
application/n-quads
.The TriG format has MIME type
application/trig
.The OWL 2 Functional-Style Syntax format has the MIME type
text/owl-functional
.RDFox uses a proprietary format described in Section 5.4 to capture datalog rules and facts. The MIME type of this format is
application/x.datalog
.
8.9.2. Formats Encoding SPARQL Query Results¶
The second group contains formats that encode results of SPARQL queries.
Data in these formats is produced by API calls on the /sparql
keys,
as well as GET
API calls that retrieve various information (see
Section 8.7.3).
The SPARQL 1.1 TSV Format has MIME type
text/tab-separated-values
.The SPARQL 1.1 CSV Format has MIME type
text/csv
.The SPARQL 1.1 XML Format has MIME type
application/sparql-results+xml
.The SPARQL 1.1 JSON Format has MIME type
application/sparql-results+json
.The proprietary format with MIME type
application/x.sparql-results+turtle
outputs each query answer in a single line that resembles Turtle. If the query has exactly three answer variables, then query answers in this format can be passed to API calls that expect Turtle data.Proprietary formats with MIME types
text/x.tab-separated-values-abbrev
,text/x.csv-abbrev
,application/x.sparql-results+xml-abbrev
,application/x.sparql-results+json-abbrev
, andapplication/x.sparql-results+turtle-abbrev
follow the same structure as the formats mentioned above, with the difference that all IRIs are abbreviated using prefixes supplied in the query. Hence, these formats provide a more user-friendly representation of query results.The proprietary format with MIME type
application/x.sparql-results+resourceid
is a simple binary format designed to speed up RDFox usage in client-server scenarios. Output of a query is serialized as follows. First, the number of answer variables is output as a 64-bit value. Next, the name of each variable in UTF-8 is output: first, a 64-bit length of the name encoded in UTF-8 is output, followed by the UTF-8 encoding of the variable name (without a zero terminator. Next, for each query answer, a nonzero 64-bit answer multiplicity (i.e., an integer specifying how many times should a particular row appears in the answer) is output, followed by a 64-bit value is output for each answer variable (where a value of zero means that the answer variable is unbound, and a nonzero value identifies a resource that can be resolved to a value using the dictionary). Finally, after all answers have been output, a single 64-bit zero multiplicity is output, thus signaling the end of the answer set. By caching dictionary values on the client side, one can considerably reduce network communication overheads. This format is not available inGET
calls that retrieve various lists (see Section 8.7.3).The proprietary format with MIME type
application/x.sparql-results+null
simply discards all answers. This can be useful in situations such as query benchmarking, where one may want to measure the speed of query processing without taking into account often considerable overhead of serializing query results and transporting them over the network.
8.10. Evaluating Queries¶
The /sparql
key exposes a SPARQL 1.1
endpoint implemented
exactly as in the specification. Both GET
and POST
request
methods are supported. Moreover, SELECT/ASK
, CONSTRUCT
, and
DELETE/INSERT
queries are supported. Query evaluation in RDFox can
be influenced using a number of parameters, which can be passed as
key-value pairs. The query result is encoded according to the required
format, and a request fails if the format does not match the query type
(e.g., if a request specifies a SELECT
query and the Turtle answer
format).
The following is an example of a query request.
Request
GET /datastores/myStore/sparql?query=SELECT+%3FX+%3FY+%3FZ+WHERE+{+%3FX+%3FY+%3FZ+} HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
X,Y,Z
[...result of the query...]
Java API
Map<String, String> parameters = new HashMap<String, String>();
// Query evaluation supports a bunch of parameters that govern how
// queries are compiled. The RESTful API would use the default parameters,
// but the Java API would allow finer control.
parameters.set(..., ...);
// While one can specify a set of prefixes in a query, it is often useful
// in applications to maintain a global set of prefixes that does not need
// to be explicitly set and parsed every time. Thus, query evaluation accepts
// a set of prefixes that can be used in the query and that will be used to
// serialize the results.
Prefixes prefixes = ...;
// The final two parameters determine the output format. Query evaluation
// can return these kinds of results:
//
// * a set of rows in case of SELECT/ASK queries,
// * an RDF graph in case of CONSTRUCT queries, or
// * nothing in the case of UPDATE queries.
//
// It seems useful to have an API that can evaluate any query, regardless of
// its type. Therefore, the following function requires three parameters:
//
// * an output stream to which answers are written (if there are any),
// * the name of a SPARQL answer format (in case of SELECT/ASK queries), and
// * the name of an RDF format (in case of CONSTRUCT queries).
//
// If the caller is sure of the query type, they can supply unused parameters as null.
OutputStream output = ...;
dsConn.evaluateQuery(prefixes, "SELECT ?X ?Y ?Z WHERE { ?X ?Y ?Z }", parameters, output, "text/csv", null);
SPARQL supports pagination of query results using OFFSET
and
LIMIT
query clauses; however, evaluating the same query while
varying its OFFSET/LIMIT
clauses may be inefficient because the
query in each request is evaluated from scratch.
In the RESTful API, including the offset=m;limit=n
parameters into a
query request has the same effect as adding the OFFSET m LIMIT n
clauses to the query. However, doing the former can be more efficient
when
a user makes a query request with
offset=m1;limit=n1
,the same user makes another request for exactly the same query (i.e., a query that is character-for-character identical as the previous one) with
offset=m2;limit=n2
wherem2 = m1 + n1 + 1
, andthe data store has not been updated between these two requests.
RDFox provides no hard efficiency guarantees, but will try to process
requests containing offset=m;limit=n
as efficiently as possible.
Therefore, applications should use this approach to result pagination
whenever possible. The endpoint.object-expiry-time
option specifies
the rough amount of time between two such requests for the same query
during which RDFox will aim to speed up query evaluation.
SPARQL queries can be long in some applications, so sending the same query multiple times can be a considerable source of overhead. In such cases, applications can consider using cursors (See Section 8.11), where a query is submitted for execution just once. These APIs, however, must be used within a transaction and are thus described in Section 7.
8.11. Cursors¶
As already mentioned in Section 8.10, RDFox supports efficient APIs for paginating query results using cursors, which provide a view into the results of a query evaluated on a ‘frozen’ snapshot of data. The concept of cursors is used in slightly different ways in the Core and the RESTful APIs, so this section discusses first the former and then the latter.
8.11.1. Cursors in the Java API¶
The Java API uses cursors to provide access to answers to queries. A cursor goes through the following life cycle.
When a cursor is created, it is in an unopened state.
Before it is used, a cursor must be opened, which positions the cursor on the first answer tuple, or at the answer end (if there are no answer tuples). Opening the cursor returns the multiplicity of the current answer, or zero if there are no answers.
Advancing a cursor returns the multiplicity of the next row. Cursors cannot go backwards — all movement is forward.
When a cursor will no longer be used, it must be closed so that any resources associated with it can be released. This must be done even when cursors are read to the end. In Java, the
Cursor
class implements theAutoCloseable
interface so that it can be used in a try-with-resources statement.
The reason why rows have multiplicities is because SPARQL has bag
semantics, and if an answer contains the same tuple n
times, it can
be more efficient to return the tuple once and say that the tuple’s
multiplicity is n
. The Java API supports cursors for SELECT/ASK
and CONSTRUCT
queries. A cursor for a CONSTRUCT
query behaves as
a cursor for a SELECT/ASK
query retuning variables ?S
, ?P
,
and ?O
for each constructed triple.
Regardless of the query type, the Java API supports two types of cursors, both implementing the same interface so they can be used by the same code.
Evaluating a query without a transaction returns a full cursor that contains all answers — that is, the query is evaluated in full and the answers are stored in the cursor before the corresponding query evaluation call finishes.
Evaluating a query in a transaction returns a live cursor, which tries to produce query answers only when they are accessed. A live cursor is typically returned whenever the query does not contain ordering or aggregation. Hence, live cursors provide an efficient mechanism for paginating query results since query answers are produced on demand. In particular, if an answer is never accessed, no time is wasted in producing it. A live cursor can return answers as long as the transaction that created it remains active (i.e., it is not committed or rolled back). Using a cursor created from a transaction that has been committed results in undefined behavior can cause crashes. A live cursor can be safely closed after the transaction it was created on has been committed; moreover, committing or rolling back a transaction does not automatically close all associated cursors.
Cursors (full or live) are typically used in the Java API as follows.
Map<String, String> parameters = new HashMap<String, String>();
// Initialise parameters that govern query evaluation.
parameters.set(..., ...);
// Initialise the prefixes the may now
Prefixes prefixes = ...;
// Create the cursor.
Cursor crs = dsConn.createCursor(prefixes, "SELECT ?X ?Y ?Z WHERE { ?X ?Y ?Z }", parameters);
for (long multiplicity = crs.open(); multiplicity != 0; multiplicity = crs.advance()) {
// Read the current answer
}
crs.close();
8.11.2. Cursors in the RESTful API¶
The RESTful API supports efficient query result pagination using the
offset=m;limit=n
request parameters (see
Section 8.10). However, this style of result
pagination requires resending the same query in each request, which can
be inefficient. Therefore, the RESTful API also explicitly exposes
cursors. (Full cursors are not exposed because their functionality is
identical to the standard query evaluation APIs.) Since live cursors are
associated with the transaction they are created on, cursor management
is performed using the /cursors
key under the URL representing the
corresponding transaction. Each open cursor is identified by an ID
exposed under the /cursors
key. Moreover, transactions are owned by
users, so cursors are (indirectly) owned by users as well; thus, if a
user accessing a cursor is not the same as the user that created the
cursor, the 401 Unauthorized
response code is returned. When a
transaction is committed or rolled back, all cursors associated with the
transaction are deleted. Each cursor exposed by the RESTful API
maintains its position, and there is an API allowing users to query the
current cursor position.
8.11.2.1. Listing Available Cursors¶
The following request retrieves the list of cursors available on a
server transaction. The response is written as an output of a SPARQL
query that binds variable ?CursorID
to the cursor ID.
Request
GET /datastores/myStore/transactions/TX101/cursors HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
CursorID
CRS101
CRS102
8.11.2.2. Creating a Cursor¶
A cursor is created by submitting the query to the /cursors
key
using the POST
method of the SPARQL 1.1
Protocol. If the cursor
contains no answers, the response includes a proprietary
X-Cursor-Exhausted: true
header and the 204 No Content
status
code, and the cursor is not created. Otherwise, the location of the new
cursor is returned in the Location
header. In order to reduce the
number of round trips to the server, this request also returns the first
batch of answers from the server. The request can include limit=0
to
specify that no answers should be returned, in which case the response
contains the 201 Created
status code. The request can include
limit=n
parameter determining how many rows should be returned; if
this parameter is omitted, then all remaining answers are returned. In
all such cases, the request must specify an Accept
header to
determine the format of the returned data. The ability to retrieve the
first batch of answers is useful because it reduces the number of
round-trips to the server.
Request
POST /datastores/myStore/transactions/TX101/cursors?limit=0 HTTP/1.1
Host: localhost
Content-Type: application/sparql-query
Content-Length: 34
SELECT ?X ?Y ?Z WHERE { ?X ?Y ?Z }
Response
HTTP/1.1 201 CREATED
Location: /datastores/myStore/transactions/TX101/cursors/CRS101
8.11.2.3. Reading a Batch of Query Answers From a Cursor¶
The following request reads the next batch of answers from a cursor. The
request can include limit=n
parameter determining how many rows
should be returned. The request updates the cursor position and so such
a request is not idempotent; consequently, the request method is
POST
. If the cursor has already returned no information, the
response includes a proprietary X-Cursor-Exhausted: true
header and
the 204 No Content
status code, and the cursor is automatically
deleted. The request can include limit=0
to specify that no answers
should be returned, in which case the response contains the
204 No Content
status code. The request can include limit=n
parameter determining how many rows should be returned; if this
parameter is omitted, then all remaining answers are returned. In all
such cases, the request must specify an Accept
header to determine
the format of the returned data. Different requests on the same cursor
can request different result formats.
Request
POST /datastores/myStore/transactions/TX101/cursors/CRS101?limit=10 HTTP/1.1
Host: localhost
Accept: text/csv
Response
HTTP/1.1 200 OK
[The first 10 answers to the query in CSV format]
8.11.2.4. Retrieving Cursor Information¶
The following request retrieves information about a specific cursor. The
response is written as an output of a SPARQL query that binds variable
?Property
to the name of a cursor property, and variable ?Value
to property value.
Request
GET /datastores/myStore/transactions/TX101/cursors/CRS101 HTTP/1.1
Host: localhost
Accept: text/csv; charset=UTF-8
Response
HTTP/1.1 200 OK
Content-Type: text/csv; charset=UTF-8
Property,Value
ID,CRS1
position,10
8.11.2.5. Closing a Cursor¶
The following request closes/deletes the cursor.
Request
DELETE /datastores/myStore/transactions/TX101/cursors/CRS101 HTTP/1.1
Host: localhost
Response
HTTP/1.1 204 No Content