5. Querying RDFox¶
RDFox supports most SPARQL 1.1 Query Language features, as well as the most commonly needed parts of SPARQL 1.1 Update. It also implements a few proprietary built-in functions that are not part of SPARQL 1.1.
In this section, we describe in detail the support in RDFox for the SPARQL 1.1 standard specification, describe the built-in functions that are specific to RDFox.
Finally, we also describe the functionality implemented in RDFox for monitoring query execution.
5.1. SPARQL 1.1 Support¶
The SPARQL 1.1 specification provides a suite of languages and protocols for querying and manipulating RDF graph data.
5.1.1. Query Language¶
The core of the specification is the SPARQL 1.1 Query Language, which specifies the syntax and semantics of allowed queries. The SPARQL 1.1 query language extends the previous version of SPARQL with a number of important features for applications, including nested subqueries, aggregation, negation, creating values by expressions, and named graphs, and property paths.
RDFox provides full support of the SPARQL 1.1 query language with the only
exception of property paths, the
variant of the BNODE function that
takes a single string argument (whose evaluation semantics is unclear), and the
nonnormative DESCRIBE query form. The
functionality provided by property paths is, however, already covered to a
large extent by Datalog rules (as will be described in Section 6.4
). The intended semantics of BNODE
and DESCRIBE
is, in our opinion, not
sufficiently specified in the standard, which is why those features have not
yet been implemented in RDFox.
5.1.2. Query Answer Formats¶
Results of SELECT
queries in SPARQL 1.1 are often represented in
tabular form in applications. In order for query results to be easily
exchanged in a machine-readable format, the SPARQL 1.1 specification
describes four common exchange formats in three different
documents:
XML,
JSON,
CSV
, and
TSV.
All of these formats are fully supported in RDFox (see Section 13.9.2
for further details).
5.1.3. Update Language¶
RDFox fully supports the SPARQL 1.1 update language. In particular, this language allows users to insert triples into a store, delete triples from a store, load an RDF graph into a store, clear an RDF graph in a store, create a new RDF graph in a store, drop an RDF graph from a store, copy (move or add) the content of one store to another, and perform a group of update operations as a single action.
5.2. Built-In Functions¶
RDFox supports a wide range of built-in operators and functions that can be used during query answering and reasoning. Concretely, RDFox supports all SPARQL 1.1 Functions and Operators, with the exception of the variant of the BNODE function that takes a single string argument, whose semantics is unclear. In addition to that, RDFox also supports most of the XPath and XQuery Functions and Operators. Finally, RDFox also provides a number of proprietary functions that are useful in practice.
There is a large overlap between the functions and operators defined in the SPARQL 1.1 and XPath and XQuery. As a result most functions supported by RDFox can be accessed using their short names, as specified in SPARQL 1.1, as well as their IRI name, as specified in XPath and XQuery. RDFox also provides a short name for many of the XPath and XQuery functions that have no SPARQL equivalent.
RDFox often provides additional overloads for the functions and operators from
the SPARQL and the XPath and XQuery specifications, which improves usability.
So, for example, one can extract the year of an xsd:gYearMonth
value, and
they can also compare without restrictions two xsd:duration
values
according to the partial order on durations. All such extensions are documented
where the respective functions and operators are introduced.
A full list of the RDFox built-in functions and operators is given in the following sections. Functions will be presented with all their available names. If present, short names will precede the IRI names. If a function name is part of the SPARQL 1.1 specification or the XPath and XQuery specification, the name will be given as a link pointing to the respecitve part of the specification. We will assume the following prefix definitions.
|
|
|
|
|
|
All RDFox functions and operators can be used in SPARQL queries. Furthermore,
all RDFox functions and operators can also be used in rules, with the exception
of NOW
, RAND
, UUID
, and STRUUID
, whose values are not
determined by their arguments.
5.2.1. Operators¶
The RDFox operators are listed in the following table and discussed in detail next.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The Boolean operators in RDFox are the logical not ( !
), logical and
( &&
), and logical or
( ||
), which behave as defined in the SPARQL 1.1.
The comparison operators in RDFox are <, <=, =, !=, >=, >
. These
operators have been significantly extended when compared to the respective
operators in SPARQL 1.1 and in XPath/XQuery. When compared to SPARQL, the
comparison operators in RDFox have additional overloads for all the date/time
and duration datatypes. When compared to XPath/XQuery, which also defines such
overloads, RDFox has the following differences.
In RDFox,
xsd:duration
values are compared according to the partial order defined in the XML Schema specification. In contrast, in XPath/XQuery, the operators<, <=, =>, >
are only defined for the subtypesxsd:dayTimeDuration
andxsd:yearMonthDuration
.Similarly, in RDFox date and time values are compared according to the partial order on dates and times defined in the XML Schema. In contrast, XPath/XQuery imposes various restrictions on the allowed comparisons.
The mathematical operators in RDFox are the unary +
and -
operators, and the binary addition +
, subtraction -
, multiplication
*
, division /
, integer division idiv
, and modulo mod
operators.
SPARQL 1.1 defines only a subset of the above operators and their
overloads in comparison to XPath/XQuery (e.g. idiv
and mod
are not in
SPARQL 1.1). RDFox extends the XPath/XQuery behavior of these operators as
outlined next.
The unary
+
and-
operators have been extended to the datatypexsd:duration
.The binary subtraction operator
-
has been extended to all compatible date and time datatypes. The result of such operation is anxsd:duration
.The binary addition and subtraction operators have been extended so that durations can be added to and subtracted from values of any of the date and time datatypes.
5.2.2. Functions on Terms¶
The following table lists the RDF functions on terms. Most of the functions
behave as specified in SPARQL 1.1. One difference is the addition of the two
Boolean functions isInteger
and isDecimal
. The function isInteger
returns true if the argument has one of the integer datatypes, while the
isDecimal
function returns true if its argument is of type xsd:decimal
.
Another change concerns the functions IRI
and URI
, which take an
optional second argument that specify the base against which the first argument
is resolved. If not provided, the default base is used. Furthermore, the
BOUND
function has been extended to operate on arbitrary expressions.
The function returns "true"^^xsd:boolean
, when the input expression can be
successfully evaluated. In particular, when the expression is a variable, its
evaluation succeeds when the variable is bound.
isInteger |
isDecimal |
||
5.2.3. Constructor Functions¶
RDFox has a number of constructor functions that allow users to create a value of a particular type. The following table lists the constructor functions defined in SPARQL 1.1.
xsd:anyURI |
xsd:boolean |
xsd:date |
xsd:dateTime |
xsd:dateTimeStamp |
xsd:dayTimeDuration |
xsd:decimal |
xsd:double |
xsd:duration |
xsd:float |
xsd:gDay |
xsd:gMonth |
xsd:gMonthDay |
xsd:gYear |
xsd:gYearMonth |
xsd:integer |
xsd:string |
xsd:time |
xsd:yearMonthDuration |
RDFox additionally provides the following constructor functions for the
date/time datatypes. The offset parameter in all functions is optional except
in the case of DATE_TIME_STAMP
.
DURATION ( year, month, day, hours, minutes, seconds ) |
Constructs an |
YEAR_MONTH_DURATION ( year, month ) |
Constructs an |
DAY_TIME_DURATION ( day, hours, minutes, seconds ) |
Constructs an |
DATE_TIME ( year, month, day, hour, minute, second, offset ) |
Constructs an |
DATE_TIME_STAMP ( year, month, day, hour, minute, second, offset ) |
Constructs an |
TIME ( hour, minute, second, offset ) |
Constructs an |
DATE ( year, month, day, offset ) |
Constructs an |
G_DAY ( day, offset ) |
Constructs an |
G_MONTH ( month, offset ) |
Constructs an |
G_MONTH_DAY ( month, day, offset ) |
Constructs an |
G_YEAR_MONTH ( year, month, offset ) |
Constructs an |
G_YEAR ( year, offset ) |
Constructs an |
5.2.4. IRI and String Functions¶
The IRI and string functions in SPARQL 1.1 and XPath/XQuery almost fully overlap, and RDFox provides full suport for them.
ESCAPE_HTML_URI ( fn:escape-html-uri ) |
IRI_TO_URI ( fn:iri-to-uri ) |
|
CONTAINS ( fn:contains ) |
STRENDS ( fn:ends-with ) |
LCASE ( fn:lower-case ) |
SUBSTR ( fn:substring ) |
||
UCASE ( fn:upper-case ) |
||
REGEX ( fn:matches ) |
||
REPLACE ( fn:replace ) |
5.2.5. Hash Functions¶
The RDFox hash functions are the ones specified in SPARQL 1.1 and are given in the following table.
5.2.6. Mathematical Functions¶
RDFox supports all mathematical functions from SPARQL 1.1 and most of the
mathematical functions in XPath/XQuery. RDFox also provides additional
functions that are useful in practice. The only nonstandard functions are
MAXFN
and MINFN
, which take any number of arguments and return the
maximum and the minimum value, respectively.
MINFN |
MAXFN |
CEIL ( fn:ceiling ) |
|
PI ( math:pi ) |
POW ( math:pow ) |
SQRT ( math:sqrt ) |
||
LOG ( math:log ) |
LOG10 ( math:log10 ) |
LOG2 |
|
EXP ( math:exp ) |
EXP10 ( math:exp10 ) |
EXP2 |
SIN ( math:sin ) |
COS ( math:cos ) |
ASIN ( math:asin ) |
ACOS ( math:acos ) |
TAN ( math:tan ) |
|
ATAN ( math:atan ) |
ATAN2 ( math:atan2 ) |
SINH |
COSH |
TANH |
ASINH |
ACOSH |
ATANH |
5.2.7. Date and Time Functions¶
This section describes the RDFox functions on dates, times and durations. RDFox supports all SPARQL 1.1 date time functions and most XPath/XQuery date time functions. In RDFox, many of these functions have been extended to apply to all date/time values.
Returns an xsd:dateTime value representing the current moment in time. |
|
Returns the year component of a date/time value. Extends fn:year-from-dateTime and fn:year-from-date to all date/time datatypes with a valid year component. |
|
Returns the month component of a date/time value. Extends fn:month-from-dateTime and fn:month-from-date to data/time datatypes with a valid month component. |
|
Returns the day component of a date/time value. Extends fn:day-from-dateTime and fn:day-from-date to date/time datatypes with a valid day component. |
|
YEARS |
Returns the number of years in a duration: fn:years-from-duration. |
MONTHS |
Returns the number of months in a duration: fn:months-from-duration. |
DAYS |
Returns the number of days in a duration: fn:days-from-duration. |
Returns the hours component of a date/time value. Extends fn:hours-from-dateTime, fn:hours-from-time, and fn:hours-from-duration to date/time datatypes with a valid hours component. |
|
Returns the minutes component of a date/time value. Extends fn:minutes-from-dateTime, fn:minutes-from-time, and fn:minutes-from-duration to date/time datatypes with a valid minutes component. |
|
Returns the seconds component of a date/time value. Extends fn:seconds-from-dateTime, fn:seconds-from-time, fn:seconds-from-duration to date/time datatypes with a valid seconds component. |
|
Returns the timezone component of a date/time value. Extends fn:timezone-from-dateTime, fn:timezone-from-date, fn:timezone-from-time to date/time datatypes with a valid timezone component. |
|
Returns the timezone component of a date/time value as a simple literal. Extended to date/time values with a valid timezone component. |
|
DURATION_MONTHS |
Returns the number of months in the internal representation |
DURATION_SECONDS |
Returns the number of seconds in the internal representation |
TIME_ON_TIMELINE |
Returns the decimal number representing a date time value on the timeline. |
TO_TIMEZONE |
Adjusts the time zone of a date/time value. An extension of fn:adjust-dateTime-to-timezone, fn:adjust-date-to-timezone, fn:adjust-time-to-timezone, to all date/time datatypes. The function takes two arguments: a mandatory date/time value and an optional timezone value. If the timezone value is provided, the function returns a date/time value in the specified timezone that is equivalent to the first argument. If the timezone is not provided, the function returns the local value of its first argument with no timezone. |
5.3. Querying Tuple Tables¶
RDFox organizes information in a data store using tuple tables, as described in
more detail in Section 4. Briefly, tuple tables include named graphs,
but can also represent data stored in external data sources. RDFox provides
proprietary TT
expressions to access data stored in tuple tables, as shown
in the following example.
Example: Assume that that a binary tuple table :EmployeeName
is
mounted from an external data source (e.g., a database), and that it contains
pairs that relate employee IDs to their names. The following query retrieves
all pairs whose ID is contained in the :Manager
class.
SELECT ?id ?name
WHERE {
?id rdf:type :Manager .
TT :EmployeeName { ?id ?name }
}
In the above example, TT :EmployeeName { ?id ?name }
retrieves all pairs of IDs
and names stored in the :EmployeeName
tuple table. Since the tuple table is binary,
only two terms (i.e., ?id
and ?name
) are allowed to occur inside the
TT
expression. This is analogous to named graphs, where GRAPH :G { ?X ?Y ?Z }
accesses all triples in a named graph :G
. The difference to GRAPH
expressions
can be summarized as follows.
The number of terms inside
TT
must match with the arity (i.e., the number of positions) of the tuple table.Each
TT
expression represents exactly one reference to a tuple table. For example, to retrieve pairs of employee IDs with the same name, one can useTT :EmployeeName { ?id1 ?name } . TT :EmployeeName { ?id2 ?name }
(whereasTT :EmployeeName { ?id1 ?name . ?id2 ?name }
is syntactically invalid).Variables cannot be used in place of tuple table names. For example,
TT ?T { ?id ?name }
is syntactically invalid.
5.4. Monitoring Query Execution¶
RDFox implements functionality for monitoring the execution of queries. In particular, users can gain access to query plans generated by the RDFox query optimizer as well as to useful statistics about the execution of such plans.
Suppose that we initialize a data store with the example data in our Getting Started guide. The following shell command provides access to the query plans produced by RDFox:
set query.explain true
Now, let’s issue the following SPARQL query against the store
SELECT ?p ?n WHERE { ?p rdf:type :Person . ?p :forename ?n. ?p :hasParent ?z }
which returns the following answers:
:meg "Meg" .
:stewie "Stewie" .
:chris "Chris" .
:meg "Meg" .
The shell now also displays the query plan that has been actually executed.
QUERY ?p ?n QueryIterator
PROJECT ?n ?p { --> ?n ?p }
CONJUNCTION { --> ?n ?p ?z } NestedIndexLoopJoinIterator
[?p, :hasParent, ?z] { --> ?p ?z } TripleTableIterator
[?p, rdf:type, :person] { ?p ?z --> ?p ?z } TripleTableIterator
[?p, :forename, ?n] { ?p ?z --> ?n ?p ?z } TripleTableIterator
The query plan is executed top-down in a depth-first-search manner and we can think of solution variable bindings as being generated one-at-a-time. It is useful to go in more detail through the execution of the plan for a given solution binding.
When we first “visit” the PROJECT
block, we haven’t obtained any
variable bindings yet (hence the empty space of the left of the “–>”
symbol); in contrast, by the time we have finished executing the subplan
underneath, we will have obtained a binding of variables ?n
and
?p
and hence an answer to the query (as reflected on the right-hand
side of the “–>” symbol). Similarly, when we first visit the
CONJUNCTION
block, which performs the join of the query, we have an
empty binding and, by the time we return from it, we will have a binding
for ?n
, ?p
and ?z
. The join is performed also top-down.
First, we obtain a binding for ?p
and ?z
by matching the the
triple pattern [?p, :hasParent, ?z]
. We then consider the second
triple pattern [?p, rdf:type, :person]
and finally the third triple
pattern [?p, :forename, ?n]
, which extends the binding by providing
also a value for variable ?n
.
Let us consider a slightly more complex query, which uses the
OPTIONAL
operator in SPARQL.
SELECT ?p ?n WHERE { ?p rdf:type :Person . ?p :hasParent ?z . OPTIONAL { ?p :forename ?n } }
RDFox will execute the following query plan:
QUERY ?p ?n QueryIterator
PROJECT ?n ?p { --> ?p | ?n }
OPTIONAL { --> ?p ?z | ?n } OptionalIterator
CONJUNCTION { --> ?p ?z } NestedIndexLoopJoinIterator
[?p, :hasParent, ?z] { --> ?p ?z } TripleTableIterator
[?p, rdf:type, :person] { ?p ?z --> ?p ?z } TripleTableIterator
FILTER true
[?p, :forename, ?n] { ?p ?z --> ?n ?p ?z } TripleTableIterator
The important difference to notice in this plan is the use of the ” | ” symbol. The variables on the left-hand-side of “|” are always bound by the corresponding block, whereas those indicated on the right-hand-side may or may not be returned.