Warning: This document is for an old version of RDFox. The latest version is 7.2.

5. Querying RDFox

RDFox supports most SPARQL 1.1 Query Language features, and it fully supports the SPARQL 1.1 Update Language. It also implements a few proprietary built-in functions that are not part of SPARQL 1.1.

In this section, we describe in detail the support in RDFox for the SPARQL 1.1 standard specification, and the built-in functions that are specific to RDFox.

Finally, we also describe the functionality implemented in RDFox for monitoring query execution.

5.1. SPARQL 1.1 Support

The SPARQL 1.1 specification provides a suite of languages and protocols for querying and manipulating RDF graph data.

5.1.1. Query Language

The core of the specification is the SPARQL 1.1 Query Language, which specifies the syntax and semantics of allowed queries. The SPARQL 1.1 query language extends the previous version of SPARQL with a number of important features for applications, including nested subqueries, aggregation, negation, creating values by expressions, named graphs, and property paths.

RDFox provides full support of the SPARQL 1.1 query language with the exception of the variant of the BNODE function that takes a single string argument and the nonnormative DESCRIBE query form. The intended semantics of BNODE and DESCRIBE is, in our opinion, not sufficiently specified in the standard, which is why those features have not yet been implemented in RDFox.

Although RDFox supports SPARQL 1.1 property paths, in many situations it might be beneficial to use reasoning instead. This is explained further in Section 6.5.3

5.1.2. Query Answer Formats

Results of SELECT queries in SPARQL 1.1 are often represented in tabular form in applications. In order for query results to be easily exchanged in a machine-readable format, the SPARQL 1.1 specification describes four common exchange formats in three different documents: XML, JSON, CSV, and TSV. All of these formats are fully supported in RDFox (see Section 14.3.2 for further details).

5.1.3. Update Language

RDFox fully supports the SPARQL 1.1 update language. In particular, this language allows users to insert triples into a store, delete triples from a store, load an RDF graph into a store, clear an RDF graph in a store, create a new RDF graph in a store, drop an RDF graph from a store, copy (move or add) the content of one store to another, and perform a group of update operations as a single action.

5.2. Built-In Functions

RDFox supports a wide range of built-in operators and functions that can be used during query answering and reasoning. Concretely, RDFox supports all SPARQL 1.1 Functions and Operators, with the exception of the variant of the BNODE function that takes a single string argument, whose semantics is unclear. In addition to that, RDFox also supports most of the XPath and XQuery Functions and Operators. Finally, RDFox also provides a number of proprietary functions that are useful in practice.

There is a large overlap between the functions and operators defined in SPARQL 1.1 and XPath and XQuery. As a result most functions supported by RDFox can be accessed using their short names, as specified in SPARQL 1.1, as well as their IRI name, as specified in XPath and XQuery. RDFox also provides a short name for many of the XPath and XQuery functions that have no SPARQL equivalent.

RDFox often provides additional overloads for the functions and operators from the SPARQL and the XPath and XQuery specifications, which improves usability. So, for example, one can extract the year of an xsd:gYearMonth value, and can also compare without restrictions two xsd:duration values according to the partial order on durations. All such extensions are documented where the respective functions and operators are introduced.

A full list of the RDFox built-in functions and operators is given in the following sections. Functions will be presented with all their available names. If present, short names will precede the IRI names. If a function name is part of the SPARQL 1.1 specification or the XPath and XQuery specification, the name will be given as a link pointing to the respective part of the specification. We will assume the following prefix definitions.

XPath/XQuery Function Name Prefixes

xsd:

<http://www.w3.org/2001/XMLSchema#>

fn:

<http://www.w3.org/2005/xpath-functions#>

math:

<http://www.w3.org/2005/xpath-functions/math#>

All RDFox functions and operators can be used in SPARQL queries. Furthermore, all RDFox functions and operators can also be used in rules, with the exception of NOW, RAND, UUID, and STRUUID, whose values are not determined by their arguments.

5.2.1. Operators

The RDFox operators are listed in the following table and discussed in detail next.

RDFox Operators

!      unary logical-not

&&      binary logical-and

||      binary logical-or

<=      binary less-equal-than

>=      binary greater-equal-than

=      binary equal

<      binary less-than

>      binary greater-than

!=      binary not-equal

+      binary add

*      binary multiply

+      unary plus

-      binary subtract

/      binary divide

-      unary minus

idiv      integer division

mod      modulo operator

The Boolean operators in RDFox are the logical not ( ! ), logical and ( && ), and logical or ( || ), which behave as defined in SPARQL 1.1.

The comparison operators in RDFox are <, <=, =, !=, >= and >. These operators have been significantly extended when compared to the respective operators in SPARQL 1.1 and in XPath/XQuery. When compared to SPARQL, the comparison operators in RDFox have additional overloads for all the date/time and duration datatypes. When compared to XPath/XQuery, which also defines such overloads, RDFox has the following differences.

  • Operators == and != are can be used to compare IRIs, blank nodes, and literals regardless of their type. For example, it is possible to compare an IRI with a blank node, or to compare two literals of different datatypes.

  • In RDFox, xsd:duration values are compared according to the partial order defined in the XML Schema specification. In contrast, in XPath/XQuery, the operators <, <=, =>, > are only defined for the subtypes xsd:dayTimeDuration and xsd:yearMonthDuration.

  • Similarly, in RDFox date and time values are compared according to the partial order on dates and times defined in the XML Schema. In contrast, XPath/XQuery imposes various restrictions on the allowed comparisons.

The mathematical operators in RDFox are the unary + and - operators, and the binary addition +, subtraction -, multiplication *, division /, integer division idiv, and modulo mod operators.

SPARQL 1.1 defines only a subset of the above operators and their overloads in comparison to XPath/XQuery (e.g. idiv and mod are not in SPARQL 1.1). RDFox extends the XPath/XQuery behavior of these operators as outlined next.

  • The unary + and - operators have been extended to the datatype xsd:duration.

  • The binary subtraction operator - has been extended to all compatible date and time datatypes. The result of such operation is an xsd:duration.

  • The binary addition and subtraction operators have been extended so that durations can be added to and subtracted from values of any of the date and time datatypes.

5.2.2. Functions on Terms

The following table lists the RDF functions on terms. Most of the functions behave as specified in SPARQL 1.1. One difference is the addition of the two Boolean functions isInteger and isDecimal. The function isInteger returns true if the argument has one of the integer datatypes, while the isDecimal function returns true if its argument is of type xsd:decimal. Another change concerns the functions IRI and URI, which take an optional second argument that specify the base against which the first argument is resolved. If not provided, the default base is used. Furthermore, the BOUND function has been extended to operate on arbitrary expressions. The function returns "true"^^xsd:boolean, when the input expression can be successfully evaluated. In particular, when the expression is a variable, its evaluation succeeds when the variable is bound.

Functions on terms

BOUND

isIRI

isURI

isBlank

isLiteral

isNumeric

isInteger

isDecimal

sameTerm

IRI

STR

URI

BNODE

STRDT

STRLANG

UUID

STRUUID

LANG

DATATYPE

5.2.3. Constructor Functions

RDFox has a number of constructor functions that allow users to create a value of a particular type. The following table lists the constructor functions defined in SPARQL 1.1.

RDFox Constructor Functions

xsd:anyURI

xsd:boolean

xsd:date

xsd:dateTime

xsd:dateTimeStamp

xsd:dayTimeDuration

xsd:decimal

xsd:double

xsd:duration

xsd:float

xsd:gDay

xsd:gMonth

xsd:gMonthDay

xsd:gYear

xsd:gYearMonth

xsd:integer

xsd:string

xsd:time

xsd:yearMonthDuration

RDFox additionally provides the following constructor functions for the date/time datatypes. The offset parameter in all functions is optional except in the case of DATE_TIME_STAMP.

RDFox Constructor Functions for Date/Time datatypes.

DURATION ( year, month, day, hours, minutes, seconds )

Constructs an xsd:duration value.

YEAR_MONTH_DURATION ( year, month )

Constructs an xsd:duration value.

DAY_TIME_DURATION ( day, hours, minutes, seconds )

Constructs an xsd:duration value.

DATE_TIME ( year, month, day, hour, minute, second, offset )

Constructs an xsd:dateTime value.

DATE_TIME_STAMP ( year, month, day, hour, minute, second, offset )

Constructs an xsd:dateTimeStamp value.

TIME ( hour, minute, second, offset )

Constructs an xsd:time value.

DATE ( year, month, day, offset )

Constructs an xsd:date value.

G_DAY ( day, offset )

Constructs an xsd:gDay value.

G_MONTH ( month, offset )

Constructs an xsd:gMonth value.

G_MONTH_DAY ( month, day, offset )

Constructs an xsd:gMonthDay value.

G_YEAR_MONTH ( year, month, offset )

Constructs an xsd:gYearMonth value.

G_YEAR ( year, offset )

Constructs an xsd:gYear value.

5.2.4. IRI and String Functions

The IRI and string functions in SPARQL 1.1 and XPath/XQuery almost fully overlap, and RDFox provides full support for them.

IRI and String Functions

ENCODE_FOR_URI      ( fn:encode-for-uri )

ESCAPE_HTML_URI      ( fn:escape-html-uri )

IRI_TO_URI      ( fn:iri-to-uri )

CONTAINS      ( fn:contains )

STRENDS      ( fn:ends-with )

LCASE      ( fn:lower-case )

STRSTARTS      ( fn:starts-with )

STRLEN      ( fn:string-length )

SUBSTR      ( fn:substring )

STRAFTER      ( fn:substring-after )

STRBEFORE      ( fn:substring-before )

UCASE      ( fn:upper-case )

CONCAT      ( fn:concat )

langMatches

REGEX      ( fn:matches )

REPLACE      ( fn:replace )

5.2.5. Hash Functions

The RDFox hash functions are the ones specified in SPARQL 1.1 and are given in the following table.

RDFox Hash Functions

MD5

SHA1

SHA256

SHA384

SHA512

5.2.6. Mathematical Functions

RDFox supports all mathematical functions from SPARQL 1.1 and most of the mathematical functions in XPath/XQuery. RDFox also provides additional functions that are useful in practice. The only nonstandard functions are MAXFN and MINFN, which take any number of arguments and return the maximum and the minimum value, respectively.

Basic Functions

IF

COALESCE

MINFN

MAXFN

ABS      ( fn:abs )

ROUND      ( fn:round )

CEIL      ( fn:ceiling )

FLOOR      ( fn:floor )

RAND

PI      ( math:pi )

Exponential and Power Functions

POW      ( math:pow )

SQRT      ( math:sqrt )

CBRT

HYPOT

LOG      ( math:log )

LOG10      ( math:log10 )

LOG2     

EXP      ( math:exp )

EXP10      ( math:exp10 )

EXP2

Trigonometric Functions

SIN      ( math:sin )

COS      ( math:cos )

ASIN      ( math:asin )

ACOS      ( math:acos )

TAN      ( math:tan )

ATAN      ( math:atan )

ATAN2      ( math:atan2 )

Hyperbolic Functions

SINH

COSH

TANH

ASINH

ACOSH

ATANH

Error and Gamma Functions

ERF

ERFC

GAMMA

LGAMMA

5.2.7. Date and Time Functions

This section describes the RDFox functions on dates, times and durations. RDFox supports all SPARQL 1.1 date time functions and most XPath/XQuery date time functions. In RDFox, many of these functions have been extended to apply to all date/time values.

RDFox Date Time Functions

NOW

Returns an xsd:dateTime value representing the current moment in time.

YEAR

Returns the year component of a date/time value. Extends fn:year-from-dateTime and fn:year-from-date to all date/time datatypes with a valid year component.

MONTH

Returns the month component of a date/time value. Extends fn:month-from-dateTime and fn:month-from-date to data/time datatypes with a valid month component.

DAY

Returns the day component of a date/time value. Extends fn:day-from-dateTime and fn:day-from-date to date/time datatypes with a valid day component.

YEARS

Returns the number of years in a duration: fn:years-from-duration.

MONTHS

Returns the number of months in a duration: fn:months-from-duration.

DAYS

Returns the number of days in a duration: fn:days-from-duration.

HOURS

Returns the hours component of a date/time value. Extends fn:hours-from-dateTime, fn:hours-from-time, and fn:hours-from-duration to date/time datatypes with a valid hours component.

MINUTES

Returns the minutes component of a date/time value. Extends fn:minutes-from-dateTime, fn:minutes-from-time, and fn:minutes-from-duration to date/time datatypes with a valid minutes component.

SECONDS

Returns the seconds component of a date/time value. Extends fn:seconds-from-dateTime, fn:seconds-from-time, fn:seconds-from-duration to date/time datatypes with a valid seconds component.

TIMEZONE

Returns the timezone component of a date/time value. Extends fn:timezone-from-dateTime, fn:timezone-from-date, fn:timezone-from-time to date/time datatypes with a valid timezone component.

TZ

Returns the timezone component of a date/time value as a simple literal. Extended to date/time values with a valid timezone component.

DURATION_MONTHS

Returns the number of months in the internal representation ($months, $seconds) of an xsd:duration value.

DURATION_SECONDS

Returns the number of seconds in the internal representation ($months, $seconds) of an xsd:duration value,

TIME_ON_TIMELINE

Returns the decimal number representing a date time value on the timeline. Note that this is the number of seconds elapsed since 0001-01-01T00:00:00 in accordance with the W3C XML Schema Definition.

TO_TIMEZONE

Adjusts the time zone of a date/time value. An extension of fn:adjust-dateTime-to-timezone, fn:adjust-date-to-timezone, fn:adjust-time-to-timezone, to all date/time datatypes. The function takes two arguments: a mandatory date/time value and an optional timezone value. If the timezone value is provided, the function returns a date/time value in the specified timezone that is equivalent to the first argument. If the timezone is not provided, the function returns the local value of its first argument with no timezone.

5.3. Querying Tuple Tables

RDFox organizes information in a data store using tuple tables, as described in more detail in Section 4. Briefly, tuple tables include named graphs, but can also represent data stored in external data sources. RDFox provides two ways of referring to tuple tables in queries: one uses the proprietary operator TT and the other uses the reserved IRI rdfox:TT.

Querying Tuple Tables Using TT Expressions

RDFox provides a proprietary TT operator to access data stored in tuple tables, as shown in the following example.

Example: Assume that a binary tuple table :EmployeeName is mounted from an external data source (e.g., a database), and that it contains pairs that relate employee IDs to their names. The following query retrieves all pairs whose ID is contained in the :Manager class.

SELECT ?id ?name
WHERE {
    ?id rdf:type :Manager .
    TT :EmployeeName { ?id ?name }
}

In the above example, TT :EmployeeName { ?id ?name } retrieves all pairs of IDs and names stored in the :EmployeeName tuple table. Since the tuple table is binary, only two terms (i.e., ?id and ?name ) are allowed to occur inside the TT expression. This is analogous to named graphs, where GRAPH :G { ?X ?Y ?Z } accesses all triples in a named graph :G. The difference to GRAPH expressions can be summarized as follows.

  • The number of terms inside TT must match with the arity (i.e., the number of positions) of the tuple table.

  • Each TT expression represents exactly one reference to a tuple table. For example, to retrieve pairs of employee IDs with the same name, one can use TT :EmployeeName { ?id1 ?name } . TT :EmployeeName { ?id2 ?name } (whereas TT :EmployeeName { ?id1 ?name .?id2 ?name } is syntactically invalid).

  • Variables cannot be used in place of tuple table names. For example, TT ?T { ?id ?name } is syntactically invalid.

Querying Tuple Tables Using rdfox:TT

The use of the proprietary operator TT is a syntactic extension of the SPARQL language and may result in queries being rejected by third-party libraries. To address this, RDFox provides an additional method for accessing tuple tables, which stays within the syntactic rules of the SPARQL language. This method specifies tuple table atoms using the reserved IRI rdfox:TT and RDF Collections, as shown in the following example.

Example: Consider again the tuple table :EmployeeName. We can retrieve all pairs whose ID is contained in the :Manager class using the following query.

SELECT ?id ?name
WHERE {
    ?id rdf:type :Manager .
    rdfox:TT :EmployeeName (?id ?name)
}

The reserved IRI rdfox::TT links the tuple table name :EmployeeName with the tuple table arguments encoded as an RDF Collection (i.e. (?id ?name)). The query can also be given in its equivalent expanded form.

SELECT ?id ?name
WHERE {
    ?id rdf:type :Manager .
    rdfox:TT :EmployeeName _:b0 .
    _:b0 rdf:first ?id ;
         rdf:rest _:b1.
    _:b1 rdf:first ?name ;
         rdf:rest rdf:nil.
}

In its expanded form, a tuple table atom is encoded using multiple triple patterns and, as a result, its position in the query body may be ambiguous. To avoid ambiguity, RDFox determines the position of the tuple table atom using the position of the triple pattern with subject rdfox:TT. In the above example the position of the tuple table atom is determined by the position of the triple pattern triple rdfox:TT :EmployeeName _:b0 .

The following restrictions apply to the use of rdfox:TT.

  • The object of the triple pattern with subject rdfox:TT should be a well-formed RDF collection.

  • The RDF collection that encodes the tuple table arguments should be of size equal to the arity of the tuple table.

  • Variables cannot be used in place of tuple table names. For example, the use of rdfox:TT ?T (?id ?name) is invalid.

5.4. Monitoring Query Execution

RDFox implements functionality for monitoring the execution of queries. In particular, users can gain access to query plans generated by the RDFox query optimizer as well as to useful statistics about the execution of such plans.

Suppose that we initialize a data store with the example data in our Getting Started guide. The following shell command provides access to the query plans produced by RDFox:

set query.explain true

Now, let’s issue the following SPARQL query against the store

SELECT ?p ?n WHERE { ?p rdf:type :Person . ?p :forename ?n. ?p :hasParent ?z }

which returns the following answers:

:meg "Meg" .
:stewie "Stewie" .
:chris "Chris" .
:meg "Meg" .

The shell now also displays the query plan that has been actually executed:

QUERY ?p ?n                                                            QueryIterator
    PROJECT ?n ?p                      {          -->    ?n ?p }
        CONJUNCTION                    {          -->    ?n ?p ?z }    NestedIndexLoopJoinIterator
            [?p, :hasParent, ?z]       {          -->    ?p ?z }       TripleTableIterator
            [?p, rdf:type, :Person]    { ?p ?z    -->    ?p ?z }       TripleTableIterator
            [?p, :forename, ?n]        { ?p ?z    -->    ?n ?p ?z }    TripleTableIterator

The query plan is executed top-down in a depth-first-search manner and we can think of solution variable bindings as being generated one-at-a-time. It is useful to go in more detail through the execution of the plan for a given solution binding.

When we first “visit” the PROJECT block, we haven’t obtained any variable bindings yet (hence the empty space left of the “–>”symbol); in contrast, by the time we have finished executing the subplan underneath, we will have obtained a binding of variables ?n and ?p and hence an answer to the query (as reflected on the right-hand side of the “–>” symbol). Similarly, when we first visit the CONJUNCTION block, which performs the join of the query, we have an empty binding and, by the time we return from it, we will have a binding for ?n, ?p and ?z. The join is performed also top-down. First, we obtain a binding for ?p and ?z by matching the the triple pattern [?p, :hasParent, ?z]. We then consider the second triple pattern [?p, rdf:type, :Person] and finally the third triple pattern [?p, :forename, ?n], which extends the binding by providing also a value for variable ?n.

Let us consider a slightly more complex query, which uses the OPTIONAL operator in SPARQL.

SELECT ?p ?n WHERE { ?p rdf:type :Person . ?p :hasParent ?z . OPTIONAL { ?p :forename ?n } }

RDFox will execute the following query plan:

QUERY ?p ?n                                                                  QueryIterator
    PROJECT ?n ?p                          {          -->    ?p | ?n }
        OPTIONAL                           {          -->    ?p ?z | ?n }    OptionalIterator
            CONJUNCTION                    {          -->    ?p ?z }         NestedIndexLoopJoinIterator
                [?p, :hasParent, ?z]       {          -->    ?p ?z }         TripleTableIterator
                [?p, rdf:type, :Person]    { ?p ?z    -->    ?p ?z }         TripleTableIterator
            FILTER true
                [?p, :forename, ?n]        { ?p ?z    -->    ?n ?p ?z }      TripleTableIterator

The important difference to notice in this plan is the use of the ” | ” symbol. The variables on the left-hand-side of “|” are always bound by the corresponding block, whereas those indicated on the right-hand-side may or may not be returned.