Warning: This document is for an old version of RDFox. The latest version is 7.2.

9. Querying

RDFox supports most SPARQL 1.1 Query Language features, and it fully supports the SPARQL 1.1 Update Language. It also implements a few proprietary built-in functions that are not part of SPARQL 1.1.

In this section, we describe in detail the support in RDFox for the SPARQL 1.1 standard specification, and the built-in functions that are specific to RDFox.

Finally, we also describe the functionality implemented in RDFox for monitoring query execution.

9.1. SPARQL 1.1 Support

The SPARQL 1.1 specification provides a suite of languages and protocols for querying and manipulating RDF graph data.

9.1.1. Query Language

The core of the specification is the SPARQL 1.1 Query Language, which specifies the syntax and semantics of allowed queries. The SPARQL 1.1 query language extends the previous version of SPARQL with a number of important features for applications, including nested subqueries, aggregation, negation, creating values by expressions, named graphs, and property paths. RDFox also implements the following modification to the semantics of aggregate queries with empty GROUP BY clauses; this variant of the semantics was found critical to provide intuitive answers to queries such as SELECT (COUNT(*) AS ?C) WHERE { ?X ?Y ?Z } and has been adopted by most, if not all RDF systems on the market.

RDFox provides full support of the SPARQL 1.1 query language with the exception of the variant of the BNODE function that takes a single string argument and the nonnormative DESCRIBE query form. The intended semantics of BNODE and DESCRIBE is, in our opinion, not sufficiently specified in the standard, which is why those features have not yet been implemented in RDFox.

Although RDFox supports SPARQL 1.1 property paths, in many situations it might be beneficial to use reasoning instead. This is explained further in Section 10.5.3

9.1.2. Query Answer Formats

RDFox supports the following formats for encoding the answers to SPARQL queries.

  • The SPARQL 1.1 TSV Format has MIME type text/tab-separated-values.

  • The SPARQL 1.1 CSV Format has MIME type text/csv.

  • The SPARQL 1.1 XML Format has MIME type application/sparql-results+xml.

  • The SPARQL 1.1 JSON Format has MIME type application/sparql-results+json.

  • The proprietary format with MIME type application/x.sparql-results+turtle outputs each query answer in a single line that resembles Turtle. If the query has exactly three answer variables, then query answers in this format can be passed to API calls that expect Turtle data.

  • Proprietary formats with MIME types text/x.tab-separated-values-abbrev, text/x.csv-abbrev, application/x.sparql-results+xml-abbrev, application/x.sparql-results+json-abbrev, and application/x.sparql-results+turtle-abbrev follow the same structure as the formats mentioned above, with the difference that all IRIs are abbreviated using prefixes of the data store that the query was evaluated over. Hence, these formats provide a more user-friendly representation of query results.

  • The proprietary format with MIME type application/x.sparql-results+null simply discards all answers. This can be useful in situations such as query benchmarking, where one may want to measure the speed of query processing without taking into account the often considerable overhead of serializing query results and transporting them over the network.

  • Each format from Section 8.1 for triples/quads can be used as a query answer format for queries that return variables ?S, ?P, ?O, and optionally ?G. In such a case, each query answer is serialized as one triple/quad (where an answer is interpreted as a quad whenever variable ?G is bound).

9.1.3. Update Language

RDFox fully supports the SPARQL 1.1 update language. In particular, this language allows users to insert triples into a store, delete triples from a store, load an RDF graph into a store, clear an RDF graph in a store, create a new RDF graph in a store, drop an RDF graph from a store, copy (move or add) the content of one graph to another, and perform a group of update operations as a single action.

9.2. Built-In Functions

RDFox supports a wide range of built-in operators and functions that can be used during query answering and reasoning. Concretely, RDFox supports all SPARQL 1.1 Functions and Operators, with the exception of the variant of the BNODE function that takes a single string argument, whose semantics is unclear. In addition to that, RDFox also supports most of the XPath and XQuery Functions and Operators. Finally, RDFox also provides a number of proprietary functions that are useful in practice.

There is a large overlap between the functions and operators defined in SPARQL 1.1 and XPath and XQuery. As a result most functions supported by RDFox can be accessed using their short names, as specified in SPARQL 1.1, as well as their IRI name, as specified in XPath and XQuery. RDFox also provides a short name for many of the XPath and XQuery functions that have no SPARQL equivalent.

RDFox often provides additional overloads for the functions and operators from the SPARQL and the XPath and XQuery specifications, which improves usability. So, for example, one can extract the year of an xsd:gYearMonth value, and can also compare without restrictions two xsd:duration values according to the partial order on durations. All such extensions are documented where the respective functions and operators are introduced.

A full list of the RDFox built-in functions and operators is given in the following sections. Functions will be presented with all their available names. If present, short names will precede the IRI names. If a function name is part of the SPARQL 1.1 specification or the XPath and XQuery specification, the name will be given as a link pointing to the respective part of the specification. We will assume the following prefix definitions.

XPath/XQuery Function Name Prefixes

xsd:

<http://www.w3.org/2001/XMLSchema#>

fn:

<http://www.w3.org/2005/xpath-functions#>

math:

<http://www.w3.org/2005/xpath-functions/math#>

All RDFox functions and operators can be used in SPARQL queries. Furthermore, all RDFox functions and operators can also be used in rules, with the exception of NOW, RAND, UUID, and STRUUID, whose values are not determined by their arguments.

9.2.1. Operators

The RDFox operators are listed in the following table and discussed in detail next.

RDFox Operators

!      unary logical-not

&&      binary logical-and

||      binary logical-or

<=      binary less-equal-than

>=      binary greater-equal-than

=      binary equal

<      binary less-than

>      binary greater-than

!=      binary not-equal

+      binary add

*      binary multiply

+      unary plus

-      binary subtract

/      binary divide

-      unary minus

idiv      integer division

mod      modulo operator

The Boolean operators in RDFox are the logical not ( ! ), logical and ( && ), and logical or ( || ), which behave as defined in SPARQL 1.1.

The comparison operators in RDFox are <, <=, =, !=, >= and >. These operators have been significantly extended when compared to the respective operators in SPARQL 1.1 and in XPath/XQuery. When compared to SPARQL, the comparison operators in RDFox have additional overloads for all the date/time and duration datatypes. When compared to XPath/XQuery, which also defines such overloads, RDFox has the following differences.

  • Operators == and != are can be used to compare IRIs, blank nodes, and literals regardless of their type. For example, it is possible to compare an IRI with a blank node, or to compare two literals of different datatypes.

  • In RDFox, xsd:duration values are compared according to the partial order defined in the XML Schema specification. In contrast, in XPath/XQuery, the operators <, <=, =>, > are only defined for the subtypes xsd:dayTimeDuration and xsd:yearMonthDuration.

  • Similarly, in RDFox date and time values are compared according to the partial order on dates and times defined in the XML Schema. In contrast, XPath/XQuery imposes various restrictions on the allowed comparisons.

The mathematical operators in RDFox are the unary + and - operators, and the binary addition +, subtraction -, multiplication *, division /, integer division idiv, and modulo mod operators.

SPARQL 1.1 defines only a subset of the above operators and their overloads in comparison to XPath/XQuery (e.g. idiv and mod are not in SPARQL 1.1). RDFox extends the XPath/XQuery behavior of these operators as outlined next.

  • The unary + and - operators have been extended to the datatype xsd:duration.

  • The binary subtraction operator - has been extended to all compatible date and time datatypes. The result of such operation is an xsd:duration.

  • The binary addition and subtraction operators have been extended so that durations can be added to and subtracted from values of any of the date and time datatypes.

9.2.2. Functions on Terms

The following table lists the RDF functions on terms. Most of the functions behave as specified in SPARQL 1.1. One difference is the addition of the two Boolean functions isInteger and isDecimal. The function isInteger returns true if the argument has one of the integer datatypes, while the isDecimal function returns true if its argument is of type xsd:decimal. Another change concerns the functions IRI and URI, which take an optional second argument that specify the base against which the first argument is resolved. If not provided, the default base is used. Furthermore, the BOUND function has been extended to operate on arbitrary expressions. The function returns "true"^^xsd:boolean, when the input expression can be successfully evaluated. In particular, when the expression is a variable, its evaluation succeeds when the variable is bound. Finally, function STREX extends the SPARQL function STR with the ability to compute a string representation of arbitrary RDF terms.

Functions on terms

BOUND

isIRI

isURI

isBlank

isLiteral

isNumeric

isInteger

isDecimal

sameTerm

IRI

STR

STREX

URI

BNODE

STRDT

STRLANG

UUID

STRUUID

LANG

DATATYPE

9.2.3. Constructor Functions

RDFox has a number of constructor functions that allow users to create a value of a particular type. The following table lists the constructor functions defined in SPARQL 1.1.

RDFox Constructor Functions

xsd:anyURI

xsd:boolean

xsd:date

xsd:dateTime

xsd:dateTimeStamp

xsd:dayTimeDuration

xsd:decimal

xsd:double

xsd:duration

xsd:float

xsd:gDay

xsd:gMonth

xsd:gMonthDay

xsd:gYear

xsd:gYearMonth

xsd:integer

xsd:string

xsd:time

xsd:yearMonthDuration

RDFox additionally provides the following constructor functions for the date/time datatypes. The offset parameter in all functions is optional except in the case of DATE_TIME_STAMP.

RDFox Constructor Functions for Date/Time datatypes.

DURATION ( year, month, day, hours, minutes, seconds )

Constructs an xsd:duration value.

YEAR_MONTH_DURATION ( year, month )

Constructs an xsd:duration value.

DAY_TIME_DURATION ( day, hours, minutes, seconds )

Constructs an xsd:duration value.

DATE_TIME ( year, month, day, hour, minute, second, offset )

Constructs an xsd:dateTime value.

DATE_TIME_STAMP ( year, month, day, hour, minute, second, offset )

Constructs an xsd:dateTimeStamp value.

TIME ( hour, minute, second, offset )

Constructs an xsd:time value.

DATE ( year, month, day, offset )

Constructs an xsd:date value.

G_DAY ( day, offset )

Constructs an xsd:gDay value.

G_MONTH ( month, offset )

Constructs an xsd:gMonth value.

G_MONTH_DAY ( month, day, offset )

Constructs an xsd:gMonthDay value.

G_YEAR_MONTH ( year, month, offset )

Constructs an xsd:gYearMonth value.

G_YEAR ( year, offset )

Constructs an xsd:gYear value.

9.2.4. IRI and String Functions

The IRI and string functions in SPARQL 1.1 and XPath/XQuery almost fully overlap, and RDFox provides full support for them, with the exception of fn:contains, which only supports the variant with two arguments.

IRI and String Functions

ENCODE_FOR_URI      ( fn:encode-for-uri )

ESCAPE_HTML_URI      ( fn:escape-html-uri )

IRI_TO_URI      ( fn:iri-to-uri )

CONTAINS      ( fn:contains )

STRENDS      ( fn:ends-with )

LCASE      ( fn:lower-case )

STRSTARTS      ( fn:starts-with )

STRLEN      ( fn:string-length )

SUBSTR      ( fn:substring )

STRAFTER      ( fn:substring-after )

STRBEFORE      ( fn:substring-before )

UCASE      ( fn:upper-case )

CONCAT      ( fn:concat )

langMatches

REGEX      ( fn:matches )

REPLACE      ( fn:replace )

9.2.5. Hash Functions

The RDFox hash functions are the ones specified in SPARQL 1.1 and are given in the following table.

RDFox Hash Functions

MD5

SHA1

SHA256

SHA384

SHA512

9.2.6. Mathematical Functions

RDFox supports all mathematical functions from SPARQL 1.1 and most of the mathematical functions in XPath/XQuery. RDFox also provides additional functions that are useful in practice. The only nonstandard functions are MAXFN and MINFN, which take any number of arguments and return the maximum and the minimum value, respectively.

Basic Functions

IF

COALESCE

MINFN

MAXFN

ABS      ( fn:abs )

ROUND      ( fn:round )

CEIL      ( fn:ceiling )

FLOOR      ( fn:floor )

RAND

PI      ( math:pi )

Exponential and Power Functions

POW      ( math:pow )

SQRT      ( math:sqrt )

CBRT

HYPOT

LOG      ( math:log )

LOG10      ( math:log10 )

LOG2     

EXP      ( math:exp )

EXP10      ( math:exp10 )

EXP2

Trigonometric Functions

SIN      ( math:sin )

COS      ( math:cos )

ASIN      ( math:asin )

ACOS      ( math:acos )

TAN      ( math:tan )

ATAN      ( math:atan )

ATAN2      ( math:atan2 )

Hyperbolic Functions

SINH

COSH

TANH

ASINH

ACOSH

ATANH

Error and Gamma Functions

ERF

ERFC

GAMMA

LGAMMA

9.2.7. Date and Time Functions

This section describes the RDFox functions on dates, times and durations. RDFox supports all SPARQL 1.1 date time functions and most XPath/XQuery date time functions. In RDFox, many of these functions have been extended to apply to all date/time values.

RDFox Date Time Functions

NOW

Returns an xsd:dateTime value representing the current moment in time.

YEAR

Returns the year component of a date/time value. Extends fn:year-from-dateTime and fn:year-from-date to all date/time datatypes with a valid year component.

MONTH

Returns the month component of a date/time value. Extends fn:month-from-dateTime and fn:month-from-date to data/time datatypes with a valid month component.

DAY

Returns the day component of a date/time value. Extends fn:day-from-dateTime and fn:day-from-date to date/time datatypes with a valid day component.

YEARS

Returns the number of years in a duration: fn:years-from-duration.

MONTHS

Returns the number of months in a duration: fn:months-from-duration.

DAYS

Returns the number of days in a duration: fn:days-from-duration.

HOURS

Returns the hours component of a date/time value. Extends fn:hours-from-dateTime, fn:hours-from-time, and fn:hours-from-duration to date/time datatypes with a valid hours component.

MINUTES

Returns the minutes component of a date/time value. Extends fn:minutes-from-dateTime, fn:minutes-from-time, and fn:minutes-from-duration to date/time datatypes with a valid minutes component.

SECONDS

Returns the seconds component of a date/time value. Extends fn:seconds-from-dateTime, fn:seconds-from-time, fn:seconds-from-duration to date/time datatypes with a valid seconds component.

TIMEZONE

Returns the timezone component of a date/time value. Extends fn:timezone-from-dateTime, fn:timezone-from-date, fn:timezone-from-time to date/time datatypes with a valid timezone component.

TZ

Returns the timezone component of a date/time value as a simple literal. Extended to date/time values with a valid timezone component.

DURATION_MONTHS

Returns the number of months in the internal representation ($months, $seconds) of an xsd:duration value.

DURATION_SECONDS

Returns the number of seconds in the internal representation ($months, $seconds) of an xsd:duration value,

TIME_ON_TIMELINE

Returns the decimal number representing a date time value on the timeline. Note that this is the number of seconds elapsed since 0001-01-01T00:00:00 in accordance with the W3C XML Schema Definition.

TO_TIMEZONE

Adjusts the time zone of a date/time value. An extension of fn:adjust-dateTime-to-timezone, fn:adjust-date-to-timezone, fn:adjust-time-to-timezone, to all date/time datatypes. The function takes two arguments: a mandatory date/time value and an optional timezone value. If the timezone value is provided, the function returns a date/time value in the specified timezone that is equivalent to the first argument. If the timezone is not provided, the function returns the local value of its first argument with no timezone.

9.2.8. The ROLE() Function

RDFox supports a proprietary built-in function for use in audit logging: ROLE. This function returns an xsd::string containing the name of the role that owns the connection on which SPARQL evaluation is taking place.

9.3. Aggregate Functions

Aggregate functions differ from normal functions in that they perform computations on sets of values rather than on individual values. In SPARQL, aggregate functions are applied either to the set of results of a given query or to a subset of results obtained using the GROUP_BY construct. RDFox supports the following list of aggregate functions, which includes all aggregate functions in SPARQL 1.1.

RDFox Aggregate Functions

COUNT(V)

Returns the number of results in which V is defined (i.e. not UNDEF).

SUM(V)

Returns the sum of all values V in the set of results.

AVG(V)

Returns the average of all values V in the set of results.

MUL(V)

Returns the product of all values V in the set of results.

MIN(V)

Returns the smallest value V in the set of results.

COUNT_MIN(V)

Returns the number of results that have the smallest value of V.

SAMPLE_ARGMIN(A, V)

Returns the value of A in the first result that has the smallest value of V.

MIN_ARGMIN(A, V)

Returns the smallest value of A in the results that have the smallest value of V.

MAX_ARGMIN(A, V)

Returns the largest value of A in the results that have the smallest value of V.

MAX(V)

Returns the largest value in the set of results.

COUNT_MAX(V)

Returns the number of results that have the largest value of V.

SAMPLE_ARGMAX(A, V)

Returns the value of A in the first result that has the largest value of V.

MIN_ARGMAX(A, V)

Returns the smallest value of A in the results that have the largest value of V.

MAX_ARGMAX(A, V)

Returns the largest value of A in the results that have the largest value of V.

GROUP_CONCAT(V; SEPARATOR=…)

Returns the concatenation (in some order) of the values V in the set of results using the given separator.

SAMPLE(V)

Returns the first defined value of V in the set of results, if one exists, or UNDEF, otherwise.

9.4. Querying Tuple Tables

RDFox organizes information in a data store using tuple tables, as described in more detail in Section 4. There are two ways of referring to tuple tables in queries: one uses the proprietary operator TT and the other uses the reserved IRI rdfox:TT.

Querying Tuple Tables Using TT Expressions

RDFox provides a proprietary TT operator to access data stored in tuple tables, as shown in the following example.

Example: Assume that a binary tuple table EmployeeName is mounted from an external data source (e.g., a database), and that it contains pairs that relate employee IDs to their names. The following query retrieves all pairs whose ID is contained in the :Manager class.

SELECT ?id ?name
WHERE {
    ?id rdf:type :Manager .
    TT EmployeeName { ?id ?name }
}

In the above example, TT EmployeeName { ?id ?name } retrieves all pairs of IDs and names stored in the EmployeeName tuple table. Since the tuple table is binary, only two terms (i.e., ?id and ?name ) are allowed to occur inside the TT expression. This is analogous to named graphs, where GRAPH :G { ?X ?Y ?Z } accesses all triples in a named graph :G. The difference to GRAPH expressions can be summarized as follows.

  • The number of terms inside TT must match with the arity (i.e., the number of positions) of the tuple table.

  • Each TT expression represents exactly one reference to a tuple table. For example, to retrieve pairs of employee IDs with the same name, one can use TT EmployeeName { ?id1 ?name } . TT EmployeeName { ?id2 ?name } (whereas TT EmployeeName { ?id1 ?name .?id2 ?name } is syntactically invalid).

  • Variables cannot be used in place of tuple table names. For example, TT ?T { ?id ?name } is syntactically invalid.

Querying Tuple Tables Using rdfox:TT

The use of the proprietary operator TT is a syntactic extension of the SPARQL language and may result in queries being rejected by third-party libraries. To address this, RDFox provides an additional method for accessing tuple tables, which stays within the syntactic rules of the SPARQL language. This method specifies tuple table atoms using the reserved IRI rdfox:TT and RDF Collections, as shown in the following example.

Example: Consider again the tuple table EmployeeName. We can retrieve all pairs whose ID is contained in the :Manager class using the following query.

SELECT ?id ?name
WHERE {
    ?id rdf:type :Manager .
     (?id ?name) rdfox:TT "EmployeeName"
}

The reserved IRI rdfox::TT links the tuple table name EmployeeName with the tuple table arguments encoded as an RDF Collection (i.e. (?id ?name)). The query can also be given in its equivalent expanded form.

SELECT ?id ?name
WHERE {
    ?id rdf:type :Manager .
    _:b0 rdfox:TT "EmployeeName" .
    _:b0 rdf:first ?id ;
         rdf:rest _:b1.
    _:b1 rdf:first ?name ;
         rdf:rest rdf:nil.
}

In its expanded form, a tuple table atom is encoded using multiple triple patterns and, as a result, its position in the query body may be ambiguous. To avoid ambiguity, RDFox determines the position of the tuple table atom using the position of the triple pattern with subject rdfox:TT. In the above example the position of the tuple table atom is determined by the position of the triple pattern triple _:b0 rdfox:TT "EmployeeName" .

The following restrictions apply to the use of rdfox:TT.

  • The object of the triple pattern with subject rdfox:TT should be a well-formed RDF collection.

  • The RDF collection that encodes the tuple table arguments should be of size equal to the arity of the tuple table.

  • Variables cannot be used in place of tuple table names. For example, the use of (?id ?name) rdfox:TT ?T is invalid.

9.5. Distinguishing Explicit and Derived Facts in Queries

As described in Section 10, RDFox supports reasoning, by which it can use Datalog rules to derive additional facts from the facts explicitly given in the input. Sometimes, it can be useful to distinguish in the query whether a fact has been explicitly given in the input, or whether it has been derived by a rule. To facilitate this, RDFox provides a proprietary extension of the SPARQL 1.1 syntax. The following example demonstrates this.

Example: Assume that the following triples are loaded into the data store.

:brian rdf:type :LivingThing .
:peter rdf:type :Person .

If the data store contains rule :LivingThing[?X] :- :Person[?X] ., then the following triple is going to be derived.

:peter rdf:type :Person .

The following query retrieves all facts of the form x rdf:type :LivingThing, together with a Boolean flag which is set to true if the fact is explicitly given in the input.

SELECT ?id ?e
WHERE {
    ?id rdf:type :LivingThing EXPLICIT ?e
}

Our our example data, this query returns one answer where ?id is mapped to :brian and ?e is mapped to true, and another answer where ?id is mapped to :peter and ?e is mapped to false.

The TT pattern provides an analogous extension. Moreover, variables bound by EXPLICIT can be used in the rest of the query like any other variable; for example, they can be used in other triple patterns to facilitate joins.

Example: The query for employees from Section 9.4 can be modified as follows to compute the join between managers and their names, while requiring both facts to be either explicit or derived.

SELECT ?id ?name
WHERE {
    ?id rdf:type :Manager EXPLICIT ?e .
    TT EmployeeName { ?id ?name EXPLICIT ?e }
}

9.6. Monitoring Query Evaluation

RDFox provides different ways of analyzing query evaluation. In particular, users can gain access to query plans generated by the RDFox query optimizer as well as to useful statistics about the execution of such plans.

Suppose that we initialize a data store with the example data from our Getting Started guide. Furthermore, consider the SPARQL query

SELECT ?p ?n WHERE { ?p rdf:type :Person . ?p :forename ?n. ?p :hasParent ?z }

which returns the following four answers:

:meg "Meg" .
:stewie "Stewie" .
:chris "Chris" .
:meg "Meg" .

Next we discuss two ways of analyzing how RDFox evaluates the above query using query plans and query profiling, respectively.

9.6.1. Query Plans

Query plans provide insights into the order in which RDFox evaluates the different parts of the query. They are determined by the query optimizer in RDFox based on the shape of the query and the statistical properties of the data present in the store at the time of planning. To see the query plan that RDFox uses to evaluate a given query, one needs to set the shell variable query.explain to true, as shown next.

set query.explain true

The next time we evaluate the above query, the shell will also display the following query plan.

QUERY ?p ?n                                                            QueryIterator
    PROJECT ?n ?p                      {          -->    ?n ?p }
        CONJUNCTION                    {          -->    ?n ?p ?z }    NestedIndexLoopJoinIterator
            [?p, :hasParent, ?z]       {          -->    ?p ?z }       TripleTableIterator
            [?p, rdf:type, :Person]    { ?p ?z    -->    ?p ?z }       TripleTableIterator
            [?p, :forename, ?n]        { ?p ?z    -->    ?n ?p ?z }    TripleTableIterator

The query plan is executed top-down in a depth-first-search manner and we can think of solution variable bindings as being generated one-at-a-time. It is useful to go in more detail through the execution of the plan for a given solution binding.

When we first “visit” the PROJECT block, we haven’t obtained any variable bindings yet (hence the empty space left of the “–>”symbol); in contrast, by the time we have finished executing the subplan underneath, we will have obtained a binding of variables ?n and ?p and hence an answer to the query (as reflected on the right-hand side of the “–>” symbol). Similarly, when we first visit the CONJUNCTION block, which performs the join of the query, we have an empty binding and, by the time we return from it, we will have a binding for ?n, ?p and ?z. The join is performed also top-down. First, we obtain a binding for ?p and ?z by matching the triple pattern [?p, :hasParent, ?z]. We then consider the second triple pattern [?p, rdf:type, :Person] and finally the third triple pattern [?p, :forename, ?n], which extends the binding by providing also a value for variable ?n.

Let us consider a slightly more complex query, which uses the OPTIONAL operator in SPARQL.

SELECT ?p ?n WHERE { ?p rdf:type :Person . ?p :hasParent ?z . OPTIONAL { ?p :forename ?n } }

RDFox will execute the following query plan:

QUERY ?p ?n                                                                  QueryIterator
    PROJECT ?n ?p                          {          -->    ?p | ?n }
        OPTIONAL                           {          -->    ?p ?z | ?n }    OptionalIterator
            CONJUNCTION                    {          -->    ?p ?z }         NestedIndexLoopJoinIterator
                [?p, :hasParent, ?z]       {          -->    ?p ?z }         TripleTableIterator
                [?p, rdf:type, :Person]    { ?p ?z    -->    ?p ?z }         TripleTableIterator
            FILTER true
                [?p, :forename, ?n]        { ?p ?z    -->    ?n ?p ?z }      TripleTableIterator

The important difference to notice in this plan is the use of the ” | ” symbol. The variables on the left-hand-side of “|” are always bound by the corresponding block, whereas those indicated on the right-hand-side may or may not be returned.

9.6.2. Query Profiling

In addition to the ordering of the plan nodes of a given query specified in a query plan, users can also obtain runtime information about the number of operations performed on each of them. This information could be useful when debugging potential performance issues with query evaluation.

To obtain such runtime information, one needs to enable a query profiler using the following command.

set query.monitor profile

When evaluating our original query

SELECT ?p ?n WHERE { ?p rdf:type :Person . ?p :forename ?n. ?p :hasParent ?z }

with query profiler enabled, RDFox will print the following output.

== QUERY EVALUATION STATISTICS ==

Statistics after 0 (s)
+-----------------------------------------------------------------------------------------------------------------------------------------
| Sample Count   Iterator Open   Iterator Advance    Plan Node
+-----------------------------------------------------------------------------------------------------------------------------------------
|            0               0                  0    QUERY ?p ?n                                                            QueryIterator
|            0               0                  0        PROJECT ?n ?p                      {          -->    ?n ?p }
|            0               1                  4            CONJUNCTION                    {          -->    ?n ?p ?z }    NestedIndexLoopJoinIterator
|            0               1                  4                [?p, :hasParent, ?z]       {          -->    ?p ?z }       TripleTableIterator
|            0               4                  4                [?p, rdf:type, :Person]    { ?p ?z    -->    ?p ?z }       TripleTableIterator
|            0               4                  4                [?p, :forename, ?n]        { ?p ?z    -->    ?n ?p ?z }    TripleTableIterator
+-----------------------------------------------------------------------------------------------------------------------------------------

=================================

The output is an enriched version of the query plan with sampling information of query evaluation and information about the number of times the iterator of each plan node was accessed, i.e. was opened and advanced. For example, the iterator for [?p, :hasParent, ?z] was opened once (successfully), and advanced four times (three times successfully and once unsuccessfully), signified as 1 / 4. For each of the four successful operations on the iterator of [?p, :hasParent, ?z], the iterator for [?p, rdf:type, :Person] is opened once (each time successfully) and advanced once (each time unsuccessfully), signified as 4 / 4. The statistics for the other plan nodes are determined in the same way with the exception of the top level plan nodes. The statistics for those plan nodes are unavailable, due to their special handling, but they can be inferred from the number of query results: the iterators of these plan nodes are opened once, and advanced as many times as the number of query results. In addition, the output contains the column sample count, which provides indication of the relative time spent for the evaluation of each plan node. The larger the number of samples a plan node has, the more time has been spent on its evaluation. The sample count column is useful for queries with non-trivial evaluation time. The sampling frequency of the profiler is controlled by the shell parameter query.profiler.sampling-frequency.

By default, the query profiler prints statistics only once, at the end of query evaluation. Alternatively, it can be configured to print statistics at a given frequency using the shell variable log-frequency. This can be useful when fixing problematic queries with long evaluation times.

9.7. Access Control

Like all operations, the evaluation of SPARQL requests (i.e. queries and updates) is subject to the rules of RDFox’s access control system (see Section 12). When specifying access control policies at the granularity of named graphs, it is important to be aware that triples in unreadable named graphs are silently skipped during query evaluation. For a more detailed explanation of this see Section 12.2.4.2.