2. RDFox Features and Requirements

2.1. RDFox Features

RDFox® provides the following main functionality:

  • RDFox can import RDF triples, rules, and OWL 2 and SWRL axioms either programmatically or from files of certain formats (see Section 8.2 for details). RDF data can be validated using the SHACL constraint language. Additionally, RDFox can access information from external data sources, such as CSV files, relational databases, or Apache Solr (see Section 7).

  • Triples, rules and axioms can be exported into a number of different formats (see Section 8.3 for details). Furthermore, the contents of the system can be incrementally saved into a binary file, which can later be loaded to restore the system’s state.

  • RDFox can answer SPARQL 1.1 queries (see Section 9) and provides functionality for monitoring query answering and accessing query plans.

  • RDFox supports materialization-based reasoning, where all triples that logically follow from the facts and rules in the system are materialized as new triples (see Section 10) . Materializations can be incrementally updated, which means that reasoning does not need to be performed from scratch once the information in the system is updated. Furthermore, the results of reasoning can be explained, which means that RDFox is able to return proofs for any new fact added to the store through materialization.

  • RDFox supports ACID transactional updates (see Section 11 for further details on transactions).

  • Individual information elements in the system can be assigned different access permissions for different users (see Section 12 for further details on access control).

2.2. Software Archive

RDFox is distributed as an archive containing the following files and directories:

  • RDFox (macOS/Linux) or RDFox.exe (Windows): a stand-alone executable that can be used to run RDFox on the command line.

  • lib: a directory containing the following libraries:

    • JRDFox.jar: the Java API to the RDFox engine.

    • libRDFox.dylib (macOS), libRDFox.so (Linux), or libRDFox.dll (Windows): a dynamic/shared library that implements the C and the Java Native APIs of RDFox.

    • libRDFox.lib (Windows only): the import library needed for linking libRDFox.dll on Windows.

    • libRDFox-static.a (macOS and Linux) or libRDFox-static.lib (Windows): a static library that implements the C API of RDFox.

  • include: a directory containing include files providing access to the C and C++ APIs.

  • examples: a directory containing demonstration programs that show how to call RDFox as a library.

    • C: a directory containing a C source file demonstrating how to call RDFox via the experimental C API. The directory also contains scripts compile-shared-and-run.sh and compile-static-and-run.sh on macOS and Linux, and scripts compile-shared-and-run.bat and compile-static-and-run.bat on Windows, which can be used to build and run the demo. On macOS and Linux, the scripts assumes a C17 compliant version of gcc is available on the path. On Windows, the scripts assume that vcvars64.bat has been executed in the shell prior to execution.

    • C++: a directory containing a C++ source file demonstrating how to use RDFox via the C++ API. The directory also contains scripts compile-shared-and-run.sh and compile-static-and-run.sh on macOS and Linux, and scripts compile-shared-and-run.bat and compile-static-and-run.bat on Windows, which can be used to build and run the demo. On macOS and Linux, the script assumes a version of g++ supporting C++11 is available on the path. On Windows, the script assumes that vcvars64.bat has been executed in the shell prior to execution.

    • Java: a directory containing source code for a program demonstrating how to call RDFox via the Java API. The examples/Java/build.xml Apache Ant script can be used to compile and run the program.

2.3. Interfaces

Users and developers can interact with RDFox through the following interfaces:

CLI

RDFox comes with a built-in shell that can be used to interact with and control the RDFox Server. The shell can be launched together with an RDFox Server instance using the shell or sandbox modes of the executable. Alternatively the remote executable mode can be used to connect to and use the shell interface of a remote RDFox Server. See Section 15 for details.

RESTful API

When RDFox’s endpoint is running, clients can interact with the associated RDFox server via a RESTful API. For details of the RESTful API, see Section 16. For details of how to configure and start the endpoint, see Section 19.

Java API

RDFox can be embedded into Java applications and called via the Java API described in Section 16 and Section 17. To use JRDFox in your project, simply add JRDFox.jar to your classpath, and make sure that the path to the dynamic library is correctly specified when starting your program using the following JVM option:

-Djava.library.path=<path to the dynamic library>
C API (EXPERIMENTAL)

RDFox can be dynamically loaded and called through a C API.

GUI

As well as serving the REST API, the RDFox endpoint serves the RDFox Console, a browser-based user interface supporting basic querying and visualization of data store content. When the endpoint is running, the Console can be loaded by visiting http[s]:<hostname>:<port>/console/ where <hostname> and <port> are the host name and port number at which the endpoint can be reached.

2.4. System Requirements

2.4.1. Software

2.4.1.1. Operating Systems

RDFox supports the following operating system versions:

Windows

Windows 10 or higher

Mac

macOS 10.14 or higher

Linux
  • Ubuntu 18.04 or higher

  • Amazon Linux 2 or higher

Additionally, RDFox can be run using Docker. See Section 22 for details.

2.4.1.2. Third-party Software

Some RDFox features depend on dynamic-link libraries (DLL) from the list of the third-party software packages below. In each case, the DLL or DLLs are loaded on-demand the first time the dependent functionality is accessed within a session. This means that RDFox can be deployed in the absence of these packages if the dependent functionality is not needed.

OpenSSL

Used to implement TLS for RDFox’s HTTP client and server code, as well as for persistence and session encryption. The search paths used to locate the DLLs from this package when the endpoint is starting can be specified via the RDFOX_LIBCRYPTO_PATH and RDFOX_LIBSSL_PATH environment variables. If the environment variables are not set, the default values shown in the following table are used.

Platform

libcrypto search path

libssl search path

Windows

libcrypto-3-x64.dll

libssl-3-x64.dll

macOS

libcrypto.3.dylib

libssl.3.dylib

Linux

libcrypto.so

libssl.so

The resolved libraries must have version v3.0.0 or higher.

libpq

Used to access PostgreSQL data sources. The search path used to locate libpq when registering one of these data sources can be specified via the RDFOX_LIBPQ_PATH environment variable. If the environment variable is not set, the default value shown in the following table is used.

Platform

libpq search path

Windows

libpq.dll

macOS

libpq.dylib

Linux

libpq.so

The resolved library should be of a version that matches that of the PostgreSQL server being connected to. The current release was built and tested with both library and server from PostgreSQL 14, however it will also work with a wider range of versions, both higher and lower. Please test your configuration and contact OST support as needed.

iODBC or unixODBC

Used to access external data sources via ODBC. The search path used to locate the DLL that will manage drivers for accessing the ODBC source can be specified via the RDFOX_ODBC_DRIVER_MANAGER_PATH environment variable. If the environment variable is not set, RDFox will attempt to use the default search paths shown in the following table to load unixODBC and, if that fails and the platform is not Windows, iODBC.

Platform

unixODBC search path

iODBC search path

Windows

odbc32.dll

(not supported)

macOS

libodbc.dylib

libiodbc.dylib

Linux

libodbc.so

libiodbc.so

Although iODBC can be used, unixODBC is recommended. The current release was built and tested with unixODBC v2.3, however it will work with a wider range of versions, both higher and lower. Please test your configuration and contact OST support as needed.

libsqlite3

Used to access SQLite data sources. The search path used to locate the SQLite library can be specified via the RDFOX_LIBSQLITE_PATH environment variable. If the environment variable is not set, RDFox will attempt to use the default search paths shown in the following table.

Platform

libsqlite3 search path

Windows

libsqlite3.dll

macOS

libsqlite3.dylib

Linux

libsqlite3.so

The resolved library should be of a version that matches that of the SQLite file being connected to. The current release was built and tested with SQLite v3.49.1, however it will also work with a wider range of versions, both higher and lower. Please test your configuration and contact OST support as needed.

Lucene

Used to access Lucene data sources. The search path used to locate the Lucene libraries when registering a Lucene data source can be specified via the server parameter jvm.options or the environment variable RDFOX_JVM_OPTIONS. The JVM options should include -Djava.class.path of the required Lucene libraries. The separator between the classpath is : (On Windows, ;). Any other JVM options can be specified as needed using the | separator. If a Lucene data source is used with JRDFox, the required Lucene libraries must be included in the classpath of the JVM running JRDFox. The required Lucene libraries for -Djava.class.path are shown in the following table with explanations of their purpose.

Lucene library search path

Description

lucene-core-<version>.jar

(Required) Lucene core library.

lucene-queryparser-<version>.jar

(Required) Query parsers and parsing framework.

lucene-backward-codecs-<version>.jar

(Optional) This library can be used to access Lucene indexes created with older major versions of Lucene.

The resolved libraries must have version v9.6.0 or higher.

libjvm

Used to access Lucene data sources with RDFox executable. The search path used to locate the JVM library can be specified via the RDFOX_LIBJVM_PATH environment variable. This is required to embed the Java Virtual Machine (JVM) within RDFox. If the environment variable is not set, RDFox will attempt to use the default search paths shown in the following table.

Platform

libjvm search path

Windows

jvm.dll

macOS

libjvm.dylib

Linux

libjvm.so

The resolved library should match the version of the Java Runtime Environment (JRE) in use. This release was built and tested with Java 17, but it is also compatible with Java 11 or higher. Please test your configuration and contact OST support if needed.

For a list of other third-party components used within RDFox, see Acknowledgments.

2.4.1.3. License Key

Creating an RDFox Server requires a time-limited license key issued by Oxford Semantic Technologies. At server creation time, RDFox will search the following locations, in the order shown, for the license key:

  • the value of the license.content server parameter, if set

  • [RDFox executable only] the value of the RDFOX_LICENSE_CONTENT environment variable, if set

  • the content of the file specified via the license.file server parameter, if set

  • [RDFox executable only] the content of the file specified via the RDFOX_LICENSE_FILE environment variable, if set

  • [RDFox executable only] the content of the file RDFox.lic in the directory containing the running executable, if the file exists

  • the content of the file at the default value for the license.file server parameter, if the file exists

If a candidate key is found in one location, the remaining locations will not be checked even if the candidate turns out to be invalid or expired. See Section 4.3 for details of how to specify server parameters such as license.content and license.file.

2.4.2. Hardware

This section describes the hardware requirements for running RDFox.

2.4.2.1. Memory

RDFox is a main-memory data store and as such its performance is heavily dependent on access to a suitable amount of memory. The amount of memory required for a given application can be broken down into the following two components.

  • Fact storage cost is the amount of memory required to store the facts (triples or quads) including both those imported explicitly and those added by materialization. This component depends on the number of facts and other characteristics of the data set such as how many unique resources it contains and the size of those resources.

  • Operating memory cost is the amount of memory required for operations such as querying, reasoning, compaction, and assorted other activities. This component is proportional to the fact storage cost but the exact proportion varies considerably with the characteristics of the workload.

Fact storage costs typically vary between 45 and 85 bytes per fact. One should provision an additional 10-100% of this for operating memory costs. The following workload characteristics will usually increase the operating memory costs:

  • high numbers of queries evaluated concurrently with updates,

  • queries that return large result sets with the ORDER BY or DISTINCT keywords, and

  • using large and/or complicated sets of Datalog rules.

A special case with high operating memory costs is that of HA replicas with the highly-available setting for the persistence.snapshot-restore-mode server parameter (see Section 4.3). These should reserve at least 100% of the fact storage cost as additional memory, as this is required to restore the data store from a snapshot.

While the above figures can be useful to estimate the memory requirements, an application should always be tested thoroughly to determine the actual memory requirements.

In general, it is recommended to ensure that the complete memory requirements of an application are met without relying on memory paging because RDFox’s performance will degrade significantly if the OS swaps the RDFox process’s memory pages in and out of disk. It can, however, be useful to enable a suitably sized swap file to avoid RDFox processes being killed by the operating system during compaction if memory requirements increase suddenly but temporarily. This is relevant for all servers that perform compaction, as well as HA replicas that restore the new snapshots and which use with the highly-available setting for the persistence.snapshot-restore-mode.

2.4.2.2. Disk Space

When RDFox is configured for Persistence, data will also be saved to disk, ready to be loaded in subsequent sessions. The underlying file system must satisfy the system requirements documented in Section 13.2.1 for the chosen persistence option, and have 40-60 bytes of disk space per triple. This includes enough space to store the data itself and some working free space that is needed for operations such as compaction and upgrade.

Recovering from low-memory or low-disk-space conditions can be complex, so it is vital to monitor these metrics and take action before they become exhausted. Regular compaction of data stores can help minimize a server’s memory and disk space usage.