2. RDFox Features and Requirements¶
2.1. RDFox Features¶
RDFox® provides the following main functionality:
RDFox can import RDF triples, rules, and OWL 2 and SWRL axioms either programmatically or from files of certain formats (see Section 8.2 for details). RDF data can be validated using the SHACL constraint language. Additionally, RDFox can access information from external data sources, such as CSV files, relational databases, or Apache Solr (see Section 7).
Triples, rules and axioms can be exported into a number of different formats (see Section 8.3 for details). Furthermore, the contents of the system can be incrementally saved into a binary file, which can later be loaded to restore the system’s state.
RDFox can answer SPARQL 1.1 queries (see Section 9) and provides functionality for monitoring query answering and accessing query plans.
RDFox supports materialization-based reasoning, where all triples that logically follow from the facts and rules in the system are materialized as new triples (see Section 10) . Materializations can be incrementally updated, which means that reasoning does not need to be performed from scratch once the information in the system is updated. Furthermore, the results of reasoning can be explained, which means that RDFox is able to return proofs for any new fact added to the store through materialization.
RDFox supports ACID transactional updates (see Section 11 for further details on transactions).
Individual information elements in the system can be assigned different access permissions for different users (see Section 12 for further details on access control).
2.2. Software Archive¶
RDFox is distributed as an archive containing the following files and directories:
RDFox
(macOS/Linux) orRDFox.exe
(Windows): a stand-alone executable that can be used to run RDFox on the command line.lib
: a directory containing the following libraries:JRDFox.jar
: the Java API to the RDFox engine.libRDFox.dylib
(macOS),libRDFox.so
(Linux), orlibRDFox.dll
(Windows): a dynamic/shared library that implements the C and the Java Native APIs of RDFox.libRDFox.lib
(Windows only): the import library needed for linkinglibRDFox.dll
on Windows.libRDFox-static.a
(macOS and Linux) orlibRDFox-static.lib
(Windows): a static library that implements the C API of RDFox.
include
: a directory containing include files providing access to the C and C++ APIs.examples
: a directory containing demonstration programs that show how to call RDFox as a library.C
: a directory containing a C source file demonstrating how to call RDFox via the experimental C API. The directory also contains scriptscompile-shared-and-run.sh
andcompile-static-and-run.sh
on macOS and Linux, and scriptscompile-shared-and-run.bat
andcompile-static-and-run.bat
on Windows, which can be used to build and run the demo. On macOS and Linux, the scripts assumes a C17 compliant version of gcc is available on the path. On Windows, the scripts assume thatvcvars64.bat
has been executed in the shell prior to execution.C++
: a directory containing a C++ source file demonstrating how to use RDFox via the C++ API. The directory also contains scriptscompile-shared-and-run.sh
andcompile-static-and-run.sh
on macOS and Linux, and scriptscompile-shared-and-run.bat
andcompile-static-and-run.bat
on Windows, which can be used to build and run the demo. On macOS and Linux, the script assumes a version of g++ supporting C++11 is available on the path. On Windows, the script assumes thatvcvars64.bat
has been executed in the shell prior to execution.Java
: a directory containing source code for a program demonstrating how to call RDFox via the Java API. Theexamples/Java/build.xml
Apache Ant script can be used to compile and run the program.
2.3. Interfaces¶
Users and developers can interact with RDFox through the following interfaces:
- CLI
RDFox comes with a built-in shell that can be used to interact with and control the RDFox Server. The shell can be launched together with an RDFox Server instance using the
shell
orsandbox
modes of the executable. Alternatively theremote
executable mode can be used to connect to and use the shell interface of a remote RDFox Server. See Section 15 for details.- RESTful API
When RDFox’s endpoint is running, clients can interact with the associated RDFox server via a RESTful API. For details of the RESTful API, see Section 16. For details of how to configure and start the endpoint, see Section 19.
- Java API
RDFox can be embedded into Java applications and called via the Java API described in Section 16 and Section 17. To use JRDFox in your project, simply add
JRDFox.jar
to your classpath, and make sure that the path to the dynamic library is correctly specified when starting your program using the following JVM option:-Djava.library.path=<path to the dynamic library>
- C API (EXPERIMENTAL)
RDFox can be dynamically loaded and called through a C API.
- GUI
As well as serving the REST API, the RDFox endpoint serves the RDFox Console, a browser-based user interface supporting basic querying and visualization of data store content. When the endpoint is running, the Console can be loaded by visiting
http[s]:<hostname>:<port>/console/
where<hostname>
and<port>
are the host name and port number at which the endpoint can be reached.
2.4. System Requirements¶
2.4.1. Software¶
2.4.1.1. Operating Systems¶
RDFox supports the following operating system versions:
- Windows
Windows 10 or higher
- Mac
macOS 10.14 or higher
- Linux
Ubuntu 18.04 or higher
Amazon Linux 2 or higher
Additionally, RDFox can be run using Docker. See Section 22 for details.
2.4.1.2. Third-party Software¶
Some RDFox features depend on dynamic-link libraries (DLL) from the list of the third-party software packages below. In each case, the DLL or DLLs are loaded on-demand the first time the dependent functionality is accessed within a session. This means that RDFox can be deployed in the absence of these packages if the dependent functionality is not needed.
- OpenSSL
Used to implement TLS for RDFox’s HTTP client and server code, as well as for persistence and session encryption. The search paths used to locate the DLLs from this package when the endpoint is starting can be specified via the
RDFOX_LIBCRYPTO_PATH
andRDFOX_LIBSSL_PATH
environment variables. If the environment variables are not set, the default values shown in the following table are used.Platform
libcrypto search path
libssl search path
Windows
libcrypto-3-x64.dll
libssl-3-x64.dll
macOS
libcrypto.3.dylib
libssl.3.dylib
Linux
libcrypto.so
libssl.so
The resolved libraries must have version v3.0.0 or higher.
- libpq
Used to access PostgreSQL data sources. The search path used to locate
libpq
when registering one of these data sources can be specified via theRDFOX_LIBPQ_PATH
environment variable. If the environment variable is not set, the default value shown in the following table is used.Platform
libpq search path
Windows
libpq.dll
macOS
libpq.dylib
Linux
libpq.so
The resolved library should be of a version that matches that of the PostgreSQL server being connected to. The current release was built and tested with both library and server from PostgreSQL 14, however it will also work with a wider range of versions, both higher and lower. Please test your configuration and contact OST support as needed.
- iODBC or unixODBC
Used to access external data sources via ODBC. The search path used to locate the DLL that will manage drivers for accessing the ODBC source can be specified via the
RDFOX_ODBC_DRIVER_MANAGER_PATH
environment variable. If the environment variable is not set, RDFox will attempt to use the default search paths shown in the following table to load unixODBC and, if that fails and the platform is not Windows, iODBC.Platform
unixODBC search path
iODBC search path
Windows
odbc32.dll
(not supported)
macOS
libodbc.dylib
libiodbc.dylib
Linux
libodbc.so
libiodbc.so
Although iODBC can be used, unixODBC is recommended. The current release was built and tested with unixODBC v2.3, however it will work with a wider range of versions, both higher and lower. Please test your configuration and contact OST support as needed.
- libsqlite3
Used to access SQLite data sources. The search path used to locate the SQLite library can be specified via the
RDFOX_LIBSQLITE_PATH
environment variable. If the environment variable is not set, RDFox will attempt to use the default search paths shown in the following table.Platform
libsqlite3 search path
Windows
libsqlite3.dll
macOS
libsqlite3.dylib
Linux
libsqlite3.so
The resolved library should be of a version that matches that of the SQLite file being connected to. The current release was built and tested with SQLite v3.49.1, however it will also work with a wider range of versions, both higher and lower. Please test your configuration and contact OST support as needed.
- Lucene
Used to access Lucene data sources. The search path used to locate the Lucene libraries when registering a Lucene data source can be specified via the server parameter
jvm.options
or the environment variableRDFOX_JVM_OPTIONS
. The JVM options should include-Djava.class.path
of the required Lucene libraries. The separator between the classpath is:
(On Windows,;
). Any other JVM options can be specified as needed using the|
separator. If a Lucene data source is used with JRDFox, the required Lucene libraries must be included in the classpath of the JVM running JRDFox. The required Lucene libraries for-Djava.class.path
are shown in the following table with explanations of their purpose.Lucene library search path
Description
lucene-core-<version>.jar
(Required) Lucene core library.
lucene-queryparser-<version>.jar
(Required) Query parsers and parsing framework.
lucene-backward-codecs-<version>.jar
(Optional) This library can be used to access Lucene indexes created with older major versions of Lucene.
The resolved libraries must have version v9.6.0 or higher.
- libjvm
Used to access Lucene data sources with RDFox executable. The search path used to locate the JVM library can be specified via the
RDFOX_LIBJVM_PATH
environment variable. This is required to embed the Java Virtual Machine (JVM) within RDFox. If the environment variable is not set, RDFox will attempt to use the default search paths shown in the following table.Platform
libjvm search path
Windows
jvm.dll
macOS
libjvm.dylib
Linux
libjvm.so
The resolved library should match the version of the Java Runtime Environment (JRE) in use. This release was built and tested with Java 17, but it is also compatible with Java 11 or higher. Please test your configuration and contact OST support if needed.
For a list of other third-party components used within RDFox, see Acknowledgments.
2.4.1.3. License Key¶
Creating an RDFox Server requires a time-limited license key issued by Oxford Semantic Technologies. At server creation time, RDFox will search the following locations, in the order shown, for the license key:
the value of the
license.content
server parameter, if set[RDFox executable only] the value of the
RDFOX_LICENSE_CONTENT
environment variable, if setthe content of the file specified via the
license.file
server parameter, if set[RDFox executable only] the content of the file specified via the
RDFOX_LICENSE_FILE
environment variable, if set[RDFox executable only] the content of the file
RDFox.lic
in the directory containing the running executable, if the file existsthe content of the file at the default value for the
license.file
server parameter, if the file exists
If a candidate key is found in one location, the remaining locations will not be
checked even if the candidate turns out to be invalid or expired. See
Section 4.3 for details of how to specify server parameters such
as license.content
and license.file
.
2.4.2. Hardware¶
This section describes the hardware requirements for running RDFox.
2.4.2.1. Memory¶
RDFox is a main-memory data store and as such its performance is heavily dependent on access to a suitable amount of memory. The amount of memory required for a given application can be broken down into the following two components.
Fact storage cost is the amount of memory required to store the facts (triples or quads) including both those imported explicitly and those added by materialization. This component depends on the number of facts and other characteristics of the data set such as how many unique resources it contains and the size of those resources.
Operating memory cost is the amount of memory required for operations such as querying, reasoning, compaction, and assorted other activities. This component is proportional to the fact storage cost but the exact proportion varies considerably with the characteristics of the workload.
Fact storage costs typically vary between 45 and 85 bytes per fact. One should provision an additional 10-100% of this for operating memory costs. The following workload characteristics will usually increase the operating memory costs:
high numbers of queries evaluated concurrently with updates,
queries that return large result sets with the
ORDER BY
orDISTINCT
keywords, andusing large and/or complicated sets of Datalog rules.
A special case with high operating memory costs is that of HA replicas with the
highly-available
setting for the persistence.snapshot-restore-mode
server parameter (see Section 4.3). These should reserve
at least 100% of the fact storage cost as additional memory, as this is
required to restore the data store from a snapshot.
While the above figures can be useful to estimate the memory requirements, an application should always be tested thoroughly to determine the actual memory requirements.
In general, it is recommended to ensure that the complete memory requirements
of an application are met without relying on memory paging because RDFox’s
performance will degrade significantly if the OS swaps the RDFox process’s
memory pages in and out of disk. It can, however, be useful to enable a
suitably sized swap file to avoid RDFox processes being killed by the operating
system during compaction if memory requirements increase suddenly but
temporarily. This is relevant for all servers that perform compaction, as well
as HA replicas that restore the new snapshots and which use with the
highly-available
setting for the persistence.snapshot-restore-mode
.
2.4.2.2. Disk Space¶
When RDFox is configured for Persistence, data will also be saved to disk, ready to be loaded in subsequent sessions. The underlying file system must satisfy the system requirements documented in Section 13.2.1 for the chosen persistence option, and have 40-60 bytes of disk space per triple. This includes enough space to store the data itself and some working free space that is needed for operations such as compaction and upgrade.
Recovering from low-memory or low-disk-space conditions can be complex, so it is vital to monitor these metrics and take action before they become exhausted. Regular compaction of data stores can help minimize a server’s memory and disk space usage.