4. RDFox Concepts¶
RDFox uses the concepts shown in the following diagram to structure the information loaded in the system. This section discusses the basic ideas behind these concepts.
4.1. Servers¶
Each RDFox instance is associated with a distinct server object, which acts as a top-level container for all information stored in the RDFox instance. A server’s job is to keep track of global configuration options of the RDFox instance, manage data stores that contain various data, and manage roles that identify users capable of interacting with the server. A server instance is created whenever RDFox is started via the command line or via the Java API; in either case, there is a way to specify server options such as the number of threads and the maximum amount of memory that the server should use. The list of all options is given in Section 7.
4.2. Data Stores¶
A server is divided into data stores. Each data store is identified by a user-defined name that is unique within the server. A data store acts as a container for data that logically belongs together. Many applications will use one data store to store their data; thus, several applications can use one server, while keeping their data separate. Some applications can use more than one data store, but it is key to remember that all queries and rules are evaluated within the context of a single data store. Thus, all information that an application wishes to access in a single query should be loaded into one data store.
RDFox provides many types of data stores, and each type is identified by a
unique type name (e.g., par-complex-nn
). Data store types differ in their
maximum capacity, and some support concurrent operations where others do not.
Moreover, each data store can be customized via a number of parameters; for
example, a data store can be configured to use the implicit semantics of
owl:sameAs
or not. All parameters are specified as a list of key-value
pairs when the data store is created, and they cannot be changed subsequently.
Data store types and parameters are described in detail in
Section 8.
A data store contains a dictionary, which keeps track of all resources (i.e., IRIs, blank nodes, and literals) occurring in the data loaded in the data store. The dictionary assigns to each resource a unique integer called a resource ID. This mapping is usually of no concern for clients, unless clients aim to optimize communication with an RDFox server by retrieving resource IDs instead of resources themselves.
RDF data in a data store is further stored in several tuple tables. A data store can also reference a number of data sources, which provide access to data in formats other than RDF, such as relational databases or CSV files. Moreover, a data store can contain OWL axioms and rules, which jointly provide inference rules that are to be used for reasoning within a data store. Finally, a data store can contain statistics modules, which keep summaries of the data loaded in the data store that are useful for query planning.
Each data store is assigned a data store ID that is with high probability unique across servers. Clients can use this identifier to ensure that they are referring to the same data store in different API calls.
4.3. Tuple Tables¶
A data store can contain several tuple tables, which are containers for actual data. The data of a tuple table is a collection of items called facts, and each fact can be understood as a list of RDF resources (i.e., IRIs, blank nodes, or literals). Facts with just three components are commonly called triples. Each tuple table is identified with a name that must be unique within a data store. Moreover, each tuple table has a minimal and maximal arity, which are numbers determining the smallest and the largest numbers of RDF resources in a fact stored in the tuple table. In most cases, the minimal and maximal arity are the same, in which case they are called just arity.
RDFox uses the very general concept of a tuple table to represent many different kinds of data containers.
In-memory tuple tables are used to store triples of the default and the named graphs, and they are described in more detail in Section 9.4.
Data source tuple tables provide a ‘virtual view’ over data in non-RDF data sources, such as CSV files, relational databases, or an Apache Solr index. Importing external data is explained in detail in Section 10.
Built-in tuple tables contain some well-known facts that can be useful in various applications of RDFox, and they are described in more detail in Section 9.5.
Each fact in a tuple table is associated with one or more fact domains. Intuitively, the domain of a fact reflects how a fact was added to a tuple table — that is, whether a fact was explicitly introduced by the user or derived through reasoning, and so on. Fact domains are described in more detail in Section 9.2.
4.4. Data Sources¶
To support accessing data in formats other than RDFox, one or more data sources can be registered with a data store. Registering a data source requires specifying a number of parameters that govern how the data is accessed. Each data source is identified by a name that is unique for the data store. The access to the actual data is provided by data source tuple tables, which are created by referencing previously registered data sources. The process of accessing external data sources is described in more detail in Section 10.
4.5. OWL Axioms¶
To support reasoning with OWL ontologies, one can import OWL axioms into a data
store. For example, an OWL axiom can be used to state that the :Professor
class is a subclass of the :Person
class; if such an axiom is imported into
a data store, RDFox will automatically infer that each instance of
:Professor
is also an instance of :Person
. RDFox associates a separate
set of axioms with each named graph, and it provides APIs for adding or
removing axioms in either the Functional-Style Syntax (FSS) or the RDF-based
syntax.
4.6. Rules¶
To support general reasoning, one can import Datalog rules into a data
store. Rules can intuitively be understood as “if-then” statements expressing
general truths about a domain of interest. For example, :Person[?X] :-
:Professor[?X] .
is a rule stating that every professor is a person. If such
a rule is added to a data store, then if the data store also contains triples
:Peter rdf:type :Professor .
and :Paul rdf:type :Professor .
, RDFox
will automatically derive triples :Peter rdf:type :Person .
and :Paul
rdf:type :Person .
. RDFox also supports incremental reasoning: if triple
:Paul rdf:type :Professor .
is removed from the data store, RDFox will
automatically remove :Paul rdf:type :Person .
. RDFox provides ways to add
and remove rules, as well as to retrieve the rules in a data store. The rule
language of RDFox, the provided reasoning support, and examples of use of
reasoning in practical applications are discussed in detail in
Section 6.
Each rule in a data store is associated with one or more of the following three rule domains.
The
user
rule domain contains the rules that a user added explicitly, either by using the RDFox API or by importing a Datalog document.The
axioms
rule domain contains rules obtained by translating all OWL axioms in the data store (in both axiom domains), thus allowing RDFox to perform OWL reasoning. Only axioms that conform to the OWL 2 RL profile are translated. Rules in this rule domain cannot be manipulated directly by the user; rather, a user can add/remove triples and/or axioms in theuser
axiom domain, and RDFox will automatically adjust the rules in theaxioms
rule domain.The
internal
rule domain contains rules that RDFox uses internally. The exact internal rules used depend on the data store configuration, and they are discussed in more detail in Section 6.6.6.
A rule can belong to multiple rule domains. For example, a rule could be added
by the user to the user
rule domain, and it could also be obtained from OWL
axioms and thus be added to the axioms
rule domain.
4.7. Roles¶
A server can contain several roles, each representing an actor (or a type of actor) allowed to access a server. Performing actions on a server or its parts requires first authenticating as one of that server’s roles. The access control model of RDFox is described in detail in Section 11.
4.8. Connections¶
All RDFox Core functionality is accessed via connections. Communication with servers is mediated by server connections while communication with data stores is mediated by data store connections.
Obtaining an initial server or data store connection requires authenticating as one of the server’s roles. Thereafter, clients can create additional connnections using the same role without re-authenticating by duplicating an existing connection or by creating data store connections from a server connection.
In general, connections are designed for use from one thread at a time. That is, while connections have no affinity to particular threads, calling connection methods concurrently from more than one thread will generally result in undefined behavior including crashes. Specific exceptions to this rule exist to support duplication of connections, password verification and the interruption of operations active on a particular connection.
Note
A data store cannot be deleted while there are open connections to it. For this reason, and to minimise resource usage, connections should be closed when no longer needed.
While the above characteristics of connections are true regardless of which RDFox interface (shell, endpoint or Java) is being used, each interface has its own specific mechanisms for managing connections. For details of connection management in the shell see Section 16.2.1. For details of how connections are created by the endpoint see Section 14.2.4 and Section 14.2.5. For information about how to create connections in Java, see the API documentation for the ConnectionFactory class.