8. Import and Export¶
In order to apply RDFox’s querying and reasoning capabilities to a given data
set, the data must first be loaded into RDFox together with any applicable
rules and axioms. Small amounts of data may be loaded using SPARQL INSERT
DATA
statements, but in most cases it is preferable to import the content
from one of the supported formats. Later, it may be necessary to extract
content from RDFox in one of these formats, and the most efficient way of
achieving this is to use the export operation. This section begins by listing
the formats supported by RDFox for import and export and then looks at the
specifics of each of these operations in turn.
8.1. Data Store Content Formats¶
The following formats can be imported into or exported from RDFox data stores.
The N-Triples format has MIME type
application/n-triples
.The Turtle format has MIME type
text/turtle
.The N-Quads format has MIME type
application/n-quads
.The TriG format has MIME type
application/trig
.The OWL 2 Functional-Style Syntax format has the MIME type
text/owl-functional
.RDFox supports proprietary formats
application/x.gen-n-triples
,text/x.gen-turtle
,application/x.gen-n-quads
, andapplication/x.gen-trig
which extend the corresponding standard formats by support for generalized triples and quads (i.e., triples/quads where any component can be an IRI, a blank node, or a literal).RDFox uses a proprietary format described in Section 10.4 to capture datalog rules and facts. The MIME type of this format is
application/x.datalog
.
Note
The list of MIME types supported for import and export of data store content
can also be viewed using the help
command in the RDFox shell.
8.2. Import¶
This section describes how facts, rules and axioms may be imported into RDFox data stores. The goal is to help users understand those aspects of import operations that are true no matter which interface (such as the RDFox shell, REST or Java) is used. If you are looking for details of how to perform an import using a specific interface, please refer to Section 15.2.23 for the RDFox shell or to Section 16.8 for the REST or Java APIs. Reading this section first will help you make sense of the various options available in those sections.
RDFox uses the concept of domains to segregate content according to how it
came to be present in the system. Facts are imported into the explicit
fact domain. Rules are imported into the user
rule domain.
8.2.1. Import Parameters¶
This section describes the parameters of each import operation. Other aspects
of import are determined by the configuration of the data store that is
receiving the content; for details, please refer to the data store parameters
in Section 5.4.1 whose names begin with import.
.
- Target Default Graph
According to the RDF 1.1 standards, an RDF dataset includes a default graph, which has no name, and zero or more named graphs. While only some of the formats supported by RDFox have a syntax for specifying named graphs, all of them have a syntax for the default graph. For each import operation, RDFox allows the user to specify a named graph to act as the default graph. For facts, this determines the graph that receives the imported content’s default graph triples. For rules, it determines the graph that default graph atoms refer to. For OWL axioms, it determines in which graph the axioms will be applied. Note that the data store parameter
default-graph-name
, if set, supplies a default value for this import parameter. If no value is available for the target default graph, default graph triples are written to the data store’sDefaultTriples
tuple table if it exists or cause an error to be raise if not. See also Section 5.2.- Add or Delete
RDFox can import content for the purpose of adding it to or deleting it from a data store. When content is being added, it is possible to also add any prefix definitions found in the content to the data store’s prefixes. These defintions override any existing definitions where there is a conflict.
- Content Sources
For each import operation, the source or sources of the content to add or delete must be specified. All interfaces provide ways to supply the content directly, or as a list of file paths or URIs. Please refer to the
allowed-schemes-on-load
andsandbox-directory
server parameters that, for security purposes, restrict the URI schemes and file paths that can be used in an import operation.- Content Format
It is possible to explicitly specify the format of the content being imported or allow RDFox to “guess” the format by inspecting the content. When a format is specified using its MIME type, an error will result if the content in any of the specified sources does not conform to that format. If the content type is not specified explicitly, RDFox will try to parse the beginning of the content from each specified content source in each of its supported formats to determine which format does not result in an error, and then proceeds to import the content in that format.
Note
When importing data in N-Quads, the content type (
application/n-quads
) must be specified explicitly.
8.2.2. Parallelization of Imports¶
RDFox will always attempt to parallelize the import of facts using the available server threads (see the num-threads server parameter). If only one source is given, the available threads will be divided between parsing the content and loading the parsed facts into tuple tables. If multiple sources are given, each will be assigned to a thread that is then responsible for both parsing and loading facts from that source.
8.3. Export¶
This section describes how facts, rules and axioms may be exported from RDFox data stores. The goal is to help users understand those aspects of the export operation which are true no matter which interface (RDFox shell, REST or Java) is used. If you are looking for details of how to perform an export in the RDFox shell or using an API, please refer to Section 15.2.20 or Section 16.8, respectively. Reading this section first will help you make sense of the various options available in those sections.
8.3.1. Export Parameters¶
This section describes the parameters that control the behaviour of export operations.
- Destination
For each export operation, the user must specify the destination. All interfaces provide ways to either receive the content directly or specify the path to which RDFox should write the output file. See the
sandbox-directory
server parameter which restricts, for security purposes, the file paths to which a server can write.- Content Format
The format for the export must be specified using its MIME type. Some interfaces satisfy this requirement by having their own default value. The choice of the format determines which portion of the data store content is exported. For RDF formats that do not support named graphs (specifically Turtle, N-Triples and their generalized equivalents), only the triples from the default graph will be exported (exporting specific named graphs into these formats can be achieved using a query that has variables
?S
,?P
, and?O
). For RDF formats that do support named graphs (specifically TriG, N-Quads and their generalized equivalents), both the default graph triples and named-graph quads are exported. If the Datalog format is specified, the data store’s rules are exported. If OWL 2 Functional-Style syntax is specified, the data store’s axioms are exported.- Fact or Rule Domain
RDFox supports restricting imports by fact or rule domain. When exporting RDF, the triples or quads from the
explicit
domain are exported by default. This can be overridden by specifying a single fact domain. When exporting Datalog, the rules from theuser
domain are exported by default but this can be overridden by specifying a space-separated list of rule domains.
8.4. Access Control¶
Like all operations, import and export are subject to the rules of RDFox’s access control system (see Section 12). When specifying access control policies at the granularity of named graphs, it is important to be aware that triples in unreadable named graphs are silently skipped during export. For a more detailed explanation of this see Section 12.2.4.2.