LODKit is a collection of Linked Open Data related Python functionalities.
LODKit is available on PyPI:
pip install lodkitThe lodkit.ttl triple constructor implements a Turtle-inspired functional DSL for RDF Graph generation.
lodkit.ttl aims to emulate RDF Turtle syntax by featuring Python equivalents for
and is recursive/composable on all code paths.
lodkit.ttl implements the Iterable[lodkit.types.Triple] protocol and exposes a to_graph method for convenient construction of an rdflib.Graph instance.
The following examples show features of the lodkit.ttl triple constructor and display the equivalent RDF graph for comparison.
The lodkit.ttl constructor takes a triple subject and an arbitrary number of triple predicate-object constellations as input; this aims to emulate Turtle Predicate List notation.
The constructor accepts any RDFLib-compliant triple object in the object position, plain strings are interpreted as rdflib.Literal.
from lodkit import ttl
from rdflib import Namespace
ex = Namespace("https://example.com/")
triples = ttl(
ex.s,
(ex.p, ex.o),
(ex.p2, "literal")
)@prefix ex: <https://example.com/> .
ex:s ex:p ex:o ;
ex:p2 "literal" .Predicate-object constellation arguments in lodkit.ttl can be of arbitrary length; the first element is interpreted as triple predicate, all succeeding elements are interpreted as Turtle Object List.
triples = ttl(
ex.s,
(ex.p, ex.o1, ex.o2, "literal")
)@prefix ex: <https://example.com/> .
ex:s ex:p ex:o1, ex:o2, "literal" .Python lists (of predicate-object constellations) in the object position of predicate-object constellations are interpreted as Turtle Blank Nodes.
triples = ttl(
ex.s,
(
ex.p, [
(ex.p2, ex.o),
(ex.p3, "1", "2")
]
)
)@prefix ex: <https://example.com/> .
ex:s ex:p [
ex:p2 ex:o ;
ex:p3 "1", "2"
] .Python tuples in the object position of predicate-object constellations are interpreted as Turtle Collection:
triples = ttl(
ex.s,
(ex.p, (ex.o, "1", "2", "3"))
)@prefix ex: <https://example.com/> .
ex:s ex:p ( ex:o "1" "2" "3" ) .One of the strengths of lodkit.ttl is that it is recursive on all code paths.
To demonstrate the composability of the lodkit.ttl constructor, one could e.g. define a lodkit.ttl object that has another lodkit.ttl object and a blank node with an object list and yet another lodkit.ttl object (in a single element RDF Collection) defined within an RDF Collection:
triples = ttl(
ex.s,
(
ex.p,
(
ttl(ex.s2, (ex.p2, "1")),
[
(ex.p3, "2", "3"),
(ex.p4, (ttl(ex.s3, (ex.p5, "4")),))
],
),
),
)@prefix ex: <https://example.com/> .
ex:s ex:p (
ex:s2
[
ex:p3 "2", "3" ;
ex:p4 ( ex:s3 )
]
) .
ex:s2 ex:p2 "1" .
ex:s3 ex:p5 "4" .This is actually a relatively simple example. Triple objects in the lodkit.ttl constructor can be arbitrarily nested.
lodkit.ttl is pretty recursive! :)
As mentioned, lodkit.ttl implements the Iterable[lodkit.types.Triple] protocol; arbitrary lodkit.ttl instances can therefore be chained to create highly modular and scalable triple generation pipelines.
A minimal example of such a (layered) triple pipeline could look like this:
class TripleGenerator:
def triple_generator_1(self) -> Iterator[Triple]:
if conditional:
yield (s, p, o)
yield from ttl(s, ...)
# more triple generator method definitions
...
def __iter__(self) -> Iterator[Triple]:
return itertools.chain(
self.triple_generator_1(),
self.triple_generator_2(),
self.triple_generator_3(),
...
)
triples: Iterator[Triple] = itertools.chain(TripleGenerator(), ...)LODKit provides a TripleChain class for convenient triple chain construction. Also see Building Triple Chains.
lodkit.TripeChain is a simple itertools.chain subclass that implements a fluid chain interface for arbitrary successive chaining and a to_graph method for deriving an rdflib.Graph from a given chain.
Note that, unlike
lodkit.ttl,TripleChainis anIteratorand can be exhausted, e.g. by callingTripleChain.to_graph.
from collections.abc import Iterator
from lodkit import TripleChain, ttl
from lodkit.types import Triple
from rdflib import Graph, Namespace
ex = Namespace("https://example.com/")
triples = ttl(ex.s, (ex.p, "1", "2", "3"))
more_triples = ttl(ex.s, (ex.p2, [(ex.p3, ex.o)]))
yet_more_triples = ttl(ex.s, (ex.p3, ex.o))
def any_iterable_of_triples() -> Iterator[Triple]:
yield (ex.s, ex.p, ex.o)
triple_chain = (
TripleChain(triples, more_triples)
.chain(yet_more_triples)
.chain(any_iterable_of_triples())
)
ex_graph = Graph()
ex_graph.bind("ex", ex)
graph: Graph = triple_chain.to_graph(graph=ex_graph)
print(graph.serialize())@prefix ex: <https://example.com/> .
ex:s ex:p ex:o,
"1",
"2",
"3" ;
ex:p2 [ ex:p3 ex:o ] ;
ex:p3 ex:o .lodkit.RDFImporter is a custom importer for parsing RDF files into rdflib.Graph objects.
Assuming graphs/some_graph.ttl exists in the import path, lodkit.RDFImporter makes it possible to import the RDF file like a module:
from graphs import some_graph
type(some_graph) # <class 'rdflib.graph.Graph'>RDF import functionality is available after registering lodkit.RDFImporter with the import maschinery e.g by calling lodkit.enable_rdf_import.
lodkit.types defines several useful types for working with RDFLib-based Python code.
The URIConstructor class provides namespaced URI constructor functionality.
A URIConstructor is initialized given a namespace.
Calls to the initialized object will construct rdflib.URIRefs for that namespace.
If a hash_value argument of type str | bytes is provided, the URIRef will be generated with the sha256 hash of the hash_value argument as last URI component;
else a URIRef with a unique component will be generated using UUID4.
make_uri = URIConstructor("https://example.com/")
make_uri() # rdflib.URIRef('https://example.com/<UUID4>')
make_uri("test") # rdflib.URIRef('https://example.com/<sha256>')
make_uri("test") == make_uri("test") # TrueClosedOntologyNamespace is an rdflib.namespace.ClosedNamespace-inspired utility class that constructs an immutable (closed) mapping of RDF term names to IRIs based on an Ontology or generally an RDF graph source.
Given a lodkit.types.GraphParseSource or an rdflib.Graph, a MappingProxyType[str, rdflib.URIRef] mapping is created and stored in ClosedOntologyNamespace.mapping by
-
Querying the RDF source for RDF class and property definitions (RDF/RDFS/OWL class/property type assertions and OWL named individual assertions)
-
Deriving RDF term names by extracting the last IRI component delimited by
#,/or:for generating the RDF term name -> IRI mapping.
Namespace members are accessible as both attributes and items of a given ClosedOntologyNamespace instance, i.e. attribute and item access is routed to ClosedOntologyNamespace.mapping.
For dictionary operations over the namespace mapping, the public ClosedOntologyNamespace.mapping can be accessed directly.
The following example loads a remote Ontology and accesses namespace members using attribute and item lookup.
from lodkit import ClosedOntologyNamespace
crm = ClosedOntologyNamespace(
source="https://cidoc-crm.org/rdfs/7.1.3/CIDOC_CRM_v7.1.3.rdf"
)
crm.E92_Spacetime_Volume # URIRef('http://www.cidoc-crm.org/cidoc-crm/E92_Spacetime_Volume')
crm["E52_Time-Span"] # URIRef('http://www.cidoc-crm.org/cidoc-crm/E52_Time-Span')
crm.E21_Author # AttributeError
crm["E21-Person"] # AttributeErrorNote that lookup failure for both attribute and item access on
ClosedOntologyNamespaceobjects raises anAttributeError!
In the case of RDF term names conflicting with class namespace names, the class namespace names take precedence for attribute access; conflicting RDF terms are still accessible via item lookup or through the ClosedOntologyNamespace.mapping proxy.
Note that currently
ClosedOntologyNamespaceis a highly dynamic runtime construct and does not support static analysis and IDE completion for namespace entries.