From 3e2eb2d1db93f8949c81b3deaceb0f359c1b49f0 Mon Sep 17 00:00:00 2001 From: Pieter Colpaert Date: Thu, 5 Jun 2025 12:01:56 +0200 Subject: [PATCH 01/11] Clean up vocabulary prefixes and change cc to dct --- tree.ttl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tree.ttl b/tree.ttl index de2594d..9a7d84f 100644 --- a/tree.ttl +++ b/tree.ttl @@ -22,7 +22,7 @@ tree: a foaf:Document ; foaf:primaryTopic tree:Ontology; - cc:license ; + dct:license ; dct:creator . foaf:name "Pieter Colpaert"; foaf:mbox "pieter.colpaert@ugent.be". @@ -213,4 +213,4 @@ tree:timeQuery a rdf:Property ; rdfs:label "Time Query"@en; rdfs:comment "Will search for elements starting from a certain timestamp"@en; rdfs:domain tiles:Node; - rdfs:range xsd:dateTime. + rdfs:range xsd:dateTime. \ No newline at end of file From d9353124557bbab1dbf1cba83ccd3592fe38d79b Mon Sep 17 00:00:00 2001 From: Pieter Colpaert Date: Thu, 5 Jun 2025 12:22:00 +0200 Subject: [PATCH 02/11] Rewrite of the MEA and split between TREE MEA and Shape Topologies (generalized) --- 01-tree-specification.bs | 66 ++++++++++++++++++++++++++++++++-------- 1 file changed, 54 insertions(+), 12 deletions(-) diff --git a/01-tree-specification.bs b/01-tree-specification.bs index 7f2e461..1394d15 100644 --- a/01-tree-specification.bs +++ b/01-tree-specification.bs @@ -114,7 +114,9 @@ A `tree:search` form is an IRI template, that when filled out with the right par A search tree is the -- in this document -- implicit concept of a set of interlinked `tree:Node`s publishing a `tree:Collection`. It will adhere to a certain growth or tree balancing strategy. -In one tree, completeness MUST be guaranteed, unless indicated otherwise (as is possible in LDES using a retention policy). +In one tree, completeness MUST be guaranteed, unless explicitly indicated otherwise. + +Note: [Linked Data Event Streams](https://w3id.org/ldes/specification) is a specialization of TREE that can indicate incompleteness of a search tree using a retention policy. # Initialization # {#init} @@ -134,19 +136,59 @@ This report also explains how clients MAY implement support for extracting conte Note: Having an identifier for the collection has become mandatory: without it you can otherwise not define completeness. -# The member extraction algorithm # {#member-extraction-algorithm} +# The Member Extraction Algorithm # {#member-extraction-algorithm} + +
+RDF does not natively provide a way to reference a set of triples or quads, and can thus not unambiguously point to a member `M` without additional explanation. +While [named graphs may be interpreted in a technology-specific stack as a reference to a set of triples](https://www.w3.org/TR/2014/NOTE-rdf11-datasets-20140225/), also other implementations exist that use it for different purposes. +This makes it more complex: we both want to introduce a TREE-specific interpretation for named graph, as well as support existing datasets that already use named graphs for a different purpose. +Therefore, this specification introduces a pragmatic set of decisions bundled in the Member Extraction Algorithm. +Implementing this algorithm provides interoperability across clients that want to extract members in a common way. +This way, it provides a way for a data provider to be clear about their intention towards a domain-agnostic processor. +
+ +Depending on its goals, a client MAY implement this algorithm to extract a set of quads that were intended as the member quads by the data provider. +On the one hand, there is the TREE specific MEA that supports subject-based star patterns, named graphs and doing an extra HTTP request for out-of-band members. +On the other hand, there is the generalized MEA that extends the TREE MEA with support for including more complex patterns, called shape topologies, and for partially out-of-band members. +Additionally, there is also the TREE profile algorithm that provides a syntactic trick to get to lowest overhead possible for extracting members. + +Note: The MEA is indeed not mandatory for all TREE clients. For example, a client interested in autocompletion might be interested in only extracting the text literals. A client focused on SPARQL querying will know what quads it wants to select. However, a client that is built to domain agnostically do an operation on top of the TREE collection and pass it on to a next step, might want to understand the full package of quads that was intended by the data provider. + +## The TREE Member Extraction Algorithm ## {##tree-member-extraction-algorithm} + +The TREE MEA is a combination of subject-based star patterns (cfr. [[!CBD]]) and named graphs. +First we introduce a couple of symbols: + * `N` is the set of all named graphs used in the current page, including the default graph. + * `Nb` is the set of blank nodes in `N`, excluding the default graph. + * `f` is the first focus node of a member we are extracting found in a ` tree:member ?f` triple. + * `F` is the set of all member focus nodes matching the ` tree:member ?f` pattern. + +A client extracting members MUST iterate over all terms `f` in `F`. +`f` can be either a blank node, a named node, or a triple term. +If `f` is a triple term, the triple itself is the member. +If `f` is a literal an error MUST be returned. + +In the case it is an IRI or a blank node, the client MUST look for further triples as follows: + 1. A set of named graphs that need to be ignored `I` MUST be created. These are named graphs that will be explicitely used to package triples to be entirely part of `M`. `I` is the union of `Nb` and `F`. + 2. For each `g` in `N \ ( Nb ∪ F)`, resolve the quad pattern `GRAPH g { f ?p ?o }` and add those quads to `M`. + 3. For every object `o` that is a result of `?o` in step 2 that is a blank node, + - repeat step 2 with that blank node as `f`. + - solve the quad pattern `GRAPH o { ?s ?p ?o }` and add all quads to `M`. + 4. resolve the quad pattern `GRAPH f { ?s ?p ?o }` and add all quads to `M`. + 5. If `M` is still an empty set of quads, dereference `f` and perform the algorithm on that response again. + +Issue: In step 5: should we perform the algorithm again on the quads in the response, or should we just add all quads found in the response and thus use the triples in the file as a package? + +## The Generalized Member Extraction Algorithm ## {##generalized-member-extraction-algorithm} + +Note: As the Shape Topologies algorithm requires more advanced processing, we discourage publishers from relying on it as it will have a negative effect on client performance. On top of that, while the TREE MEA is straightforward to implement, the generalized MEA is more tedious for developers, which results in a large support for the TREE MEA among client tooling, while the Shape Topologies algorithm may get omitted. It does however has functionalities that are not supported by the more simple TREE MEA. -The member extraction algorithm allows a data publisher to define their members in different ways: - 1. As in the examples above: all quads with the object of the `tree:member` quads as a subject (and recursively the quads of their blank nodes) are by default included (see also [[!CBD]]), except when they would explicitly not be included in case 3, when the shape would be closed. - 2. Out of band / in band: - - when no quads of a member have been found, the member will be dereferenced. This allows to publish the member on a separate page. - - part of the member can be maintained elsewhere when a shape is defined (see 3) - 3. By defining a more complex shape with `tree:shape`, also nested entities can be included in the member - 4. By putting the triples in a named graph of the object of `tree:member`, all these triples will be matched. +When on this `tree:Node`, `tree:shapeTopology` has been set to true, the algorithm in the [Shape Topology report](https://w3id.org/tree/specification/shape-topologies) MUST be used to extract the set of quads as intended by the server using the `tree:shape`. +This is an extension of the TREE Member Extraction Algorithm in which a SHACL shape can indicate: + 1. Partially out of band members + 2. Based on the shape, include or exclude a more specific set of quads. It does however not perform full validation of the members. -Depending on the goals of the client, it MAY implement the member extraction algorithm to fetch all triples about the entity as intended by the server. -The method used within TREE is combination of Concise Bounded Descriptions [[!CBD]], named graphs and the topology of a shape (deducted from the `tree:shape`). -The full algorithm is specified in the [shape topologies](https://w3id.org/tree/specification/shape-topologies) report. +A client MAY also provide a configuration flag to still extract it using the Shape Topology algorithm in case the `tree:shapeTopology` property has not been set. # Traversing the search tree # {#traversing} From f5144574f4c2ca61cf4b81146e00d2ff82313a17 Mon Sep 17 00:00:00 2001 From: Pieter Colpaert Date: Thu, 5 Jun 2025 12:25:40 +0200 Subject: [PATCH 03/11] Added shapeTopology term to the vocabulary --- tree.ttl | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/tree.ttl b/tree.ttl index 9a7d84f..52b0091 100644 --- a/tree.ttl +++ b/tree.ttl @@ -1,21 +1,11 @@ @prefix tree: . @prefix tiles: . -@prefix cc: . @prefix dct: . @prefix foaf: . -@prefix gsp: . -@prefix locn: . @prefix owl: . -@prefix prov: . @prefix rdf: . @prefix rdfs: . -@prefix schema: . @prefix sh: . -@prefix voaf: . -@prefix vs: . -@prefix wdrs: . -@prefix xhtm: . -@prefix xml: . @prefix xsd: . @prefix hydra: . @prefix dcat: . @@ -184,6 +174,12 @@ tree:conditionalImport a rdf:Property ; rdfs:label "Import conditionally"@en ; rdfs:comment "Imports a file in order being able to evaluate a tree:path correctly"@en ; rdfs:range tree:ConditionalImport . + +tree:shapeTopology a rdf:Property ; + rdfs:label "Shape Topology"@en; + rdfs:comment "A boolean to trigger a client to use the Shape Toplogy algorithm instead of the TREE Member Extraction Algorithm for extracting the member quads."@en; + rdfs:range xsd:boolean ; + rdfs:domain tree:Node . ###### Properties for the Tiles ontology ###### Mind that tiles prefix is just a synonym for the tree prefix From e9e613d37e196cd91c77e9d0efcd0f3b7f7c1b80 Mon Sep 17 00:00:00 2001 From: Pieter Colpaert Date: Thu, 5 Jun 2025 12:27:14 +0200 Subject: [PATCH 04/11] Added shapeTopology term to the vocabulary.md --- vocabulary.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/vocabulary.md b/vocabulary.md index d02044f..79e8ffb 100644 --- a/vocabulary.md +++ b/vocabulary.md @@ -123,6 +123,13 @@ Links to the collection’s items that are the sh:targetNodes of th **Domain**: tree:Collection +### tree:shapeTopology ### {#shapeTopology} + +A boolean: if set to true, the client MUST apply the shape topology algorithm for extracting the members. + +**Domain**: the root tree:Node +**Range**: `xsd:boolean` + ### tree:import ### {#import} Imports a document containing triples needed for complying to the SHACL shape, or for evaluating the relation. From fae5e2aa76d7103c327e6d46c25a0db8c2d6958f Mon Sep 17 00:00:00 2001 From: Pieter Colpaert Date: Thu, 5 Jun 2025 12:27:45 +0200 Subject: [PATCH 05/11] Shape Topology on root node only --- 01-tree-specification.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/01-tree-specification.bs b/01-tree-specification.bs index 1394d15..cf96b50 100644 --- a/01-tree-specification.bs +++ b/01-tree-specification.bs @@ -183,7 +183,7 @@ Issue: In step 5: should we perform the algorithm again on the quads in the resp Note: As the Shape Topologies algorithm requires more advanced processing, we discourage publishers from relying on it as it will have a negative effect on client performance. On top of that, while the TREE MEA is straightforward to implement, the generalized MEA is more tedious for developers, which results in a large support for the TREE MEA among client tooling, while the Shape Topologies algorithm may get omitted. It does however has functionalities that are not supported by the more simple TREE MEA. -When on this `tree:Node`, `tree:shapeTopology` has been set to true, the algorithm in the [Shape Topology report](https://w3id.org/tree/specification/shape-topologies) MUST be used to extract the set of quads as intended by the server using the `tree:shape`. +When on the root `tree:Node`, `tree:shapeTopology` has been set to true, the algorithm in the [Shape Topology report](https://w3id.org/tree/specification/shape-topologies) MUST be used to extract the set of quads as intended by the server using the `tree:shape`. This is an extension of the TREE Member Extraction Algorithm in which a SHACL shape can indicate: 1. Partially out of band members 2. Based on the shape, include or exclude a more specific set of quads. It does however not perform full validation of the members. From bb018c772b27848d65794b73aa2bc7b301edebb2 Mon Sep 17 00:00:00 2001 From: Pieter Colpaert Date: Thu, 5 Jun 2025 12:31:13 +0200 Subject: [PATCH 06/11] Mention Profile Algorithm in main spec --- 01-tree-specification.bs | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/01-tree-specification.bs b/01-tree-specification.bs index cf96b50..8a44f49 100644 --- a/01-tree-specification.bs +++ b/01-tree-specification.bs @@ -190,6 +190,10 @@ This is an extension of the TREE Member Extraction Algorithm in which a SHACL sh A client MAY also provide a configuration flag to still extract it using the Shape Topology algorithm in case the `tree:shapeTopology` property has not been set. +## The TREE profile algorithm ## {##profile-based-algorithm} + +The profile algorithm, available in [a separate report](https://w3id.org/tree/specification/profile), is a syntactic trick that MAY be implemented by a client to minimize the overhead of the MEA. + # Traversing the search tree # {#traversing} After dereferencing a `tree:Node`, a client MUST extract all (zero or more) `tree:Relation` descriptions from the page. From b3f91be56c6828da85ae8b586376b8684db177db Mon Sep 17 00:00:00 2001 From: Pieter Colpaert Date: Fri, 6 Jun 2025 09:48:44 +0200 Subject: [PATCH 07/11] Fixed id --- 01-tree-specification.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/01-tree-specification.bs b/01-tree-specification.bs index ddbcde8..aaf8e65 100644 --- a/01-tree-specification.bs +++ b/01-tree-specification.bs @@ -190,7 +190,7 @@ This is an extension of the TREE Member Extraction Algorithm in which a SHACL sh A client MAY also provide a configuration flag to still extract it using the Shape Topology algorithm in case the `tree:shapeTopology` property has not been set. -## The TREE profile algorithm ## {##profile-based-algorithm} +## The TREE profile algorithm ## {#profile-based-algorithm} The profile algorithm, available in [a separate report](https://w3id.org/tree/specification/profile), is a syntactic trick that MAY be implemented by a client to minimize the overhead of the MEA. From b0f626b9ac3083727e31bba1488d712a9e47b40a Mon Sep 17 00:00:00 2001 From: Pieter Colpaert Date: Fri, 6 Jun 2025 09:50:57 +0200 Subject: [PATCH 08/11] Fixed id --- 01-tree-specification.bs | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/01-tree-specification.bs b/01-tree-specification.bs index aaf8e65..fb34cb2 100644 --- a/01-tree-specification.bs +++ b/01-tree-specification.bs @@ -154,7 +154,7 @@ Additionally, there is also the TREE profile algorithm that provides a syntactic Note: The MEA is indeed not mandatory for all TREE clients. For example, a client interested in autocompletion might be interested in only extracting the text literals. A client focused on SPARQL querying will know what quads it wants to select. However, a client that is built to domain agnostically do an operation on top of the TREE collection and pass it on to a next step, might want to understand the full package of quads that was intended by the data provider. -## The TREE Member Extraction Algorithm ## {##tree-member-extraction-algorithm} +## The TREE Member Extraction Algorithm ## {#tree-member-extraction-algorithm} The TREE MEA is a combination of subject-based star patterns (cfr. [[!CBD]]) and named graphs. First we introduce a couple of symbols: @@ -179,7 +179,7 @@ In the case it is an IRI or a blank node, the client MUST look for further tripl Issue: In step 5: should we perform the algorithm again on the quads in the response, or should we just add all quads found in the response and thus use the triples in the file as a package? -## The Generalized Member Extraction Algorithm ## {##generalized-member-extraction-algorithm} +## The Generalized Member Extraction Algorithm ## {#generalized-member-extraction-algorithm} Note: As the Shape Topologies algorithm requires more advanced processing, we discourage publishers from relying on it as it will have a negative effect on client performance. On top of that, while the TREE MEA is straightforward to implement, the generalized MEA is more tedious for developers, which results in a large support for the TREE MEA among client tooling, while the Shape Topologies algorithm may get omitted. It does however has functionalities that are not supported by the more simple TREE MEA. From a1c1300cb04430d1e2026adbb9ca0279b0098d66 Mon Sep 17 00:00:00 2001 From: Pieter Colpaert Date: Wed, 11 Jun 2025 13:13:11 +0200 Subject: [PATCH 09/11] Update 01-tree-specification.bs Co-authored-by: Ieben Smessaert --- 01-tree-specification.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/01-tree-specification.bs b/01-tree-specification.bs index fb34cb2..efad5a7 100644 --- a/01-tree-specification.bs +++ b/01-tree-specification.bs @@ -150,7 +150,7 @@ This way, it provides a way for a data provider to be clear about their intentio Depending on its goals, a client MAY implement this algorithm to extract a set of quads that were intended as the member quads by the data provider. On the one hand, there is the TREE specific MEA that supports subject-based star patterns, named graphs and doing an extra HTTP request for out-of-band members. On the other hand, there is the generalized MEA that extends the TREE MEA with support for including more complex patterns, called shape topologies, and for partially out-of-band members. -Additionally, there is also the TREE profile algorithm that provides a syntactic trick to get to lowest overhead possible for extracting members. +Additionally, there is also the TREE profile algorithm that provides a syntactic trick to get the lowest overhead possible for extracting members. Note: The MEA is indeed not mandatory for all TREE clients. For example, a client interested in autocompletion might be interested in only extracting the text literals. A client focused on SPARQL querying will know what quads it wants to select. However, a client that is built to domain agnostically do an operation on top of the TREE collection and pass it on to a next step, might want to understand the full package of quads that was intended by the data provider. From fad2f329accd86f003c46bbf1962bfd259d5d401 Mon Sep 17 00:00:00 2001 From: Pieter Colpaert Date: Wed, 11 Jun 2025 13:17:02 +0200 Subject: [PATCH 10/11] Update 01-tree-specification.bs Co-authored-by: Ieben Smessaert --- 01-tree-specification.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/01-tree-specification.bs b/01-tree-specification.bs index efad5a7..15e6c90 100644 --- a/01-tree-specification.bs +++ b/01-tree-specification.bs @@ -183,7 +183,7 @@ Issue: In step 5: should we perform the algorithm again on the quads in the resp Note: As the Shape Topologies algorithm requires more advanced processing, we discourage publishers from relying on it as it will have a negative effect on client performance. On top of that, while the TREE MEA is straightforward to implement, the generalized MEA is more tedious for developers, which results in a large support for the TREE MEA among client tooling, while the Shape Topologies algorithm may get omitted. It does however has functionalities that are not supported by the more simple TREE MEA. -When on the root `tree:Node`, `tree:shapeTopology` has been set to true, the algorithm in the [Shape Topology report](https://w3id.org/tree/specification/shape-topologies) MUST be used to extract the set of quads as intended by the server using the `tree:shape`. +When `tree:shapeTopology` has been set to true on the root `tree:Node`, the algorithm in the [Shape Topology report](https://w3id.org/tree/specification/shape-topologies) MUST be used to extract the set of quads as intended by the server using the `tree:shape`. This is an extension of the TREE Member Extraction Algorithm in which a SHACL shape can indicate: 1. Partially out of band members 2. Based on the shape, include or exclude a more specific set of quads. It does however not perform full validation of the members. From c50f26f3c953926e2d88335f8f2a85080ca40f2f Mon Sep 17 00:00:00 2001 From: Pieter Colpaert Date: Wed, 11 Jun 2025 13:20:01 +0200 Subject: [PATCH 11/11] Fixing @smessie comments --- 01-tree-specification.bs | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/01-tree-specification.bs b/01-tree-specification.bs index efad5a7..52c6df7 100644 --- a/01-tree-specification.bs +++ b/01-tree-specification.bs @@ -160,7 +160,7 @@ The TREE MEA is a combination of subject-based star patterns (cfr. [[!CBD]]) and First we introduce a couple of symbols: * `N` is the set of all named graphs used in the current page, including the default graph. * `Nb` is the set of blank nodes in `N`, excluding the default graph. - * `f` is the first focus node of a member we are extracting found in a ` tree:member ?f` triple. + * `f` is the focus node of a member we are extracting, as found in a ` tree:member ?f` triple. * `F` is the set of all member focus nodes matching the ` tree:member ?f` pattern. A client extracting members MUST iterate over all terms `f` in `F`. @@ -173,22 +173,22 @@ In the case it is an IRI or a blank node, the client MUST look for further tripl 2. For each `g` in `N \ ( Nb ∪ F)`, resolve the quad pattern `GRAPH g { f ?p ?o }` and add those quads to `M`. 3. For every object `o` that is a result of `?o` in step 2 that is a blank node, - repeat step 2 with that blank node as `f`. - - solve the quad pattern `GRAPH o { ?s ?p ?o }` and add all quads to `M`. - 4. resolve the quad pattern `GRAPH f { ?s ?p ?o }` and add all quads to `M`. + - solve the quad pattern `GRAPH o { ?s ?p ?o }` and add all quads to `M`. + 4. Resolve the quad pattern `GRAPH f { ?s ?p ?o }` and add all quads to `M`. 5. If `M` is still an empty set of quads, dereference `f` and perform the algorithm on that response again. Issue: In step 5: should we perform the algorithm again on the quads in the response, or should we just add all quads found in the response and thus use the triples in the file as a package? ## The Generalized Member Extraction Algorithm ## {#generalized-member-extraction-algorithm} -Note: As the Shape Topologies algorithm requires more advanced processing, we discourage publishers from relying on it as it will have a negative effect on client performance. On top of that, while the TREE MEA is straightforward to implement, the generalized MEA is more tedious for developers, which results in a large support for the TREE MEA among client tooling, while the Shape Topologies algorithm may get omitted. It does however has functionalities that are not supported by the more simple TREE MEA. +Note: As the Shape Topologies algorithm requires more advanced processing, we discourage publishers from relying on it as it will have a negative effect on client performance. On top of that, while the TREE MEA is straightforward to implement, the generalized MEA is more tedious for developers, which results in a large support for the TREE MEA among client tooling, while the Shape Topologies algorithm may get omitted. It does however support more complex scenarios that are not supported by the TREE MEA. When on the root `tree:Node`, `tree:shapeTopology` has been set to true, the algorithm in the [Shape Topology report](https://w3id.org/tree/specification/shape-topologies) MUST be used to extract the set of quads as intended by the server using the `tree:shape`. This is an extension of the TREE Member Extraction Algorithm in which a SHACL shape can indicate: 1. Partially out of band members 2. Based on the shape, include or exclude a more specific set of quads. It does however not perform full validation of the members. -A client MAY also provide a configuration flag to still extract it using the Shape Topology algorithm in case the `tree:shapeTopology` property has not been set. +A client MAY provide the end-user with a way to override the default behavior and still perform the Shape Topology algorithm in case the `tree:shapeTopology` property has not been set. ## The TREE profile algorithm ## {#profile-based-algorithm}