This is a follow up question to another question I posted regarding using SPAQRL for RDF to come up with a query that filters out all vertices that have any edges to other vertices that are not in a list of specified values, for which I received a working answer.
Here is a visual representation of the graph I'm working with, which contains nodes of two separate RDF types (:package
and :platform
). In this graph, packages (:Package_A
, :Package_B
, :Package_C
, and :Package_D
) have outgoing edges to each platform that they require, and the values of the platforms are :Platform_1:
and :Platform_2
.
In this follow up, packages can now also depend on other packages, and this is represented on the graph using a :requires
edge from a :package
to another :package
when it has a dependency on it.
In the above graph, Package_A
requires both Package_B
and Package_C
, while both Package_B
and Package_C
require Package_D
Here is the data that creates this graph:
INSERT DATA {
:Package_A rdf:type :package .
:Package_B rdf:type :package .
:Package_C rdf:type :package .
:Package_D rdf:type :package .
:Platform_1 rdf:type :platform .
:Platform_2 rdf:type :platform .
:Package_A :platform :Platform_1 .
:Package_B :platform :Platform_1 .
:Package_C :platform :Platform_1 .
:Package_D :platform :Platform_1 .
:Package_D :platform :Platform_2 .
:Package_A :requires :Package_B .
:Package_A :requires :Package_C .
:Package_B :requires :Package_D .
:Package_C :requires :Package_D .
}
I'm able to query this graph to filter out all :package
vertices that have any edges to other vertices that are not in a list of specified values.
For example, in the case of this specified singleton list: [:Platform_1]
, the following query filters out Package_D
since it contains edges to both Platform_1
AND Platform_2
(Platform_2
is not in the specified list). Package_A
, Package_B
and Package_C
are returned since these packages only have edges that lead to Platform_1
.
SELECT * {
?package a :package .
FILTER NOT EXISTS {
?package :platform ?platform .
FILTER (?platform NOT IN(:Platform_1))
}
}
I'm now trying to expand on this query so that it also filters out any vertex that :requires
any other package (either directly or transitively) that meets the same criteria and has an outgoing edge to a value that is not in the specified list.
Meaning that in the case of the list [:Platform_1]
, all packages should be filtered out since all packages have at least a transitive :requires
on Package_D
, which has a :platform edge to Platform_2
, which is not in the specified list.
In the case of the list [:Platform_1, :Platform_2]
, all package vertices should be returned since all package vertices only have edges connecting to Platform_1
and Platform_2
.
I have attempted to use the recursive operator *
as a way to get all transitive :requires
paths, and apply the same filter above to it. However, this does not work when specifying the list [:Platform_1]
as the results of this query still contain Package_A
, Package_B
and Package_C
:
SELECT * {
?package a :package .
FILTER NOT EXISTS {
?package :requires* ?requires .
?package :platform ?platform .
FILTER (?platform NOT IN(:Platform_1))
}
}
Anyone have any ideas how I could construct a query that filters out all vertices that have any edges to other vertices that are not in a list of specified values AND also filters out any vertex that has a :requires
edge (either direct or transitive) that leads to another vertex that meets the same criteria?
A very thought-provoking question. In your query you want to filter out any packages that depend transitively on other packages that are have a platform 'outside' of a list of approved platforms.
The following query will work:
SELECT * {
?package a :package .
FILTER NOT EXISTS {
?package :requires* ?requirement .
?requirement :platform ?platform .
FILTER (?platform NOT IN(:Platform_1))
}
}
The difference with your query is that it is the requirement that has the platform, as opposed to the package. The *
will take care of the case when there are no requirements (i.e. Package D), as well as when we look at the platform of the package itself.
For brevity, you may express the above query like this when you are not interested in the list of requirements:
SELECT * {
?package a :package .
FILTER NOT EXISTS {
?package :requires*/:platform ?platform . #Notice the /.
FILTER (?platform NOT IN(:Platform_1))
}
}