I would like to create a Sparql query that contains two counts.
The query should get the 'neighbours of neighbours' of A (A → B → C, where A is the start node), and should report for each C, how many paths there were from A to C, and how many "inlinks" there are to C from anywhere. The result set should be as follow:
C | #C | C_INLINKS
--------------------------
A | 2 | 123
B | 3 | 234
Where #C is the number of paths to C from starting node A.
I can create the counts separately, but I don't know how to combine these:
Count neighbours of neighbours:
select ?c count(?c) as ?countc WHERE {
<http://dbpedia.org/resource/AFC_Ajax> ?p1 ?b.
?b ?p2 ?c.
FILTER (regex(str(?c), '^http://dbpedia.org/resource/'))
}
GROUP BY ?c
ORDER BY DESC(?countc)
LIMIT 100
Count inlinks to neighbours of neigbours
select ?c count(?inlink) as ?inlinks WHERE {
<http://dbpedia.org/resource/AFC_Ajax> ?p1 ?b.
?b ?p2 ?c.
?inlink ?p3 ?c
FILTER (regex(str(?c), '^http://dbpedia.org/resource/'))
}
GROUP BY ?c
ORDER BY DESC(?inlinks)
LIMIT 100
Is it possible to combine these two queries? Thank you!
The counts you're trying to extract require you to group by different things. group by lets you specify what you're trying to count with respect to. E.g., when you say, select (count(?x) as ?xn) {...} group by ?y, you're saying "how many ?x's appear per each value of ?y. The counts you're looking for are: "how many C's per A" and then "how many inlinks per C"? That means that in one case you'd need to group by ?a and in the other, you'd need to group by ?c. However, in this case, since you've got a fixed ?a, this might be a little bit easier. To count the distinct paths (?p1,?p2) is a little bit tricky, since when you do count(distinct …), you can only have a single expression for …. However, you can be sneaky by counting distinct concat(str(?p1),str(?p2)), which is a single expression, and should be unique for each ?p1 ?p2 pair. Then I think you'd be looking for a query like this:
select ?c
(count(distinct concat(str(?p1),str(?b),str(?p2))) as ?n_paths)
(count(distinct ?inlink) as ?n_inlink)
where {
dbpedia:AFC_Ajax ?p1 ?b . ?b ?p2 ?c .
?inlink ?p ?c
filter strstarts(str(?c),str(dbpedia:))
}
group by ?c
c n_paths n_inlink
----------------------------------------------------------------------------
http://dbpedia.org/resource/AFC_Ajax 32 540
http://dbpedia.org/resource/Category:AFC_Ajax_players 17 484
http://dbpedia.org/resource/Category:Living_people 17 659447
http://dbpedia.org/resource/Category:Eredivisie_players 13 2232
http://dbpedia.org/resource/Category:Dutch_footballers 12 2141
http://dbpedia.org/resource/Category:1994_births 6 3605
…