I want to retrieve blank nodes with a SPARQL query. I am using DBpedia as my dataset. For example, when I use the following query, I got a count of about 3.4 million results.
PREFIX prop:<http://dbpedia.org/property/>
select count(?x) where {
?x prop:name ?y
}
When I use the DISTINCT
solution modifier, I get approximately 2.2 million results.
PREFIX prop:<http://dbpedia.org/property/>
select count(DISTINCT ?x) where {
?x prop:name ?y
}
I have two questions:
A query like this could be used to retrieve (up to 10) blank nodes:
select ?bnode where {
?bnode ?p ?o
filter(isBlank(?bnode))
}
limit 10
However, I get no results. It doesn't look like there are blank nodes (as subjects, anyhow) in the DBpedia data.
The reason that your queries return a different number of results is that ?x
's have more than one name. A query like your first one:
select count(?x) where { ?x prop:name ?y }
on data like:
<somePerson> prop:name "Jim" .
<somePerson> prop:name "James" .
would produce 2
, since there are two ways to match ?x prop:name ?y
. ?x
is bound to <somePerson>
in both of them, but ?y
is bound to different names. In a query like your second one:
select count(DISTINCT ?x) where { ?x prop:name ?y }
you're explicitly only counting the distinct values of ?x
, and there's only one of those in my sample data. This is one way that you can end up with different numbers of results, and it doesn't require any blank nodes.