I'm trying to link some local data up with DBpedia to extract information about countries' economic stats. How can I compensate for alternate paths with differing lengths? The field itself is OPTIONAL
so that the query doesn't miss a result if it happens to not have language listed, but I am getting blank language
columns on resources that do have languages listed.
For instance, http://dbpedia.org/page/Netherlands, http://dbpedia.org/page/Ireland, and http://dbpedia.org/page/Italy index the languages spoken very differently, from a string to different predicates referencing a resource:
Netherlands:
Ireland:
Italy:
Here's a (stripped-down) example query that kind of works, but is not great:
SELECT DISTINCT
?countryName
?dbEntry
(GROUP_CONCAT(DISTINCT ?dbLanguage; separator=", ") AS ?languages)
WHERE
{
?dbEntry a dbo:Place ;
rdfs:label | dbo:longName ?countryName .
# For some reason, stacking two OPTIONALs and BINDing is all that seems to work here, and still not 100%
OPTIONAL {
?dbEntry dbo:language / foaf:name ?dbofLanguage .
BIND(?dbofLanguage AS ?dbLanguage) .
}
OPTIONAL {
?dbEntry dbp:languages ?dbpLanguage .
BIND(?dbpLanguage AS ?dbLanguage) .
}
FILTER (STR(?countryName) IN ("Netherlands", "Italy", "Ireland")) .
}
GROUP BY ?countryName ?dbEntry
LIMIT 3
You'll see the results come back formatted entirely differently:
I'd like to write something like
OPTIONAL {
?dbEntry (dbo:language / foaf:name) | (dbp:languages / rdfs:label) | dbp:languages ?language
}
but I'm thinking SPARQL doesn't support anything that complex yet? (I get zero results)
Edited to correct query, having realized your issue...
SELECT DISTINCT ?countryName
?dbEntry
( GROUP_CONCAT ( DISTINCT ?language ; separator=", " ) AS ?languages )
WHERE
{
?dbEntry a dbo:Place ;
rdfs:label | dbo:longName ?countryName .
OPTIONAL
{
?dbEntry ( dbo:language / foaf:name ) | ( dbp:languages / rdfs:label ) | ( dbp:languages ) ?language
FILTER isLiteral ( ?language )
}
FILTER ( STR ( ?countryName ) IN ( "Netherlands" , "Italy" , "Ireland" ) ) .
}
GROUP BY ?countryName ?dbEntry
Note -- these properties (and thus your query) will change drastically in the next version of DBpedia. Check out the current DBpedia Live page on Ireland, for example.
This appears to do what you want, with just a little bit more Property Path (the ?
operator on rdfs:label
following dbp:languages
)--
SELECT DISTINCT ?countryName
?dbEntry
( GROUP_CONCAT ( DISTINCT ?language ; separator=", " ) AS ?languages )
WHERE
{
?dbEntry a dbo:Place ;
rdfs:label | dbo:longName ?countryName .
OPTIONAL
{
?dbEntry ( dbo:language / foaf:name ) | ( dbp:languages / rdfs:label? ) ?language
}
FILTER ( STR ( ?countryName ) IN ( "Netherlands" , "Italy" , "Ireland" ) ) .
}
GROUP BY ?countryName ?dbEntry