By searching on Google and SO, I came up with the following SPARQL query for finding world's largest cities for the purpose of rudimentary geocoding:
SELECT ?city ?cityLabel ?countryLabel ?iso ?population ?gps
WHERE {
?city wdt:P31 wd:Q515 . hint:Prior hint:runFirst true .
?city wdt:P17 ?country .
?country wdt:P297 ?iso .
?city wdt:P625 ?gps .
?city wdt:P1082 ?population .
FILTER (?population > 100000) .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?population)
LIMIT 5000
For some reason, the result set does not include Paris (France) but includes smaller cities in France. What am I doing wrong?
Thank you!
?city wdt:P31 wd:Q515 .
This triple pattern excludes Paris (Q90), because it’s not an instance of city (Q515).
It’s an instance of subclasses of city (Q515), though. For example: capital city (Q5119).
To find all items that are instances of city (Q515) or of a subclass of city (Q515), you can use a property path:
wdt:P31/wdt:P279*
/
: SequencePath*
: ZeroOrMorePathAs a city can be an instance of multiple city subclasses, you might want to make the results distinct, otherwise these cities will appear multiple times in the result:
SELECT DISTINCT ?city ?cityLabel #etc.