Search code examples
sparqlsemantic-webdbpediawikipedia-api

Dbpedia/sparql: get population & lat/lng of all cities/towns/villages in UK


I'm entering the following query at http://dbpedia.org/sparql:

PREFIX geo:  <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT ?s ?name ?value ?lat ?lng
WHERE { 
    ?s a <http://dbpedia.org/ontology/PopulatedPlace> .
    ?s <http://dbpedia.org/property/name> ?name .
    ?s <http://dbpedia.org/property/populationTotal> ?value .
    FILTER (?lng > -8.64 AND ?lng < 2.1 AND ?lat < 61.1 AND ?lat > 49.35 )
    ?s geo:lat ?lat .
    ?s geo:long ?lng .
}

(The bounding box is intended to be for the UK, the other option is to add <http://dbpedia.org/ontology/country> <http://dbpedia.org/resource/United_Kingdom> ., but there's a possibility that some places might not have been tagged with UK as the country).

The problem is that it doesn't seem to be pulling back many places (around 290). Swapping population for populationTotal gives 1588 places, and I can't figure out (semantically) which one should be used.

Is this a limitation with the underlying data, or is there something that could be improved in the way I'm formulating the query?

Note: this question is mainly academic now as I got the info from http://download.geonames.org/export/dump/GB.zip, but I'd much prefer to use open data and the semantic web, so posting up this question to see if there was something I was missing, or to find out if there is a shortcoming in how the data is being scraped from Wikipedia and whether I can muck in.


Solution

  • Your query is only returning locations that have a value for populationTotal. For example, if Town A has "10,000" for populationTotal in the database, and Town B has NULL, only Town A will be returned.

    If you want to return all locations in the UK, then you need to specify population as an optional parameter. This query will show you all the locations, as well as the populations for the ones that have that data.

    PREFIX geo:  <http://www.w3.org/2003/01/geo/wgs84_pos#>
    SELECT ?s ?name ?value ?lat ?lng
    WHERE { 
        ?s a <http://dbpedia.org/ontology/PopulatedPlace> .
        ?s <http://dbpedia.org/property/name> ?name .
        OPTIONAL { ?s <http://dbpedia.org/property/populationTotal> ?value . }
        FILTER (?lng > -8.64 AND ?lng < 2.1 AND ?lat < 61.1 AND ?lat > 49.35 )
        ?s geo:lat ?lat .
        ?s geo:long ?lng .
    }