Search code examples
google-bigquerygisopenstreetmap

how to extract all countries geometry from Openstreet map dataset in BigQuery


I am using this query to extract the geometry of all countries using OSM, it works ok, but I am sure, it is creating a duplicated as I am using flag as a reference, some places have a flag, but they are not really countries

SELECT feature_type, osm_id, osm_timestamp, geometry,ar.key,ar.value,
  FROM `bigquery-public-data.geo_openstreetmap.planet_features`,UNNEST(all_tags) ar
   where ('boundary', 'administrative') IN (SELECT (key, value) FROM UNNEST(all_tags))
   and(feature_type="polygon" or feature_type= "multipolygon")
   AND ('flag') IN (SELECT (key) FROM UNNEST(all_tags)) and ar.key="name" order by st_area(geometry) desc

sorry forget to say, some tags are missing, for example if you select admin_level = 2, countries like the USA get dropped


Solution

  • We can a list of all countries and their geometries by merging 2 tables:

    SELECT features.feature_type, features.osm_id
      , ARRAY(
         SELECT DISTINCT AS STRUCT * FROM UNNEST(features.all_tags||relations.all_tags)
         WHERE key IN('int_name', 'name')
         ORDER BY 1 LIMIT 1 
      ) features
     , ROUND(ST_AREA(geometry)/1e6,1) area
    FROM
      `bigquery-public-data.geo_openstreetmap.planet_features` AS features,
      `bigquery-public-data.geo_openstreetmap.planet_relations` AS relations
    WHERE ('boundary','administrative') IN (SELECT (key,value) FROM UNNEST(features.all_tags))
    AND ('admin_level','2') IN (SELECT (key,value) FROM UNNEST(relations.all_tags))
    AND feature_type = 'multipolygon'
    AND relations.id=SAFE_CAST(features.osm_id AS INT64)
    ORDER BY area
    

    enter image description here

    (ref)