Search code examples
pythongeojsontopojsonvegaaltair

How can I make a map using GeoJSON data in Altair?


I'm very new to mapping, and to Altair/Vega. There's an example in the Altair documentation for how to make a map starting with an outline of US states, which is created basically with:

states = alt.topo_feature(data.us_10m.url, feature='states')

# US states background
background = alt.Chart(states).mark_geoshape(
    fill='lightgray',
    stroke='white'
)

but I want to plot points in the British Isles, instead. Since there are only US and World maps in the vega data collections, I would have to create my own GeoJSON, no?

So I tried getting GeoJSON for the British Isles from a world map, by running some of the command-line commands from this blog post, namely,

ogr2ogr -f GeoJSON -where "adm0_a3 IN ('GBR','IRL','IMN','GGY','JEY','GBA')" subunits.json ne_10m_admin_0_map_subunits/ne_10m_admin_0_map_subunits.shp

This seems to have created a GeoJSON file, subunits.json, which probably represents the British Isles. But how can I get this into Altair? Or is there another way to make a map of the British Isles using Altair?


Solution

  • The example you refer to is using topojson structured data, while you have geojson structured data. So you probably need:

    # remote geojson data object
    url_geojson = 'https://raw.githubusercontent.com/mattijn/datasets/master/two_polygons.geo.json'
    data_geojson_remote = alt.Data(url=url_geojson, format=alt.DataFormat(property='features',type='json'))
    
    # chart object
    alt.Chart(data_geojson_remote).mark_geoshape(
    ).encode(
        color="properties.name:N"
    ).project(
        type='identity', reflectY=True
    )
    

    chart

    Update: GeoDataFrames (geopandas) are directly supported since Altair version 3.3.0. So do any objects that support the __geo_interface__.


    For more insight read on!

    Here below is discussed the variants:

    1. Inline GeoJSON
    2. Inline TopoJSON
    3. TopoJSON from URL
    4. GeoJSON from URL

    Explaining the differences between geojson and topojson structured json files and their usage within Altair

    import geojson
    import topojson
    import pprint
    import altair as alt
    

    Inline GeoJSON

    We start with creating a collection containing two features, namely two adjacent polygons.

    Example of the two polygons that we will create in the GeoJSON data format.:

    FeatureCollection with two Features

    feature_1 = geojson.Feature(
        geometry=geojson.Polygon([[[0, 0], [1, 0], [1, 1], [0, 1], [0, 0]]]),
        properties={"name":"abc"}
    )
    feature_2 = geojson.Feature(
        geometry=geojson.Polygon([[[1, 0], [2, 0], [2, 1], [1, 1], [1, 0]]]),
        properties={"name":"def"}
    )
    var_geojson = geojson.FeatureCollection([feature_1, feature_2])
    

    Inspect the created GeoJSON by pretty print the variable var_geojson

    pprint.pprint(var_geojson)
    
    {'features': [{'geometry': {'coordinates': [[[0, 0],
                                                 [1, 0],
                                                 [1, 1],
                                                 [0, 1],
                                                 [0, 0]]],
                                'type': 'Polygon'},
                   'properties': {'name': 'abc'},
                   'type': 'Feature'},
                  {'geometry': {'coordinates': [[[1, 0],
                                                 [2, 0],
                                                 [2, 1],
                                                 [1, 1],
                                                 [1, 0]]],
                                'type': 'Polygon'},
                   'properties': {'name': 'def'},
                   'type': 'Feature'}],
     'type': 'FeatureCollection'}
    

    As can be seen, the two Polygon Features are nested within the features object and the geometry is part of each feature.

    Altair has the capability to parse nested json objects using the property key within format. The following is an example of such:

    # inline geojson data object
    data_geojson = alt.InlineData(values=var_geojson, format=alt.DataFormat(property='features',type='json')) 
    
    # chart object
    alt.Chart(data_geojson).mark_geoshape(
    ).encode(
        color="properties.name:N"
    ).project(
        type='identity', reflectY=True
    )
    

    chart

    Inline TopoJSON

    TopoJSON is an extension of GeoJSON, where the geometry of the features are referred to from a top-level object named arcs. This makes it possible to apply a hash function on the geometry, so each shared arc should only be stored once.

    We can convert the var_geojson variable into a topojson file format structure:

    var_topojson = topojson.Topology(var_geojson, prequantize=False).to_json()
    var_topojson
    
    {'arcs': [[[1.0, 1.0], [0.0, 1.0], [0.0, 0.0], [1.0, 0.0]],
              [[1.0, 0.0], [2.0, 0.0], [2.0, 1.0], [1.0, 1.0]],
              [[1.0, 1.0], [1.0, 0.0]]],
     'objects': {'data': {'geometries': [{'arcs': [[-3, 0]],
                                          'properties': {'name': 'abc'},
                                          'type': 'Polygon'},
                                         {'arcs': [[1, 2]],
                                          'properties': {'name': 'def'},
                                          'type': 'Polygon'}],
                          'type': 'GeometryCollection'}},
     'type': 'Topology'}
    

    Now the nested geometry objects are replaced by arcs and refer by index to the top-level arcs object. Instead of having a single FeatureCollection we now can have multiple objects, where our converted FeatureCollection is stored within the key data as a GeometryCollection.

    NOTE: the key-name data is arbitrary and differs in each dataset.

    Altair has the capability to parse the nested data object in the topojson formatted structure using the feature key within format, while declaring it is a topojson type. The following is an example of such:

    # inline topojson data object
    data_topojson = alt.InlineData(values=var_topojson, format=alt.DataFormat(feature='data',type='topojson')) 
    
    # chart object
    alt.Chart(data_topojson).mark_geoshape(
    ).encode(
        color="properties.name:N"
    ).project(
        type='identity', reflectY=True
    )
    

    chart

    TopoJSON from URL

    There also exist a shorthand to extract the objects from a topojson file if this file is accessible by URL:

    alt.topo_feature(url, feature)
    

    Altair example where a topojson file is referred by URL

    # remote topojson data object
    url_topojson = 'https://raw.githubusercontent.com/mattijn/datasets/master/two_polygons.topo.json'
    data_topojson_remote = alt.topo_feature(url=url_topojson, feature='data')
    
    # chart object
    alt.Chart(data_topojson_remote).mark_geoshape(
    ).encode(
        color="properties.name:N"
    ).project(
        type='identity', reflectY=True
    )
    

    chart

    GeoJSON from URL

    But for geojson files accessible by URL there is no such shorthand and should be linked as follows:

    alt.Data(url, format)
    

    Altair example where a geojson file is referred by URL

    # remote geojson data object
    url_geojson = 'https://raw.githubusercontent.com/mattijn/datasets/master/two_polygons.geo.json'
    data_geojson_remote = alt.Data(url=url_geojson, format=alt.DataFormat(property='features',type='json'))
    
    # chart object
    alt.Chart(data_geojson_remote).mark_geoshape(
    ).encode(
        color="properties.name:N"
    ).project(
        type='identity', reflectY=True
    )
    

    chart