Search code examples
holoviewsdatashadergeoviews

Datashader: categorical colormapping of GeoDataFrames


Installed packages

  • datashader 0.13.0
  • holoviews 1.14.4
  • geoviews 1.9.1
  • bokeh 2.3.2

What I'm trying to do

I'm trying to recreate a choropleth map with one color mapped to one category in a large GeoDataFrame using Datashader, following this example in the Pipeline page and this as well as this SO, which all differ slightly in their syntax, and all use points as the example, rather than polygons.

Reproducible code sample

Below a small sample of the full dataset.

d = {
    'geometry': {
        0: 'POLYGON ((13.80961103741604 51.04076975651729, 13.80965521888065 51.04079016168103, 13.80963851766593 51.04080454197601, 13.80959433642561 51.04078412781548, 13.80961103741604 51.04076975651729))',
        1729: 'POLYGON ((13.80839606906416 51.03845025070634, 13.80827635138927 51.03836030644977, 13.80840483855695 51.03829244374037, 13.80852462026795 51.03838211873356, 13.80839606906416 51.03845025070634))',
        2646: 'POLYGON ((13.80894179055831 51.04544128170094, 13.80952887156242 51.0450399782091, 13.80954152432486 51.04504668985658, 13.80896834397535 51.04545611172818, 13.80894179055831 51.04544128170094))'
    },
    'category': {
        0: 'Within_500m',
        1729: 'Outside_500m',
        2646: 'River/stream'
    }
}

gdf = gpd.GeoDataFrame(pd.DataFrame(d), geometry=gpd.GeoSeries.from_wkt(pd.DataFrame(d)['geometry']), crs="EPSG:4326")

gdf['category'] = gdf['category'].astype('category')

spatialpdGDF = GeoDataFrame(gdf)

color_key = {'Within_500m': 'red', 'Outside_500m': 'lightgrey', 'River/stream': 'lightblue'}
canvas = ds.Canvas(plot_width=1000, plot_height=1000)
agg = canvas.polygons(spatialpdGDF, 'geometry', agg=ds.count_cat('category'))
tf.shade(agg, color_key=color_key)

Expected behaviour

I would expect all polys to be rasterized and displayed in a single color for each of the categories.

Observed behaviour

The full dataset results in an almost white image, some outlines are very faintly visible.

enter image description here

If I change the background color, some of the polys stand out more, though even the title is only faintly visible.

tf.Images(tf.set_background(tf.shade(agg, color_key=color_key, name="Custom color key"), "black"))

enter image description here

Does this have to do with Datashader calculating, as the Pipeline notebook mentions, "the transparency and color of each pixel according to each category’s contribution to that pixel"? But since each category is the sole contributor to each pixel (i.e. there is no spatial overlap with other categories in this case), why does the alpha seem to be set so low that one cannot see anything? I also tried the agg=ds.by('category') aggregator with the same result.

Incidentally, if I delete the 'category' column (which causes an "input must be numeric" error otherwise) and use GeoViews in combination with HoloViews rasterize I can visualise the polys using one color without problem, but I haven't figured out how to use this approach to plot multiple datashaded GDFs with different color mapping on the same Bokeh/or mpl plot (the usual HoloViews "overlay multiplication" does not work in that case).

import geoviews as gv
from holoviews.operation.datashader import rasterize

gv.extension('bokeh')

del gdf['category']

rasterize(gv.Polygons(gdf)).opts(cmap=['red'])

Solution

  • Try agg=ds.by('category', ds.any()), which will ignore polygons that overlap in any pixel. ds.count_cat('category') is now an alias for ds.by('category', ds.count()), but as of Datashader 0.12.1 you are no longer limited to just count, and can e.g. use any to discard information about overlaps.