Search code examples
pythonplotholoviewshistogram2ddatashader

data shader change color for each date


For a scatterplot with datashader I want to incorporate the notion of time into the plot. Potentially by using color.

Currently,

import numpy as np
import pandas as pd
import seaborn as sns

date_values = ['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04']
result = []
for d in date_values:
    print(d)
    df = pd.DataFrame(np.random.randn(10000, 2), columns=list('AB'))
    df.columns = ['value_foo', 'value_bar']
    df['dt'] = d
    df['dt'] = pd.to_datetime(df['dt'])
    result.append(df)

df =  pd.concat(result)    
display(df.head())

import holoviews as hv
import holoviews.operation.datashader as hd
hv.extension("bokeh", "matplotlib") 

import datashader as ds
import datashader.transfer_functions as tf


cvs = ds.Canvas().points(df, 'value_foo', 'value_bar')
from colorcet import fire
#tf.set_background(tf.shade(cvs, cmap=fire),"black")
tf.shade(cvs)

#sns.jointplot(x="value_foo", y="value_bar", data=df, hue='dt')

Gives enter image description here

However now the different dates are not distinguishable. How can I include the date information (for example using color) when plotting?


Solution

  • Datashader can colorize using any categorical column. Here, you have only four distinct dates, which already works as a categorical, but if you have a lot of dates, you'll first want to bin them into a suitable set of date ranges (e.g. less than 256 total values, if you use a 256-color colormap).

    Either way, once you have a categorical column c, pass agg=ds.count_cat('c') to your .points() call, and you should get a plot colorized by date.

    See the 'pickup_hour' plot in https://examples.pyviz.org/nyc_taxi/ for examples.