Search code examples
pythonpandasdataframebar-chartbokeh

Create nested Bar graph in Bokeh from a DataFrame


I have an existing DataFrame which is grouped by the job title and by the year. I want to create a nested bar graph in Bokeh from this but I am confused on what to put in order to plot it properly.

The dataframe:

                       size
fromJobtitle      year   

CEO               2000   236
                  2001   479
                  2002     4
Director          2000    42
                  2001   609
                  2002   188
Employee          1998    23
                  1999   365
                  2000  2393
                  2001  5806
                  2002   817
In House Lawyer   2000     5
                  2001    54
Manager           1999     8
                  2000   979
                  2001  2173
                  2002   141
Managing Director 1998     2
                  1999    14
                  2000   130
                  2001   199
                  2002    11
President         1999    31
                  2000   202
                  2001   558
                  2002   198
Trader            1999     5
                  2000   336
                  2001   494
                  2002    61
Unknown           1999   591
                  2000  2960
                  2001  3959
                  2002   673
Vice President    1999    49
                  2000  2040
                  2001  3836
                  2002   370

An example output is:example graph


Solution

  • I assume you have a DataFrame df with three columns fromJobtitle, year, size. If you have a MultiIndex, reset the Index. To use FactorRange from bokeh, we need a list of tupels with two strings (this is imporant, floats won't work) like

    [('CEO', '2000'), ('CEO', '2001'), ('CEO', '2002'), ...] 
    

    an so on.

    This can be done with

    df['x'] = df[['fromJobtitle', 'year']].apply(lambda x: (x[0],str(x[1])), axis=1)
    

    And this is all the heavy part. The rest does bokeh for you.

    from bokeh.plotting import show, figure, output_notebook
    from bokeh.models import FactorRange
    output_notebook()
    
    p = figure(
        x_range=FactorRange(*list(df["x"])),
        width=1400
    )
    p.vbar(
        x="x",
        top="size",
        width=0.9,
        source=df,
    )
    
    show(p)
    

    This is the generated figure

    Bar plot