Search code examples
pythonmatplotlibpython-ggplot

why won't geom_bar() change fill color like it's supposed to in python ggplot?


I'm trying to follow the demo here: http://blog.yhat.com/posts/aggregating-and-plotting-time-series-in-python.html and am unable to reproduce the figure image

mine looks like this: badmeat

I'm using Win 8 with Python 2.7, the latest ggplot master from github (0.6.6 I think, but pip is telling me it's 0.6.5), pandas 0.16.2, numpy 1.8.1, and matplotlib 1.4.3. I think I've correctly reproduced the code from the demo:

import numpy as np
import pandas as pd
import matplotlib.pylab as plt
from ggplot import *

def floor_decade(date_value):
    "Takes a date. Returns the decade."
    return (date_value.year // 10) * 10

meat2 = meat.dropna(thresh=800, axis=1) # drop columns that have fewer than 800 observations
ts = meat2.set_index(['date'])

by_decade = ts.groupby(floor_decade).sum()

by_decade.index.name = 'year'

by_decade = by_decade.reset_index()

p1 = ggplot(by_decade, aes('year', weight='beef')) + \
    geom_bar() + \
    scale_y_continuous(labels='comma') + \
    ggtitle('Head of Cattle Slaughtered by Decade')

p1.draw()
plt.show()

by_decade_long = pd.melt(by_decade, id_vars="year")

p2 = ggplot(aes(x='year', weight='value', colour='variable'), data=by_decade_long) + \
geom_bar() + \
ggtitle("Meat Production by Decade")

p2.draw()
plt.show()

Solution

  • You are close. Try using fill parameter in ggplot instead of colour. This will fill the insides of the bars with the color specified, instead of colouring the lines. Also, you can change the lines around the bars with colour as a geom_bar parameter. The following shows both:

    p2 = ggplot(aes(x='year', weight='value', fill='variable'), data=by_decade_long) + geom_bar(colour='black') + ggtitle("Meat Production by Decade")

    Bar Chart Result

    Source: I just went through this same struggle learning ggplot for python.