Search code examples
plotnine

Plotnine bar plot order by variable


I have a question on ordering bar plots. For example:

http://pythonplot.com/#bar-counts

(ggplot(mpg) +
aes(x='manufacturer') +
geom_bar(size=20) +
coord_flip() +
ggtitle('Number of Cars by Make')
)

how to order by "mpg" ?


Solution

  • Thanks to has2k1: https://github.com/has2k1/plotnine/issues/94

    If the x mapping is an ordered categorical, it is respected.

    from plydata import *
    from plotnine import *
    from plotnine.data import mpg
    
    # count the manufacturer and sort by the count (see, plydata documentation
    # or find out how to do the same thing using raw pandas)
    m_categories = (
        mpg
        >> count('manufacturer', sort=True)
        >> pull('manufacturer')
    )
    
    df = mpg.copy()
    df['manufacturer'] = pd.Categorical(df['manufacturer'],     categories=m_categories, ordered=True)
    
    (ggplot(df) + 
       aes(x='manufacturer') +
    geom_bar(size=20) + 
    coord_flip() +
    ggtitle('Number of Cars by Make')
    )