Search code examples
pythonpandasseabornbar-chartgrouped-bar-chart

Pandas Dataframe to Seaborn Grouped Barchart


I have the following dataframe which I have obtained from a larger dataframe which lists the worst 10 "Benchmark Returns" and their corresponding portfolio returns and dates:

enter image description here

I've managed to create a Seaborn bar plot which lists Benchmark Returns against their corresponding dates with this script:

import pandas as pd
import seaborn as sns

df = pd.read_csv('L:\\My Documents\\Desktop\\Data NEW.csv', parse_dates = True)

df = df.nsmallest(10, columns = 'Benchmark Returns')
df = df[['Date', 'Benchmark Returns', 'Portfolio Returns']]
p6 = sns.barplot(x = 'Date', y = 'Benchmark Returns', data = df)
p6.set(ylabel = 'Return (%)')
for x_ticks in p6.get_xticklabels():
    x_ticks.set_rotation(90)

And it produces this plot:

enter image description here

However, what I'd like is a grouped bar plot that contains both Benchmark Returns and Portfolio Returns, where two different colors are used to distinguish between these two categories.

I've tried several different methods but nothing seems to work.


Solution

  • The trick is to transform the pandas df from wide to long format

    Step 1: Prepare data

    import seaborn as sns
    
    np.random.seed(123)
    index = np.random.randint(1,100,10)
    
    x1 = pd.date_range('2000-01-01','2015-01-01').map(lambda t: t.strftime('%Y-%m-%d'))
    dts = np.random.choice(x1,10)
    
    benchmark = np.random.randn(10)
    portfolio = np.random.randn(10)
    
    df = pd.DataFrame({'Index': index,
                       'Dates': dts,
                       'Benchmark': benchmark,
                       'Portfolio': portfolio},
                        columns = ['Index','Dates','Benchmark','Portfolio'])
    

    Step 2: From "wide" to "long" format

    df1 = pd.melt(df, id_vars=['Index','Dates']).sort_values(['variable','value'])
    df1
    
        Index   Dates   variable    value
    9   48  2012-06-13  Benchmark   -1.410301
    1   93  2002-07-31  Benchmark   -1.301489
    8   97  2005-01-21  Benchmark   -1.100985
    0   67  2011-06-01  Benchmark   0.126526
    4   84  2003-09-25  Benchmark   0.465645
    3   18  2009-07-13  Benchmark   0.522742
    5   58  2007-12-04  Benchmark   0.724915
    7   98  2002-12-28  Benchmark   0.746581
    6   87  2009-02-07  Benchmark   1.495827
    2   99  2000-04-21  Benchmark   2.207427
    16  87  2009-02-07  Portfolio   -2.750224
    14  84  2003-09-25  Portfolio   -1.855637
    15  58  2007-12-04  Portfolio   -1.779455
    19  48  2012-06-13  Portfolio   -1.774134
    11  93  2002-07-31  Portfolio   -0.984868
    12  99  2000-04-21  Portfolio   -0.748569
    10  67  2011-06-01  Portfolio   -0.747651
    18  97  2005-01-21  Portfolio   -0.695981
    17  98  2002-12-28  Portfolio   -0.234158
    13  18  2009-07-13  Portfolio   0.240367
    

    Step 3: Plot

    sns.barplot(x='Dates', y='value', hue='variable', data=df1)
    plt.xticks(rotation=90)
    plt.ylabel('Returns')
    plt.title('Portfolio vs Benchmark Returns');
    

    enter image description here