Search code examples
pythonpandasmatplotlibseabornbar-chart

How to make multiple plots with seaborn from a wide dataframe


I'm currently learning about data visualization using seaborn, and I came across a problem that I couldn't find a solution to.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

So I have this data

index col1 col2 col3 col4 col5 col6 col7 col8
1990 0 4 7 3 7 0 6 6
1991 1 7 5 0 8 1 8 4
1992 0 5 0 1 9 1 7 2
1993 2 7 0 0 6 1 2 7
1994 4 1 5 5 8 1 6 3
1995 7 0 6 4 8 0 5 7
1996 5 1 1 4 6 1 7 4
1997 0 4 7 5 5 1 8 5
1998 1 3 7 0 7 0 7 1
1999 5 7 1 1 6 0 8 5
2000 3 8 5 0 3 0 6 3
2001 6 0 4 1 7 1 2 7

I want to make barplots/histplots with col1, col2 .. col8 as one column and 1990 values as one column so like 1990;

col? val
col1 0
col2 4
col3 7
col4 3
col5 7
col6 0
col7 6
col8 6

and plot them for each year from 1990 to 2001.

g = sns.FacetGrid(df, col=df.index.value_counts())
g.map(sns.histplot, df.columns)

This is the code that I've written I looked at facetgrid but could get it working for my case, any feedback is appreciated.


Solution

  • Imports and Test DataFrame

    • Tested in python 3.11.3, pandas 2.0.2, matplotlib 3.7.1, seaborn 0.12.2
    import pandas as pd
    import seaborn as sns
    
    # sample dataframe
    data = {1990: {'col1': 0, 'col2': 4, 'col3': 7, 'col4': 3, 'col5': 7, 'col6': 0, 'col7': 6, 'col8': 6}, 1991: {'col1': 1, 'col2': 7, 'col3': 5, 'col4': 0, 'col5': 8, 'col6': 1, 'col7': 8, 'col8': 4}, 1992: {'col1': 0, 'col2': 5, 'col3': 0, 'col4': 1, 'col5': 9, 'col6': 1, 'col7': 7, 'col8': 2}, 1993: {'col1': 2, 'col2': 7, 'col3': 0, 'col4': 0, 'col5': 6, 'col6': 1, 'col7': 2, 'col8': 7}, 1994: {'col1': 4, 'col2': 1, 'col3': 5, 'col4': 5, 'col5': 8, 'col6': 1, 'col7': 6, 'col8': 3}, 1995: {'col1': 7, 'col2': 0, 'col3': 6, 'col4': 4, 'col5': 8, 'col6': 0, 'col7': 5, 'col8': 7}, 1996: {'col1': 5, 'col2': 1, 'col3': 1, 'col4': 4, 'col5': 6, 'col6': 1, 'col7': 7, 'col8': 4}, 1997: {'col1': 0, 'col2': 4, 'col3': 7, 'col4': 5, 'col5': 5, 'col6': 1, 'col7': 8, 'col8': 5}, 1998: {'col1': 1, 'col2': 3, 'col3': 7, 'col4': 0, 'col5': 7, 'col6': 0, 'col7': 7, 'col8': 1}, 1999: {'col1': 5, 'col2': 7, 'col3': 1, 'col4': 1, 'col5': 6, 'col6': 0, 'col7': 8, 'col8': 5}, 2000: {'col1': 3, 'col2': 8, 'col3': 5, 'col4': 0, 'col5': 3, 'col6': 0, 'col7': 6, 'col8': 3}, 2001: {'col1': 6, 'col2': 0, 'col3': 4, 'col4': 1, 'col5': 7, 'col6': 1, 'col7': 2, 'col8': 7}}
    df = pd.DataFrame.from_dict(data, orient='index')
    
    # display(df.head())
          col1  col2  col3  col4  col5  col6  col7  col8
    1990     0     4     7     3     7     0     6     6
    1991     1     7     5     0     8     1     8     4
    1992     0     5     0     1     9     1     7     2
    1993     2     7     0     0     6     1     2     7
    1994     4     1     5     5     8     1     6     3
    

    Plotting with seaborn.catplot

    # convert the wide dataframe to a long format with melt
    dfm = df.melt(ignore_index=False).reset_index(names=['Year'])
    
    # display(dfm.head())
       Year variable  value
    0  1990     col1      0
    1  1991     col1      1
    2  1992     col1      0
    3  1993     col1      2
    4  1994     col1      4
    
    # plot with catplot and kind='bar'
    g = sns.catplot(data=dfm, kind='bar', col='Year', col_wrap=4, x='variable', y='value', height=3)
    

    enter image description here

    Plotting with pandas.DataFrame.plot

    • While you have asked about seaborn, given the dataframe in the OP with all the years in the index, the easiest way to plot the data is transpose the dataframe with .T, and then use pandas.DataFrame.plot
      • Set ylim=(0, 30) if needed.
    # display(df.T.head())
          1990  1991  1992  1993  1994  1995  1996  1997  1998  1999  2000  2001
    col1     0     1     0     2     4     7     5     0     1     5     3     6
    col2     4     7     5     7     1     0     1     4     3     7     8     0
    col3     7     5     0     0     5     6     1     7     7     1     5     4
    col4     3     0     1     0     5     4     4     5     0     1     0     1
    col5     7     8     9     6     8     8     6     5     7     6     3     7
    
    # transpose and plot
    axes = df.T.plot(kind='bar', subplots=True, layout=[3, 4], figsize=(15, 7), legend=False, rot=0)
    

    enter image description here