Search code examples
pythonpandasdataframematplotlibdata-analysis

How to pick pairs of columns, plot them against each other in a bar chart pandas?


I have imported the required libs:

import pandas as pd
import scipy.stats
import matplotlib.pyplot as plt
import psycopg2 as pg

I have a data frame as follows:

df = pd.DataFrame({'non_read_avg': [0.58], 'non_write_avg': [0.75], 'non_mat_avg':[0.45], 'non_rwm_avg':[0.14],
          'rel_read_avg': [0.68], 'rel_write_avg': [0.70], 'rel_mat_avg':[0.75], 'rel_rwm_avg':[0.34]})

I want to select pairs of columns from this data frame and plot them against each other:

df.plot(x=['non_read_avg','rel_read_avg', 'non_write_avg','rel_write_avg'], kind='bar')

Then label the first pair ('non_read_avg','rel_read_avg') as 'Reading' (on the x axis), label the second pair ('non_write_avg','rel_write_avg') as 'Writing'. On the graph have two colors for each pair: one representing the 'religious' say blue, the other is 'non-religious' say green. I will need to do this for every column in the dataframe, basically having pairs of bars on the same graph and label them differently. Is this possible to do?


Solution

  • Its a nice data wrangling problem. My suggestion would be to organise the data differently to start with.

    df_rearanged = pd.DataFrame({
        'rel' : [0.68, 0.70, 0.75, 0.34],
        'non' : [0.58, 0.75, 0.45, 0.14]
        },index = ['read', 'write', 'mat', 'rwm']
    )
    

    In this way you have all the information you need on the rows and columns of the dataframe

    enter image description here

    And the plotting becomes dead easy & exactly what you want.

    df_rearanged.plot(kind='bar')
    plt.show()
    

    enter image description here

    I hope this helps