Search code examples
pythonpandasmatplotlibseabornfacet-grid

How to keep the index when using pd.melt and merge to create a DataFrame for Seaborn and matplotlib


I am trying to draw subplots using two identical DataFrames ( predicted and observed) with exact same structure ... the first column is index

The code below makes new index when they are concatenated using pd.melt and merge as you can see in the figure the index of orange line is changed from 1-5 to 6-10

I was wondering if some could fix the code below to keep the same index for the orange line:

import pandas as pd 
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

actual = pd.DataFrame({'a': [5, 8, 9, 6, 7, 2],
                       'b': [89, 22, 44, 6, 44, 1]})
predicted = pd.DataFrame({'a': [7, 2, 13, 18, 20, 2],
                       'b': [9, 20, 4, 16, 40, 11]})

# Creating a tidy-dataframe to input under seaborn
merged = pd.concat([pd.melt(actual), pd.melt(predicted)]).reset_index()
merged['category'] = ''
merged.loc[:len(actual)*2,'category'] = 'actual'
merged.loc[len(actual)*2:,'category'] = 'predicted'

g = sns.FacetGrid(merged, col="category", hue="variable")
g.map(plt.plot, "index", "value", alpha=.7)
g.add_legend();

enter image description here


Solution

  • The orange line ('variable' == 'b') doesn't have an index of 0-5 because of how you used melt. If you look at pd.melt(actual), the index doesn't match what you are expecting, IIUC.

    Here is how I would rearrange the dataframe:

    merged = pd.concat([actual, predicted], keys=['actual', 'predicted'])
    merged.index.names = ['category', 'index']
    merged = merged.reset_index()
    merged = pd.melt(merged, id_vars=['category', 'index'], value_vars=['a', 'b'])