Search code examples
dataframematplotlibplotscatter-plotmulti-index

Scatter plot from multi indexed Dataframe


I am trying to do a scatter plot from a multi indexed DataFrame. However, I don't understand how to properly call the x-axis. The x-axis should be the index containing all the % from P1, P2 and P3, and the y-axis should be the columns 'ZCR' from S1 et S2.

Here is the structure of my multi indexed Dataframe :

# Create the structure of a new multiindexed df 

idx = pd.MultiIndex(
    levels=[['P1', 'P2', 'P3'], 
            ['20%', '40%', '60%', '80%', '90%', '100%']], 
    codes=[[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2], 
           [0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5,]])

cols = pd.MultiIndex(levels=[['S1', 'S2'], 
                             ['MF', 'FSD', 'RMS', 'PPF', 'NPF', 'pk-pk', 'CF', 'ZCR', 'MZCI', 'ZCISD']], 
                     codes=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 
                            [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

df = pd.DataFrame(np.random.rand(18, 20),
                 columns = cols,
                 index = idx)

enter image description here

I managed to do a regular line plot, as the x-absiss is automatically taken from the index :

df.loc[:, (slice(None), 'ZCR')].unstack(level=0).plot()
plt.show()

enter image description here

I have tried some code variations with plt.scatter(), but couldn't manage to obtain the desired output. So any help will be appreciated !


Solution

  • I'm not sure if this is exactly what you're trying to do with the x-values, but how about the 'xs' method to select a particular level in a MultiIndex:

    From your code above:

    # Create the structure of a new multiindexed df 
    idx = pd.MultiIndex(
        levels=[['P1', 'P2', 'P3'], 
                ['20%', '40%', '60%', '80%', '90%', '100%']], 
        codes=[[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2], 
               [0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5,]])
    
    cols = pd.MultiIndex(levels=[['S1', 'S2'], 
                                 ['MF', 'FSD', 'RMS', 'PPF', 'NPF', 'pk-pk', 'CF', 'ZCR', 'MZCI', 'ZCISD']], 
                         codes=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 
                                [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
    
    df = pd.DataFrame(np.random.rand(18, 20),
                     columns = cols,
                     index = idx)
    

    The new line / use of 'xs' method:

    df.xs('ZCR',axis=1,level=1).plot(marker='x',linestyle='None')
    

    resulting chart:

    ZCR scatter values

    related post: pandas multiindex - how to select second level when using columns?