Search code examples
pythonpandaspandas-melt

How do I further melt horizontal values into vertical values?


I have a dataframe which has horizontal identifiers (yes and no) and values, and I want to melt it into vertical values into each yes. Here is a snippet of my dataframe:

option        Region                       Store Name       option1 option2   option3 option4    profit
0            Region 1                           Store 1        Y       Y          N        N     48.1575
1            Region 1                           Store 2        N       Y          N        Y     74.7667
2            Region 1                           Store 3        N       Y          N        Y     102.35
3            Region 2                           Store 4        N       Y          N        Y     114.59
4            Region 2                           Store 5        N       Y          N        Y     99.705
5            Region 2                           Store 6        N       Y          N        Y     105.07

The answer is need to get is:

option        Region                       Store Name       options    profit
0            Region 1                           Store 1     option1     48.1575
1            Region 1                           Store 1     option2     48.1575
2            Region 1                           Store 2     option2     74.7667
3            Region 1                           Store 2     option4     74.7667

Essentially, I need to unstack the customer options tables, assign the same profit to everything with a yes, and drop everything with a no.

So far, the function I used is:

e1 = pd.melt(sales_dist_e, id_vars=['Area', 'Store Name'], var_name='option').set_index(['Area', 'Store Name', 'optionx']).squeeze().unstack().reset_index() which was mostly derived from this previous related question, but I can't seem to make it work with my current example.


Solution

  • IIUC, does this work?

    df.melt(['option', 'Region', 'Store Name', 'profit'], var_name='options')\
      .query("value == 'Y'")\
      .drop('value', axis=1)\
      .sort_values('profit')
    

    Output:

        option   Region Store Name    profit  options
    0        0  Region1    Store 1   48.1575  option1
    6        0  Region1    Store 1   48.1575  option2
    7        1  Region1    Store 2   74.7667  option2
    19       1  Region1    Store 2   74.7667  option4
    10       4  Region2    Store 5   99.7050  option2
    22       4  Region2    Store 5   99.7050  option4
    8        2  Region1    Store 3  102.3500  option2
    20       2  Region1    Store 3  102.3500  option4
    11       5  Region2    Store 6  105.0700  option2
    23       5  Region2    Store 6  105.0700  option4
    9        3  Region2    Store 4  114.5900  option2
    21       3  Region2    Store 4  114.5900  option4