Search code examples
pythonpandasdataframenumpypandas-groupby

Pandas melt to copy values and insert new column


I have two dataframes like as shown below

test_id,status,revenue,cnt_days,age     
1,passed,234.54,3,21          
2,passed,543.21,5,29
11,failed,21.3,4,35
15,failed,2098.21,6,57             
51,passed,232,21,80     
75,failed,123.87,32,43

df1 = pd.read_clipboard(sep=',')


test_id,var_name,score,sign
1,revenue,10,pos
1,cnt_days,5,neg
1,age,15,pos
2,revenue,11,pos
2,cnt_days,3,neg
2,age,25,pos

df2 = pd.read_clipboard(sep=',')

I would like to do the below

a) Bring the value of variables revenue, cnt_days, age from df1 and store it in df2 under a new column var_value. We copy only these 3 variables because they are present under df2[var_name]

For ex: We have df1 column names stored as values in df2 under var_name.

Now, I would like to bring their values and store it under var_value for each matching test_id and corresponding column name.

I was trying something like below

out_df = df1.merge(df2,on='test_ids').melt(var_name='var_name')
out_df.drop_duplicates()

But this results in incorrect output.

I expect my output to be like as below

enter image description here


Solution

  • Because in ouput is not column status fors rmove it by drop, then use DataFrame.melt and add to df2 by left join in DataFrame.merge:

    out_df = df2.merge(df1.drop('status',1)
                          .melt('test_id', var_name='var_name', value_name='var_value'),
                       how='left')
    print (out_df)
       test_id  var_name  score sign  var_value
    0        1   revenue     10  pos     234.54
    1        1  cnt_days      5  neg       3.00
    2        1       age     15  pos      21.00
    3        2   revenue     11  pos     543.21
    4        2  cnt_days      3  neg       5.00
    5        2       age     25  pos      29.00
    

    If order of columns is important:

    out_df.insert(2, 'var_value', out_df.pop('var_value'))
    print (out_df)
       test_id  var_name  var_value  score sign
    0        1   revenue     234.54     10  pos
    1        1  cnt_days       3.00      5  neg
    2        1       age      21.00     15  pos
    3        2   revenue     543.21     11  pos
    4        2  cnt_days       5.00      3  neg
    5        2       age      29.00     25  pos