Search code examples
pythonpandasmelt

Reference melt variable column to calculate another column in datafrane


Using python/pandas I have used the melt() function to transform my data

Person  Score1  Score2  V1  V2
A   1   4   6   8
B   2   5   3   6
C   3   6   4   7

into the form

 Person variable  value  V1  V2
0      A   Score1      1   6  8
1      B   Score1      2   3  6
2      C   Score1      3   4  7
3      A   Score2      4   6  8
4      B   Score2      5   3  6
5      C   Score2      6   4  7

I now want to add another column V where

V = V1 if variable = Score1, else = V2 if variable = Score2

resulting in:

  Person variable  value  V
0      A   Score1      1  6
1      B   Score1      2  3
2      C   Score1      3  4
3      A   Score2      4  8
4      B   Score2      5  6
5      C   Score2      6  7

I tried using var_name to name the variable attribute but it doesnt seem to really define it so im struggling to use it to calculate the values for the V column, any ideas?


Solution

  • use np.where

    import numpy as np
    
    df['v'] = np.where(df['variable']== 'Score1', df['V1'], df['V2'])
    
    # if you want to drop the columns
    # df.drop(['V1','V2], axis=1, inplace=True)