Search code examples
pythonpandaspivotunpivot

Make pandas df in wide format and unconcatenate values to different columns


sorry, I have a bit of a trouble explaining the problem in title

By accident we pivoted our Pandas Dataframe to this:

df = pd.DataFrame(np.array([[1,1,2], [1,2,1], [2,1,2], [2,2,2],[3,1,3]]),columns=['id', '3s', 'score'])

id   3s  score
1    1   2
1    2   1
2    1   2             
2    2   2                 
3    1   3

But we need to unstack this so df will look like this (the original version): The '3s' column 'unpivots' to the discrete set by 3 ordered columns with 0s and 1s, which add in order. So if we had '3s'= 2 with 'score'= 2 the values will be [1,1,0] (2 out of 3 in order) in columns ['4','5','6'] (second set of 3s) for corresponding id

df2 = pd.DataFrame(np.array([[1,1,1,0,1,0,0], [2,1,1,0,1,1,0], [3,1,1,1,np.nan,np.nan,np.nan] ]),columns=['id', '1', '2','3','4','5','6'])

id   1   2   3   4   5   6
1    1   1   0   1   0   0
2    1   1   0   1   1   0      
3    1   1   1 

Any help greatly appreciated! (please save me)


Solution

  • This should do the trick:

    for gr in df.groupby('3s').groups:
        for i in range(1,4):
            df[str(i+(gr-1)*3)]=np.where((df['3s'].eq(gr))&(df['score'].ge(i)), 1,0)
    df=df.drop(['3s', 'score'], axis=1).groupby('id').max().reset_index()
    

    Output:

       id  1  2  3  4  5  6
    0   1  1  1  0  1  0  0
    1   2  1  1  0  1  1  0
    2   3  1  1  1  0  0  0