sorry, I have a bit of a trouble explaining the problem in title
By accident we pivoted our Pandas Dataframe to this:
df = pd.DataFrame(np.array([[1,1,2], [1,2,1], [2,1,2], [2,2,2],[3,1,3]]),columns=['id', '3s', 'score'])
id 3s score
1 1 2
1 2 1
2 1 2
2 2 2
3 1 3
But we need to unstack this so df will look like this (the original version): The '3s' column 'unpivots' to the discrete set by 3 ordered columns with 0s and 1s, which add in order. So if we had '3s'= 2
with 'score'= 2
the values will be [1,1,0]
(2 out of 3 in order) in columns ['4','5','6']
(second set of 3s) for corresponding id
df2 = pd.DataFrame(np.array([[1,1,1,0,1,0,0], [2,1,1,0,1,1,0], [3,1,1,1,np.nan,np.nan,np.nan] ]),columns=['id', '1', '2','3','4','5','6'])
id 1 2 3 4 5 6
1 1 1 0 1 0 0
2 1 1 0 1 1 0
3 1 1 1
Any help greatly appreciated! (please save me)
This should do the trick:
for gr in df.groupby('3s').groups:
for i in range(1,4):
df[str(i+(gr-1)*3)]=np.where((df['3s'].eq(gr))&(df['score'].ge(i)), 1,0)
df=df.drop(['3s', 'score'], axis=1).groupby('id').max().reset_index()
Output:
id 1 2 3 4 5 6
0 1 1 1 0 1 0 0
1 2 1 1 0 1 1 0
2 3 1 1 1 0 0 0