I have a sample dataframe as given below.
import pandas as pd
import numpy as np
NaN = np.nan
data = {'ID':['A','A','A','A','A','A','A','A','A','C','C','C','C','C','C','C','C'],
'Week': ['Week1','Week1','Week1','Week1','Week2','Week2','Week2','Week2','Week3',
'Week1','Week1','Week1','Week1','Week2','Week2','Week2','Week2'],
'Risk':['High','','','','','','','','','High','','','','','','',''],
'Testing':[NaN,'Pos',NaN,'Neg',NaN,NaN,NaN,NaN,'Pos', NaN,
NaN,NaN,'Negative',NaN,NaN,NaN,'Positive'],
'CloseContact': [NaN, 'True', NaN, NaN, 'False',NaN, NaN, 'False', 'True',
NaN, NaN, 'False', NaN, 'True','True','False', NaN ]}
df1 = pd.DataFrame(data)
df1
Now, 2 columns have to be created CC1 and CC2. For each ID, for each week(important), CC1 will get the first non null value of 'CloseContact' column and CC2 will get the second non null value of 'CloseContact' column.
The final dataframe should shoild look like the image shown below.
Any help is greatly appreciated. Thank you.
Like your previous question:
mi = pd.MultiIndex.from_product([df1['ID'].unique(), df1['Week'].unique()],
names=['ID', 'Week'])
out = df1.loc[df1['CloseContact'].notna()] \
.groupby(['ID', 'Week'])['CloseContact'] \
.apply(lambda x: x.head(2).tolist()) \
.apply(pd.Series).rename(columns={0: 'CC1', 1: 'CC2'}) \
.reindex(mi).reset_index()
Output:
>>> out
ID Week CC1 CC2
0 A Week1 True NaN
1 A Week2 False False
2 A Week3 True NaN
3 C Week1 False NaN
4 C Week2 True True
5 C Week3 NaN NaN