I have a requirement where I need to convert the rows of a dataframe column to columns, however I am facing an issue after GROUPBY. Below is a set of 3 users that can have types between type1 to type6.
user_id1 type4
user_id1 type6
user_id1 type1
user_id1 type2
user_id1 type1
user_id1 type6
user_id2 type1
user_id2 type2
user_id2 type2
user_id2 type1
user_id2 type3
user_id2 type4
user_id2 type5
user_id2 type6
user_id2 type2
user_id2 type6
user_id3 type1
user_id3 type2
user_id3 type3
user_id3 type2
The output I am expecting is -
user_id type1 type2 type3 type4 type5 type6
user_id1 2 1 0 1 0 2
user_id2 2 3 1 1 1 2
user_id3 1 2 1 0 0 0
I tried to do a groupby on the type and got the count.But not sure how to convert to column especially the missing types should be populated with 0.
Thanks a lot for your time.
What you need to use is pivot_table from pandas. You can specify what rows and columns you need, fill_value states what do you want to do with empty values and aggfunc len counts.
I'm not sure what your DataSeries looks like, but you need sth like this:
pd.pivot_table(data, index='user_id', columns='type', aggfunc=len, fill_value=0)