Search code examples
pythonpython-3.xpandassklearn-pandas

Python - convert rows to columns after group by and populate zeroes for non matching rows


I have a requirement where I need to convert the rows of a dataframe column to columns, however I am facing an issue after GROUPBY. Below is a set of 3 users that can have types between type1 to type6.

user_id1    type4
user_id1    type6
user_id1    type1
user_id1    type2
user_id1    type1
user_id1    type6
user_id2    type1
user_id2    type2
user_id2    type2
user_id2    type1
user_id2    type3
user_id2    type4
user_id2    type5
user_id2    type6
user_id2    type2
user_id2    type6
user_id3    type1
user_id3    type2
user_id3    type3
user_id3    type2

The output I am expecting is -

user_id   type1 type2   type3   type4   type5   type6
user_id1    2    1       0       1       0       2
user_id2    2    3       1       1       1       2
user_id3    1    2       1       0       0       0

I tried to do a groupby on the type and got the count.But not sure how to convert to column especially the missing types should be populated with 0.

Thanks a lot for your time.


Solution

  • What you need to use is pivot_table from pandas. You can specify what rows and columns you need, fill_value states what do you want to do with empty values and aggfunc len counts.

    I'm not sure what your DataSeries looks like, but you need sth like this:

    pd.pivot_table(data, index='user_id', columns='type', aggfunc=len, fill_value=0)