Search code examples
pythonpandasdata-analysis

Python Pandas groupby multiple columns


thank you for your help.

I have data that looks like this:

city,  room_type
A, X
A, Y
A, Z
B, X
B, Y
B, Y

I want my end result to look like this:

city, count(X), count(Y), count(z) 
A,  1, 1, 1
B,  1, 2, 0

I am grouping by city and I want to show the count of each room_type in each city.

Any way to do this with python pandas? Thank you.

I learned SQL years ago and think that it may have been possible. I'm sure python can do the same. Thanks!


Solution

  • You can use crosstab with rename columns:

    df = pd.crosstab(df.city, df.room_type).rename(columns=lambda x: 'count({})'.format(x))
    print (df)
    room_type  count(X)  count(Y)  count(Z)
    city                                   
    A                 1         1         1
    B                 1         2         0
    

    Another solutions with groupby and size or value_counts, for reshape is used unstack:

    df = df.groupby(['city', 'room_type']).size().unstack(fill_value=0)
           .rename(columns=lambda x: 'count({})'.format(x))
    print (df)
    room_type  count(X)  count(Y)  count(Z)
    city                                   
    A                 1         1         1
    B                 1         2         0
    

    df = df.groupby('city')['room_type'].value_counts().unstack(fill_value=0)
           .rename(columns=lambda x: 'count({})'.format(x))
    print (df)
    room_type  count(X)  count(Y)  count(Z)
    city                                   
    A                 1         1         1
    B                 1         2         0