Search code examples
pythonpandasdataframepandas-groupbyaggregation

Aggregating a dataframe by a String column in pandas


I have a dataframe which looks like this:

dfB
name        value        country
benzene     spice        Australia
benzene     spice        Australia
benzene     spice        Australia
benzene     herbs        Australia
benzene     herbs        Americas
benzene     anise        Poland
methyl      herbs
methyl      herbs        Americas        
methyl      spice        Americas
alcohol     spice        Germany
alcohol     spice        Germany

I want to createa a different dataframe which is an aggregation of the country column, something like this:

dfB
name        value        country        count
benzene     spice        Australia      3
benzene     herbs        Australia      1
benzene     herbs        Americas       1
benzene     anise        Poland         1
methyl      herbs                       1
methyl      herbs        Americas       1 
methyl      spice        Americas       1
alcohol     spice        Germany        2

The idea is to aggregare the country column and create a count for the country column for each unique "name" and "value" combination. If there is a blank or Nan it should also he treated differently.

I tried using groupby:

grouped = dfB.groupby(["name", "value", "country"]).agg({"country": "count"})

but it does not seem to create the dataframe how I intend to. How can I do this?


Solution

  • Use value_counts or groupby to not modify order:

    out = dfB.value_counts(["name", "value", "country"], sort=False, dropna=False) \
             .rename('count').reset_index()
    out.loc[out['country'].isna(), 'count'] = 1
    
    out1 = dfB.groupby(["name", "value", "country"], sort=False, dropna=False) \
             .size().reset_index(name='count')
    out1.loc[out1['country'].isna(), 'count'] = 1
    
    >>> out
          name  value    country  count
    0  alcohol  spice    Germany      2
    1  benzene  anise     Poland      1
    2  benzene  herbs   Americas      1
    3  benzene  herbs  Australia      1
    4  benzene  spice  Australia      3
    5   methyl  herbs   Americas      1
    6   methyl  herbs        NaN      1
    7   methyl  spice   Americas      1
    
    >>> out1
          name  value    country  count
    0  benzene  spice  Australia      3
    1  benzene  herbs  Australia      1
    2  benzene  herbs   Americas      1
    3  benzene  anise     Poland      1
    4   methyl  herbs        NaN      1
    5   methyl  herbs   Americas      1
    6   methyl  spice   Americas      1
    7  alcohol  spice    Germany      2