Search code examples
pythondataframegroup-byenums

enums in pandas dataframe, not possible to do groupby on a enumn column?


I just learned about enums and thought that they would fit something I'm coding. But when I run this code, I get an error. Am I trying to do something I shouldn't be doing or is this a bug?

When trying to groupby a column with enums, I get this error: TypeError: '<' not supported between instances of 'CarBrand' and 'CarBrand'

The code:

import pandas as pd
from enum import Enum

class CarBrand(Enum):
    VOLVO = 'Volvo'
    BMW = 'BMW'

data = {
    'brand': [CarBrand.VOLVO,
              CarBrand.VOLVO, 
              CarBrand.BMW],
    'price': [35000, 
              37000, 
              45000]
}

df = pd.DataFrame(data)
sum_per_brand = df.groupby('brand').sum('price')
print(sum_per_brand)

This is the print I was expecting: brand price BMW 45000 VOLVO 72000


Solution

  • pd.DataFrame.groupby sorts by default. It works if you use sort=False:

    sum_per_brand = df.groupby('brand', sort=False).sum('price')
    

    Alternatively, you could use a datatype which supports sorting (like CategoricalDtype).