Search code examples
pythonpandasdata-sciencecategories

combining categories failure


I am trying to combine the heating types categories in my dataset, so that the ones with less than 2000 appears are combined into other. However when I try executing the code, I keep getting this error: "Cannot do inplace boolean setting on mixed-types with a non np.nan value"

I tried the code this way:

heats = tidy_housing_cleaned['heatingType'].value_counts()
heating_mask = tidy_housing_cleaned.isin(heats[heats < 2000].index)
tidy_housing_cleaned[heating_mask] = 'Other'

Data:
heats

Error:
error

Has someone seen this before?


Solution

  • TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value

    I suppose that you see this error because there's more then one column in tidy_housing_cleaned. We can overcome it with loc, replace, mask etc.

    loc

    index = heating_mask[heating_mask['heatingType']].index
    tidy_housing_cleaned.loc[index,'heatingType'] = 'Other'
    

    replace

    tidy_housing_cleaned['heatingType'].replace(
        heats[heats<2000].index, 
        "Other", inplace=True)
    

    mask

    tidy_housing_cleaned['heatingType'].mask(
        (heats[tidy_housing_cleaned['heatingType']] < 2000).values, 
        other='Other', inplace=True)