Search code examples
python-3.xpandasdata-cleaning

Categorise column based upon another column values




I have a DataFrame as follows:
col1 num agg_col
12  200   0
13  300   0
14  400   0
15  500   0
16  600   0
17  700   0

I am trying to populate agg_col based upon the values in col1.
For instance, if col1 is 12 - 14, the populate 1 on agg_col, 15-16,
populate 2 on agg_col. if col1 = 17, populate 3.

I wrote the following python code:

df['agg_col'][(df['col1'] >= 12) & (df['col1'] <= 14)] = 1


But I am stuck here and am not being able to proceed. Please help!!!


Solution

  • Try look at pd.cut

    pd.cut(df.col1,[0,15,16,17],labels=[1,2,3])
    Out[988]: 
    0    1
    1    1
    2    1
    3    1
    4    2
    5    3
    Name: col1, dtype: category
    Categories (3, int64): [1 < 2 < 3]