Search code examples
pythonpandasdataframenumpypandas-groupby

Pandas replace tuple like value from pd.cut with a integer


I have a dataframe like as shown below

  count
  (1.386, 3.045]
  (1.386, 3.045]
  (0.692, 1.386]
  (1.386, 3.045]
  (1.386, 3.045]
  (1.386, 3.045]
  (1.386, 3.045]
  (0.692, 1.386]

I would like to create labels for each interval

Above dataframe is a result of pd.cut function like below

pd.cut(t['count'],bins=p_breaks,labels=[1,2,3,4,5],include_lowest=True,duplicates='drop')

but it resulted in an error

So, I removed the labels argument and I got an ouptut like below

  (1.386, 3.045]
  (1.386, 3.045]
  (0.692, 1.386]
  (1.386, 3.045]
  (1.386, 3.045]
  (1.386, 3.045]
  (1.386, 3.045]
  (0.692, 1.386]

Now, I would like to replace these items. So, I tried the below

t['count'].replace((0.692, 1.386),1)
t['count'].replace((1.386, 3.045),2)

I expect my output to be like as below

count
2
2
1
2
2
2
2
1

Solution

  • There is no need to use replace, you can use .cat.codes to get the ordinal values assigned to the corresponding intervals

    t['count'] = pd.cut(t['count'], bins=p_breaks, duplicates='drop', include_lowest=True).cat.codes + 1