I want to split a set of numbers from 100 (inclusive) to 200 (inclusive) in bins. The numbers are needed to be split in those intervals: [100, 135), [135, 160), [160, 175), [175, 190), [190, 200]. Unfortunately, for now, I have not found a function that solves my problem perfectly
I have tried pd.cut function with a right
parameter that was set to False
, but the output of all possible intervals was: [100, 135), [135, 160), [160, 175), [175, 190), [190, 200]. The difference is that I need to have last interval to include 200 (so [190, 200], not [190, 200)).
Example
import pandas as pd
s = pd.Series(range(1000, 2004)).div(10)
s:
0 100.0
1 100.1
2 100.2
3 100.3
4 100.4
...
999 199.9
1000 200.0 <-- exactly 200
1001 200.1
1002 200.2
1003 200.3
Length: 1004, dtype: float64
Code
How about additionally processing the case where value is exactly 200 with boolean masking in the result of the pd.cut
function?
bins=[100, 135, 160, 175, 190, 200]
labels=['[100, 135)', '[135, 160)', '[160, 175)', '[175, 190)', '[190, 200]']
cond = s.eq(200)
out = pd.cut(s, bins=bins, labels=labels, right=False).mask(cond, '[190, 200]')
out:
0 [100, 135)
1 [100, 135)
2 [100, 135)
3 [100, 135)
4 [100, 135)
...
999 [190, 200]
1000 [190, 200] <-- exactly 200
1001 NaN
1002 NaN
1003 NaN
Length: 1004, dtype: category
Categories (5, object): ['[100, 135)' < '[135, 160)' < '[160, 175)' < '[175, 190)' < '[190, 200]']