Search code examples
pythonpandasdataframerangebinning

Return the lowr or upper bound of a range after binning in Python


I convert the following df into bins using pd.cut in following:

import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(0,100,size=(5, 4)), columns=list('ABCD'))
print(df)
newDF = pd.cut(df.A, 2, precision=0)
print(newDF)

A   B   C   D
0  83  43  99  85
1   6  57  44  45
2   5  72  10  53
3  24  50  23  18
4  75  25  96  27
0    (44.0, 83.0]
1     (5.0, 44.0]
2     (5.0, 44.0]
3     (5.0, 44.0]
4    (44.0, 83.0]

Is there any way to return the lower bound or upper bound of the range instead of the whole range? For example, from the above example:

0    44.0
1    5.0
2    5.0
3    5.0
4    44.0

Solution

  • Use Series.map:

    pd.cut(df.A, 2, precision=0).map(lambda x: x.left)
    

    or pd.IntervalIndex

    s = pd.cut(df.A, 2, precision=0)
    pd.Series(data=pd.IntervalIndex(s).left, index = s.index)
    

    #print(df)
    #
    #
    #    A   B   C   D
    #0  26  70  28   2
    #1  49  42  56  28
    #2  48  26  40  19
    #3   3  50  17   3
    #4  20  34  54  42
    #
    #
    #pd.cut(df.A, 2, precision=0).map(lambda x: x.left)
    #
    #0     3.0
    #1    26.0
    #2    26.0
    #3     3.0
    #4     3.0
    #Name: A, dtype: category
    #Categories (2, float64): [3.0 < 26.0]