Search code examples
pythonpandasdataframebinning

Binning data in a pandas dataframe into intervals


I have a data frame which have speed records (1 million entries). I have to bin them into specific intervals which is represented here by cw_bins. I have managed to do it in the below code. But in this way to access each bin I need to change the index of cw_y. I can write a for loop to save these in different lists. I just wanted to know if there is a better was in pandas to do this.

cw_bins=[14.0,15.0,16.0,17.0,18.0,19.0,20.0]
groups=filter_power.groupby(pd.cut(filter_power.Speed,cw_bins))
cw_x=list(groups)
cw_y=(cw_x[4])
cw_z=(cw_y[1])

Solution

  • What about using numpy.digitize?

    cw_bins=[14.0,15.0,16.0,17.0,18.0,19.0,20.0]
    
    # create a column with the index of the binned value
    df['binned_values'] = np.digitize(df.Speed.values, bins=cw_bins)
    
    # for example - get all records between 16.0 and 17.0
    df[df['binned_values'] == 3]