Search code examples
pythonpandasp-lang

grouping values in pandas column


I have a pandas dataframe that contain score such as

score
0.1
0.15
0.2
0.3
0.35
0.4
0.5

etc

I want to group these value into the gorups of 0.2 so if score is between 0.1 or 0.2 the value for this row in sore will be 0.2 if score is between 0.2 and 0.4 then the value for score will be 0.4

so for example if max score is 1, I will have 5 buckets of score, 0.2 0.4 0.6 0.8 1

desired output:

score
0.2
0.2
0.2
0.4
0.4
0.4
0.6

Solution

  • You can first define a function that does the rounding for you:

    import numpy as np
    def custom_round(x, base):
        return base * np.ceil(x / base)
    

    Then use .apply() to apply the function to your column:

    df.score.apply(lambda x: custom_round(x, base=.2))
    

    Output:

    0    0.2
    1    0.2
    2    0.2
    3    0.4
    4    0.4
    5    0.4
    6    0.6
    Name: score, dtype: float64