Search code examples
pythonpandasdataframepercentage

Percentage decrease based on column value


My dataframe looks like this:

question   timeSpent
a          5354
b          2344
c          2555
d          5200
e          3567

I want to add an extra column, Score, which contains values between 0 and 1. The greater the timeSpent (expressed in seconds) is, the closer to 0 the Score. If the time spent is smaller, then the Score approaches 1.

Suppose the value is 1 if timeSpent is smaller or equal than 2500. Then it drops by 20% with every 100 seconds that have passed. If it hits or is greater than 5500, it stays at 0.

So for 2600, the score would be 0.8, for 2700 the score would be 0.64 etc.

I wrote if-else statements for every interval, but I think there must be a quicker way to do it.


Solution

  • You can create a function to calculate the score and apply it to every timeSpent

    def get_score(num):
        if num <= 2500: return 1
        if num >= 5500: return 0
        x = 1
        for _ in range((num - 2500) // 100):
            x *= 0.8
        return x
    
    df = pd.DataFrame({'question': [a, b, c, d, e], 'timeSpent': [5354, 2344, 2555, 5200, 3567]})
    df['Score'] = df.timeSpent.apply(lambda x: get_score(x))
    

    Output:

      question  timeSpent     Score
    0        a       5354  0.001934
    1        b       2344  1.000000
    2        c       2555  1.000000
    3        d       5200  0.002418
    4        e       3567  0.107374