Search code examples
pythonpandaslambdaperiod

How to access index value in a apply/lambda function when index is a PeriodIndex?


When accessing the index value in apply/lambda combination, I use name parameter. But in the case of a period index, it doesn't seem to work. In below code, I am computing the completion rate at a given row, considering periods of 4 hours.

import pandas as pd

p4h = pd.period_range(start='2020-02-01 00:00', end='2020-02-04 00:00', freq='4h')
p1h = pd.period_range(start='2020-02-01 00:00', end='2020-02-04 00:00', freq='1h')

df = p1h.to_series()
p4h_st_as_series = p4h.start_time.to_series()

df['OpenPI'] = df.apply(lambda x:
                   p4h.to_series().loc[p4h_st_as_series.index <=
                             x.start_time].index[-1])

completion = df.apply(lambda row: ((row.name.end_time - row['OpenPI'].start_time)
                          /(row['OpenPI'].end_time - row['OpenPI'].start_time)))

Result:

>>> AttributeError: 'Period' object has no attribute 'name'

Please, does anyone has any idea?

Thanks for your help! Bests,


Solution

  • Working code below. I was forgetting the axis=1 which has been missing.

    import pandas as pd
    
    p4h = pd.period_range(start='2020-02-01 00:00', end='2020-02-04 00:00', freq='4h', name='p4h')
    p1h = pd.period_range(start='2020-02-01 00:00', end='2020-02-04 00:00', freq='1h', name='p1h')
    
    df = p1h.to_frame()
    p4h_st_as_series = p4h.start_time.to_series()
    
    df['OpenPI'] = df.apply(lambda x:
                   p4h.to_series().loc[p4h_st_as_series.index <=
                             x.name.start_time].index[-1], axis=1)
    
    completion = df.apply(lambda row: ((row.name.end_time - row.OpenPI.start_time)
                          /(row.OpenPI.end_time - row.OpenPI.start_time)), axis=1)